Debian Med Project
Help us to see Debian used by medical practitioners and biomedical researchers! Join us on the Salsa page.
Summary
Covid-19
This task exists only for tagging COVID-19 relevant cases

The Debian Med team intends to take part at the

 COVID-19 Biohackathon (April 5-11, 2020)
This task was created only for the purpose to list relevant packages.

Description

For a better overview of the project's availability as a Debian package, each head row has a color code according to this scheme:

If you discover a project which looks like a good candidate for Debian Med to you, or if you have prepared an unofficial Debian package, please do not hesitate to send a description of that project to the Debian Med mailing list

Links to other tasks

Debian Med Covid-19 packages

Official Debian packages with high relevance

abacas
close gaps in genomic alignments from short reads
Versions of package abacas
ReleaseVersionArchitectures
trixie1.3.1-9all
sid1.3.1-9all
stretch1.3.1-3all
buster1.3.1-5all
jessie1.3.1-2all
bullseye1.3.1-9all
bookworm1.3.1-9all
Debtags of package abacas:
roleprogram
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

ABACAS (Algorithm Based Automatic Contiguation of Assembled Sequences) intends to rapidly contiguate (align, order, orientate), visualize and design primers to close gaps on shotgun assembled contigs based on a reference sequence.

ABACAS uses MUMmer to find alignment positions and identify syntenies of assembled contigs against the reference. The output is then processed to generate a pseudomolecule taking overlapping contigs and gaps in to account. ABACAS generates a comparison file that can be used to visualize ordered and oriented contigs in ACT. Synteny is represented by red bars where colour intensity decreases with lower values of percent identity between comparable blocks. Information on contigs such as the orientation, percent identity, coverage and overlap with other contigs can also be visualized by loading the outputted feature file on ACT.

The package is enhanced by the following packages: abacas-examples
Please cite: Samuel Assefa, Thomas M. Keane, Thomas D. Otto, Chris Newbold and Matthew Berriman: ABACAS: algorithm-based automatic contiguation of assembled sequences. (PubMed,eprint) Bioinformatics 25(15):1968-1969 (2009)
Topics: Probes and primers
abyss
de novo, parallel, sequence assembler for short reads
Versions of package abyss
ReleaseVersionArchitectures
bullseye2.2.5+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm2.3.5+dfsg-2amd64,arm64,mips64el,ppc64el,s390x
jessie1.5.2-1 (non-free)amd64
stretch2.0.2-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
stretch-backports2.1.5-7~bpo9+1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster2.1.5-7amd64,arm64,armhf,i386
sid2.3.10-1amd64,arm64,mips64el,ppc64el,riscv64
trixie2.3.9-1amd64,arm64,mips64el,ppc64el,riscv64
Debtags of package abyss:
roleprogram
Popcon: 3 users (5 upd.)*
Versions and Archs
License: DFSG free
Git

ABySS is a de novo, parallel, sequence assembler that is designed for short reads. It may be used to assemble genome or transcriptome sequence data. Parallelization is achieved using MPI, OpenMP and pthread.

Please cite: Shaun D. Jackman, Benjamin P. Vandervalk, Hamid Mohamadi, Justin Chu, Sarah Yeo, S. Austin Hammond, Golnaz Jahesh, Hamza Khan, Lauren Coombe, Rene L. Warren and İnanç Birol: "ABySS 2.0: resource-efficient assembly of large genomes using a Bloom filter". (PubMed,eprint) Genome Research 27(5):768-777 (2017)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Sequence assembly
allelecount
NGS copy number algorithms
Versions of package allelecount
ReleaseVersionArchitectures
trixie4.3.0-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm4.3.0-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye4.2.1-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid4.3.0-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Support code for NGS copy number algorithms. Takes a file of locations and a [cr|b]am file and generates a count of coverage of each allele [ACGT] at that location (given any filter settings).

The alleleCount package primarily exists to prevent code duplication between some other projects, specifically AscatNGS and Battenberg.

Registry entries: Bioconda 
assembly-stats
get assembly statistics from FASTA and FASTQ files
Versions of package assembly-stats
ReleaseVersionArchitectures
sid1.0.1+ds-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.0.1+ds-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.0.1+ds-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye1.0.1+ds-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Get statistics from a list of files.

Detection of FASTA or FASTQ format of each file is automatic from the file contents, so file names and extensions are irrelevant.

The default output format is human readable. You can change the output format and ignore sequences shorter than a given length.

Registry entries: Bioconda 
augur
pipeline components for real-time virus analysis
Versions of package augur
ReleaseVersionArchitectures
trixie24.4.0-1all
bullseye11.0.0-1all
buster-backports6.4.2-2~bpo10+1all
bookworm20.0.0-1all
sid24.4.0-1all
upstream26.0.0
Popcon: 1 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

The nextstrain project is an attempt to make flexible informatic pipelines and visualization tools to track ongoing pathogen evolution as sequence data emerges. The nextstrain project derives from nextflu, which was specific to influenza evolution.

nextstrain is comprised of three components:

  • fauna: database and IO scripts for sequence and serological data
  • augur: informatic pipelines to conduct inferences from raw data
  • auspice: web app to visualize resulting inferences
bamclipper
Remove gene-specific primer sequences from SAM/BAM alignments
Versions of package bamclipper
ReleaseVersionArchitectures
bookworm1.0.0-3all
trixie1.0.0-3all
sid1.0.0-3all
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Remove gene-specific primer sequences from SAM/BAM alignments of PCR amplicons by soft-clipping.

bamclipper.sh soft-clips gene-specific primers from BAM alignment file based on genomic coordinates of primer pairs in BEDPE format.

Please cite: Chun Hang Au, Dona N Ho, Ava Kwong, Tsun Leung Chan and Edmond S K Ma: BAMClipper: removing primers from alignments to minimize false-negative mutations in amplicon next-generation sequencing. (PubMed,eprint) Scientific Reports 7(1):1567 (2017)
Registry entries: Bioconda 
bamkit
tools for common BAM file manipulations
Versions of package bamkit
ReleaseVersionArchitectures
bullseye0.0.1+git20170413.ccd079d-2all
sid0.0.1+git20170413.ccd079d-3all
bookworm0.0.1+git20170413.ccd079d-3all
trixie0.0.1+git20170413.ccd079d-3all
Popcon: 3 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

This package provides some Python3 tools for common BAM file manipulations.

bbmap
BBTools genomic aligner and other tools for short sequences
Versions of package bbmap
ReleaseVersionArchitectures
bookworm39.01+dfsg-2all
buster-backports38.63+dfsg-1~bpo10+1all
bullseye38.90+dfsg-1all
trixie39.09+dfsg-1all
sid39.11+dfsg-1all
Popcon: 0 users (4 upd.)*
Versions and Archs
License: DFSG free
Git

The BBTools are a collection of small programs to solve recurrent tasks for the creative handling of short biological RNA/DNA sequences. This suite may be best known for its mapper, which is also the name of the project on sourceforge, but several tools have been added over time. All tools are multi-threaded, implemented platform-independently in Java:

BBMap: Short read aligner for DNA and RNA-seq data. Capable of handling arbitrarily large genomes with millions of scaffolds. Handles Illumina, PacBio, 454, and other reads; very high sensitivity and tolerant of errors and numerous large indels.

BBNorm: Kmer-based error-correction and normalization tool.

Dedupe: Simplifies assemblies by removing duplicate or contained subsequences that share a target percent identity.

Reformat: Reformats reads between fasta/fastq/scarf/fasta+qual/sam, interleaved/paired, and ASCII-33/64, at over 500 MB/s.

BBDuk: Filters, trims, or masks reads with kmer matches to an artifact/contaminant file.

The package is enhanced by the following packages: multiqc
Please cite: Brian Bushnell, Jonathan Rood and Esther Singer: BBMerge – Accurate paired shotgun read merging via overlap. (PubMed,eprint) PLOS One (2017)
Registry entries: Bio.tools  SciCrunch  Bioconda 
bcalm
de Bruijn compaction in low memory
Versions of package bcalm
ReleaseVersionArchitectures
trixie2.2.3-5amd64,arm64,mips64el,ppc64el,riscv64
bookworm2.2.3-4amd64,arm64,mips64el,ppc64el
sid2.2.3-5amd64,arm64,mips64el,ppc64el,riscv64
bullseye2.2.3-1amd64,arm64,i386,mips64el,ppc64el,s390x
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

A bioinformatics tool for constructing the compacted de Bruijn graph from sequencing data.

This is the parallel version of the BCALM software using gatb-core library.

Please cite: Rayan Chikhi, Antoine Limasset and Paul Medvedev: Compacting de Bruijn graphs from sequencing data quickly and in low memory.. (eprint) Bioinformatics 32(12):208 (2016)
bcftools
genomic variant calling and manipulation of VCF/BCF files
Versions of package bcftools
ReleaseVersionArchitectures
trixie1.20-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch1.3.1-1amd64,arm64,armel,mips64el,mipsel,ppc64el
buster1.9-1amd64,arm64,armhf
sid1.20-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.16-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
bullseye1.11-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
stretch-backports1.8-1~bpo9+1amd64,arm64,armel,armhf,mips64el,mipsel,ppc64el
upstream1.21
Popcon: 23 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

BCFtools is a set of utilities that manipulate variant calls in the Variant Call Format (VCF) and its binary counterpart BCF. All commands work transparently with both VCFs and BCFs, both uncompressed and BGZF-compressed.

The package is enhanced by the following packages: multiqc
Please cite: Petr Danecek and Shane A. McCarthy: BCFtools/csq: Haplotype-aware variant consequences. (2016)
Registry entries: Bio.tools  SciCrunch  Bioconda 
bedtools
suite of utilities for comparing genomic features
Versions of package bedtools
ReleaseVersionArchitectures
buster2.27.1+dfsg-4amd64,arm64,armhf
sid2.31.1+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie2.31.1+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm2.30.0+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye2.30.0+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch2.26.0+dfsg-3amd64,arm64,armel,i386,mips64el,mipsel,ppc64el
jessie2.21.0-1amd64,armhf,i386
Debtags of package bedtools:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopesuite
useanalysing, comparing, converting, filtering
works-withbiological-sequence
Popcon: 30 users (10 upd.)*
Versions and Archs
License: DFSG free
Git

The BEDTools utilities allow one to address common genomics tasks such as finding feature overlaps and computing coverage. The utilities are largely based on four widely-used file formats: BED, GFF/GTF, VCF, and SAM/BAM. Using BEDTools, one can develop sophisticated pipelines that answer complicated research questions by streaming several BEDTools together.

The groupBy utility is distributed in the filo package.

Please cite: Aaron R. Quinlan and Ira M. Hall: BEDTools: a flexible suite of utilities for comparing genomic features. (PubMed,eprint) Bioinformatics 26(6):841-842 (2010)
Registry entries: Bio.tools  SciCrunch  Bioconda 
biobambam2
tools for early stage alignment file processing
Versions of package biobambam2
ReleaseVersionArchitectures
sid2.0.185+ds-2amd64,i386,mips64el,ppc64el,riscv64
trixie2.0.185+ds-2amd64,i386,mips64el,ppc64el,riscv64
bookworm2.0.185+ds-1amd64,arm64,i386,ppc64el
bullseye2.0.179+ds-1amd64,arm64,i386,ppc64el
upstream2.0.185-release-20221211202123
Popcon: 2 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

This package contains some tools for processing BAM files, including

  bamsormadup:  parallel sorting and duplicate marking
  bamcollate2:  reads BAM and writes BAM reordered such that alignment
                or collated by query name
  bammarkduplicates: reads BAM and writes BAM with duplicate alignments
                marked using the BAM flags field
  bammaskflags: reads BAM and writes BAM while masking (removing) bits
                from the flags column
  bamrecompress: reads BAM and writes BAM with a defined compression
                 setting. This tool is capable of multi-threading.
  bamsort:       reads BAM and writes BAM resorted by coordinates or
                 query name
  bamtofastq:    reads BAM and writes FastQ; output can be collated
                 or uncollated by query name
The package is enhanced by the following packages: multiqc
Please cite: German Tischler and Steven Leonard: biobambam: tools for read pair collation based algorithms on BAM files. (eprint) Source Code Biol Med. 9:13 (2014)
Registry entries: Bio.tools  SciCrunch  Bioconda 
bowtie2
ultrafast memory-efficient short read aligner
Versions of package bowtie2
ReleaseVersionArchitectures
stretch2.3.0-2amd64
jessie2.2.4-1amd64
buster2.3.4.3-1amd64
bullseye2.4.2-2amd64,arm64,mips64el,ppc64el
sid2.5.4-1amd64,arm64,mips64el,ppc64el,riscv64
trixie2.5.4-1amd64,arm64,mips64el,ppc64el,riscv64
bookworm2.5.0-3amd64,arm64,mips64el,ppc64el
Popcon: 20 users (16 upd.)*
Versions and Archs
License: DFSG free
Git

is an ultrafast and memory-efficient tool for aligning sequencing reads to long reference sequences. It is particularly good at aligning reads of about 50 up to 100s or 1,000s of characters, and particularly good at aligning to relatively long (e.g. mammalian) genomes.

Bowtie 2 indexes the genome with an FM Index to keep its memory footprint small: for the human genome, its memory footprint is typically around 3.2 GB. Bowtie 2 supports gapped, local, and paired-end alignment modes

The package is enhanced by the following packages: bowtie2-examples multiqc
Please cite: Ben Langmead and Steven L Salzberg: Fast gapped-read alignment with Bowtie 2. (PubMed) Nature Methods 9:357–359 (2012)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Genomics
busco
benchmarking sets of universal single-copy orthologs
Versions of package busco
ReleaseVersionArchitectures
bullseye5.0.0-1all
sid5.5.0-2amd64,arm64,i386
trixie5.5.0-2amd64,arm64,i386
bookworm5.4.4-1amd64,i386
Popcon: 4 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Assessing genome assembly and annotation completeness with Benchmarking Universal Single-Copy Orthologs (BUSCO).

  • Automated selection of lineages issued from https://www.orthodb.org/
  • Automated download of all necessary files and datasets to conduct a run
  • Use prodigal for non-eukaryotic genomes
The package is enhanced by the following packages: multiqc
Please cite: Mathieu Seppey, Mosè Manni and Evgeny M. Zdobnov: BUSCO: Assessing Genome Assembly and Annotation Completeness. (PubMed) Methods Mol Biol. 1962:227-245 (2019)
Registry entries: Bio.tools  Bioconda 
bustools
program for manipulating BUS files for single cell RNA-Seq datasets
Versions of package bustools
ReleaseVersionArchitectures
trixie0.43.2+dfsg-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
bookworm0.42.0+dfsg-1amd64,arm64,mips64el,ppc64el,s390x
bullseye0.40.0-4amd64,arm64,mips64el,ppc64el,s390x
sid0.43.2+dfsg-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
upstream0.44.1
Popcon: 1 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

This package contains BUStools program, it can be used to error correct barcodes, collapse UMIs, produce gene count or transcript compatibility count matrices

Please cite: Páll Melsted, A. Sina Booeshaghi, Fan Gao, Eduardo Beltrame, Lambda Lu, Kristján Eldjárn Hjorleifsson, Jase Gehring and Lior Pachter: Modular and efficient pre-processing of single-cell RNA-seq.. BioRxiv :673285 (2019)
Registry entries: Bio.tools  Bioconda 
bwa
Burrows-Wheeler Aligner
Versions of package bwa
ReleaseVersionArchitectures
trixie0.7.18-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch-backports0.7.17-1~bpo9+1amd64
buster0.7.17-3amd64
bullseye0.7.17-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm0.7.17-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid0.7.18-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch0.7.15-2+deb9u1amd64
jessie0.7.10-1amd64
Debtags of package bwa:
biologynuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline, text-mode
roleprogram
useanalysing, comparing
Popcon: 20 users (9 upd.)*
Versions and Archs
License: DFSG free
Git

BWA is a software package for mapping low-divergent sequences against a large reference genome, such as the human genome. It consists of three algorithms: BWA-backtrack, BWA-SW and BWA-MEM. The first algorithm is designed for Illumina sequence reads up to 100bp, while the rest two for longer sequences ranged from 70bp to 1Mbp. BWA-MEM and BWA-SW share similar features such as long-read support and split alignment, but BWA-MEM, which is the latest, is generally recommended for high-quality queries as it is faster and more accurate. BWA-MEM also has better performance than BWA-backtrack for 70-100bp Illumina reads.

Please cite: Heng Li and Richard Durbin: Fast and accurate short read alignment with Burrows-Wheeler transform. (PubMed,eprint) Bioinformatics 25(14):1754-1760 (2009)
Registry entries: Bio.tools  SciCrunch  Bioconda 
cat-bat
taxonomic classification of contigs and metagenome-assembled genomes (MAGs)
Versions of package cat-bat
ReleaseVersionArchitectures
bullseye5.2.2-1amd64,arm64,ppc64el,s390x
trixie5.3-2amd64,arm64,ppc64el,riscv64,s390x
bookworm5.2.3-2amd64,arm64,ppc64el,s390x
sid5.3-2amd64,arm64,ppc64el,riscv64,s390x
upstream6.0.1
Popcon: 1 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

Contig Annotation Tool (CAT) and Bin Annotation Tool (BAT) are pipelines for the taxonomic classification of long DNA sequences and metagenome assembled genomes (MAGs/bins) of both known and (highly) unknown microorganisms, as generated by contemporary metagenomics studies. The core algorithm of both programs involves gene calling, mapping of predicted ORFs against the nr protein database, and voting-based classification of the entire contig / MAG based on classification of the individual ORFs. CAT and BAT can be run from intermediate steps if files are formatted appropriately.

Please cite: F. A. Bastiaan von Meijenfeldt, Ksenia Arkhipova, Diego D. Cambuy, Felipe H. Coutinho and Bas E. Dutilh: Robust taxonomic classification of uncharted microbial sequences and bins with CAT and BAT. (PubMed,eprint) Genome Biology 20(1):217 (2019)
Registry entries: Bioconda 
centrifuge
rapid and memory-efficient system for classification of DNA sequences
Versions of package centrifuge
ReleaseVersionArchitectures
buster1.0.3-2amd64
bookworm1.0.3-11amd64,arm64,armel,armhf,i386,mips64el,ppc64el
sid1.0.4.2-1amd64,arm64,mips64el,ppc64el,riscv64
bullseye1.0.3-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el
trixie1.0.4.2-1amd64,arm64,mips64el,ppc64el,riscv64
Popcon: 1 users (4 upd.)*
Versions and Archs
License: DFSG free
Git

Centrifuge is a very rapid and memory-efficient system for the classification of DNA sequences from microbial samples, with better sensitivity than and comparable accuracy to other leading systems. The system uses a novel indexing scheme based on the Burrows-Wheeler transform (BWT) and the Ferragina-Manzini (FM) index, optimized specifically for the metagenomic classification problem. Centrifuge requires a relatively small index (e.g., 4.3 GB for ~4,100 bacterial genomes) yet provides very fast classification speed, allowing it to process a typical DNA sequencing run within an hour. Together these advances enable timely and accurate analysis of large metagenomics data sets on conventional desktop computers.

Please cite: Daehwan Kim, Li Song, Florian P. Breitwieser and Steven L. Salzberg: Centrifuge: rapid and sensitive classification of metagenomic sequences. (PubMed,eprint) Genome Research 26(12):1721-1729 (2016)
Registry entries: Bio.tools  Bioconda 
changeo
Repertoire clonal assignment toolkit (Python 3)
Versions of package changeo
ReleaseVersionArchitectures
buster0.4.5-1all
sid1.3.0-2all
trixie1.3.0-2all
bookworm1.3.0-1all
bullseye1.0.2-1all
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Change-O is a collection of tools for processing the output of V(D)J alignment tools, assigning clonal clusters to immunoglobulin (Ig) sequences, and reconstructing germline sequences.

Dramatic improvements in high-throughput sequencing technologies now enable large-scale characterization of Ig repertoires, defined as the collection of trans-membrane antigen-receptor proteins located on the surface of B cells and T cells. Change-O is a suite of utilities to facilitate advanced analysis of Ig and TCR sequences following germline segment assignment. Change-O handles output from IMGT/HighV-QUEST and IgBLAST, and provides a wide variety of clustering methods for assigning clonal groups to Ig sequences. Record sorting, grouping, and various database manipulation operations are also included.

This package installs the library for Python 3.

Please cite: Namita T. Gupta, Jason A. Vander Heiden, Mohamed Uduman, Daniel Gadala-Maria, Gur Yaari and Steven H. Kleinstein: Link to publication (PubMed,eprint) Bioinformatics 31(20):3356-3358 (2015)
Registry entries: Bioconda 
chip-seq
tools performing common ChIP-Seq data analysis tasks
Versions of package chip-seq
ReleaseVersionArchitectures
bookworm1.5.5-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.5.5-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster-backports1.5.5-3~bpo10+1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bullseye1.5.5-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.5.5-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

The ChIP-Seq software provides a set of tools performing common genome- wide ChIP- seq analysis tasks, including positional correlation analysis, peak detection, and genome partitioning into signal-rich and signal-poor regions. These tools exist as stand-alone C programs and perform the following tasks:

 1. Positional correlation analysis and generation of an aggregation
    plot (AP) (chipcor),
 2. Extraction of specific genome annotation features around reference
    anchor points (chipextract),
 3. Read centering or shifting (chipcenter),
 4. Narrow peak caller using a fixed width peak size (chippeak),
 5. Broad peak caller used for large regions of enrichment (chippart),
 6. Feature selection tool based on a read count threshold (chipscore).

Because the ChIP-Seq tools are primarily optimized for speed, they use their own compact format for ChIP-seq data representation called SGA (Simplified Genome Annotation). SGA is a line-oriented, tab-delimited plain text format.

Please cite: Giovanna Ambrosini, René Dreos, Sunil Kumar and Philipp Bucher: The ChIP-Seq tools and web server: a resource for analyzing ChIP-seq and other types of genomic data. (PubMed,eprint) BMC Genomics 17(1):938 (2016)
clonalframeml
Efficient Inference of Recombination in Whole Bacterial Genomes
Versions of package clonalframeml
ReleaseVersionArchitectures
sid1.13-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster1.11-3amd64,arm64,armhf,i386
bullseye1.12-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.12-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.13-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

ClonalFrameML is a software package that performs efficient inference of recombination in bacterial genomes. ClonalFrameML was created by Xavier Didelot and Daniel Wilson. ClonalFrameML can be applied to any type of aligned sequence data, but is especially aimed at analysis of whole genome sequences. It is able to compare hundreds of whole genomes in a matter of hours on a standard Desktop computer. There are three main outputs from a run of ClonalFrameML: a phylogeny with branch lengths corrected to account for recombination, an estimation of the key parameters of the recombination process, and a genomic map of where recombination took place for each branch of the phylogeny.

ClonalFrameML is a maximum likelihood implementation of the Bayesian software ClonalFrame which was previously described by Didelot and Falush (2007). The recombination model underpinning ClonalFrameML is exactly the same as for ClonalFrame, but this new implementation is a lot faster, is able to deal with much larger genomic dataset, and does not suffer from MCMC convergence issues

Please cite: Xavier Didelot and Daniel J. Wilson: ClonalFrameML: Efficient Inference of Recombination in Whole Bacterial Genomes. (PubMed,eprint) PLoS Comput Biology 11(2):e1004041 (2015)
Registry entries: Bioconda 
cutadapt
Clean biological sequences from high-throughput sequencing reads
Versions of package cutadapt
ReleaseVersionArchitectures
bullseye3.2-2all
trixie4.7-2all
buster1.18-1all
sid4.7-2all
stretch1.12-2all
bookworm4.2-1all
upstream4.9
Popcon: 8 users (5 upd.)*
Newer upstream!
License: DFSG free
Git

Cutadapt helps with biological sequence clean tasks by finding the adapter or primer sequences in an error-tolerant way. It can also modify and filter reads in various ways. Adapter sequences can contain IUPAC wildcard characters. Also, paired-end reads and even colorspace data is supported. If you want, you can also just demultiplex your input data, without removing adapter sequences at all.

This package contains the user interface.

The package is enhanced by the following packages: multiqc
Please cite: Marcel Martin: Cutadapt removes adapter sequences from high-throughput sequencing reads. (eprint) EMBnet.journal 17(1):10-12 (2015)
Registry entries: Bio.tools  SciCrunch  Bioconda 
cwltool
Common Workflow Language reference implementation
Versions of package cwltool
ReleaseVersionArchitectures
stretch1.0.20170114120503-1all
bullseye3.0.20210124104916-3+deb11u1all
sid3.1.20241024121129-1all
trixie3.1.20241024121129-1all
bookworm3.1.20230209161050-1all
buster1.0.20181217162649+dfsg-10all
upstream3.1.20241112140730
Popcon: 24 users (23 upd.)*
Newer upstream!
License: DFSG free
Git

This is the reference implementation of the Common Workflow Language standards.

The CWL open standards are for describing analysis workflows and tools in a way that makes them portable and scalable across a variety of software and hardware environments, from workstations to cluster, cloud, and high performance computing (HPC) environments. CWL is designed to meet the needs of data-intensive science, such as Bioinformatics, Medical Imaging, Astronomy, Physics, and Chemistry.

The CWL reference implementation (cwltool) is intended to be feature complete and to provide comprehensive validation of CWL files as well as provide other tools related to working with CWL descriptions.

Please cite: Michael R. Crusoe, Sanne Abeln, Alexandru Iosup, Peter Amstutz, John Chilton, Nebojša Tijanić, Hervé Ménager, Stian Soiland-Reye, Bogdan Gavrilović, Carole Goble and The CWL Community: Methods included: standardizing computational reuse and portability with the Common Workflow Language. Communications of the ACM 65(6):54-63 (2022)
Registry entries: SciCrunch  Bioconda 
dcmtk
OFFIS DICOM toolkit command line utilities
Versions of package dcmtk
ReleaseVersionArchitectures
bullseye3.6.5-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie3.6.0-15+deb8u1amd64,armel,armhf,i386
trixie3.6.8-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
jessie-security3.6.0-15+deb8u1amd64,armel,armhf,i386
bookworm3.6.7-9~deb12u1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch3.6.1~20160216-4amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster3.6.4-2.1amd64,arm64,armhf,i386
buster-security3.6.4-2.1+deb10u1amd64,arm64,armhf,i386
bullseye-backports3.6.7-6~bpo11+1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid3.6.8-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Debtags of package dcmtk:
interfacecommandline
roleprogram
scopeutility
useconverting, downloading
works-withimage, image:raster
Popcon: 92 users (113 upd.)*
Versions and Archs
License: DFSG free
Git

DCMTK includes a collection of libraries and applications for examining, constructing and converting DICOM image files, handling offline media, sending and receiving images over a network connection, as well as demonstrative image storage and worklist servers.

This package contains the DCMTK utility applications.

Note: This version was compiled with libssl support.

Please cite: Chung-Yueh Lien, Michael Onken, Marco Eichelberg, Tsair Kao and Andreas Hein: Open Source Tools for Standardized Privacy Protection of Medical Images. (eprint) Progress in Biomedical Optics and Imaging - Proceedings of SPIE 7967:79670M-79670M (2011)
delly
Structural variant discovery by read analysis
Versions of package delly
ReleaseVersionArchitectures
bullseye0.8.7-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster0.8.1-2amd64,arm64,armhf
trixie1.1.8-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.1.6-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.1.8-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
upstream1.3.1
Popcon: 0 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

Delly performs Structural variant discovery by integrated paired-end and split-read analysis. It discovers, genotypes and visualizes deletions, tandem duplications, inversions and translocations at single-nucleotide resolution in short-read massively parallel sequencing data. It uses paired-ends, split-reads and read-depth to sensitively and accurately delineate genomic rearrangements throughout the genome.

Please cite: Tobias Rausch, Thomas Zichner, Andreas Schlattl, Adrian M. Stuetz, Vladimir Benes and Jan O. Korbel: DELLY: structural variant discovery by integrated paired-end and split-read analysis.. Bioinformatics 28:i333-i339 (2012)
Registry entries: Bio.tools  SciCrunch  Bioconda 
dextractor
(d)extractor and compression command library
Versions of package dextractor
ReleaseVersionArchitectures
sid1.0-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.0-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.0-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye1.0-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 3 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Dextractor commands allow one to pull exactly and only the information needed for assembly and reconstruction from the source HDF5 files produced by the PacBio RS II sequencer, or from the source BAM files produced by the PacBio Sequel sequencer.

For each of the three extracted file types -- fasta, quiva, and arrow -- the library contains commands to compress the given file type, and to decompress it, which is a reversible process delivering the original uncompressed file. The compressed .fasta files, with the extension .dexta, consume 1/4 byte per base. The compressed .quiva files, with the extension .dexqv, consume 1.5 bytes per base on average, and the compressed .arrow files, with the extension .dexar, consume 1/4 byte per base

For more information, please view the available documentation at https://github.com/thegenemyers/DEXTRACTOR

Registry entries: Bioconda 
diamond-aligner
accelerated BLAST compatible local sequence aligner
Versions of package diamond-aligner
ReleaseVersionArchitectures
bullseye2.0.7-1amd64,arm64,ppc64el,s390x
stretch-backports0.9.22+dfsg-2~bpo9+1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
trixie2.1.9-1amd64,arm64,ppc64el,riscv64,s390x
buster0.9.24+dfsg-1amd64
bookworm2.1.3-1amd64,arm64,ppc64el,s390x
sid2.1.9-1amd64,arm64,ppc64el,riscv64,s390x
upstream2.1.10
Popcon: 1 users (3 upd.)*
Newer upstream!
License: DFSG free
Git

DIAMOND is a sequence aligner for protein and translated DNA searches and functions as a drop-in replacement for the NCBI BLAST software tools. It is suitable for protein-protein search as well as DNA-protein search on short reads and longer sequences including contigs and assemblies, providing a speedup of BLAST ranging up to x20,000.

Please cite: Benjamin Buchfink, Chao Xie and Daniel H Huson: Fast and sensitive protein alignment using DIAMOND. (PubMed) Nature methods 12(1):59-60 (2015)
Registry entries: Bio.tools  SciCrunch  Bioconda 
discosnp
discovering Single Nucleotide Polymorphism from raw set(s) of reads
Versions of package discosnp
ReleaseVersionArchitectures
stretch1.2.6-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bookworm2.6.2-2amd64,arm64,mips64el,ppc64el
trixie2.6.2-3amd64,arm64,mips64el,ppc64el,riscv64
sid2.6.2-3amd64,arm64,mips64el,ppc64el,riscv64
buster2.3.0-2amd64,arm64,i386
jessie1.2.5-1amd64,armel,armhf,i386
bullseye4.4.4-1amd64,arm64,i386,mips64el,ppc64el,s390x
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Software discoSnp is designed for discovering Single Nucleotide Polymorphism (SNP) from raw set(s) of reads obtained with Next Generation Sequencers (NGS).

Note that number of input read sets is not constrained, it can be one, two, or more. Note also that no other data as reference genome or annotations are needed.

The software is composed by two modules. First module, kissnp2, detects SNPs from read sets. A second module, kissreads, enhance the kissnp2 results by computing per read set and for each found SNP:

 1) its mean read coverage
 2) the (phred) quality of reads generating the polymorphism.

This program is superseded by DiscoSnp++.

Registry entries: Bio.tools  SciCrunch  Bioconda 
drop-seq-tools
analyzing Drop-seq data
Versions of package drop-seq-tools
ReleaseVersionArchitectures
bullseye2.4.0+dfsg-6all
sid3.0.2+dfsg-1all
bookworm2.5.2+dfsg-1all
trixie3.0.2+dfsg-1all
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

This software provide for core computational analysis of Drop-seq data, which shows you how to transform raw sequence data into an expression measurement for each gene in each individual cell.

Registry entries: Bioconda 
fasta3
tools for searching collections of biological sequences
Versions of package fasta3
ReleaseVersionArchitectures
bookworm36.3.8i.14-Nov-2020-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster36.3.8g-1 (non-free)amd64
bullseye36.3.8h.2020-02-11-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid36.3.8i.14-Nov-2020-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie36.3.8i.14-Nov-2020-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 1 users (6 upd.)*
Versions and Archs
License: DFSG free
Git

The FASTA programs find regions of local or global similarity between Protein or DNA sequences, either by searching Protein or DNA databases, or by identifying local duplications within a sequence. Other programs provide information on the statistical significance of an alignment. Like BLAST, FASTA can be used to infer functional and evolutionary relationships between sequences as well as help identify members of gene families.

  • Protein
  • Protein-protein FASTA
  • Protein-protein Smith-Waterman (ssearch)
  • Global Protein-protein (Needleman-Wunsch) (ggsearch)
  • Global/Local protein-protein (glsearch)
  • Protein-protein with unordered peptides (fasts)
  • Protein-protein with mixed peptide sequences (fastf)

  • Nucleotide

  • Nucleotide-Nucleotide (DNA/RNA fasta)
  • Ordered Nucleotides vs Nucleotide (fastm)
  • Un-ordered Nucleotides vs Nucleotide (fasts)

  • Translated

  • Translated DNA (with frameshifts, e.g. ESTs) vs Proteins (fastx/fasty)
  • Protein vs Translated DNA (with frameshifts) (tfastx/tfasty)
  • Peptides vs Translated DNA (tfasts)

  • Statistical Significance

  • Protein vs Protein shuffle (prss)
  • DNA vs DNA shuffle (prss)
  • Translated DNA vs Protein shuffle (prfx)

  • Local Duplications

  • Local Protein alignments (lalign)
  • Plot Protein alignment "dot-plot" (plalign)
  • Local DNA alignments (lalign)
  • Plot DNA alignment "dot-plot" (plalign)

This software is often used via a web service at the EBI with readily indexed reference databases at http://www.ebi.ac.uk/Tools/fasta/.

Please cite: William R. Pearson and D. J. Lipman: Improved tools for biological sequence comparison. (PubMed,eprint) Proc Natl Acad Sci U S A 85(8):2444-8 (1988)
Registry entries: Bioconda 
fastani
Fast alignment-free computation of whole-genome Average Nucleotide Identity
Versions of package fastani
ReleaseVersionArchitectures
sid1.33-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.33-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.33-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

ANI is defined as mean nucleotide identity of orthologous gene pairs shared between two microbial genomes. FastANI supports pairwise comparison of both complete and draft genome assemblies.

fastp
Ultra-fast all-in-one FASTQ preprocessor
Versions of package fastp
ReleaseVersionArchitectures
trixie0.23.4+dfsg-1amd64,arm64,armel,armhf,mips64el,ppc64el,riscv64,s390x
buster0.19.6+dfsg-1amd64,arm64,armhf,i386
bullseye0.20.1+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm0.23.2+dfsg-2amd64,arm64,armel,armhf,mips64el,mipsel,ppc64el,s390x
sid0.23.4+dfsg-1amd64,arm64,armel,armhf,mips64el,ppc64el,riscv64,s390x
upstream0.24.0
Popcon: 2 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

All-in-one FASTQ preprocessor, fastp provides functions including quality profiling, adapter trimming, read filtering and base correction. It supports both single-end and paired-end short read data and also provides basic support for long-read data.

The package is enhanced by the following packages: multiqc
Please cite: Shifu Chen, Yanqing Zhou, Yaru Chen and Jia Gu: fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics 34(17):i884-i890 (2018)
Registry entries: Bioconda 
fastqc
control de calidad para datos de secuencias de alto rendimiento
Versions of package fastqc
ReleaseVersionArchitectures
buster0.11.8+dfsg-2all
bookworm0.11.9+dfsg-6all
trixie0.12.1+dfsg-4all
sid0.12.1+dfsg-4all
jessie0.11.2+dfsg-3all
stretch0.11.5+dfsg-6all
bullseye0.11.9+dfsg-4all
Popcon: 19 users (7 upd.)*
Versions and Archs
License: DFSG free
Git

El objetivo de FastQC es proporcionar una forma sencilla de realizar algunas comprobaciones de control de calidad de los datos de secuencia sin procesar procedentes de canales de secuenciación de alto rendimiento. Proporciona un conjunto modular de análisis que puede utilizar para obtener una impresión rápida sobre si sus datos tienen algún problema del que debería ser consciente antes de realizar cualquier análisis posterior.

Las principales funciones de FastQC son

  • Importar datos desde archivos BAM, SAM o FastQ (cualquier variante)
  • Proporcionar una visión general rápida para informarle en qué áreas puede haber problemas

  • Gráficos y tablas de resumen para evaluar rápidamente sus datos

  • Exportar los resultados a un informe permanente basado en HTML
  • Operar fuera de línea para permitir la generación automática de informes sin necesidad de ejecutar la aplicación interactiva.
The package is enhanced by the following packages: multiqc
Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Sequencing
filtlong
quality filtering tool for long reads of genome sequences
Versions of package filtlong
ReleaseVersionArchitectures
bullseye0.2.0-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid0.2.1-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm0.2.1-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie0.2.1-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Filtlong is a tool for filtering long reads by quality. It can take a set of long reads and produce a smaller, better subset. It uses both read length (longer is better) and read identity (higher is better) when choosing which reads pass the filter.

Registry entries: Bio.tools  Bioconda 
flash
Fast Length Adjustment of SHort reads
Versions of package flash
ReleaseVersionArchitectures
bullseye1.2.11-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.2.11-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.2.11-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.2.11-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

FLASH (Fast Length Adjustment of SHort reads) is a very fast and accurate software tool to merge paired-end reads from next-generation sequencing experiments. FLASH is designed to merge pairs of reads when the original DNA fragments are shorter than twice the length of reads. The resulting longer reads can significantly improve genome assemblies. They can also improve transcriptome assembly when FLASH is used to merge RNA-seq data.

The package is enhanced by the following packages: multiqc
Please cite: Tanja Magoč and Steven L Salzberg: FLASH: Fast Length Adjustment of Short Reads to Improve Genome Assemblies. (PubMed,eprint) Bioinformatics 27(21):2957-2963 (2011)
Registry entries: Bio.tools  Bioconda 
flye
de novo assembler for single molecule sequencing reads using repeat graphs
Versions of package flye
ReleaseVersionArchitectures
sid2.9.5+dfsg-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
trixie2.9.5+dfsg-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
bookworm2.9.1+dfsg-1amd64,arm64,mips64el,ppc64el,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Flye is a de novo assembler for single molecule sequencing reads, such as those produced by PacBio and Oxford Nanopore Technologies. It is designed for a wide range of datasets, from small bacterial projects to large mammalian-scale assemblies. The package represents a complete pipeline: it takes raw PacBio / ONT reads as input and outputs polished contigs. Flye also has a special mode for metagenome assembly.

Please cite: Mikhail Kolmogorov, Jeffrey Yuan, Yu Lin and Pavel A. Pevzner: Assembly of long, error-prone reads using repeat graphs. (PubMed) Nature Biotechnology 37(5):540–546 (2019)
Registry entries: Bio.tools  Bioconda 
freebayes
Bayesian haplotype-based polymorphism discovery and genotyping
Versions of package freebayes
ReleaseVersionArchitectures
sid1.3.7-1amd64,arm64,mips64el,ppc64el,riscv64
experimental1.3.7-1~expamd64,arm64,armel,armhf,i386,mips64el,ppc64el,s390x
bookworm1.3.6-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye1.3.5-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster1.2.0-2amd64
stretch-backports1.2.0-1~bpo9+1amd64
upstream1.3.8-pre3
Popcon: 2 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

FreeBayes is a Bayesian genetic variant detector designed to find small polymorphisms, specifically SNPs (single-nucleotide polymorphisms), indels (insertions and deletions), MNPs (multi-nucleotide polymorphisms), and complex events (composite insertion and substitution events) smaller than the length of a short-read sequencing alignment.

Please cite: Erik Garrison and Gabor Marth: Haplotype-based variant detection from short-read sequencing. (eprint) arXiv (2012)
Registry entries: Bio.tools  SciCrunch  Bioconda 
genometools
versatile genome analysis toolkit
Versions of package genometools
ReleaseVersionArchitectures
bullseye-backports-sloppy1.6.5+ds-2~bpo11+1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie1.5.3-2amd64,armel,armhf,i386
stretch1.5.9+ds-4amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster1.5.10+ds-3amd64,arm64,armhf,i386
buster-backports1.6.1+ds-3~bpo10+1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bullseye1.6.1+ds-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.6.2+ds-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.6.5+ds-2.2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.6.5+ds-2.2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm-backports1.6.5+ds-2~bpo12+1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Debtags of package genometools:
biologynuceleic-acids
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
uitoolkitncurses
Popcon: 4 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The GenomeTools contains a collection of useful tools for biological sequence analysis and -presentation combined into a single binary.

The toolkit contains binaries for sequence and annotation handling, sequence compression, index structure generation and access, annotation visualization, and much more.

Please cite: Gordon Gremme, Sascha Steinbiss and Stefan Kurtz: GenomeTools: a comprehensive software library for efficient processing of structured genome annotations.. (PubMed) IEEE/ACM Transactions on Computational Biology and Bioinformatics 10(3):645-656 (2013)
Registry entries: Bio.tools 
gffread
GFF/GTF format conversions, region filtering, FASTA sequence extraction
Versions of package gffread
ReleaseVersionArchitectures
bullseye0.12.1-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid0.12.7-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie0.12.7-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm0.12.7-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 2 users (4 upd.)*
Versions and Archs
License: DFSG free
Git

Gffread is a GFF/GTF parsing utility providing format conversions, region filtering, FASTA sequence extraction and more.

Registry entries: Bio.tools  Bioconda 
ginkgocadx
Medical Imaging Software and complete DICOM Viewer
Versions of package ginkgocadx
ReleaseVersionArchitectures
buster3.8.8-1amd64,i386
bullseye3.8.8-5amd64,i386
stretch3.8.4-1amd64,i386
jessie3.7.0.1465.37+dfsg-1amd64,armel,armhf,i386
Debtags of package ginkgocadx:
fieldmedicine, medicine:imaging
roleprogram
uitoolkitgtk, wxwidgets
useviewing
Popcon: 20 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Ginkgo CADx provides a complete DICOM viewer solution with advanced capabilities and support for extensions.

  • Easy and customizable interface through profiles.
  • Full featured DICOM image visualization.
  • Complete tool set (measure, markers, text, ...).
  • Multiple modalities support (Neurological, Radiological, Dermatological, Ophthalmological, Ultrasound, Endoscopy, ...)
  • Dicomization support from JPEG, PNG, GIF and TIFF.
  • Full EMH integration support: HL7 standard and IHE compliant workflows.
  • PACS Workstation (C-FIND, C-MOVE, C-STORE...)
  • Extensible through custom extensions.
  • Retinal image mosaic composition.
  • Automatic retinal analysis diagnostics.
  • Psoriasis automatic diagnostics.
gnumed-client
medical practice management - Client
Versions of package gnumed-client
ReleaseVersionArchitectures
buster1.7.5+dfsg-3all
jessie1.4.12+dfsg-1all
trixie1.8.19+dfsg-1all
bookworm1.8.9+dfsg-1all
bullseye1.8.5+dfsg-2all
stretch1.6.11+dfsg-3all
sid1.8.19+dfsg-1all
Debtags of package gnumed-client:
fieldmedicine
interfacex11
networkclient
roleprogram
scopeapplication
uitoolkitwxwidgets
useorganizing
works-withdb, people
x11application
Popcon: 4 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

This is the GNUmed Electronic Medical Record. Its purpose is to enable doctors to keep a medically sound record on their patients' health. It does not currently provide functionality for stock keeping. Clinical features are well-tested by real doctors in the field.

While the GNUmed team has taken the utmost care to make sure the medical records are safe at all times you still need to make sure you are taking appropriate steps to backup the medical data to a safe place at appropriate intervals. Do not forget to test your recovery procedures, too !

Protect your data! GNUmed itself comes without any warranty whatsoever. You have been warned.

This package contains the wxpython client.

The package is enhanced by the following packages: entangle gnumed-doc
Screenshots of package gnumed-client
gnumed-server
medical practice management - server
Versions of package gnumed-server
ReleaseVersionArchitectures
bookworm22.19-1all
bullseye22.15-1all
jessie19.12-1all
buster22.5-1all
sid22.28-1all
stretch21.11-1all
trixie22.28-1all
upstream22.29
Debtags of package gnumed-server:
fieldmedicine
roleprogram
Popcon: 11 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

This is the GNUmed Electronic Medical Record. Its purpose is to enable doctors to keep a medically sound record on their patients' health. It does not currently provide functionality for billing and stock keeping. Clinical features are well-tested by real doctors in the field.

While the GNUmed team has taken the utmost care to make sure the medical records are safe at all times you still need to make sure you are taking appropriate steps to backup the medical data to a safe place at appropriate intervals. Do not forget to test your recovery procedures, too !

Protect your data! GNUmed itself comes without any warranty whatsoever. You have been warned.

This package contains the PostgreSQL server part.

Note: The package does currently NOT build the GNUmed database but just installs the needed SQL files. Please see README.Debian.

gromacs
Molecular dynamics simulator, with building and analysis tools
Versions of package gromacs
ReleaseVersionArchitectures
stretch2016.1-2amd64,arm64,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bookworm2022.5-2amd64,arm64,mips64el,ppc64el,s390x
trixie2024.3-2amd64,arm64,mips64el,ppc64el,riscv64,s390x
sid2024.4-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
bullseye2020.6-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie5.0.2-1amd64,armel,armhf,i386
buster2019.1-1amd64,arm64,armhf,i386
Debtags of package gromacs:
fieldbiology, biology:structural, chemistry
interfacecommandline, x11
roleprogram
uitoolkitxlib
x11application
Popcon: 21 users (16 upd.)*
Versions and Archs
License: DFSG free
Git

GROMACS is a versatile package to perform molecular dynamics, i.e. simulate the Newtonian equations of motion for systems with hundreds to millions of particles.

It is primarily designed for biochemical molecules like proteins and lipids that have a lot of complicated bonded interactions, but since GROMACS is extremely fast at calculating the nonbonded interactions (that usually dominate simulations) many groups are also using it for research on non- biological systems, e.g. polymers.

This package contains variants both for execution on a single machine, and using the MPI interface across multiple machines.

Please cite: Berk Hess, Carsten Kutzner, David van der Spoel and Erik Lindahl: GROMACS 4: Algorithms for Highly Efficient, Load-Balanced, and Scalable Molecular Simulation. (eprint) J. Chem. Theory Comput. 4(3):435-447 (2008)
Registry entries: Bio.tools  SciCrunch  Bioconda 
gubbins
phylogenetic analysis of genome sequences
Versions of package gubbins
ReleaseVersionArchitectures
stretch2.2.0-1amd64,i386
sid3.3.5-1amd64,i386
trixie3.3.5-1amd64,i386
bookworm2.4.1-5amd64,i386
bullseye2.4.1-4amd64,i386
buster2.3.4-1amd64,i386
Popcon: 4 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Gubbins supports rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences.

Gubbins (Genealogies Unbiased By recomBinations In Nucleotide Sequences) is an algorithm that iteratively identifies loci containing elevated densities of base substitutions while concurrently constructing a phylogeny based on the putative point mutations outside of these regions. Simulations demonstrate the algorithm generates highly accurate reconstructions under realistic models of short-term bacterial evolution, and can be run in only a few hours on alignments of hundreds of bacterial genome sequences.

Please cite: Nicholas J. Croucher, Andrew J. Page, Thomas R. Connor, Aidan J. Delaney, Jacqueline A. Keane, Stephen D. Bentley, Julian Parkhill and Simon R. Harris: Rapid phylogenetic analysis of large samples of recombinant bacterial whole genome sequences using Gubbins. (PubMed,eprint) Nucleic Acids Research 43(3):e15 (2014)
Registry entries: Bioconda 
imagej
Image processing program with a focus on microscopy images
Versions of package imagej
ReleaseVersionArchitectures
jessie1.49i+dfsg-1all
bookworm1.53t-1all
stretch1.51i+dfsg-2all
bullseye1.53g-2all
sid1.54g-1all
trixie1.54g-1all
buster1.52j-1all
Debtags of package imagej:
roleprogram
useanalysing, editing, viewing
works-withimage, image:raster
works-with-formatgif, jpg, tiff
Popcon: 65 users (28 upd.)*
Versions and Archs
License: DFSG free
Git

It can display, edit, analyze, process, save and print 8-bit, 16-bit and 32-bit images. It can read many image formats including TIFF, GIF, JPEG, BMP, DICOM, FITS and "raw". It supports "stacks", a series of images that share a single window.

It can calculate area and pixel value statistics of user-defined selections. It can measure distances and angles. It can create density histograms and line profile plots. It supports standard image processing functions such as contrast manipulation, sharpening, smoothing, edge detection and median filtering.

Spatial calibration is available to provide real world dimensional measurements in units such as millimeters. Density or gray scale calibration is also available.

ImageJ is developed by Wayne Rasband (wayne@codon.nih.gov), is at the Research Services Branch, National Institute of Mental Health, Bethesda, Maryland, USA.

Please cite: Caroline A Schneider, Wayne S Rasband and Kevin W Eliceiri: NIH Image to ImageJ: 25 years of image analysis. (PubMed,eprint) Nature methods 9:671-675 (2012)
Registry entries: SciCrunch 
Screenshots of package imagej
ivar
functions broadly useful for viral amplicon-based sequencing
Versions of package ivar
ReleaseVersionArchitectures
bookworm1.3.1+dfsg-7amd64,arm64,i386,mips64el,mipsel
sid1.4.3+dfsg-2amd64,arm64,i386,mips64el,riscv64
bullseye1.3+dfsg-1amd64,arm64,i386,mips64el,mipsel
trixie1.4.3+dfsg-2amd64,arm64,i386,mips64el,riscv64
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

iVar is a computational package that contains functions broadly useful for viral amplicon-based sequencing. Additional tools for metagenomic sequencing are actively being incorporated into iVar. While each of these functions can be accomplished using existing tools, iVar contains an intersection of functionality from multiple tools that are required to call iSNVs and consensus sequences from viral sequencing data across multiple replicates. iVar provided the following functions:

 1. trimming of primers and low-quality bases,
 2. consensus calling,
 3. variant calling - both iSNVs and insertions/deletions, and
 4. identifying mismatches to primer sequences and excluding the
    corresponding reads from alignment files.
The package is enhanced by the following packages: multiqc
Please cite: Nathan D. Grubaugh, Karthik Gangavarapu, Joshua Quick, Nathaniel L. Matteson, Jaqueline Goes De Jesus, Bradley J. Main, Amanda L. Tan, Lauren M. Paul, Doug E. Brackney, Saran Grewal, Nikos Gurfield, Koen K. A. Van Rompay, Sharon Isern, Scott F. Michael, Lark L. Coffey, Nicholas J. Loman and Kristian G. Andersen: An amplicon-based sequencing framework for accurately measuring intrahost virus diversity using PrimalSeq and iVar. (PubMed,eprint) Genome Biology 20(1):8 (2019)
Registry entries: Bioconda 
kalign
Alineación de secuencias múltiples global y progresiva
Versions of package kalign
ReleaseVersionArchitectures
trixie3.4.0-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
jessie2.03+20110620-2amd64,armel,armhf,i386
stretch2.03+20110620-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster2.03+20110620-5amd64,arm64,armhf,i386
bullseye3.3-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm3.3.5-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid3.4.0-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Debtags of package kalign:
biologyformat:aln, nuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
usecomparing
works-with-formatplaintext
Popcon: 12 users (9 upd.)*
Versions and Archs
License: DFSG free
Git

Kalign es una herramienta de consola para realizar el alineamiento de múltiples secuencias biológicas. Emplea el algoritmo de la cadena de equiparación Muth-Manber, a fin de mejorar tanto la precisión como la velocidad de la adaptación. Se utiliza la adaptación global progresiva, enriquecida mediante el empleo de un algoritmo de aproximación de la cadena de equiparación para calcular las secuencias de distancias y la incorporación de las coincidencias locales en otras adaptaciones globales.

Please cite: Lassmann, Timo.: Kalign 3: multiple sequence alignment of large datasets. (eprint) Bioinformatics 36(6):1928-1929 (2020)
Registry entries: Bio.tools  SciCrunch 
kallisto
near-optimal RNA-Seq quantification
Versions of package kallisto
ReleaseVersionArchitectures
bookworm0.48.0+dfsg-3amd64,arm64,mips64el,ppc64el,s390x
trixie0.48.0+dfsg-4amd64,arm64,mips64el,ppc64el,riscv64,s390x
sid0.48.0+dfsg-4amd64,arm64,mips64el,ppc64el,riscv64,s390x
bullseye0.46.2+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
upstream0.51.1
Popcon: 1 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

Kallisto is a program for quantifying abundances of transcripts from RNA-Seq data, or more generally of target sequences using high-throughput sequencing reads. It is based on the novel idea of pseudoalignment for rapidly determining the compatibility of reads with targets, without the need for alignment. On benchmarks with standard RNA-Seq data, kallisto can quantify 30 million human reads in less than 3 minutes on a Mac desktop computer using only the read sequences and a transcriptome index that itself takes less than 10 minutes to build. Pseudoalignment of reads preserves the key information needed for quantification, and kallisto is therefore not only fast, but also as accurate than existing quantification tools. In fact, because the pseudoalignment procedure is robust to errors in the reads, in many benchmarks kallisto significantly outperforms existing tools.

The package is enhanced by the following packages: multiqc
Please cite: Nicolas L Bray, Harold Pimentel, Páll Melsted and Lior Pachter: Near-optimal probabilistic RNA-seq quantification. (PubMed) Nature Biotechnology 34(5):525–527 (2016)
Registry entries: Bio.tools  Bioconda 
kraken2
taxonomic classification system using exact k-mer matches
Versions of package kraken2
ReleaseVersionArchitectures
bookworm2.1.2-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie2.1.3-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye2.1.1-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid2.1.3-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 3 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Kraken 2 is the newest version of Kraken, a taxonomic classification system using exact k-mer matches to achieve high accuracy and fast classification speeds. This classifier matches each k-mer within a query sequence to the lowest common ancestor (LCA) of all genomes containing the given k-mer. The k-mer assignments inform the classification algorithm. [see: Kraken 1's Webpage for more details].

Kraken 2 provides significant improvements to Kraken 1, with faster database build times, smaller database sizes, and faster classification speeds. These improvements were achieved by the following updates to the Kraken classification program:

 1. Storage of Minimizers: Instead of storing/querying entire k-mers,
    Kraken 2 stores minimizers (l-mers) of each k-mer. The length of
    each l-mer must be ≤ the k-mer length. Each k-mer is treated by
    Kraken 2 as if its LCA is the same as its minimizer's LCA.
 2. Introduction of Spaced Seeds: Kraken 2 also uses spaced seeds to
    store and query minimizers to improve classification accuracy.
 3. Database Structure: While Kraken 1 saved an indexed and sorted list
    of k-mer/LCA pairs, Kraken 2 uses a compact hash table. This hash
    table is a probabilistic data structure that allows for faster
    queries and lower memory requirements. However, this data structure
    does have a <1% chance of returning the incorrect LCA or returning
    an LCA for a non-inserted minimizer. Users can compensate for this
    possibility by using Kraken's confidence scoring thresholds.
 4. Protein Databases: Kraken 2 allows for databases built from amino
    acid sequences. When queried, Kraken 2 performs a six-frame
    translated search of the query sequences against the database.
 5. 16S Databases: Kraken 2 also provides support for databases not
    based on NCBI's taxonomy. Currently, these include the 16S
    databases: Greengenes, SILVA, and RDP.
Please cite: Derrick E Wood and Steven L Salzberg: Kraken: ultrafast metagenomic sequence classification using exact alignments. (PubMed,eprint) Genome Biol. 15(3):R46 (2014)
Registry entries: Bio.tools  Bioconda 
lastz
pairwise aligning DNA sequences
Versions of package lastz
ReleaseVersionArchitectures
bullseye1.04.03-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.04.22-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.04.22-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.04.22-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

LASTZ is a drop-in replacement for BLASTZ, and is backward compatible with BLASTZ’s command-line syntax. That is, it supports all of BLASTZ’s options but also has additional ones, and may produce slightly different alignment results.

Registry entries: Bioconda 
libbbhash-dev
bloom-filter based minimal perfect hash function library
Versions of package libbbhash-dev
ReleaseVersionArchitectures
trixie1.0.0-6amd64,arm64,mips64el,ppc64el,riscv64,s390x
bullseye1.0.0-3amd64,arm64,mips64el,ppc64el,s390x
sid1.0.0-6amd64,arm64,mips64el,ppc64el,riscv64,s390x
bookworm1.0.0-5amd64,arm64,mips64el,ppc64el,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

BBHash is a simple library for building minimal perfect hash function. It is designed to handle large scale datasets. The function is just a little bit larger than other state-of-the-art libraries, it takes approximately 3 bits / elements (compared to 2.62 bits/elem for the emphf lib), but construction is faster and does not require additional memory.

Please cite: Antoine Limasset, Guillaume Rizk, Rayan Chikhi and Pierre Peterlongo: Fast and scalable minimal perfect hashing for massive key sets. HAL-Inria (2017)
libchipcard-dev
API for smartcard readers
Maintainer: Micha Lenk
Versions of package libchipcard-dev
ReleaseVersionArchitectures
trixie5.1.6-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
experimental5.99.1beta-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
jessie5.0.3beta-5amd64,armel,armhf,i386
stretch5.0.4-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster5.1.0beta-3amd64,arm64,armhf,i386
buster-backports5.1.5rc2-7~bpo10+1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bullseye5.1.5rc2-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm5.1.6-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid5.1.6-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
upstream5.99.1beta
Debtags of package libchipcard-dev:
devellang:c, library
roledevel-lib
Popcon: 0 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

libchipcard provides an API for accessing smartcards. Examples are memory cards, as well as HBCI (home banking), German GeldKarte (electronic small change), and KVK (health insurance) cards.

This package contains the development files for libchipcard.

libgclib-dev
header files for Genome Code Lib (GCLib)
Versions of package libgclib-dev
ReleaseVersionArchitectures
bookworm0.12.7+ds-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye0.11.10+ds-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid0.12.7+ds-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie0.12.7+ds-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 0 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

This is an eclectic gathering of (mostly) C++ code which upstream used for some bioinformatics projects. The main idea is to provide lean code and efficient data structures, trying to avoid too many code dependencies of heavy libraries while minimizing production cycles (and this also implies a decent compile/build time -- looking at you, bloated configure scripts and lengthy compile times of Boost code or other heavy C++ template code..).

This code was gathered even before the C++ STL had been fully adopted as a cross-platform "standard". Since STL by itself is a bit heavier for most of the C++ needs, it is preferred to use simpler&leaner C++ classes or templates for basic strings, containers, basic algorithms etc.

Header files of Genome Code Lib. It is mainly known for being used by StringTie but with its own release cycle.

libgdcm-tools
Grassroots DICOM tools and utilities
Versions of package libgdcm-tools
ReleaseVersionArchitectures
stretch2.6.6-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
jessie2.4.4-3+deb8u1amd64,armel,armhf,i386
buster2.8.8-9amd64,arm64,armhf,i386
bullseye3.0.8-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye-backports3.0.17-4~bpo11+1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm3.0.21-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie3.0.24-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid3.0.24-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Debtags of package libgdcm-tools:
fieldmedicine:imaging
interfacecommandline
roleprogram
scopeutility
useconverting
works-withimage, image:raster
Popcon: 23 users (10 upd.)*
Versions and Archs
License: DFSG free
Git

Grassroots DiCoM is a C++ library for DICOM medical files. It is automatically wrapped to python/C#/Java (using swig). It supports RAW,JPEG (lossy/lossless),J2K,JPEG-LS, RLE and deflated.

Install this package for the gdcmanon, gdcmclean, gdcmconv, gdcmdiff, gdcmdump, gdcmpap3, gdcmgendir, gdcmimg, gdcminfo, gdcmpdf, gdcmraw, gdcmscanner, gdcmscu, gdcmtar, gdcmxml programs.

Please cite: David Rodríguez González, Trevor Carpenter, Jano I. van Hemert and Joanna Wardlaw: An open source toolkit for medical imaging de-identification. (PubMed,eprint) European Radiology 20(8):1896-1904 (2010)
libhtscodecs-dev
Development headers for custom compression for CRAM and others
Versions of package libhtscodecs-dev
ReleaseVersionArchitectures
sid1.6.1-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye0.5-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.3.0-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.6.1-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 0 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

This library implements the custom CRAM codecs used for "EXTERNAL" block types. These consist of two variants of the rANS codec (8-bit and 16-bit renormalisation, with run-length encoding and bit-packing also supported in the latter), a dynamic arithmetic coder, and custom codecs for name/ID compression and quality score compression derived from fqzcomp.

They come with small command line test tools to act as both compression exploration programs and as part of the test harness.

This package contains the development headers

libics-dev
Image Cytometry Standard file reading and writing (devel)
Versions of package libics-dev
ReleaseVersionArchitectures
bullseye1.6.4-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.6.8-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch1.5.2-6amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bookworm1.6.6-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie1.5.2-6amd64,armel,armhf,i386
buster1.6.2-2amd64,arm64,armhf,i386
sid1.6.8-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Debtags of package libics-dev:
devellibrary
roledevel-lib
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

This is the reference library for ICS (Image Cytometry Standard), an open standard for writing images of any dimensionality and data type to file, together with associated information regarding the recording equipment or recorded subject.

This package contains the libraries needed to build ICS applications.

libmaus2-dev
collection of data structures and algorithms for biobambam (devel)
Versions of package libmaus2-dev
ReleaseVersionArchitectures
sid2.0.813+ds-3amd64,i386,mips64el,ppc64el,riscv64
bookworm2.0.813+ds-1amd64,arm64,i386,ppc64el
bullseye2.0.768+dfsg-2amd64,arm64,i386,ppc64el
trixie2.0.813+ds-3amd64,i386,mips64el,ppc64el,riscv64
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Libmaus2 is a collection of data structures and algorithms. It contains

  • I/O classes (single byte and UTF-8)
  • bitio classes (input, output and various forms of bit level manipulation)
  • text indexing classes (suffix and LCP array, fulltext and minute (FM), ...)
  • BAM sequence alignment files input/output (simple and collating)

and many lower level support classes.

This package contains header files and static libraries.

Registry entries: Bio.tools  Bioconda 
libmilib-java
library for Next Generation Sequencing (NGS) data processing
Versions of package libmilib-java
ReleaseVersionArchitectures
bookworm2.2.0+dfsg-1all
bullseye1.13-1all
sid2.2.0+dfsg-1all
buster1.10-2all
trixie2.2.0+dfsg-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

A helping Java package adopted by a range of Open Source tools for the analysis of B and T cell repertoires.

libseqan3-dev
C++ library for the analysis of biological sequences v3 (development)
Versions of package libseqan3-dev
ReleaseVersionArchitectures
experimental3.4.0~rc.1+ds-1~0exp0all
buster-backports3.0.1+ds-3~bpo10+1amd64,arm64,mips64el,ppc64el,s390x
bullseye3.0.2+ds-9all
trixie3.3.0+ds-2all
sid3.3.0+ds-2all
bookworm3.2.0+ds-6all
upstream3.4.0~rc.1
Popcon: 0 users (1 upd.)*
Newer upstream!
License: DFSG free
Git

SeqAn is a C++ template library of efficient algorithms and data structures for the analysis of sequences with the focus on biological data. This library applies a unique generic design that guarantees high performance, generality, extensibility, and integration with other libraries. SeqAn is easy to use and simplifies the development of new software tools with a minimal loss of performance.

This package contains the developer files.

Please cite: Andreas Doring, David Weese, Tobias Rausch and Knut Reinert: SeqAn An efficient, generic C++ library for sequence analysis. (PubMed,eprint) BMC Bioinformatics 9(1):11 (2008)
Registry entries: Bio.tools  Bioconda 
lighter
fast and memory-efficient sequencing error corrector
Versions of package lighter
ReleaseVersionArchitectures
bookworm1.1.2-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.1.3-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.1.2-7amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye1.1.2-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

Lighter is a fast, memory-efficient tool for correcting sequencing errors. Lighter avoids counting k-mers. Instead, it uses a pair of Bloom filters, one holding a sample of the input k-mers and the other holding k-mers likely to be correct. As long as the sampling fraction is adjusted in inverse proportion to the depth of sequencing, Bloom filter size can be held constant while maintaining near-constant accuracy. Lighter is parallelized, uses no secondary storage, and is both faster and more memory-efficient than competing approaches while achieving comparable accuracy.

lumpy-sv
general probabilistic framework for structural variant discovery
Versions of package lumpy-sv
ReleaseVersionArchitectures
sid0.3.1+dfsg-7amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
bookworm0.3.1+dfsg-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
bullseye0.3.1+dfsg-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
trixie0.3.1+dfsg-7amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

LUMPY, a novel SV discovery framework that naturally integrates multiple SV signals jointly across multiple samples. LUMPY yields improved sensitivity, especially when SV signal is reduced owing to either low coverage data or low intra-sample variant allele frequency.

The package is enhanced by the following packages: lumpy-sv-examples
mecat2
ultra-fast and accurate de novo assembly tools for SMRT reads
Versions of package mecat2
ReleaseVersionArchitectures
trixie0.0+git20200428.f54c542+ds-4amd64
bullseye0.0+git20200428.f54c542+ds-3amd64
bookworm0.0+git20200428.f54c542+ds-3amd64
sid0.0+git20200428.f54c542+ds-4amd64
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

An improved version of MECAT. It is an ultra-fast and accurate mapping and error correcting de novo assembly tools for single molecula sequencing (SMRT) reads. MECAT2 consists of the following three modules:

 1. mecat2map: a fast and accurate alignment tool for SMRT reads.
 2. mecat2cns: correct noisy reads based on their pairwise overlaps.
 3. fsa: a string graph based assembly tool.
Please cite: Chuan-Le Xiao, Ying Chen, Shang-Qian Xie, Kai-Ning Chen, Yan Wang, Yue Han, Feng Luo and Zhi Xie: MECAT: fast mapping, error correction, and de novo assembly for single-molecule sequencing reads. Nature Methods 14(11):1078 (2017)
megahit
ultra-fast and memory-efficient meta-genome assembler
Versions of package megahit
ReleaseVersionArchitectures
sid1.2.9-5amd64
bullseye1.2.9-2amd64
trixie1.2.9-5amd64
bookworm1.2.9-4amd64
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Megahit is an ultra-fast and memory-efficient NGS assembler. It is optimized for metagenomes, but also works well on generic single genome assembly (small or mammalian size) and single-cell assembly.

The software was praised in a Briefings in Bioinformatics 5/2020 review (DOI: 10.1093/bib/bbaa085).

Please cite: Dinghua Li, Chi-Man Liu, Ruibang Luo, Kunihiko Sadakane and Tak-Wah Lam: MEGAHIT: an ultra-fast single-node solution for large and complex metagenomics assembly via succinct de Bruijn graph. (PubMed) 31:1674-1676 (2015)
Registry entries: Bio.tools  Bioconda 
metabat
robust statistical framework for reconstructing genomes from metagenomic data
Versions of package metabat
ReleaseVersionArchitectures
bullseye2.15-3amd64,i386
bookworm2.15-4amd64,i386
trixie2.15-4amd64,i386
sid2.15-4amd64,i386
upstream2.17
Popcon: 2 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

MetaBAT integrates empirical probabilistic distances of genome abundance and tetranucleotide frequency for accurate metagenome binning. MetaBAT outperforms alternative methods in accuracy and computational efficiency on both synthetic and real metagenome datasets. It automatically forms hundreds of high quality genome bins on a very large assembly consisting millions of contigs in a matter of hours on a single node.

Please cite: Dongwan D. Kang, Jeff Froula, Rob Egan and Zhong Wang: MetaBAT, an efficient tool for accurately reconstructing single genomes from complex microbial communities. (PubMed) PeerJ 3:e1165 (2015)
Registry entries: Bioconda 
minia
short-read biological sequence assembler
Versions of package minia
ReleaseVersionArchitectures
bullseye3.2.1+git20200522.4960a99-1amd64,arm64,i386,mips64el,ppc64el,s390x
buster1.6906-2amd64,arm64,armhf,i386
stretch1.6906-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
jessie1.6088-1amd64,armel,armhf,i386
sid3.2.6-4amd64,arm64,mips64el,ppc64el,riscv64
trixie3.2.6-4amd64,arm64,mips64el,ppc64el,riscv64
bookworm3.2.6-3amd64,arm64,mips64el,ppc64el
Popcon: 3 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

What was referred to as "next-generation" DNA sequencing up to the year 2020 delivered only "short" reads up to ~600 base pairs in length that would then have to be puzzled by random overlaps in their sequence towards a complete genome. This is the genome assembly. And there are many biological pitfalls on long stretches of low complexity regions and copy number variations and other sorts of redundancies that render this difficult.

This package provides a short-read DNA sequence assembler based on a de Bruijn graph, capable of assembling a human genome on a desktop computer in a day.

The output of Minia is a set of contigs, i.e. stretches of gap-free linear overlaps of short reads. In the best possible case this is a whole chromosome.

Minia produces results of similar contiguity and accuracy to other de Bruijn assemblers (e.g. Velvet).

Please cite: Rayan Chikhi and Guillaume Rizk: Space-Efficient and Exact de Bruijn Graph Representation Based on a Bloom Filter.. (PubMed,eprint) Algorithms for Molecular Biology 8(1):22 (2013)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Sequence assembly
minimap2
versatile pairwise aligner for genomic and spliced nucleotide sequences
Versions of package minimap2
ReleaseVersionArchitectures
bullseye2.17+dfsg-12amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm2.24+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie2.27+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid2.27+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch-backports2.15+dfsg-1~bpo9+1amd64,i386
buster2.15+dfsg-1amd64,i386
upstream2.28
Popcon: 8 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

Minimap2 is a versatile sequence alignment program that aligns DNA or mRNA sequences against a large reference database. Typical use cases include: (1) mapping PacBio or Oxford Nanopore genomic reads to the human genome; (2) finding overlaps between long reads with error rate up to ~15%; (3) splice-aware alignment of PacBio Iso-Seq or Nanopore cDNA or Direct RNA reads against a reference genome; (4) aligning Illumina single- or paired-end reads; (5) assembly-to-assembly alignment; (6) full- genome alignment between two closely related species with divergence below ~15%.

For ~10kb noisy reads sequences, minimap2 is tens of times faster than mainstream long-read mappers such as BLASR, BWA-MEM, NGMLR and GMAP. It is more accurate on simulated long reads and produces biologically meaningful alignment ready for downstream analyses. For >100bp Illumina short reads, minimap2 is three times as fast as BWA-MEM and Bowtie2, and as accurate on simulated data. Detailed evaluations are available from the minimap2 paper or the preprint.

Please cite: Heng Li: Minimap2: pairwise alignment for nucleotide sequences. (PubMed,eprint) Bioinformatics :2103-2110 (2018)
Registry entries: Bioconda 
mmb
model the structure and dynamics of macromolecules
Versions of package mmb
ReleaseVersionArchitectures
experimental4.0.0+dfsg-3.1~exp1amd64,arm64,armhf
bookworm4.0.0+dfsg-2amd64,arm64
trixie4.0.0+dfsg-4amd64,arm64
sid4.0.0+dfsg-4amd64,arm64
bullseye3.2+dfsg-2+deb11u1amd64,arm64,ppc64el
Popcon: 3 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

MacroMoleculeBuilder, previously known as RNABuilder, can be used for morphing, homology modeling, folding (e.g. using base pairing contacts), redesigning complexes, fitting to low-resolution density maps, predicting local rearrangements upon mutation, and many other applications.

mmseqs2
ultra fast and sensitive protein search and clustering
Versions of package mmseqs2
ReleaseVersionArchitectures
bullseye12-113e3+ds-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
bookworm14-7e284+ds-1amd64,arm64,mips64el,ppc64el
trixie15-6f452+ds-2amd64,arm64,mips64el,ppc64el,riscv64
sid15-6f452+ds-2amd64,arm64,mips64el,ppc64el,riscv64
Popcon: 3 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

MMseqs2 (Many-against-Many sequence searching) is a software suite to search and cluster huge proteins/nucleotide sequence sets. MMseqs2 is open source GPL-licensed software implemented in C++ for Linux, MacOS, and (as beta version, via cygwin) Windows. The software is designed to run on multiple cores and servers and exhibits very good scalability. MMseqs2 can run 10000 times faster than BLAST. At 100 times its speed it achieves almost the same sensitivity. It can perform profile searches with the same sensitivity as PSI-BLAST at over 400 times its speed.

Please cite: Martin Steinegger and Johannes Söding: Clustering huge protein sequence sets in linear time. Nature Communications 9(1) (2018)
Registry entries: Bio.tools  Bioconda 
multiqc
output integration for RNA sequencing across tools and samples
Versions of package multiqc
ReleaseVersionArchitectures
bullseye1.9+dfsg-3all
trixie1.21+dfsg-2all
sid1.21+dfsg-2all
bookworm1.14+dfsg-1all
upstream1.25.1
Popcon: 97 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

The sequencing of DNA or RNA with current high-throughput technologies involves an array of tools and these are applied over a range of samples. It is easy to loose oversight. And gathering the data and forwarding them in a readable manner to the individuals who took the samples is a challenge for a tool in itself. Well. Here it is. MultiQC aggregates the output of multiple tools into a single report.

Reports are generated by scanning given directories for recognised log files. These are parsed and a single HTML report is generated summarising the statistics for all logs found. MultiQC reports can describe multiple analysis steps and large numbers of samples within a single plot, and multiple analysis tools making it ideal for routine fast quality control.

Please cite: Philip Ewels, Måns Magnusson, Sverker Lundin and Max Käller: MultiQC: summarize analysis results for multiple tools and samples in a single report. (PubMed,eprint) Bioinformatics 31(19):3047-8 (2016)
Registry entries: Bio.tools  SciCrunch  Bioconda 
muscle
Multiple alignment program of protein sequences
Versions of package muscle
ReleaseVersionArchitectures
sid5.1.0-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm5.1.0-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye3.8.1551-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster3.8.1551-2amd64,arm64,armhf,i386
stretch3.8.31+dfsg-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
jessie3.8.31-1amd64,armel,armhf,i386
trixie5.1.0-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
upstream5.3
Debtags of package muscle:
biologyformat:aln, nuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
usecomparing
works-with-formatplaintext
Popcon: 17 users (12 upd.)*
Newer upstream!
License: DFSG free
Git

MUSCLE is a multiple alignment program for protein sequences. MUSCLE stands for multiple sequence comparison by log-expectation. In the authors tests, MUSCLE achieved the highest scores of all tested programs on several alignment accuracy benchmarks, and is also one of the fastest programs out there.

Muscle v5 is a major re-write of MUSCLE based on new algorithms.

Users should be aware that command line arguments compared to version 3.x of MUSCLE have changed!

Highest accuracy, scalable to thousands of sequences

Compared to previous versions, Muscle v5 is much more accurate, is often faster, and scales to much larger datasets. At the time of writing (late 2021), Muscle v5 has the highest scores on multiple alignment benchmarks including Balibase, Bralibase, Prefab and Balifam. It can align tens of thousands of sequences with high accuracy on a low-cost commodity computer (say, an 8-core Intel CPU with 32 Gb RAM). On large datasets, Muscle v5 is 20-30% more accurate than MAFFT and Clustal-Omega.

Alignment ensembles

Muscle v5 can generate ensembles of high-accuracy alternative alignments. All replicates have equal average accuracy on benchmark test, including the MSA made with default parameters. By comparing results of downstream analysis (trees, structure prediction...) on different replicates, you can assess the effects of alignment errors on your study.

Please cite: Robert C. Edgar: MUSCLE: multiple sequence alignment with high accuracy and high throughput. (PubMed,eprint) Nucleic Acids Research 32(5):1792-1797 (2004)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Sequence analysis
Screenshots of package muscle
muscle3
multiple alignment program of protein sequences
Versions of package muscle3
ReleaseVersionArchitectures
trixie3.8.1551-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid3.8.1551-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm3.8.1551-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

MUSCLE is a multiple alignment program for protein sequences. MUSCLE stands for multiple sequence comparison by log-expectation. In the authors tests, MUSCLE achieved the highest scores of all tested programs on several alignment accuracy benchmarks, and is also one of the fastest programs out there.

This is version 3 of the muscle program. It is a different program than muscle version 5 which is packaged as muscle in Debian.

Please cite: Robert C. Edgar: MUSCLE: multiple sequence alignment with high accuracy and high throughput. (PubMed,eprint) Nucleic Acids Research 32(5):1792-1797 (2004)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Sequence analysis
nanofilt
filtering and trimming of long read sequencing data
Versions of package nanofilt
ReleaseVersionArchitectures
bullseye2.6.0-3all
sid2.8.0-1all
bookworm2.8.0-1all
trixie2.8.0-1all
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Filtering and trimming of long read sequencing data. Filtering on quality and/or read length, and optional trimming after passing filters. Reads from stdin, writes to stdout. Optionally reads directly from an uncompressed file specified on the command line.

Intended to be used:

 1. directly after fastq extraction.
 2. prior to mapping.
 3. in a stream between extraction and mapping.
Please cite: Wouter De Coster, Svenn D'Hert, Darrin T. Schultz and Christine Van Broeckhoven: NanoPack: visualizing and processing long-read sequencing data. Bioinformatics 34 (2018)
Registry entries: Bioconda 
nanolyse
remove lambda phage reads from a fastq file
Versions of package nanolyse
ReleaseVersionArchitectures
bullseye1.2.0-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.2.0-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.2.0-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.2.0-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

NanoLyse is a tool for rapid removal of contaminant DNA, using the Minimap2 aligner through the mappy Python binding. A typical application would be the removal of the lambda phage control DNA fragment supplied by ONT, for which the reference sequence is included in the package. However, this approach may lead to unwanted loss of reads from regions highly homologous to the lambda phage genome.

Please cite: Wouter De Coster, Svenn D’Hert, Darrin T Schultz, Marc Cruts and Christine Van Broeckhoven: NanoPack: visualizing and processing long-read sequencing data. (PubMed,eprint) Bioinformatics 34(15):2666-2669 (2018)
Registry entries: Bioconda 
nanook
pre- and post-alignment analysis of nanopore sequencing data
Versions of package nanook
ReleaseVersionArchitectures
stretch-backports1.33+dfsg-1~bpo9+1all
buster1.33+dfsg-1all
bullseye1.33+dfsg-2.1all
bookworm1.33+dfsg-5all
trixie1.33+dfsg-5all
sid1.33+dfsg-5all
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

NanoOK is a flexible, multi-reference software for pre- and post- alignment analysis of nanopore sequencing data, quality and error profiles.

NanoOK (pronounced na-nook) is a tool for extraction, alignment and analysis of Nanopore reads. NanoOK will extract reads as FASTA or FASTQ files, align them (with a choice of alignment tools), then generate a comprehensive multi-page PDF report containing yield, accuracy and quality analysis. Along the way, it generates plain text files which can be used for further analysis, as well as graphs suitable for inclusion in presentations and papers.

The package is enhanced by the following packages: nanook-examples
Please cite: Richard M. Leggett, Darren Heavens, Mario Caccamo, Matthew D. Clark and Robert P. Davey: NanoOK: multi-reference alignment analysis of nanopore sequencing data, quality and error profiles. (PubMed,eprint) Bioinformatics 32(1):142-144 (2016)
Registry entries: Bio.tools 
nanopolish
consensus caller for nanopore sequencing data
Versions of package nanopolish
ReleaseVersionArchitectures
stretch0.5.0-1amd64,arm64,armel,i386,mips64el,mipsel,ppc64el
stretch-backports0.10.2-1~bpo9+1amd64
buster0.11.0-2amd64
bullseye0.13.2-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm0.14.0-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
trixie0.14.0-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
sid0.14.0-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
Popcon: 3 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Nanopolish uses a signal-level hidden Markov model for consensus calling of nanopore genome sequencing data. It can perform signal-level analysis of Oxford Nanopore sequencing data. Nanopolish can calculate an improved consensus sequence for a draft genome assembly, detect base modifications, call SNPs and indels with respect to a reference genome and more.

Registry entries: Bio.tools  SciCrunch  Bioconda 
nanosv
structural variant caller for nanopore data
Versions of package nanosv
ReleaseVersionArchitectures
trixie1.2.4+git20190409.c1ae30c-6all
bookworm1.2.4+git20190409.c1ae30c-6all
sid1.2.4+git20190409.c1ae30c-6all
bullseye1.2.4+git20190409.c1ae30c-3all
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

NanoSV is a software package that can be used to identify structural genomic variations in long-read sequencing data, such as data produced by Oxford Nanopore Technologies’ MinION, GridION or PromethION instruments, or Pacific Biosciences RSII or Sequel sequencers. NanoSV has been extensively tested using Oxford Nanopore MinION sequencing data.

Please cite: Mircea Cretu Stancu, Markus J. van Roosmalen, Ivo Renkens, Marleen M. Nieboer, Sjors Middelkamp, Joep de Ligt, Giulia Pregno, Daniela Giachino, Giorgia Mandrile, Jose Espejo Valle-Inclan, Jerome Korzelius, Ewart de Bruijn, Edwin Cuppen, Michael E. Talkowski, Tobias Marschall, Jeroen de Ridder and Wigard P. Kloosterman: Mapping and phasing of structural variation in patient genomes using nanopore sequencing.. (eprint) Nature Communications 8:1326 (2017)
Registry entries: Bioconda 
ncbi-blast+
next generation suite of BLAST sequence search tools
Versions of package ncbi-blast+
ReleaseVersionArchitectures
bullseye2.11.0+ds-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie2.16.0+ds-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm2.12.0+ds-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye-backports2.12.0+ds-3~bpo11+1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch2.6.0-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster-backports2.9.0-4~bpo10+1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
stretch-backports-sloppy2.9.0-3~bpo9+1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster2.8.1-1+deb10u1amd64,arm64,armhf,i386
sid2.16.0+ds-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
jessie2.2.29-3amd64,armel,armhf,i386
Debtags of package ncbi-blast+:
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
useanalysing
works-withbiological-sequence
Popcon: 45 users (32 upd.)*
Versions and Archs
License: DFSG free
Git

The Basic Local Alignment Search Tool (BLAST) is the most widely used sequence similarity tool. There are versions of BLAST that compare protein queries to protein databases, nucleotide queries to nucleotide databases, as well as versions that translate nucleotide queries or databases in all six frames and compare to protein databases or queries. PSI-BLAST produces a position-specific-scoring-matrix (PSSM) starting with a protein query, and then uses that PSSM to perform further searches. It is also possible to compare a protein or nucleotide query to a database of PSSM’s. The NCBI supports a BLAST web page at blast.ncbi.nlm.nih.gov as well as a network service.

Registry entries: Bio.tools  SciCrunch  Bioconda 
ngmlr
CoNvex Gap-cost alignMents for Long Reads
Versions of package ngmlr
ReleaseVersionArchitectures
bookworm0.2.7+git20210816.a2a31fb+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie0.2.7+git20210816.a2a31fb+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye0.2.7+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
experimental0.2.7+git20210816.a2a31fb+dfsg-4~0exp0simdeamd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid0.2.7+git20210816.a2a31fb+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Ngmlr is a long-read mapper designed to sensitively align PacBilo or Oxford Nanopore to (large) reference genomes. It was designed to quickly and correctly align the reads, including those spanning (complex) structural variations. Ngmlr uses an SV aware k-mer search to find approximate mapping locations for a read and then a banded Smith- Waterman alignment algorithm to compute the final alignment. Ngmlr uses a convex gap cost model that penalizes gap extensions for longer gaps less than for shorter ones to compute precise alignments.

Please cite: Fritz J. Sedlazeck, Philipp Rescheneder, Moritz Smolka, Han Fang, Maria Nattestad, Arndt von Haeseler and Michael C. Schatz: Accurate detection of complex structural variations using single-molecule sequencing. Nature Methods 15:461–468 (2018)
Registry entries: Bio.tools  SciCrunch  Bioconda 
nthash
Methods to evaluate runtime and uniformity tests for hashing methods
Versions of package nthash
ReleaseVersionArchitectures
sid2.3.0+dfsg-1amd64,arm64,mips64el,ppc64el,riscv64
trixie2.3.0+dfsg-1amd64,arm64,mips64el,ppc64el,riscv64
bookworm2.3.0+dfsg-1amd64,arm64,mips64el,ppc64el
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

This contains nttest binary which has options for evaluating runtimes and uniformity for different hashing methods.

Please cite: Hamid Mohamadi, Justin Chu, Benjamin P. Vandervalk and Inanc Birol: ntHash: recursive nucleotide hashing. (PubMed,eprint) Bioinformatics 32(22):3492-3494 (2016)
odil
C++11 library for the DICOM standard (application)
Versions of package odil
ReleaseVersionArchitectures
sid0.12.2-5all
stretch0.7.3-1all
trixie0.12.2-5all
bookworm0.12.2-2all
buster0.10.0-3all
bullseye0.12.1-1all
upstream0.13.0
Popcon: 6 users (5 upd.)*
Newer upstream!
License: DFSG free
Git

Odil leverages C++ constructs to provide a user-friendly API of the different parts of the DICOM standard. Included in Odil are exception-based error handling, generic access to datasets elements, standard JSON and XML representation of datasets, and generic implementation of messages, clients and servers for the various DICOM protocols.

This package contains the command-line application.

orthanc
Lightweight, RESTful DICOM server for medical imaging
Versions of package orthanc
ReleaseVersionArchitectures
buster1.5.6+dfsg-1amd64,arm64,armhf,i386
bullseye1.9.2+really1.9.1+dfsg-1+deb11u1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie0.8.4+dfsg-1amd64,armel,armhf,i386
sid1.12.4+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.12.4+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster-security1.5.6+dfsg-1+deb10u1amd64,arm64,armhf,i386
stretch1.2.0+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bullseye-security1.9.2+really1.9.1+dfsg-1+deb11u1amd64,arm64,armhf,i386
bookworm1.10.1+dfsg-2+deb12u1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm-security1.10.1+dfsg-2+deb12u1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 145 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Orthanc aims at providing a simple, yet powerful DICOM server for medical imaging. Orthanc can turn any computer running Windows or Linux into a Vendor Neutral Archive (in other words, a mini-PACS system). Its architecture is lightweight, meaning that no complex database administration is required, nor the installation of third-party dependencies.

What makes Orthanc unique is the fact that it provides a RESTful API. Thanks to this major feature, it is possible to drive Orthanc from any computer language. The DICOM tags of the stored medical images can be downloaded in the JSON file format. Furthermore, standard PNG images can be generated on-the-fly from the DICOM instances by Orthanc.

Orthanc lets its users focus on the content of the DICOM files, hiding the complexity of the DICOM format and of the DICOM protocol.

Please cite: Sébastien Jodogne: The Orthanc Ecosystem for Medical Imaging. (PubMed,eprint) Journal of Digital Imaging 31(3):341–352 (2018)
Other screenshots of package orthanc
VersionURL
1.9.7+dfsghttps://screenshots.debian.net/shrine/screenshot/25317/simage/large-fb8fe23a6a96ff9dabdb57bd4685f3e8.jpg
1.9.7+dfsghttps://screenshots.debian.net/shrine/screenshot/25359/simage/large-efddd415e09a8e5f0efce5b95e22635b.jpg
1.9.7+dfsghttps://screenshots.debian.net/shrine/screenshot/25360/simage/large-9a4c4b22f179c74187ca035359d54cf2.png
1.9.7+dfsghttps://screenshots.debian.net/shrine/screenshot/25361/simage/large-7e419f029caa48e227bb8a06e8c40ea0.jpeg
1.9.7+dfsghttps://screenshots.debian.net/shrine/screenshot/25362/simage/large-db8cdd3a739846c211c7176ef5b8865c.jpeg
1.9.7+dfsghttps://screenshots.debian.net/shrine/screenshot/25556/simage/large-810b340170fb6fc94744a07601cc0a75.png
1.9.7+dfsghttps://screenshots.debian.net/shrine/screenshot/23986/simage/large-ee46469958cc3a481db4394575b50bc7.jpeg
1.9.7+dfsghttps://screenshots.debian.net/shrine/screenshot/24799/simage/large-11bc719ff4e27568ed1c5bef0ac4b03c.png
https://screenshots.debian.net/shrine/screenshot/10544/simage/large-d296180c6a85fb7ebe2cb7344822e27c.png
1.9.7+dfsghttps://screenshots.debian.net/shrine/screenshot/24778/simage/large-75ba00cccf400179059e3fd5642f66d6.jpeg
1.9.7+dfsghttps://screenshots.debian.net/shrine/screenshot/simage/large-885dac742e205e84dffae75f502ca5fb.jpg
1.9.7+dfsghttps://screenshots.debian.net/shrine/screenshot/23295/simage/large-05d63d7cd1c089d5cf01af8b16aaf4c0.png
1.9.7+dfsghttps://screenshots.debian.net/shrine/screenshot/24707/simage/large-ef28a9b493f0813a40749436bbb5115f.jpg
1.9.7+dfsghttps://screenshots.debian.net/shrine/screenshot/25105/simage/large-e77d632ec6ddff97e0afd364546e8d14.jpeg
1.9.7+dfsghttps://screenshots.debian.net/shrine/screenshot/simage/large-8f8dbdc8a4597ba1030e7745c62f340f.jpeg
Screenshots of package orthanc
orthanc-dicomweb
Plugin to extend Orthanc with support of WADO and DICOMweb
Versions of package orthanc-dicomweb
ReleaseVersionArchitectures
bullseye1.5+dfsg-3amd64,arm64,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch0.3+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bookworm1.7+dfsg-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.17+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster0.6+dfsg-1amd64,arm64,armhf,i386
sid1.17+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 18 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Orthanc DICOMweb is a plugin to Orthanc, the lightweight, RESTful Vendor Neutral Archive for medical imaging. It extends the Orthanc core with support of the WADO (now known as WADO-URI) and DICOMweb (QIDO-RS, STOW-RS, WADO-RS) standards.

Please cite: Sebastien Jodogne: The Orthanc Ecosystem for Medical Imaging. J Digit Imaging (2018)
Screenshots of package orthanc-dicomweb
orthanc-python
Develop plugins for Orthanc using the Python programming language
Versions of package orthanc-python
ReleaseVersionArchitectures
trixie4.3+ds-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid4.3+ds-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye3.1+ds-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm4.0+ds-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 8 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

This plugin can be used to write Orthanc plugins using the Python programming language instead of the more complex C/C++ programming languages. It can be used to gain access to Python modules directly in Orthanc.

This plugin can be of great help to anyone wishing to automate her imaging workflow, to design/train new machine learning algorithms, or to deploy AI systems directly in clinical setups.

Please cite: Sebastien Jodogne: The Orthanc Ecosystem for Medical Imaging. J Digit Imaging (2018)
orthanc-wsi
Whole-slide imaging support for Orthanc (digital pathology)
Versions of package orthanc-wsi
ReleaseVersionArchitectures
sid2.1+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.1-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye1.0-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster0.6-2amd64,arm64,armhf,i386
trixie2.1+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 73 users (6 upd.)*
Versions and Archs
License: DFSG free
Git

Orthanc-WSI brings support of whole-slide imaging for digital pathology into Orthanc, the lightweight, RESTful Vendor Neutral Archive for medical imaging.

This package contains two command-line tools to convert whole-slide images to and from DICOM. Support for proprietary file formats is available through OpenSlide. The package also contains an Orthanc plugin to display such DICOM images by any standard Web browser. The implementation follows DICOM Supplement 145.

Please cite: Sebastien Jodogne: The Orthanc Ecosystem for Medical Imaging. J Digit Imaging (2018)
paleomix
pipelines and tools for the processing of ancient and modern HTS data
Versions of package paleomix
ReleaseVersionArchitectures
bullseye1.3.2-1amd64,arm64,mips64el,ppc64el
buster1.2.13.3-1amd64
bookworm1.3.7-3amd64,arm64
trixie1.3.8-2amd64,arm64
sid1.3.8-2amd64,arm64
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The PALEOMIX pipelines are a set of pipelines and tools designed to aid the rapid processing of High-Throughput Sequencing (HTS) data: The BAM pipeline processes de-multiplexed reads from one or more samples, through sequence processing and alignment, to generate BAM alignment files useful in downstream analyses; the Phylogenetic pipeline carries out genotyping and phylogenetic inference on BAM alignment files, either produced using the BAM pipeline or generated elsewhere; and the Zonkey pipeline carries out a suite of analyses on low coverage equine alignments, in order to detect the presence of F1-hybrids in archaeological assemblages. In addition, PALEOMIX aids in metagenomic analysis of the extracts.

The pipelines have been designed with ancient DNA (aDNA) in mind, and includes several features especially useful for the analyses of ancient samples, but can all be for the processing of modern samples, in order to ensure consistent data processing.

Please cite: Mikkel Schubert, Luca Ermini, Clio Der Sarkissian, Hákon Jónsson, Aurélien Ginolhac, Robert Schaefer, Michael D Martin, Ruth Fernández, Martin Kircher, Molly McCue, Eske Willerslev and Ludovic Orlando: Characterization of ancient and modern genomes by SNP detection and phylogenomic and metagenomic analysis using PALEOMIX. (PubMed) Nature Protocols 9(5):1056-82 (2014)
Registry entries: Bio.tools  SciCrunch 
parallel-fastq-dump
parallel fastq-dump wrapper
Versions of package parallel-fastq-dump
ReleaseVersionArchitectures
bullseye0.6.6-3amd64
trixie0.6.7-3amd64,arm64
sid0.6.7-3amd64,arm64
bookworm0.6.7-3amd64,arm64
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

NCBI fastq-dump can be very slow sometimes, even if you have the resources (network, IO, CPU) to go faster, even if you already downloaded the sra file. This tool speeds up the process by dividing the work into multiple threads.

This is possible because fastq-dump have options (-N and -X) to query specific ranges of the sra file, this tool works by dividing the work into the requested number of threads, running multiple fastq-dump in parallel and concatenating the results back together, as if you had just executed a plain fastq-dump call.

Registry entries: Bioconda 
parasail
Aligner based on libparasail
Versions of package parasail
ReleaseVersionArchitectures
sid2.6.2+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie2.6.2+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm2.6+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye2.4.3+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 2 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

This package contains a command-line aligner based on libparasail. Parasail is a SIMD C library containing implementations of the Smith-Waterman, Needleman-Wunsch, and various semi-global pairwise sequence alignment algorithm.

picard-tools
Command line tools to manipulate SAM and BAM files
Versions of package picard-tools
ReleaseVersionArchitectures
jessie1.113-1all
trixie3.1.1+dfsg-1all
bullseye2.24.1+dfsg-1all
sid3.1.1+dfsg-1all
stretch2.8.1+dfsg-1all
buster2.18.25+dfsg-2amd64
bookworm2.27.5+dfsg-2all
upstream3.3.0
Popcon: 10 users (5 upd.)*
Newer upstream!
License: DFSG free
Git

SAM (Sequence Alignment/Map) format is a generic format for storing large nucleotide sequence alignments. Picard Tools includes these utilities to manipulate SAM and BAM files:

 AddCommentsToBam                  FifoBuffer
 AddOrReplaceReadGroups            FilterSamReads
 BaitDesigner                      FilterVcf
 BamIndexStats                     FixMateInformation
                                   GatherBamFiles
 BedToIntervalList                 GatherVcfs
 BuildBamIndex                     GenotypeConcordance
 CalculateHsMetrics                IlluminaBasecallsToFastq
 CalculateReadGroupChecksum        IlluminaBasecallsToSam
 CheckIlluminaDirectory            LiftOverIntervalList
 CheckTerminatorBlock              LiftoverVcf
 CleanSam                          MakeSitesOnlyVcf
 CollectAlignmentSummaryMetrics    MarkDuplicates
 CollectBaseDistributionByCycle    MarkDuplicatesWithMateCigar
 CollectGcBiasMetrics              MarkIlluminaAdapters
 CollectHiSeqXPfFailMetrics        MeanQualityByCycle
 CollectIlluminaBasecallingMetrics MergeBamAlignment
 CollectIlluminaLaneMetrics        MergeSamFiles
 CollectInsertSizeMetrics          MergeVcfs
 CollectJumpingLibraryMetrics      NormalizeFasta
 CollectMultipleMetrics            PositionBasedDownsampleSam
 CollectOxoGMetrics                QualityScoreDistribution
 CollectQualityYieldMetrics        RenameSampleInVcf
 CollectRawWgsMetrics              ReorderSam
 CollectRnaSeqMetrics              ReplaceSamHeader
 CollectRrbsMetrics                RevertOriginalBaseQualitiesAndAddMateCigar
 CollectSequencingArtifactMetrics  RevertSam
 CollectTargetedPcrMetrics         SamFormatConverter
 CollectVariantCallingMetrics      SamToFastq
 CollectWgsMetrics                 ScatterIntervalsByNs
 CompareMetrics                    SortSam
 CompareSAMs                       SortVcf
 ConvertSequencingArtifactToOxoG   SplitSamByLibrary
 CreateSequenceDictionary          SplitVcfs
 DownsampleSam                     UpdateVcfSequenceDictionary
 EstimateLibraryComplexity         ValidateSamFile
 ExtractIlluminaBarcodes           VcfFormatConverter
 ExtractSequences                  VcfToIntervalList
 FastqToSam                        ViewSam
The package is enhanced by the following packages: multiqc
Please cite: Broad Institute: Picard toolkit. Broad Institute, GitHub repository (2019)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Sequencing; Document, record and content management
picopore
lossless compression of Nanopore files
Versions of package picopore
ReleaseVersionArchitectures
bullseye1.2.0-2all
trixie1.2.0-3all
sid1.2.0-3all
bookworm1.2.0-2all
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The Nanopore is a device to determine the sequences of single moleculres of DNA. No amplification. The output is gigantic and tools like this one help to reduce it.

Over time, other means have substitute the need for this one. Upstream has halted development. Some tutorials and pipelines of the Nanopore still refer to it, though.

Registry entries: Bio.tools  Bioconda 
pigx-rnaseq
pipeline for checkpointed and distributed RNA-seq analyses
Versions of package pigx-rnaseq
ReleaseVersionArchitectures
bookworm0.1.0-1.1all
trixie0.1.1-1all
sid0.1.1-1all
bullseye0.0.10+ds-2all
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

This package provides a automated workflow for the automated analysis of RNA-seq experiments. A series of well-accecpted tools are connected in Python scripts and controlled via snakemake. This supports the parallel execution of these workflows and provides checkpointing, such that interrupted workflows can take up their work again.

Please cite: Ricardo Wurmus, Bora Uyar, Brendan Osberg, Vedran Franke, Alexander Gosdschan, Katarzyna Wreczycka, Jonathan Ronen and and Altuna Akalin: PiGx: Reproducible Genomics Analysis Pipelines with GNU Guix. (PubMed,eprint) GigaScience 7(12):giy123 (2018)
Registry entries: SciCrunch 
pinfish
Collection of tools to annotate genomes using long read transcriptomics data
Versions of package pinfish
ReleaseVersionArchitectures
sid0.1.0+ds-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye0.1.0+ds-2amd64,arm64,mips64el,ppc64el,s390x
bookworm0.1.0+ds-3amd64,arm64,mips64el,ppc64el,s390x
trixie0.1.0+ds-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The toolchain is composed of the following tools: 1. spliced_bam2gff - a tool for converting sorted BAM files containing spliced alignments into GFF2 format. Each read will be represented as a distinct transcript. This tool comes handy when visualizing spliced reads at particular loci and to provide input to the rest of the toolchain.

  1. cluster_gff - this tool takes a sorted GFF2 file as input and clusters together reads having similar exon/intron structure and creates a rough consensus of the clusters by taking the median of exon boundaries from all transcripts in the cluster.

  2. polish_clusters - this tool takes the cluster definitions generated by cluster_gff and for each cluster creates an error corrected read by mapping all reads on the read with the median length and polishing it using racon. The polished reads can be mapped to the genome using minimap2 or GMAP.

  3. collapse_partials - this tool takes GFFs generated by either cluster_gff or polish_clusters and filters out transcripts which are likely to be based on RNA degradation products from the 5' end. The tool clusters the input transcripts into "loci" by the 3' ends and discards transcripts which have a compatible transcripts in the loci with more exons.

plasmidid
mapping-based, assembly-assisted plasmid identification tool
Versions of package plasmidid
ReleaseVersionArchitectures
bookworm1.6.5+dfsg-2amd64
bullseye1.6.3+dfsg-3amd64
sid1.6.5+dfsg-2amd64,arm64
trixie1.6.5+dfsg-2amd64,arm64
Popcon: 0 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

PlasmidID is a mapping-based, assembly-assisted plasmid identification tool that analyzes and gives graphic solution for plasmid identification.

PlasmidID is a computational pipeline that maps Illumina reads over plasmid database sequences. The k-mer filtered, most covered sequences are clustered by identity to avoid redundancy and the longest are used as scaffold for plasmid reconstruction. Reads are assembled and annotated by automatic and specific annotation. All information generated from mapping, assembly, annotation and local alignment analyses is gathered and accurately represented in a circular image which allow user to determine plasmidic composition in any bacterial sample.

Registry entries: Bioconda 
plink1.9
whole-genome association analysis toolset
Versions of package plink1.9
ReleaseVersionArchitectures
trixie1.90~b7.2-231211-1amd64,armel,armhf,i386
sid1.90~b7.2-231211-1amd64,armel,armhf,i386
buster1.90~b6.6-181012-1amd64,armhf,i386
stretch1.90~b3.45-170113-1amd64,armel,armhf,i386,mipsel
bullseye1.90~b6.21-201019-1amd64,armel,armhf,i386,mipsel
bookworm1.90~b6.26-220402-1amd64,armel,armhf,i386,mipsel
Popcon: 85 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

plink expects as input the data from SNP (single nucleotide polymorphism) chips of many individuals and their phenotypical description of a disease. It finds associations of single or pairs of DNA variations with a phenotype and can retrieve SNP annotation from an online source.

SNPs can evaluated individually or as pairs for their association with the disease phenotypes. The joint investigation of copy number variations is supported. A variety of statistical tests have been implemented.

plink1.9 is a comprehensive update of plink with new algorithms and new methods, faster and less memory consumer than the first plink.

Please note: The executable was renamed to plink1.9 because of a name clash. Please read more about this in /usr/share/doc/plink1.9/README.Debian.

Please cite: Christopher C. Chang, Carson C. Chow, Laurent C.A.M. Tellier, Shashaank Vattikuti, Shaun M. Purcell and James J. Lee: Second-generation PLINK: rising to the challenge of larger and richer datasets. (eprint) GigaScience 4(1):7 (2015)
Registry entries: Bio.tools  SciCrunch  Bioconda 
plink2
whole-genome association analysis toolset
Versions of package plink2
ReleaseVersionArchitectures
bullseye2.00~a3-210203+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid2.00~a5.8-231123+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie2.00~a5.8-231123+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm2.00~a3.5-220809+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 2 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

plink expects as input the data from SNP (single nucleotide polymorphism) chips of many individuals and their phenotypical description of a disease. It finds associations of single or pairs of DNA variations with a phenotype and can retrieve SNP annotation from an online source.

SNPs can evaluated individually or as pairs for their association with the disease phenotypes. The joint investigation of copy number variations is supported. A variety of statistical tests have been implemented.

plink2 is a comprehensive update of plink and plink1.9 with new algorithms and new methods, faster and less memory consumer than the first plink.

Please cite: Christopher C. Chang, Carson C. Chow, Laurent C.A.M. Tellier, Shashaank Vattikuti, Shaun M. Purcell and James J. Lee: Second-generation PLINK: rising to the challenge of larger and richer datasets. (eprint) GigaScience 4(1):7 (2015)
Registry entries: SciCrunch 
plip
fully automated protein-ligand interaction profiler
Versions of package plip
ReleaseVersionArchitectures
trixie2.3.0+dfsg-2all
buster1.4.3~b+dfsg-2all
stretch1.3.3+dfsg-1all
bookworm2.2.2+dfsg-1all
bullseye2.1.7+dfsg-1all
sid2.3.0+dfsg-2all
Popcon: 3 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

The Protein-Ligand Interaction Profiler (PLIP) is a tool to analyze and visualize protein-ligand interactions in PDB files.

Features include:

  • Detection of eight different types of noncovalent interactions
  • Automatic detection of relevant ligands in a PDB file
  • Direct download of PDB structures from wwPDB server if valid PDB ID is given
  • Processing of custom PDB files containing protein-ligand complexes (e.g. from docking)
  • No need for special preparation of a PDB file, works out of the box
  • Atom-level interaction reports in rST and XML formats for easy parsing
  • Generation of PyMOL session files (.pse) for each pairing, enabling easy preparation of images for publications and talks
  • Rendering of preview image for each ligand and its interactions with the protein
Please cite: Sebastian Salentin, Sven Schreiber, V. Joachim Haupt, Melissa F. Adasme and Michael Schroeder: PLIP: fully automated protein–ligand interaction profiler. (eprint) Nucleic Acids Research (W1) (2015)
Registry entries: Bio.tools 
porechop
adapter trimmer for Oxford Nanopore reads
Versions of package porechop
ReleaseVersionArchitectures
bullseye0.2.4+dfsg-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster0.2.4+dfsg-1amd64,arm64,armhf,i386
bookworm0.2.4+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie0.2.4+dfsg-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid0.2.4+dfsg-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Porechop is a tool for finding and removing adapters from Oxford Nanopore reads. Adapters on the ends of reads are trimmed off, and when a read has an adapter in its middle, it is treated as chimeric and chopped into separate reads. Porechop performs thorough alignments to effectively find adapters, even at low sequence identity. Porechop also supports demultiplexing of Nanopore reads that were barcoded with the Native Barcoding Kit, PCR Barcoding Kit or Rapid Barcoding Kit.

Registry entries: Bioconda 
poretools
toolkit for nanopore nucleotide sequencing data
Versions of package poretools
ReleaseVersionArchitectures
trixie0.6.0+dfsg-7all
stretch0.6.0+dfsg-2all
buster0.6.0+dfsg-3all
bullseye0.6.0+dfsg-5all
bookworm0.6.0+dfsg-6all
sid0.6.0+dfsg-7all
Popcon: 2 users (4 upd.)*
Versions and Archs
License: DFSG free
Git

poretools is a flexible toolkit for exploring datasets generated by nanopore sequencing devices from MinION for the purposes of quality control and downstream analysis. Poretools operates directly on the native FAST5 (a variant of the HDF5 standard) file format produced by ONT and provides a wealth of format conversion utilities and data exploration and visualization tools.

Please cite: Nicholas Loman and Aaron Quinlan: Poretools: a toolkit for analyzing nanopore sequence data. (PubMed,eprint) Bioinformatics 30(23):3399-3401 (2014)
Registry entries: Bio.tools  Bioconda 
pplacer
phylogenetic placement and downstream analysis
Versions of package pplacer
ReleaseVersionArchitectures
bullseye1.1~alpha19-4amd64,arm64,ppc64el,s390x
sid1.1~alpha19-8amd64,arm64,ppc64el,riscv64,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Pplacer places reads on a phylogenetic tree. guppy (Grand Unified Phylogenetic Placement Yanalyzer) yanalyzes them. rppr is a helpful tool for working with reference packages.

Pplacer places query sequences on a fixed reference phylogenetic tree to maximize phylogenetic likelihood or posterior probability according to a reference alignment. Pplacer is designed to be fast, to give useful information about uncertainty, and to offer advanced visualization and downstream analysis.

Please cite: Frederick A Matsen, Robin B Kodner and E Virginia Armbrust: pplacer: linear time maximum-likelihood and Bayesian phylogenetic placement of sequences onto a fixed reference tree. (PubMed,eprint) BMC Bioinformatics 11:538 (2010)
Registry entries: Bioconda 
presto
toolkit for processing B and T cell sequences
Versions of package presto
ReleaseVersionArchitectures
bookworm0.7.1-1all
sid0.7.2-1all
trixie0.7.2-1all
bullseye0.6.2-1all
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

pRESTO is a toolkit for processing raw reads from high-throughput sequencing of B cell and T cell repertoires.

Dramatic improvements in high-throughput sequencing technologies now enable large-scale characterization of lymphocyte repertoires, defined as the collection of trans-membrane antigen-receptor proteins located on the surface of B cells and T cells. The REpertoire Sequencing TOolkit (pRESTO) is composed of a suite of utilities to handle all stages of sequence processing prior to germline segment assignment. pRESTO is designed to handle either single reads or paired-end reads. It includes features for quality control, primer masking, annotation of reads with sequence embedded barcodes, generation of unique molecular identifier (UMI) consensus sequences, assembly of paired-end reads and identification of duplicate sequences. Numerous options for sequence sorting, sampling and conversion operations are also included.

Please cite: Jason A. Vander Heiden, Gur Yaari, Mohamed Uduman, Joel N.H. Stern, Kevin C. O’Connor, David A. Hafler, Francois Vigneault and Steven H. Kleinstein: pRESTO: a toolkit for processing high-throughput sequencing raw reads of lymphocyte receptor repertoires. (PubMed,eprint) Bioinformatics 30(13):1930-1932 (2014)
Registry entries: Bio.tools  SciCrunch  Bioconda 
prinseq-lite
PReprocessing and INformation of SEQuence data (lite version)
Versions of package prinseq-lite
ReleaseVersionArchitectures
sid0.20.4-6all
bullseye0.20.4-6all
bookworm0.20.4-6all
trixie0.20.4-6all
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

PRINSEQ will help you to preprocess your genomic or metagenomic sequence data in FASTA or FASTQ format. It is a tool that generates summary statistics of sequence and quality data and that is used to filter, reformat and trim next-generation sequence data. It is particular designed for 454/Roche data, but can also be used for other types of sequence data. The standalone version is primarily designed for data preprocessing and does not generate summary statistics in graphical form.

Please cite: Schmieder R and Edwards R: Quality control and preprocessing of metagenomic datasets. (PubMed,eprint) Bioinformatics 27(6):863-864 (2011)
prokka
rapid annotation of prokaryotic genomes
Versions of package prokka
ReleaseVersionArchitectures
sid1.14.6+dfsg-6all
bullseye1.14.6+dfsg-3amd64
bookworm1.14.6+dfsg-4amd64
trixie1.14.6+dfsg-6all
Popcon: 2 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

A typical 4 Mbp genome can be fully annotated in less than 10 minutes on a quad-core computer, and scales well to 32 core SMP systems. It produces GFF3, GBK and SQN files that are ready for editing in Sequin and ultimately submitted to Genbank/DDJB/ENA.

The package is enhanced by the following packages: multiqc
Please cite: Torsten Seemann: Prokka: rapid prokaryotic genome annotation. (PubMed,eprint) Bioinformatics 30(14):2068-2069 (2014)
Registry entries: Bio.tools  SciCrunch  Bioconda 
proteinortho
Detection of (Co-)orthologs in large-scale protein analysis
Versions of package proteinortho
ReleaseVersionArchitectures
bookworm6.1.7+dfsg-1amd64,arm64,ppc64el,s390x
stretch5.15+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
sid6.3.1+dfsg-1amd64,arm64,ppc64el,riscv64,s390x
buster5.16.b+dfsg-1amd64,arm64,armhf,i386
trixie6.3.1+dfsg-1amd64,arm64,ppc64el,riscv64,s390x
bullseye6.0.28+dfsg-1amd64,arm64,ppc64el,s390x
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Proteinortho is a stand-alone tool that is geared towards large datasets and makes use of distributed computing techniques when run on multi-core hardware. It implements an extended version of the reciprocal best alignment heuristic. Proteinortho was applied to compute orthologous proteins in the complete set of all 717 eubacterial genomes available at NCBI at the beginning of 2009. Authors succeeded identifying thirty proteins present in 99% of all bacterial proteomes.

Please cite: Marcus Lechner, Sven Findeiß, Lydia Steiner, Manja Marz, Peter F Stadler and Sonja J Prohaska: Proteinortho: Detection of (Co-)orthologs in large-scale analysis. (PubMed,eprint) BMC Bioinformatics 12:124 (2011)
pybedtools-bin
Scripts produced for pybedtools
Versions of package pybedtools-bin
ReleaseVersionArchitectures
buster0.8.0-1all
trixie0.10.0-1all
bookworm0.9.0-4all
bullseye0.8.0-5all
sid0.10.0-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

The BEDTools suite of programs is widely used for genomic interval manipulation or “genome algebra”. pybedtools wraps and extends BEDTools and offers feature-level manipulations from within Python.

This package provides scripts that are executable with the Python 3 version of this package.

Please cite: R. K. Dale, B. S. Pedersen and A. R. Quinlan: Pybedtools: a flexible Python library for manipulating genomic datasets and annotations". Bioinformatics 27(24):3423-3424 (2011)
Registry entries: Bio.tools  Bioconda 
pycoqc
computes metrics and generates Interactive QC plots
Versions of package pycoqc
ReleaseVersionArchitectures
bookworm2.5.2+dfsg-3all
trixie2.5.2+dfsg-3all
sid2.5.2+dfsg-3all
bullseye2.5.2+dfsg-1all
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

PycoQC computes metrics and generates interactive QC plots for Oxford Nanopore technologies sequencing data

PycoQC relies on the sequencing_summary.txt file generated by Albacore and Guppy, but if needed it can also generates a summary file from basecalled fast5 files. The package supports 1D and 1D2 runs generated with Minion, Gridion and Promethion devices and basecalled with Albacore 1.2.1+ or Guppy 2.1.3+

The package is enhanced by the following packages: multiqc
Registry entries: Bioconda 
python3-biom-format
Biological Observation Matrix (BIOM) format (Python 3)
Versions of package python3-biom-format
ReleaseVersionArchitectures
trixie2.1.16-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch2.1.5+dfsg-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
buster2.1.7+dfsg-2amd64,arm64,armhf,i386
bullseye2.1.10-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
bookworm2.1.12-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid2.1.16-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 21 users (4 upd.)*
Versions and Archs
License: DFSG free
Git

The BIOM file format (canonically pronounced biome) is designed to be a general-use format for representing biological sample by observation contingency tables. BIOM is a recognized standard for the Earth Microbiome Project and is a Genomics Standards Consortium candidate project.

The BIOM format is designed for general use in broad areas of comparative -omics. For example, in marker-gene surveys, the primary use of this format is to represent OTU tables: the observations in this case are OTUs and the matrix contains counts corresponding to the number of times each OTU is observed in each sample. With respect to metagenome data, this format would be used to represent metagenome tables: the observations in this case might correspond to SEED subsystems, and the matrix would contain counts corresponding to the number of times each subsystem is observed in each metagenome. Similarly, with respect to genome data, this format may be used to represent a set of genomes: the observations in this case again might correspond to SEED subsystems, and the counts would correspond to the number of times each subsystem is observed in each genome.

This package provides the BIOM format library for the Python 3 interpreter.

Please cite: Daniel McDonald, Jose C. Clemente, Justin Kuczynski, Jai R. Rideout, Jesse Stombaugh, Doug Wendel, Andreas Wilke, Susan Huse, John Hufnagle, Folker Meyer, Rob Knight and J. G. Caporaso: The Biological Observation Matrix (BIOM) format or: how I learned to stop worrying and love the ome-ome. (eprint) GigaScience 1:7 (2012)
python3-biopython
Python3 library for bioinformatics
Versions of package python3-biopython
ReleaseVersionArchitectures
sid1.84+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.80+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye1.78+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster1.73+dfsg-1amd64,arm64,armhf,i386
stretch1.68+dfsg-3amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
jessie1.64+dfsg-5amd64,armel,armhf,i386
trixie1.84+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 49 users (43 upd.)*
Versions and Archs
License: DFSG free
Git

The Biopython Project is an international association of developers of freely available Python tools for computational molecular biology.

It is a distributed collaborative effort to develop Python3 libraries and applications which address the needs of current and future work in bioinformatics. The source code is made available under the Biopython License, which is extremely liberal and compatible with almost every license in the world. The project works along with the Open Bioinformatics Foundation, who generously provide web and CVS space for the project.

Please cite: Peter J. A. Cock, Tiago Antao, Jeffrey T. Chang, Brad A. Chapman, Cymon J. Cox, Andrew Dalke, Iddo Friedberg, Thomas Hamelryck, Frank Kauff, Bartek Wilczynski and Michiel J. L. de Hoon: Biopython: freely available Python tools for computational molecular biology and bioinformatics. (PubMed,eprint) Bioinformatics 25(11):1422-1423 (2009)
Registry entries: Bio.tools  SciCrunch 
python3-bx
library to manage genomic data and its alignment
Versions of package python3-bx
ReleaseVersionArchitectures
bookworm0.9.0-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye0.8.9-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie0.13.0-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster0.8.2-1amd64,arm64,armhf,i386
sid0.13.0-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The bx-python project is a Python3 library and associated set of scripts to allow for rapid implementation of genome scale analyses. The library contains a variety of useful modules, but the particular strengths are:

  • Classes for reading and working with genome-scale multiple local alignments (in MAF, AXT, and LAV formats)
  • Generic data structure for indexing on disk files that contain blocks of data associated with intervals on various sequences (used, for example, to provide random access to individual alignments in huge files; optimized for use over network filesystems)
  • Data structures for working with intervals on sequences
  • "Binned bitsets" which act just like chromosome sized bit arrays, but lazily allocate regions and allow large blocks of all set or all unset bits to be stored compactly
  • "Intersecter" for performing fast intersection tests that preserve both query and target intervals and associated annotation
Registry entries: Bioconda 
python3-cgecore
Python3 module for the Center for Genomic Epidemiology
Versions of package python3-cgecore
ReleaseVersionArchitectures
sid1.5.6+ds-1all
trixie1.5.6+ds-1all
bookworm1.5.6+ds-1all
bullseye1.5.6+ds-1all
Popcon: 0 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

This Python3 module contains classes and functions needed to run the service wrappers and pipeline scripts developed by the Center for Genomic Epidemiology.

Registry entries: Bioconda 
python3-cogent3
infraestructura digital (framework) para biología genómica
Versions of package python3-cogent3
ReleaseVersionArchitectures
sid2023.12.15a1+dfsg-1s390x
bullseye2020.12.21a+dfsg-4+deb11u1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm2023.2.12a1+dfsg-2+deb12u1amd64,arm64,mips64el,ppc64el,s390x
sid2024.5.7a1+dfsg-3amd64,arm64,mips64el,ppc64el
upstream2024.7.19a9
Popcon: 1 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

PyCogent es una biblioteca de software para biología genómica. Es una infraestructura digital (framework) completamente integrada y cuidadosamente probada para:

  • controlar aplicaciones de terceros,
  • diseñar flujos de trabajo; consultar en bases de datos,
  • realizar nuevos análisis probabilísticos de evolución de secuencias biológicas, y

  • generar gráficos de calidad de publicación. Se distingue por muchas utilidades únicas integradas (como alineamiento de codones verdadero) y la incorporación frecuente de métodos completamente nuevos para el análisis de datos genómicos.

Please cite: Rob Knight, Peter Maxwell, Amanda Birmingham, Jason Carnes, J Gregory Caporaso, Brett C Easton, Michael Eaton, Micah Hamady, Helen Lindsay, Zongzhi Liu, Catherine Lozupone, Daniel McDonald, Michael Robeson, Raymond Sammut, Sandra Smit, Matthew J Wakefield, Jeremy Widmann, Shandy Wikman, Stephanie Wilson, Hua Ying and Gavin A Huttley: PyCogent: a toolkit for making sense from sequence. (PubMed,eprint) Genome Biology 8(8):R171 (2007)
Registry entries: Bio.tools  Bioconda 
python3-cooler
library for a sparse, compressed, binary persistent storage
Versions of package python3-cooler
ReleaseVersionArchitectures
trixie0.10.2-1amd64,arm64,mips64el,ppc64el,riscv64
sid0.10.2-1amd64,arm64,mips64el,ppc64el,riscv64
bookworm0.9.1-1amd64,arm64,mips64el,ppc64el
Popcon: 0 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Cooler is a support library for a sparse, compressed, binary persistent storage format, also called cooler, used to store genomic interaction data, such as Hi-C contact matrices.

The cooler file format is an implementation of a genomic matrix data model using HDF5 as the container format. The cooler package includes a suite of command line tools and a Python API to facilitate creating, querying and manipulating cooler files.

The package is enhanced by the following packages: python3-cooler-examples
Please cite: Nezar Abdennur and Leonid A Mirny: Cooler: scalable storage for Hi-C data and other genomically labeled arrays. (PubMed) Bioinformatics 36(1):311–316 (2019)
Registry entries: Bioconda 
python3-cyvcf2
VCF parser based on htslib (Python 3)
Versions of package python3-cyvcf2
ReleaseVersionArchitectures
bullseye0.30.4-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie0.31.1-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm0.30.18-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster0.10.4-1amd64,arm64,armhf,i386
sid0.31.1-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

This modules allows fast parsing of VCF and BCF including region-queries with Python. This is essential for efficient analyses of nucleotide variation with Python on high-throughput sequencing data.

cyvcf2 is a cython wrapper around htslib. Attributes like variant.gt_ref_depths return a numpy array directly so they are immediately ready for downstream use.

This package installs the library for Python 3.

Please cite: Brent S. Pedersen and Aaron R. Quinlan: cyvcf2: fast, flexible variant analysis with Python. (eprint) Bioinformatics 33(12):1867–1869 (2017)
Registry entries: Bio.tools  Bioconda 
python3-depinfo
retrieve and print Python 3 package dependencies
Versions of package python3-depinfo
ReleaseVersionArchitectures
bookworm2.2.0-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster1.4.0-1amd64,arm64,armhf,i386
trixie2.2.0-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid2.2.0-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye1.6.0-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
upstream2.2.0rc3
Popcon: 2 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

This is a utility Python package intended for other library packages. It provides a function that when called with your package name, will print platform and dependency information.

python3-drmaa
interface to DRMAA-compliant distributed resource management systems
Versions of package python3-drmaa
ReleaseVersionArchitectures
sid0.7.9-3all
bookworm0.7.9-3all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

This is a Python implementation of the Distributed Resource Management (DRM) Application API (DRMAA). It provides all high-level functionality necessary to consign a job to a DRM system (e.g. Sun Gridengine), including common operations on jobs, such as termination or suspension.

python3-etelemetry
lightweight Python3 client to communicate with the etelemetry server
Versions of package python3-etelemetry
ReleaseVersionArchitectures
bullseye0.2.0-4all
bookworm0.3.0-3all
trixie0.3.1-1all
sid0.3.1-1all
Popcon: 10 users (9 upd.)*
Versions and Archs
License: DFSG free
Git

This Python3 package provides a lightweight Python3 client interface to communicate with the etelemetry server. It can be used for nipy or nipype.

python3-gffutils
Work with GFF and GTF files in a flexible database framework
Versions of package python3-gffutils
ReleaseVersionArchitectures
trixie0.13-1all
sid0.13-1all
buster0.9-1all
bullseye0.10.1-2all
bookworm0.11.1-3all
Popcon: 2 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

A Python package for working with and manipulating the GFF and GTF format files typically used for genomic annotations. Files are loaded into a sqlite3 database, allowing much more complex manipulation of hierarchical features (e.g., genes, transcripts, and exons) than is possible with plain-text methods alone.

Registry entries: Bio.tools  Bioconda 
python3-htseq
Python3 high-throughput genome sequencing read analysis utilities
Versions of package python3-htseq
ReleaseVersionArchitectures
bookworm1.99.2-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
trixie2.0.5-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
sid2.0.5-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
bullseye0.13.5-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
buster0.11.2-1amd64,arm64
Popcon: 21 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

HTSeq can be used to performing a number of common analysis tasks when working with high-throughput genome sequencing reads:

  • Getting statistical summaries about the base-call quality scores to study the data quality.
  • Calculating a coverage vector and exporting it for visualization in a genome browser.
  • Reading in annotation data from a GFF file.
  • Assigning aligned reads from an RNA-Seq experiments to exons and genes.
Please cite: Simon Anders, Paul Theodor Pyl and Wolfgang Huber: HTSeq—a Python framework to work with high-throughput sequencing data. (PubMed,eprint) Bioinformatics 31(2):166-169 (2015)
Registry entries: Bio.tools  SciCrunch  Bioconda 
python3-nanoget
extract information from Oxford Nanopore sequencing data and alignments
Versions of package python3-nanoget
ReleaseVersionArchitectures
trixie1.19.3-1all
bookworm1.16.1-2all
bullseye1.12.2-4all
sid1.19.3-1all
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The Python3 module nanoget provides functions to extract useful metrics from Oxford Nanopore sequencing reads and alignments.

Data can be presented in the following formats, using the following functions:

  • sorted bam file process_bam(bamfile, threads)
  • standard fastq file process_fastq_plain(fastqfile, 'threads')
  • fastq file with metadata from MinKNOW or Albacore process_fastq_rich(fastqfile)
  • sequencing_summary file generated by Albacore process_summary(sequencing_summary.txt, 'readtype')

Fastq files can be compressed using gzip, bzip2 or bgzip. The data is returned as a pandas DataFrame with standardized headernames for convenient extraction. The functions perform logging while being called and extracting data.

The package is enhanced by the following packages: python3-nanoget-examples
python3-nanomath
simple math function for other Oxford Nanopore processing scripts
Versions of package python3-nanomath
ReleaseVersionArchitectures
bullseye1.2.0+ds-1all
bookworm1.2.1+ds-1all
trixie1.2.1+ds-1all
sid1.2.1+ds-1all
upstream1.4.0
Popcon: 0 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

This Python3 module provides a few simple math and statistics functions for other scripts processing Oxford Nanopore sequencing data.

  • Calculate read N50 from a set of lengths get_N50(readlenghts)
  • Remove extreme length outliers from a dataset remove_length_outliers(dataframe, columname)
  • Calculate the average Phred quality of a read ave_qual(qualscores)
  • Write out the statistics report after calling readstats function write_stats(dataframe, outputname)
  • Compute a number of statistics, return a dictionary calc_read_stats(dataframe)
python3-pairix
1D/2D indexing and querying with a pair of genomic coordinates
Versions of package python3-pairix
ReleaseVersionArchitectures
sid0.3.8-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
bullseye0.3.7-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
bookworm0.3.7-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
trixie0.3.8-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
Popcon: 0 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Pairix is a tool for indexing and querying on a block-compressed text file containing pairs of genomic coordinates.

Pairix is a stand-alone C program that was written on top of tabix as a tool for the 4DN-standard pairs file format describing Hi-C data: pairs_format_specification.md

However, Pairix can be used as a generic tool for indexing and querying any bgzipped text file containing genomic coordinates, for either 2D- or 1D- indexing and querying.

For example: given the custom text file below, you want to extract specific lines from the Pairs file further below. An awk command would read the Pairs file from beginning to end. Pairix creates an index and uses it to access the file from a relevant position by taking advantage of bgzf compression, allowing for a fast query on large files.

The package is enhanced by the following packages: python-pairix-examples
Registry entries: Bioconda 
python3-pairtools
Framework to process sequencing data from a Hi-C experiment
Versions of package python3-pairtools
ReleaseVersionArchitectures
bullseye0.3.0-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.0.3-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
sid1.0.3-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
bookworm1.0.2-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
upstream1.1.0
Popcon: 1 users (3 upd.)*
Newer upstream!
License: DFSG free
Git

Simple and fast command-line framework to process sequencing data from a Hi-C experiment.

Process pair-end sequence alignments and perform the following operations:

  • Detect ligation junctions (a.k.a. Hi-C pairs) in aligned paired-end sequences of Hi-C DNA molecules
  • Sort .pairs files for downstream analyses
  • Detect, tag and remove PCR/optical duplicates
  • Generate extensive statistics of Hi-C datasets
  • Select Hi-C pairs given flexibly defined criteria
  • Restore .sam alignments from Hi-C pairs
The package is enhanced by the following packages: python3-pairtools-examples
Registry entries: Bioconda 
python3-pauvre
QC and genome browser plotting Oxford Nanopore and PacBio long reads
Versions of package python3-pauvre
ReleaseVersionArchitectures
bookworm0.2.3-2all
trixie0.2.3-4all
sid0.2.3-4all
bullseye0.2.2-2all
Popcon: 0 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

Pauvre is a plotting package designed for nanopore and PacBio long reads.

This package currently hosts four scripts for plotting and/or printing stats.

 pauvre marginplot
    Takes a fastq file as input and outputs a marginal histogram with a
    heatmap.
 pauvre stats
    Takes a fastq file as input and prints out a table of stats, including
    how many basepairs/reads there are for a length/mean quality cutoff.
    This is also automagically called when using pauvre marginplot
 pauvre redwood
    Method of representing circular genomes. A redwood plot contains long
    reads as "rings" on the inside, a gene annotation "cambrium/phloem",
    and a RNAseq "bark". The input is .bam files for the long reads and
    RNAseq data, and a .gff file for the annotation.
 pauvre synteny
    Makes a synteny plot of circular genomes. Finds the most parsimonius
    rotation to display the synteny of all the input genomes with the
    fewest crossings-over. Input is one .gff file per circular genome
    and one directory of gene alignments.
Registry entries: Bioconda 
python3-pbcommand
common command-line interface for Pacific Biosciences analysis modules
Versions of package python3-pbcommand
ReleaseVersionArchitectures
trixie2.1.1+git20220616.3f2e6c2-3all
bookworm2.1.1+git20220616.3f2e6c2-2all
bullseye2.1.1+git20201023.cc0ed3d-1all
sid2.1.1+git20220616.3f2e6c2-3all
upstream2.1.1+git20231020.28d1635
Popcon: 0 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

To integrate with the pbsmrtpipe workflow engine, one must to be able to generate a Tool Contract and to be able to run from a Resolved Tool Contract. A Tool Contract contains the metadata of the exe, such as the file types of inputs, outputs and options. There are two principal use cases, first wrapping/calling Python functions that have been defined in external Python 3 packages, or scripts. Second, creating a CLI tool that supports emitting tool contracts, running resolved tool contracts and complete argparse-style CLI.

Registry entries: Bioconda 
python3-pbcore
Python 3 library for processing PacBio data files
Versions of package python3-pbcore
ReleaseVersionArchitectures
sid2.1.2+dfsg-9all
bookworm2.1.2+dfsg-5all
trixie2.1.2+dfsg-8all
bullseye1.7.1+git20200430.a127b1e+dfsg-1all
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The pbcore package provides Python modules for processing Pacific Biosciences data files and building PacBio bioinformatics applications. These modules include tools to read/write PacBio data formats, sample data files for testing and debugging, base classes, and utilities for building bioinformatics applications.

This package is part of the SMRTAnalysis suite.

This is the Python 3 module.

Registry entries: Bioconda 
python3-pyani
Python3 module for average nucleotide identity analyses
Versions of package python3-pyani
ReleaseVersionArchitectures
sid0.2.12-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye0.2.10-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm0.2.12-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie0.2.12-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
upstream0.2.13.1
Popcon: 1 users (1 upd.)*
Newer upstream!
License: DFSG free
Git

Pyani is a Python3 module and script that provides support for calculating average nucleotide identity (ANI) and related measures for whole genome comparisons, and rendering relevant graphical summary output. Where available, it takes advantage of multicore systems, and can integrate with SGE/OGE-type job schedulers for the sequence comparisons.

Please cite: Leighton Pritchard, Rachel H. Glover, Sonia Humphris, John G. Elphinstone and Ian K. Toth: Genomics and taxonomy in diagnostics for food security: soft-rotting enterobacterial plant pathogens. (eprint) Anal. Methods 8(1):12-24 (2016)
Registry entries: Bioconda 
python3-pychopper
identify, orient and trim full-length Nanopore cDNA reads
Versions of package python3-pychopper
ReleaseVersionArchitectures
sid2.7.10-1all
bullseye2.5.0-1all
bookworm2.7.2-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Pychopper v2 is a Python module to identify, orient and trim full-length Nanopore cDNA reads. It is also able to rescue fused reads and provides the script 'pychopper.py'. The general approach of Pychopper v2 is the following:

  • Pychopper first identifies alignment hits of the primers across the length of the sequence. The default method for doing this is using nhmmscan with the pre-trained strand specific profile HMMs, included with the package. Alternatively, one can use the edlib backend, which uses a combination of global and local alignment to identify the primers within the read.
  • After identifying the primer hits by either of the backends, the reads are divided into segments defined by two consecutive primer hits. The score of a segment is its length if the configuration of the flanking primer hits is valid (such as SPP,-VNP for forward reads) or zero otherwise.
  • The segments are assigned to rescued reads using a dynamic programming algorithm maximizing the sum of used segment scores (hence the amount of rescued bases). A crucial observation about the algorithm is that if a segment is included as a rescued read, then the next segment must be excluded as one of the primer hits defining it was "used up" by the previous segment. This put constraints on the dynamic programming graph. The arrows in read define the optimal path for rescuing two fused reads with the a total score of l1 + l3.

A crucial parameter of Pychopper v2 is -q, which determines the stringency of primer alignment (E-value in the case of the pHMM backend). This can be explicitly specified by the user, however by default it is optimized on a random sample of input reads to produce the maximum number of classified reads.

Registry entries: Bioconda 
python3-pydicom
DICOM medical file reading and writing (Python 3)
Versions of package python3-pydicom
ReleaseVersionArchitectures
buster1.2.1-1all
bullseye2.0.0-1all
sid2.4.3-1all
trixie2.4.3-1all
bookworm2.3.1-1all
upstream3.0.1
Popcon: 36 users (19 upd.)*
Newer upstream!
License: DFSG free
Git

pydicom is a pure Python module for parsing DICOM files. DICOM is a standard (http://medical.nema.org) for communicating medical images and related information such as reports and radiotherapy objects.

pydicom makes it easy to read DICOM files into natural pythonic structures for easy manipulation. Modified datasets can be written again to DICOM format files.

This package installs the module for Python 3.

python3-pyfaidx
efficient random access to fasta subsequences for Python 3
Versions of package python3-pyfaidx
ReleaseVersionArchitectures
stretch0.4.8.1-1all
sid0.8.1.3-1all
trixie0.8.1.3-1all
bookworm0.7.1-2all
bullseye0.5.9.2-1all
buster0.5.5.2-1all
Popcon: 5 users (18 upd.)*
Versions and Archs
License: DFSG free
Git

Samtools provides a function "faidx" (FAsta InDeX), which creates a small flat index file ".fai" allowing for fast random access to any subsequence in the indexed FASTA file, while loading a minimal amount of the file in to memory. This Python module implements pure Python classes for indexing, retrieval, and in-place modification of FASTA files using a samtools compatible index. The pyfaidx module is API compatible with the pygr seqdb module. A command-line script "faidx" is installed alongside the pyfaidx module, and facilitates complex manipulation of FASTA files without any programming knowledge.

This package provides the Python 3 modules to access fasta files.

Please cite: Matthew D. Shirley, Zhaorong Ma, Brent S. Pedersen and Sarah J. Wheelan: Efficient "pythonic" access to FASTA files using pyfaidx. PeerJ PrePrints 3:e1196 (2015)
Registry entries: Bio.tools 
python3-pynn
simulator-independent specification of neuronal network models
Versions of package python3-pynn
ReleaseVersionArchitectures
bookworm0.10.1-2all
bullseye0.9.6-1all
sid0.10.1-3all
upstream0.12.3
Popcon: 0 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

PyNN allows for coding a model once and run it without modification on any simulator that PyNN supports (currently NEURON, NEST, PCSIM and Brian). PyNN translates standard cell-model names and parameter names into simulator-specific names.

python3-pysam
interface for the SAM/BAM sequence alignment and mapping format (Python 3)
Versions of package python3-pysam
ReleaseVersionArchitectures
bookworm0.20.0+ds-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
buster0.15.2+ds-2amd64,arm64
stretch0.10.0+ds-2amd64,arm64,mips64el,ppc64el
sid0.22.1+ds-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
stretch-backports0.14+ds-2~bpo9+1amd64,arm64,mips64el,ppc64el
trixie0.22.1+ds-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
bullseye0.15.4+ds-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
Popcon: 30 users (15 upd.)*
Versions and Archs
License: DFSG free
Git

Pysam is a Python module for reading and manipulating Samfiles. It's a lightweight wrapper of the samtools C-API. Pysam also includes an interface for tabix.

This package installs the module for Python 3.

The package is enhanced by the following packages: python-pysam-tests
Registry entries: Bio.tools  Bioconda 
python3-questplus
QUEST+ implementation in Python3
Versions of package python3-questplus
ReleaseVersionArchitectures
bullseye2019.4-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie2023.1-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid2023.1-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm2019.4-6amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

QUEST+ is a Bayesian adaptive psychometric testing method that allows an arbitrary number of stimulus dimensions, psychometric function parameters, and trial outcomes.

This package provides an implementation in Python3.

python3-scitrack
Python3 library to track scientific data
Versions of package python3-scitrack
ReleaseVersionArchitectures
bullseye2020.6.5-1all
bookworm2021.5.3-3all
sid2024.10.8-1all
trixie2024.10.8-1all
Popcon: 1 users (4 upd.)*
Versions and Archs
License: DFSG free
Git

Scitrack is a library aimed at application developers writing scientific software to support tracking of scientific computation. The library provides elementary functionality to support logging. The primary capabilities concern generating checksums on input and output files and facilitating logging of the computational environment.

python3-screed
short nucleotide read sequence utils in Python 3
Versions of package python3-screed
ReleaseVersionArchitectures
stretch0.9-2all
sid1.1.3-1all
bookworm1.0.5-4all
bullseye1.0.5-1all
trixie1.1.3-1all
buster1.0-3all
Popcon: 3 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Screed parses FASTA and FASTQ files, generates databases, and lets you query these databases. Values such as sequence name, sequence description, sequence quality, and the sequence itself can be retrieved from these databases.

Registry entries: Bioconda 
python3-seirsplus
Models of SEIRS epidemic dynamics with extensions
Versions of package python3-seirsplus
ReleaseVersionArchitectures
trixie1.0.9-2all
bookworm1.0.9-1all
bullseye0.1.4+git20200528.5c04080+ds-2all
sid1.0.9-2all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

This package implements generalized SEIRS infectious disease dynamics models with extensions that model the effect of factors including population structure, social distancing, testing, contact tracing, and quarantining detected cases.

Notably, this package includes stochastic implementations of these models on dynamic networks.

python3-streamz
build pipelines to manage continuous streams of data
Versions of package python3-streamz
ReleaseVersionArchitectures
bookworm0.6.4-1all
sid0.6.4-2all
bullseye0.6.2-1all
trixie0.6.4-2all
Popcon: 0 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

It is simple to use in simple cases, but also supports complex pipelines that involve branching, joining, flow control, feedback, back pressure, and so on. Optionally, Streamz can also work with both Pandas and cuDF dataframes, to provide sensible streaming operations on continuous tabular data.

python3-tinyalign
numerical representation of differences between strings
Versions of package python3-tinyalign
ReleaseVersionArchitectures
bullseye0.2-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm0.2.1-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid0.2.2-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie0.2.2-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 0 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

A small Python module providing edit distance (aka Levenshtein distance, that is, counting insertions, deletions and substitutions) and Hamming distance computation.

Its main purpose is to speed up computation of edit distance by allowing to specify a maximum number of differences maxdiff (banding). If that parameter is provided, the returned edit distance is anly accurate up to maxdiff. That is, if the actual edit distance is higher than maxdiff, a value larger than maxdiff is returned, but not necessarily the actual edit distance.

Registry entries: Bioconda 
python3-toolz
List processing tools and functional utilities
Versions of package python3-toolz
ReleaseVersionArchitectures
stretch0.8.2-1all
sid1.0.0-1all
bullseye0.9.0-1.1all
trixie1.0.0-1all
bookworm0.12.0-1all
buster0.9.0-1all
Popcon: 161 users (115 upd.)*
Versions and Archs
License: DFSG free
Git

A set of utility functions for iterators, functions, and dictionaries. These functions interoperate well and form the building blocks of common data analytic operations. They extend the standard libraries itertools and functools and borrow heavily from the standard libraries of contemporary functional languages.

This contains the Python 3 version

python3-torch
Tensors and Dynamic neural networks in Python (Python Interface)
Versions of package python3-torch
ReleaseVersionArchitectures
bullseye1.7.1-7amd64,arm64,armhf,ppc64el,s390x
sid2.5.0+dfsg-1amd64,arm64,ppc64el,riscv64,s390x
bookworm1.13.1+dfsg-4amd64,arm64,ppc64el,s390x
upstream2.5.1
Popcon: 137 users (30 upd.)*
Newer upstream!
License: DFSG free
Git

PyTorch is a Python package that provides two high-level features:

(1) Tensor computation (like NumPy) with strong GPU acceleration (2) Deep neural networks built on a tape-based autograd system

You can reuse your favorite Python packages such as NumPy, SciPy and Cython to extend PyTorch when needed.

This is the CPU-only version of PyTorch (Python interface).

Please cite: Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai and Soumith Chintala:
Registry entries: SciCrunch 
python3-tornado
scalable, non-blocking web server and tools - Python 3 package
Versions of package python3-tornado
ReleaseVersionArchitectures
bookworm6.2.0-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie3.2.2-1.1amd64,armel,armhf,i386
buster5.1.1-4amd64,arm64,armhf,i386
trixie6.4.1-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye6.1.0-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch4.4.3-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
sid6.4.1-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 6602 users (5432 upd.)*
Versions and Archs
License: DFSG free
Git

Tornado is a Python web framework and asynchronous networking library, originally developed at FriendFeed. By using non-blocking network I/O, Tornado can scale to tens of thousands of open connections, making it ideal for long polling, WebSockets, and other applications that require a long-lived connection to each user.

This is the Python 3 version of the package.

python3-treetime
inference of time stamped phylogenies and ancestral reconstruction (Python 3)
Versions of package python3-treetime
ReleaseVersionArchitectures
bullseye0.8.1-1all
sid0.11.4-1all
trixie0.11.4-1all
buster0.5.3-1all
bookworm0.9.4-1all
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

TreeTime provides routines for ancestral sequence reconstruction and the maximum likelihoo inference of molecular-clock phylogenies, i.e., a tree where all branches are scaled such that the locations of terminal nodes correspond to their sampling times and internal nodes are placed at the most likely time of divergence.

TreeTime aims at striking a compromise between sophisticated probabilistic models of evolution and fast heuristics. It implements GTR models of ancestral inference and branch length optimization, but takes the tree topology as given. To optimize the likelihood of time-scaled phylogenies, treetime uses an iterative approach that first infers ancestral sequences given the branch length of the tree, then optimizes the positions of unconstraine d nodes on the time axis, and then repeats this cycle. The only topology optimization are (optional) resolution of polytomies in a way that is most (approximately) consistent with the sampling time constraints on the tree. The package is designed to be used as a stand-alone tool or as a library used in larger phylogenetic analysis workflows.

Features

  • ancestral sequence reconstruction (marginal and joint maximum likelihood)
  • molecular clock tree inference (marginal and joint maximum likelihood)
  • inference of GTR models
  • rerooting to obtain best root-to-tip regression
  • auto-correlated relaxed molecular clock (with normal prior)

This package provides the Python 3 module.

Registry entries: Bioconda 
python3-vcf
Variant Call Format (VCF) parser for Python 3
Versions of package python3-vcf
ReleaseVersionArchitectures
bullseye0.6.8+git20170215.476169c-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
sid0.6.8+git20170215.476169c-10amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
trixie0.6.8+git20170215.476169c-10amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
bookworm0.6.8+git20170215.476169c-9amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
Popcon: 4 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

The Variant Call Format (VCF) specifies the format of a text file used in bioinformatics for storing gene sequence variations. The format has been developed with the advent of large-scale genotyping and DNA sequencing projects, such as the 1000 Genomes Project.

The intent of this module is to mimic the csv module in the Python stdlib, as opposed to more flexible serialization formats like JSON or YAML. vcf will attempt to parse the content of each record based on the data types specified in the meta-information lines -- specifically the ##INFO and ##FORMAT lines. If these lines are missing or incomplete, it will check against the reserved types mentioned in the spec. Failing that, it will just return strings.

This package provides the Python 3 modules.

Registry entries: Bioconda 
q2-cutadapt
QIIME 2 plugin to work with adapters in sequence data
Versions of package q2-cutadapt
ReleaseVersionArchitectures
bookworm2022.11.1-2amd64,arm64,mips64el,ppc64el
sid2024.5.0-1amd64,arm64,mips64el,ppc64el,riscv64
bullseye2020.11.1-1amd64,arm64,mips64el,ppc64el
upstream2024.10.0
Popcon: 15 users (3 upd.)*
Newer upstream!
License: DFSG free
Git

QIIME 2 is a powerful, extensible, and decentralized microbiome analysis package with a focus on data and analysis transparency. QIIME 2 enables researchers to start an analysis with raw DNA sequence data and finish with publication-quality figures and statistical results. Key features:

  • Integrated and automatic tracking of data provenance
  • Semantic type system
  • Plugin system for extending microbiome analysis functionality
  • Support for multiple types of user interfaces (e.g. API, command line, graphical)

QIIME 2 is a complete redesign and rewrite of the QIIME 1 microbiome analysis pipeline. QIIME 2 will address many of the limitations of QIIME 1, while retaining the features that makes QIIME 1 a powerful and widely-used analysis pipeline.

QIIME 2 currently supports an initial end-to-end microbiome analysis pipeline. New functionality will regularly become available through QIIME 2 plugins. You can view a list of plugins that are currently available on the QIIME 2 plugin availability page. The future plugins page lists plugins that are being developed.

Please cite: Evan Bolyen, Jai Ram Rideout, Matthew R Dillon, Nicholas A Bokulich, Christian Abnet, Gabriel A Al-Ghalith, Harriet Alexander, Eric J Alm, Manimozhiyan Arumugam, Francesco Asnicar, Yang Bai, Jordan E Bisanz, Kyle Bittinger, Asker Brejnrod, Colin J Brislawn, C Titus Brown, Benjamin J Callahan, Andrés Mauricio Caraballo-Rodríguez, John Chase, Emily Cope, Ricardo Da Silva, Pieter C Dorrestein, Gavin M Douglas, Daniel M Durall, Claire Duvallet, Christian F Edwardson, Madeleine Ernst, Mehrbod Estaki, Jennifer Fouquier, Julia M Gauglitz, Deanna L Gibson, Antonio Gonzalez, Kestrel Gorlick, Jiarong Guo, Benjamin Hillmann, Susan Holmes, Hannes Holste, Curtis Huttenhower, Gavin Huttley, Stefan Janssen, Alan K Jarmusch, Lingjing Jiang, Benjamin Kaehler, Kyo Bin Kang, Christopher R Keefe, Paul Keim, Scott T Kelley, Dan Knights, Irina Koester, Tomasz Kosciolek, Jorden Kreps, Morgan GI Langille, Joslynn Lee, Ruth Ley, Yong-Xin Liu, Erikka Loftfield, Catherine Lozupone, Massoud Maher, Clarisse Marotz, Bryan D Martin, Daniel McDonald, Lauren J McIver, Alexey V Melnik, Jessica L Metcalf, Sydney C Morgan, Jamie Morton, Ahmad Turan Naimey, Jose A Navas-Molina, Louis Felix Nothias, Stephanie B Orchanian, Talima Pearson, Samuel L Peoples, Daniel Petras, Mary Lai Preuss, Elmar Pruesse, Lasse Buur Rasmussen, Adam Rivers, Michael S Robeson, Patrick Rosenthal, Nicola Segata, Michael Shaffer, Arron Shiffer, Rashmi Sinha, Se Jin Song, John R Spear, Austin D Swafford, Luke R Thompson, Pedro J Torres, Pauline Trinh, Anupriya Tripathi, Peter J Turnbaugh, Sabah Ul-Hasan, Justin JJ van der Hooft, Fernando Vargas, Yoshiki Vázquez-Baeza, Emily Vogtmann, Max von Hippel, William Walters, Yunhu Wan, Mingxun Wang, Jonathan Warren, Kyle C Weber, Chase HD Williamson, Amy D Willis, Zhenjiang Zech Xu, Jesse R Zaneveld, Yilong Zhang, Qiyun Zhu, Rob Knight and J Gregory Caporaso: Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. (eprint) Nature Biotechnology 37 (2019)
q2-feature-table
QIIME 2 plugin supporting operations on feature tables
Versions of package q2-feature-table
ReleaseVersionArchitectures
bookworm2022.11.1+dfsg-2all
sid2024.5.0+dfsg-1all
bullseye2020.11.1+dfsg-1all
upstream2024.10.0
Popcon: 16 users (3 upd.)*
Newer upstream!
License: DFSG free
Git

QIIME 2 is a powerful, extensible, and decentralized microbiome analysis package with a focus on data and analysis transparency. QIIME 2 enables researchers to start an analysis with raw DNA sequence data and finish with publication-quality figures and statistical results. Key features:

  • Integrated and automatic tracking of data provenance
  • Semantic type system
  • Plugin system for extending microbiome analysis functionality
  • Support for multiple types of user interfaces (e.g. API, command line, graphical)

QIIME 2 is a complete redesign and rewrite of the QIIME 1 microbiome analysis pipeline. QIIME 2 will address many of the limitations of QIIME 1, while retaining the features that makes QIIME 1 a powerful and widely-used analysis pipeline.

QIIME 2 currently supports an initial end-to-end microbiome analysis pipeline. New functionality will regularly become available through QIIME 2 plugins. You can view a list of plugins that are currently available on the QIIME 2 plugin availability page. The future plugins page lists plugins that are being developed.

Please cite: Evan Bolyen, Jai Ram Rideout, Matthew R Dillon, Nicholas A Bokulich, Christian Abnet, Gabriel A Al-Ghalith, Harriet Alexander, Eric J Alm, Manimozhiyan Arumugam, Francesco Asnicar, Yang Bai, Jordan E Bisanz, Kyle Bittinger, Asker Brejnrod, Colin J Brislawn, C Titus Brown, Benjamin J Callahan, Andrés Mauricio Caraballo-Rodríguez, John Chase, Emily Cope, Ricardo Da Silva, Pieter C Dorrestein, Gavin M Douglas, Daniel M Durall, Claire Duvallet, Christian F Edwardson, Madeleine Ernst, Mehrbod Estaki, Jennifer Fouquier, Julia M Gauglitz, Deanna L Gibson, Antonio Gonzalez, Kestrel Gorlick, Jiarong Guo, Benjamin Hillmann, Susan Holmes, Hannes Holste, Curtis Huttenhower, Gavin Huttley, Stefan Janssen, Alan K Jarmusch, Lingjing Jiang, Benjamin Kaehler, Kyo Bin Kang, Christopher R Keefe, Paul Keim, Scott T Kelley, Dan Knights, Irina Koester, Tomasz Kosciolek, Jorden Kreps, Morgan GI Langille, Joslynn Lee, Ruth Ley, Yong-Xin Liu, Erikka Loftfield, Catherine Lozupone, Massoud Maher, Clarisse Marotz, Bryan D Martin, Daniel McDonald, Lauren J McIver, Alexey V Melnik, Jessica L Metcalf, Sydney C Morgan, Jamie Morton, Ahmad Turan Naimey, Jose A Navas-Molina, Louis Felix Nothias, Stephanie B Orchanian, Talima Pearson, Samuel L Peoples, Daniel Petras, Mary Lai Preuss, Elmar Pruesse, Lasse Buur Rasmussen, Adam Rivers, Michael S Robeson, Patrick Rosenthal, Nicola Segata, Michael Shaffer, Arron Shiffer, Rashmi Sinha, Se Jin Song, John R Spear, Austin D Swafford, Luke R Thompson, Pedro J Torres, Pauline Trinh, Anupriya Tripathi, Peter J Turnbaugh, Sabah Ul-Hasan, Justin JJ van der Hooft, Fernando Vargas, Yoshiki Vázquez-Baeza, Emily Vogtmann, Max von Hippel, William Walters, Yunhu Wan, Mingxun Wang, Jonathan Warren, Kyle C Weber, Chase HD Williamson, Amy D Willis, Zhenjiang Zech Xu, Jesse R Zaneveld, Yilong Zhang, Qiyun Zhu, Rob Knight and J Gregory Caporaso: Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. (eprint) Nature Biotechnology 37 (2019)
q2-quality-filter
QIIME2 plugin for PHRED-based filtering and trimming
Versions of package q2-quality-filter
ReleaseVersionArchitectures
sid2024.5.0-1all
bullseye2020.11.1-2all
bookworm2022.11.1-2all
upstream2024.10.0
Popcon: 15 users (3 upd.)*
Newer upstream!
License: DFSG free
Git

QIIME 2 is a powerful, extensible, and decentralized microbiome analysis package with a focus on data and analysis transparency. QIIME 2 enables researchers to start an analysis with raw DNA sequence data and finish with publication-quality figures and statistical results. Key features:

  • Integrated and automatic tracking of data provenance
  • Semantic type system
  • Plugin system for extending microbiome analysis functionality
  • Support for multiple types of user interfaces (e.g. API, command line, graphical)

QIIME 2 is a complete redesign and rewrite of the QIIME 1 microbiome analysis pipeline. QIIME 2 will address many of the limitations of QIIME 1, while retaining the features that makes QIIME 1 a powerful and widely-used analysis pipeline.

QIIME 2 currently supports an initial end-to-end microbiome analysis pipeline. New functionality will regularly become available through QIIME 2 plugins. You can view a list of plugins that are currently available on the QIIME 2 plugin availability page. The future plugins page lists plugins that are being developed.

Please cite: Evan Bolyen, Jai Ram Rideout, Matthew R Dillon, Nicholas A Bokulich, Christian Abnet, Gabriel A Al-Ghalith, Harriet Alexander, Eric J Alm, Manimozhiyan Arumugam, Francesco Asnicar, Yang Bai, Jordan E Bisanz, Kyle Bittinger, Asker Brejnrod, Colin J Brislawn, C Titus Brown, Benjamin J Callahan, Andrés Mauricio Caraballo-Rodríguez, John Chase, Emily Cope, Ricardo Da Silva, Pieter C Dorrestein, Gavin M Douglas, Daniel M Durall, Claire Duvallet, Christian F Edwardson, Madeleine Ernst, Mehrbod Estaki, Jennifer Fouquier, Julia M Gauglitz, Deanna L Gibson, Antonio Gonzalez, Kestrel Gorlick, Jiarong Guo, Benjamin Hillmann, Susan Holmes, Hannes Holste, Curtis Huttenhower, Gavin Huttley, Stefan Janssen, Alan K Jarmusch, Lingjing Jiang, Benjamin Kaehler, Kyo Bin Kang, Christopher R Keefe, Paul Keim, Scott T Kelley, Dan Knights, Irina Koester, Tomasz Kosciolek, Jorden Kreps, Morgan GI Langille, Joslynn Lee, Ruth Ley, Yong-Xin Liu, Erikka Loftfield, Catherine Lozupone, Massoud Maher, Clarisse Marotz, Bryan D Martin, Daniel McDonald, Lauren J McIver, Alexey V Melnik, Jessica L Metcalf, Sydney C Morgan, Jamie Morton, Ahmad Turan Naimey, Jose A Navas-Molina, Louis Felix Nothias, Stephanie B Orchanian, Talima Pearson, Samuel L Peoples, Daniel Petras, Mary Lai Preuss, Elmar Pruesse, Lasse Buur Rasmussen, Adam Rivers, Michael S Robeson, Patrick Rosenthal, Nicola Segata, Michael Shaffer, Arron Shiffer, Rashmi Sinha, Se Jin Song, John R Spear, Austin D Swafford, Luke R Thompson, Pedro J Torres, Pauline Trinh, Anupriya Tripathi, Peter J Turnbaugh, Sabah Ul-Hasan, Justin JJ van der Hooft, Fernando Vargas, Yoshiki Vázquez-Baeza, Emily Vogtmann, Max von Hippel, William Walters, Yunhu Wan, Mingxun Wang, Jonathan Warren, Kyle C Weber, Chase HD Williamson, Amy D Willis, Zhenjiang Zech Xu, Jesse R Zaneveld, Yilong Zhang, Qiyun Zhu, Rob Knight and J Gregory Caporaso: Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2. (eprint) Nature Biotechnology 37 (2019)
qcat
demultiplexing Oxford Nanopore reads from FASTQ files
Versions of package qcat
ReleaseVersionArchitectures
trixie1.1.0-6all
bullseye1.1.0-2all
sid1.1.0-6all
bookworm1.1.0-6all
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Qcat is a command-line tool for demultiplexing Oxford Nanopore reads from FASTQ files. It accepts basecalled FASTQ files and splits the reads into separate FASTQ files based on their barcode. Qcat makes the demultiplexing algorithms used in albacore/guppy and EPI2ME available to be used locally with FASTQ files. Currently qcat implements the EPI2ME algorithm.

The package is enhanced by the following packages: qcat-examples
Registry entries: Bioconda 
quicktree
Neighbor-Joining algorithm for phylogenies
Versions of package quicktree
ReleaseVersionArchitectures
bookworm2.5-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye2.5-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid2.5-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie2.5-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

QuickTree is an efficient implementation of the Neighbor-Joining algorithm (PMID: 3447015), capable of reconstructing phylogenies from huge alignments in time less than the age of the universe.

QuickTree accepts both distance matrix and multiple-sequence-aligment inputs. The former should be in PHYLIP format. The latter should be in Stockholm format, which is the native alignment format for the Pfam database. Alignments in various formats can be converted to Stockholm format with the sreformat program, which is part of the HMMer package (hmmer.org).

The tress are written to stdout, in the Newick/New-Hampshire format use by PHYLIP and many other programs

r-bioc-htsfilter
GNU R filter replicated high-throughput transcriptome sequencing data
Versions of package r-bioc-htsfilter
ReleaseVersionArchitectures
bullseye1.30.1+dfsg-1all
bookworm1.38.0+dfsg-2all
trixie1.44.0+dfsg-1all
sid1.44.0+dfsg-1all
upstream1.46.0
Popcon: 3 users (5 upd.)*
Newer upstream!
License: DFSG free
Git

This package implements a filtering procedure for replicated transcriptome sequencing data based on a global Jaccard similarity index in order to identify genes with low, constant levels of expression across one or more experimental conditions.

r-bioc-limma
linear models for microarray data
Versions of package r-bioc-limma
ReleaseVersionArchitectures
jessie3.22.1+dfsg-1amd64,armel,armhf,i386
sid3.60.6+dfsg-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
stretch3.30.8+dfsg-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
trixie3.60.6+dfsg-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
buster3.38.3+dfsg-1amd64,arm64,armhf,i386
bullseye3.46.0+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm3.54.1+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
upstream3.62.1
Popcon: 29 users (21 upd.)*
Newer upstream!
License: DFSG free
Git

Microarrays are microscopic plates with carefully arranged short DNA strands and/or chemically prepared surfaces to which other DNA preferably binds. The amount of DNA binding at different locations of these chips, typically determined by a fluorescent dye, is to be interpreted. The technology is typically used with DNA that is derived from RNA, i.e to determine the activity of a gene and/or its splice variants. But the technology is also used to determine sequence variations in genomic DNA.

This Bioconductor package supports the analysis of gene expression microarray data, especially the use of linear models for analysing designed experiments and the assessment of differential expression. The package includes pre-processing capabilities for two-colour spotted arrays. The differential expression methods apply to all array platforms and treat Affymetrix, single channel and two channel experiments in a unified way.

Please cite: Gordon K. Smyth: Limma: linear models for microarray data. (eprint) :397-420 (2005)
Registry entries: Bio.tools  SciCrunch  Bioconda 
r-bioc-mutationalpatterns
GNU R comprehensive genome-wide analysis of mutational processes
Versions of package r-bioc-mutationalpatterns
ReleaseVersionArchitectures
sid3.14.0+dfsg-1all
bullseye3.0.1+dfsg-2all
trixie3.14.0+dfsg-1all
bookworm3.8.1+dfsg-1all
upstream3.16.0
Popcon: 3 users (5 upd.)*
Newer upstream!
License: DFSG free
Git

This BioConductor package provides an extensive toolset for the characterization and visualization of a wide range of mutational patterns in base substitution catalogs.

r-bioc-pwmenrich
PWM enrichment analysis
Versions of package r-bioc-pwmenrich
ReleaseVersionArchitectures
bullseye4.26.0-1all
sid4.40.0-1all
bookworm4.34.0-1all
trixie4.40.0-1all
upstream4.42.0
Popcon: 1 users (3 upd.)*
Newer upstream!
License: DFSG free
Git

A toolkit of high-level functions for DNA motif scanning and enrichment analysis built upon Biostrings. The main functionality is PWM enrichment analysis of already known PWMs (e.g. from databases such as MotifDb), but the package also implements high-level functions for PWM scanning and visualisation. The package does not perform "de novo" motif discovery, but is instead focused on using motifs that are either experimentally derived or computationally constructed by other tools.

r-bioc-rcpi
molecular informatics toolkit for compound-protein interaction
Versions of package r-bioc-rcpi
ReleaseVersionArchitectures
sid1.40.3+ds-1all
trixie1.40.3+ds-1all
bookworm1.34.0+ds-1all
upstream1.42.0
Popcon: 0 users (3 upd.)*
Newer upstream!
License: DFSG free
Git

Rcpi offers a molecular informatics toolkit with a comprehensive integration of bioinformatics and chemoinformatics tools for drug discovery.

Please cite: Dong-Sheng Cao, Nan Xiao, Qing-Song Xu and Alex F. Chen: Rcpi: R/Bioconductor package to generate various descriptors of proteins, compounds and their interactions. Bioinformatics 31(2):279-281 (2015)
Registry entries: Bio.tools  Bioconda 
r-bioc-rgsepd
GNU R gene set enrichment / projection displays
Versions of package r-bioc-rgsepd
ReleaseVersionArchitectures
bookworm1.30.0-1all
sid1.36.0-1all
bullseye1.22.0-1all
trixie1.36.0-1all
upstream1.38.0
Popcon: 1 users (3 upd.)*
Newer upstream!
License: DFSG free
Git

R/GSEPD is a bioinformatics package for R to help disambiguate transcriptome samples (a matrix of RNA-Seq counts at transcript IDs) by automating differential expression (with DESeq2), then gene set enrichment (with GOSeq), and finally a N-dimensional projection to quantify in which ways each sample is like either treatment group.

r-bioc-rsamtools
GNU R binary alignment (BAM), variant call (BCF), or tabix file import
Versions of package r-bioc-rsamtools
ReleaseVersionArchitectures
jessie1.16.1-2amd64,armel,armhf,i386
stretch1.26.1-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
sid2.20.0+dfsg-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
buster1.34.1-1amd64,arm64,armhf,i386
trixie2.20.0+dfsg-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
bullseye2.6.0-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm2.14.0-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
upstream2.22.0
Popcon: 26 users (9 upd.)*
Newer upstream!
License: DFSG free
Git

This package provides an interface to the 'samtools', 'bcftools', and 'tabix' utilities for manipulating SAM (Sequence Alignment / Map), binary variant call (BCF) and compressed indexed tab-delimited (tabix) files.

Registry entries: Bio.tools  Bioconda 
r-bioc-tcgabiolinks
GNU R/Bioconductor package for integrative analysis with GDC data
Versions of package r-bioc-tcgabiolinks
ReleaseVersionArchitectures
bullseye2.18.0+dfsg-1all
bookworm2.25.3+dfsg-1all
sid2.32.0+dfsg-2all
upstream2.34.0
Popcon: 3 users (3 upd.)*
Newer upstream!
License: DFSG free
Git

The aim of TCGAbiolinks is:

 1) facilitate the GDC open-access data retrieval,
 2) prepare the data using the appropriate pre-processing strategies,
 3) provide the means to carry out different standard analyses and
 4) to easily reproduce earlier research results.
In more detail, the package provides multiple methods for analysis (e.g.,

differential expression analysis, identifying differentially methylated regions) and methods for visualization (e.g., survival plots, volcano plots, starburst plots) in order to easily develop complete analysis pipelines.

r-cran-alakazam
Immunoglobulin Clonal Lineage and Diversity Analysis
Versions of package r-cran-alakazam
ReleaseVersionArchitectures
bullseye1.1.0-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
experimental1.3.0-2~0exp0amd64,arm64,mips64el,ppc64el,riscv64,s390x
sid1.3.0-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
buster0.2.11-1amd64,arm64,armhf,i386
trixie1.3.0-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
bookworm1.2.1-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 3 users (6 upd.)*
Versions and Archs
License: DFSG free
Git

Alakazam is part of the Immcantation analysis framework for Adaptive Immune Receptor Repertoire sequencing (AIRR-seq) and provides a set of tools to investigate lymphocyte receptor clonal lineages, diversity, gene usage, and other repertoire level properties, with a focus on high-throughput immunoglobulin (Ig) sequencing.

Alakazam serves five main purposes:

  • Providing core functionality for other R packages in the Immcantation framework. This includes common tasks such as file I/O, basic DNA sequence manipulation, and interacting with V(D)J segment and gene annotations.
  • Providing an R interface for interacting with the output of the pRESTO and Change-O tool suites.
  • Performing lineage reconstruction on clonal populations of Ig sequences and analyzing the topology of the resultant lineage trees.
  • Performing clonal abundance and diversity analysis on lymphocyte repertoires.
  • Performing physicochemical property analyses of lymphocyte receptor sequences.
Please cite: Namita T. Gupta, Jason A. Vander Heiden, Mohamed Uduman, Daniel Gadala-Maria, Gur Yaari and Steven H. Kleinstein: Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data. (eprint) 31(20):3356–3358 (2017)
r-cran-covid19us
cases of COVID-19 in the United States prepared for GNU R
Versions of package r-cran-covid19us
ReleaseVersionArchitectures
bookworm0.1.9-1all
bullseye0.1.7-1all
sid0.1.9-1all
trixie0.1.9-1all
Popcon: 3 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

This package provides a GNU R wrapper around the 'COVID Tracking Project API' https://covidtracking.com/api/ providing data on cases of COVID-19 in the US.

r-cran-diagnosismed
medical diagnostic test accuracy analysis toolkit
Versions of package r-cran-diagnosismed
ReleaseVersionArchitectures
bullseye0.2.3-7all
bookworm0.2.3-7all
sid0.2.3-7all
trixie0.2.3-7all
jessie0.2.3-3all
stretch0.2.3-4all
buster0.2.3-6all
Debtags of package r-cran-diagnosismed:
devellang:r
fieldmedicine
interfacecommandline
roleprogram
useanalysing
Popcon: 6 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

DiagnosisMed is a GNU R package to analyze the accuracy of data from diagnostic tests evaluating health conditions. It was designed to be used by health professionals. This package helps estimating sensitivity and specificity from categorical and continuous test results including some evaluations of indeterminate results, or compare different categorical tests, and estimate reasonable cut-offs of tests and display it in a way commonly used by health professionals. No graphical interface is available yet.

r-cran-epi
GNU R epidemiological analysis
Versions of package r-cran-epi
ReleaseVersionArchitectures
stretch2.7-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster2.32-2amd64,arm64,armhf,i386
bullseye2.43-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm2.47-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie2.53-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid2.53-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
jessie1.1.67-4amd64,armel,armhf,i386
upstream2.56
Debtags of package r-cran-epi:
fieldmedicine
interfacecommandline
roleprogram
Popcon: 29 users (24 upd.)*
Newer upstream!
License: DFSG free
Git

Functions for demographic and epidemiological analysis in the Lexis diagram, i.e. register and cohort follow-up data, including interval censored data and representation of multistate data. Also some useful functions for tabulation and plotting. Contains some epidemiological datasets.

The Epi package is mainly focused on "classical" chronic disease epidemiology. The package has grown out of the course Statistical Practice in Epidemiology using R (see http://www.pubhealth.ku.dk/~bxc/SPE).

There is A short introduction to R for Epidemiology available at http://staff.pubhealth.ku.dk/%7Ebxc/Epi/R-intro.pdf Beware that the pages 38-120 of this is merely the manual pages for the Epi package.

Epi is not the only R-package for epidemiological analysis, a package with more affinity to infectious disease epidemiology is the epitools package which is also evailable in Debian.

Epi is used in the Department of Biostatistics of the University of Copenhagen.

Please cite: Martyn Plummer and Bendix Carstensen: Lexis: An R Class for Epidemiological Studies with Long-Term Follow-Up. Journal of Statistical Software 38(5):1-12 (2011)
r-cran-epibasix
GNU R Elementary Epidemiological Functions
Versions of package r-cran-epibasix
ReleaseVersionArchitectures
stretch1.3-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster1.5-1all
bullseye1.5-2all
bookworm1.5-2all
trixie1.5-2all
sid1.5-2all
jessie1.3-1amd64,armel,armhf,i386
Debtags of package r-cran-epibasix:
fieldmedicine
interfacecommandline
roleprogram
Popcon: 5 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

Elementary Epidemiological Functions for a Graduate Epidemiology / Biostatistics Course.

This package contains elementary tools for analysis of common epidemiological problems, ranging from sample size estimation, through 2x2 contingency table analysis and basic measures of agreement (kappa, sensitivity/specificity). Appropriate print and summary statements are also written to facilitate interpretation wherever possible. This package is a work in progress, so any comments or suggestions would be appreciated. Source code is commented throughout to facilitate modification. The target audience includes graduate students in various epi/biostatistics courses.

Epibasix was developed in Canada.

r-cran-epicalc
GNU R Epidemiological calculator
Versions of package r-cran-epicalc
ReleaseVersionArchitectures
sid2.15.1.0-5all
bullseye2.15.1.0-5all
bookworm2.15.1.0-5all
trixie2.15.1.0-5all
jessie2.15.1.0-1all
stretch2.15.1.0-2all
buster2.15.1.0-4all
Debtags of package r-cran-epicalc:
devellang:r
fieldmedicine, statistics
interfacecommandline
roleprogram
Popcon: 6 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

Functions making R easy for epidemiological calculation.

Datasets from Dbase (.dbf), Stata (.dta), SPSS(.sav), EpiInfo(.rec) and Comma separated value (.csv) formats as well as R data frames can be processed to do make several epidemiological calculations.

r-cran-epiestim
GNU R estimate time varying reproduction numbers from rpidemic curves
Versions of package r-cran-epiestim
ReleaseVersionArchitectures
sid2.2-4+dfsg-1all
trixie2.2-4+dfsg-1all
bookworm2.2-4+dfsg-1all
bullseye2.2-4+dfsg-1all
buster-backports2.2-4+dfsg-1~bpo10+1all
Popcon: 3 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

Tools to quantify transmissibility throughout an epidemic from the analysis of time series of incidence as described in Cori et al. (2013) and Wallinga and Teunis (2004) .

r-cran-epir
GNU R Functions for analysing epidemiological data
Versions of package r-cran-epir
ReleaseVersionArchitectures
stretch0.9-79-1all
buster0.9-99-1all
sid2.0.76+dfsg-1all
bullseye2.0.19-1all
jessie0.9-59-1all
bookworm2.0.57+dfsg-1all
Debtags of package r-cran-epir:
devellang:r
fieldmedicine
interfacecommandline
roleprogram
useanalysing
Popcon: 46 users (40 upd.)*
Versions and Archs
License: DFSG free
Git

A package for analysing epidemiological data. Contains functions for directly and indirectly adjusting measures of disease frequency, quantifying measures of association on the basis of single or multiple strata of count data presented in a contingency table, and computing confidence intervals around incidence risk and incidence rate estimates. Miscellaneous functions for use in meta-analysis, diagnostic test interpretation, and sample size calculations.

r-cran-epitools
GNU R Epidemiology Tools for Data and Graphics
Versions of package r-cran-epitools
ReleaseVersionArchitectures
jessie0.5-7-1all
buster0.5-10-2all
bullseye0.5-10.1-2all
bookworm0.5-10.1-2all
trixie0.5-10.1-2all
sid0.5-10.1-2all
stretch0.5-7-1all
Debtags of package r-cran-epitools:
fieldmedicine
interfacecommandline
roleprogram
Popcon: 7 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

GNU R Tools for public health epidemiologists and data analysts. Epitools provides numerical tools and programming solutions that have been used and tested in real-world epidemiologic applications.

Many practical problems in the analysis of public health data require programming or special software, and investigators in different locations may duplicate programming efforts. Often, simple analyses, such as the construction of confidence intervals, are not calculated and thereby complicate appropriate statistical inferences for small geographic areas. There are many examples of simple and useful numerical tools that would enhance the work of epidemiologists at local health departments and yet are not readily available for the problem in front of them. The availability of these tools will encourage wider use of appropriate methods and promote evidence-based public health practices.

r-cran-hms
GNU R pretty time of day
Versions of package r-cran-hms
ReleaseVersionArchitectures
stretch-backports0.4.2-1~bpo9+1all
sid1.1.3-1all
trixie1.1.3-1all
bookworm1.1.2-1all
bullseye1.0.0-1all
buster0.4.2-2all
Popcon: 280 users (79 upd.)*
Versions and Archs
License: DFSG free
Git

This GNU R package implements an S3 class for storing and formatting time-of-day values, based on the 'difftime' class.

r-cran-incidence
GNU R compute, handle, plot and model incidence of dated events
Versions of package r-cran-incidence
ReleaseVersionArchitectures
bookworm1.7.3-1all
trixie1.7.5-1all
sid1.7.5-1all
buster-backports1.7.3-1~bpo10+1all
bullseye1.7.3-1all
Popcon: 3 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

Provides functions and classes to compute, handle and visualise incidence from dated events for a defined time interval. Dates can be provided in various standard formats. The class 'incidence' is used to store computed incidence and can be easily manipulated, subsetted, and plotted. In addition, log-linear models can be fitted to 'incidence' objects using 'fit'. This package is part of the RECON (http://www.repidemicsconsortium.org/) toolkit for outbreak analysis.

r-cran-kernelheaping
GNU R kernel density estimation for heaped and rounded data
Versions of package r-cran-kernelheaping
ReleaseVersionArchitectures
sid2.3.0-1all
bookworm2.3.0-1all
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

In self-reported or anonymised data the user often encounters heaped data, i.e. data which are rounded (to a possibly different degree of coarseness). While this is mostly a minor problem in parametric density estimation the bias can be very large for non-parametric methods such as kernel density estimation. This package implements a partly Bayesian algorithm treating the true unknown values as additional parameters and estimates the rounding parameters to give a corrected kernel density estimate. It supports various standard bandwidth selection methods. Varying rounding probabilities (depending on the true value) and asymmetric rounding is estimable as well: Gross, M. and Rendtel, U. (2016) (). Additionally, bivariate non- parametric density estimation for rounded data, Gross, M. et al. (2016) (), as well as data aggregated on areas is supported.

r-cran-lexrankr
extractive summarization of text with the LexRank algorithm
Versions of package r-cran-lexrankr
ReleaseVersionArchitectures
buster0.5.0-2amd64,arm64,armhf,i386
bookworm0.5.2-8amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie0.5.2-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye0.5.2-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid0.5.2-8amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 4 users (4 upd.)*
Versions and Archs
License: DFSG free
Git

An R implementation of the LexRank algorithm implementing stochastic graph-based method for computing relative importance of textual units for Natural Language Processing. The technique on the problem of Text Summarization (TS) is tested. Extractive TS relies on the concept of sentence salience to identify the most important sentences in a document or set of documents. Salience is typically defined in terms of the presence of particular important words or in terms of similarity to a centroid pseudo-sentence.

Please cite: Güneş Erkan and Dragomir R. Radev: LexRank: Graph-based Lexical Centrality as Salience in Text Summarization. (eprint) Journal of Artific Intelligence Research 22:457-479 (2004)
r-cran-mediana
clinical trial simulations
Versions of package r-cran-mediana
ReleaseVersionArchitectures
bullseye1.0.8-3all
trixie1.0.8-3all
sid1.0.8-3all
bookworm1.0.8-3all
Popcon: 1 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

Provides a general framework for clinical trial simulations based on the Clinical Scenario Evaluation (CSE) approach. The package supports a broad class of data models (including clinical trials with continuous, binary, survival-type and count-type endpoints as well as multivariate outcomes that are based on combinations of different endpoints), analysis strategies and commonly used evaluation criteria.

r-cran-msm
GNU R Multi-state Markov and hidden Markov models in continuous time
Versions of package r-cran-msm
ReleaseVersionArchitectures
stretch1.6.4-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
jessie1.4-2amd64,armel,armhf,i386
trixie1.8-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.7-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster1.6.6-2amd64,arm64,armhf,i386
bullseye1.6.8-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.8-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
upstream1.8.2
Debtags of package r-cran-msm:
interfacecommandline
roleprogram
Popcon: 54 users (47 upd.)*
Newer upstream!
License: DFSG free
Git

Functions for fitting general continuous-time Markov and hidden Markov multi-state models to longitudinal data. Both Markov transition rates and the hidden Markov output process can be modelled in terms of covariates. A variety of observation schemes are supported, including processes observed at arbitrary times, completely-observed processes, and censored states.

Please cite: Christopher H. Jackson: Multi-State Models for Panel Data: The msm Package for R. Journal of Statistical Software 38(8):1-29 (2011)
r-cran-qtl
GNU R package for genetic marker linkage analysis
Versions of package r-cran-qtl
ReleaseVersionArchitectures
bookworm1.58-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.70-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.70-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster1.44-9-1amd64,arm64,armhf,i386
bullseye1.47-9-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie1.33-7-1amd64,armel,armhf,i386
stretch1.40-8-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
Debtags of package r-cran-qtl:
devellang:r, library
fieldbiology, statistics
roleapp-data
suitegnu
Popcon: 17 users (9 upd.)*
Versions and Archs
License: DFSG free
Git

R/qtl is an extensible, interactive environment for mapping quantitative trait loci (QTLs) in experimental crosses. It is implemented as an add-on-package for the freely available and widely used statistical language/software R (see http://www.r-project.org).

The development of this software as an add-on to R allows one to take advantage of the basic mathematical and statistical functions, and powerful graphics capabilities, that are provided with R. Further, the user will benefit by the seamless integration of the QTL mapping software into a general statistical analysis program. The goal is to make complex QTL mapping methods widely accessible and allow users to focus on modeling rather than computing.

A key component of computational methods for QTL mapping is the hidden Markov model (HMM) technology for dealing with missing genotype data. The main HMM algorithms, with allowance for the presence of genotyping errors, for backcrosses, intercrosses, and phase-known four-way crosses were implemented.

The current version of R/qtl includes facilities for estimating genetic maps, identifying genotyping errors, and performing single-QTL genome scans and two-QTL, two-dimensional genome scans, by interval mapping (with the EM algorithm), Haley-Knott regression, and multiple imputation. All of this may be done in the presence of covariates (such as sex, age or treatment). One may also fit higher-order QTL models by multiple imputation.

Please cite: Karl W. Broman, Hao Wu, Saunak Sen and Gary A. Churchill: R/qtl: QTL mapping in experimental crosses. (PubMed,eprint) Bioinformatics 19:889-890 (2003)
Registry entries: Bio.tools  SciCrunch  Bioconda 
r-cran-seroincidence
GNU R seroincidence calculator tool
Versions of package r-cran-seroincidence
ReleaseVersionArchitectures
bookworm2.0.0-3all
sid2.0.0-3all
trixie2.0.0-3all
stretch1.0.5-1all
bullseye2.0.0-2all
buster2.0.0-1all
Popcon: 4 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

Antibody levels measured in a cross-sectional population samples can be translated into an estimate of the frequency with which seroconversions (new infections) occur. In order to interpret the measured cross-sectional antibody levels, parameters which predict the decay of antibodies must be known. In previously published reports (Simonsen et al. 2009 and Versteegh et al. 2005), this information has been obtained from longitudinal studies on subjects who had culture-confirmed Salmonella and Campylobacter infections. A Bayesian back-calculation model was used to convert antibody measurements into an estimation of time since infection. This can be used to estimate the seroincidence in the cross-sectional sample of population. For both the longitudinal and cross-sectional measurements of antibody concentrations, the indirect ELISA was used. The models are only valid for persons over 18 years. The seroincidence estimates are suitable for monitoring the effect of control programmes when representative cross-sectional serum samples are available for analyses. These provide more accurate information on the infection pressure in humans across countries.

Please cite: PFM Teunis, JCH van Eijkeren, CW Ang, YTHP van Duynhoven, JB Simonsen, MA Strid and W van Pelt: Biomarker dynamics: estimating infection rates from serological data. (PubMed) Statistics in Medicine 31(20):2240–2248 (2012)
r-cran-sf
Simple Features for R
Versions of package r-cran-sf
ReleaseVersionArchitectures
bullseye0.9-7+dfsg-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch-backports0.6-3+dfsg-1~bpo9+1arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bookworm1.0-9+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.0-17+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch-backports0.7-2+dfsg-1~bpo9+1amd64
buster0.7-2+dfsg-1amd64,arm64,armhf,i386
upstream1.0-19
Popcon: 178 users (71 upd.)*
Newer upstream!
License: DFSG free
Git

Support for simple features, a standardized way to encode spatial vector data. Binds to 'GDAL' for reading and writing data, to 'GEOS' for geometrical operations, and to 'PROJ' for projection conversions and datum transformations.

r-cran-shazam
Immunoglobulin Somatic Hypermutation Analysis
Versions of package r-cran-shazam
ReleaseVersionArchitectures
trixie1.2.0-1all
bookworm1.1.2-1all
buster0.1.11-1all
bullseye1.0.2-1all
sid1.2.0-1all
Popcon: 3 users (5 upd.)*
Versions and Archs
License: DFSG free
Git

Provides a computational framework for Bayesian estimation of antigen-driven selection in immunoglobulin (Ig) sequences, providing an intuitive means of analyzing selection by quantifying the degree of selective pressure. Also provides tools to profile mutations in Ig sequences, build models of somatic hypermutation (SHM) in Ig sequences, and make model-dependent distance comparisons of Ig repertoires.

SHazaM is part of the Immcantation analysis framework for Adaptive Immune Receptor Repertoire sequencing (AIRR-seq) and provides tools for advanced analysis of somatic hypermutation (SHM) in immunoglobulin (Ig) sequences. Shazam focuses on the following analysis topics:

  • Quantification of mutational load SHazaM includes methods for determine the rate of observed and expected mutations under various criteria. Mutational profiling criteria include rates under SHM targeting models, mutations specific to CDR and FWR regions, and physicochemical property dependent substitution rates.
  • Statistical models of SHM targeting patterns Models of SHM may be divided into two independent components: 1) a mutability model that defines where mutations occur and 2) a nucleotide substitution model that defines the resulting mutation. Collectively these two components define an SHM targeting model. SHazaM provides empirically derived SHM 5-mer context mutation models for both humans and mice, as well tools to build SHM targeting models from data.
  • Analysis of selection pressure using BASELINe The Bayesian Estimation of Antigen-driven Selection in Ig Sequences (BASELINe) method is a novel method for quantifying antigen-driven selection in high-throughput Ig sequence data. BASELINe uses SHM targeting models can be used to estimate the null distribution of expected mutation frequencies, and provide measures of selection pressure informed by known AID targeting biases.
  • Model-dependent distance calculations SHazaM provides methods to compute evolutionary distances between sequences or set of sequences based on SHM targeting models. This information is particularly useful in understanding and defining clonal relationships.
Please cite: Namita T. Gupta, Jason A. Vander Heiden, Mohamed Uduman, Daniel Gadala-Maria, Gur Yaari and Steven H. Kleinstein: Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data.. (PubMed,eprint) Bioinformatics 31(20):3356-3358 (2015)
Registry entries: Bioconda 
r-cran-sjplot
GNU R data visualization for statistics in social science
Versions of package r-cran-sjplot
ReleaseVersionArchitectures
sid2.8.16+dfsg-1all
bookworm2.8.12+dfsg-1all
stretch-backports2.6.2-1~bpo9+1all
bullseye2.8.7-1all
buster2.6.2-1all
Popcon: 19 users (15 upd.)*
Versions and Archs
License: DFSG free
Git

Collection of plotting and table output functions for data visualization. Results of various statistical analyses (that are commonly used in social sciences) can be visualized using this package, including simple and cross tabulated frequencies, histograms, box plots, (generalized) linear models, mixed effects models, principal component analysis and correlation matrices, cluster analyses, scatter plots, stacked scales, effects plots of regression models (including interaction terms) and much more. This package supports labelled data.

r-cran-spp
GNU R ChIP-seq processing pipeline
Versions of package r-cran-spp
ReleaseVersionArchitectures
sid1.16.0-2amd64,arm64,mips64el,ppc64el,riscv64,s390x
buster1.15.5-1amd64,arm64,armhf,i386
bullseye1.16.0-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.16.0-2amd64,arm64,mips64el,ppc64el,riscv64,s390x
bookworm1.16.0-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 3 users (5 upd.)*
Versions and Archs
License: DFSG free
Git

R package for anlaysis of ChIP-seq and other functional sequencing data

  • Assess overall DNA-binding signals in the data and select appropriate quality of tag alignment.
  • Discard or restrict positions with abnormally high number of tags.
  • Calculate genome-wide profiles of smoothed tag density and save them in WIG files for viewing in other browsers.
  • Calculate genome-wide profiles providing conservative statistical estimates of fold enrichment ratios along the genome. These can be exported for browser viewing, or thresholded to determine regions of significant enrichment/depletion.
  • Determine statistically significant point binding positions
  • Assess whether the set of point binding positions detected at a current sequencing depth meets saturation criteria, and if does not, estimate what sequencing depth would be required to do so.
Please cite: Peter V Kharchenko, Michael Y Tolstorukov and Peter J Park: Design and analysis of ChIP-seq experiments for DNA-binding proteins. (PubMed) Nature biotechnology 26(12):1351–1359 (2008)
r-cran-stringi
GNU R character string processing facilities
Versions of package r-cran-stringi
ReleaseVersionArchitectures
sid1.8.4-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster1.2.4-2amd64,arm64,armhf,i386
bookworm1.7.12-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch-backports1.2.4-2~bpo9+1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
stretch1.1.2-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
trixie1.8.4-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye1.5.3-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 302 users (171 upd.)*
Versions and Archs
License: DFSG free
Git

Allows for fast, correct, consistent, portable, as well as convenient character string/text processing in every locale and any native encoding. Owing to the use of the ICU library, the package provides R users with platform-independent functions known to Java, Perl, Python, PHP, and Ruby programmers. Among available features there are: pattern searching (e.g. via regular expressions), random string generation, string collation, transliteration, concatenation, date-time formatting and parsing, etc.

Please cite: Marek Gagolewski, Bartlomiej Tartanus, Oliver Keyes and Marcin Pawel Bujarski: R package stringi: Character string processing facilities. zenodo (2015)
r-cran-surveillance
GNU R package for the Modeling and Monitoring of Epidemic Phenomena
Versions of package r-cran-surveillance
ReleaseVersionArchitectures
stretch1.13.0-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
jessie1.8-0-1amd64,armel,armhf,i386
sid1.24.0-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.24.0-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.20.3-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye1.19.0-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster1.16.2-1amd64,arm64,armhf,i386
upstream1.24.1
Debtags of package r-cran-surveillance:
fieldmedicine
interfacecommandline
roleprogram
Popcon: 6 users (7 upd.)*
Newer upstream!
License: DFSG free
Git

Statistical methods for the modeling and monitoring of time series of counts, proportions and categorical data, as well as for the modeling of continuous-time point processes of epidemic phenomena.

The monitoring methods focus on aberration detection in count data time series from public health surveillance of communicable diseases, but applications could just as well originate from environmetrics, reliability engineering, econometrics, or social sciences. The package implements many typical outbreak detection procedures such as the (improved) Farrington algorithm, or the negative binomial GLR-CUSUM method of Höhle and Paul (2008) . A novel CUSUM approach combining logistic and multinomial logistic modeling is also included. The package contains several real-world data sets, the ability to simulate outbreak data, and to visualize the results of the monitoring in a temporal, spatial or spatio-temporal fashion. A recent overview of the available monitoring procedures is given by Salmon et al. (2016) .

For the retrospective analysis of epidemic spread, the package provides three endemic-epidemic modeling frameworks with tools for visualization, likelihood inference, and simulation. hhh4() estimates models for (multivariate) count time series following Paul and Held (2011) and Meyer and Held (2014) . twinSIR() models the susceptible-infectious-recovered (SIR) event history of a fixed population, e.g, epidemics across farms or networks, as a multivariate point process as proposed by Höhle (2009) . twinstim() estimates self-exciting point process models for a spatio-temporal point pattern of infective events, e.g., time-stamped geo-referenced surveillance data, as proposed by Meyer et al. (2012) . A recent overview of the implemented space-time modeling frameworks for epidemic phenomena is given by Meyer et al. (2017) .

Please cite: Maëlle Salmon, Dirk Schumacher and Michael Höhle: Monitoring Count Time Series in R: Aberration Detection in Public Health Surveillance. Journal of Statistical Software 70(10):1-35 (2016)
r-cran-tigger
Infers new Immunoglobulin alleles from Rep-Seq Data
Versions of package r-cran-tigger
ReleaseVersionArchitectures
trixie1.1.0-1all
sid1.1.0-1all
buster0.3.1-1all
bookworm1.0.1-1all
bullseye1.0.0-1all
Popcon: 2 users (5 upd.)*
Versions and Archs
License: DFSG free
Git

Summary: Infers the V genotype of an individual from immunoglobulin (Ig) repertoire-sequencing (Rep-Seq) data, including detection of any novel alleles. This information is then used to correct existing V allele calls from among the sample sequences.

High-throughput sequencing of B cell immunoglobulin receptors is providing unprecedented insight into adaptive immunity. A key step in analyzing these data involves assignment of the germline V, D and J gene segment alleles that comprise each immunoglobulin sequence by matching them against a database of known V(D)J alleles. However, this process will fail for sequences that utilize previously undetected alleles, whose frequency in the population is unclear.

TIgGER is a computational method that significantly improves V(D)J allele assignments by first determining the complete set of gene segments carried by an individual (including novel alleles) from V(D)J-rearrange sequences. TIgGER can then infer a subject’s genotype from these sequences, and use this genotype to correct the initial V(D)J allele assignments.

The application of TIgGER continues to identify a surprisingly high frequency of novel alleles in humans, highlighting the critical need for this approach. TIgGER, however, can and has been used with data from other species.

Core Abilities:

  • Detecting novel alleles
  • Inferring a subject’s genotype
  • Correcting preliminary allele calls

Required Input

  • A table of sequences from a single individual, with columns containing the following:
  • V(D)J-rearranged nucleotide sequence (in IMGT-gapped format)
  • Preliminary V allele calls
  • Preliminary J allele calls
  • Length of the junction region
  • Germline Ig sequences in IMGT-gapped fasta format (e.g., as those downloaded from IMGT/GENE-DB)

The former can be created through the use of IMGT/HighV-QUEST and Change-O.

Please cite: Namita T. Gupta, Jason A. Vander Heiden, Mohamed Uduman, Daniel Gadala-Maria, Gur Yaari and Steven H. Kleinstein: Change-O: a toolkit for analyzing large-scale B cell immunoglobulin repertoire sequencing data. (eprint) 31(20):3356–3358 (2017)
Registry entries: Bioconda 
r-other-ascat
Allele-Specific Copy Number Analysis of Tumours
Versions of package r-other-ascat
ReleaseVersionArchitectures
sid3.1.2-1all
bookworm3.1.1-1all
trixie3.1.2-1all
bullseye2.5.2-3all
upstream3.2.0
Popcon: 3 users (3 upd.)*
Newer upstream!
License: DFSG free
Git

ASCAT (allele-specific copy number analysis of tumors) is a allele- specific copy number analysis of the in vivo breast cancer genome. It can be used to accurately dissect the allele-specific copy number of solid tumors, simultaneously estimating and adjusting for both tumor ploidy and nonaberrant cell admixture.

Please cite: Peter Van Loo, Silje H Nordgard, Ole Christian Lingjærde, Hege G Russnes, Inga H Rye, Wei Sun, Victor J Weigman, Peter Marynen, Anders Zetterberg, Bjørn Naume, Charles M Perou, Anne-Lise Børresen-Dale and Vessela N Kristensen: Allele-specific Copy Number Analysis of Tumors. (PubMed) PNAS 107(39):16910-5 (2010)
Registry entries: Bio.tools  Bioconda 
ragout
Reference-Assisted Genome Ordering UTility
Versions of package ragout
ReleaseVersionArchitectures
sid2.3-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye2.3-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm2.3-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie2.3-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Ragout (Reference-Assisted Genome Ordering UTility) is a tool for chromosome-level scaffolding using multiple references. Given initial assembly fragments (contigs/scaffolds) and one or multiple related references (complete or draft), it produces a chromosome-scale assembly (as a set of scaffolds).

The approach is based on the analysis of genome rearrangements (like inversions or chromosomal translocations) between the input genomes and reconstructing the most parsimonious structure of the target genome.

Ragout now supports both small and large genomes (of mammalian scale and complexity). The assembly of highly polymorphic genomes is currently limited.

Please cite: Mikhail Kolmogorov, Joel Armstrong, Brian J. Raney, Ian Streeter, Matthew Dunn, Fengtang Yang, Duncan Odom, Paul Flicek, Thomas M. Keane, David Thybert, Benedict Paten and Son Pham: Chromosome assembly of large and complex genomes using multiple references. (PubMed,eprint) Genome Research 28(11):1720-1732 (2018)
Registry entries: Bioconda 
readucks
Nanopore read de-multiplexer (read demux -> readux -> readucks, innit)
Versions of package readucks
ReleaseVersionArchitectures
bullseye0.0.3-2all
sid0.0.3-5all
bookworm0.0.3-5all
Popcon: 0 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

This package is inspired by the demultiplexing options in porechop but without the adapter trimming options - it just demuxes. It uses the parasail library with its Python bindings to do pairwise alignment which provides a considerable speed up over the seqan library used by porechop due to its low-level use of vector processor instructions.

recan
genetic distance plotting for recombination events analysis
Versions of package recan
ReleaseVersionArchitectures
bullseye0.1.2-2all
trixie0.5+dfsg-1all
bookworm0.1.5+dfsg-2all
sid0.5+dfsg-1all
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

recan is a Python package which allows one to construct genetic distance plots to explore and discover recombination events in viral genomes.

This method has been previously implemented in desktop software tools: RAT, Simplot and RDP4.

rna-star
ultrafast universal RNA-seq aligner
Versions of package rna-star
ReleaseVersionArchitectures
buster2.7.0a+dfsg-1amd64,arm64
bullseye2.7.8a+dfsg-2amd64,arm64,mips64el,ppc64el
bookworm2.7.10b+dfsg-2amd64,arm64,mips64el,ppc64el
trixie2.7.11b+dfsg-2amd64,arm64,mips64el,ppc64el,riscv64
stretch2.5.2b+dfsg-1amd64,arm64,mips64el,ppc64el
sid2.7.11b+dfsg-2amd64,arm64,mips64el,ppc64el,riscv64
stretch-backports2.7.0a+dfsg-1~bpo9+1amd64,arm64,mips64el,ppc64el
Popcon: 6 users (9 upd.)*
Versions and Archs
License: DFSG free
Git

Spliced Transcripts Alignment to a Reference (STAR) software based on a previously undescribed RNA-seq alignment algorithm that uses sequential maximum mappable seed search in uncompressed suffix arrays followed by seed clustering and stitching procedure. STAR outperforms other aligners by a factor of >50 in mapping speed, aligning to the human genome 550 million 2 × 76 bp paired-end reads per hour on a modest 12-core server, while at the same time improving alignment sensitivity and precision. In addition to unbiased de novo detection of canonical junctions, STAR can discover non-canonical splices and chimeric (fusion) transcripts, and is also capable of mapping full-length RNA sequences. Using Roche 454 sequencing of reverse transcription polymerase chain reaction amplicons, the authors experimentally validated 1960 novel intergenic splice junctions with an 80-90% success rate, corroborating the high precision of the STAR mapping strategy.

The package is enhanced by the following packages: multiqc
Please cite: Alexander Dobin, Carrie A. Davis, Felix Schlesinger, Jorg Drenkow, Chris Zaleski, Sonali Jha, Philippe Batut, Mark Chaisson and Thomas R. Gingeras: STAR: ultrafast universal RNA-seq aligner. (PubMed,eprint) Bioinformatics 29(1):15-21 (2012)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Topics: Sequence analysis
rsem
RNA-Seq by Expectation-Maximization
Versions of package rsem
ReleaseVersionArchitectures
trixie1.3.3+dfsg-3amd64,arm64,mips64el,ppc64el,riscv64,s390x
sid1.3.3+dfsg-3amd64,arm64,mips64el,ppc64el,riscv64,s390x
stretch1.2.31+dfsg-1amd64,arm64,mips64el,ppc64el,s390x
buster1.3.1+dfsg-1amd64,arm64
bookworm1.3.3+dfsg-2amd64,arm64,mips64el,ppc64el,s390x
bullseye1.3.3+dfsg-1amd64,arm64,mips64el,ppc64el,s390x
Popcon: 3 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

RSEM is a software package for estimating gene and isoform expression levels from RNA-Seq data. The RSEM package provides an user-friendly interface, supports threads for parallel computation of the EM algorithm, single-end and paired-end read data, quality scores, variable-length reads and RSPD estimation. In addition, it provides posterior mean and 95% credibility interval estimates for expression levels. For visualization, It can generate BAM and Wiggle files in both transcript-coordinate and genomic-coordinate. Genomic-coordinate files can be visualized by both UCSC Genome browser and Broad Institute’s Integrative Genomics Viewer (IGV). Transcript-coordinate files can be visualized by IGV. RSEM also has its own scripts to generate transcript read depth plots in pdf format. The unique feature of RSEM is, the read depth plots can be stacked, with read depth contributed to unique reads shown in black and contributed to multi-reads shown in red. In addition, models learned from data can also be visualized. Last but not least, RSEM contains a simulator.

The package is enhanced by the following packages: multiqc
Please cite: Bo Li and Colin Dewey: RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome. (PubMed,eprint) BMC Bioinformatics 12(1):323 (2011)
Registry entries: Bio.tools  SciCrunch  Bioconda 
ruby-bio
Herramientas de Ruby para biología molecular computacional
Versions of package ruby-bio
ReleaseVersionArchitectures
jessie1.4.3.0001-2all
sid2.0.5-1all
bookworm2.0.4-1all
bullseye2.0.1-2all
stretch1.5.0-2all
trixie2.0.5-1all
buster1.5.2-1all
Popcon: 0 users (1 upd.)*
Versions and Archs
License: DFSG free
Git

El proyecto BioRuby pretende implementar un entorno integrado para bioinformáticos mediante el lenguaje Ruby. La filosofía de diseño de la biblioteca BioRuby es KISS («keep it simple, stupid»: mantenlo sencillo, estúpido) para maximizar la usabilidad y eficiencia de una herramienta diaria para los biólogos. El proyecto se inició en Japón con el apoyo de la Universidad de Tokio (Centro del Genoma Humano), la Universidad de Kyoto (Centro de bioinformática) y la Fundación Open Bio.

salmon
wicked-fast transcript quantification from RNA-seq data
Versions of package salmon
ReleaseVersionArchitectures
bullseye1.4.0+ds1-1amd64,arm64
stretch0.7.2+ds1-2amd64
bookworm1.10.1+ds1-1amd64,arm64
trixie1.10.2+ds1-1amd64,arm64
sid1.10.2+ds1-1amd64,arm64
buster0.12.0+ds1-1amd64
upstream1.10.3
Popcon: 1 users (4 upd.)*
Newer upstream!
License: DFSG free
Git

Salmon is a wicked-fast program to produce a highly-accurate, transcript-level quantification estimates from RNA-seq data. Salmon achieves is accuracy and speed via a number of different innovations, including the use of lightweight alignments (accurate but fast-to-compute proxies for traditional read alignments) and massively-parallel stochastic collapsed variational inference. The result is a versatile tool that fits nicely into many different pipelines. For example, you can choose to make use of the lightweight alignments by providing Salmon with raw sequencing reads, or, if it is more convenient, you can provide Salmon with regular alignments (e.g. computed with your favorite aligner), and it will use the same wicked-fast, state-of-the-art inference algorithm to estimate transcript-level abundances for your experiment.

The package is enhanced by the following packages: multiqc
Please cite: Rob Patro, Geet Duggal, Michael I Love, Rafael A Irizarry and Carl Kingsford: Salmon provides fast and bias-aware quantification of transcript expression. (eprint) Nature Methods 14(4):417-419 (2017)
Registry entries: Bio.tools  SciCrunch  Bioconda 
samblaster
marks duplicates, extracts discordant/split reads
Versions of package samblaster
ReleaseVersionArchitectures
bookworm0.1.26-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie0.1.26-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid0.1.26-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
buster0.1.24-2amd64,arm64,armhf,i386
bullseye0.1.26-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Current "next-generation" sequencing technologies cannot tell what exact sequence they will be reading. They take what is available. And if some sequences are read very often, then this needs some extra biomedical thinking. The genome could for instance be duplicated.

samblaster is a fast and flexible program for marking duplicates in read-id grouped paired-end SAM files. It can also optionally output discordant read pairs and/or split read mappings to separate SAM files, and/or unmapped/clipped reads to a separate FASTQ file. When marking duplicates, samblaster will require approximately 20MB of memory per 1M read pairs.

The package is enhanced by the following packages: multiqc
Please cite: Gregory G. Faust and Ira M. Hall: SAMBLASTER: fast duplicate marking and structural variant read extraction. (PubMed,eprint) Bioinformatics 30(17):2503-2505 (2014)
Registry entries: Bio.tools  SciCrunch  Bioconda 
samclip
filter SAM file for soft and hard clipped alignments
Versions of package samclip
ReleaseVersionArchitectures
trixie0.4.0-4all
sid0.4.0-4all
bullseye0.4.0-2all
bookworm0.4.0-4all
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Most short read aligners perform local alignment of reads to the reference genome. Examples includes bwa mem, minimap2, and bowtie2 (unless in --end-to-end mode). This means the ends of the read may not be part of the best alignment.

This can be caused by:

  • adapter sequences (aren't in the reference)
  • poor quality bases (mismatches only make the alignment score worse)
  • structural variation in your sample compared to the reference
  • reads overlapping the start and end of contigs (including circular genomes)

Read aligners output a SAM file. Column 6 in this format stores the CIGAR string. which describes which parts of the read aligned and which didn't. The unaligned ends of the read can be "soft" or "hard" clipped, denoted with S and H at each end of the CIGAR string. It is possible for both types to be present, but that is not common. Soft and hard don't mean anything biologically, they just refer to whether the full read sequence is in the SAM file or not.

Registry entries: Bioconda 
samtools
processing sequence alignments in SAM, BAM and CRAM formats
Versions of package samtools
ReleaseVersionArchitectures
bookworm1.16.1-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster1.9-4amd64,arm64,armhf
bullseye1.11-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
jessie0.1.19-1amd64,armhf,i386
stretch-backports1.7-2~bpo9+1amd64,arm64,armel,armhf,mips,mips64el,mipsel,ppc64el,s390x
stretch1.3.1-3amd64,arm64,armel,i386,mips64el,mipsel,ppc64el
trixie1.20-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.20-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
upstream1.21
Debtags of package samtools:
fieldbiology
interfacecommandline
networkclient
roleprogram
scopeutility
uitoolkitncurses
useanalysing, calculating, filtering
works-withbiological-sequence
Popcon: 53 users (29 upd.)*
Newer upstream!
License: DFSG free
Git

Samtools is a set of utilities that manipulate nucleotide sequence alignments in the binary BAM format. It imports from and exports to the ascii SAM (Sequence Alignment/Map) and CRAM formats, does sorting, merging and indexing, and allows one to retrieve reads in any regions swiftly. It is designed to work on a stream, and is able to open a BAM or CRAM (not SAM) file on a remote FTP or HTTP server.

The package is enhanced by the following packages: libbio-samtools-perl multiqc
Please cite: Heng Li, Bob Handsaker, Alec Wysoker, Tim Fennell, Jue Ruan, Nils Homer, Gabor Marth, Goncalo Abecasis, Richard Durbin and 1000 Genome Project Data Processing Subgroup: The Sequence Alignment/Map (SAM) Format and SAMtools. (PubMed,eprint) Bioinformatics 25(16):2078-2079 (2009)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Screenshots of package samtools
scrappie
basecaller for Nanopore sequencer
Versions of package scrappie
ReleaseVersionArchitectures
sid1.4.2-8amd64,arm64,armhf,i386,mips64el,ppc64el,riscv64,s390x
experimental1.4.2-9~0exp0simdeamd64,arm64,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.4.2-8amd64,arm64,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.4.2-8amd64,arm64,armhf,i386,mips64el,ppc64el,s390x
bullseye1.4.2-7amd64,arm64,armhf,i386,mips64el,ppc64el,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The Nanopore is a device for DNA/RNA sequencing that does not require an amplification of the material. The polynucleotides are threaded through a pore and while these pass through, the change in the electrostatic potential allows one to identify ("call") the actual base that resides in the pore. Scrappie goes a step further and also attempts to describe modifications to the nucleic acid.

Please cite: Ryan R. Wick, Louise M. Judd and Kathryn E. Holt: Performance of neural network basecalling tools for Oxford Nanopore sequencing.. (eprint) Genome Biol. 20:129 (2019)
Registry entries: Bioconda 
sepp
phylogeny with ensembles of Hidden Markov Models
Versions of package sepp
ReleaseVersionArchitectures
sid4.5.5+dfsg-1amd64,arm64
bullseye4.3.10+dfsg-5amd64
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

The tool SEPP implementing these methods uses ensembles of Hidden Markov Models (HMMs) in different ways, each focusing on a different problem.

SEPP stands for "SATe-enabled Phylogenetic Placement", and addresses the problem of phylogenetic placement of short reads into reference alignments and trees.

Registry entries: Bioconda 
seqkit
cross-platform and ultrafast toolkit for FASTA/Q file manipulation
Versions of package seqkit
ReleaseVersionArchitectures
bookworm2.3.1+ds-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie2.8.2+ds-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid2.8.2+ds-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye0.15.0+ds-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
upstream2.9.0
Popcon: 3 users (4 upd.)*
Newer upstream!
License: DFSG free
Git

SeqKit describes a cross-platform ultrafast comprehensive toolkit for FASTA/Q processing. SeqKit provides executable binary files for all major operating systems, including Windows, Linux, and Mac OS X, and can be directly used without any dependencies or pre-configurations. SeqKit demonstrates competitive performance in execution time and memory usage compared to similar tools. The efficiency and usability of SeqKit enable researchers to rapidly accomplish common FASTA/Q file manipulations.

Please cite: Wei Shen, Shuai Le, Yan Li and Fuquan Hu: SeqKit: A Cross-Platform and Ultrafast Toolkit for FASTA/Q File Manipulation. (PubMed,eprint) PlosOne 11(10):e0163962 (2016)
Registry entries: Bio.tools  Bioconda 
seqmagick
imagemagick-like frontend to Biopython SeqIO
Versions of package seqmagick
ReleaseVersionArchitectures
trixie0.8.6-3all
bookworm0.8.4-3all
buster0.7.0-1all
bullseye0.8.4-1all
sid0.8.6-3all
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Seqmagick is a little utility to expose the file format conversion in BioPython in a convenient way.

Features include:

  • Modifying sequences:
  • Remove gaps
  • Reverse & reverse complement
  • Trim to a range of residues
  • Change case
  • Sort by length or ID
  • Displaying information about sequence files
  • Subsetting sequence files by:
  • Position
  • ID
  • Deduplication
  • Filtering sequences by quality score
  • Trimming alignments to a region of interest defined by the forward and reverse primers
Registry entries: Bioconda 
shapeit4
fast and accurate method for estimation of haplotypes (phasing)
Versions of package shapeit4
ReleaseVersionArchitectures
trixie4.2.2+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid4.2.2+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye4.2.0+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm4.2.2+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Segmented HAPlotype Estimation and Imputation Tools version 4 (SHAPEIT4). SHAPEIT4 is a fast and accurate method for estimation of haplotypes (aka phasing) for SNP array and sequencing data.

The package is enhanced by the following packages: shapeit4-example
Please cite: Olivier Delaneau, Jean-Francois Zagury, Matthew R Robinson, Jonathan L Marchini and Emmanouil T Dermitzakis: Accurate, scalable and integrative haplotype estimation. (eprint) Nature Communications (2019)
Registry entries: Bioconda 
shiny-server
put Shiny web apps online
Versions of package shiny-server
ReleaseVersionArchitectures
bookworm1.5.20.1002-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid1.5.20.1002-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.5.20.1002-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
upstream1.5.23.1030
Popcon: 45 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

Shiny Server lets you put shiny web applications and interactive documents online. Take your Shiny apps and share them with your organization or the world.

Shiny Server lets you go beyond static charts, and lets you manipulate the data. Users can sort, filter, or change assumptions in real-time. Shiny server empower your users to customize your analysis for their specific needs and extract more insight from the data.

shovill
Assemble bacterial isolate genomes from Illumina paired-end reads
Versions of package shovill
ReleaseVersionArchitectures
bullseye1.1.0-4amd64
trixie1.1.0-9amd64
bookworm1.1.0-9amd64
sid1.1.0-9amd64
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Shovill is a pipeline which uses SPAdes at its core, but alters the steps before and after the primary assembly step to get similar results in less time. Shovill also supports other assemblers like SKESA, Velvet and Megahit, so you can take advantage of the pre- and post-processing the Shovill provides with those too.

smrtanalysis
software suite for single molecule, real-time sequencing
Versions of package smrtanalysis
ReleaseVersionArchitectures
stretch0~20161126all
bullseye0~20210111all
bookworm0~20210112all
sid0~20210112all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

SMRT® Analysis is a powerful, open-source bioinformatics software suite available for analysis of DNA sequencing data from Pacific Biosciences’ SMRT technology. Users can choose from a variety of analysis protocols that utilize PacBio® and third-party tools. Analysis protocols include de novo genome assembly, cDNA mapping, DNA base-modification detection, and long-amplicon analysis to determine phased consensus sequences.

This is a metapackage that depends on the components of SMRT Analysis.

Registry entries: Bio.tools  SciCrunch 
snakemake
pythonic workflow management system
Versions of package snakemake
ReleaseVersionArchitectures
buster5.4.0-1all
stretch3.10.0-1all
sid7.32.4-6all
bullseye5.24.1-2all
trixie7.32.4-6all
bookworm7.21.0-1all
upstream8.25.3
Popcon: 31 users (6 upd.)*
Newer upstream!
License: DFSG free
Git

Build systems like GNU Make are frequently used to create complicated workflows, e.g. in bioinformatics. This project aims to reduce the complexity of creating workflows by providing a clean and modern domain specific language (DSL) in Python style, together with a fast and comfortable execution environment.

Please cite: Johannes Köster and Sven Rahmann: Snakemake-a scalable bioinformatics workflow engine. Bioinformatics (2012)
Registry entries: Bio.tools  SciCrunch  Bioconda 
snpeff
genetic variant annotation and effect prediction toolbox - tool
Versions of package snpeff
ReleaseVersionArchitectures
sid5.2.e+dfsg-1all
bookworm5.1+d+dfsg-3all
trixie5.2.e+dfsg-1all
Popcon: 1 users (3 upd.)*
Versions and Archs
License: DFSG free
Git

"We are all different!" Geneticists agree to this. Even twins, who are said to be identical are on a molecular level only "mostly" identical. And even within the exact same individual, healthy cells acquire mutations such that we are all genetic mosaics. Changes to individual cells may be induced by environmental factors, e.g. like UV light, or happen sporadically as mishaps during cellular divisions.

Because there are so many genetic differences, and most have just no particular meaning for the development of a phenotype, i.e. most have no effect, it would be nice to have heuristics implemented that direct the researcher towards single-nucleotide polymorphisms (SNPs) that are most likely to be relevant. This identifies the gene that causes or contributes to, e.g, an illness, and possibly also genes that are affected by that change. Such mechanistic understanding of a disease, particularly when multiple genes and multiple genetic variants are contributing to the then "polygenic" phenotype, is at the onset of drug development and increasingly also for selecting individualized therapies in the clinic.

SnpEff is a variant annotation and effect prediction tool. It annotates and predicts the effects of variants on genes (such as amino acid changes). The inputs are predicted variants (SNPs, insertions, deletions and MNPs). The input file is usually obtained as a result of a sequencing experiment, and it is usually in variant call format (VCF).

SnpEff analyzes the input variants. It annotates the variants and calculates the effects they produce on known genes (e.g. amino acid changes).

This package contains the command line tool.

The package is enhanced by the following packages: multiqc
Please cite: Pablo Cingolani, Adrian Platts, Le Lily Wang, Melissa Coon, Tung Nguyen, Luan Wang, Susan J. Land, Douglas M. Ruden and Xiangyi Lu: A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w^1118; iso-2; iso-3. (PubMed,eprint) Fly 6(2):80-92 (2012)
Registry entries: Bio.tools  SciCrunch  Bioconda 
snpsift
tool to annotate and manipulate genome variants - tool
Versions of package snpsift
ReleaseVersionArchitectures
trixie5.2.e+dfsg-1all
sid5.2.e+dfsg-1all
bookworm5.1+dfsg2-2all
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

SnpSift is a toolbox that allows one to filter and manipulate annotated files. Once the genomic variants have been annotated, one needs to filter them out in order to find the "interesting / relevant variants". Given the large data files, this is not a trivial task (e.g. one cannot load all the variants into XLS spreadsheet). SnpSift helps to perform this VCF file manipulation and filtering required at this stage in data processing pipelines.

This package contains the command line tool.

spades
genome assembler for single-cell and isolates data sets
Versions of package spades
ReleaseVersionArchitectures
buster3.13.0+dfsg2-2amd64
stretch3.9.1+dfsg-1amd64
trixie3.15.5+dfsg-7amd64
experimental4.0.0+dfsg1-1amd64
bookworm3.15.5+dfsg-2amd64
bullseye3.13.1+dfsg-2amd64
sid3.15.5+dfsg-7amd64
stretch-backports-sloppy3.13.1+dfsg-2~bpo9+1amd64
stretch-backports3.12.0+dfsg-1~bpo9+1amd64
upstream4.0.0
Popcon: 2 users (5 upd.)*
Newer upstream!
License: DFSG free
Git

The SPAdes – St. Petersburg genome assembler is intended for both standard isolates and single-cell MDA bacteria assemblies. It works with Illumina or IonTorrent reads and is capable of providing hybrid assemblies using PacBio and Sanger reads. You can also provide additional contigs that will be used as long reads.

This package provides the following additional pipelines:

  • metaSPAdes – a pipeline for metagenomic data sets
  • plasmidSPAdes – a pipeline for extracting and assembling plasmids from WGS data sets
  • metaplasmidSPAdes – a pipeline for extracting and assembling plasmids from metagenomic data sets
  • rnaSPAdes – a de novo transcriptome assembler from RNA-Seq data
  • truSPAdes – a module for TruSeq barcode assembly
  • biosyntheticSPAdes – a module for biosynthetic gene cluster assembly with paired-end reads

SPAdes provides several stand-alone binaries with relatively simple command-line interface: k-mer counting (spades-kmercounter), assembly graph construction (spades-gbuilder) and long read to graph aligner (spades-gmapper).

Please cite: Anton Bankevich, Sergey Nurk, Dmitry Antipov, Alexey A. Gurevich, Mikhail Dvorkin, Alexander S. Kulikov, Valery M. Lesin, Sergey I. Nikolenko, Son Pham, Andrey D. Prjibelski, Alexey V. Pyshkin, Alexander V. Sirotkin, Nikolay Vyahhi, Glenn Tesler, Max A. Alekseyev and Pavel A. Pevzner: SPAdes: A New Genome Assembly Algorithm and Its Applications to Single-Cell Sequencing. (PubMed,eprint) Journal of Computational Biology 19(5):455-477 (2012)
Registry entries: Bio.tools  SciCrunch  Bioconda 
spaln
splicing-aware transcript-alignment to genomic DNA
Versions of package spaln
ReleaseVersionArchitectures
trixie3.0.2+dfsg-2amd64,arm64,mips64el,ppc64el,riscv64,s390x
bookworm2.4.13f+dfsg-1amd64,arm64,mips64el,ppc64el,s390x
sid3.0.2+dfsg-2amd64,arm64,mips64el,ppc64el,riscv64,s390x
bullseye2.4.1+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
upstream3.0.6b
Popcon: 2 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

Spaln (space-efficient spliced alignment) is a stand-alone program that maps and aligns a set of cDNA or protein sequences onto a whole genomic sequence in a single job. It also performs spliced or ordinary alignment after rapid similarity search against a protein sequence database, if a genomic segment or an amino acid sequence is given as a query.

spaln supports a combination of protein sequence database and a given genomic segment and performs rapid similarity searches and (semi-)global alignments of a set of protein sequence queries against a protein sequence database. Spaln adopts multi-phase heuristics that makes it possible to perform the job on a conventional personal computer.

Registry entries: Bioconda 
staden-io-lib-utils
programs for manipulating DNA sequencing files
Versions of package staden-io-lib-utils
ReleaseVersionArchitectures
stretch1.14.8-2amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bookworm1.14.15-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.15.0-1.1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.15.0-1.1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
jessie1.13.7-1amd64,armel,armhf,i386
bullseye1.14.13-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster1.14.11-6amd64,arm64,armhf,i386
Debtags of package staden-io-lib-utils:
biologynuceleic-acids, peptidic
fieldbiology, biology:bioinformatics
interfacecommandline
roleprogram
scopeutility
useanalysing
Popcon: 5 users (5 upd.)*
Versions and Archs
License: DFSG free
Git

The io_lib from the Staden package is a library of file reading and writing code to provide a general purpose trace file (and Experiment File) reading interface. It has been compiled and tested on a variety of unix systems, MacOS X and MS Windows.

This package contains the programs that are distributed with the Staden io_lib for manipulating and converting sequencing data files, and in particular files to manipulate short reads generated by second and third generation sequencers and stored in SRF format.

Registry entries: Bioconda 
stringtie
assemble short RNAseq reads to transcripts
Versions of package stringtie
ReleaseVersionArchitectures
trixie2.2.1+ds-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye2.1.4+ds-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid2.2.1+ds-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm2.2.1+ds-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
upstream2.2.3
Popcon: 1 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

The abundance of transcripts in a human tissue sample can be determined by RNA sequencing. The exact sequence sampled may be random, depending on the technology used. And it may be short, i.e. shorter than the transcript. At some point, many shorter reads need to be assembled to the model the complete transcripts.

StringTie knows how to assemble of RNA-Seq into potential transcripts without the need of a reference genome and provides a quantification also of the splice variants.

Please cite: Mihaela Pertea, Geo M. Pertea, Corina .M. Antonescu, Tsung-Cheng Chang, Joshua T. Mendell and Steven L. Salzberg: StringTie enables improved reconstruction of a transcriptome from RNA-seq reads. Nature Biotechnology 33:290–295 (2015)
Registry entries: Bio.tools  SciCrunch  Bioconda 
sumaclust
fast and exact clustering of genomic sequences
Versions of package sumaclust
ReleaseVersionArchitectures
sid1.0.36+ds-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
stretch1.0.20-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster1.0.31-2amd64,arm64,armhf,i386
bullseye1.0.36+ds-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm1.0.36+ds-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

With the development of next-generation sequencing, efficient tools are needed to handle millions of sequences in reasonable amounts of time. Sumaclust is a program developed by the LECA. Sumaclust aims to cluster sequences in a way that is fast and exact at the same time. This tool has been developed to be adapted to the type of data generated by DNA metabarcoding, i.e. entirely sequenced, short markers. Sumaclust clusters sequences using the same clustering algorithm as UCLUST and CD- HIT. This algorithm is mainly useful to detect the 'erroneous' sequences created during amplification and sequencing protocols, deriving from 'true' sequences.

Registry entries: Bioconda 
texlive-science
??? missing short description for package texlive-science :-(
Versions of package texlive-science
ReleaseVersionArchitectures
bookworm2022.20230122-4all
trixie2024.20241102-1all
sid2024.20241102-1all
buster2018.20190227-2all
stretch2016.20170123-5all
jessie2014.20141024-1all
bullseye2020.20210202-3all
Debtags of package texlive-science:
fieldbiology, chemistry, electronics, mathematics, physics
made-oftex
roleapp-data
sciencepublishing
usetypesetting
works-withgraphs, text
works-with-formattex
Popcon: 2862 users (847 upd.)*
Versions and Archs
License: DFSG free
Git
thesias
Testing Haplotype Effects In Association Studies
Versions of package thesias
ReleaseVersionArchitectures
buster-backports3.1.1-1~bpo10+1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
bookworm3.1.1-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid3.1.1-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie3.1.1-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye3.1.1-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

The objectif of the THESIAS program is to performed haplotype-based association analysis in unrelated individuals. This program is based on the maximum likelihood model described in Tregouet et al. 2002 (Hum Mol Genet 2002,11: 2015-2023) and is linked to the SEM algorithm (Tregouet et al. Ann Hum Genet 2004,68: 165-177). THESIAS allows one to simultaneous estimate haplotype frequencies and their associate effects on the phenotype of interest. In this new THESIAS release, quantitative, qualitative (logistic and matched-pair analysis), categorical and survival outcomes can be studied. X-linked haplotype analysis is also feasible. Covariate-adjusted haplotype effects as well as haplotype x covariate interactions can also be investigated.

Please cite: David-Alexandre Trégouët and Valérie Garelle: "A new JAVA interface implementation of THESIAS: testing haplotype effects in association studies". (eprint) Bioinformatics 23(8):1038-1039 (2007)
Registry entries: Bio.tools  SciCrunch  Bioconda 
Screenshots of package thesias
tiddit
structural variant calling
Versions of package tiddit
ReleaseVersionArchitectures
sid3.6.1+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
bullseye2.12.0+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm3.5.2+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
trixie3.6.1+dfsg-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
upstream3.8.0
Popcon: 1 users (3 upd.)*
Newer upstream!
License: DFSG free
Git

TIDDIT is a tool to used to identify chromosomal rearrangements using Mate Pair or Paired End sequencing data. TIDDIT identifies intra and inter- chromosomal translocations, deletions, tandem-duplications and inversions, using supplementary alignments as well as discordant pairs.

TIDDIT has two analysis modules. The sv mode, which is used to search for structural variants. And the cov mode that analyse the read depth of a bam file and generates a coverage report.

Registry entries: Bio.tools  Bioconda 
tipp
tool for Taxonomic Identification and Phylogenetic Profiling
Versions of package tipp
ReleaseVersionArchitectures
sid1.0+dfsg-3amd64,arm64
Popcon: users ( upd.)*
Versions and Archs
License: DFSG free
Git

TIPP is a modification of SEPP for classifying query sequences (i.e. reads) using phylogenetic placement.

TIPP inserts each read into a taxonomic tree and uses the insertion location to identify the taxonomic lineage of the read. The novel idea behind TIPP is that rather than using the single best alignment and placement for taxonomic identification, it uses a collection of alignments and placements and considers statistical support for each alignment and placement.

TIPP can also be used for abundance estimation by computing an abundance profile on the reads binned to marker genes in a reference dataset.

tnseq-transit
statistical calculations of essentiality of genes or genomic regions
Versions of package tnseq-transit
ReleaseVersionArchitectures
bookworm3.2.7-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie3.3.4-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye3.2.1-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
stretch-backports2.2.1-2~bpo9+1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
buster2.3.4-1amd64
sid3.3.4-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
upstream3.3.8
Popcon: 1 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

This is a software that can be used to analyze Tn-Seq datasets. It includes various statistical calculations of essentiality of genes or genomic regions (including conditional essentiality between 2 conditions). These methods were developed and tested as a collaboration between the Sassetti lab (UMass) and the Ioerger lab (Texas A&M)

TRANSIT is capable of analyzing TnSeq libraries constructed with Himar1 or Tn5 datasets.

TRANSIT assumes you have already done pre-processing of raw sequencing files (.fastq) and extracted read counts into a .wig formatted file. The .wig file should contain the counts at all sites where an insertion could take place (including sites with no reads). For Himar1 datasets this is all TA sites in the genome. For Tn5 datasets this would be all nucleotides in the genome.

Please cite: Michael A. DeJesus, Chaitra Ambadipudi, Richard Baker, Christopher Sassetti and Thomas R. Ioerger: TRANSIT - A Software Tool for Himar1 TnSeq Analysis. (PubMed,eprint) PLOS 11(10):e1004401 (2015)
Registry entries: Bio.tools  Bioconda 
toil
cross-platform workflow engine
Versions of package toil
ReleaseVersionArchitectures
buster3.18.0-2all
bullseye5.2.0-5all
bookworm5.9.2-2+deb12u1all
sid6.1.0-4all
upstream7.0.0
Popcon: 2 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

Toil is a scalable, efficient, cross-platform and easy-to-use workflow engine in pure Python. It works with several well established load balancers like Slurm or the Sun Grid Engine. Toil is also compatible with the Common Workflow Language (CWL) via the "toil-cwl-runner" interface, which this package make available via the Debian alternativess system under the alias "cwl-runner".

Please cite: John Vivian, Arjun Arkal Rao, Frank Austin Nothaft, Christopher Ketchum, Joel Armstrong, Adam Novak, Jacob Pfeil, Jake Narkizian Alden D. Deran, Audrey Musselman-Brown, Hannes Schmidt, Peter Amstutz, Brian Craft, Mary Goldman, Kate Rosenbloom, Melissa Cline, Brian O'Connor, Megan Hanna, Chet Birger, W. James Kent David A. Patterson, Anthony D. Joseph, Jingchun Zhu, Sasha Zaranek, Gad Getz, David Haussler and Benedict Paten: Toil enables reproducible, open source, big biomedical data analyses. Nature Biotechnology 35(4):314–316 (2017)
Registry entries: Bioconda 
tombo
identification of modified nucleotides from raw nanopore sequencing data
Versions of package tombo
ReleaseVersionArchitectures
bookworm1.5.1-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie1.5.1-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid1.5.1-6amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye1.5.1-2amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Tombo is a suite of tools primarily for the identification of modified nucleotides from nanopore sequencing data. Tombo also provides tools for the analysis and visualization of raw nanopore signal.

Please cite: Marcus Stoiber, Joshua Quick, Rob Egan, Ji Eun Lee, Susan Celniker, Robert K. Neely, Nicholas Loman, Len A Pennacchio and James Brown: De novo Identification of DNA Modifications Enabled by Genome-Guided Nanopore Signal Processing. (eprint) bioRxiv (2016)
Registry entries: Bioconda 
tophat-recondition
post-processor for TopHat unmapped reads
Versions of package tophat-recondition
ReleaseVersionArchitectures
bookworm1.4-3all
bullseye1.4-3all
sid1.4-3all
trixie1.4-3all
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

tophat-recondition is a post-processor for TopHat unmapped reads (contained in unmapped.bam), making them compatible with downstream tools (e.g., the Picard suite, samtools, GATK) (TopHat issue #17). It also works around bugs in TopHat:

  • the "mate is unmapped" SAM flag is not set on any reads in the unmapped.bam file (TopHat issue #3)
  • the mapped mate of an unmapped read can be absent from accepted_hits.bam, creating a mismatch between the file and the unmapped read's flags (TopHat issue #16)
Please cite: Christian Brueffer and Lao H. Saal: A post-processor for TopHat unmapped reads. Bioinformatics 17(1):199 (2016)
Registry entries: Bio.tools  Bioconda 
trinculo
toolkit to carry out genetic association for multi-category phenotypes
Versions of package trinculo
ReleaseVersionArchitectures
trixie0.96+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid0.96+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bullseye0.96+dfsg-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bookworm0.96+dfsg-4amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

An efficient toolkit for carrying out genetic association for multi-category phenotypes. Implements multinomial and ordinal association incorporating covariates, conditional analysis, empirical and non-emperical priors and fine-mapping.

umap-learn
Uniform Manifold Approximation and Projection
Versions of package umap-learn
ReleaseVersionArchitectures
bullseye0.4.5+dfsg-2all
bookworm0.5.3+dfsg-2all
sid0.5.4+dfsg-1all
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Uniform Manifold Approximation and Projection (UMAP) is a dimension reduction technique that can be used for visualisation similarly to t- SNE, but also for general non-linear dimension reduction. The algorithm is founded on three assumptions about the data:

 1. The data is uniformly distributed on a Riemannian manifold;
 2. The Riemannian metric is locally constant (or can be
    approximated as such);
 3. The manifold is locally connected.

From these assumptions it is possible to model the manifold with a fuzzy topological structure. The embedding is found by searching for a low dimensional projection of the data that has the closest possible equivalent fuzzy topological structure.

Please cite: Leland McInnes, John Healy and James Melville: UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction. (eprint) arXiv (2018)
umis
tools for processing UMI RNA-tag data
Versions of package umis
ReleaseVersionArchitectures
trixie1.0.9-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
bookworm1.0.8-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
bullseye1.0.7-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
sid1.0.9-1amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Umis provides tools for estimating expression in RNA-Seq data which performs sequencing of end tags of transcript, and incorporate molecular tags to correct for amplification bias.

There are four steps in this process.

 1. Formatting reads
 2. Filtering noisy cellular barcodes
 3. Pseudo-mapping to cDNAs
 4. Counting molecular identifiers
Please cite: Valentine Svensson, Kedar Nath Natarajan, Lam-Ha Ly, Ricardo J Miragaia, Charlotte Labalette, Iain C Macaulay, Ana Cvejic and Sarah A Teichmann: Power analysis of single-cell RNA-sequencing experiments. (PubMed) Nature methods 14:381–387 (2017)
Registry entries: Bioconda 
uncalled
Utility for Nanopore Current Alignment to Large Expanses of DNA
Versions of package uncalled
ReleaseVersionArchitectures
bullseye2.2+ds-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
trixie2.3+ds-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
sid2.3+ds-2amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm2.2+ds1-1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Streaming algorithm for mapping raw nanopore signal to DNA references

Enables real-time enrichment or depletion on Oxford Nanopore Technologies (ONT) MinION runs via ReadUntil.

Also supports standalone signal mapping of fast5 reads

Please cite: Sam Kovaka, Yunfan Fan, Bohan Ni, Winston Timp and Michael C. Schatz: Targeted nanopore sequencing by real-time mapping of raw electrical signal with UNCALLED. (eprint) Nature Biotechnology (2020)
unicycler
hybrid assembly pipeline for bacterial genomes
Versions of package unicycler
ReleaseVersionArchitectures
stretch-backports0.4.7+dfsg-1~bpo9+1amd64
buster0.4.7+dfsg-2amd64
stretch-backports-sloppy0.4.8+dfsg-1~bpo9+1amd64
bullseye0.4.8+dfsg-2amd64
bookworm0.5.0+dfsg-1amd64
trixie0.5.0+dfsg-1amd64
sid0.5.0+dfsg-1amd64
upstream0.5.1
Popcon: 1 users (2 upd.)*
Newer upstream!
License: DFSG free
Git

Unicycler is an assembly pipeline for bacterial genomes. It can assemble Illumina-only read sets where it functions as a SPAdes-optimiser. It can also assembly long-read-only sets (PacBio or Nanopore) where it runs a miniasm+Racon pipeline. For the best possible assemblies, give it both Illumina reads and long reads, and it will conduct a hybrid assembly.

Please cite: Ryan R. Wick, Louise M. Judd, Claire L. Gorrie and Kathryn E. Holt: Unicycler: Resolving bacterial genome assemblies from short and long sequencing reads. (PubMed,eprint) PLOS Computational Biology 13(6):e1005595 (2017)
Registry entries: Bio.tools  Bioconda 
vg
tools for working with genome variation graphs
Versions of package vg
ReleaseVersionArchitectures
bullseye1.30.0+ds-1amd64,mips64el
sid1.30.0+ds-1amd64,mips64el
upstream1.61.0
Popcon: 1 users (0 upd.)*
Newer upstream!
License: DFSG free
Git

variation graph data structures, interchange formats, alignment, genotyping, and variant calling methods

Variation graphs provide a succinct encoding of the sequences of many genomes. A variation graph (in particular as implemented in vg) is composed of:

  • nodes, which are labeled by sequences and ids
  • edges, which connect two nodes via either of their respective ends
  • paths, describe genomes, sequence alignments, and annotations (such as gene models and transcripts) as walks through nodes connected by edges

This model is similar to a number of sequence graphs that have been used in assembly and multiple sequence alignment. Paths provide coordinate systems relative to genomes encoded in the graph, allowing stable mappings to be produced even if the structure of the graph is changed.

Please cite: Erik Garrison, Jouni Sirén, Adam M Novak, Glenn Hickey, Jordan M Eizenga, Eric T Dawson, William Jones, Shilpa Garg, Charles Markello, Michael F Lin, Benedict Paten and Richard Durbin: Variation graph toolkit improves read mapping by representing genetic variation in the reference. (PubMed) Nature Biotechnology 36(9):875–879 (2018)
Registry entries: Bioconda 
vsearch
tool for processing metagenomic sequences
Versions of package vsearch
ReleaseVersionArchitectures
sid2.29.1-1amd64,arm64,mips64el,ppc64el,riscv64
stretch2.3.4-1amd64
buster2.10.4-1amd64
bookworm2.22.1-1amd64,arm64,ppc64el
bullseye2.15.2-3amd64,arm64,ppc64el
trixie2.29.1-1amd64,arm64,mips64el,ppc64el,riscv64
Popcon: 4 users (9 upd.)*
Versions and Archs
License: DFSG free
Git

Versatile 64-bit multithreaded tool for processing metagenomic sequences, including searching, clustering, chimera detection, dereplication, sorting, masking and shuffling

The aim of this project is to create an alternative to the USEARCH tool developed by Robert C. Edgar (2010). The new tool should:

  • have a 64-bit design that handles very large databases and much more than 4GB of memory
  • be as accurate or more accurate than usearch
  • be as fast or faster than usearch
The package is enhanced by the following packages: vsearch-examples
Please cite: Torbjørn Rognes, Tomáš Flouri, Ben Nichols, Christopher Quince and Frédéric Mahé: VSEARCH: a versatile open source tool for metagenomics. (eprint) PeerJ 4:e2584
Registry entries: Bio.tools  Bioconda 
vt
toolset for short variant discovery in genetic sequence data
Versions of package vt
ReleaseVersionArchitectures
bullseye0.57721+ds-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid0.57721+ds-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie0.57721+ds-3amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm0.57721+ds-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

vt is a variant tool set that discovers short variants from Next Generation Sequencing data.

Vt-normalize is a tool to normalize representation of genetic variants in the VCF. Variant normalization is formally defined as the consistent representation of genetic variants in an unambiguous and concise way. In vt a simple general algorithm to enforce this is implemented.

The package is enhanced by the following packages: vt-examples
Please cite: Adrian Tan, Gonçalo R. Abecasis and Hyun Min Kang: Unified representation of genetic variants. (PubMed,eprint) Bioinformatics 31(13):2202–2204 (2015)
Registry entries: Bio.tools  Bioconda 
workrave
Herramienta para evitar las lesiones por esfuerzo repetitivo
Maintainer: Francois Marier
Versions of package workrave
ReleaseVersionArchitectures
jessie1.10.4-3amd64,armel,armhf,i386
stretch1.10.16-1amd64,arm64,armel,armhf,i386,mips,mips64el,mipsel,ppc64el,s390x
sid1.10.52-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
trixie1.10.52-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64,s390x
bookworm1.10.50-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
bullseye1.10.44-7.1amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
buster1.10.23-5amd64,arm64,armhf,i386
Debtags of package workrave:
hardwareinput
interfacex11
roleprogram
scopeutility
suitegnome
uitoolkitgtk
usemonitor
x11applet, application
Popcon: 213 users (19 upd.)*
Versions and Archs
License: DFSG free
Git

Workrave es un asistente para la recuperación y prevención de las lesiones por esfuerzo repetitivo o «Repetitive Strain Injury (RSI)». El programa muestra alertas periódicas para que realice micropausas, descansos y restringe el límite máximo diario de uso.

Incluye una aplicación para la barra de tareas compatible con GNOME y KDE y es capaz de monitorizar por la red su actividad si cambia entre diferentes ordenadores como parte de su trabajo diario.

Workrave ofrece muchas más opciones de configuración que otras herramientas similares.

wtdbg2
de novo sequence assembler for long noisy reads
Versions of package wtdbg2
ReleaseVersionArchitectures
bullseye2.5-7amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el,s390x
sid2.5-10amd64
trixie2.5-10amd64
bookworm2.5-9amd64
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

Wtdbg2 is a de novo sequence assembler for long noisy reads produced by PacBio or Oxford Nanopore Technologies (ONT). It assembles raw reads without error correction and then builds the consensus from intermediate assembly output. Wtdbg2 is able to assemble the human and even the 32Gb Axolotl genome at a speed tens of times faster than CANU and FALCON while producing contigs of comparable base accuracy.

During assembly, wtdbg2 chops reads into 1024bp segments, merges similar segments into a vertex and connects vertices based on the segment adjacency on reads. The resulting graph is called fuzzy Bruijn graph (FBG). It is akin to De Bruijn graph but permits mismatches/gaps and keeps read paths when collapsing k-mers. The use of FBG distinguishes wtdbg2 from the majority of long-read assemblers.

The package is enhanced by the following packages: wtdbg2-examples
Please cite: Jue Ruan and Heng Li: Fast and accurate long-read assembly with wtdbg2. (PubMed,eprint) naturemethods 17(2):155-158 (2020)
yanagiba
filter low quality Oxford Nanopore reads basecalled with Albacore
Versions of package yanagiba
ReleaseVersionArchitectures
bullseye1.0.0-2all
bookworm1.0.0-5all
trixie1.0.0-5all
sid1.0.0-5all
Popcon: 0 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Yanagiba is used to filter short or low quality Oxford Nanopore reads which have been basecalled with Albacore. It takes fastq.gz and an Albacore summary file as input. If no Albacore summary file is provided attempt to calculate mean qscore from directly from fastq file using NanoMath. Note: Calculated quality scores appear to be lower for reads called with Metrichor, you may need to lower your minqual setting in this case.

yanosim
read simulator nanopore DRS datasets
Versions of package yanosim
ReleaseVersionArchitectures
bullseye0.1-3amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
trixie0.1-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
sid0.1-5amd64,arm64,armel,armhf,i386,mips64el,ppc64el,riscv64
bookworm0.1-5amd64,arm64,armel,armhf,i386,mips64el,mipsel,ppc64el
Popcon: 1 users (2 upd.)*
Versions and Archs
License: DFSG free
Git

Yanosim has three options:

  1. yanosim model:

    Creates an model of mismatches, insertions and deletions based on an alignment of nanopore DRS reads to a reference. Reads should be aligned to a transcriptome i.e. without spliced alignment, using minimap2. They should have the cs tag. 2. yanosim quantify:

    Quantify the number of reads mapping to each transcript in a reference, so that the right number of reads can be simulated. 3. yanosim simulate:

    Given a model created using yanosim model, and per-transcript read counts created using yanosim simulate, simulate error-prone long-reads from the given fasta file.

Official Debian packages with lower relevance

libsimde-dev
Implementations of SIMD instructions for all systems
Versions of package libsimde-dev
ReleaseVersionArchitectures
trixie0.8.2-1all
bullseye0.7.2-4all
bookworm0.7.4~rc2-2all
sid0.8.2-1all
experimental0.8.2~rc1-1amd64,arm64,armel,armhf,i386,ppc64el,riscv64,s390x
Popcon: 12 users (31 upd.)*
Versions and Archs
License: DFSG free
Git

SIMDe provides fast, portable implementations of SIMD intrinsics on hardware which doesn't natively support them, such as calling SSE functions on ARM. There is no performance penalty if the hardware supports the native implementation (e.g., SSE/AVX runs at full speed on x86, NEON on ARM, etc.).

This makes porting code to other architectures much easier in a few key ways:

First, instead of forcing you to rewrite everything for each architecture, SIMDe lets you get a port up and running almost effortlessly. You can then start working on switching the most performance-critical sections to native intrinsics, improving performance gradually. SIMDe lets (for example) SSE/AVX and NEON code exist side-by-side, in the same implementation.

Second, SIMDe makes it easier to write code targeting ISA extensions you don't have convenient access to. You can run NEON code on your x86 machine without an emulator. Obviously you'll eventually want to test on the actual hardware you're targeting, but for most development, SIMDe can provide a much easier path.

SIMDe takes a very different approach from most other SIMD abstraction layers in that it aims to expose the entire functionality of the underlying instruction set. Instead of limiting functionality to the lowest common denominator, SIMDe tries to minimize the amount of effort required to port while still allowing you the space to optimize as needed.

The current focus is on writing complete portable implementations, though a large number of functions already have accelerated implementations using one (or more) of the following:

    SIMD intrinsics from other ISA extensions (e.g., using NEON to implement
SSE).
    Compiler-specific vector extensions and built-ins such as
__builtin_shufflevector and __builtin_convertvector
    Compiler auto-vectorization hints, using:
       OpenMP 4 SIMD
       Cilk Plus
       GCC loop-specific pragmas
       clang pragma loop hint directives
python3-anndata
annotated gene by sample numpy matrix
Versions of package python3-anndata
ReleaseVersionArchitectures
bullseye0.7.5+ds-3all
bookworm0.8.0-4all
sid0.10.6-1all
upstream0.11.0
Popcon: 0 users (1 upd.)*
Newer upstream!
License: DFSG free
Git

AnnData provides a scalable way of keeping track of data together with learned annotations. It is used within Scanpy, for which it was initially developed. Both packages have been introduced in Genome Biology (2018).

Please cite: F. Alexander Wolf, Philipp Angerer and Fabian J. Theis: SCANPY: large-scale single-cell gene expression data analysis.. (PubMed) Genome Biol. 19:15 (2018)
Registry entries: Bioconda 
python3-mmtf
binary encoding of biological structures (Python 3)
Versions of package python3-mmtf
ReleaseVersionArchitectures
trixie1.1.3-1all
bullseye1.1.2-3all
bookworm1.1.3-1all
sid1.1.3-1all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

The macromolecular transmission format (MMTF) is a binary encoding of biological structures.

This package installs the library for Python 3.

r-bioc-rsubread
Subread Sequence Alignment and Counting for R
Versions of package r-bioc-rsubread
ReleaseVersionArchitectures
sid2.18.0-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
trixie2.18.0-1amd64,arm64,mips64el,ppc64el,riscv64,s390x
bookworm2.12.2-1amd64,arm64,mips64el,ppc64el,s390x
bullseye2.4.2-1amd64,arm64,mips64el,ppc64el,s390x
upstream2.20.0
Popcon: 14 users (4 upd.)*
Newer upstream!
License: DFSG free
Git

Alignment, quantification and analysis of second and third generation sequencing data. Includes functionality for read mapping, read counting, SNP calling, structural variant detection and gene fusion discovery.

Can be applied to all major sequencing techologies and to both short and long sequence reads.

Please cite: Yang Liao, Gordon K Smyth and Wei Shi: The R package Rsubread is easier, faster, cheaper and better for alignment and quantification of RNA sequencing reads,. (eprint) Nucleic Acids Research 47(8):e47 (2019)
Registry entries: Bio.tools  Bioconda 

Debian packages in contrib or non-free

bcbio
toolkit for analysing high-throughput sequencing data
Versions of package bcbio
ReleaseVersionArchitectures
bullseye1.2.5-1 (contrib)all
sid1.2.9-2 (contrib)all
bookworm1.2.9-2 (contrib)all
buster1.1.2-3all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free, but needs non-free components
Git

This package installs the command line tools of the bcbio-nextgen toolkit implementing best-practice pipelines for fully automated high throughput sequencing analysis.

A high-level configuration file specifies inputs and analysis parameters to drive a parallel pipeline that handles distributed execution, idempotent processing restarts and safe transactional steps. The project contributes a shared community resource that handles the data processing component of sequencing analysis, providing researchers with more time to focus on the downstream biology.

This package builds and having it in Debian unstable helps the Debian developers to synchronize their efforts. But unless a series of external dependencies are not installed manually, the functionality of bcbio in Debian is only a shadow of itself. Please use the official distribution of bcbio for the time being, which means "use conda". The TODO file in the Debian directory should give an overview on progress for Debian packaging.

Registry entries: Bio.tools  Bioconda 
python3-seqcluster
analysis of small RNA in NGS data
Versions of package python3-seqcluster
ReleaseVersionArchitectures
trixie1.2.9+ds-4 (contrib)all
sid1.2.9+ds-4 (contrib)all
bullseye1.2.7+ds-1 (contrib)all
bookworm1.2.9+ds-3 (contrib)all
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free, but needs non-free components
Git

Identifies small RNA sequences of all sorts in RNA sequencing data. This is especially helpful for the identification of RNA that is neither coding nor belonging to the already well-established group of miRNA, towards many tools feel constrained to.

This package provides the Python module. For executables see the package 'seqcluster'.

Please cite: Lorena Pantano, Marc R. Friedländer, Georgia Escaramís, Esther Lizano, Joan Pallarès-Albanell, Isidre Ferrer, Xavier Estivill and Eulàlia Martí: Specific small-RNA signatures in the amygdala at premotor and motor stages of Parkinson's disease revealed by deep sequencing analysis. (PubMed) Bioinformatics (2015)
Registry entries: Bio.tools 
varscan
variant detection in next-generation sequencing data
Versions of package varscan
ReleaseVersionArchitectures
bookworm2.4.3+dfsg-4 (non-free)amd64
bullseye2.4.3+dfsg-3 (non-free)amd64
jessie2.3.7+dfsg-1 (non-free)amd64
sid2.4.3+dfsg-4 (non-free)amd64
trixie2.4.3+dfsg-4 (non-free)amd64
stretch2.4.3+dfsg-1 (non-free)amd64
buster2.4.3+dfsg-3 (non-free)amd64
Popcon: 1 users (1 upd.)*
Versions and Archs
License: non-free
Git

Variant detection in massively parallel sequencing. For one sample, calls SNPs, indels, and consensus genotypes. For tumor-normal pairs, further classifies each variant as Germline, Somatic, or LOH, and also detects somatic copy number changes.

Please cite: Daniel C. Koboldt, Qunyuan Zhang, David E. Larson, Dong Shen, Michael D. McLellan, Ling Lin, Christopher A. Miller, Elaine R. Mardis, Li Ding and Richard K. Wilson: VarScan 2: somatic mutation and copy number alteration discovery in cancer by exome sequencing". (PubMed,eprint) Genome Res. 22(3):568-576 (2012)
Registry entries: Bio.tools  SciCrunch  Bioconda 
vienna-rna
RNA sequence analysis
Versions of package vienna-rna
ReleaseVersionArchitectures
sid2.6.4+dfsg-1 (non-free)amd64,arm64,armel,armhf,mips64el,ppc64el,riscv64,s390x
bullseye2.4.17+dfsg-2 (non-free)amd64,arm64,armel,armhf,mips64el,mipsel,ppc64el
bookworm2.5.1+dfsg-1 (non-free)amd64,arm64,mips64el,ppc64el,s390x
trixie2.6.4+dfsg-1 (non-free)amd64,arm64,armel,armhf,mips64el,ppc64el,riscv64,s390x
upstream2.7.0
Popcon: 0 users (1 upd.)*
Newer upstream!
License: non-free
Git

The Vienna RNA Package consists of a C code library and several stand-alone programs for the prediction and comparison of RNA secondary structures. It is developed and maintained by the group of Ivo Hofacker in Vienna.

RNA secondary structure prediction through energy minimization is the most used function in the package. It provides three kinds of dynamic programming algorithms for structure prediction:

  • the minimum free energy algorithm of (Zuker & Stiegler 1981) which yields a single optimal structure,
  • the partition function algorithm of (McCaskill 1990) which calculates base pair probabilities in the thermodynamic ensemble, and the suboptimal folding algorithm of (Wuchty et.al 1999) which generates all suboptimal structures within a given energy range of the optimal energy.

For secondary structure comparison, the package contains several measures of distance (dissimilarities) using either string alignment or tree-editing (Shapiro & Zhang 1990). Finally, is provided an algorithm to design sequences with a predefined structure (inverse folding). The RNAforester package is a tool for aligning RNA secondary structures and it's user interface integrates to those of the tools of the Vienna RNA package.

Please cite: Ronny Lorenz, Stephan H. Bernhart, Christian Höner zu Siederdissen, Hakim Tafer, Christoph Flamm, Peter F. Stadler and Ivo L. Hofacker: ViennaRNA Package 2.0. (eprint) Algorithms for Molecular Biology 6(1):26 (2011)
Registry entries: Bio.tools  SciCrunch 

Debian packages in experimental

libtensorflow-framework2
Computation using data flow graphs for scalable machine learning
Versions of package libtensorflow-framework2
ReleaseVersionArchitectures
experimental2.3.1-1amd64
Popcon: 0 users (0 upd.)*
Versions and Archs
License: DFSG free
Git

TensorFlow is an open source software library for numerical computation using data flow graphs. The graph nodes represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture enables you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code.

This package ships shared object libtensorflow_framework.so.2.0

A shared object which includes registration mechanisms for ops and kernels. Does not include the implementations of any ops or kernels. Instead, the library which loads libtensorflow_framework.so (e.g. _pywrap_tensorflow_internal.so for Python, libtensorflow.so for the C API) is responsible for registering ops with libtensorflow_framework.so. In addition to this core set of ops, user libraries which are loaded (via TF_LoadLibrary/tf.load_op_library) register their ops and kernels with this shared object directly.

For example, from Python tf.load_op_library loads a custom op library (via dlopen() on Linux), the library finds libtensorflow_framework.so (no filesystem search takes place, since libtensorflow_framework.so has already been loaded by pywrap_tensorflow) and registers its ops and kernels via REGISTER_OP and REGISTER_KERNEL_BUILDER (which use symbols from libtensorflow_framework.so), and pywrap_tensorflow can then use these ops. Since other languages use the same libtensorflow_framework.so, op libraries are language agnostic.

Please cite: Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, Greg S. Corrado, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Ian Goodfellow, Andrew Harp, Geoffrey Irving, Michael Isard, Rafal Jozefowicz, Yangqing Jia, Lukasz Kaiser, Manjunath Kudlur, Josh Levenberg, Dan Mané, Mike Schuster, Rajat Monga, Sherry Moore, Derek Murray, Chris Olah, Jonathon Shlens, Benoit Steiner, Ilya Sutskever, Kunal Talwar, Paul Tucker, Vincent Vanhoucke, Vijay Vasudevan, Fernanda Viégas, Oriol Vinyals, Pete Warden, Martin Wattenberg, Martin Wicke, Yuan Yu and and Xiaoqiang Zheng: TensorFlow: Large-scale machine learning on heterogeneous systems.. (2015)
Remark of Debian Med team: https://lists.debian.org/debian-devel/2019/10/msg00168.html

The framework should also generate python3-tensorflow as predepencency for streamlit

Packaging has started and developers might try the packaging code in VCS

arvados
managing and analyzing biomedical big data
Versions of package arvados
ReleaseVersionArchitectures
VCS2.0.3-1all
Versions and Archs
License: Apache-2.0 #FIXME
Debian package not available
Git
Version: 2.0.3-1

Arvados is an open source platform for managing, processing, and sharing genomic and other large scientific and biomedical data. With Arvados, bioinformaticians run and scale compute-intensive workflows, developers create biomedical applications, and IT administrators manage large compute and storage resources.

auspice
web app for visualizing pathogen evolution
Versions of package auspice
ReleaseVersionArchitectures
VCS2.10.0-1all
Versions and Archs
License: AGPL-3
Debian package not available
Git
Version: 2.10.0-1

Nextstrain is an open-source project to harness the scientific and public health potential of pathogen genome data. We provide a continually- updated view of publicly available data with powerful analytics and visualizations showing pathogen evolution and epidemic spread. Our goal is to aid epidemiological understanding and improve outbreak response.

Remark of Debian Med team: That's a nodejs package. Unfortunately lots of preconditions are missing

Comments welcome whether it makes sense to package about 20 nodejs packages. A list can be found as "debian/TODO" or as comments in debian/control in Salsa.

blat
BLAST-Like Alignment Tool
Versions of package blat
ReleaseVersionArchitectures
VCS35-1all
Versions and Archs
License: FreeForScientificUse
Debian package not available
Git
Version: 35-1

BLAT on DNA is designed to quickly find sequences of 95% and greater similarity of length 25 bases or more. It may miss more divergent or shorter sequence alignments. It will find perfect sequence matches of 25 bases, and sometimes find them down to 20 bases. BLAT on proteins finds sequences of 80% and greater similarity of length 20 amino acids or more. In practice DNA BLAT works well on primates, and protein blat on land vertebrates.

BLAT is not BLAST. DNA BLAT works by keeping an index of the entire genome in memory. The index consists of all non-overlapping 11-mers except for those heavily involved in repeats. The index takes up a bit less than a gigabyte of RAM. The genome itself is not kept in memory, allowing BLAT to deliver high performance on a reasonably priced Linux box. The index is used to find areas of probable homology, which are then loaded into memory for a detailed alignment. Protein BLAT works in a similar manner, except with 4-mers rather than 11-mers. The protein index takes a little more than 2 gigabytes.

Please cite: W. Jim Kent: BLAT--the BLAST-like alignment tool. (PubMed,eprint) Genome Research 12(4):656-64 (2002)
Registry entries: Bio.tools  SciCrunch 
chime
COVID-19 Hospital Impact Model for Epidemics
Versions of package chime
ReleaseVersionArchitectures
VCS0.2.1-1all
Versions and Archs
License: MIT
Debian package not available
Git
Version: 0.2.1-1

Penn Medicine - COVID-19 Hospital Impact Model for Epidemics

This tool was developed by the Predictive Healthcare team at Penn Medicine. For questions and comments please see our contact page. Code can be found on Github. Join our Slack channel if you would like to get involved!

The estimated number of currently infected individuals is 533. The 91 confirmed cases in the region imply a 17% rate of detection. This is based on current inputs for Hospitalizations (4), Hospitalization rate (5%), Region size (4119405), and Hospital market share (15%).

An initial doubling time of 6 days and a recovery time of 14.0 days imply an R_0 of 2.71.

Mitigation: A 0% reduction in social contact after the onset of the outbreak reduces the doubling time to 6.0 days, implying an effective R_t of 2.712.712.71.

Remark of Debian Med team: Needs streamlit (see below)
covpipe
pipeline to generate consensus sequences from NGS reads
Versions of package covpipe
ReleaseVersionArchitectures
VCS3.0.6-1all
Versions and Archs
License: GPL-3+
Debian package not available
Git
Version: 3.0.6-1

CovPipe is a pipeline to generate consensus sequences from NGS reads based on a reference sequence. The pipeline is tailored to be used for SARS-CoV-2 data, but may be used for other viruses.

Genomic variants of your NGS data in comparison to a reference will be determined. These variants will be included into the reference and form the consensus sequences. See below for further details on the determined set of consensus sequences.

ensembl-vep
Variant Effect Predictor predicting the functional effects of genomic variants
Versions of package ensembl-vep
ReleaseVersionArchitectures
VCS100.2-1all
Versions and Archs
License: Apache-2.0
Debian package not available
Git
Version: 100.2-1

The Ensembl Variant Effect Predictor predicts the functional effects of genomic variants. It has three components:

  • VEP (Variant Effect Predictor) predicts the functional effects of genomic variants.
  • Haplosaurus uses phased genotype data to predict whole-transcript haplotype sequences.
  • Variant Recoder translates between different variant encodings.
Please cite: William McLaren, Laurent Gil, Sarah E. Hunt, Harpreet Singh Riat, Graham R. S. Ritchie, Anja Thormann, Paul Flicek and Fiona Cunningham: The Ensembl Variant Effect Predictor. (PubMed,eprint) Genome Biology 17(1):122 (2016)
Registry entries: Bioconda 
fieldbioinformatics
pipeline with virus identification with Nanopore sequencer
Versions of package fieldbioinformatics
ReleaseVersionArchitectures
VCS1.1.3-1all
Versions and Archs
License: MIT
Debian package not available
Git
Version: 1.1.3-1

This is the ARTIC bioinformatics pipeline for working with virus sequencing data, sequenced with nanopore. It implements a complete bioinformatics protocol to take the output from the Nanopore sequencer and determine consensus genome sequences. Includes basecalling, de-multiplexing, mapping, polishing and consensus generation.

An outbreak of SARS-CoV-2, Ebola, ... something unknown? This software is field-proven.

Registry entries: Bio.tools  Bioconda 
flappie
flip-flop basecaller for Oxford Nanopore reads
Versions of package flappie
ReleaseVersionArchitectures
VCS2.1.3+ds-1all
Versions and Archs
License: Oxford-Nanopore-PL-1.0
Debian package not available
Git
Version: 2.1.3+ds-1

Basecall Fast5 reads using flip-flop basecalling.

Features

  • Flip-flop basecalling for the MinION platform

  • R9.4.1 (Native or PCR libraries)

  • R10C (PCR libraries only)
  • Basecalling of 5mC in CpG context for R9.4.1, PromethION platform
graphmap2
highly sensitive and accurate mapper for long, error-prone reads
Versions of package graphmap2
ReleaseVersionArchitectures
VCS0.6.4-1all
Versions and Archs
License: MIT
Debian package not available
Git
Version: 0.6.4-1

GraphMap2 is a highly sensitive and accurate mapper for long, error- prone reads. The mapping algorithm is designed to analyse nanopore sequencing reads, which progressively refines candidate alignments to robustly handle potentially high-error rates and a fast graph traversal to align long reads with speed and high precision (>95%). Evaluation on MinION sequencing data sets against short- and long-read mappers indicates that GraphMap increases mapping sensitivity by 10–80% and maps

95% of bases. GraphMap alignments enabled single-nucleotide variant calling on the human genome with increased sensitivity (15%) over the next best mapper, precise detection of structural variants from length 100 bp to 4 kbp, and species and strain-specific identification of pathogens using MinION reads.

Please cite: Ivan Sović, Mile Šikić, Andreas Wilm, Shannon Nicole Fenlon, Swaine Chen and Niranjan Nagarajan: Fast and sensitive mapping of nanopore sequencing reads with GraphMap. (PubMed,eprint) Nature Communications 7(11307) (2016)
Registry entries: Bioconda 
manta
structural variant and indel caller for mapped sequencing data
Versions of package manta
ReleaseVersionArchitectures
VCS1.6.0+dfsg-1all
Versions and Archs
License: GPL-3+
Debian package not available
Git
Version: 1.6.0+dfsg-1

Manta calls structural variants (SVs) and indels from mapped paired-end sequencing reads. It is optimized for analysis of germline variation in small sets of individuals and somatic variation in tumor/normal sample pairs. Manta discovers, assembles and scores large-scale SVs, medium- sized indels and large insertions within a single efficient workflow. The method is designed for rapid analysis on standard compute hardware: NA12878 at 50x genomic coverage is analyzed in less than 20 minutes on a 20 core server, and most WGS tumor/normal analyses can be completed within 2 hours. Manta combines paired and split-read evidence during SV discovery and scoring to improve accuracy, but does not require split- reads or successful breakpoint assemblies to report a variant in cases where there is strong evidence otherwise. It provides scoring models for germline variants in small sets of diploid samples and somatic variants in matched tumor/normal sample pairs. There is experimental support for analysis of unmatched tumor samples as well. Manta accepts input read mappings from BAM or CRAM files and reports all SV and indel inferences in VCF 4.1 format.

Please cite: Xiaoyu Chen, Ole Schulz-Trieglaff, Richard Shaw, Bret Barnes, Felix Schlesinger, Morten Källberg, Anthony J. Cox, Semyon Kruglyak and Christopher T. Saunders: Manta: rapid detection of structural variants and indels for germline and cancer sequencing applications. (PubMed,eprint) Bioinformatics 32(8):1220-1222 (2015)
Registry entries: Bio.tools  Bioconda 
medaka
sequence correction provided by ONT Research
Versions of package medaka
ReleaseVersionArchitectures
VCS1.0.3+dfsg-1all
Versions and Archs
License: MPL-2.0
Debian package not available
Git
Version: 1.0.3+dfsg-1

Medaka is a tool to create a consensus sequence from nanopore sequencing data. This task is performed using neural networks applied from a pileup of individual sequencing reads against a draft assembly. It outperforms graph-based methods operating on basecalled data, and can be competitive with state-of-the-art signal-based methods, whilst being much faster.

Features

  • Requires only basecalled data. (.fasta or .fastq)
  • Improved accurary over graph-based methods (e.g. Racon).
  • 50X faster than Nanopolish (and can run on GPUs).
  • Methylation aggregation from Guppy .fast5 files.
  • Benchmarks are provided here.
  • Includes extras for implementing and training bespoke correction networks.
Registry entries: Bioconda 
nanoplot
plotting scripts for long read sequencing data
Versions of package nanoplot
ReleaseVersionArchitectures
VCS1.36.2-1all
Versions and Archs
License: MIT
Debian package not available
Git
Version: 1.36.2-1

NanoPlot provides plotting scripts for long read sequencing data.

These scripts perform data extraction from Oxford Nanopore sequencing data in the following formats:

  • fastq files (optionally compressed)
  • fastq files generated by albacore, guppy or MinKNOW containing additional information (optionally compressed)
  • sorted bam files
  • sequencing_summary.txt output table generated by albacore, guppy or MinKnow basecalling (optionally compressed)
  • fasta files (optionally compressed)
  • multiple files of the same type can be offered simultaneously
Please cite: Wouter De Coster, Svenn D'Hert, Darrin T Schultz, Marc Cruts and Christine Van Broeckhoven: NanoPack: visualizing and processing long-read sequencing data. (PubMed,eprint) Bioinformatics 34(15):2666-2669 (2018)
Registry entries: Bioconda 
ncbi-magicblast
RNA-seq mapping tool
Versions of package ncbi-magicblast
ReleaseVersionArchitectures
VCS1.5.0+ds-1all
Versions and Archs
License: PD
Debian package not available
Git
Version: 1.5.0+ds-1

Magic-BLAST is a tool for mapping large next-generation RNA or DNA sequencing runs against a whole genome or transcriptome. Each alignment optimizes a composite score, taking into account simultaneously the two reads of a pair, and in case of RNA-seq, locating the candidate introns and adding up the score of all exons. This is very different from other versions of BLAST, where each exon is scored as a separate hit and read- pairing is ignored.

Please cite: Grzegorz M. Boratyn, Jean Thierry-Mieg, Danielle Thierry-Mieg, Ben Busby and Thomas L. Madden: Magic-BLAST, an accurate RNA-seq aligner for long and short reads. (PubMed,eprint) BMC Bioinformatics 20(1):405 (2019)
Registry entries: Bioconda 
nextflow
DSL for data-driven computational pipelines
Versions of package nextflow
ReleaseVersionArchitectures
VCS23.10.1+dfsg-1all
Versions and Archs
License: Apache-2.0
Debian package not available
Git
Version: 23.10.1+dfsg-1

Nextflow is a bioinformatics workflow manager that enables the development of portable and reproducible workflows. It supports deploying workflows on a variety of execution platforms including local, HPC schedulers, AWS Batch, Google Genomics Pipelines, and Kubernetes. Additionally, it provides support for manage your workflow dependencies through built-in support for Conda, Docker, Singularity, and Modules.

Please cite: Paolo Di Tommaso, Maria Chatzou, Evan W Floden, Pablo Prieto Barja, Emilio Palumbo and Cedric Notredame: Nextflow enables reproducible computational workflows. (PubMed,eprint) Nature Biotechnology 35(4):316-319 (2017)
nextstrain-ncov
Nextstrain build for novel coronavirus (nCoV)
Versions of package nextstrain-ncov
ReleaseVersionArchitectures
VCS0.0+git20200320.392dc1c-1all
Versions and Archs
License: MIT
Debian package not available
Git
Version: 0.0+git20200320.392dc1c-1

This is a Nextstrain build for novel coronavirus, alternately known as hCoV-19 or SARS-CoV-2, visible at https://nextstrain.org/ncov .

Remark of Debian Med team: needs auspice
nf-core-artic
nf-core ARTIC field bioinformatics viral genome pipeline
Versions of package nf-core-artic
ReleaseVersionArchitectures
VCS0.0+git20200324.9edd884-1all
Versions and Archs
License: free
Debian package not available
Git
Version: 0.0+git20200324.9edd884-1

RNA-seq workflow for nextflow, meant to form a a bioinformatics pipeline for working with virus sequencing data sequenced with nanopore. This is a reimplementation for the nextflow workflow suite of the ARTIC fieldbioinformatics protocol.

This package at the very moment is not much more than a technical exercise. Upstream tagged it as "under development" - and that is what it is here, too.

Please cite: Philip Ewels, Alexander Peltzer, Sven Fillinger, Harshil Patel, Johannes Alneberg, Andreas Wilm, Maxime Ulysse Garcia, Paolo Di Tommaso and Sven Nahnsen.: The nf-core framework for community-curated bioinformatics pipelines. Nat Biotechnol. (2020)
oncofuse
predicting oncogenic potential of gene fusions
Versions of package oncofuse
ReleaseVersionArchitectures
VCS1.1.1-1all
Versions and Archs
License: Apache-2.0
Debian package not available
Git
Version: 1.1.1-1

Oncofuse is a framework designed to estimate the oncogenic potential of de-novo discovered gene fusions. It uses several hallmark features and employs a bayesian classifier to provide the probability of a given gene fusion being a driver mutation.

Please cite: Mikhail Shugay, Iñigo Ortiz de Mendíbil, José L. Vizmanos and Francisco J. Novo: Oncofuse: a computational framework for the prediction of the oncogenic potential of gene fusions. (PubMed,eprint) Bioinformatics 29(20):2539–2546 (2013)
Registry entries: SciCrunch 
optitype
precision HLA typing from next-generation sequencing data
Versions of package optitype
ReleaseVersionArchitectures
VCS1.3.2-1all
Versions and Archs
License: <license>
Debian package not available
Git
Version: 1.3.2-1

OptiType is a novel HLA genotyping algorithm based on integer linear programming, capable of producing accurate 4-digit HLA genotyping predictions from NGS data by simultaneously selecting all major and minor HLA Class I alleles.

Please cite: András Szolek, Benjamin Schubert, Christopher Mohr, Marc Sturm, Magdalena Feldhahn and Oliver Kohlbacher: OptiType: precision HLA typing from next-generation sequencing data. (PubMed,eprint) Bioinformatics 30(23):3310–3316 (2014)
pangolin
Phylogenetic Assignment of Named Global Outbreak LINeages
Versions of package pangolin
ReleaseVersionArchitectures
VCS4.3.1-1all
Versions and Archs
License: GPL-3+
Debian package not available
Git
Version: 4.3.1-1

Pangolin runs a multinomial logistic regression model trained against lineage assignments based on GISAID data.

Legacy pangolin runs using a guide tree and alignment hosted at cov-lineages/lineages. Some of this data is sourced from GISAID, but anonymised and encrypted to fit with guidelines. Appropriate permissions have been given and acknowledgements for the teams that have worked to provide the original SARS-CoV-2 genome sequences to GISAID are also hosted here.

Registry entries: Bioconda 
pomoxis
analysis components from Oxford Nanopore Research
Versions of package pomoxis
ReleaseVersionArchitectures
VCS0.3.4-1all
Versions and Archs
License: MPL-2.0
Debian package not available
Git
Version: 0.3.4-1

Pomoxis comprises a set of basic bioinformatic tools tailored to nanopore sequencing. Notably tools are included for generating and analysing draft assemblies. Many of these tools are used by the research data analysis group at Oxford Nanopore Technologies.

Features

  • Wraps third party tools with known good default parameters and methods of use.
  • Creates an isolated environment with all third-party tools.
  • Streamlines common short analysis chains.
  • Integrates into katuali for performing more complex analysis pipelines.
Registry entries: Bioconda 
python3-idseq-dag
Pipeline engine for IDseq (Python 3)
Versions of package python3-idseq-dag
ReleaseVersionArchitectures
VCS4.2.3-1all
Versions and Archs
License: MIT
Debian package not available
Git
Version: 4.2.3-1

Idseq_dag is the pipeline execution engine for idseq (see idseq.net). It is a pipelining system that implements a directed acyclic graph (DAG) where the nodes (steps) correspond to individual python classes. The graph is defined using JSON.

The pipeline would be executed locally with local machine resources. idseq-dag could be installed inside a docker container and run inside the container.

This package installs the library for Python 3.

python3-scanpy
Single-Cell Analysis in Python
Versions of package python3-scanpy
ReleaseVersionArchitectures
VCS1.9.6-1all
Versions and Archs
License: BSD-3-Clause
Debian package not available
Git
Version: 1.9.6-1

Scanpy is a scalable toolkit for analyzing single-cell gene expression data built jointly with anndata. It includes preprocessing, visualization, clustering, trajectory inference and differential expression testing. The Python-based implementation efficiently deals with datasets of more than one million cells.

Please cite: F. Alexander Wolf, Philipp Angerer and Fabian J. Theis: SCANPY: large-scale single-cell gene expression data analysis. (eprint) Genome Biology 19(15) (2018)
Registry entries: Bio.tools  SciCrunch  Bioconda 
qualimap
evaluating next generation sequencing alignment data
Versions of package qualimap
ReleaseVersionArchitectures
VCS2.2.1+dfsg-1all
Versions and Archs
License: GPL-2+
Debian package not available
Git
Version: 2.2.1+dfsg-1

Qualimap 2 provides both a Graphical User Interface (GUI) and a command-line interface to facilitate the quality control of alignment sequencing data and its derivatives like feature counts.

Supported types of experiments include:

  • Whole-genome sequencing
  • Whole-exome sequencing
  • RNA-seq (speical mode available)
  • ChIP-seq

Qualimap examines sequencing alignment data in SAM/BAM files according to the features of the mapped reads and provides an overall view of the data that helps to the detect biases in the sequencing and/or mapping of the data and eases decision-making for further analysis.

Qualimap provides multi-sample comparison of alignment and counts data.

  • Fast analysis accross the reference of genome coverage and nucleotide distribution;
  • Easy to interpret summary of the main properties of the alignment data;
  • Analysis of the reads mapped inside/outside of the regions provided in GFF format;
  • Computation and analysis of read counts obtained from intersectition of read alignments with genomic features;
  • Analysis of the adequasy of the sequencing depth in RNA-seq experiments;
  • Multi-sample comparison of alignment and counts data;
  • Clustering of epigenomic profiles.
Please cite: Fernando García-Alcalde, Konstantin Okonechnikov, José Carbonell, Luis M. Cruz, Stefan Götz, Sonia Tarazona, Joaquín Dopazo, Thomas F. Meyer and Ana Conesa: Qualimap: evaluating next-generation sequencing alignment data. (PubMed,eprint) Bioinformatics 28(20):2678-2679 (2012)
Registry entries: Bio.tools  SciCrunch  Bioconda 
quast
Quality Assessment Tool for Genome Assemblies
Versions of package quast
ReleaseVersionArchitectures
VCS5.0.2+dfsg-1all
Versions and Archs
License: GPL-2
Debian package not available
Git
Version: 5.0.2+dfsg-1

QUAST evaluates genome assemblies. For metagenomes, please see MetaQUAST project. It works both with and without a given reference genome. The tool accepts multiple assemblies, thus it allows for comparisons.

Please cite: Alla Mikheenko, Andrey Prjibelski, Vladislav Saveliev, Dmitry Antipov and Alexey Gurevich: Versatile genome assembly evaluation with QUAST-LG. Bioinformatics 34(13):i142-i150 (2018)
Registry entries: Bio.tools  Bioconda 
r-cran-covid19
GNU R Coronavirus COVID-19 data acquisition and visualization
Versions of package r-cran-covid19
ReleaseVersionArchitectures
VCS0.2.1-1all
Versions and Archs
License: GPL-3
Debian package not available
Git
Version: 0.2.1-1

This GNU R package provides pre-processed, ready-to-use, tidy format datasets of the 2019 Novel Coronavirus COVID-19 (2019-nCoV) epidemic. The latest data are downloaded in real-time, processed and merged with demographic indicators from several trusted sources. The package implements advanced data visualization across the space and time dimensions by means of animated mapping. Besides worldwide data, the package includes granular data for Italy, Switzerland and the Diamond Princess.

r-other-fastbaps
A fast genetic clustering algorithm that approximates a Dirichlet Process Mixture model
Versions of package r-other-fastbaps
ReleaseVersionArchitectures
VCS1.0.4-1all
Versions and Archs
License: MIT
Debian package not available
Git
Version: 1.0.4-1

Takes a multiple sequence alignment as input and clusters according to the 'no-admixture' model. It combines ideas from the Bayesian Hierarchical Clustering algorithm of Heller et al. and hierBAPS to produce a rapid and accurate clustering algorithm.

Please cite: Gerry Tonkin-Hill, John A Lees, Stephen D Bentley, Simon D W Frost and Jukka Corander: Fast hierarchical Bayesian analysis of population structure. (PubMed,eprint) Nucleic Acids Research 47(11):5539–5549 (2019)
Registry entries: Bioconda 
rosa
Removal of Spurious Antisense in biological RNA sequences
Versions of package rosa
ReleaseVersionArchitectures
VCS1.0-1all
Versions and Archs
License: GPL-3.0+
Debian package not available
Git
Version: 1.0-1

In stranded RNA-Seq experiments it is possible to detect and measure antisense transcription, important since antisense transcripts impact gene transcription in several different ways. Stranded RNA-Seq determines the strand from which an RNA fragment originates, and so can be used to identify where antisense transcription may be implicated in gene regulation.

However, spurious antisense reads are often present in experiments, and can manifest at levels greater than 1% of sense transcript levels. This is enough to disrupt analyses by causing false antisense counts to dominate the set of genes with high antisense transcription levels.

The RoSA (Removal of Spurious Antisense) tool detects the presence of high levels of spurious antisense transcripts, by:

  • analysing ERCC spike-in data to find the ratio of antisense:sense transcripts in the spike-ins; or
  • using antisense and sense counts around splice sites to provide a set of gene-specific estimates; or
  • both.

Once RoSA has an estimate of the spurious antisense, expressed as a ratio of antisense:sense counts, RoSA will calculate a correction to the antisense counts based on the ratio. Where a gene-specific estimate is available for a gene, it will be used in preference to the global estimate obtained from either spike-ins or spliced reads.

This package provides the library for the statistics suite R.

sailfish
RNA-seq expression estimation
Versions of package sailfish
ReleaseVersionArchitectures
VCS0.10.1+dfsg-1all
Versions and Archs
License: GPL-3.0+
Debian package not available
Git
Version: 0.10.1+dfsg-1

RNA-seq is a technology to read at least parts of individual RNA sequences of a tissue sample. After assigning these reads to genes that are likely responsible to have coded for them (mapping), this gives an insight (estimate) about how much these genes have been active (expressed) in that sample. The trickier bits in that process to address is the similarity of genes and the genes being capable to variably but deterministically skip parts of their sequence to be read (introns). A single variantly spliced gene may then yield different sequences (isoforms) and the RNA-seq evaluation better informs about this. It may be relevant for a disease.

Sailfish is particularly good (efficient) in this process. It tricks the complexity by introducing an intermediate level of artificial very short reads to which the alternative splicing is of no concern. That can then be addressed by "telephone-book"-like hashing techniques that are easy and lightning fast. The final presentation is then found to be competitive with established mappers like eXpress and Cufflinks.

Please cite: Rob Patro, Stephen M Mount and Carl Kingsford: Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms. (PubMed) Nature Biotechnology 32(5):462-464 (2014)
Registry entries: Bio.tools 
seqwish
alignment to variation graph inducer
Versions of package seqwish
ReleaseVersionArchitectures
VCS0.7.1-1all
Versions and Archs
License: MIT
Debian package not available
Git
Version: 0.7.1-1

Seqwish implements a lossless conversion from pairwise alignments between sequences to a variation graph encoding the sequences and their alignments. As input we typically take all-versus-all alignments, but the exact structure of the alignment set may be defined in an application specific way. This algorithm uses a series of disk-backed sorts and passes over the alignment and sequence inputs to allow the graph to be constructed from very large inputs that are commonly encountered when working with large numbers of noisy input sequences. Memory usage during construction and traversal is limited by the use of sorted disk-backed arrays and succinct rank/select dictionaries to record a queryable version of the graph.

Registry entries: Bioconda 
signalalign
HMM-HDP models for MinION signal alignments
Versions of package signalalign
ReleaseVersionArchitectures
VCS0.0+git20170131.3293fad+dfsg-1all
Versions and Archs
License: MIT
Debian package not available
Git
Version: 0.0+git20170131.3293fad+dfsg-1

MinION signal-level alignment and methylation detection using hidden Markov Models with hierarchical Dirichlet process kmer learning.

Nanopore sequencing is based on the principal of isolating a nanopore in a membrane separating buffered salt solutions, then applying a voltage across the membrane and monitoring the ionic current through the nanopore. The Oxford Nanopore Technologies (ONT) MinION sequences DNA by recording the ionic current as DNA strands are enzymatically guided through the nanopore. SignalAlign will align the ionic current from the MinION to a reference sequence using a trainable hidden Markov model (HMM). The emissions model for the HMM can either be the table of parametric normal distributions provided by ONT or a hierarchical Dirichlet process (HDP) mixture of normal distributions. The HDP models enable mapping of methylated bases to your reference sequence.

Registry entries: Bio.tools 
streamlit
fast way to build custom ML tools
Versions of package streamlit
ReleaseVersionArchitectures
VCS0.56.0-1all
Versions and Archs
License: Apache-2.0
Debian package not available
Git
Version: 0.56.0-1

Streamlit lets you create apps for your machine learning projects with deceptively simple Python scripts. It supports hot-reloading, so your app updates live as you edit and save your file. No need to mess with HTTP requests, HTML, JavaScript, etc. All you need is your favorite editor and a browser.

Remark of Debian Med team: Help is urgently needed - no idea how to package this :-(

This is a machine learning framework which is required by chime. Needs python3-tensorflow

strelka
strelka2 germline and somatic small variant caller
Versions of package strelka
ReleaseVersionArchitectures
VCS2.9.10+dfsg-1all
Versions and Archs
License: GPL-3+
Debian package not available
Git
Version: 2.9.10+dfsg-1

Strelka2 is a fast and accurate small variant caller optimized for analysis of germline variation in small cohorts and somatic variation in tumor/normal sample pairs. The germline caller employs an efficient tiered haplotype model to improve accuracy and provide read-backed phasing, adaptively selecting between assembly and a faster alignment- based haplotyping approach at each variant locus. The germline caller also analyzes input sequencing data using a mixture-model indel error estimation method to improve robustness to indel noise. The somatic calling model improves on the original Strelka method for liquid and late- stage tumor analysis by accounting for possible tumor cell contamination in the normal sample. A final empirical variant re-scoring step using random forest models trained on various call quality features has been added to both callers to further improve precision.

Please cite: Sangtae Kim, Konrad Scheffler, Aaron L. Halpern, Mitchell A. Bekritsky, Eunho Noh, Morten Källberg, Xiaoyu Chen, Yeonbin Kim, Doruk Beyter, Peter Krusche and Christopher T. Saunders: Strelka2: fast and accurate calling of germline and somatic variants. (PubMed) Nature Methods 15(8):591–594 (2018)
Registry entries: Bioconda 
ufasta
utility to manipulate fasta files
Versions of package ufasta
ReleaseVersionArchitectures
VCS0.0.3+git20190131.85d60d1-1all
Versions and Archs
License: to_be_clarified
Debian package not available
Git
Version: 0.0.3+git20190131.85d60d1-1

Description of ufasta subcommands:

  • one: remove the new lines in the data section. Hence, all the sequences are written on one line. In some sense, it is the opposite of the format subcommand.
  • format: reformat the data sections. The data is written in lines of the same length, it can changes the content in upper/lower case.
  • sizes: print the amount of sequence in each section
  • head: like UNIX head. Display the first 10 sequences
  • tail: like UNIX tail. Display the last 10 sequences
  • rc: reverse complement every sequence
  • n50, stats: display stats about the sequences: N50, E size, total size, etc.
  • extract: extract a sequence whose header match given names
  • hsort, sort: sort file based on header content
  • dsort: sort the data sections
  • hgreap: output sequences whose header match the regular expression
  • dgresp: output sequences whose sequence match the regular expression
  • split: split a fasta file into many files
vadr
classification and annotation of viral sequences
Versions of package vadr
ReleaseVersionArchitectures
VCS1.2.1-1all
Versions and Archs
License: public_domain
Debian package not available
Git
Version: 1.2.1-1

VADR (Viral Annotation DefineR) is a suite of tools for classifying and analyzing sequences homologous to a set of reference models of viral genomes or gene families. It has been mainly tested for analysis of Norovirus and Dengue virus sequences in preparation for submission to the GenBank database and finds its application also for the ongoing pandemics.

Please cite: Alejandro A Schäffer, Eneida L Hatcher, Linda Yankie, Lara Shonkwiler, J Rodney Brister, Ilene Karsch-Mizrachi and Eric P Nawrocki: VADR: validation and annotation of virus sequence submissions to GenBank. (eprint) bioRxiv (2020)
Registry entries: Bio.tools 
*Popularitycontest results: number of people who use this package regularly (number of people who upgraded this package recently) out of 246672