Microbial 'omics


Table of Contents


Anvi'o is an advanced analysis and visualization platform for ‘omics data. Its interactive interface facilitates the management of metagenomic contigs and associated data for automatic or human-guided identification of genome bins, and their curation. The extensible visualization approach distills multiple dimensions of information for each contig into a single, intuitive display, offering a dynamic and unified work environment for data exploration, manipulation and reporting. Beyond its easy-to-use interface, the advanced modular architecture of anvi’o as a platform allows users with programming skills to implement and test novel ideas with minimal effort. Please see the anvi'o project page for details, and the codebase to follow the development and/or participate.


Oligotyping is a human-guided computational approach that makes it possible to decompose very closely related taxa at one nucleotide resolution. It is generally applied to the high-throughput sequencing of bacterial marker gene amplicons amplified from environmental samples (such as 16S rRNA gene). See the project page for details.

Minimum Entropy Decomposition

MED is an information theory-based clustering algorithm for sensitive partitioning of high-throughput marker gene sequences. The source code is distributed through the oligotyping pipeline. See the project page for more.

Illumina Utilities Library

A lightweight and high-performance library to analyze raw Illumina data. It contains programs for demultiplexing, quality filtering, and mergeing partially or fully overlapping reads. Illumina utils has been a core component of the sequencing operations at the MBL. The source code, installation instructions and examples are available through its GitHub repository:


BLAST Filtering Pipeline

A metagenomic short read filtering software that uses a flexible configuration format. It allows users to define a chain of genomic filters, each of which perform on the output data provided by the previous filter, to filter reads out from sequencing data. It can exploit Sun Grid Engine and distribute individual processes. The source code is available through its GitHub repository: