Microbial 'omics

Brought to you by

Help pages for anvi'o programs and artifacts

Here you will find a list of all anvi’o programs and artifacts that enable constructing workflows for integrated multi ‘omics investigations.

If you need an introduction to the terminology used in ‘omics research or in anvi’o, please take a look at our vocabulary page. The anvi’o community is with you! If you have practical, technical, or science questions this page to learn about resources available to you. If you are feeling overwhelmed, you can always scream towards the anvi’o Slack channel.

Questions? Concerns? Find us on

The help contents were last updated on 02 Jan 21 22:00:26 for anvi’o version 7 (hope).

The latest version of anvi’o is v7. See the release notes.

Table of Contents

Anvi’o artifacts

Anvi’o artifacts represent concepts, file types, or data types anvi’o programs can work with. A given anvi’o artifact can be provided by the user (such as a FASTA file), produced by anvi’o (such as a profile database), or both (such as phylogenomic trees). Anvi’o artifacts link anvi’o programs to each other to build novel workflows.

Listed below a total of 105 artifacts.

pan-db contigs-db trnaseq-db modules-db structure-db pdb-db kegg-data single-profile-db profile-db genes-db genomes-storage-db
fasta contigs-fasta trnaseq-fasta concatenated-gene-alignment-fasta short-reads-fasta genes-fasta locus-fasta
configuration-ini external-gene-calls protein-structure-txt samples-txt fasta-txt collection-txt misc-data-items-txt misc-data-layers-txt misc-data-nucleotides-txt misc-data-amino-acids-txt misc-data-layer-orders-txt misc-data-items-order-txt linkmers-txt gene-calls-txt binding-frequencies-txt functions-txt functional-enrichment-txt view-data layer-taxonomy-txt gene-taxonomy-txt genome-taxonomy-txt external-genomes internal-genomes metagenomes coverages-txt detection-txt variability-profile-txt codon-frequencies-txt aa-frequencies-txt fixation-index-matrix kegg-metabolism augustus-gene-calls vcf blast-table splits-txt genbank-file groups-txt splits-taxonomy-txt hmm-hits-matrix-txt clustering-configuration
bam-file raw-bam-file
contigs-stats genes-stats
hmm-hits completion misc-data-items misc-data-layers misc-data-nucleotides misc-data-amino-acids genome-similarity misc-data-layer-orders misc-data-items-order metapangenome oligotypes functions kegg-functions layer-taxonomy gene-taxonomy genome-taxonomy scgs-taxonomy-db scgs-taxonomy trna-taxonomy-db trna-taxonomy variability-profile split-bins state ngrams pn-ps-data
cogs-data pfams-data interacdome-data
dendrogram phylogeny
state-json workflow-config
contigs-workflow metagenomics-workflow pangenomics-workflow phylogenomics-workflow trnaseq-workflow

Anvi’o programs

Anvi’o programs perform atomic tasks that can be weaved together to implement complete ‘omics workflows. Please note that there may be programs that are not listed on this page. You can type ‘anvi-‘ in your terminal, and press the TAB key twice to see the full list of programs available to you on your system, and type anvi-program-name --help to read the full list of command line options.

Listed below a total of 117 programs.

🔥 anvi-analyze-synteny. Extract ngrams, as in 'co-occurring genes in synteny', from genomes.
🧀 genomes-storage-db functions pan-db
🍕 ngrams
🔥 anvi-cluster-contigs. A program to cluster items in a merged anvi'o profile using automatic binning algorithms.
🧀 profile-db contigs-db collection
🍕 collection bin
🔥 anvi-compute-completeness. A script to generate completeness info for a given list of splits.
🧀 contigs-db splits-txt hmm-source
🔥 anvi-compute-functional-enrichment. This is a driver program for anvi-script-enrichment-stats, a script that computes enrichment scores and group associations for annotated entities (ie, functions, KEGG Modules) across groups of genomes or samples..
🧀 kegg-metabolism groups-txt misc-data-layers pan-db genomes-storage-db external-genomes internal-genomes
🍕 functional-enrichment-txt
🔥 anvi-compute-gene-cluster-homogeneity. Compute homogeneity for gene clusters.
🧀 pan-db genomes-storage-db
🔥 anvi-compute-genome-similarity. Export sequences from sequence sources and compute a similarity metric (e.g. ANI). If a Pan Database is given anvi'o will write computed output to misc data tables of Pan Database.
🧀 external-genomes internal-genomes pan-db
🍕 genome-similarity
🔥 anvi-convert-trnaseq-database. A program that processes one or more anvio' tRNA-seq databases generated by anvi-trnaseq to generate anvi'o contigs and merged profile databases that are accessible to the rest of the tools in anvi'o software ecosystem. Briefly, this program will determine final seed sequences from input tRNA-seq databases, determine their coverages across samples, identify tRNA modification sites, and INDELs associated with transcripts in each sample against the seed sequences and store all these data into resulting databases for interactive visualization of the data or in-depth analysis using other anvi'o frameworks.
🧀 trnaseq-db
🍕 contigs-db profile-db
🔥 anvi-db-info. Access self tables, display values, or set new ones totally on your own risk.
🧀 pan-db profile-db contigs-db genomes-storage-db structure-db genes-db
🔥 anvi-delete-collection. Remove a collection from a given profile database.
🧀 profile-db collection
🔥 anvi-delete-hmms. Remove HMM hits from an anvi'o contigs database.
🧀 contigs-db hmm-source hmm-hits
🔥 anvi-delete-misc-data. Remove stuff from 'additional data' or 'order' tables for either items or layers in either pan or profile databases. OR, remove stuff from the 'additional data' tables for nucleotides or amino acids in contigs databases.
🧀 pan-db profile-db misc-data-items misc-data-layers misc-data-layer-orders misc-data-nucleotides misc-data-amino-acids
🔥 anvi-delete-state. Delete an anvi'o state from a pan or profile database.
🧀 pan-db profile-db state
🔥 anvi-dereplicate-genomes. Identify redundant (highly similar) genomes.
🧀 external-genomes internal-genomes fasta genome-similarity
🍕 fasta
🔥 anvi-display-contigs-stats. Start the anvi'o interactive interactive for viewing or comparing contigs statistics.
🧀 contigs-db
🍕 contigs-stats interactive svg
🔥 anvi-display-metabolism. Start the anvi'o interactive interactive for viewing KEGG metabolism data.
🧀 contigs-db kegg-data kegg-functions profile-db collection bin
🍕 interactive
🔥 anvi-display-pan. Start an anvi'o server to display a pan-genome.
🧀 pan-db genomes-storage-db
🍕 collection bin interactive svg
🔥 anvi-display-structure. Interactively visualize sequence variants on protein structures.
🧀 structure-db variability-profile-txt contigs-db profile-db splits-txt
🍕 interactive
🔥 anvi-estimate-genome-completeness. Estimate completion and redundancy using domain-specific single-copy core genes.
🧀 contigs-db profile-db external-genomes collection
🍕 completion
🔥 anvi-estimate-metabolism. Reconstructs metabolic pathways and estimates pathway completeness for a given set of contigs.
🧀 contigs-db kegg-data kegg-functions profile-db collection bin external-genomes internal-genomes metagenomes
🍕 kegg-metabolism
🔥 anvi-estimate-scg-taxonomy. Estimates taxonomy at genome and metagenome level. This program is the entry point to estimate taxonomy for a given set of contigs (i.e., all contigs in a contigs database, or contigs described in collections as bins). For this, it uses single-copy core gene sequences and the GTDB database.
🧀 profile-db contigs-db scgs-taxonomy collection bin metagenomes
🍕 genome-taxonomy genome-taxonomy-txt
🔥 anvi-estimate-trna-taxonomy. Estimates taxonomy at genome and metagenome level using tRNA sequences..
🧀 profile-db contigs-db trna-taxonomy collection bin metagenomes
🍕 genome-taxonomy genome-taxonomy-txt
🔥 anvi-experimental-organization. Create an experimental clustering dendrogram..
🧀 clustering-configuration
🍕 dendrogram
🔥 anvi-export-collection. Export a collection from an anvi'o database.
🧀 profile-db collection
🍕 collection-txt
🔥 anvi-export-contigs. Export contigs (or splits) from an anvi'o contigs database.
🧀 contigs-db
🍕 contigs-fasta
🔥 anvi-export-functions. Export functions of genes from an anvi'o contigs database for a given annotation source.
🧀 contigs-db functions
🍕 functions-txt
🔥 anvi-export-gene-calls. Export gene calls from an anvi'o contigs database.
🧀 contigs-db
🍕 gene-calls-txt
🔥 anvi-export-gene-coverage-and-detection. Export gene coverage and detection data for all genes associated with contigs described in a profile database.
🧀 profile-db contigs-db
🍕 coverages-txt detection-txt
🔥 anvi-export-items-order. Export an item order from an anvi'o database.
🧀 pan-db profile-db
🍕 misc-data-items-order-txt dendrogram phylogeny
🔥 anvi-export-locus. This program helps you cut a 'locus' from a larger genetic context (e.g., contigs, genomes). By default, anvi'o will locate a user-defined anchor gene, extend its selection upstream and downstream based on the –num-genes argument, then extract the locus to create a new contigs database. The anchor gene must be provided as –search-term, –gene-caller-ids, or –hmm-sources. If –flank-mode is designated, you MUST provide TWO flanking genes that define the locus region (Please see –flank-mode help for more information). If everything goes as plan, anvi'o will give you individual locus contigs databases for every matching anchor gene found in the original contigs database provided. Enjoy your mini contigs databases!.
🧀 contigs-db
🍕 locus-fasta
🔥 anvi-export-misc-data. Export additional data or order tables in pan or profile databases for items or layers.
🧀 pan-db profile-db contigs-db misc-data-items misc-data-layers misc-data-layer-orders misc-data-nucleotides misc-data-amino-acids
🍕 misc-data-items-txt misc-data-layers-txt misc-data-layer-orders-txt misc-data-nucleotides-txt misc-data-amino-acids-txt
🔥 anvi-export-splits-and-coverages. Export split or contig sequences and coverages across samples stored in an anvi'o profile database. This program is especially useful if you would like to 'bin' your splits or contigs outside of anvi'o and import the binning results into anvi'o using anvi-import-collection program.
🧀 profile-db contigs-db
🍕 contigs-fasta coverages-txt
🔥 anvi-export-splits-taxonomy. Export taxonomy for splits found in an anvi'o contigs database.
🧀 contigs-db
🍕 splits-taxonomy-txt
🔥 anvi-export-state. Export an anvi'o state into a profile database.
🧀 pan-db profile-db state
🍕 state-json
🔥 anvi-export-structures. Export .pdb structure files from a structure database.
🧀 structure-db
🍕 protein-structure-txt
🔥 anvi-gen-contigs-database. Generate a new anvi'o contigs database.
🧀 contigs-fasta external-gene-calls
🍕 contigs-db
🔥 anvi-gen-fixation-index-matrix. Generate a pairwise matrix of a fixation indices between samples.
🧀 contigs-db profile-db structure-db bin variability-profile-txt splits-txt
🍕 fixation-index-matrix
🔥 anvi-gen-gene-consensus-sequences. Collapse variability for a set of genes across samples.
🧀 profile-db contigs-db
🍕 genes-fasta
🔥 anvi-gen-gene-level-stats-databases. A program to compute genes databases for a ginen set of bins stored in an anvi'o collection. Genes databases store gene-level coverage and detection statistics, and they are usually computed and generated automatically when they are required (such as running anvi-interactive with --gene-mode flag). This program allows you to pre-compute them if you don't want them to be done all at once.
🧀 profile-db contigs-db collection bin
🍕 genes-db
🔥 anvi-gen-genomes-storage. Create a genome storage from internal and/or external genomes for a pangenome analysis.
🧀 external-genomes internal-genomes
🍕 genomes-storage-db
🔥 anvi-gen-phylogenomic-tree. Generate phylogenomic tree from aligment file.
🧀 concatenated-gene-alignment-fasta
🍕 phylogeny
🔥 anvi-gen-structure-database. Identifies genes in your contigs database that encode proteins that are homologous to proteins with solved structures. If sufficiently similar homologs are identified, they are used as structural templates to predict the 3D structure of proteins in your contigs database.
🧀 contigs-db pdb-db
🍕 structure-db
🔥 anvi-gen-variability-network. A program to generate a network description from an anvi'o variability profile (potentially outdated program).
🧀 variability-profile
🔥 anvi-gen-variability-profile. Generate a table that comprehensively summarizes the variability of nucleotide, codon, or amino acid positions. We call these single nucleotide variants (SNVs), single codon variants (SCVs), and single amino acid variants (SAAVs), respectively.
🧀 contigs-db profile-db structure-db bin variability-profile splits-txt
🍕 variability-profile-txt
🔥 anvi-get-aa-counts. Fetches the number of times each amino acid occurs from a contigs database in a given bin, set of contigs, or set of genes.
🧀 splits-txt contigs-db profile-db collection
🍕 aa-frequencies-txt
🔥 anvi-get-codon-frequencies. Get amino acid or codon frequencies of genes in a contigs database.
🧀 contigs-db
🍕 codon-frequencies-txt aa-frequencies-txt
🔥 anvi-get-sequences-for-gene-calls. A script to get back sequences for gene calls.
🧀 contigs-db genomes-storage-db
🍕 genes-fasta external-gene-calls
🔥 anvi-get-sequences-for-gene-clusters. Do cool stuff with gene clusters in anvi'o pan genomes.
🧀 pan-db genomes-storage-db
🍕 genes-fasta concatenated-gene-alignment-fasta misc-data-items
🔥 anvi-get-sequences-for-hmm-hits. Get sequences for HMM hits from many inputs.
🧀 contigs-db profile-db external-genomes internal-genomes hmm-source hmm-hits
🍕 genes-fasta concatenated-gene-alignment-fasta
🔥 anvi-get-short-reads-from-bam. Get short reads back from a BAM file with options for compression, splitting of forward and reverse reads, etc.
🧀 profile-db contigs-db bin bam-file
🍕 short-reads-fasta
🔥 anvi-get-short-reads-mapping-to-a-gene. Recover short reads from BAM files that were mapped to genes you are interested in. It is possible to work with a single gene call, or a bunch of them. Similarly, you can get short reads from a single BAM file, or from many of them.
🧀 contigs-db bam-file
🍕 short-reads-fasta
🔥 anvi-get-split-coverages. Export splits and the coverage table from database.
🧀 profile-db contigs-db collection bin
🍕 coverages-txt
🔥 anvi-import-collection. Import an external binning result into anvi'o.
🧀 contigs-db profile-db pan-db collection-txt
🍕 collection
🔥 anvi-import-functions. Parse and store functional annotation of genes.
🧀 contigs-db functions-txt
🍕 functions
🔥 anvi-import-items-order. Import a new items order into an anvi'o database.
🧀 pan-db profile-db misc-data-items-order-txt dendrogram phylogeny
🍕 misc-data-items-order
🔥 anvi-import-misc-data. Populate additional data or order tables in pan or profile databases for items and layers, OR additional data in contigs databases for nucleotides and amino acids (the Swiss army knife-level serious stuff).
🧀 pan-db profile-db contigs-db misc-data-items-txt dendrogram phylogeny misc-data-layers-txt misc-data-layer-orders-txt misc-data-nucleotides-txt misc-data-amino-acids-txt
🍕 misc-data-items misc-data-layers misc-data-layer-orders misc-data-nucleotides misc-data-amino-acids
🔥 anvi-import-state. Import an anvi'o state into a profile database.
🧀 pan-db profile-db state-json
🍕 state
🔥 anvi-import-taxonomy-for-genes. Import gene-level taxonomy into an anvi'o contigs database.
🧀 contigs-db gene-taxonomy-txt
🍕 gene-taxonomy
🔥 anvi-import-taxonomy-for-layers. Import layers-level taxonomy into an anvi'o additional layer data table in an anvi'o single-profile database.
🧀 single-profile-db layer-taxonomy-txt
🍕 layer-taxonomy
🔥 anvi-init-bam. Sort/Index BAM files.
🧀 raw-bam-file
🍕 bam-file
🔥 anvi-inspect. Start an anvi'o inspect interactive interface.
🧀 profile-db contigs-db bin
🍕 interactive
🔥 anvi-interactive. Start an anvi'o server for the interactive interface.
🧀 profile-db single-profile-db contigs-db genes-db bin view-data dendrogram phylogeny
🍕 collection bin interactive svg
🔥 anvi-matrix-to-newick. Takes a distance matrix, returns a newick tree.
🧀 view-data
🍕 dendrogram
🔥 anvi-merge. Merge multiple anvio profiles.
🧀 single-profile-db contigs-db
🍕 profile-db misc-data-items-order
🔥 anvi-merge-bins. Merge a given set of bins in an anvi'o collection.
🧀 pan-db profile-db collection bin
🔥 anvi-meta-pan-genome. Convert a pangenome into a metapangenome.
🧀 internal-genomes pan-db genomes-storage-db
🍕 metapangenome
🔥 anvi-oligotype-linkmers. Takes an anvi'o linkmers report, generates an oligotyping output.
🧀 linkmers-txt
🍕 oligotypes
🔥 anvi-pan-genome. An anvi'o program to compute a pangenome from an anvi'o genome storage.
🧀 genomes-storage-db
🍕 pan-db misc-data-items-order
🔥 anvi-profile. Creates a single anvi'o profile database. When it is run on a BAM file, depending on the user parameters, the program quantifies coverage per nucleotide position (and averages them per contig), calculates single-nucleotide, single-codon, and single-amino acid variants, as well as structural variants such as insertion and deletions and stores these data into appropriate tables.
🧀 bam-file contigs-db
🍕 single-profile-db misc-data-items-order variability-profile
🔥 anvi-refine. Start an anvi'o interactive interactive to manually curate or refine a genome, whether it is a metagenome-assembled, single-cell, or an isolate genome.
🧀 profile-db contigs-db bin
🍕 bin
🔥 anvi-rename-bins. Rename all bins in a given collection (so they have pretty names).
🧀 collection bin profile-db contigs-db
🍕 collection bin
🔥 anvi-report-linkmers. Reports sequences stored in one or more BAM files that cover one of more specific nucleotide positions in a reference.
🧀 bam-file
🍕 linkmers-txt
🔥 anvi-run-hmms. This program deals with populating tables that store HMM hits in an anvi'o contigs database.
🧀 contigs-db hmm-source
🍕 hmm-hits
🔥 anvi-run-interacdome. Run InteracDome on a contigs database.
🧀 contigs-db interacdome-data
🍕 binding-frequencies-txt misc-data-amino-acids
🔥 anvi-run-kegg-kofams. Run KOfam HMMs on an anvi'o contigs database.
🧀 contigs-db kegg-data
🍕 kegg-functions functions
🔥 anvi-run-ncbi-cogs. This program runs NCBI's COGs to associate genes in an anvi'o contigs database with functions. COGs database was been designed as an attempt to classify proteins from completely sequenced genomes on the basis of the orthology concept..
🧀 contigs-db cogs-data
🍕 functions
🔥 anvi-run-pfams. Run Pfam on Contigs Database.
🧀 contigs-db pfams-data
🍕 functions
🔥 anvi-run-scg-taxonomy. The purpose of this program is to affiliate single-copy core genes in an anvi'o contigs database with taxonomic names. A properly setup local SCG taxonomy database is required for this program to perform properly. After its successful run, anvi-estimate-scg-taxonomy will be useful to estimate taxonomy at genome-, collection-, or metagenome-level).
🧀 contigs-db scgs-taxonomy-db
🍕 scgs-taxonomy
🔥 anvi-run-trna-taxonomy. The purpose of this program is to affiliate tRNA gene sequences in an anvi'o contigs database with taxonomic names. A properly setup local tRNA taxonomy database is required for this program to perform properly. After its successful run, anvi-estimate-trna-taxonomy will be useful to estimate taxonomy at genome-, collection-, or metagenome-level)..
🧀 contigs-db trna-taxonomy-db
🍕 trna-taxonomy
🔥 anvi-run-workflow. Execute, manage, parallelize, and troubleshoot entire 'omics workflows and chain together anvi'o and third party programs.
🧀 samples-txt fasta-txt workflow-config
🍕 contigs-workflow metagenomics-workflow pangenomics-workflow phylogenomics-workflow trnaseq-workflow
🔥 anvi-scan-trnas. Identify and store tRNA genes in a contigs database.
🧀 contigs-db
🍕 hmm-hits
🔥 anvi-search-functions. Search functions in an anvi'o contigs database or genomes storage. Basically, this program searches for one or more search terms you define in functional annotations of genes in an anvi'o contigs database, and generates multiple reports. The default report simply tells you which contigs contain genes with functions matching to serach terms you used, useful for viewing in the interface. You can also request a much more comprehensive report, which gives you anything you might need to know for each hit and serach term.
🧀 contigs-db genomes-storage-db
🍕 functions-txt
🔥 anvi-setup-interacdome. Setup InteracDome data.
🍕 interacdome-data
🔥 anvi-setup-kegg-kofams. Download and setup KEGG KOfam HMM profiles and KEGG MODULE data.
🍕 kegg-data modules-db
🔥 anvi-setup-ncbi-cogs. Download and setup NCBI's Clusters of Orthologous Groups database.
🍕 cogs-data
🔥 anvi-setup-pdb-database. Setup or update an offline database of representative PDB structures clustered at 95%.
🍕 pdb-db
🔥 anvi-setup-pfams. Download and setup Pfam data from the EBI.
🍕 pfams-data
🔥 anvi-setup-scg-taxonomy. The purpose of this program is to download necessary information from GTDB (https://gtdb.ecogenomic.org/), and set it up in such a way that your anvi'o installation is able to assign taxonomy to single-copy core genes using anvi-run-scg-taxonomy and estimate taxonomy for genomes or metagenomes using anvi-estimate-scg-taxonomy).
🍕 scgs-taxonomy-db
🔥 anvi-setup-trna-taxonomy. The purpose of this program is to setup necessary databases for tRNA genes collected from GTDB (https://gtdb.ecogenomic.org/), genomes in your local anvi'o installation so taxonomy information for a given set of tRNA sequences can be identified using anvi-run-trna-taxonomy and made sense of via anvi-estimate-trna-taxonomy).
🍕 trna-taxonomy-db
🔥 anvi-show-collections-and-bins. A script to display collections stored in an anvi'o profile or pan database.
🧀 pan-db profile-db
🔥 anvi-show-misc-data. Show all misc data keys in all misc data tables.
🧀 pan-db profile-db contigs-db
🔥 anvi-split. Split an anvi'o pan or profile database into smaller, self-contained pieces. Provide either a genomes-storage and pan database or a profile and contigs database pair, and you'll get back directories of individual projects for each bin that can be treated as smaller anvi'o projects.
🧀 profile-db contigs-db genomes-storage-db pan-db collection
🍕 split-bins
🔥 anvi-summarize. Summarizer for anvi'o pan or profile db's. Essentially, this program takes a collection id along with either a profile database and a contigs database or a pan database and a genomes storage and generates a static HTML output for what is described in a given collection. The output directory will contain almost everything any downstream analysis may need, and can be displayed using a browser without the need for an anvi'o installation. For this reason alone, reporting summary outputs as supplementary data with publications is a great idea for transparency and reproducibility.
🧀 profile-db contigs-db collection pan-db genomes-storage-db
🍕 summary
🔥 anvi-trnaseq. A program to process raw tRNA-seq dataset, which is the sequencing of tRNA transcripts in a given sample, to generate an anvi'o tRNA-seq database.
🧀 trnaseq-fasta
🍕 trnaseq-db
🔥 anvi-update-db-description. Update the description in an anvi'o database.
🧀 pan-db profile-db contigs-db genomes-storage-db
🔥 anvi-update-structure-database. Add or re-run genes from an already existing structure database. All settings used to generate your database will be used in this program.
🧀 contigs-db structure-db
🔥 anvi-script-add-default-collection. A script to add a 'DEFAULT' collection in an anvi'o pan or profile database with a bin named 'EVERYTHING' that describes all items available in the profile database.
🧀 pan-db profile-db contigs-db
🍕 collection bin
🔥 anvi-script-augustus-output-to-external-gene-calls. Takes in gene calls by AUGUSTUS v3.3.3, generates an anvi'o external gene calls file. It may work well with other versions of AUGUSTUS, too. It is just no one has tested the script with different versions of the program.
🧀 augustus-gene-calls
🍕 external-gene-calls
🔥 anvi-script-calculate-pn-ps-ratio. This program calculates for each gene the ratio of pN/pS (the metagenomic analogy of dN/dS) based on metagenomic read recruitment, however, unlike standard pN/pS calculations, it relies on codons rather than nucleotides for accurate estimations of synonimity.
🧀 contigs-db variability-profile-txt
🍕 pn-ps-data
🔥 anvi-script-compute-ani-for-fasta. Run ANI between contigs in a single FASTA file.
🧀 fasta
🍕 genome-similarity
🔥 anvi-script-filter-fasta-by-blast. Filter FASTA file according to BLAST table (remove sequences with bad BLAST alignment).
🧀 contigs-fasta blast-table
🍕 contigs-fasta
🔥 anvi-script-fix-homopolymer-indels. Corrects homopolymer-region associated INDELs in a given genome based on a reference genome. The most effective use of this script is when the input genome is a genome reconstructed by minION long reads, and the reference genome is one that is of high-quality. Essentially, this script will BLAST the genome you wish to correct against the reference genome you provide, identify INDELs in the BLAST results that are exclusively associated with homopolymer regions, and will take the reference genome as a guide to correct the input sequences, and report a new FASTA file. You can use the output FASTA file that is fixed as the input FASTA file over and over again to see if you can eliminate all homopolymer-associated INDELs.
🧀 fasta
🍕 fasta
🔥 anvi-script-gen-distribution-of-genes-in-a-bin. Quantify the detection of genes in genomes in metagenomes to identify the environmental core. This is a helper script for anvi'o metapangenomic workflow.
🧀 contigs-db profile-db collection bin
🍕 view-data misc-data-items-txt
🔥 anvi-script-gen-hmm-hits-matrix-across-genomes. A simple script to generate a TAB-delimited file that reports the frequency of HMM hits for a given HMM source across contigs databases.
🧀 external-genomes internal-genomes hmm-source hmm-hits
🍕 hmm-hits-matrix-txt
🔥 anvi-script-gen-pseudo-paired-reads-from-fastq. A script that takes a FASTQ file that is not paired-end (i.e., R1 alone) and converts it into two FASTQ files that are paired-end (i.e., R1 and R2). This is a quick-and-dirty workaround that halves each read from the original FASTQ and puts one half in the FASTQ file for R1 and puts the reverse-complement of the second half in the FASTQ file for R2. If you've ended up here, things have clearly not gone very well for you, and Evan, who battled similar battles and ended up implementing this solution wholeheartedly sympathizes.
🧀 short-reads-fasta
🍕 short-reads-fasta
🔥 anvi-script-gen-short-reads. Generate short reads from contigs. Useful to reconstruct mock data sets from already assembled contigs.
🧀 configuration-ini
🍕 short-reads-fasta
🔥 anvi-script-gen_stats_for_single_copy_genes.py. A simple script to generate info from search tables, given a contigs-db.
🍕 genes-stats
🔥 anvi-script-get-coverage-from-bam. Get nucleotide-level, contig-level, or bin-level coverage values from a BAM file.
🧀 bam-file collection-txt
🍕 coverages-txt
🔥 anvi-script-get-hmm-hits-per-gene-call. A simple script to generate a TAB-delimited file gene caller IDs and their HMM hits for a given HMM source.
🧀 contigs-db hmm-source hmm-hits
🍕 functions-txt
🔥 anvi-script-get-primer-matches. You provide this program with FASTQ files for one or more samples AND one or more short sequences, and it collects reads from FASTQ files that matches to your sequences. This tool can be most powerful if you want to collect all short reads from one or more metagenomes that are downstream to a known sequence. Using the comprehensive output files you can analyze the diversity of seuqences visually, manually, or using established strategies such as oligotyping..
🧀 samples-txt
🍕 short-reads-fasta
🔥 anvi-script-merge-collections. Generate an additional data file from multiple collections.
🧀 contigs-db collection-txt
🔥 anvi-script-pfam-accessions-to-hmms-directory. You give this program one or more PFAM accession ids, and it generates an anvi'o compatible HMM directory to be used with anvi-run-hmms.
🍕 hmm-source
🔥 anvi-script-process-genbank. This script takes a GenBank file, and outputs a FASTA file, as well as two additional TAB-delimited output files for external gene calls and gene functions that can be used with the programs anvi-gen-contigs-database and anvi-import-functions.
🧀 genbank-file
🍕 contigs-fasta external-gene-calls functions-txt
🔥 anvi-script-process-genbank-metadata. This script takes the 'metadata' output of the program ncbi-genome-download (see https://github.com/kblin/ncbi-genome-download for details), and processes each GenBank file found in the metadata file to generate a FASTA file, as well as genes and functions files for each entry. Plus, it autmatically generates a FASTA TXT file descriptor for anvi'o snakemake workflows. So it is a multi-talented program like that.
🍕 contigs-fasta functions-txt external-gene-calls
🔥 anvi-script-reformat-fasta. Reformat FASTA file (remove contigs based on length, or based on a given list of deflines, and/or generate an output with simpler names).
🧀 fasta
🍕 contigs-fasta
🔥 anvi-script-snvs-to-interactive. Take the output of anvi-gen-variability-profile, prepare an output for interactive interface.
🧀 variability-profile-txt
🍕 interactive
🔥 anvi-script-transpose-matrix. Transpose a TAB-delimited file.
🧀 view-data functions-txt misc-data-items-txt misc-data-layers-txt gene-calls-txt linkmers-txt
🍕 view-data functions-txt misc-data-items-txt misc-data-layers-txt gene-calls-txt linkmers-txt
🔥 anvi-script-variability-to-vcf. A script to convert SNV output obtained from anvi-gen-variability-profile to the standard VCF format.
🧀 variability-profile-txt
🍕 vcf