contigs-db [artifact]

A DB-type anvi’o artifact. This artifact is typically generated, used, and/or exported by anvi’o (and not provided by the user)..

Provided by

anvi-convert-trnaseq-database anvi-gen-contigs-database

Required or used by

anvi-cluster-contigs anvi-compute-completeness anvi-db-info anvi-delete-functions anvi-delete-hmms anvi-display-contigs-stats anvi-display-metabolism anvi-display-structure anvi-estimate-genome-completeness anvi-estimate-metabolism anvi-estimate-scg-taxonomy anvi-estimate-trna-taxonomy anvi-export-contigs anvi-export-functions anvi-export-gene-calls anvi-export-gene-coverage-and-detection anvi-export-locus anvi-export-misc-data anvi-export-splits-and-coverages anvi-export-splits-taxonomy anvi-gen-fixation-index-matrix anvi-gen-gene-consensus-sequences anvi-gen-gene-level-stats-databases anvi-gen-structure-database anvi-gen-variability-profile anvi-get-aa-counts anvi-get-codon-frequencies anvi-get-sequences-for-gene-calls anvi-get-sequences-for-hmm-hits anvi-get-short-reads-from-bam anvi-get-short-reads-mapping-to-a-gene anvi-get-split-coverages anvi-import-collection anvi-import-functions anvi-import-misc-data anvi-import-taxonomy-for-genes anvi-inspect anvi-interactive anvi-merge anvi-migrate anvi-profile anvi-refine anvi-rename-bins anvi-run-hmms anvi-run-interacdome anvi-run-kegg-kofams anvi-run-ncbi-cogs anvi-run-pfams anvi-run-scg-taxonomy anvi-run-trna-taxonomy anvi-scan-trnas anvi-search-functions anvi-show-misc-data anvi-split anvi-summarize anvi-update-db-description anvi-update-structure-database anvi-script-add-default-collection anvi-script-calculate-pn-ps-ratio anvi-script-gen-distribution-of-genes-in-a-bin anvi-script-get-hmm-hits-per-gene-call anvi-script-merge-collections


A contigs database is an anvi’o database that contains key information associated with your sequences.

In a way, an anvi’o contigs database is a modern, more talented form of a FASTA file, where you can store additional information about your sequences in it and others can query and use it. Information storage and access is primarily done by anvi’o programs, however, it can also be done through the command line interface or programmatically.

The information a contigs database contains about its sequences include the positions of open reading frames, tetra-nucleotide frequencies, functional and taxonomic annotations, information on individual nucleotide or amino acid positions, and more.

Another (less computation-heavy) way of thinking about it

When working in anvi’o, you’ll need to be able to access previous analysis done on a genome or transcriptome. To do this, anvi’o uses tools like contigs databases instead of regular fasta files. So, you’ll want to convert the data that you have into a contigs database to use other anvi’o programs (using anvi-gen-contigs-database). As seen on the page for metagenomes, you can then use this contigs database instead of your fasta file for all of your anvi’o needs.

In short, to get the most out of your data in anvi’o, you’ll want to use your data (which was probably originally in a fasta file) to create both a contigs-db and a profile-db. That way, anvi’o is able to keep track of many different kinds of analysis and you can easily interact with other anvi’o programs.

Usage Information

Creating and populating a contigs database

Contigs databases will be initialized using anvi-gen-contigs-database using a contigs-fasta. This will compute the k-mer frequencies for each contig, soft-split your contigs, and identify open reading frames. To populate a contigs database with more information, you can then run various other programs.

Key programs that populate an anvi’o contigs database with essential information include

Once an anvi’o contigs database is generated and populated with information, it is always a good idea to run anvi-display-contigs-stats to see a numerical summary of its contents.

Other programs you can run to populate a contigs database include

Analysis on a populated contigs database

Other essential programs that read from a contigs database and yield key information include anvi-estimate-genome-completeness, anvi-get-sequences-for-hmm-hits, and anvi-estimate-scg-taxonomy.

If you wish to run programs like anvi-cluster-contigs, anvi-estimate-metabolism, and anvi-gen-gene-level-stats-databases, or view your database with anvi-interactive, you’ll need to first use your contigs database to create a profile-db.

