A DB-type anvi’o artifact. This artifact is typically generated, used, and/or exported by anvi’o (and not provided by the user)..
Back to the main page of anvi’o programs and artifacts.
Required or used by
anvi-cluster-contigs anvi-compute-completeness anvi-db-info anvi-delete-hmms anvi-display-contigs-stats anvi-display-metabolism anvi-display-structure anvi-estimate-genome-completeness anvi-estimate-metabolism anvi-estimate-scg-taxonomy anvi-estimate-trna-taxonomy anvi-export-contigs anvi-export-functions anvi-export-gene-calls anvi-export-gene-coverage-and-detection anvi-export-locus anvi-export-misc-data anvi-export-splits-and-coverages anvi-export-splits-taxonomy anvi-gen-fixation-index-matrix anvi-gen-gene-consensus-sequences anvi-gen-gene-level-stats-databases anvi-gen-structure-database anvi-gen-variability-profile anvi-get-aa-counts anvi-get-codon-frequencies anvi-get-sequences-for-gene-calls anvi-get-sequences-for-hmm-hits anvi-get-short-reads-from-bam anvi-get-short-reads-mapping-to-a-gene anvi-get-split-coverages anvi-import-collection anvi-import-functions anvi-import-misc-data anvi-import-taxonomy-for-genes anvi-inspect anvi-interactive anvi-merge anvi-profile anvi-refine anvi-rename-bins anvi-run-hmms anvi-run-interacdome anvi-run-kegg-kofams anvi-run-ncbi-cogs anvi-run-pfams anvi-run-scg-taxonomy anvi-run-trna-taxonomy anvi-scan-trnas anvi-search-functions anvi-show-misc-data anvi-split anvi-summarize anvi-update-db-description anvi-update-structure-database anvi-script-add-default-collection anvi-script-calculate-pn-ps-ratio anvi-script-gen-distribution-of-genes-in-a-bin anvi-script-get-hmm-hits-per-gene-call anvi-script-merge-collections
A contigs database is an anvi’o database that contains key information associated with your sequences.
In a way, an anvi’o contigs database is a modern, more talented form of a FASTA file, where you can store additional information about your sequences in it and others can query and use it. Information storage and access is primarily done by anvi’o programs, however, it can also be done through the command line interface or programmatically.
The information a contigs database contains about its sequences include the positions of open reading frames, tetra-nucleotide frequencies, functional and taxonomic annotations, information on individual nucleotide or amino acid positions, and more.
Another (less computation-heavy) way of thinking about it
When working in anvi’o, you’ll need to be able to access previous analysis done on a genome or transcriptome. To do this, anvi’o uses tools like contigs databases instead of regular fasta files. So, you’ll want to convert the data that you have into a contigs database to use other anvi’o programs (using anvi-gen-contigs-database). As seen on the page for metagenomes, you can then use this contigs database instead of your fasta file for all of your anvi’o needs.
In short, to get the most out of your data in anvi’o, you’ll want to use your data (which was probably originally in a fasta file) to create both a contigs-db and a profile-db. That way, anvi’o is able to keep track of many different kinds of analysis and you can easily interact with other anvi’o programs.
Creating and populating a contigs database
Contigs databases will be initialized using anvi-gen-contigs-database using a contigs-fasta. This will compute the k-mer frequencies for each contig, soft-split your contigs, and identify open reading frames. To populate a contigs database with more information, you can then run various other programs.
Key programs that populate an anvi’o contigs database with essential information include
- anvi-run-hmms (which uses HMMs to annotate your genes against an hmm-source)
- anvi-run-scg-taxonomy (which associates its single-copy core gene with taxonomic data)
- anvi-scan-trnas (which identifies the tRNA genes)
- anvi-run-ncbi-cogs (which tries to assign functions to your genes using the COGs database)
Once an anvi’o contigs database is generated and populated with information, it is always a good idea to run anvi-display-contigs-stats to see a numerical summary of its contents.
Other programs you can run to populate a contigs database include
- anvi-run-kegg-kofams (which annotates the genes in the database with the KEGG KOfam database)
Analysis on a populated contigs database
If you wish to run programs like anvi-cluster-contigs, anvi-estimate-metabolism, and anvi-gen-gene-level-stats-databases, or view your database with anvi-interactive, you’ll need to first use your contigs database to create a profile-db.
Edit this file to update this information.