A DB-type anvi’o artifact. This artifact is typically generated, used, and/or exported by anvi’o (and not provided by the user)..
Back to the main page of anvi’o programs and artifacts.
Required or used by
anvi-analyze-synteny anvi-compute-functional-enrichment anvi-compute-gene-cluster-homogeneity anvi-db-info anvi-display-pan anvi-get-sequences-for-gene-calls anvi-get-sequences-for-gene-clusters anvi-meta-pan-genome anvi-migrate anvi-pan-genome anvi-search-functions anvi-split anvi-summarize anvi-update-db-description
This is an Anvi’o database that stores information about your genomes, primarily for use in pangenomic analyses.
You can think of it like this: in a way, a genomes-storage-db is to the the pangenomic workflow what a contigs-db is to the the metagenomic workflow. They both describe key information unique to your particular dataset and are required to run the vast majority of programs.
What kind of information?
A genomes storage database contains information about the genomes that you inputted to create it, as well as the genes within them.
Specifically, there are three tables stored in a genomes storage database:
- A table describing the information about each of your genomes, such as their name, type (internal or external), GC content, number of contigs, completition, redunduncy, number of genes, etc.
- A table describing the genes within your genomes. For each gene, this includes its gene caller id, associated genome and position, sequence, length, and whether or not it is partial.
- A table describing the functions of your genes, including their sources and e-values.
Cool. How do I make one?
Cool cool. What can I do with one?
With one of these, you can run anvi-pan-genome to get a pan-db. If a genomes storage database is the contigs-db of pangenomics, then a pan-db is the profile-db. It contains lots of information that is vital for analysis, and most programs will require both the pan-db and its genomes storage database as an input.
Edit this file to update this information.