The purpose of this program is to affiliate single-copy core genes in an anvi'o contigs database with taxonomic names. A properly setup local SCG taxonomy database is required for this program to perform properly. After its successful run,
anvi-estimate-scg-taxonomy will be useful to estimate taxonomy at genome-, collection-, or metagenome-level).
This program associates the single-copy core genes in your contigs-db with taxnomy information.
Once this information is stored in your contigs-db (in the form of a scgs-taxonomy artifact), you can run anvi-estimate-scg-taxonomy or use the anvi-interactive and enable “Realtime taxonomy estimate for bins.” Check out this tutorial for more information.
What does this program do?
In short, this program searches all of the single-copy core genes that it uses for this workflow (which are the 22 listed on this page) against the GTDB databases that you downloaded, and stores hits in your contigs-db. In other words, it finds your single-copy core genes and assigns them taxonomy. This way, it can use these single-copy core genes later to estimate the taxnomy of larger groups of contigs that include these single-copy core genes when you run anvi-estimate-scg-taxonomy.
Sweet. How do I run it?
anvi-run-scg-taxonomy -c contigs-db
In case you’re running this on a genome and not getting any hits, you have the option to try lowering the percent identity required for a hit (as long as you’re careful with it). The default value is 90 percent.
anvi-run-scg-taxonomy -c contigs-db \ --min-percent-identity 70
Edit this file to update this information.
Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit this file on GitHub. If you are not sure how to do that, find the
__resources__ tag in this file to see an example.