This article describes various ways to import taxonomy into anvi’o.
If what you find here does not solve your problem, please feel free to suggest new ways to deal with taxonomy, by entering an issue.
Anvi’o accepts taxonomic annotation at the gene level. Annotations in your files should correspond to open reading frames in your contigs, hence, taxonomical annotations should be done on FASTA files exported from the anvi’o contigs database. The basic workflow goes like this: (1) generate your contigs database, (2) export your gene sequences, (3) annotate them with taxonomy, and (4) import results back into your contigs database using
Important note: There are many ways to have your genes annotated with taxonomy. But, there is only one way to make sure the gene IDs in your taxonomy files correspond to the gene caller IDs in the database: export your DNA or AA sequences from the anvi’o contigs database you wish to annotate using anvi’o programs
This is the simplest way to get the taxonomical annotation of genes into the contigs database. The TAB-delimited input matrix should follow this format:
Not every gene call has to be in the matrix, and not every level of taxonomy has to be present, anvi’o will find a way to deal with that, but the more the merrier.
Once you have your matrix ready, this is the command line to import it using the parser
Assuming you generated an anvi’o contigs database. To import taxonomy into this contigs database, first you will export all gene calls:
Then you will run the following command:
If the environment variable
$CENTRIFUGE_BASE is not properly set, you will get an error. See the export instructions here to try again.
This step takes about one minute on my laptop for 40,000 genes.
When centrifuge is done running, you should find two files in your work directory, which you will import into anvi’o. These files are
centrifuge_hits.tsv. Just to make sure that they are not empty, feel free to run this command:
Fine. It is time to import these results! To do this, you will use the program
anvi-import-taxonomy with the parser for
You can use any file name you like, however, the order of input files is important: following the parameter
-i, you should first declare the
report file, and then the
This is it. If everything went alright, the interactive interface and anvi’o summary results should contain taxonomy information.