Microbial 'omics

Brought to you by

Citing anvi'o like a pro

Anvi’o is an evolving software ecosystem, and its components are often described in multiple studies. Thus, the best practice for your study may be to cite multiple publications if it benefits from multiple anvi’o features.

We know that finding the best studies to cite can be a lot of work. The purpose of this page is to offer up-to-date suggestions to help you find out how to finalize your citations regarding anvi’o. But if you are unsure, please feel free to drop us a line, or find us on Slack.

Anvi’o often uses third-party software or resources (such as HMMER, Prodigal, MCL, GTDB, or NCBI) and the platform typically guides you to cite relevant work when they are used for an anvi’o analysis. Suggestions on this page are specific to anvi’o, and do not include third-party software that you should also make sure to cite properly.

We know this is difficult work and we are thankful for your attention.

Default citation

If you have used anvi’o for anything at all please consider citing this work as it describes the software ecosystem in general which currently sits on more than 120,000 lines of code, which means any given anvi’o program benefits from the entirety of this ecosystem:

Community-led, integrated, reproducible multi-omics with anvi'o

Eren AM, Kiefl E, Shaiber A, Veseli I, Miller SE, Schechter MS, Fink I, Pan JN, Yousef M, Fogarty EC, Trigodet F, Watson AR, Esen ÖC, Moore RM, Clayssen Q, Lee MD, Kivenson V, Graham ED, Merrill BD, Karkman A, Blankenberg D, Eppley JM, Sjödin A, Scott JJ, Vázquez-Campos X, McKay LJ, McDaniel EA, Stevens SLR, Anderson RE, Fuessel J, Fernandez-Guerra A, Maignien L, Delmont TO, Willis AD Nature Microbiology, 6(1):3:6.

The rest of the citations on this page are specific for certain anvi’o features.

Functional enrichment analyses

This feature was described for the first time in this study:

Functional and genetic markers of niche partitioning among enigmatic members of the human oral microbiome

Shaiber A, Willis AD, Delmont TO, Roux S, Chen L, Schmid AC, Yousef M, Watson AR, Lolans K, Esen ÖC, Lee STM, Downey N, Morrison HG, Dewhirst FE, Mark Welch JL, Eren AM Co-senior authors Genome Biology, 21:292.

In a recent study, we cited this work the following way:

(…)

Functional enrichment analyses. The statistical approach for enrichment analysis is defined elsewhere (Shaiber et al. 2020), but briefly the program anvi-compute-functional-enrichment determined enrichment scores for functions (or metabolic modules) within groups of genomes by fitting a binomial generalized linear model (GLM) to the occurrence of each function (or complete metabolic module) in each group, and then computing a Rao test statistic, uncorrected p-values, and corrected q-values. We considered any function or metabolic module with a q-value less than 0.05 to be ‘enriched’ in its associated group (…)

Snakemake workflows

There is not yet a published study that describes our workflows, although they were first introduced in the following work:

Functional and genetic markers of niche partitioning among enigmatic members of the human oral microbiome

Shaiber A, Willis AD, Delmont TO, Roux S, Chen L, Schmid AC, Yousef M, Watson AR, Lolans K, Esen ÖC, Lee STM, Downey N, Morrison HG, Dewhirst FE, Mark Welch JL, Eren AM Co-senior authors Genome Biology, 21:292.

In a recent study, we cited our workflows the following way:

(…)

‘Omics workflows. Whenever applicable, we automated and scaled our ‘omics analyses using the bioinformatics workflows implemented by the program anvi-run-workflow (Shaiber et al. 2020) in anvi’o (Eren et al. 2021). Anvi’o workflows implement numerous steps of bioinformatics tasks including short-read quality filtering, assembly, gene calling, functional annotation, hidden Markov model search, metagenomic read-recruitment, metagenomic binning, pangenomics, and phylogenomics. Workflows use Snakemake (Köster and Rahmann 2012) and a tutorial is available at the URL http://merenlab.org/anvio-workflows/. The following sections detail these steps.

(…)

Metabolic reconstruction

There is not yet a published study that describes anvi’o metabolic reconstruction capabilities, although in a recent study we mentioned the program we mentioned this framework the following way:

(…)

Analysis of metabolic modules and enrichment. We calculated the level of completeness for a given KEGG module (Kanehisa et al. 2014; Kanehisa et al. 2017) in our genomes using the program anvi-estimate-metabolism, which leveraged previous annotation of genes with KEGG orthologs (KOs) (see the section ‘Processing of contigs’). Then, the program anvi-compute-functional-enrichment determined whether a given metabolic module was enriched in based on the output from anvi-estimate-metabolism. The URL https://merenlab.org/m/anvi-estimate-metabolism serves a tutorial for this program which details the modes of usage and output file formats (…)

Single-amino acid variants

If you are using anvi’o to study microbial population genetics through single-codon or single-amino acid variants, please consider also citing this work:

Single-amino acid variants reveal evolutionary processes that shape the biogeography of a global SAR11 subclade

Delmont TO, Kiefl E, Kilinc O, Esen ÖC, Uysal I, Rappé MS, Giovannoni S, Eren AM Co-first authors eLife, 8:e46497.

Metapangenomics

The metapangenomics was first introduced in this study. If you are using anvi’o to investigate how to bring together pangenomes and metagenomes, please consider citing this work as well.

Linking pangenomes and metagenomes: the Prochlorococcus metapangenome

Delmont TO, Eren AM PeerJ, 6:e4320.