Microbial 'omics

A tutorial on the anvi'o interactive interface

This tutorial is tailored for anvi’o v2.3.0 or later. You can learn the version of your installation by typing anvi-interactive -v. If you have an older version, some things will not work the way they should.

The purpose of this tutorial is to give you a brief idea about the capabilities of the anvi’o interactive interface using an intuitive dataset: the taxonomic profiles of 690 metagenomes from the Human Microbiome Project (HMP)) generated by MetaPhlAn. To follow this tutorial create a new directory anywhere on your computer, and go into it from your terminal.

While this tutorial will take you through a simple analysis of a real dataset, there also is available a more comprehensive (but more abstract) tutorial on data types in the anvi’o interactive interface understands.

Table of Contents

The data matrix

We often represent our data in the following form:

  item_1 item_2 item_3 item_4 item_5 item_6 item_7 (…)
sample_1 ? ? ? ? ? ? ? (…)
sample_2 ? ? ? ? ? ? ? (…)
sample_3 ? ? ? ? ? ? ? (…)
sample_4 ? ? ? ? ? ? ? (…)
sample_5 ? ? ? ? ? ? ? (…)
sample_6 ? ? ? ? ? ? ? (…)
sample_7 ? ? ? ? ? ? ? (…)
sample_8 ? ? ? ? ? ? ? (…)
sample_9 ? ? ? ? ? ? ? (…)
(…) (…) (…) (…) (…) (…) (…) (…) (…)

The dataset we will go through in this tutorial is not different, and it follows the same structural organization:

Metagenome Streptococcus_mitis Propionibacterium_acnes Haemophilus_parainfluenzae Lactobacillus_crispatus Bacteroides_unclassified Corynebacterium_matruchotii (…)
SRS011061 0 0 0.0375 0 0.5463 0 (…)
SRS011090 78.99923 0.01181 1.86651 0 0 0 (…)
SRS011098 1.03629 0.00202 3.1655 0 0.00442 10.7104 (…)
SRS011126 0.80909 0 8.07113 0 0.06489 21.49041 (…)
SRS011132 1.8407 75.61046 0.15936 0 0 0 (…)
SRS011134 0.20981 0 0.0731 0 14.23341 0 (…)
SRS011140 2.70361 0.00204 18.00913 0 0.0154 0.04016 (…)
SRS011144 32.22543 0.06306 3.622 0 0.03833 0.18776 (…)
SRS011152 1.50179 0 14.26581 0 0.01726 19.2156 (…)
(…) (…) (…) (…) (…) (…) (…) (…)

You can download the full dataset,

 $ wget http://merenlab.org/tutorials/interactive-interface/files/data.txt

and you can take a very quick look at it in anvi’o:

$ anvi-interactive -d data.txt \
                   -p profile.db \
                   --title "Taxonomic profiles of 690 gut metagenomes" \
                   --manual

manual

Because we haven’t provided any specific organization, anvi’o organizes all samples alphabetically (very smart, anvi’o, really, thanks). But we will recover that.

If you press m on your keyboard, you can toggle the information window on

Organizing items

Clearly there is not much to see in the previous display. Because we already know that those samples come from different environments (such as gut, and oral cavity), and they can be organized much better than sorting them alphabetically.

We can do a quick hierarchical clustering on the data using the program anvi-matrix-to-newick , and get back a newick “tree” that would make things a bit easier on our eyes:

 $ anvi-matrix-to-newick data.txt \
                         -o tree.txt
 $ anvi-interactive -d data.txt \
                    -p profile.db \
                    --title "Taxonomic profiles of 690 gut metagenomes" \
                    --tree tree.txt \
                    --manual

manual

A bit better! While we are here, though, click “save state” and save a default state so we don’t have to press draw every time we run the interface going forward (later we will update that state). Although the program anvi-matrix-to-newick uses Euclidean distance and ward linkage by default to organize things, other distance metric and linkage options are available:

 $ anvi-matrix-to-newick data.txt \
                         -o tree.txt \
                         --distance braycurtis \
                         --linkage average

Available distance metrics include braycurtis, canberra, chebyshev, cityblock, correlation, cosine, dice, euclidean, hamming, jaccard, kulsinski, matching, minkowski, rogerstanimoto, russellrao, sokalmichener, sokalsneath, sqeuclidean, and yule. Available linkage algorithms include single, complete, average, weighted, centroid, median, and ward.

Organizing layers

We definitely improved the organization of items based on their taxonomic makeup. But layers could have been organized better, as well. The initial step is somewhat similar (but notice that we now add the flag --transpose):

 $ anvi-matrix-to-newick data.txt \
                         --transpose \
                         -o layers-tree.txt

Now we have a tree file to organize our layers, however, the utilization of this tree file will require a minimal understanding of what we call ‘samples database’ in anvi’o jargon. A rather comprehensive description of the samples database concept is laid out here:

http://merenlab.org/2015/11/10/samples-db/

If you read that, you already know about the sample-order file. If you don’t want to spend time on it you can download it here:

 $ wget http://merenlab.org/tutorials/interactive-interface/files/samples-order.txt

Now you can generate a samples database,

 $ anvi-gen-samples-info-database -R samples-order.txt \
                                  -o samples.db

and re-run the anvi-interactive,

 $ anvi-interactive -d data.txt \
                    -p profile.db \
                    --title "Taxonomic profiles of 690 gut metagenomes" \
                    --tree tree.txt \
                    -s samples.db \
                    --manual

You will be surprised to see that nothing has really changed. Why? Because you need to instruct anvi’o to use the new organization to order layers. This can be done from the “samples” tab (which clearly should be called ‘layers’ at this point):

manual

Then, if you click draw again, you will feel that we are getting somewhere.

manual

Going all corners

We are aware that most people have quite strong feelings against circular plots.

We like them because they display more data in media we use for publishing (i.e., the A4 page size, etc). But as you probably know, anvi’o can also give you ugly, cornered displays (hehe).

To honor all those who like corners better, we shall continue with the phylogram display for the rest of this tutorial. When you change the “Drawing type” to phylogram, it will initially look quite ugly. But after playing with settings a little bit, you can make it look more reasonable:

manual

Additional data for the items

Anvi’o can extend any view with additional data. For instance, we have some information about these metagenomes. Such as the sampling site, or the gender of the individual they originate from. We could display that information to improve our understanding of the data.

You can download the pre-prepared additional data file from here:

 $ wget http://merenlab.org/tutorials/interactive-interface/files/additional-data.txt

The first column of the additional data file is pretty much identical to the data file, but there are some other data columns in it:

Metagenome Body_Site Body_Subsite Host_Gender
SRS011061 GastrointestinalTract Stool Female
SRS011090 Oral Buccal_mucosa Female
SRS011098 Oral Supragingival_plaque Female
SRS011126 Oral Supragingival_plaque Male
SRS011132 Airways Nares Male
SRS011134 GastrointestinalTract Stool Male
SRS011140 Oral Tongue_dorsum Male
SRS011144 Oral Buccal_mucosa Male
SRS011152 Oral Supragingival_plaque Male
(…) (…) (…) (…)

Adding these additional information on an anvi’o display as new layers is quite straightforward:

 $ anvi-interactive -d data.txt \
                    -p profile.db \
                    --title "Taxonomic profiles of 690 HMP metagenomes" \
                    --tree tree.txt \
                    -s samples.db \
                    -A additional-data.txt \
                    --manual

manual

Just a small tip while we are here: you can always zoom-in to a particular part of a given display by making a selection while pressing your shift key:

manual

Additional data for the layers

How about extending layers with extra information? At this point we can at least add some taxonomy for these taxa. This information must be included in the samples database, just like the layers order information.

Here is one for the lazy:

 $ wget http://merenlab.org/tutorials/interactive-interface/files/samples-information.txt

Now you can remove the old samples database, and generate a new one:

 $ rm samples.db
 $ anvi-gen-samples-info-database -R samples-order.txt \
                                  -D samples-information.txt \
                                  -o samples.db

And rerun the interactive interface:

 $ anvi-interactive -d data.txt \
                    -p profile.db \
                    --title "Taxonomic profiles of 690 HMP metagenomes" \
                    --tree tree.txt \
                    -s samples.db \
                    -A additional-data.txt \
                    --manual

to get this one:

manual

I guess we can all agree that this figure looks unbearably ugly, and quite useless :(

Prettification

I had started this section by saying “prettification is clearly not a real word, but it absolutely should should have been”. Then I Google’d it just for fun, and there it was! It is a real word, which means there is no reason for you to not do it:

manual

One of the most powerful aspects of anvi’o is its ability to give you so much power to communicate your results as best as possible. Prettification is working with the anvi’o display above and not letting it go until it starts to look like something that helps you convey your message.

Working with large SVG files can be challenging. We have some suggestions here to ameliorate that burden.

Let’s step by step prettify this display, to get to here:

manual

And this is the circular version, if you are curious AND not stubborn (hehe):

manual

You can import this visual display into your version by downloading the anvi’o state file:

 $ wget http://merenlab.org/tutorials/interactive-interface/files/pretty-state.json
 $ anvi-import-state -p PROFILE.db -s pretty-state.json -n default

Now you can re-run your interface, and you will have it, too:

 $ anvi-interactive -d data.txt \
                    -p profile.db \
                    --title "Taxonomic profiles of 690 HMP metagenomes" \
                    --tree tree.txt \
                    -s samples.db \
                    -A additional-data.txt \
                    --manual

Sharing anvi’o displays interactively

[If it wasn’t so late, Meren was going to write a paragraph here that first would talk about how important it is to give our peers access to our interactive displays, and then would introduce our ongoing project, anvi’server].

Here, you can view the same figure interactively on anvi’server:

https://anvi-server.org/merenlab/hmp_metagenomes

Final words

Do you want more examples? Do you have questions? Please don’t hesitate to get in touch with us!