Microbial 'omics

A tutorial on the anvi'o interactive interface

This tutorial is tailored for anvi’o v2.3.0 or later. You can learn the version of your installation by typing anvi-interactive -v. If you have an older version, some things will not work the way they should.

The purpose of this tutorial is to give you a brief idea about the capabilities of the anvi’o interactive interface using an intuitive dataset: the taxonomic profiles of 690 metagenomes from the Human Microbiome Project (HMP)) generated by MetaPhlAn. To follow this tutorial create a new directory anywhere on your computer, and go into it from your terminal.

While this tutorial will take you through a simple analysis of a real dataset, there also is available a more comprehensive (but more abstract) tutorial on data types in the anvi’o interactive interface understands.

Table of Contents

The data matrix

We often represent our data in the following form:

  item_1 item_2 item_3 item_4 item_5 item_6 item_7 (…)
sample_1 ? ? ? ? ? ? ? (…)
sample_2 ? ? ? ? ? ? ? (…)
sample_3 ? ? ? ? ? ? ? (…)
sample_4 ? ? ? ? ? ? ? (…)
sample_5 ? ? ? ? ? ? ? (…)
sample_6 ? ? ? ? ? ? ? (…)
sample_7 ? ? ? ? ? ? ? (…)
sample_8 ? ? ? ? ? ? ? (…)
sample_9 ? ? ? ? ? ? ? (…)
(…) (…) (…) (…) (…) (…) (…) (…) (…)

The dataset we will go through in this tutorial is not different, and it follows the same structural organization:

Metagenome Streptococcus_mitis Propionibacterium_acnes Haemophilus_parainfluenzae Lactobacillus_crispatus Bacteroides_unclassified Corynebacterium_matruchotii (…)
SRS011061 0 0 0.0375 0 0.5463 0 (…)
SRS011090 78.99923 0.01181 1.86651 0 0 0 (…)
SRS011098 1.03629 0.00202 3.1655 0 0.00442 10.7104 (…)
SRS011126 0.80909 0 8.07113 0 0.06489 21.49041 (…)
SRS011132 1.8407 75.61046 0.15936 0 0 0 (…)
SRS011134 0.20981 0 0.0731 0 14.23341 0 (…)
SRS011140 2.70361 0.00204 18.00913 0 0.0154 0.04016 (…)
SRS011144 32.22543 0.06306 3.622 0 0.03833 0.18776 (…)
SRS011152 1.50179 0 14.26581 0 0.01726 19.2156 (…)
(…) (…) (…) (…) (…) (…) (…) (…)

You can download the full dataset,

 $ wget http://merenlab.org/tutorials/interactive-interface/files/data.txt

and you can take a very quick look at it in anvi’o:

$ anvi-interactive -d data.txt \
                   -p profile.db \
                   --title "Taxonomic profiles of 690 gut metagenomes" \
                   --manual

manual

Because we haven’t provided any specific organization, anvi’o organizes all samples alphabetically (very smart, anvi’o, really, thanks). But we will recover that.

If you press m on your keyboard, you can toggle the information window on

Organizing items

Clearly there is not much to see in the previous display. Because we already know that those samples come from different environments (such as gut, and oral cavity), and they can be organized much better than sorting them alphabetically.

We can do a quick hierarchical clustering on the data using the program anvi-matrix-to-newick , and get back a newick “tree” that would make things a bit easier on our eyes:

 $ anvi-matrix-to-newick data.txt \
                         -o tree.txt
 $ anvi-interactive -d data.txt \
                    -p profile.db \
                    --title "Taxonomic profiles of 690 gut metagenomes" \
                    --tree tree.txt \
                    --manual

manual

A bit better! While we are here, though, click “save state” and save a default state so we don’t have to press draw every time we run the interface going forward (later we will update that state). Although the program anvi-matrix-to-newick uses Euclidean distance and ward linkage by default to organize things, other distance metric and linkage options are available:

 $ anvi-matrix-to-newick data.txt \
                         -o tree.txt \
                         --distance braycurtis \
                         --linkage average

Available distance metrics include braycurtis, canberra, chebyshev, cityblock, correlation, cosine, dice, euclidean, hamming, jaccard, kulsinski, matching, minkowski, rogerstanimoto, russellrao, sokalmichener, sokalsneath, sqeuclidean, and yule. Available linkage algorithms include single, complete, average, weighted, centroid, median, and ward.

Organizing layers

We definitely improved the organization of items based on their taxonomic makeup. But layers could have been organized better, as well (right? RIGHT?).

Doing it with anvi’o is somewhat similar to the initial step of organizing items (but notice that we now add the flag --transpose):

 $ anvi-matrix-to-newick data.txt \
                         --transpose \
                         -o layers-tree.txt

Now we have a tree file to organize our layers, however, the utilization of this tree file is not going to be as straightforward as using the --tree parameter for the anvi-interactive command.

It will require us to add this information into the ‘layer orders’ table. This table is one of additional data table anvi’o often uses to enrich its displays.

A rather comprehensive description of these tables, and how to operate on them is laid out here:

If you read that article, you already know about the simple structure of the input file to add new layers orders into a profile database. If you don’t want to spend time on it you can download it here:

 $ wget http://merenlab.org/tutorials/interactive-interface/files/layer-orders.txt

After taking a look at the contents of this file, you can import it in your profile database:

 $ anvi-import-misc-data layer-orders.txt \
                         --target-data-table layer_orders \
                         --pan-or-profile-db profile.db

and re-run the anvi-interactive,

 $ anvi-interactive -d data.txt \
                    -p profile.db \
                    --title "Taxonomic profiles of 690 gut metagenomes" \
                    --tree tree.txt \
                    --manual

You will be surprised to see that nothing has really changed. Why? Because you need to instruct anvi’o to use the new organization to order layers. This can be done from the “samples” tab (which clearly should be called ‘layers’ at this point):

manual

Then, if you click draw again, you will feel that we are getting somewhere.

manual

This may be a good time to save your default state by using the buttons down below in your settings panel so you don’t have to click Draw every time you start a new interactive interface.

Let’s go all corners

We are aware that most people have quite strong feelings against circular plots.

We like them because they display more data in media we use for publishing (i.e., the A4 page size, etc). But as you probably know, anvi’o can also give you ugly, cornered displays (hehe).

To honor all those who like corners better, we shall continue with the phylogram display for the rest of this tutorial. When you change the “Drawing type” to phylogram, it will initially look quite ugly. But after playing with settings a little bit, you can make it look more reasonable:

manual

Additional data for the items

Anvi’o can extend any view with additional data. For instance, we have some information about these metagenomes. Such as the sampling site, or the gender of the individual they originate from. We could display that information to improve our understanding of the data.

You can download the pre-prepared items additional data file from here:

 $ wget http://merenlab.org/tutorials/interactive-interface/files/additional-items-data.txt

The first column of the additional data file is pretty much identical to the data file, but there are some other data columns in it:

Metagenome Body_Site Body_Subsite Host_Gender
SRS011061 GastrointestinalTract Stool Female
SRS011090 Oral Buccal_mucosa Female
SRS011098 Oral Supragingival_plaque Female
SRS011126 Oral Supragingival_plaque Male
SRS011132 Airways Nares Male
SRS011134 GastrointestinalTract Stool Male
SRS011140 Oral Tongue_dorsum Male
SRS011144 Oral Buccal_mucosa Male
SRS011152 Oral Supragingival_plaque Male
(…) (…) (…) (…)

Careful readers know what’s up. We need to add these additional data into the profile database, right? Yes. And it is will go exactly the way you imagine it would (note the change in target data table):

 $ anvi-import-misc-data additional-items-data.txt \
                         --target-data-table items \
                         --pan-or-profile-db profile.db

Now you can re-run your interactive interface:

 $ anvi-interactive -d data.txt \
                    -p profile.db \
                    --title "Taxonomic profiles of 690 HMP metagenomes" \
                    --tree tree.txt \
                    --manual

to get this one (you will really squint your eyes to see the new layer at the bottom):

manual

Just a small tip while we are here: you can always zoom-in to a particular part of a given display by making a selection while pressing your shift key:

manual

You could have shown your items additonal data in the interactive interface without importing it, but using the --additional-layers parameter. But it is always a better practice to import additional data into the profile database to minimize the number of files that need to be carried around for full reproducibility.

Additional data for the layers

How about extending layers with extra information? At this point we can at least add some taxonomy for these layers.

Here is a layers additional data file for the lazy:

 $ wget http://merenlab.org/tutorials/interactive-interface/files/additional-layers-data.txt

After taking a look at the file, you can import it into the profile database:

 $ anvi-import-misc-data additional-layers-data.txt \
                         --target-data-table layers \
                         --pan-or-profile-db profile.db

And rerun the interactive interface:

 $ anvi-interactive -d data.txt \
                    -p profile.db \
                    --title "Taxonomic profiles of 690 HMP metagenomes" \
                    --tree tree.txt \
                    --manual

to get this one:

manual

I guess we can all agree that this figure looks unbearably ugly, and quite useless :(

Prettification

I had started this section by saying “prettification is clearly not a real word, but it absolutely should should have been”. Then I Google’d it just for fun, and there it was! It is a real word, which means there is no reason for you to not do it:

manual

One of the most powerful aspects of anvi’o is its ability to give you so much power to communicate your results as best as possible. Prettification is working with the anvi’o display above and not letting it go until it starts to look like something that helps you convey your message.

Working with large SVG files can be challenging. We have some suggestions here to ameliorate that burden.

Let’s step by step prettify this display, to get to here:

manual

And this is the circular version, if you are curious AND not stubborn (hehe):

manual

You can import this visual display into your version by downloading the anvi’o state file:

 $ wget http://merenlab.org/tutorials/interactive-interface/files/pretty-state.json
 $ anvi-import-state -p PROFILE.db -s pretty-state.json -n default

Now you can re-run your interface, and you will have it, too:

 $ anvi-interactive -d data.txt \
                    -p profile.db \
                    --title "Taxonomic profiles of 690 HMP metagenomes" \
                    --tree tree.txt \
                    --manual

Sharing anvi’o displays interactively

[If it wasn’t so late, Meren was going to write a paragraph here that first would talk about how important it is to give our peers access to our interactive displays, and then would introduce our ongoing project, anvi’server].

Here, you can view the same figure interactively on anvi’server:

https://anvi-server.org/merenlab/hmp_metagenomes

Final words

Do you want more examples? Do you have questions? Please don’t hesitate to get in touch with us!