Microbial 'omics

Brought to you by

# A tutorial on the anvi'o interactive interface

This tutorial is tailored for anvi’o v2.3.0 or later. You can learn the version of your installation by typing anvi-interactive -v in your terminal.

The purpose of this tutorial is to give you a brief idea about the capabilities of the anvi’o interactive interface using an intuitive dataset without using any of the actual anvi’o functionality. The dataset we will use throughout this page is the taxonomic profiles of 690 metagenomes from the Human Microbiome Project (HMP)) generated by MetaPhlAn.

While this tutorial will take you through a simple analysis of a real dataset, there also is available a more comprehensive (but more abstract) tutorial on data types in the anvi’o interactive interface understands.

## The data matrix

Anvi’o interactive interface is often initiated to display and manupilate data stored in anvi’o databases. However, in this simple tutorial we will use the interactive interface in ‘manual mode’ by providing it with the data to diplay manually.

We often work with data tables that look like this:

item_1 item_2 item_3 item_4 item_5 item_6 item_7 (…)
sample_1 ? ? ? ? ? ? ? (…)
sample_2 ? ? ? ? ? ? ? (…)
sample_3 ? ? ? ? ? ? ? (…)
sample_4 ? ? ? ? ? ? ? (…)
sample_5 ? ? ? ? ? ? ? (…)
sample_6 ? ? ? ? ? ? ? (…)
sample_7 ? ? ? ? ? ? ? (…)
sample_8 ? ? ? ? ? ? ? (…)
sample_9 ? ? ? ? ? ? ? (…)
(…) (…) (…) (…) (…) (…) (…) (…) (…)

The dataset we will go through in this tutorial is not any different, and it follows the same structural organization:

Metagenome Streptococcus_mitis Propionibacterium_acnes Haemophilus_parainfluenzae Lactobacillus_crispatus Bacteroides_unclassified Corynebacterium_matruchotii (…)
SRS011061 0 0 0.0375 0 0.5463 0 (…)
SRS011090 78.99923 0.01181 1.86651 0 0 0 (…)
SRS011098 1.03629 0.00202 3.1655 0 0.00442 10.7104 (…)
SRS011126 0.80909 0 8.07113 0 0.06489 21.49041 (…)
SRS011132 1.8407 75.61046 0.15936 0 0 0 (…)
SRS011134 0.20981 0 0.0731 0 14.23341 0 (…)
SRS011140 2.70361 0.00204 18.00913 0 0.0154 0.04016 (…)
SRS011144 32.22543 0.06306 3.622 0 0.03833 0.18776 (…)
SRS011152 1.50179 0 14.26581 0 0.01726 19.2156 (…)
(…) (…) (…) (…) (…) (…) (…) (…)

Each row in this table represents a gut metagenome and every column represents a microbial taxon. The cells display the percent abundance of a given taxon in a given metagenome.

 $wget http://merenlab.org/tutorials/interactive-interface/files/data.txt  Then you can take a very quick look at it in anvi’o: $ anvi-interactive -d data.txt \
-p profile.db \
--title "Taxonomic profiles of 690 gut metagenomes" \
--manual


Because we haven’t provided any specific organization, anvi’o organizes all samples alphabetically. But we will recover from that.

If you press m on your keyboard, you can toggle the information window on that would show you the actual data points under your mouse pointer.

Once you are done looking at an interactive display and would like to continue running more commands, you should close the browser tab and kill the server by pressing CTRL+C key combination in your terminal.

## Organizing items

Clearly there is not much to see in the previous display. Because we already know that those samples come from different environments (such as gut, and oral cavity), and they can be organized much better than sorting them alphabetically.

We can do a quick hierarchical clustering on the data using the program anvi-matrix-to-newick , and get back a newick “tree” that would make things a bit easier on our eyes:

 $anvi-matrix-to-newick data.txt \ -o tree.txt  $ anvi-interactive -d data.txt \
-p profile.db \
--title "Taxonomic profiles of 690 gut metagenomes" \
--tree tree.txt \
--manual


A bit better!

Although the program anvi-matrix-to-newick uses Euclidean distance and ward linkage by default to organize things, available distance metrics include braycurtis, canberra, chebyshev, cityblock, correlation, cosine, dice, euclidean, hamming, jaccard, kulsinski, matching, minkowski, rogerstanimoto, russellrao, sokalmichener, sokalsneath, sqeuclidean, and yule with available linkage algorithms, single, complete, average, weighted, centroid, median, and ward.

## State files

Anvi’o enables you to a lot with its interactive interface when it comes to making visualization decisions. It also offers you a way to store these changes through so called states. A state is a JSON-formatted description of what you see in the interface and is stored in a anvi’o profile databases. There can be more than one state files in a given database, but if anvi’o finds a state in a given profile database called default, then it will automatically load it and click the Draw button on your behalf.

While we are here, let’s click “save state” and save a default state so we don’t have to press draw every time we run the interface going forward (later we will have to update this state as we continue editing the interface).

There are other ways to deal with states. For instance, if you go to your terminal and run the command,

anvi-help state


You will see that there are multiple programs that can export these state files. Just for fun, let’s export the state we just stored and take a look at it:

anvi-export-state -p profile.db \
-s default \
-o default.json


As you can imagine, an edited state file can also be imported to a profile database.

## Organizing layers

Going back to the dataset we have been playing with, we definitely have improved the display of items when we organized them based on the occurrence patterns of different taxa on metagenomes. But layers, which, in this particular dataset represent taxa, could have been organized better, as well.

Doing it with anvi’o is somewhat similar to the initial step of organizing items (but notice that we now add the flag --transpose):

 $anvi-matrix-to-newick data.txt \ --transpose \ -o layers-tree.txt  Now we have a tree file to organize our layers, however, the utilization of this tree file is not going to be as straightforward as using the --tree parameter for the anvi-interactive command. It will require us to add this information into the ‘layer orders’ table. This table is one of additional data table anvi’o often uses to enrich its displays. A rather comprehensive description of these tables, and how to operate on them is laid out here: If you read that article, you already know about the simple structure of the input file to add new layers orders into a profile database. If you don’t want to spend time on it you can download it here: $ wget http://merenlab.org/tutorials/interactive-interface/files/layer-orders.txt


After taking a look at the contents of this file, you can import it in your profile database:

 $anvi-import-misc-data layer-orders.txt \ --target-data-table layer_orders \ --pan-or-profile-db profile.db  and re-run the anvi-interactive, $ anvi-interactive -d data.txt \
-p profile.db \
--title "Taxonomic profiles of 690 gut metagenomes" \
--tree tree.txt \
--manual


You will be surprised to see that nothing has really changed. Why? Because you need to instruct anvi’o to use the new organization to order layers. This can be done from the “layers” tab:

Then, if you click draw again, you will feel that we are getting somewhere.

This may be a good time to update your default state.

## Let’s go all corners

We are aware that most people have quite strong feelings against circular plots.

We like them because they display more data in media we use for publishing (i.e., the A4 page size, etc). But as you probably know, anvi’o can also give you ugly, cornered displays (hehe).

To honor all those who like corners better, we shall continue with the phylogram display for the rest of this tutorial. When you change the “Drawing type” to phylogram, it will initially look quite ugly. But after playing with settings a little bit, you can make it look more reasonable:

## Additional data for the items

Anvi’o can extend any view with additional data. For instance, we have some information about these metagenomes. Such as the sampling site, or the gender of the individual they originate from. We could display that information to improve our understanding of the data.

 $wget http://merenlab.org/tutorials/interactive-interface/files/additional-items-data.txt  The first column of the additional data file is pretty much identical to the data file, but there are some other data columns in it: Metagenome Body_Site Body_Subsite Host_Gender SRS011061 GastrointestinalTract Stool Female SRS011090 Oral Buccal_mucosa Female SRS011098 Oral Supragingival_plaque Female SRS011126 Oral Supragingival_plaque Male SRS011132 Airways Nares Male SRS011134 GastrointestinalTract Stool Male SRS011140 Oral Tongue_dorsum Male SRS011144 Oral Buccal_mucosa Male SRS011152 Oral Supragingival_plaque Male (…) (…) (…) (…) Careful readers know what’s up. We need to add these additional data into the profile database, right? Yes. And it is will go exactly the way you imagine it would (note the change in target data table): $ anvi-import-misc-data additional-items-data.txt \
--target-data-table items \
--pan-or-profile-db profile.db


Now you can re-run your interactive interface:

 $anvi-interactive -d data.txt \ -p profile.db \ --title "Taxonomic profiles of 690 HMP metagenomes" \ --tree tree.txt \ --manual  to get this one (you will really squint your eyes to see the new layer at the bottom): Just a small tip while we are here: you can always zoom-in to a particular part of a given display by making a selection while pressing your shift key: You could have shown your items additional data in the interactive interface without importing it, but using the --additional-layers parameter. But it is always a better practice to import additional data into the profile database to minimize the number of files that need to be carried around for full reproducibility. ## Additional data for the layers How about extending layers with extra information? At this point we can at least add some taxonomy for these layers. Here is a layers additional data file for the lazy: $ wget http://merenlab.org/tutorials/interactive-interface/files/additional-layers-data.txt


After taking a look at the file, you can import it into the profile database:

 $anvi-import-misc-data additional-layers-data.txt \ --target-data-table layers \ --pan-or-profile-db profile.db  And rerun the interactive interface: $ anvi-interactive -d data.txt \
-p profile.db \
--title "Taxonomic profiles of 690 HMP metagenomes" \
--tree tree.txt \
--manual


to get this one:

I guess we can all agree that this figure looks unbearably ugly, and quite useless :(

## Prettification

I had started this section by saying “prettification is clearly not a real word, but it absolutely should should have been”. Then I Google’d it just for fun, and there it was! It is a real word, which means there is no reason for you to not do it:

One of the most powerful aspects of anvi’o is its ability to give you so much power to communicate your results as best as possible. Prettification is working with the anvi’o display above and not letting it go until it starts to look like something that helps you convey your message.

Working with large SVG files can be challenging. We have some suggestions here to ameliorate that burden.

Let’s step by step prettify this display, to get to here:

And this is the circular version, if you are curious AND not stubborn (hehe):

 $wget http://merenlab.org/tutorials/interactive-interface/files/pretty-state.json$ anvi-import-state -p profile.db -s pretty-state.json -n default


Now you can re-run your interface, and you will have it, too:

 \$ anvi-interactive -d data.txt \
-p profile.db \
--title "Taxonomic profiles of 690 HMP metagenomes" \
--tree tree.txt \
--manual