- 1. Data types and input files
- 2. Using the anvi’o interactive interface
This document last updated on April 1st, 2016.
With anvi’o you can do metagenomic binning, characterize single-nucleotide variation, study bacterial pangenomes, benchmark software tools, predict number of bacterial genomes in a metagenomic assembly, or even remove contamination from eukaryotic assembly projects. Anvi’o’s ‘versatility’ partly comes from its integrated visualization framework that allows the user to see all these different types of data, and interact with them.
The anvi’o interactive interface is a fully customizable visualization environment that is accessible through an intuitive interface to efficiently visualize complex data. It can handle large datasets, and it’s source code is freely available within the anvi’o platform.
Although it is fully integrated with core anvi’o operations detailed in the metagenomic workflow tutorial, the visualization environment can be initiated in an ad hoc manner by using the
anvi-interactive program with
--manual-mode flag, or through anvi’server, without an anvi’o installation. In summary, if you have a matrix file, anvi’o may be useful to you to generate high-quality, publication-ready figures with mouse clicks.
The purpose of this article is to provide a more detailed description of the interface by demonstrating the data types the interface can work with, and later details of the user interface.
A little note on our ongoing project, anvi'server
To make the anvi’o interactive interface more accessible, we teamed up with Tobias Paczian, and with his remarkable efforts created a web service. This new service, which we call anvi’server, is now running at http://anvi-server.org. Through anvi’server, you can perform anvi’o visualizations by uploading your data through a simple interface:
Topics covered for the remainder of this article are directly applicable to the interactive interface whether it is accessed through local anvi’o installations, or through the anvi’server.
Please note that the anvi’server is under active development, and your testing efforts will be greatly appreciated. Please don’t hesitate to get in touch with us if you have any questions.
1. Data types and input files
The purpose of this section is to provide examples for each of the data type (and input files) the interactive interface can work with. For this, I will start with a simple tree, and add layers step by step to describe different data types.
You can follow these examples in two ways:
- Using anvi-interactive in your terminal: For each data type I will either provide a link to the files used in the example command line, or give an example file structure so you can try them on your own files.
- Using http://anvi-server.org: The other option is to use our new anvi’server without installing anvi’o. If your only purpose with the interactive interface is to do an ad hoc visualization, I think this would be the best way to go. Otherwise you can read about the ways to install the platform on your own server or laptop.
Command lines mentioned in this article are run on anvi’o version 2 or later. You can check your verison using
OK. Let’s start.
1.1 Newick tree
The least you can do with the anvi’o interactive interface would be to visualize a newick-formatted tree. The tree file I use for this example contains 300 leaves. This is how I run the interactive interface from the command line:
Which gives me this:
Here is the anvi’server link for this visualization: http://anvi-server.org/public/meren/interface_demo_I
Let’s assume you for each item you have in the previous tree, you have multiple numerical values you want to overlay on the tree in a TAB-delimited matrix file that looks like this:
This data can be visualized along with the tree this way:
Here is the anvi’server link for this visualization: http://anvi-server.org/public/meren/interface_demo_II
If you only have this matrix file but not the tree file, you can run this command to get the tree file:
Adding a text layer is as simple as adding a column of text values in your matrix file:
Now I can run the same command on this updated matrix file:
And here is the result:
Anvi’server link: http://anvi-server.org/public/meren/interface_demo_III
Please note that when I first visualized only the tree file, there were labels for each leaf. However, when I visualized the tree file along with the data, labels disappeared. Anvi’o is designed to visualize trees with thousands of items, where having labels rarely is useful. Therefore the default behavior of the visualization interface is to omit labels when there is data, and labels are shown only when the user wants to visualize a single tree. If you would like to show labels, you can simply use the text data type to duplicate the first column in your matrix file:
Here I am adding a column of categorical data in the same file:
The same command line for the matrix file above:
Anvi’server link: http://anvi-server.org/public/meren/interface_demo_IV
Some of you are probably asking themselves ‘what is the difference between text and categorical data?’. Very good question! Essentially they are both the same. If the unique number of items in a given dataset of non-integer values more than 12, the interactive interface assumes that this is a text layer, and instead of assigning random colors to each item, it shows it as such. In contrast, if there are 12 or less unique values, the interface by default treats it as a categorical data layer, and assigns random colors. Of course these colors can be changed quite easily through the interface, or programmatically by processing the ‘state’ file (more on this later).
1.5 Stacked bars
Stacked bars are a bit tricky compared to the other data types, but nothing too complicated. Here is an example addition to our data file:
As you can see, the header field contains multipart information. What is before the exclamation mark is the ‘name’ of this data, which will appear in the interface as a label. The column separated values after the exclamation mark are subsets of the data. This header notation indicates that there will be three numerical values will be present in each following field.
The command line for this matrix:
Anvi’server link: http://anvi-server.org/public/meren/interface_demo_V
1.6 Samples information
The samples information file contains one or more data types (numerical, categorical, or stacked-bar data) to provide contextual data for layers of interest. Layers of interest are, in most cases, also correspond to samples in the study. In our test case, these layers correspond to
c3 (see the previous matrix file).
Here is an example samples information file for the view data matrix file we have been using in previous steps for visualization:
Please note that this time the first column is composed of layer names that appeared as rows in the data matrix file. If you are using the interactive interface via the command line, you first need to crate a samples database using this file, if you are using anvi’server, you can simply provide the TAB-delimited file via the new project window. Here is the command line (assuming the view data is identical to the one used in the previous example):
Which produces this:
Please consider reading this article if you are interested in learning more about the samples database.
Anvi’server link: http://anvi-server.org/public/meren/interface_demo_VI
1.7 Samples order
The samples order file contains information about different orderings of samples, so the user can access to this information from the interface to arrange layers of interest. An order can be a comma separated list of sample names, or it can be a newick tree for the organization of samples.
A samples order file can be used with or without a samples information file (see
anvi-gen-samples-db -h for help if you are following these examles from your terminal). Here is an example samples order file for the data matrix file we have been using:
There can be as many lines as you like for different orderings, however, there has to be only two columns, both of which should contain all names for layers of interest.
Similar to the utilization of samples information file, you need to create a samples database for this file as well. To make things simpler, I will use the previous samples information file, along with this new order file to create a new samples database prior to calling anvi-interactive:
In this new display you will find out that the two orderings (
test_list) appears in the ‘Sample order’ combo box under in the ‘Samples’ tab on the left panel, along with all the orderings autmoatically generated based on the samples information file:
test_tree order, and re-drawing the tree will result in a tiny dendrogram for the layers of interst:
Anvi’server link (don’t forget to play with the Samples order combo box): http://anvi-server.org/public/meren/interface_demo_VII
2. Using the anvi’o interactive interface
We tried our best to make anvi’o interface intuitive, easy-to-learn, and easy-to-use. In an ideal world, you shouldn’t need a tutorial to start using it, and learn it through trial and error. But here is a very general overview of the major components of the interface.
If you happen to realize there is something missing, or it would have been helpful to you if something was better explained in this document, please do not hesitate to drop us a line.
2.1 An overview of the display
The interactive interface has two major areas of interaction: the space for visualization on the right, and the left panel. The left panel gives access to various controls to work with the data visualized, and improve the presentataion of it.
2.2 The left panel
At the bottom of the layers tab there is a section with tiny controls that are available in all tabs. Through these controls you can,
- Create or refresh the display when necessary using the draw button (some changes require you to do that),
- Zoom in, zoom out, and center the display.
- Download your display as an SVG file.
See the post “Working with SVG files anvi’o generate” for tips about how to work with large SVG files using Inkscape.
2.2.1 Layers tab
Through the layers tab you can,
- Change general settings for the tree (i.e., switching between circle or rectengular displays, changing tree radius or width), and layers (i.e., editing layer margins, or activating custom layer margins).
- Load or save states to store all visual settings, or load a previously saved state.
- Customize individual layers by switching between different display modes depending on the layer type (i.e., ‘text’ or ‘color’ mode for categorical layers, or ‘bar’ or ‘intensity’ mode for numerical layers), set normalization (i.e., ‘square-root’, or ‘log’ normalization), minimum, and maximum cutoff values for numerical layers, or set layer height, and layer margin (i.e., its distance from the previous layer).
- Use the multi-selector at the bottom to change settings for multiple layers at once.
2.2.2 Samples tab
Samples tab is for the additional data you provide the interface through a samples database (see samples order and samples infomration sections above). Through this layer you can,
- Change the order of layers using automatically-generated or user-provided orders of layers using the Sample order combo box,
- Customize individual samples information entries.
Changes in this tab can be reflected to the current display without re-drawing the entire tree unless the sample order is changed.
2.2.3 Bins tab
Anvi’o allows you to create selections of items shown in the display (whether they are metagenomic contigs, 16S rRNA tags, or any other type of information). Bins tab allow you to maintain these selections. Any selection on the tree will be added to active bin in this tab (the state radio button next to a bin defines its activity). Through this tab you can,
- Create or delete bins, set bin names, change the color of a given bin, or sort bins based on their name, the number of units they carry, or completion and contamination estimates (completion / contamination estimates are only computed for genomic or metagenomic analyses).
- View the number of selected units in a given bin, and see the list of names in the selection by clicking the button that shows the number of units described in the bin.
- Store a collection of bins, or load a previously stored collection.
2.2.4 Mouse tab
The mouse tab displays the value of items underneath the mouse pointer while the user browse the tree.
Displaying the numerical or categorical value of an item shown on the tree is not an easy task. We originally thought that displaying pop-up windows would solve it, but besides the great overhead, it often became a nuisance while browsing parts of the tree. We could show those pop-up displays only when use clicks on the tree, however click-behavior is much more appropriate to add or remove individual items from a bin, hence, it wasn’t the best solution either. So we came up with the ‘mouse tab’. You have a better idea? I am not surprised! We would love to try improve your experience: please enter an issue, and let’s discuss.
2.2.5 Search tab
It does what the name suggests. Using this tab you can,
- Build expressions to search items visualized in the main display.
- Highlight matches, and append them to, or remove them from the selected bin in the Bins tab.
Tips and tricks
Here are some small conveniences that may help the interface serve you better (we are happy to expand these little tricks with your suggestions).
You can zoom to a section of the display by making a rectangular selection of the area while the pressing the
You can click an entire branch to add items into the selected bin, and remove them by right-clicking a branch.
If you click a branch while pressing the
CTRLbutton, it will create a new bin, and add the content of the selection into that bin.
5, you can go between Layers, Bins, Samples, Mouse, and Search tabs!