Microbial 'omics


Brought to you by

anvi-script-calculate-pn-ps-ratio [program]

Table of Contents

This program calculates for each gene the ratio of pN/pS (the metagenomic analogy of dN/dS) based on metagenomic read recruitment, however, unlike standard pN/pS calculations, it relies on codons rather than nucleotides for accurate estimations of synonimity.

See program help menu or go back to the main page of anvi’o programs and artifacts.

Can provide

pn-ps-data

Can consume

contigs-db variability-profile-txt

Usage

This program calculates the pN/pS ratio for each gene in a contigs-db and outputs it as a pn-ps-data artifact.

What is the pN/pS ratio?

The pN/pS ratio (first described in Schloissnig et al. 2012) is the ratio of 2 rates: the rates of non-synonymous (pN) and synonymous (pS) polymorphism. It is analogous to dN/dS, which is the ratio of rates between non-synonymous (dN) and synonymous substitutions between 2 strains/species. We calculate pN/pS from allele frequency obtained through SCVs and SAAVs (see publication in preparation) for exact implementation details.

Neat. How do I use this program?

Firstly, you’ll need to run anvi-gen-variability-profile twice with the same parameters on the same databases. The first time, use the flag --engine AA to get a variability-profile-txt for SAAVs (single amino acid variants), which we’ll name the SAAVs.txt in this example. The second time, use the flag --engine CDN to get a variability-profile-txt for SCVs (single codon variants), which we’ll name SCVs.txt in this example.

Then you can run this program like so:

anvi-script-calculate-pn-ps-ratio -a SAAVs.txt \ -b SCVs.txt \ -c contigs-db \ -o output_dir

This will result in a directory called output_dir that contains several tables that describe each of your genes. See pn-ps-data for more information.

Other parameters

By default, this program ignores some of the genes and variable positions in your variability profiles; you can choose to be more sensitive or ignore more positions by changing any of these three variables:

  • The minimum departure from consensus for a variable position (default: 0.10).
  • The minimum number of SCVs in a gene (default: 4).
  • The minimum coverage at a variable position (default: 30)

Edit this file to update this information.

Additional Resources

Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit this file on GitHub. If you are not sure how to do that, find the __resources__ tag in this file to see an example.