Microbial 'omics


Brought to you by

anvi-get-sequences-for-gene-calls [program]

A script to get back sequences for gene calls.

See program help menu or go back to the main page of anvi’o programs and artifacts.

Table of Contents

Provides

genes-fasta

Requires or uses

contigs-db genomes-storage-db

Usage

This program allows you to export the sequences of your gene calls from a contigs-db or genomes-storage-db in the form of a genes-fasta.

If you want other information about your gene calls from a contigs-db, you can run anvi-export-gene-calls (which outputs a gene-calls-txt) or get the coverage and detection information with anvi-export-gene-coverage-and-detection.

Running on a contigs database

You can run this program on a contigs-db like so:

anvi-get-sequences-for-gene-calls -c contigs-db \ -o path/to/output

This is create a genes-fasta that contains every gene in your contigs database. If you only want a specific subset of genes, you can run the following:

anvi-get-sequences-for-gene-calls -c contigs-db \ -o path/to/output \ --gene-caller-ids 897,898,1312 \ --delimiter ,

Now the resulting genes-fasta will contain only those three genes.

You also have the option to report the output in gff3 format, report extended deflines for each gene, or report amino acid sequences instead of nucleotide sequences.

Running on a genomes storage database

You can also get the sequences from gene calls in a genomes-storage-db, like so:

anvi-get-sequences-for-gene-calls -g genomes-storage-db \ -o path/to/output

This will create a genes-fasta that contains every gene in your genomes storage database. To focus on only a subset of the genomes contained in your database, use the flag --genome-names. You can provide a comma-delimited list of genome names or a flat text file that contains one genome per line. Alternatively, you could provide a list of gene-caller-ids as specified above.

You also have the option to report the output in gff3 format, report extended deflines for each gene, or report amino acid sequences instead of nucleotide sequences.

Edit this file to update this information.

Additional Resources

Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit this file on GitHub. If you are not sure how to do that, find the __resources__ tag in this file to see an example.