Microbial 'omics

# anvi-export-locus [program]

This program helps you cut a 'locus' from a larger genetic context (e.g., contigs, genomes). By default, anvi'o will locate a user-defined anchor gene, extend its selection upstream and downstream based on the –num-genes argument, then extract the locus to create a new contigs database. The anchor gene must be provided as –search-term, –gene-caller-ids, or –hmm-sources. If –flank-mode is designated, you MUST provide TWO flanking genes that define the locus region (Please see –flank-mode help for more information). If everything goes as plan, anvi'o will give you individual locus contigs databases for every matching anchor gene found in the original contigs database provided. Enjoy your mini contigs databases!.

## Authors

Matthew Schechter
Evan Kiefl
Alon Shaiber

## Usage

This program lets you export selections of your contigs-db around all occurances of a user-defined anchor gene.

The output of this is a folder that contains a separate contigs-db for the region around each hit of the anchor gene. (In fact, you’ll get a FASTA file, contigs-db, profile-db, and a copy of the runlog).

For example, you could specify the recognition site for a specific enzyme and use this program to pull out all potential sites where that enzyme could bind.

### Required Parameters

You’ll need to provide a contigs-db (of course), as well as the name of the output directory and a prefix to use when naming all of the output databases.

You can define the region of interest either by defining the two flanking genes or by searching for an anchor gene and defining a number of genes around this gene that you want to look at. For example, if you set num-genes as 1, then each locus will contain the gene of interest, a gene upstream of it, and a gene downstream of it, for a total of three genes.

### Defining the region of interest

There are four ways to indicate the desired anchor gene:

1. Provide a search term in the functional annotations of all of your genes. (If you’re trying to find a gene with a vague function, you might want to use anvi-search-functions to find out which genes will show up first. Alternatively, you can you anvi-export-functions to look at a full list of the functional annotaitons in this database).

anvi-export-locus -c contigs-db \ --num-genes 2 \ -o GLYCO_DIRECTORY \ -O Glyco \ --search-term “Glycosyltransferase involved in cell wall bisynthesis” \

You also have the option to specify an annotation source with the flag --annotation source

2. Provide a specific gene caller ID.

anvi-export-locus -c contigs-db \ --num-genes 2 \ -o output_directory \ -O GENE_1 \ --gene-caller-ids 1

3. Provide a search term for the HMM source annotations. To do this, you must also specify an hmm-source. (You can use the flag --list-hmm-sources to list the available sources).

anvi-export-locus -c contigs-db \ --num-genes 2 \ -o Ribosomal_S20p \ -O Ribosomal_S20p \ --use-hmm \ --hmm-source Bacteria_71 \ --search-term Ribosomal_S20p

1. Run in flank-mode and provide two flanking genes that define the locus region.

anvi-export-locus -c contigs-db \ --flank-mode \ -o locus_output \ -O gyclo_to_acyl \ --search-term “Glycosyltransferase involved in cell wall bisynthesis”,”Acyl carrier protein” \

