This program helps you cut a 'locus' from a larger genetic context (e.g., contigs, genomes). By default, anvi'o will locate a user-defined anchor gene, extend its selection upstream and downstream based on the –num-genes argument, then extract the locus to create a new contigs database. The anchor gene must be provided as –search-term, –gene-caller-ids, or –hmm-sources. If –flank-mode is designated, you MUST provide TWO flanking genes that define the locus region (Please see –flank-mode help for more information). If everything goes as plan, anvi'o will give you individual locus contigs databases for every matching anchor gene found in the original contigs database provided. Enjoy your mini contigs databases!.
This program lets you export selections of your contigs-db around all occurances of a user-defined anchor gene.
The output of this is a folder that contains a separate contigs-db for the region around each hit of the anchor gene. (In fact, you’ll get a FASTA file, contigs-db, profile-db, and a copy of the runlog).
For example, you could specify the recognition site for a specific enzyme and use this program to pull out all potential sites where that enzyme could bind.
You’ll need to provide a contigs-db (of course), as well as the name of the output directory and a prefix to use when naming all of the output databases.
You can define the region of interest either by defining the two flanking genes or by searching for an anchor gene and defining a number of genes around this gene that you want to look at. For example, if you set
num-genes as 1, then each locus will contain the gene of interest, a gene upstream of it, and a gene downstream of it, for a total of three genes.
Defining the region of interest
There are four ways to indicate the desired anchor gene:
Provide a search term in the functional annotations of all of your genes. (If you’re trying to find a gene with a vague function, you might want to use anvi-search-functions to find out which genes will show up first. Alternatively, you can you anvi-export-functions to look at a full list of the functional annotaitons in this database).
anvi-export-locus -c contigs-db \ --num-genes 2 \ -o GLYCO_DIRECTORY \ -O Glyco \ --search-term “Glycosyltransferase involved in cell wall bisynthesis” \
You also have the option to specify an annotation source with the flag
Provide a specific gene caller ID.
anvi-export-locus -c contigs-db \ --num-genes 2 \ -o output_directory \ -O GENE_1 \ --gene-caller-ids 1
Provide a search term for the HMM source annotations. To do this, you must also specify an hmm-source. (You can use the flag
--list-hmm-sourcesto list the available sources).
anvi-export-locus -c contigs-db \ --num-genes 2 \ -o Ribosomal_S20p \ -O Ribosomal_S20p \ --use-hmm \ --hmm-source Bacteria_71 \ --search-term Ribosomal_S20p
- Run in
flank-modeand provide two flanking genes that define the locus region.
anvi-export-locus -c contigs-db \ --flank-mode \ -o locus_output \ -O gyclo_to_acyl \ --search-term “Glycosyltransferase involved in cell wall bisynthesis”,”Acyl carrier protein” \
- Run in
You can also remove partial hits, ignore reverse complement hits, or overwrite all files in a pre-existing output.
Edit this file to update this information.
Are you aware of resources that may help users better understand the utility of this program? Please feel free to edit this file on GitHub. If you are not sure how to do that, find the
__resources__ tag in this file to see an example.