Microbial 'omics


Brought to you by

Table of Contents

DB

A DB-type anvi’o artifact. This artifact is typically generated, used, and/or exported by anvi’o (and not provided by the user)..

Back to the main page of anvi’o programs and artifacts.

Provided by

anvi-pan-genome

Required or used by

anvi-analyze-synteny anvi-compute-genome-similarity anvi-display-pan anvi-export-items-order anvi-export-misc-data anvi-export-state anvi-get-enriched-functions-per-pan-group anvi-get-sequences-for-gene-clusters anvi-import-collection anvi-import-items-order anvi-import-misc-data anvi-import-state anvi-meta-pan-genome anvi-show-collections-and-bins anvi-summarize

Description

An anvi’o database that contains key information associated with your gene clusters.

Advanced information for programmers

While it is possible to read and write a given anvi’o pan database through SQLite functions directly, one can also use anvi’o libraries to initiate a pan database to read from.

Initiate a pan database instance

import argparse

from anvio.dbops import PanSuperclass

args = argparse.Namespace(pan_db="PAN.db", genomes_storage="GENOMES.db")

pan_db = PanSuperclass(args)

Gene clusters dictionary

Once an instance from PanSuperclass is initiated, the following member function will give access to gene clusters:

pan_db.init_gene_clusters()
print(pan_db.gene_clusters)
{
  "GC_00000001": {
    "Genome_A": [19, 21],
    "Genome_B": [30, 32],
    "Genome_C": [122, 125],
    "Genome_D": [44, 42]
  },
  "GC_00000002": {
    "Genome_A": [123],
    "Genome_B": [176],
    "Genome_C": [175],
    "Genome_D": []
  },
  (...)
  "GC_00000036": {
    "Genome_A": [],
    "Genome_B": [24],
    "Genome_C": [],
    "Genome_D": []
  }
  (...)

Each item in this dictionary is a gene cluster describes anvi’o gene caller ids of each gene from each genome that contributes to this cluster.

Sequences in gene clusters

gene_clusters_of_interest = set(["GC_00000006", "GC_00000036"])
gene_cluster_sequences = pan_db.get_sequences_for_gene_clusters(gene_cluster_names= gene_clusters_of_interest)

print(gene_cluster_sequences)
{
  "GC_00000006": {
    "Genome_A": {
      23: "MDVKKGWSGNNLND--NNNGSFTLFNAYLPQAKLANEAMHQKIMEMSAKAPNATMSITGHSLGTMISIQAVANLPQAD"
    },
    "Genome_B": {
      34: "MDVKKGWSGNNLND--NNNGSFTLFNAYLPQAKLANEAMHQKIMEMSAKAPNATMSITGHSLGTMISIQAVANLPQAD"
    },
    "Genome_C": {
      23: "MDVKKGWSGNNLNDWVNNNGSFTLFNAYLPQAKLANEAMHQKIMEMSAKAPNATMSITGHSLGTMISIQAVANLPQAD"
    },
    "Genome_D": {
      23: "MDVKKGWSGNNLNDWVNNAGSFTLFNAYLPQAKLANEAMHQKIMEMSAKAPNATMSITGHSLGTMISIQAVANLPQAD"
    }
  },
  "GC_00000036": {
    "Genome_A": {},
    "Genome_B": {
      24: "MSKRHKFKQFMKKKNLNPMNNRKKVGIILFATSIGLFFLFAFRTTYIVATGKVAGVSLKEKTA"
    },
    "Genome_C": {},
    "Genome_D": {}
  }
}

Edit this file to update this information.