Team:Alberta/Project/Bioinformatics

From 2009.igem.org

Revision as of 23:44, 19 October 2009 by Emera (Talk | contribs)

University of Alberta - BioBytes










































































































Why design a minimal genome?

Genomes are extremely complex. Producing a minimal genome allows for a better understanding of the function and interaction of cellular components. This better understanding can lead to optimization of synthetic processes and provides a well characterized chassis for synthetic biology. Moreover, a simplified cell can be used to study cellular processes in a controlled, characterized genetic background. Through the use of bioinformatics our team has attempted to produce a new essential gene list for a minimal genome in the ''E. coli'' bacterium.

Why We Need Bioinformatics

''E. coli'' has over 4,500 genes. The size and complexity of this genome makes it almost impossible to manually process. An ''in silico'' approach allows for this complex data to be more easily collected, manipulated, and interpreted. Bioinformatics has aided us in accomplishing the following:
  • Review lists of essential genes in the literature and existing databases and compile a preliminary essential gene list
  • Model the metabolic reactions and net growth rate of ''E.coli'' with given gene sets. This identified additional metabolic genes essential to a minimal genome.
  • Identify knock out combinations that could be tested in the wet lab, to verify the accuracy of our metabolic model.
  • Select standardized promoters and terminators that would replace the natural promoters and terminators of essential genes.
  • Determine which promoter should be used with which gene, by analyzing expression level data.
  • Design primers to amplify all essential genes from genomic DNA.
These steps have all been completed, and are described on the following pages.

Literature Review

In order to produce a preliminary genome list, various databases and papers were used. These were determined through a variety of different experimental methods and have very limited overlap. Each gene must was carefully considered and a gene list of 332 genes was produced. Additionally, 29 genes were found to be essential for the RNA's. For more information on the Literature Reviews used, please click on the Gene Selection tab.

Metabolic Modeling

To verify that all genes necessary for metabolism are included in our essential gene list, a computer model was used. The Model was produced by the Palson group at the University of San Diego and was used in conjunction with the Cobra Toolbox developed by the System's Biology Research Group. It provides a new "in silico" approach to identifying essential genes. The results from the computational analysis suggests that many more genes are required in order to produce a viable minimal genome. This added an additional 118 essential genes. Together with the Literature Research, 450 genes were found to make up our essential gene list. In order to accomplish this a series of programs were developed to be used with the Cobra Toolbox. These programs allow for '''the determination of any organism's minimal metabolic network.''' The results of the metabolic modeling is currently being researched in the wetlab to demonstrate its accuracy. See the Gene Selection tab for the final BioBytes list of essential genes, and the Modeling section for more information on how modeling was done.