Team:Alberta/Project/Bioinformatics
From 2009.igem.org
|
Why build a minimal genome?Genomes are complex! Determining how simplified a genome can become enriches our understanding of the function and interactions of cellular components. Simplified cells can be used as a well characterized chasses for synthetic biology. Moreover, a simplified cell can be used to study cellular processes in a controlled, characterized genetic background. Finally, developing a minimal genome requires us to develop and optimize molecular methods of genome assembly. These methods can be then applied to other high through put biology. |
Why We Need Bioinformatics
The size and complexity of the genome make bioinformatics analysis essential. We used bioinformatics to accomplish the following:
|
Literature ReviewFour essential gene lists from the literature were analyzed to construct a preliminary essential gene list
These literature lists vary greatly in size and have minimal overlap. All analysis referred to genes by Blattner numbers in order to standardize gene names. The maximum number of genes in common between any two literature lists is 205, which is between Baba et al 2006 and Gerdes et al 2003. The varying levels of overlap between the four essential gene lists from the literature can be demonstrated in a Venn diagram, in which the number indicate the number of genes in common between lists. Only 48 genes were present in all four lists. |
|||||||||||||||||||||||||
Constructing of a Preliminary Essential Gene List from the LiteratureCriteria for Gene Selection:
Genes for the following processes were included:
RNAs:The rrnC operon supplies the rRNA’s and three of the tRNAs. This operon was selected because it includes the great number of tRNA’s. To select the other tRNA’s, all tRNA’s listed as essential in PEC were first included. One tRNA was then selected for each anticodon that differed on one of the last two bases. At least one tRNA was included for each amino acid. The complete list of essential RNA’s can be found here . |
|||||||||||||||||||||||||
Statistics on BioBytes Gene List Based on Literature ReviewTotal genes in Ecoli: 4762 Total protein coding genes in BioBytes preliminary essentials list: 332 Total number of RNA genes in BioBytes preliminary essentials list: 29 |
|||||||||||||||||||||||||
Metabolic ModelingUnfortunately, due to the lack of consensus seen from the literature genes, it was necessary to find another way of producing our essential gene list. Metabolic modeling was used to aid in the identification of important essential genes. The E.coli MG1655 genome was modeled by the Palsson group at the University of San Diego and was used for our modeling experiments. Additionally, the Cobra Toolbox developed by the System`s Biology Research Group was used to interface the model with the Matlab program. A series of multiple gene deletions were performed using the model in order to determine the essential metabolic genes. These were compared to the literature genes selected and based on the individual gene`s function and involvement in various metabolic pathways, the gene was either added to our master essential gene list or removed. Additionally, media conditions were altered for the cells environment allowing for predictions for the conditions which should be applied to the minimal cell once it is developed. Once completed our MatLab model contributed 116 genes to our master gene list. Metabolic modeling allows for computational analysis of entire genomes which would be impossible to accomplish any other way. The various sources and methods used to collect data has allowed for an unique gene list which has the best possible chance of producing a minimal genome. The MatLab protocols demonstrated in the modeling section can be used to identify any organism’s essential genes provided a model is available. |