Team:MoWestern Davidson/project design




One of the advantages of synthetic biology is its multiple levels of modularity. Our basic design reflects this modularity in the multiple levels of bacterial automation. One end of the scale involves less human effort in molecular construction and more interpreting; while the opposite end involves more human effort in molecular construction and less interpreting.

All of these designs on this scale solve the SAT problem in addition to the MAX SAT problem, because in solving the MAX SAT problem, we solve the SAT problem. The SAT problem asks can these constraints be solved; the MAX SAT problem asks how many of these constrains can be solved. We constructed reporter genes to express a specific phenotype if the E. coli solves different constraints. We designed these reporter genes by constructing a continuum that models different levels of bacterial automation for computation. This allowed us to select the level of automation we desired and design reporter genes based off of that level. Engineering the reporter genes involved inserting a 5 base pair insertion that shifts the reading frame in translation, causing a nonsense protein. We then provided the cells with different suppressor tRNAs that may or may not suppress this insertion. If the insertion is suppressed, the reading frame will be restored and the protein will be expressed. These suppressor tRNAs represent the inputs of the SAT problem, and the 5 base pair insertions represent the clauses of the problem. In most designs the frame shift leader (FSL), which includes a start codon and the 5 base pair insertion, occurs at the beginning of the protein structure. Note that for every point on the bacterial automation scale, there are three inputs, representing the number of variables being evaluated for the particular SAT problem.

Automation Scale and FSL Design

1. Single Literal: In this approach, the reporter gene has a FSL of one literal, or 5 base pair insertion. The bacteria are given a set of tRNA variables as inputs to evaluate that literal, and if the bacteria receive that literal's tRNA compliment, then it will express a gene. To evaluate the logical clauses and MAX SAT, we must look plates to see if the gene was expressed. The design of the FSL in the reporter genes in this example is relatively simple. Immediately after the start codon, ATG, we insert a single the 5 base pair insertion. Then every possible 5mer is exposed to a common set of tRNAs

2. Single Clause: In this approach, the reporter gene has a FSL of one logical clause (a OR b), consisting of two 5 base pair insertions. Bacteria are given a set of tRNA variables as inputs to evaluate that clause. If the clause is satisfied and suppression occurs, then the gene will be expressed. In order to compute the problem we must then determine how many colonies expressed the gene, thus satisfying the clauses. In this design, the FSL has two 5 base pair insertions. If one tRNA binds to either insertion, the reading frame is restored. However, these insertions are designed in such a manner that if one tRNA binds, another tRNA could not bind to the second 5 base pair insertion.

3. Automated Population with One Reporter: This approach uses 4 different FSLs ranging from 1 to 4 clauses (a or b) in the same reporter gene. These reporter genes are then divided into colonies representing the number of clauses (1, 2, 3, or 4) in their reporter gene. Bacteria are given a set of tRNA variables as inputs and evaluate the logical clauses and MAX SAT by reporting the gene expression. This set up allows us to determine the maximum number of clauses the tRNA is able to solve. The FSL length and design varies according to the number of clauses inserted. However, the clauses are designed in the same way as in the previous single clause line. These single clauses are then strung together so that in order for the proper reading frame to be restored, exactly one 5 base pair insertion in each clause must be satisfied.

4. Automated Population with Different Reporters: This approach is basically the same as the previous, except that the each clause has its own reporter gene. For example, the first clone tests for 1 clause satisfied and if satisfied, the clone will produce GFP. The second clone tests for 2 clauses satisfied and will produce RFP if satisfied. The third clone tests for 3 clauses satisfied and will produce Chloramphenicol resistance. The last clone tests for 4 clauses satisfied and will produce Tetracycline resistance.

5. Automated Population with Different Reporters in a Single Clone: In this approach, 4 different FSLs that test for at least 1, 2, 3, and 4 clauses satisfied are inserted in the beginning of different reporter genes used in a single clone. The first reporter gene tests for at least 1 clause satisfied, and if satisfied, the gene GFP will be expressed. The second reporter gene tests for at least 2 clauses satisfied and will express RFP if satisfied. The third reporter gene tests for at least 3 clauses satisfied and will produce Chloramphenicol resistance. Tetracycline resistance is the last reporter gene that tests for all 4 clauses satisfied. Each FSL design has the same SAT problem, (a OR b) AND (b OR c’) for example, encoded in it. Bacteria are given a set of tRNA variables as inputs and evaluate the logical clauses and MAX SAT.

Predicting Effects of Suppressor tRNAs

One of the most critical factors in choosing which suppressor tRNAs to use is the prevalence in the normal E. coli genome of the mRNA codon that the tRNA targets. In larger SAT problems, we do not want to use suppressor tRNAs whose mRNA codons occur relatively frequently in the reading frame of coding sequences, especially if the coding sequence is a gene for an essential protein in the cell. We wanted to obtain bioinformatic data to predict how much effect introducing suppressor tRNAs may have on E. coli. We also chose to first construct a 1-SAT problem in E. coli - having a single 5-bp codon addition before our reporter gene, and its appropriate suppressor tRNA. We tested the effect of each suppressor tRNA separately to obtain biological data on the effects of these tRNAs.

Anderson and colleagues examined the efficiency of suppression of the suppressor tRNAs listed in the table below. They noted, however, that the suppressor tRNA for the codon AGGAU(*) wobbles on the last base, causing non-specific base-pairing. Because of this wobble effect, we chose not to use the AGGAU tRNA to avoid inappropriate suppression. You can access Anderson et. al. 2002 for the complete examination of the codons and tRNA anticodons.

Suppression Efficiency of 5-base Codons

Codon % Suppression
AGGAC 5.0%
AGGAU* 11.3%
CGGUC 4.5%
CUACC 7.4%
CUACU 11.2%
CUAGC 8.5%
CUAGU 12.0%
CCAAU 4.4%
CCACC 1.6%
CCACU 7.4%
CCAUC (9-bp anticodon) 8.0%
CCAUC (10-bp anticodon) 5.6%
CCCUC 3.8%

We created a Perl program to search all coding sequences of the E. coli genome (accessed using ASAP) to find the total number of times the suppressor tRNA codons occur in the correct reading frame. Considering these data with the percent of suppression of each tRNA, this information gives us an indication of generally how likely are our suppressor tRNAs to disrupt the host cell's normal gene expression and cell functions.

A Perl script program searched the coding sequences of E. coli K-12 MG1655 in 3-nt reading frame for the number of occurrences of each 5-bp DNA codon. Above, each codon is addressed by the mRNA transcript that is suppressed by the corresponding suppressor tRNA.

We also used a Perl script program to find the distribution of codons throughout the coding sequences - how many of each codon occurred in a single sequence. We found, shown below, that the majority of sequences in the genome had 0 or 1 occurrences of any codon. However, there were up to 11 occurrences of the codons in other coding sequences.

Distribution of the number of occurrences of any suppressor codons per individual coding sequences in the E. coli genome.

These data functioned as a precautionary reference for which suppressor tRNAs may be more likely to make E. coli sick, and which coding sequences these tRNAs may be affecting. We decided for our project design to begin by testing a single codon frameshift mutation leader upstream of a reporter gene. From this, we would obtain data on which tRNAs cause enough suppression to see a phenotype from our reporters; and which tRNAs to avoid because the cells become sick.

Designing the Frameshift Leaders and Suppressor tRNAs

We employed two separate approaches to design and construct our suppressor tRNAs and the frameshifted reporter proteins. The genomic sequence of the novel five nucleotide anticodon tRNAs were obtained from papers by Thomas J. Magliery, J. Christopher Anderson and Peter G. Schultz {Magliery et. al. & Anderson et. al.}.

We used synthesized single-stranded oligonucleotides to assemble the twelve suppressor tRNAs with flanking BioBrick prefix and suffix sticky ends. For more information on our tRNA designs, click here.

The reporter genes were designed and constructed by PCR-directed mutagenesis in order to add a synthetic frameshift leader (FSL) to the beginning of the reporter gene sequence. For more information on our FSL designs click here.