Team:MoWestern Davidson/project

From 2009.igem.org

(Difference between revisions)

Latest revision as of 05:03, 21 October 2009

Home

Team

Project Overview

Parts

Notebooks

Human Practice

Outcomes

Acknowledgements

Understanding SAT: a Lock and Door Analogy

A member of the NP-complete family (the most challenging of the non-deterministic polynomial time problems), the satisfiability (SAT) problem can be compared to an analogy of locks and keys. Imagine a hallway of 4 doors, each with one lock.

Every lock on each door can be opened by one of two different keys, represented by the upper case and lower case letters on each door - so the first lock can be opened by either the "G" (first green key) or the "B" (first blue key). As long as you have at least one of the keys for a particular lock, you can open the door. A person with three sets of keys, blue (B, b), green (G, g), and red (R, r), wants to find the combination of one blue key, one green key, and one red key that will open all four of the doors. The graphic below illustrates three of the person's eight possible three-key combinations, and out of this set, only the first key ring can open all four doors. The first key ring is said to "satisfy" the door problem.

The Real Math

Descriptions in parentheses are reference to the "lock and door" analogy.

Literal=(key hole)= half of a variable; example (a)

Variable=(color)= both literals from a single variable; example (a,a')

Clause=(door)=a group of literals joined by OR; example (a OR b)

SAT Problem=(hallway of doors)= clauses joined together by AND; example (a or b) AND (a' OR b')

SAT

A SAT problem is made by joining a certain number of clauses (doors) together with the logical operator "AND" (AND implies that you have to satisfy (open) all of the clauses (doors) to find a solution to the problem). Each clause contains literals (key holes) joined together by the logical operator "OR" (here OR implies that you need at least one literal (keyhole) for the clause (door) to be satisfied (opened)) and these literals (key holes) are selected from the number of variables (the different key colors) being used. So translating the janitor's problem from above into mathematical terms would yield:

(a OR b) AND (a OR b') AND (a OR c') AND (a' or c)

Where variable (a,a' ) replaces the blue keys, (b,b' ) replaces the green keys and (c,c' ) replaces the red keys.

In mathematical terms, this problem is a 3-variable (3 colors), 2-SAT (2 key holes), 4 clause (4 doors) SAT problem. Keep in mind that you can change any of these parameters by adding more variables (colors), more clauses (more doors), and you can make the problem 3-SAT, 4-SAT, etc (3, 4, etc. key holes per door).

MAX SAT

A related problem, MAX SATuses the same terminology, but instead of wanting to be able to satisfy (open) all of the clauses (doors) in an all or none fashion, MAX SAT problems ask what is the greatest number of clauses (doors) that can be satisfied (opened). We studied a modified MAX SAT problem because we were interested in looking at all patterns of satisfaction, recording when one clause (door) was satisfied, 2 clauses, 3 clauses etc. (not just the maximum number satisfied (opened).

For this project we examined methods for solving both our modified MAX SAT problem and regular SAT

The Math in E. coli

Our team translated the MAX SAT NP-complete problem into a modified biological process. In cells, the process of transcription codes DNA into mRNA, and then the mRNA is translated into a protein that expresses a function. Just as an English sentence requires a specific grouping of letters to form words, DNA must be read in 3-nucleotide groupings called codons to continue successfully the process of protein production. Transfer RNAs (tRNAs) are RNA molecules with 3 nucleotide anticodon loops that match specifically to codon sequences and supply the amino acids according to those sequences. Amino acids are required to construct all proteins. If nucleotides are inserted into or deleted from an mRNA, the reading frame is shifted, causing a frameshift mutation. Frameshift mutations cause tRNAs to supply the wrong amino acids in the protein that produces a different function than the original function.

With this biological background, the connection can be made between the “door and key” analogy and the biological interpretation. Our “doors” are engineered reporter gene DNA sequences including the 5 nucleotide frameshift mutation “keyhole”, resulting in our frameshift leaders (FSL) (see figure below). The “keys” that may or may not open the door are engineered 5 nucleotide anticodon suppressor tRNAs. Opening the door, thus solving the MAX SAT problem, would allow the cell to communicate its outcome in the expression of the reporter gene (fluorescence/antibiotic resistance). The idea of using suppressor tRNAs to bypass designed mutation to solve a SAT problem is an extension of earlier work using suppressor logic from papers by Thomas J. Magliery, J. Christopher Anderson and Peter G. Schultz (Magliery et al. & Anderson et al.). These two publications used 2,4,5, and 6 nucleotide frameshift mutations. Their research demonstrated 4 and 5 base codon suppression was biologically feasible in their search to expand the genetic code. The difference in the original amino acid and the modified 5mer amino acid is the addition of the 2 nucleotides and the insertion of the amino acid serine.

Diagram of 2-SAT problem with mRNA encoding RFP and the mRNA contains two clauses, each with two 5mer frame shift mutations.

Only one of the two 5mers can be suppressed and permit translation of RFP.

Bacterial "locks" will open (continue translating in the correct reading frame) if the correct suppressor tRNA (key) binds to either of the 5mers (keyholes) within them

These tRNA "keys" base pair with a corresponding 5 nucleotide "keyhole" as described in the "lock" shown on the left

History of SAT

SAT was the first problem proven to be NP-complete in 1971 by Steve Cook, and with this proof came the concept of NP-completeness. The Soviet computer scientist Leonid Levin independently arrived at a similar proof at around the same time. In 1972, Dick Karp published a list of NP-complete problems, all linked to SAT. (3, 4)

Significance of SAT

Computer aided design applications
Computer circuit design
- Logic synthesis
- Automatic test pattern generation
Internet search algorithms

@@ Line 1: / Line 1: @@
 {{Template:MoWestern_Davidson2009}}
+{| align="right"
+  | __TOC__
+  |}
+<font face="verdana">
+== Understanding SAT: a Lock and Door Analogy ==
-=Overview=
+A member of the NP-complete family (the most challenging of the non-deterministic polynomial time problems), the satisfiability (SAT) problem can be compared to an analogy of locks and keys. Imagine a hallway of 4 doors, each with one lock.
-The Satisfiability (Sat) Problem:
-A member of the NP Complete family (the most challenging of the non-deterministic polynomial time problems, can be compared to an analogy of locks and keys. Imagine a set of doors, each with one lock.
-[[Image:Doors.png|doors|300x300px]]
+[[Image:Doors.png|Figure.1 Hallway of doors (Clauses)|300x300px|left]]
-each lock on each door can be opened by two different keys, (represented by the letters on each door- so the first lock can be opened by either the "G" (first green key) or the "B"(first <span style="color:#0000FF"> blue </span>  key).) As long as you have at least one of the keys you can open the door.
-A janitor then, with three sets of keys [[Image:keysagain.png|Figure1.keys|150x150px]]wants to find the combination of one <span style="color:#0000FF"> blue </span>  key, one <span style="color:#	32 CD 32"> green </span>  key, and one Red key that will open all four of the doors.  The graphic below illustrates 3 of the janitor's possible 8 key combinations, and out of this set, only the first key ring can open all four doors. This key ring is said to "satisfy" the door problem.
+Every lock on each door can be opened by one of two different keys, represented by the upper case and lower case letters on each door - so the first lock can be opened by either the "G" (first <span style="color:#228B22"> green </span> key) or the "B" (first <span style="color:#0000FF"> blue </span>  key). As long as you have at least one of the keys for a particular lock, you can open the door.
+A person with three sets of keys, <span style="color:#0000FF"> blue </span>(B, b), <span style="color:#228B22"> green </span> (G, g), and <span style="color:#FF0000"> red </span> (R, r), wants to find the combination of one <span style="color:#0000FF"> blue </span>  key, one <span style="color:#228B22"> green </span>  key, and one <span style="color:#FF0000"> red </span> key that will open all four of the doors.  The graphic below illustrates three of the person's eight possible three-key combinations, and out of this set, only the first key ring can open all four doors. The first key ring is said to "satisfy" the door problem.
 <center>
-[[Image:picture2doorsno.png|caption|700x700px]]
+[[Image:picture2doorsno.png|500x500px|Testing different keys]]
 </center>
-Translating into Math
+== The Real Math ==
+<Font Size="1">
+'''Descriptions in parentheses are reference to the "lock and door" analogy.'''
+</Font>
-Each sat problem is made by joining (logical operator "'''AND'''") a number of clauses (doors), which each contain a number of literals(key holes)joined to each other by the logical operator "'''OR'''" (here they are joined by being in the same lock.)  so the same problem the janitor faced would be translated like this
+'''Literal'''=(key hole)= half of a variable; example (a)
-'''(a OR b) AND (a OR b') AND (a OR c') AND (a'or c)'''
+'''Variable'''=(color)= both literals from a single variable; example (a,a')
-where (a,a') replaces the <span style="color:#0000FF"> blue </span>  set of keys, (b,b') replaces green set of keys and (c,c') replaces the red set of keys.
+'''Clause'''=(door)=a group of literals joined by OR; example (a OR b)
+'''SAT Problem'''=(hallway of doors)= clauses joined together by AND; example (a or b) AND (a' OR b')
+<u>SAT</u>
+A SAT problem is made by joining a certain number of clauses (doors) together with the logical operator "'''AND'''" (AND implies that you have to satisfy (open) ''all'' of the clauses (doors) to find a solution to the problem). Each clause contains literals (key holes) joined together by the logical operator "'''OR'''" (here OR implies that you need at least one literal (keyhole) for the clause (door) to be satisfied (opened)) and these literals (key holes) are selected from the number of variables (the different key colors) being used. So translating the janitor's problem from above into mathematical terms would yield:
+<blockquote>  '''(a OR b) AND (a OR b') AND (a OR c') AND (a' or c)'''</blockquote>
+Where variable (a,a' ) replaces the <span style="color:#0000FF"> blue </span> keys, (b,b' ) replaces the <span style="color:#228B22"> green </span> keys and (c,c' ) replaces the <span style="color:#FF0000"> red </span> keys.
+In mathematical terms, this problem is a 3-variable (3 colors), 2-SAT (2 key holes), 4 clause (4 doors) SAT problem.  Keep in mind that you can change any of these parameters by adding more variables (colors), more clauses (more doors),  and you can make the problem 3-SAT, 4-SAT, etc (3, 4, etc. key holes per door).
-Our team translated this np-complete problem into a modified biological process. In cells, the process of transcription codes DNA into RNA, and then the RNA is translated into a protein that is expressed once completed. Just as an English sentence requires a specific grouping of letters to form words, DNA must be read in 3 nucleotide groupings called codons to continue the process successfully. Transfer RNAs (tRNAs) are RNA molecules with 3 nucleotide anticodon loops that match specifically to codon sequences and supply the amino acids according to those sequences. These amino acids are required to construct the protein. If nucleotides are inserted or deleted, the reading frame is shifted, causing a frameshift mutation. That frameshift mutation causes the tRNA to supply the wrong amino acid in the peptide chain that then gives an unanticipated protein, or no protein.
-With this biological background, the connection can be made between the “door and key” analogy and the biological interpretation. Our “doors” are engineered reporter gene DNA sequences including the 5 nucleotide frameshift mutation “keyhole”, resulting in our frameshift leaders (FSL). The “keys” that may or may not open the door are engineered 5 nucleotide anticodon suppressor tRNAs. Opening the door, thus solving the MAX SAT problem, would allow the cell to convey this outcome in the expression of the reporter gene (fluorescence/antibiotic resistance).  The idea of using suppressor tRNAs to bypass designed mutation to solve a SAT problem is an extension using suppressor logic from papers by Thomas J. Magliery, J. Christopher Anderson and Peter G. Schultz {[https://static.igem.org/mediawiki/2009/b/bc/Magliery%2CAnderson%2CSchultz2001JMB.pdf| Magliery et. al.] & [https://static.igem.org/mediawiki/2009/9/90/Anderson%2CMagliery%2CSchultz2002ChemBiol.pdf| Anderson et. al.]}. This team worked with 2,3,4,5, and 6 nucleotide frameshift mutations. Their research found that 4 and 5 base codon suppression was most efficient in their mission of expanding the genetic code. The difference in the original amino acid and the modified 5mer amino acid is the addition of the 2 nucleotides.
+<u>MAX SAT</u>
+A related problem, MAX SATuses the same terminology, but instead of wanting to be able to satisfy (open) all of the clauses (doors) in an all or none fashion, MAX SAT problems ask what is the greatest number of clauses (doors) that can be satisfied (opened).  We studied a modified MAX SAT problem because we were interested in looking at all patterns of satisfaction, recording when one clause (door) was satisfied, 2 clauses, 3 clauses etc. (not just the maximum number satisfied (opened).
+For this project we examined methods for solving both our modified MAX SAT problem and regular SAT
+== The Math in ''E. coli'' ==
+Our team translated the MAX SAT NP-complete problem into a modified biological process. In cells, the process of transcription codes DNA into mRNA, and then the mRNA is translated into a protein that expresses a function. Just as an English sentence requires a specific grouping of letters to form words, DNA must be read in 3-nucleotide groupings called codons to continue successfully the process of protein production. Transfer RNAs (tRNAs) are RNA molecules with 3 nucleotide anticodon loops that match specifically to codon sequences and supply the amino acids according to those sequences. Amino acids are required to construct all proteins. If nucleotides are inserted into or deleted from an mRNA, the reading frame is shifted, causing a frameshift mutation. Frameshift mutations cause tRNAs to supply the wrong amino acids in the protein that produces a different function than the original function.
+With this biological background, the connection can be made between the “door and key” analogy and the biological interpretation. Our “doors” are engineered reporter gene DNA sequences including the 5 nucleotide frameshift mutation “keyhole”, resulting in our frameshift leaders (FSL) (see figure below).  The “keys” that may or may not open the door are engineered 5 nucleotide anticodon suppressor tRNAs. Opening the door, thus solving the MAX SAT problem, would allow the cell to communicate its outcome in the expression of the reporter gene (fluorescence/antibiotic resistance).  The idea of using suppressor tRNAs to bypass designed mutation to solve a SAT problem is an extension of earlier work using suppressor logic from papers by Thomas J. Magliery, J. Christopher Anderson and Peter G. Schultz ([https://static.igem.org/mediawiki/2009/b/bc/Magliery%2CAnderson%2CSchultz2001JMB.pdf Magliery et al.] & [https://static.igem.org/mediawiki/2009/9/90/Anderson%2CMagliery%2CSchultz2002ChemBiol.pdf Anderson et al.]). These two publications used 2,4,5, and 6 nucleotide frameshift mutations. Their research demonstrated 4 and 5 base codon suppression was biologically feasible in their search to expand the genetic code. The difference in the original amino acid and the modified 5mer amino acid is the addition of the 2 nucleotides and the insertion of the amino acid serine. <br>
+<center>'''Diagram of 2-SAT problem with mRNA encoding RFP and the mRNA contains two clauses, each with two 5mer frame shift mutations.'''<br>
+'''Only one of the two 5mers can be suppressed and permit translation of RFP.'''
+[[Image:Biology_doors_and_locks.png|center|Doors in mRNA|750px]]
+<gallery widths="320px" heights="210px" perrow="5">
+Image:lockdef.png|Bacterial "locks" will open (continue translating in the correct reading frame) if the correct suppressor tRNA (key) binds to either of the 5mers (keyholes) within them
+Image:keys1.png|These tRNA "keys" base pair with a corresponding 5 nucleotide "keyhole" as described in the "lock" shown on the left
+</gallery>
+</center>
+==History of SAT==
+SAT was the first problem proven to be NP-complete in 1971 by Steve Cook, and with this proof came the concept of NP-completeness. The Soviet computer scientist Leonid Levin independently arrived at a similar proof at around the same time. In 1972, Dick Karp published a list of NP-complete problems, all linked to SAT. (3, 4)
+== Significance of SAT ==
+[[Image:Circuit.png|left|100px]]
+*Computer aided design applications
+*Computer circuit design
+**Logic synthesis
+**Automatic test pattern generation
+*Internet search algorithms
+<br><br><br>
 {{Template:MoWestern_Davidson2009_end}}