Revision as of 16:44, 18 October 2009 by Apmasell (Talk | contribs)





The mathematical modeling component of this year's project consisted of a computational simulation of DNA recombination as mediated by the ΦC31 integrase enzyme. The necessity for this simulation arose directly from challenges faced by the design team in its attempts to create a recursively repeatable technique for inserting sequences of interest onto chromosomes. Specifically, it was noted that manually examining the possible results of interactions between DNA strands quickly becomes infeasible due to the number of potential reaction pathways.

The first stage of the modeling project was therefore to formally codify the reaction rules employed by the design team with the aim of applying computational power to the problem. The predominant challenge faced at this stage was to abstract the concept of DNA strands into a computationally workable form along with developing mathematically rigorous definitions of the behaviours of reaction sites.

Formally, the grand object of the modeling project was the determination of a finite deterministic sequence of att sites and their enclosed operators that would allow one to predictably insert into a chromosome all of the desired sequences.

The form that the solution could take, we postulated would be a sequence of two or three plasmids such that each would contain at least one matching set of att sites with the addition of several incomplete att sites.

There were two general approaches used in modelling. Software development toward a top-down (inductive, brute force) solver has finished. Characterization of the algorithm underpinning the solver however revealed that the problem is NP-hard at least, and NP-complete at worse. An auxiliary approach was attempted whereupon we tried to map the sequence problem onto a mathematical problem with known solutions.


In order to run the solver, we had to make a few assumptions. First, as we do not know what combination of sequences with attN sites is part of the solution, we assume that any product in our search space is fair game for the next generation of reactions. We further assume that any plasmid with valid attN sites and complementary operators is capable of self reacting and also of reacting with any other plasmid in the history of the modelled cell. Second, because of the exponential behaviour of the search space, we assume that the smallest solution that exists can be found within the search space generated after reacting 10E7 plasmids. This second assumption is made in order to have sane parameters for termination.


An ancillary branch morphed out of the necessity to tend the exponential behaviour of the problem. There may exist some math that inherently facilitates the modeling and solving of this problem. We explored maths that mainly dealt with topology (knot theory) and functional reasoning (lambda theory, combinatory calculus) but finally could not identify a good candidate as a scaffold to our solution.