Team:BIOTEC Dresden/Modeling v2

From 2009.igem.org

(Difference between revisions)
Line 5: Line 5:
Interpolation formula [Stewart et. al. 1999] allows to calculate the probability of two cites of DNA to meet by looping.  
Interpolation formula [Stewart et. al. 1999] allows to calculate the probability of two cites of DNA to meet by looping.  
-
[[Image:Looping_probability_form.png]]
+
[[Image:Looping_probability_form.png]]
 +
[[Image:Looping_probability.png|350px|thumb|right|Fig.1 Normalized looping probability of DNA vs number of base pairs between the cites of interest]]
The key parameters of the model are the distance between cites (in number of base pairs), persistence length of DNA (in nm) and the proximity of two DNA ends necessary for looping to occur (nm). Probability is expressed as the local molar concentration of one site with respect to the other. On the Fig. 1 results of a simulation according to this formula are shown. Persistence length is considered to be 50 nm [Hagerman, 1988], proximity of two cites is 10 nm (black line), compared to 0 nm (red line). Probability is normalized with respect to the maximum value (0.1194 10^-6 M) reached at ~400 bp. Both lines are consistent with experiment and show no looping below the persistence length (~150 bp). Our simulation (performed with MATLAB) proves that the optimal distance for looping to occur is ~400 bp and the probability of two cites to meet decreases with growing distance between them. As can be seen from the picture, accounting for non-zero distance between to cites to form a loop, gives higher probability of looping to happen.
The key parameters of the model are the distance between cites (in number of base pairs), persistence length of DNA (in nm) and the proximity of two DNA ends necessary for looping to occur (nm). Probability is expressed as the local molar concentration of one site with respect to the other. On the Fig. 1 results of a simulation according to this formula are shown. Persistence length is considered to be 50 nm [Hagerman, 1988], proximity of two cites is 10 nm (black line), compared to 0 nm (red line). Probability is normalized with respect to the maximum value (0.1194 10^-6 M) reached at ~400 bp. Both lines are consistent with experiment and show no looping below the persistence length (~150 bp). Our simulation (performed with MATLAB) proves that the optimal distance for looping to occur is ~400 bp and the probability of two cites to meet decreases with growing distance between them. As can be seen from the picture, accounting for non-zero distance between to cites to form a loop, gives higher probability of looping to happen.
-
[[Image:Looping_probability.png|Fig.1 Normalized looping probability of DNA vs number of base pairs between the cites of interest]]
 

Revision as of 20:19, 21 October 2009

Looping probability simulation

Interpolation formula [Stewart et. al. 1999] allows to calculate the probability of two cites of DNA to meet by looping.

Looping probability form.png

Fig.1 Normalized looping probability of DNA vs number of base pairs between the cites of interest

The key parameters of the model are the distance between cites (in number of base pairs), persistence length of DNA (in nm) and the proximity of two DNA ends necessary for looping to occur (nm). Probability is expressed as the local molar concentration of one site with respect to the other. On the Fig. 1 results of a simulation according to this formula are shown. Persistence length is considered to be 50 nm [Hagerman, 1988], proximity of two cites is 10 nm (black line), compared to 0 nm (red line). Probability is normalized with respect to the maximum value (0.1194 10^-6 M) reached at ~400 bp. Both lines are consistent with experiment and show no looping below the persistence length (~150 bp). Our simulation (performed with MATLAB) proves that the optimal distance for looping to occur is ~400 bp and the probability of two cites to meet decreases with growing distance between them. As can be seen from the picture, accounting for non-zero distance between to cites to form a loop, gives higher probability of looping to happen.


Theory behind FLP-FRT recombination

In genetics, FLP-FRT recombination is a site-directed recombination technology used to manipulate an organism's DNA under controlled conditions in vivo. It is analogous to Cre-Lox recombination. It involves the recombination of sequences between short Flippase Recognition Target (FRT) sites by the Flippase recombination enzyme (FLP or Flp) derived from the 2µ plasmid of the baker's yeast Saccharomyces cerevisiae.

The 34bp long FRT site sequence is : 5'-GAAGTTCCTATTCtctagaaaGTATAGGAACTTC-3'. Flippase (flp) binds to the 13-bp 5'-GAAGTTCCTATTC-3' and to the reverse complement of 5'-GTATAGGAACTTC-3' (5'-GAAGTTCCTATAC-3'). The FRT site is cleaved just before 5'-tctagaaa-3', the 8bp asymmetric core region, on the top strand and behind this sequence on the bottom strand.[1]

Several variant FRT sites exist. Recombination can occur between two identical FRT sites but generally not between non-identical FRT sites.

Many available constructs include the sequence 5'-GAAGTTCCTATTCC-3' immediately upstream the FRT site (resulting in 5'-GAAGTTCCTATTCCGAAGTTCCTATTCtctagaaaGTATAGGAACTTC-3') but this sequence is dispensable for recombination.

Because the recombination activity can be targeted to only one target organ, or a low level of recombination activity can be used to consistently alter the DNA of only a subset of cells, FLP-FRT can be used to construct genetic mosaics in multicellular organisms. Using this technology, the loss or alteration of a gene can be studied in one target organ of interest, even if experimental animals could not survive the loss of the gene in other organs.

The effect of altering a gene can also be studied over time, by using an inducible promoter to trigger the recombination activity late in development - this prevents the alteration from affecting overall development of an organ, and allows single cells lacking the gene to be compared to normal neighboring cells in the same environment.


A very similar study using eukaryotic DNA: http://www.ncbi.nlm.nih.gov/pubmed/10581237

kinetic analysis of Flp activity - DNA binding and recombination models: http://www.ncbi.nlm.nih.gov/pubmed/9813124

Thermostability of Flp recombinase (We are using the F70L variant because it is sufficiently slow to give a time course): http://www.ncbi.nlm.nih.gov/pubmed/8932381

A202_01.gif

pCAGGS-FLPe-IRESpuro expression vector.


Unlike transcriptional regulation, this method gives true all-or-none induction due to covalent modification of DNA by Flp recombinase. Determining the transfer curve of inter-FRT site distance versus average recombination time allows the onset of gene expression to be predicted. We then apply this Flp reporter system as a powerful PoPS measurement device.


Recombination pathway of FLP: design of mathematical model

The recombination pathway upon which the mathematical model for recombination was based is shown in Figure 1. The model describes an excision reaction on a linear DNA substrate. The steps of DNA binding are well characterised for FLP. The enzyme binds the inverted repeat target site first as a monomer, which is then joined by a second monomer to form a dimer Andrews et al., 1987; Hoess et al., 1984; Mack et al., 1992). In the model, we have assumed that the protein is monomeric in solution, based on the behaviour of FLP and Cre in sucrose density gradients under similar buffer conditions to those used in this study (Abremski & Hoess, 1984; Qian et al., 1990).

Kinetic FLP 1.png


Figure 1Steps in FLP and Cre excision recombination reaction. Inverted repeat target sites are shown as open arrows. Recombinase monomers are shown as filled circles. Each step in the reaction is reversible and has a forward and a backward rate, indicated by small arrows. The forward and backward rate constants for each step are indicated on the Figure beside the arrows (small kn or k-n). The equilibrium constant for the conversion of one complex to another is given by the quotient of the backward and forward rate constants. Equilibrium constants are indicated (Kn). Names of species used for mathematical modelling are shown beside each complex (Ringrose et al., 1998, reproduced with permission).


In our scheme, the second-order association rate constants for the binding of the first and second monomers are named k1 and k2, respectively. This process must occur twice to occupy both target sites of the excision substrate. In the model, all DNA binding and dissociation steps are described by k1 and k2, and their corresponding first-order dissociation rate constants, k-1 and k-2 (Figure 1). A reduced model, describing the sequential binding of recombinase monomers to a single full site target comprising two half sites, in terms of the four parameters k1, k2, k-1 and k-2 is described by equations (7) to (10) (see below).
After DNA binding, the next step in the pathway is synapsis. FLP synapsis occurs by random collision (Beatty et al., 1986). In Figure 1, synapsis of the fully occupied excision substrate is described by a single first-order rate constant, k3. The model only allows for intramolecular synapsis, although in reality intermolecular recombination can also occur. However, for recombination assays, standard experimental conditions have been used, under which the frequency of intermolecular recombination is negligible. The multiple steps of catalysis are well characterised for FLP (for reviews, see Stark et al., 1992; Jayaram, 1994; Sadowski, 1995), and are described in Figure 1 by a single pair of rate constants, k4 and k-4, the forward and back rates of catalysis, respectively.
At present, we do not have a means of accurately measuring the formation of the synaptic complex. For this reason, we have further simplified the model by combining the constants k3 and k4 to give an apparent constant, k34, describing the rate of conversion of the fully bound substrate (SM4) into the excised synaptic complex (IEP) (Figure 1). The back rate, k-34, describes the reverse process. The relationship between k34, k-34 and their components k3, k-3, k4 and k-4 is given by:

Eq6 FLP.png

In this scheme, the dissociation of the synapse is represented as a reversal of its assembly: dissociation gives rise first to two products, each of which is bound by a recombinase dimer (Figure 1). This process is described by the first-order rate constant, k5. The dissociation of recombinase from DNA is assumed to occur in a stepwise manner, and is described in the model by the rate constants k-1 and k-2 (Figure 1). This assumption is based on the most simple and logical pathway. The dissociation mechanism of FLP has not been studied extensively, and there is disagreement in the literature regarding the steps involved. In experiments with synthetic Holliday junctions, it has been reported that the resolution of such structures requires two (Dixon & Sadowski, 1993), three (Qian & Cox, 1995) or four molecules of FLP (Lee et al., 1996). Based on experiments using full recombination substrates, Waite & Cox (1995) proposed a mechanism for FLP dissociation in which one or more monomers leave the synapse after recombination whilst others remain bound for longer. In the absence of a consensus on the mechanism of dissociation, Ringrose et al. (1998) proposed the mechanism shown in Figure 1, and point out that other mechanisms could easily be incorporated into the model by modification of equations (11) to (24) (see below).

The DNA binding rate constants k1, k-1, k2, k-2, and their corresponding equilibrium constants K1 and K2, have been directly measured using the gel mobility shift assay. If the DNA binding constants are known, this leaves two pairs of unknown rate constants: k34 and k-34, and k5 and k-5 (Figure 1). The mathematical model describes a general excision recombination reaction in which four protein monomers are required to reversibly recombine a single substrate, giving two products. The values of all rate constants, and of protein and substrate input can be varied to represent specific cases.


Simulation of FLP recombination

Kinetic FLP 2.png


Figure 8. Simulation of FLP recombination. a) Substrate titration (see below). The input values for protein, substrate, k1, k-1, k2 and k-2 are taken from Ringrose et al., (1998). Simulated curves for total nM excision product at three minutes and 60 minutes were fitted to the experimental data for FLP by simultaneous optimisation of the rate constants k34, k-34, k5 and k-5. Continuous lines, simulated recombination at three and 60 minutes; 25.6 nM FLP, 0.05 to 10 nM substrate. Open circles, recombination observed at three minutes. Filled circles, recombination observed at 60 minutes. b) Time course. The optimised parameters in Ringrose et al.,(1998) were used to simulate time course curves for FLP. Continuous line, simulated recombination time course at 25.6 nM FLP, 0.4 nM substrate; filled circles, data from time course experiment at 25.6 nM FLP, 0.4 nM substrate. c) Protein titration. Recombination time course curves were simulated for FLP at various protein inputs, with 0.4 nM substrate input. The simulated recombination after 60 minutes is plotted against protein concentration. Continuous line, simulated recombination parameters as in Ringrose et al.,(1998). Open circles, data points from protein titration experiment.


Mathematical modelling I: DNA binding

Using the nomenclature of Figure 1, the binding of recombinase monomers (M) to a full site substrate (S) is described by the kinetic equations:

Eq1 FLP.png

k1 and k-1 are the association and dissociation rate constants, respectively, for the species SM. k2 and k-2 are the association and dissociation rate constants for the species SM2. For the DNA binding model the assumption is made that the on and off rates for each half site, a and b, are identical. The evolution of species M, S, SM and SM2 over time is described by the differential equations:


Eq2 FLP.png

Eq3 FLP.png

dt is the time interval in seconds. The terms in square brackets refer to molar concentrations at time, t. Equations (7) to (10) were implemented in a Fortran 77 code, and solved by finite difference numerical integration. The rate constants k1, k-1, k2 and k-2 were determined by fitting the computed time courses of the SM and SM2 concentrations to the experimentally observed values simultaneously. Starting values of k1, k-1, k2 and k-2 were determined by choosing values which gave an approximate agreement with the experimental data. Using these starting values, the model of equations (7) to (10) was integrated in time using a semi-implicit Euler scheme (Press et al., 1992). The time courses were fitted to the experimental data by minimising an error function using the method of Powell (1965), a derivative-free version of the conjugate gradient method. The error function was constructed by summing the squared deviations of the experimental and computed SM and SM2 concentrations. As each set of rate constants corresponds to a unique solution of equations (7) to (10), and hence a given value of the error function, optimal values of the rate constants correspond to optimal agreement between experimental and modelled time courses.


Mathematical modelling II: the recombination reaction

The proposed pathway of recombination is shown in Figure 1. The names of most species and rate constants used in the model are shown in the Figure. Additional species described by the model and not shown in Figure 1 are as follows. The distinction is made in the model between SM2A (equation (14)), which has a single monomer bound at each target site, and SM2B (equation (15)), which has two monomers bound at one site, leaving the other site unoccupied. Only SM2B is indicated in Figure 1. Either SM2A or SM2B can give rise to SM3 (equation (16)), in which one site is bound by two monomers while the other is bound by only one monomer. Later species which are also not shown in Figure 1 describe the stepwise dissociation of monomers from DNA. EPM2 and LPM2 (Figure 1), both with two monomers bound, can each release a single monomer to give rise to EPM and LPM respectively (equations (21) and (22)). The release of the last bound monomer from EPM and LPM gives rise to EP and LP (equations (23) and (24); Figure 1). Based on classical kinetic equations, the differential equations describing the rate of change of each species over time (in seconds) are as follows:


Eq4 FLP.png

Eq5 FLP.png


Equations (11) to (24) were implemented and solved as described above for the DNA binding model. To find values for the unknown rate constants k34, k-34, k5 and k-5, computed time courses were fitted to experimental data by varying the input values of unknown rate constants. The quantity taken to represent the total excision product, EPtot, is given by IEP + EPM2 + EPM + EP. This quantity is considered to describe the experimentally observed excision product because only the linear excision product, and not the circle, is labelled and quanti fied. In addition, because recombination reactions are terminated with proteinase K, the DNA seen as excision product includes the DNA within the excised synaptic complex (IEP in the model) and all subsequent bound

and unbound forms (EPM2, EPM and EP).