Team:Heidelberg/Project Synthetic promoters


 * 1) body {margin: 0}

{|valign="top" border="0" style="margin-left: 2px;"
 * width="650px" style="padding: 0 15px 15px 20px; background-color:#ede8e2"|

= Synthetic Promoters = The central question of the synthetic promoter project is: Are we able to make specific promoters for use in mammalian cells synthetically?

Or, going even further: Are we able to develop a standard method for creating promoters of
 * Defined strength
 * Defined response
 * Defined pathway integration

How we got this to work... Read on!

Abstract
Promoters are the key regulators of gene expression. Possessing promoters which are active under a desired condition, at a desired strength and in a specified tissue is of great value for Plant Biotechnology, Gene Therapy and fundamental research in Bioscience. Therefore, it has become a desire to synthetically construct promoters responsive to a variety of pathways. We explore two ways to the synthesis of promoters: On one hand, we have developed a bioinformatical model and database (HEARTBEAT) describing the structure of promoters responsive to user-defined inputs. On the other hand, we have developed a biochemical method for the synthesis of randomized promoter libraries. Using this method, we have created a library of constitutive promoters of varying strength. Also, we have created libraries of promoters putatively responsive to a variety of pathways. We have screened these libraries for functional, pathway responsive promoters and present a detailed characterization of a NF-&kappa;B responsive promoter of our making. We finally discuss ways to combine randomized biochemical synthesis and bioinformatical modeling to propose a method towards the generation of promoters of complex regulation (i.e. by multiple pathways).

[TOP]

Introduction
Controlling gene expression in a condition and tissue specific manner is a desirable tool for a wide variety of applications ranging from fundamental bioscience research to gene therapy. Having this idea in mind, it is inevitable to consider the starting point of gene expression: the promoter. Most efforts of obtaining any promoter focus on cloning them from nature. In eukaryotes, this approach leads to several problems. The one major issue is the complexity of eukaryotic systems resulting in cross-talk between pathways. In particular, promoters tend to respond to several transcription factors and are thus activated by more than one pathway. That way, one cannot determine which signal originally induced the promoter. Alongside this specificity, induction under a desired combination of conditions and at defined strengths would be valuable implementations.

Therefore, efforts have emerged to synthetically construct promoters. Two concepts of synthetic promoters in mammalian cells co-exist independently from each other. One is the concept of "genetic switches" (see [2] for a recent review) - promoters which can specifically be induced by a stimulus mammalian cells and are usually insensitive to, e.g. tetracycline [3]. Much fewer efforts have been put into developing promoters sensitive to endogenous signals (referred to as "synthetic promoters" in the rest of this article). Such promoters are of very high value for a broad variety of applications. Three examples should demonstrate this: First, in virotherapy for cancer and other diseases, it has become a desire to express toxic genes only in affected cells (reviewed in [4]). For example, breast cancer cells are characterized by high levels of estrogen receptor. Constructing a promoter which is active only at high estrogen receptor levels (preferable only in cells which are irradiated, as ER can be very active in other tissues of the female reproductive tract[17]) might therefore help developing novel breast cancer therapies. Second, biologists studying pathway interactions are in need for transcriptional assays, that being promoters which are specifically activated by a single transcription factor. Third, the concept can be transferred to plants, where synthetic promoters can be very valuable, as plant biotechnology is always in need for novel tissue- or development-specific promoters.

Three approaches exist to construct synthetic promoters responsive to endogenous factors. First, the structure of promoters is modeled by generating large data sets describing the relative spacing and coincidence of transcription factors (reviewed in [5]). To our knowledge, such predictions have not been tested in vivo. Second, promoters are generated by randomly or repeatedly cloning response elements upstream of a core promoter. In our understanding, repeated cloning of response elements works well [6] and is frequently applied, but no suggestions exist on how to apply this strategy to the generation of more complexly regulated promoters. The random creation of promoters works well to generate constitutive promoters [7] and was even applied to broadly identify activating elements [8], but no promoters of specific regulation have been described for this approach. A third approach is the randomization of spacer elements between transcription factor binding sites, which is applied to generate libraries of promoters of varying strength [9], [10].

In order to be able to design synthetic promoters, an understanding of natural promoters is required. Mammalian promoters can be subdivided into several "domains". The core promoter is the binding site of the basal transcription machinery, i.e. RNA polymerase and associated factors. Core promoters differ in composition, but are more or less similar for most genes (reviewed in [11]). The main regulatory domain is the proximal promoter, which is where regulatory elements bind. It can be very large (4kb), meaning that some transcription factors regulate transcription despite being very far away from the RNA polymerase. This is mainly possible because of the three-dimensional structure the DNA adopts. In addition to this, there are even more distal elements that are referred to as "enhancers" and "silencers". A further challenge is that some transcription factors are not able to initiate transcription on their own, but rather they require other transcription factors for their activity [5].

[TOP]

RA-PCR, a method for the generation of randomized promoter libraries


We have developed a standard method (termed "Random Assembly PCR / RA-PCR") for the construction of randomized promoter libraries. We modified Assembly PCR [12] to create promoters with randomized spacing and frequency of Transcription Factor Binding Sites. It relies on using different oligos containing a transcription factor binding site (or random DNA) plus two annealing sequences (see Fig. 1 for a comprehensive explanation of the method). We use two sets of oligos, one for the top strand, one for the bottom strand. The oligos for each strand have the same annealing sequences (which are complementary to the annealing sequences of the other strand). If these oligos are pooled, they will randomly anneal to each other, thus generating randomized repeats of the transcription factor binding sites of interest at varying spacing. In order to be able to clone the construct, we also add two stop oligos (termed stop 5' and top 3') which contain only one annealing sequence, plus a cutsite (SpeI 5', HindIII 3'). Double-stranded DNA is created by running a seven-cycle PCR, and amplified by a 25-cycle PCR. Then, the resulting (proximal) promoter is cloned 5' of a core promoter (we used the core promoter of JeT [10]) by inserting it into pSMB_MEASURE (Part:BBa_K203100), the promoter measurement plasmid we developed (from there, it can be excised like any standard biological part in a submission plasmid). Thus, a mixture of different promoters in the same plasmid backbone is generated. These can then be transformed into bacteria. Each colony represents a single putative promoter, which can be transfected into mammalian cell. 24 hours after transfection cells are exposed to the conditions of interest and control conditions, serving as a reference. Promoters active under the desired conditions, but not under control conditions, are selected for further characterization.

Please see a detailed protocol for RA-PCR below.

[TOP]

Generation of a library of constitutive promoters
As a first application of RA-PCR, we have created a library of constitutive promoters. We performed RA-PCR on oligos containing binding sites for some well-known generally activating transcription factors (Sp1, Ap1,CREB, NF-Y) which we identified from literature search [8],[7],[10]. We also added NF-&kappa;B responsive oligos as NF-&kappa;B has non-specific activity and is therefore used by a variety of viral constitutive promoters, e.g. the HIV promoter [13]. We picked 24 colonies, two of which we dismissed after a test digest (not shown). Fig. 3 shows the sequence analysis of some randomly selected clones and demonstrates that RA-PCR is able to generate randomized repeats of oligos. We then measured the activity of the clones we picked by applying the Concept of Relative Expression Units (REU) we developed. Fig. 2 shows that we have been able to create a library of promoters of varying strength, some of which have an expression strength higher than JeT (which was attempted but not accomplished by JeT's developers [10]). Such a library is of great value for fine-tuning gene expression levels.

[TOP]

Generation and screening of a library of promoters putatively responsive to NF-&kappa;B
RA-PCR was conducted with oligos containing a NF-&kappa;B binding site, plus a small number of "general activators" (NF-Y, Sp1, Ap1, CREB). Box 1 demonstrates how the oligos were designed from a frequency matrix. 33 clones were picked, miniprepped and transfected. NF-&kappa;B was then induced by the addition of TNF-&alpha; (2.5µM) for 10 hours, and left uninduced as a control. The plate was then scanned by TECAN, an automated fluorescence plate reader. TECAN is very imprecise on eukaryotic cells, and the arbitrary fluorescence we measured is not proportional to REU or another precise measure of mammalian promoter activity, but it can serve as a rough indicator of promoter induction. The result (Fig.4) shows that most clones appear not to be induced by NF-&kappa;B / TNF-&alpha;, whereas others are induced at varying levels of strength. Considering the sequence analysis of some randomly selected clones (Fig.5), this result is not intuitive, as most sequences contain a NF-&kappa;B binding site, but it demonstrates that simply cloning repeats of a Transcription Factor Binding Site in front of a core promoter will not necessarily work.

We picked clone 31 for further characterization in REU.

[TOP]

Characterization of a NF-&kappa;B responsive promoter
We characterized clone 31 by flow cytometry (3 independent experiments, each experiment having 3 replicates) and found it to be upregulated by approx. 100% (Fig. 6). We also did a timecourse measurement of induction (Fig. 7). To verify our assumption that promoters created by RA-PCR are specific to the pathway of interest, we tried to induce clone 31 by a wide variety of conditions (full medium, full medium lacking growth factors / serum, minimal medium / starvation and pPAR&gamma; induction by Thiazolidinedione and found no change in induction levels (Fig. 8) Also, we show that the promoter is active only if p65, a component of NF-&kappa;B, localizes to the nucleus (Fig. 9 and 10).

[TOP]

RA-PCR can generate promoters responsive to a variety of pathways
We performed RA-PCR to construct promoters putatively responsive to transcription factors as diverse as p53 (DNA damage sensor), pPAR&gamma; (metabolism & diabetis), SREBP (Sterol nutrition), HIF (hypoxia) and Estrogen receptor. While screening these promoters we found the following:
 * For pPAR&gamma;, we identified two clones which appear to be responsive to the anti-diabetis drug 2,4-Thiazolidinedione. We roughly characterized these promoters by a triple TECAN read relative to JeT (Fig. 11)
 * For p53, induction of the pathway by the Topoisomerase inhibitor Camptothecin (an anti-cancer drug) turned out to be difficult as is severely harms the cells and makes promoter induction levels difficult. We therefore attempted to normalize screening conditions to number of living cells by Hoechst-Staining. We found that some promoters appeared to be strongly downregulated by Camptothecin and therefore experimented with a variety of conditions inducing by p53 by different pathways, at different phosphorylation sites, but where unable to obtain a conclusive picture.
 * For HIF, we failed to induced the conditions sufficiently to achieve promoter activation. We discuss below how screening can be improved.
 * For SREBP and Estrogen, we encountered technical problems during promoter synthesis (probably damaged HindIII enzyme) and therefore were unable to produce enough clones for sufficient screening. For SREBP, we therefore cloned two natural, SREBP-upregulated promoters we had at hand and submitted them to the registry (Main Article: Natural promoter subproject).

[TOP]

HEARTBEAT, a model describing promoter structure
Main article: HEARTBEAT

Based on the assumption that transcription factors (TFs) have a spatial preference for binding to the natural promoters' sequence concerning the distance to the transcriptional start site (TSS) [14], we developed HEARTBEAT (Heidelberg Artificial Transcription Factor Binding Site Engineering and Assembly Tool). In a first step, 4395 human promoter sequences 1000 bp upstream from the TSS obtained from the UCSC genome browser were analyzed by the program “Promotersweep” [15]. Promotersweep is able to assign transcription factor binding sites (TFBS) to a given sequence by retrieving and combining information from three homology databases (EnsEMBL Compara, NCBI HomoloGene, DoOP database), five promoter databases (EPD, DBTSS), six sequence motive identification tools (e.g. Meme, Gibbs Motif Sampler) and two matrix profile databases (Jaspar Core Library, Transfac Professional Library). Each TFBS motive is further classified into weak, conserved and reliable according to the quality of the assignment. The final result of Promotersweep can be divided into general spatial information about the TFBS and the consensus sequence on the one hand and further detailed facts about the associated gene on the other. In Fig. 12, the spatial distribution of VDR (Vitamin D receptor) binding sites within 140 natural promoter sequences is shown as an example. The size of each bin equals the number of VDR-TFBS within a range of 20 bps. The solid line represents the probability density function (pdf). Here, the maximum of the pdf is located 54 bps upstream to the TSS indicated by the vertical line. Natural promoter sequences usually exhibit multiple TFBS which implies dependencies between different TFs according to their binding behavior to the DNA. Fig. 13 shows the frequency distribution of coincidental appearing TFBSs if VDR is present. The highest peak represents VDR itself. The next three highest peaks are Kid3 (inhibitory), WT1 and AP-2 (stimulatory). In total, together with VDR, there are over 300 different TFBS coincidentally present. Both plots represent data deduced from the Heartbeat-database which enable a well-defined synthesis of promoter sequences.

[TOP]

An in vivo test of predicted promoters
Main article: HEARTBEAT database

We tested SREBP responsive promoters predicted by HEARTBEAT and found only sequences matching the predictions best to be upregulated by SREBP (see figure).

[TOP]

Discussion
The results shown above demonstrate the potential of RA-PCR towards the synthesis of any promoter. Even by analyzing modest amounts of clones for each individual pathway, we were able to obtain promoters of a wide variety of strength and inducibility. Also, we were able to obtain constitutive promoters of greater strength than JeT, which has not been possible until now[10].

Many insights about promoter regulation are possible by analyzing different promoters created by RA-PCR. For example, NF-&kappa;B clone 3 and clone 11 (see Fig. 4 and 5) differ only in the positioning of the single response element (RE), but still, induction strength differs threefold. This gives hints about Nf-&kappa;B's binding preference. A systematic study of promoters generated by RA-PCR and their strength can therefore be used to develop a comprehensive model of transcriptional regulation. This approach was addressed with the HEARTBEAT-fuzzy network (FN) modeling. HEARTBEAT-FN is used to predict the outcome of the designed sequences and in a reverse process to determine the quality of the synthetic construct.

[TOP]

Screening conditions and induction strength
As noted above, we experienced difficulties inducing some of the pathways (namely, HIF and p53). From our cell culture work, we learned that finding the ideal timepoint of induction for a certain pathway and the ideal conditions is very difficult even with literature at hand. Also, one would expect a much higher induction than the one observed for the NF-&kappa;B responsive clone we describe. Our induction levels might be low because NF-&kappa;B has a high constant actvity [16], especially if the cells encounter "rough" cell culture conditions. This also explains the (small) induction of uninduced NF-&kappa;B responsive promoters we observe in Fig. 7. Therefore we suggest that for future screening, a library of siRNAs for the transcription factors of interest should be compiled. Also, a library of transcription factors mutated to be constantly active is required. With these libraries available, individual transcription factors can be knocked down, and activated specifically at 100% efficiency. This will greatly facilitate screening and parts characterization.

[TOP]

Generation of down-regulated promoters
As shown, we were able to generate a set of promoters upregulated by certain factors. For several applications, promoters of a high constant strength, which become down-regulated by a signal, are required. We think it might be possible to construct such promoters by performing a RA-PCR with oligos containing weak binding sites for generally activating transcription factors (that is, binding sites which deviate from the consensus sequence) and to add some oligos containing very strong binding sites for the transcription factor of interest (say, NF-&kappa;B). If this factor is not active, the general activators will be able to bind to the DNA and activate transcription. Upon factor activation, the general activator will be replaced. If the binding site is then in a position where it does not initiate transcription (as for some of the clones (32 etc.) shown in Fig. 4 and 5), the promoter will be downregulated, instead of upregulated. This hypothesis remains to be tested.

[TOP]

M-RA-PCR, a model-guided biochemical method for synthesis of complex promoters
RA-PCR can be modified to reflect modeled probability density curves. If a promoter regulated by multiple pathways, for example VDR (Vitamin D receptor) and SREBP (Sterol regulated element binding protein), is to be constructed, considering the density curves as obtained from the model (Fig. 12) can give clues about its construction. A working VDR/SREBP promoter requires VDR and/or SREBP Response Elements (REs) in the close vicinity of the TSS (at approx. 850). It might require SREBP REs between 300 and 700, and VDR REs between 0 and 300. This distribution can be reflected by setting up 3 RA-PCRs with varying concentrations of VDR-responsive, SREBP-responsive and spacer-oligos (compare Fig. 15). If a 3'Stop oligo containing a NheI cutsite, and a 5'Stop oligo containing a SpeI cutsite (or any combination of cutsites yielding compatible ends) is used, an infinite number of RA-PCR products can be assembled and cloned in front of a core promoter (having a SpeI cutsite 5'). We believe that this technique, termed Model-guided Random Assembly PCR, or M-RA-PCR, is the way forward to constructing the promoters of complex regulation described in the Introduction.

[TOP]

Final remarks
We have developed two independent methods for the generation of truly synthetic promoter for use in mammalian cells and discussed possibilities for their combination and improvement. Our findings show that knowledge of promoter structure obtained by bioinformatical analysis (HEARTBEAT) greatly improves promoter function. We are looking forward to continuing this work and generating promoters which can be used in medical or biotechnological applications, such as transcriptional targeting in virotherapy or a reporter cell line.

[TOP]

RA-PCR protocol

 * '''All Oligos we used can be found in Material and Methods
 * Obtain density curves about the distribution of your TF of interest from our model HEARTBEAT GUI. If this densitiy curve shows a decisive peak at distance >250bp from the Transcriptional Start Site (TSS), consider performing M-RA-PCR . If a peak is present close to the TSS, or if data is insufficient, continue here. How our model was developed is described in detail on the model page.
 * Check HEARTBEAT GUI for transcription factors coinciding with your transcription factor of interest
 * Design two annealing sites, each 15-18 base pairs long. Annealing sites should be void of transcription factor binding sites. Calculate the reverse complement of both sequences. The following sequences MAY be used:
 * Design a 5' stop oligos containing a cutsite (SpeI) and AS1_F.
 * Design a 3' stop oligos containing a cutsite (HindIII) and AS1_RC.
 * Design forward and reverse Oligos for each transcription factor of your interest. Forward oligos contain AS2_F, the transcription factor binding site and AS1_F. Reverse oligos contain AS2_RC, the TFBS and AS1_RC. TFBS should be designed to represent the matrix describing the factor's binding preferences (Box 1).
 * Design forward and/or reverse oligos for coinceding transcription factors identified in step 2 in the same way as described in step 6.
 * Design forward and/or reverse oligos for general activators.
 * Design forward and reverse spacer oligos, which contain 10-15*dNTPs (random nucleotides) instead of a TFBS.
 * Order oligos at 100µM. Pool the oligos. As a general rule, use 0,8µL oft Stop5' and Stop 3'; ~4µL of the transcription factor (forward), ~4µL of the transcription factor (reverse), 1-2µL each of the forward and reverse spacer oligo, ~1µL of coinciding transcription factors and a total of 0,5µL of general activators. For the oligo concentrations we used, please refer to Material & Methods. Define Vo as the total Volume of oligos pooled.
 * Add Oligos (volume Vo) to water in total volume Vw = (10 * Vo – Vo). Add volume Vo of diluted Oligos Vw to water in volume Vf = (10 * Vo – Vo). Add volume Vo of final oligo dilution Vf to a PCR reaction. Total reaction volume SHOULD be 50 µL. We used Phusion MasterMix 2x (Finnzymes) as PCR reagent. Carry out this PCR reaction in two replicates in order to achieve greater heterogeneity.
 * Run the PCR, 7-10 cycles, with the following setup:
 * 1 cycle Initial denaturing, 5 minute 95°C
 * 7-10 cycles assembly: 30 seconds 95°C, 45 seconds 58°C, 45 seconds 72°C
 * Terminal hold, 4°C, forever
 * Remove oligonucleotides by performing a PCR purification using PCR purification kit (QIAGEN) or a gel extraction using Gel extraction kit (QIAGEN)
 * Add PCR reagent (Phusion MasterMix 2x) again. Add 5' stop oligo and 3' Stop oligo, 25pmol (1µL of 1:4 diluted stock).
 * Run the PCR, 25 cycles, with the following setup:
 * 1 cycle Initial dentaturing, 5 minute 95°C
 * 25 cycles amplification 30 seconds 95°C, 45 seconds 68°C, 60 seconds 72°C
 * Terminal hold, 4°C, forever
 * Gel purify PCR products to exclude everything <200Bp. Use a 1% agarose gel, 50V for at least 2h to achieve a good resolution
 * Digest with HindIII and SpeI (or whatever cutsites were included in step 4 and 5). Digest a reporter plasmid containing a core promoter and a reporter gene with the same enzymes. We used BBa_K203100 for this task. Make sure to perform a thorough digest; in addition, digest the plasmid with shrimp alkaline phosphatase or calf intestine phosphatase afterwards. Gel purify the plasmid backbone, PCR purify the digested PCR products.
 * Ligate. Perform a thorough ligation to increase transformation efficiency. We used Fermentas T4 DNA Ligase for 5h, 19°C or overnight, 16°C.
 * Transform into competent E. coli cells and plate out. Pick no more than 20 colonies per individual PCR reaction. If more putative promoters are desired, set up several PCR reactions
 * Isolate plasmid DNA from the selected colonies. We used a QIAGEN Miniprep kit for this tasked.
 * Recommended step: Test-digest miniprep DNA with the same enzymes used in step 17 to make sure you get plasmid with synthesized promoters of varying length. Length of the inserts (that is, synthetic promoters) should be between 100 and 600 basepairs. If this is not the case, vary stop oligo concentration in step 10, improve gel purification setup in step 16 or alter PCR conditions in step 12 and 15.
 * Perfom a screening to select functioning clones. For example, transfect clones in triplicates into eukaryotic cells on a 96 well plate by using transfection agents such as EFFECTENE or Lipofectamine. Then, induce the conditions of interest in one replicate, shut them off in a second replicate, and leave control medium on the third replicate. When the pathway is fully active, read fluorescence (or luminescence, if a luciferase reporter is used) by a plate reader (TECAN) or other automated methods. For the conditions we used for pathway induction, please refer to our Methods

[TOP]