Team:Heidelberg/Notebook modeling

From 2009.igem.org

(Difference between revisions)
(10-19-2009)
 
(189 intermediate revisions not shown)
Line 1: Line 1:
__NOTOC__
__NOTOC__
-
 
{{Template_HD_3}}
{{Template_HD_3}}
-
<html><body id="Notebook_modeling"></body></html>
+
<html><body id="notebook"></body></html>
 +
 
{|
{|
-
|-valign="top" border="0"
+
|-valign="top" border="0" style="margin-left: 2px;"
-
|width="650px" style="padding: 0 20px 0 0; background-color:#ede8e2"|
+
|width="650px" style="padding: 0 15px 15px 20px; background-color:#ede8e2"|
-
__NOTOC__
+
= Notebook HEARTBEAT =
 +
Welcome to the notebook of the HEARTBEAT (Heidelberg Artificial Transcription Factor Binding Sites Assembly and Engineering Tool) project. This notebook comprises the work on three sublanes: HEARTBEAT database (DB), HEARTBEAT graphical user interface (GUI) and HEARTBEAT fuzzy modeling (FN) as well as some additional work on logo as well as wiki design. Have fun!
 +
 
 +
=='''Contents'''==
 +
* <span style="font-size:5mm;">[[Team:Heidelberg/Notebook_modeling#July| July]]</span>
 +
 
 +
* <span style="font-size:5mm;">[[Team:Heidelberg/Notebook_modeling#August| August]]</span>
 +
 
 +
* <span style="font-size:5mm;">[[Team:Heidelberg/Notebook_modeling#September| September]]</span>
 +
 
 +
* <span style="font-size:5mm;">[[Team:Heidelberg/Notebook_modeling#October| October]]</span>
 +
 
 +
== July ==
 +
=== 7-27-2009 ===
 +
 
 +
* Meeting with Oliver Pelz
 +
** Discuss general ideas of our Database Structure and Content
 +
** An introduction into PromoterSweep (LINK). PromoterSweep screens a given sequence for conserved regions giving us consensus sequences and moreover screens them for TFBS by using database search (TRANSFAC, Jasper) (LINK)
 +
** Our new database should contain following informations: promoter sequence, TFs, TFBS, position of TFBS, number of binding TFBS, "host organism"
 +
** We decide to choose MySQL as a appropiate language solving this challenge which allows us also a graphical representation of the database on the web later.
 +
** GUI on wiki: which language? php? javascript?
 +
** Problems: access to PromoterSweep (Husar Bioinformatics Group, DKFZ), choice of Promoter Database (DoOP, UCSC, EnsEMBL) (LINK)
 +
 
 +
* aim: create database until end of August
 +
 
 +
[[https://2009.igem.org/Team:Heidelberg/Notebook_modeling TOP]]
 +
 
 +
== August ==
 +
{| class="wikitable centered" border="2" rules="rows" width="650px" style="border-color:white;"
 +
|- 
 +
! Week !! colspan="7"  |Days
 +
|-
 +
|style="text-align:center"|
 +
|style="text-align:center"| Mon
 +
|style="text-align:center"| Tue
 +
|style="text-align:center"| Wed
 +
|style="text-align:center"| Thu
 +
|style="text-align:center"| Fri
 +
|style="text-align:center"| Sat
 +
|style="text-align:center"| Sun
 +
|-
 +
|style="text-align:center"| 31
 +
|style="text-align:center"| -
 +
|style="text-align:center"| -
 +
|style="text-align:center"| -
 +
|style="text-align:center"| -
 +
|style="text-align:center"| -
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-1-2009|1]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-2-2009|2]]
 +
|-
 +
|style="text-align:center"| 32
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-3-2009|3]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-4-2009|4]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-5-2009|5]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-6-2009|6]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-7-2009|7]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-8-2009|8]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-9-2009|9]]
 +
|-
 +
|style="text-align:center"| 33
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-10-2009|10]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-11-2009|11]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-12-2009|12]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-13-2009|13]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-14-2009|14]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-15-2009|15]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-16-2009|16]]
 +
|-
 +
|style="text-align:center"| 34
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-17-2009|17]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-18-2009|18]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-19-2009|19]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-20-2009|20]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-21-2009|21]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-22-2009|22]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-23-2009|23]]
 +
|-
 +
|style="text-align:center"| 35
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-24-2009|24]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-25-2009|25]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-26-2009|26]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-27-2009|27]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-28-2009|28]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-29-2009|29]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-30-2009|30]]
 +
|-
 +
|style="text-align:center"| 36
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#8-31-2009|31]]
 +
|style="text-align:center"| -
 +
|style="text-align:center"| -
 +
|style="text-align:center"| -
 +
|style="text-align:center"| -
 +
|style="text-align:center"| -
 +
|style="text-align:center"| -
 +
|}
 +
[[https://2009.igem.org/Team:Heidelberg/Notebook_modeling TOP]]
 +
 
 +
=== 8-3-2009 ===
 +
 
 +
* First contact with MySQL
 +
* Start making an overview of other team's projects
 +
* Configuring our Virtual Server
 +
 
 +
=== 8-4-2009 ===
 +
 
 +
* Official Team Meeting (LINK) @ BQ seminar room 43: preparaing presentation & writing meeting report
 +
* Start installing developing environment on our internal server
 +
** GNOME
 +
** Mediawiki
 +
 
 +
=== 8-5-2009 ===
 +
 
 +
* Meeting with Tobias Bauer & Anna-Lena Kranz (Theoretical Bioinformatics, DKFZ) @ TP3, DKFZ
 +
** Integrating ideas of PromoterSweep, Transfac as well as DoOP/CisRED
 +
** select "interesting" TFs (e.g. HIF, NFkB, c-myc, p53) for Wetlab
 +
** select "interesting" pathways (e.g. cell cycle, inflammation, metabolism etc)
 +
** future experimental validation: ChIP-on-Chip
 +
*** for this we need a TFBS-free sequence
 +
** idea: plot histogram of TFBS relative to TSS
 +
*** problem: choice of sequence: upstream only? inculde downstream?
 +
** new programming language: '''R''' and '''perl'''
 +
** next meeting: Friday after team meeting
 +
 
 +
* Meeting with Karl-Heinz Glatting (HUSAR, DKFZ) @ TP3, DKFZ
 +
** An introduction into PromoterSweep
 +
** Structure and analysis principles of PromoterSweep
 +
** Output is stored in an XML file. This means we have to parse the xml code.
 +
** Oliver Pelz will give help for us in programming
 +
 
 +
* Protocol of the meeting can be downloaded [https://www.bioquant.uni-heidelberg.de/fileadmin/igem/wiki_docs/Notebook-Modelling/husar_treffen.pdf| from here].
 +
 
 +
* Start working with MySQL
 +
* request UNIX/HUSAR/HPC access at DKFZ (Nao)
 +
* first contact with several databases: EmsEMBL, Compara, cisRED, DoOP, TiProD, contra
 +
 
 +
=== 8-6-2009 ===
 +
 
 +
* Meeting with Oliver Pelz
 +
** defining workflow with PromoterSweep, Matrix Profile Search and introduction into different Motif Discovery Algorithms
 +
 
 +
* installation of NX server for access onto internal server from Windows
 +
* configure developing environment (printing from Linux, configure Mediawiki)
 +
* defining basic concept of database construction
 +
** we select annotated promoter sequences in DoOP
 +
** we make a selection of pathway of interest using KEGG
 +
** narrow down number of target promoter sequences <10000.
 +
 
 +
=== 8-7-2009 ===
 +
 
 +
* Official Team Meeting on Scheduling
 +
* Meeting with Anna-Lena and Tobias
 +
** Introduction into R
 +
** Tobias will give us access to their computing cluster (Group Roland Eils)
 +
** Promoter Selection: DoOP, EnsEMBL, or UCSC?
 +
 
 +
* HUSAR account arrived
 +
* installation of R, R editor and perl editor
 +
* further configuration of our internal server / mediawiki
 +
* writing first perl program - "Hi there"
 +
 
 +
[[https://2009.igem.org/Team:Heidelberg/Notebook_modeling TOP]]
 +
 
 +
=== 8-10-2009 ===
 +
 
 +
* first contact with R and perl
 +
* playing around with R and perl
 +
* playing around with R library: Biobase
 +
* check working on DKFZ cluster
 +
 
 +
=== 8-11-2009 ===
 +
* defining programming languages: perl, R, MySQL
 +
* retrieving first Promotersweep output files
 +
 
 +
* Meeting with Marti
 +
** ideas for modeling
 +
*** we will have at least three colors which overlap in their spectra.
 +
*** a very nice approach will be Fuzzy Logic Modeling.
 +
*** '''idea 1''': ''error checking'' of affinity: compare expectation to experimental results and figure out where the error is hiding
 +
*** '''idea 2''': ''create&visualize fancy and fuzzy data from ''in silico'' simulation
 +
** combine: promoter, output and graphic representation
 +
** next meeting with Marti: end of next week.
 +
 
 +
* extract NCBI Entrez Gene IDs with R and perl
 +
* MAC adresses registered for bioquant network
 +
 
 +
=== 8-12-2009 ===
 +
 
 +
* configure perl working environment
 +
* study structure of DoOP database
 +
* download DoOP and load DoOP database into MySQL
 +
 
 +
=== 8-13-2009 ===
 +
 
 +
* trying out some DoOP queries
 +
* download fasta sequences from UCSC gene browser
 +
* mapping of NCBI Entrez Gene IDs with RefSeq IDs
 +
* configure perl working environment on Windows XP
 +
* contact Endre Sebestyen concerning the perl module Bio-DoOP-DoOP
 +
 
 +
=== 8-14-2009 ===
 +
 
 +
* parse UCSC fasta sequences according to our selection
 +
* write parsed sequences into multifasta format
 +
* start PromoterSweep Analysis over Weekend
 +
 
 +
[[https://2009.igem.org/Team:Heidelberg/Notebook_modeling TOP]]
 +
 
 +
=== 8-18-2009 ===
 +
 
 +
Tim, Stephen, ab hier müsst ihr eure Sachen selber eintragen!
 +
 
 +
* study outputfile of PromoterSweep. check out general structure and pick up useful information.
 +
* result is grouped in: ''General Info'', ''Best Genomic Mapping'', '' Promoter DB Search Result'', ''Graphical Overview'', ''Combined Binding Sites'', ''TSS and Exon Info'', ''Profile Matrices'' and ''Generated Output Files''.
 +
* upon selection, sections of interest will be collected and made ready for entry into MySQL DB
 +
* discuss table structure of our database
 +
 
 +
* How should our database be called? - Brainstorming -
 +
** SHOULD contain: iGEM, Transcription Factor, Binding Site, Promoter, synthetic biology, Heidelberg
 +
** MAY contain: position, heartbeat, prediction, assembly, eukaryotes
 +
** and still more keywords to come
 +
*establishing local@host access to mysql
 +
 
 +
=== 8-19-2009 ===
 +
 
 +
* parse Promotersweep xml file into tab-separated text file
 +
** the text file should contain: RefSeq ID, TF name, TFBS position, TF motif sequence, TFBS Quality, TSS, Entrez ID, EnsEMBL ID, further gene description.
 +
** this provided us with several programming problems concerning working with multiple arrays, hashes and their combinations (arrays of hashes, hashes of hashes, etc.) thus
 +
* studying structure and basic concepts of hash & key
 +
*including parsed data into mysql database
 +
 
 +
=== 8-20-2009 ===
 +
* pre-decision for our table-structure
 +
** Table: Main_Info
 +
*** RefSeq ID, TF, TF motif start & end position, TFBS motif score, TFBS quality, TSS database info
 +
** Table: Gene_Info
 +
*** Ensembl_ID, Gene Symbol, Gene Description.
 +
** we go for the RefSeq ID to be the '''key''' connecting these two tables.
 +
 
 +
=== 8-21-2009 ===
 +
 
 +
* update script for parsing the Promotersweep output files due to unexpected errors
 +
* we forgot to include "weak" as a category for the TFBS quality - added!
 +
* PromoterSweep result contains information about TSS derived from different promoter databases. On which should we rely, if they differ from each other?
 +
** We set our highest priority to DoOP database since they show a good accordance within the RefseqID results when compared to other databases (e.g. DBTSS).
 +
 
 +
* order [http://www.mathworks.com/| Matlab] iGEM licence
 +
 
 +
* search for a tool to use MySQL in R programming environment
 +
* wiki: write an short article about the German Cancer Research Center ([[Team:Heidelberg/Team_Scientific_Environment|DKFZ]])
 +
 
 +
* Meeting with Anna-Lena: once we established our database... then
 +
** two strategies:
 +
*** manually select interesting transcription factors and analyse them using database queries
 +
*** plot histograms of TFBS occurance within the target promoter sequence (TSS - 1000bp upstream) for each TF and make systematic analysis
 +
** we go for both!
 +
** idea for the future: we can analyze combinatorial appearance of distinct TF pairs
 +
 
 +
* We have a name for our database - we call it -
 +
 
 +
 
 +
- wait for it -
 +
 
 +
 
 +
'''HEARTBEAT''' database ('''''He'''idelberg '''Ar'''tificial '''T'''ranscription Factor '''B'''inding Site '''E'''ngineering and '''A'''ssembly '''T'''ool)
 +
<br><br><br>
 +
[[https://2009.igem.org/Team:Heidelberg/Notebook_modeling TOP]]
 +
 
 +
=== 8-24-2009 ===
 +
 
 +
* Meeting with Marti: defining output modeling strategies
 +
** "exclusive promoters"
 +
*** a model for predicting the behaviour of activation of one, two, three... promoters at the same time.
 +
*** the potential of this model lies in the possibility to model single as well as many pathways in combination and even check for synergistic effects
 +
*** modeling logic: quantitative '''ODE''' VS. quantitative & qualitative '''fuzzy logic'''
 +
** "error checking"
 +
*** what to capture/measure: affinity of transcription factor binding to DNA
 +
**** calculate score / reliabilty
 +
**** phenotypic measurement
 +
*** if we have time in the end: model/experiment optimization by wetlab-drylab-rounds (GRAFIK)
 +
*** if we do not have much time: figure out where is catch
 +
** modeling layers & final visualization
 +
*** (i) capture affinity - (ii) model gene expression - (iii) pathway activity - (iv) fancy visualization (Mathworks Simulink?)
 +
*** plot: time course, dynamic affinity
 +
*** keep in mind the possible high amount of False Positives using promoter search/analysis
 +
 
 +
=== 8-25-2009 ===
 +
* official Team Meeting also with Mr. Kai Ludwig (LANGE + PFLANZ) as guest for Logo / Title Claim discussion
 +
 
 +
* so far we have 1753 promoter sequences analyzed by PromoterSweep!
 +
 
 +
* Meeting with Daniela (Nao): ''Cell Profiler'' for capturing biological images & data analysis based on MATLAB
 +
 
 +
* working with R module '''RMySQL''' for using the pipeline between R and MySQL
 +
* create a list of useful RMySQL commands
 +
 
 +
=== 8-26-2009 ===
 +
* '''Workflow for plotting histogram''' - workflow (SOURCE CODE/S?)
 +
** make MySQL query using R
 +
** make list of TFs, avoid duplicates using perl
 +
** pick up each TF (perl/R) and plot histogram (R)
 +
 
 +
* create MySQL command list including combinatorial queries
 +
 
 +
=== 8-27-2009 ===
 +
 
 +
* check HEARTBEAT DB for duplicate entries
 +
* how should we plot the histogram?
 +
** (a) histogram - how "wide" should be each bin? 100bp? 50bp? 20bp?
 +
** (b) plot probability density
 +
* study Transfac PWM (position weight matrices) for
 +
** difference in consensus sequences (also ask Anna-Lena)
 +
** different PWM types (vertebrates, plant, insect, fungi, bacteria, nematodes...)
 +
** positive control: when histograms are generated and plotted, check distribution of Sp1
 +
 
 +
* so far we have 3640 promoter sequences "sweeped"!
 +
 
 +
*access from R to mysql at the local@host server established
 +
 
 +
=== 8-28-2009 ===
 +
* dealing with perl - introduce transition of variables between perl and R 
 +
 
 +
[[https://2009.igem.org/Team:Heidelberg/Notebook_modeling TOP]]
 +
 
 +
=== 8-31-2009 ===
 +
 
 +
[[https://2009.igem.org/Team:Heidelberg/Notebook_modeling TOP]]
 +
 
 +
== September ==
 +
{| class="wikitable centered" border="2" rules="rows" width="650px" style="border-color:white;"
 +
|- 
 +
! Week !! colspan="7"  |Days
 +
|-
 +
|style="text-align:center"|
 +
|style="text-align:center"| Mon
 +
|style="text-align:center"| Tue
 +
|style="text-align:center"| Wed
 +
|style="text-align:center"| Thu
 +
|style="text-align:center"| Fri
 +
|style="text-align:center"| Sat
 +
|style="text-align:center"| Sun
 +
|-
 +
|style="text-align:center"| 36
 +
|style="text-align:center"| -
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-1-2009|1]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-2-2009|2]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-3-2009|3]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-4-2009|4]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-5-2009|5]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-6-2009|6]]
 +
|-
 +
|style="text-align:center"| 37
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-7-2009|7]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-8-2009|8]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-9-2009|9]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-10-2009|10]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-11-2009|11]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-12-2009|12]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-13-2009|13]]
 +
|-
 +
|style="text-align:center"| 38
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-14-2009|14]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-15-2009|15]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-16-2009|16]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-17-2009|17]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-18-2009|18]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-19-2009|19]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-20-2009|20]]
 +
|-
 +
|style="text-align:center"| 39
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-21-2009|21]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-22-2009|22]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-23-2009|23]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-24-2009|24]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-25-2009|25]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-26-2009|26]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-27-2009|27]]
 +
|-
 +
|style="text-align:center"| 40
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-28-2009|28]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-29-2009|29]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#9-30-2009|30]]
 +
|style="text-align:center"| -
 +
|style="text-align:center"| -
 +
|style="text-align:center"| -
 +
|style="text-align:center"| -
 +
|}
 +
[[https://2009.igem.org/Team:Heidelberg/Notebook_modeling TOP]]
 +
 
 +
=== 9-1-2009 ===
 +
* derive transcription factor data using R and MySQL
 +
* plot HEARTBEAT TF hit distribution as histograms & density functions for different PWM subsets (all, vertebrates only, single matrices and joined TFs)
 +
*further completion of the database
 +
 
 +
=== 9-2-2009 ===
 +
* discussion on how to make statistical studies on our gained distributions
 +
** ideas: define maximum and variance -> Nao
 +
* look for motif sequences -> Tim
 +
 
 +
* we have 4476 sequences analysed by Promotersweep so far!
 +
** but we are expecting 4700 sequences - check missing ones!
 +
 
 +
=== 9-3-2009 ===
 +
* internal team meeting: Tim, Lars, Stephen, Nao
 +
** select especially interesting TFs
 +
*** criteria: (a) good hits in our distributions; (b) easy experimental handling
 +
*** we go for '''HIF''', '''SREBP''' and '''VDR''' to analyse and make synthetic promoter design
 +
* Transfac PWM: there are some annotaion inconveniences of some matrices
 +
* which "spacer" sequences should we use in order to generate TFBS free sequece parts
 +
 
 +
* rational design of synthetic promoters
 +
** Tim: SREBP, Nao: VDR
 +
** both go for a total number of 10 sequences
 +
** strategies:
 +
*** single TFs: search for density maxima
 +
*** check combinatorial appearance and design promoter sequences with multiple binding TFs
 +
** use spacer sequences generated by Lars and check for TFBS using Transfac
 +
** sequence length: max. 1000bp
 +
 
 +
* back-up idea: if synthesis does not work for a long (~1000bp) sequence then try to work out a protocol for a two-step promoter synthesis combining one empty (TFBS free) sequence with another which consists of many TF and activator binding sites.
 +
 
 +
=== 9-4-2009 ===
 +
* work with Transfac PWM: structure, description, and using consensus sequence
 +
* write script to get the ID's and frequencies for all co-occuring TFBS of VDR and SREBP
 +
* write script for generating consensus sequence based on Transfac PWM and replacing ambiguity code with A, C, G or T <pre>Getconsensus.pl, MakeConsensus.pl</pre>
 +
 
 +
* Wiki Meeting (Nao)
 +
** Logo choice & modification
 +
** choose header pics
 +
** navigation layout
 +
** develop a catchy, cool homepage
 +
 
 +
=== 9-5-2009 ===
 +
* Meeting with Tim, design synthetic promoter sequences
 +
* check spacer sequence (200bp) for TFBS: one TFBS found; remove it by cutting and shortening the sequence to 190bp)
 +
* Kid3 is a repressor!
 +
 
 +
=== 9-6-2009 ===
 +
* design more synthetic promoter sequences by manual iteration process which consists of (i) TFBS check and (ii) TFBS removal & filling up random sequence
 +
 
 +
* aim: creation of an automatic designing tool for synthetic promoters which include sequence design, transfac search as well as filling the sequence up with spacer sequences.
 +
 
 +
 
 +
[[https://2009.igem.org/Team:Heidelberg/Notebook_modeling TOP]]
 +
 
 +
=== 9-7-2009 ===
 +
* check designed sequences for restriction sites <pre>CheckRestrictionsites.pl</pre>
 +
* finish creating sequences
 +
* consider CMV core promoter into the calculation of the relative position of TFBS to the TSS
 +
* create sequences for negative control
 +
** pure TFBS free sequence
 +
** sequences with TFBS at minima of the density function
 +
* checking for all sequences for further binding sites with the Transfac match tool
 +
 
 +
=== 9-8-2009 ===
 +
* check restriction sites for reverse complementary strand
 +
* add flanking sites with restriction sites and spacer nucleotides to our designed sequences
 +
* is there any possibility to automatize Transfac queries?
 +
* work with combined / joined MySQL query structures
 +
* or solve this process by simply writing new temporary tables?
 +
 
 +
* workflow summary (short) for manual designing of a synthetic promoter:
 +
** (A) use random sequence
 +
** (B) check TF-matrices
 +
** (C) validate TFs (mouse? human? repressor?)
 +
** (D) check Transfac and restriction sites
 +
 
 +
* Phone conference with Kai Ludwig, Logo & Web Design (Nao)
 +
 
 +
* official Team Meeting
 +
 
 +
* wiki closure on Oct 21st!
 +
 
 +
=== 9-9-2009 ===
 +
 
 +
* modify synthetic promoter sequences to be ready for ordering
 +
* Sweep more promoter sequences using Promotersweep
 +
* start Modeling
 +
* revise and improve HEARTBEAT
 +
* discuss differences between PWMs
 +
 
 +
=== 9-10-2009 ===
 +
* still modifying synthetic sequences to be ready for shipping
 +
* we have altogether 25 designed promoter sequences!
 +
 
 +
=== 9-11-2009 ===
 +
* Software Meeting (Stephen, Tim, Nao)
 +
** compartibility with mediawiki: HTML, perl, php, R, java?
 +
** GUI design
 +
*** simple interface: single TF, auxiliary TFs, #TFBS, sequence length
 +
*** "interactive": multiple TF, choosing auxiliary TFs, additional information (see [[Team:Heidelberg/Eukaryopedia|Eukaryopedia]]), density function plot & histogram
 +
*** "hyper-interactive" step-by-step design & creation
 +
 
 +
* Modeling Meeting with Marti and Anna-Lena (Tim, Nao)
 +
** aim: fancy visualization to show expectation & prediction providing pathway insights
 +
** TODO/QUESTIONS
 +
*** what is the stimulus? collect possible inputs!
 +
*** measurable outcome: experiments & pathways
 +
*** quality of synthetic sequence: error checking
 +
**** we need to define the '''quality''' of our sequences
 +
** LEVELS of modeling
 +
*** (1) DNA (2) expression/transcriptional activity (3) output
 +
*** each with corresponding measurement
 +
 
 +
* general modeling scheme: input - "What we are affecting" - possible outcomes
 +
* how? We use fuzzy logic
 +
 
 +
[[https://2009.igem.org/Team:Heidelberg/Notebook_modeling TOP]]
 +
 
 +
=== 9-14-2009 ===
 +
* collect input for inducing the system (e.g. p53: CPT, Pifithrin-alpha; NFkB: TNF-alpha etc.)
 +
* phone conference with Kai Ludwig
 +
* learn how to include Perl code into html code
 +
** learn how to use embperl
 +
** configure apache2 server such that embperl can be interpreted
 +
** try to make offline use of embperl working
 +
*try to find nice html editor for ubuntu - (seamonkey, Amaya)
 +
 
 +
=== 9-15-2009 ===
 +
* create network picture for meeting tomorrow
 +
* Logo discussion
 +
* Read paper: ''Fuzzy Logic Modeling of Signaling Networks'' (Aldridge 2009)
 +
* learn data management of virtual server
 +
* get an overview about the apache2 file and security system
 +
 
 +
=== 9-16-2009 ===
 +
* Modeling Meeting with Marti (Douaa, Tim, Nao)
 +
** update on available drugs/sequences
 +
** decide what to model: (A) error checking, and (B) differential expression?
 +
** use natural promoters to build up model for prediction of activity of synthetic promoters
 +
** Discussion of TF score
 +
*** Transfac sequence alignment score
 +
*** promotersweep binding site quality
 +
*** relative position to TSS: How?
 +
**** (A) peak width & amplitude, (B) distance to maximal peak & position, (C) number of PEAK, (D) "sliding window" and calculate area under curve, (E) #TFBS (also for comparison of different synthetic promoters)
 +
*** biophysical affinity using TRAP
 +
** first model: build up either on CMV or on JeT
 +
** potential: integrate many stimuli -> find out crosstalks of pathways?
 +
 
 +
* TODO (meeting)
 +
** collect data
 +
** define WHAT we want to model
 +
** summarize available sequences
 +
** try to formulate IF ... THEN "sentences"
 +
** check MATLAB & MATLAB Fuzzy Logic Toolbox availability
 +
 
 +
=== 9-17-2009 ===
 +
* internal Team Meeting
 +
* find error.log files on the server and learn how to use it
 +
 
 +
=== 9-18-2009 ===
 +
 
 +
[[https://2009.igem.org/Team:Heidelberg/Notebook_modeling TOP]]
 +
 
 +
=== 9-20-2009 ===
 +
* learn how to use tag language of embperl
 +
** learn how to write loops with embperl
 +
** access of input variables in embperl -- using the %fdat hash
 +
 
 +
=== 9-21-2009 ===
 +
* struggling with how to use R from embperl
 +
 
 +
=== 9-22-2009 ===
 +
* Wiki Meeting (Dani, Cori, Nao)
 +
** install image processing tool
 +
** design wiki, brainstorming for possible navigation bars
 +
* Wiki Phone Meeting with Kai Ludwig (Nao)
 +
** design header & presentation-master as well as team shirts
 +
 
 +
* Seminar: '''Martijn Luijsterburg''' (Karolinska Institute) - ''Heterochromatin Protein 1 is involved in the DNA damage response''. Host: Thomas Höfer, Bioquant
 +
 
 +
=== 9-23-2009 ===
 +
* Modeling Meeting with Marti, Anna-Lena (Tim, Nao)
 +
** contact database group (TP3)
 +
** statistics: characterizing peaks
 +
*** we go for area under the curve and affinity. optionally we can choose Transfac sequence score and peak height & width
 +
** strategy to convince the wetlab people from the importance of modeling during the meeting on upcoming friday.
 +
** MATLAB license?
 +
** logical gates: try to start creating model topology after Friday
 +
 
 +
* Presentation: '''Marti Bernado Faura''' (Bioquant, University of Heidelberg): ''Data-driven Fuzzy Logic modeling of Programmed Cell Death''
 +
** intro into fuzzy logic
 +
** system development & work flow of fuzzy logic
 +
** fuzzy inference & model prediction
 +
** model types: MISO / MIMO
 +
 
 +
* Wrap-up meeting: Team HEARTBEAT (Tim, Nao)
 +
** split up computational work into three tracks: HEARTBEAT DB, HEARTBEAT GUI and modeling
 +
*** database: documentation (until Oct 18), peak characterization, calculate absolute density function
 +
*** GUI: based on ''embperl'', design according to our new wiki
 +
*** modeling: MATLAB license, collect sequences & input data, develop network model, include pathways
 +
 
 +
* literature work
 +
 
 +
=== 9-24-2009 ===
 +
* prepare slides for meeting tomorrow
 +
* pathway search: TNF-alpha/NFkB, VDR, SREBP and crosstalks. NFkB has a lot of pathway crosstalks, while SREBP and VDR show a interesting connection. Upon induction, SREBP activates VDR.
 +
 
 +
=== 9-25-2009 ===
 +
* Team Meeting (Wetlab, Nao)
 +
** short progress report of all of us
 +
** modeling: discussing scheme, modeling elements and strategies
 +
 
 +
 
 +
[[https://2009.igem.org/Team:Heidelberg/Notebook_modeling TOP]]
 +
 
 +
=== 9-28-2009 ===
 +
* Wiki Phone Meeting with Kai Ludwig (Nao)
 +
 
 +
=== 9-29-2009 ===
 +
* designed synthetic promoters (HB_0001 - HB_0025) will be joined to CMV core promoter since JeT core promoter contains a Sp1 site in it. All other sequences (random synthesized, e.g.) are coupled with JeT core promoter.
 +
* literature studies on combinatorial ''cis''-regulation as well as on modelig of the lambda-switch
 +
* prepare slides for the next modeling meeting
 +
 
 +
=== 9-30-2009 ===
 +
* Wiki Meeting (Dani, Nao)
 +
* MATLAB license order (Jens)
 +
* postpone Yara meeting (Wetlab, Tim)
 +
 
 +
* got sequences from Lars
 +
* got qRT-PCR setup from Chenchen
 +
 
 +
* Modeling Meeting with Marti & Anna-Lena (Tim, Nao)
 +
** still need to collect FACS and microscopy results
 +
** discuss our network prediction model using TNF-alpha as an example
 +
** maybe we can use the lambda switch paper as a good starting point for our modeling
 +
 
 +
[[https://2009.igem.org/Team:Heidelberg/Notebook_modeling TOP]]
 +
 
 +
== October ==
 +
{| class="wikitable centered" border="2" rules="rows" width="650px" style="border-color:white;"
 +
|- 
 +
! Week !! colspan="7"  |Days
 +
|-
 +
|style="text-align:center"|
 +
|style="text-align:center"| Mon
 +
|style="text-align:center"| Tue
 +
|style="text-align:center"| Wed
 +
|style="text-align:center"| Thu
 +
|style="text-align:center"| Fri
 +
|style="text-align:center"| Sat
 +
|style="text-align:center"| Sun
 +
|-
 +
|style="text-align:center"| 40
 +
|style="text-align:center"| -
 +
|style="text-align:center"| -
 +
|style="text-align:center"| -
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-1-2009|1]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-2-2009|2]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-3-2009|3]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-4-2009|4]]
 +
|-
 +
|style="text-align:center"| 41
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-5-2009|5]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-6-2009|6]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-7-2009|7]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-8-2009|8]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-9-2009|9]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-10-2009|10]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-11-2009|11]]
 +
|-
 +
|style="text-align:center"| 42
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-12-2009|12]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-13-2009|13]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-14-2009|14]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-15-2009|15]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-16-2009|16]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-17-2009|17]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-18-2009|18]]
 +
|-
 +
|style="text-align:center"| 43
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-19-2009|19]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-20-2009|20]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-21-2009|21]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-22-2009|22]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-23-2009|23]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-24-2009|24]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-25-2009|25]]
 +
|-
 +
|style="text-align:center"| 44
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-26-2009|26]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-27-2009|27]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-28-2009|28]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-29-2009|29]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-30-2009|30]]
 +
|style="text-align:center"| [[Team:Heidelberg/Notebook_modeling#10-31-2009|31]]
 +
|style="text-align:center"| -
 +
|}
 +
 
 +
=== 10-1-2009 ===
 +
* Wiki & Presentation Meeting with Dani (Nao)
 +
 
 +
=== 10-2-2009 ===
 +
* some wiki work
 +
 
 +
[[https://2009.igem.org/Team:Heidelberg/Notebook_modeling TOP]]
 +
 
 +
=== 10-5-2009 ===
 +
 
 +
=== 10-6-2009 ===
 +
* Internal Team Meeting
 +
** check out number and measurement plans of randomly assembled synthetic promoters (5x NFkB, 5x p53, 2x pPARg, 2x SREBP)
 +
 
 +
* Wiki Meeting (Corinna, Daniela, Nao)
 +
** discuss design of the top page and possible features
 +
** try out CSS design
 +
 
 +
=== 10-7-2009 ===
 +
* Wiki Design (Nao)
 +
* Wiki Phone Meeting with Kai Ludwig (Nao)
 +
 
 +
* MATLAB has arrived!
 +
* literature work
 +
 
 +
* Wetlab Meeting: progress report on measurement of random assembled synthetic promoters
 +
* make thoughts about the whole storyboard of our presentation at the jamboree
 +
 
 +
=== 10-8-2009 ===
 +
* Short Meeting with Roland
 +
* image processing work for wiki
 +
 
 +
=== 10-9-2009 ===
 +
[[https://2009.igem.org/Team:Heidelberg/Notebook_modeling TOP]]
 +
 
 +
=== 10-12-2009 ===
 +
 
 +
=== 10-13-2009 ===
 +
* Measurement discussion with Lars: REU/RMPU, defining equations for mammalian systems
 +
* literature work on PoPS paper (Kelly JR et al.) and apply their equations
 +
 
 +
* Marti Modeling Meeting (Anna-Lena, Tim, Nao)
 +
** Journal Club (Tim, Nao)
 +
** summary of meeting from Team Meeting from last thursday
 +
 
 +
* Marti: start modeling using MATLAB and Fuzzy Logic Toolbox (FLT), playing around with FLT and tutorial
 +
 
 +
=== 10-14-2009 ===
 +
Nao
 +
* develop first test fuzzy inference system (FIS) for testing
 +
* Marti Modeling Meeting, specify model topology
 +
* collect data: FACS (Cori), Microscopy (Hannah), Sequence & TECAN (Lars)
 +
* start calculating position score using R
 +
* translating project abstract
 +
 
 +
=== 10-15-2009 ===
 +
Nao
 +
* calculate affinity score using TRAP (Anna-Lena)
 +
* collect ideas for integrating TFwise scores in order to calculate final position/affinity score for one sequence: median, mean, maximum, weighted mean?
 +
* all data analysis is stored in three sheets (SequenceAnalysis, ResultSummary and CalculateTRAP)
 +
* from now on we concentrate on FACS measurements because they are the most reliable ones (TECAN used only for scanning)
 +
* fill up TRAP data with missing transcription factors
 +
 
 +
=== 10-16-2009 ===
 +
Nao
 +
* Anna-Lena Meeting: discuss how to integrate sequence scores
 +
* get & check p53, pPARg and random SREBP sequences
 +
* go through FACS results
 +
* add HEARTBEAT sequences for data analysis
 +
* modeling documentation
 +
* parsing experimental setups for modeling use
 +
* Chenchen qRT-PCR results
 +
 
 +
* define possible modeling layers
 +
** first layer: input
 +
*** drug type, pathway, drug mode of action, drug concentration, targeted cells, incubation time
 +
*** sequence type, position score, affinity score
 +
*** we choose position & affinity score, sequence type and the presence of stimulation. Time as well as different concentration (unfortunately no data available) can be added in future
 +
** second layer: promoters
 +
*** 6 constitutives, 3 standards, 6 inducible available
 +
*** data analysis narrows this to 5 constitutives, 3 standards and 4 inducible
 +
*** HEARTBEAT sequences have to be measured a.s.a.p.
 +
 
 +
* Marti Modeling Meeting
 +
** try to define some fuzzy rules
 +
** we assume better binding -> better expression
 +
** define membership functions
 +
** start modeling with NFkB results
 +
 
 +
All
 +
* Internal Team Meeting
 +
** reminder: wiki task, wiki to do
 +
* Official Team Meeting
 +
 
 +
=== 10-17-2009 ===
 +
* final decision: we go for maximum of position and affinity score
 +
* added HB sequences for data analysis table; as soon as results are there we can model designed synthetic promoters
 +
* define shape of membership functions
 +
* literature search for missing activity values?
 +
* still TODO: check out p53 results since the p53-NFkB crosstalk is really interesting!
 +
 
 +
=== 10-18-2009 ===
 +
Nao
 +
* SREBP/VDR paper arrived
 +
* finish data analysis
 +
* study & playing around with MATLAB FLT, programming from both FLT GUI and MATLAB command line
 +
* define our work to be (i) error checking and (ii) exclusive pathway modeling
 +
* high potential of this model lies in its plug'n'play structrue, with a high capacity of integrating more inputs, outputs and also the middle layer (promoter diversity)
 +
[[https://2009.igem.org/Team:Heidelberg/Notebook_modeling TOP]]
 +
 
 +
=== 10-19-2009 ===
 +
Nao
 +
* define final network structure
 +
* wiki work
 +
* reading RFC documentation and correction
 +
* we call this project HEARTBEAT fuzzy network (FN)
 +
* HB FN documentation and first results!
 +
** creating two fuzzy controllers: inducible NFkB and constitutive
 +
* how do we integrate the data? combine via Simulink!
 +
 
 +
=== 10-20-2009 ===
 +
* Creating, developing, integrating and combining fuzzy network modeling (MATLAB, Simulink)
 +
* first analysis of HB sequences
 +
* HEARTBEAT FN documentation
 +
 
 +
=== 10-21-2009 ===
 +
 
 +
=== 10-22-2009 ===
 +
* FROZEN WIKI!!!!
 +
 
 +
[[https://2009.igem.org/Team:Heidelberg/Notebook_modeling TOP]]
-
|width="250px" style="background-color:#d8d5d0"|
+
|width="250px" style="padding: 0 20px 15px 15px; background-color:#d8d5d0"|
|}
|}

Latest revision as of 01:43, 22 October 2009

Notebook HEARTBEAT

Welcome to the notebook of the HEARTBEAT (Heidelberg Artificial Transcription Factor Binding Sites Assembly and Engineering Tool) project. This notebook comprises the work on three sublanes: HEARTBEAT database (DB), HEARTBEAT graphical user interface (GUI) and HEARTBEAT fuzzy modeling (FN) as well as some additional work on logo as well as wiki design. Have fun!

Contents

July

7-27-2009

  • Meeting with Oliver Pelz
    • Discuss general ideas of our Database Structure and Content
    • An introduction into PromoterSweep (LINK). PromoterSweep screens a given sequence for conserved regions giving us consensus sequences and moreover screens them for TFBS by using database search (TRANSFAC, Jasper) (LINK)
    • Our new database should contain following informations: promoter sequence, TFs, TFBS, position of TFBS, number of binding TFBS, "host organism"
    • We decide to choose MySQL as a appropiate language solving this challenge which allows us also a graphical representation of the database on the web later.
    • GUI on wiki: which language? php? javascript?
    • Problems: access to PromoterSweep (Husar Bioinformatics Group, DKFZ), choice of Promoter Database (DoOP, UCSC, EnsEMBL) (LINK)
  • aim: create database until end of August

[TOP]

August

Week Days
Mon Tue Wed Thu Fri Sat Sun
31 - - - - - 1 2
32 3 4 5 6 7 8 9
33 10 11 12 13 14 15 16
34 17 18 19 20 21 22 23
35 24 25 26 27 28 29 30
36 31 - - - - - -

[TOP]

8-3-2009

  • First contact with MySQL
  • Start making an overview of other team's projects
  • Configuring our Virtual Server

8-4-2009

  • Official Team Meeting (LINK) @ BQ seminar room 43: preparaing presentation & writing meeting report
  • Start installing developing environment on our internal server
    • GNOME
    • Mediawiki

8-5-2009

  • Meeting with Tobias Bauer & Anna-Lena Kranz (Theoretical Bioinformatics, DKFZ) @ TP3, DKFZ
    • Integrating ideas of PromoterSweep, Transfac as well as DoOP/CisRED
    • select "interesting" TFs (e.g. HIF, NFkB, c-myc, p53) for Wetlab
    • select "interesting" pathways (e.g. cell cycle, inflammation, metabolism etc)
    • future experimental validation: ChIP-on-Chip
      • for this we need a TFBS-free sequence
    • idea: plot histogram of TFBS relative to TSS
      • problem: choice of sequence: upstream only? inculde downstream?
    • new programming language: R and perl
    • next meeting: Friday after team meeting
  • Meeting with Karl-Heinz Glatting (HUSAR, DKFZ) @ TP3, DKFZ
    • An introduction into PromoterSweep
    • Structure and analysis principles of PromoterSweep
    • Output is stored in an XML file. This means we have to parse the xml code.
    • Oliver Pelz will give help for us in programming
  • Protocol of the meeting can be downloaded from here.
  • Start working with MySQL
  • request UNIX/HUSAR/HPC access at DKFZ (Nao)
  • first contact with several databases: EmsEMBL, Compara, cisRED, DoOP, TiProD, contra

8-6-2009

  • Meeting with Oliver Pelz
    • defining workflow with PromoterSweep, Matrix Profile Search and introduction into different Motif Discovery Algorithms
  • installation of NX server for access onto internal server from Windows
  • configure developing environment (printing from Linux, configure Mediawiki)
  • defining basic concept of database construction
    • we select annotated promoter sequences in DoOP
    • we make a selection of pathway of interest using KEGG
    • narrow down number of target promoter sequences <10000.

8-7-2009

  • Official Team Meeting on Scheduling
  • Meeting with Anna-Lena and Tobias
    • Introduction into R
    • Tobias will give us access to their computing cluster (Group Roland Eils)
    • Promoter Selection: DoOP, EnsEMBL, or UCSC?
  • HUSAR account arrived
  • installation of R, R editor and perl editor
  • further configuration of our internal server / mediawiki
  • writing first perl program - "Hi there"

[TOP]

8-10-2009

  • first contact with R and perl
  • playing around with R and perl
  • playing around with R library: Biobase
  • check working on DKFZ cluster

8-11-2009

  • defining programming languages: perl, R, MySQL
  • retrieving first Promotersweep output files
  • Meeting with Marti
    • ideas for modeling
      • we will have at least three colors which overlap in their spectra.
      • a very nice approach will be Fuzzy Logic Modeling.
      • idea 1: error checking of affinity: compare expectation to experimental results and figure out where the error is hiding
      • idea 2: create&visualize fancy and fuzzy data from in silico simulation
    • combine: promoter, output and graphic representation
    • next meeting with Marti: end of next week.
  • extract NCBI Entrez Gene IDs with R and perl
  • MAC adresses registered for bioquant network

8-12-2009

  • configure perl working environment
  • study structure of DoOP database
  • download DoOP and load DoOP database into MySQL

8-13-2009

  • trying out some DoOP queries
  • download fasta sequences from UCSC gene browser
  • mapping of NCBI Entrez Gene IDs with RefSeq IDs
  • configure perl working environment on Windows XP
  • contact Endre Sebestyen concerning the perl module Bio-DoOP-DoOP

8-14-2009

  • parse UCSC fasta sequences according to our selection
  • write parsed sequences into multifasta format
  • start PromoterSweep Analysis over Weekend

[TOP]

8-18-2009

Tim, Stephen, ab hier müsst ihr eure Sachen selber eintragen!

  • study outputfile of PromoterSweep. check out general structure and pick up useful information.
  • result is grouped in: General Info, Best Genomic Mapping, Promoter DB Search Result, Graphical Overview, Combined Binding Sites, TSS and Exon Info, Profile Matrices and Generated Output Files.
  • upon selection, sections of interest will be collected and made ready for entry into MySQL DB
  • discuss table structure of our database
  • How should our database be called? - Brainstorming -
    • SHOULD contain: iGEM, Transcription Factor, Binding Site, Promoter, synthetic biology, Heidelberg
    • MAY contain: position, heartbeat, prediction, assembly, eukaryotes
    • and still more keywords to come
  • establishing local@host access to mysql

8-19-2009

  • parse Promotersweep xml file into tab-separated text file
    • the text file should contain: RefSeq ID, TF name, TFBS position, TF motif sequence, TFBS Quality, TSS, Entrez ID, EnsEMBL ID, further gene description.
    • this provided us with several programming problems concerning working with multiple arrays, hashes and their combinations (arrays of hashes, hashes of hashes, etc.) thus
  • studying structure and basic concepts of hash & key
  • including parsed data into mysql database

8-20-2009

  • pre-decision for our table-structure
    • Table: Main_Info
      • RefSeq ID, TF, TF motif start & end position, TFBS motif score, TFBS quality, TSS database info
    • Table: Gene_Info
      • Ensembl_ID, Gene Symbol, Gene Description.
    • we go for the RefSeq ID to be the key connecting these two tables.

8-21-2009

  • update script for parsing the Promotersweep output files due to unexpected errors
  • we forgot to include "weak" as a category for the TFBS quality - added!
  • PromoterSweep result contains information about TSS derived from different promoter databases. On which should we rely, if they differ from each other?
    • We set our highest priority to DoOP database since they show a good accordance within the RefseqID results when compared to other databases (e.g. DBTSS).
  • order [http://www.mathworks.com/| Matlab] iGEM licence
  • search for a tool to use MySQL in R programming environment
  • wiki: write an short article about the German Cancer Research Center (DKFZ)
  • Meeting with Anna-Lena: once we established our database... then
    • two strategies:
      • manually select interesting transcription factors and analyse them using database queries
      • plot histograms of TFBS occurance within the target promoter sequence (TSS - 1000bp upstream) for each TF and make systematic analysis
    • we go for both!
    • idea for the future: we can analyze combinatorial appearance of distinct TF pairs
  • We have a name for our database - we call it -


- wait for it -


HEARTBEAT database (Heidelberg Artificial Transcription Factor Binding Site Engineering and Assembly Tool)


[TOP]

8-24-2009

  • Meeting with Marti: defining output modeling strategies
    • "exclusive promoters"
      • a model for predicting the behaviour of activation of one, two, three... promoters at the same time.
      • the potential of this model lies in the possibility to model single as well as many pathways in combination and even check for synergistic effects
      • modeling logic: quantitative ODE VS. quantitative & qualitative fuzzy logic
    • "error checking"
      • what to capture/measure: affinity of transcription factor binding to DNA
        • calculate score / reliabilty
        • phenotypic measurement
      • if we have time in the end: model/experiment optimization by wetlab-drylab-rounds (GRAFIK)
      • if we do not have much time: figure out where is catch
    • modeling layers & final visualization
      • (i) capture affinity - (ii) model gene expression - (iii) pathway activity - (iv) fancy visualization (Mathworks Simulink?)
      • plot: time course, dynamic affinity
      • keep in mind the possible high amount of False Positives using promoter search/analysis

8-25-2009

  • official Team Meeting also with Mr. Kai Ludwig (LANGE + PFLANZ) as guest for Logo / Title Claim discussion
  • so far we have 1753 promoter sequences analyzed by PromoterSweep!
  • Meeting with Daniela (Nao): Cell Profiler for capturing biological images & data analysis based on MATLAB
  • working with R module RMySQL for using the pipeline between R and MySQL
  • create a list of useful RMySQL commands

8-26-2009

  • Workflow for plotting histogram - workflow (SOURCE CODE/S?)
    • make MySQL query using R
    • make list of TFs, avoid duplicates using perl
    • pick up each TF (perl/R) and plot histogram (R)
  • create MySQL command list including combinatorial queries

8-27-2009

  • check HEARTBEAT DB for duplicate entries
  • how should we plot the histogram?
    • (a) histogram - how "wide" should be each bin? 100bp? 50bp? 20bp?
    • (b) plot probability density
  • study Transfac PWM (position weight matrices) for
    • difference in consensus sequences (also ask Anna-Lena)
    • different PWM types (vertebrates, plant, insect, fungi, bacteria, nematodes...)
    • positive control: when histograms are generated and plotted, check distribution of Sp1
  • so far we have 3640 promoter sequences "sweeped"!
  • access from R to mysql at the local@host server established

8-28-2009

  • dealing with perl - introduce transition of variables between perl and R

[TOP]

8-31-2009

[TOP]

September

Week Days
Mon Tue Wed Thu Fri Sat Sun
36 - 1 2 3 4 5 6
37 7 8 9 10 11 12 13
38 14 15 16 17 18 19 20
39 21 22 23 24 25 26 27
40 28 29 30 - - - -

[TOP]

9-1-2009

  • derive transcription factor data using R and MySQL
  • plot HEARTBEAT TF hit distribution as histograms & density functions for different PWM subsets (all, vertebrates only, single matrices and joined TFs)
  • further completion of the database

9-2-2009

  • discussion on how to make statistical studies on our gained distributions
    • ideas: define maximum and variance -> Nao
  • look for motif sequences -> Tim
  • we have 4476 sequences analysed by Promotersweep so far!
    • but we are expecting 4700 sequences - check missing ones!

9-3-2009

  • internal team meeting: Tim, Lars, Stephen, Nao
    • select especially interesting TFs
      • criteria: (a) good hits in our distributions; (b) easy experimental handling
      • we go for HIF, SREBP and VDR to analyse and make synthetic promoter design
  • Transfac PWM: there are some annotaion inconveniences of some matrices
  • which "spacer" sequences should we use in order to generate TFBS free sequece parts
  • rational design of synthetic promoters
    • Tim: SREBP, Nao: VDR
    • both go for a total number of 10 sequences
    • strategies:
      • single TFs: search for density maxima
      • check combinatorial appearance and design promoter sequences with multiple binding TFs
    • use spacer sequences generated by Lars and check for TFBS using Transfac
    • sequence length: max. 1000bp
  • back-up idea: if synthesis does not work for a long (~1000bp) sequence then try to work out a protocol for a two-step promoter synthesis combining one empty (TFBS free) sequence with another which consists of many TF and activator binding sites.

9-4-2009

  • work with Transfac PWM: structure, description, and using consensus sequence
  • write script to get the ID's and frequencies for all co-occuring TFBS of VDR and SREBP
  • write script for generating consensus sequence based on Transfac PWM and replacing ambiguity code with A, C, G or T
    Getconsensus.pl, MakeConsensus.pl
  • Wiki Meeting (Nao)
    • Logo choice & modification
    • choose header pics
    • navigation layout
    • develop a catchy, cool homepage

9-5-2009

  • Meeting with Tim, design synthetic promoter sequences
  • check spacer sequence (200bp) for TFBS: one TFBS found; remove it by cutting and shortening the sequence to 190bp)
  • Kid3 is a repressor!

9-6-2009

  • design more synthetic promoter sequences by manual iteration process which consists of (i) TFBS check and (ii) TFBS removal & filling up random sequence
  • aim: creation of an automatic designing tool for synthetic promoters which include sequence design, transfac search as well as filling the sequence up with spacer sequences.


[TOP]

9-7-2009

  • check designed sequences for restriction sites
    CheckRestrictionsites.pl
  • finish creating sequences
  • consider CMV core promoter into the calculation of the relative position of TFBS to the TSS
  • create sequences for negative control
    • pure TFBS free sequence
    • sequences with TFBS at minima of the density function
  • checking for all sequences for further binding sites with the Transfac match tool

9-8-2009

  • check restriction sites for reverse complementary strand
  • add flanking sites with restriction sites and spacer nucleotides to our designed sequences
  • is there any possibility to automatize Transfac queries?
  • work with combined / joined MySQL query structures
  • or solve this process by simply writing new temporary tables?
  • workflow summary (short) for manual designing of a synthetic promoter:
    • (A) use random sequence
    • (B) check TF-matrices
    • (C) validate TFs (mouse? human? repressor?)
    • (D) check Transfac and restriction sites
  • Phone conference with Kai Ludwig, Logo & Web Design (Nao)
  • official Team Meeting
  • wiki closure on Oct 21st!

9-9-2009

  • modify synthetic promoter sequences to be ready for ordering
  • Sweep more promoter sequences using Promotersweep
  • start Modeling
  • revise and improve HEARTBEAT
  • discuss differences between PWMs

9-10-2009

  • still modifying synthetic sequences to be ready for shipping
  • we have altogether 25 designed promoter sequences!

9-11-2009

  • Software Meeting (Stephen, Tim, Nao)
    • compartibility with mediawiki: HTML, perl, php, R, java?
    • GUI design
      • simple interface: single TF, auxiliary TFs, #TFBS, sequence length
      • "interactive": multiple TF, choosing auxiliary TFs, additional information (see Eukaryopedia), density function plot & histogram
      • "hyper-interactive" step-by-step design & creation
  • Modeling Meeting with Marti and Anna-Lena (Tim, Nao)
    • aim: fancy visualization to show expectation & prediction providing pathway insights
    • TODO/QUESTIONS
      • what is the stimulus? collect possible inputs!
      • measurable outcome: experiments & pathways
      • quality of synthetic sequence: error checking
        • we need to define the quality of our sequences
    • LEVELS of modeling
      • (1) DNA (2) expression/transcriptional activity (3) output
      • each with corresponding measurement
  • general modeling scheme: input - "What we are affecting" - possible outcomes
  • how? We use fuzzy logic

[TOP]

9-14-2009

  • collect input for inducing the system (e.g. p53: CPT, Pifithrin-alpha; NFkB: TNF-alpha etc.)
  • phone conference with Kai Ludwig
  • learn how to include Perl code into html code
    • learn how to use embperl
    • configure apache2 server such that embperl can be interpreted
    • try to make offline use of embperl working
  • try to find nice html editor for ubuntu - (seamonkey, Amaya)

9-15-2009

  • create network picture for meeting tomorrow
  • Logo discussion
  • Read paper: Fuzzy Logic Modeling of Signaling Networks (Aldridge 2009)
  • learn data management of virtual server
  • get an overview about the apache2 file and security system

9-16-2009

  • Modeling Meeting with Marti (Douaa, Tim, Nao)
    • update on available drugs/sequences
    • decide what to model: (A) error checking, and (B) differential expression?
    • use natural promoters to build up model for prediction of activity of synthetic promoters
    • Discussion of TF score
      • Transfac sequence alignment score
      • promotersweep binding site quality
      • relative position to TSS: How?
        • (A) peak width & amplitude, (B) distance to maximal peak & position, (C) number of PEAK, (D) "sliding window" and calculate area under curve, (E) #TFBS (also for comparison of different synthetic promoters)
      • biophysical affinity using TRAP
    • first model: build up either on CMV or on JeT
    • potential: integrate many stimuli -> find out crosstalks of pathways?
  • TODO (meeting)
    • collect data
    • define WHAT we want to model
    • summarize available sequences
    • try to formulate IF ... THEN "sentences"
    • check MATLAB & MATLAB Fuzzy Logic Toolbox availability

9-17-2009

  • internal Team Meeting
  • find error.log files on the server and learn how to use it

9-18-2009

[TOP]

9-20-2009

  • learn how to use tag language of embperl
    • learn how to write loops with embperl
    • access of input variables in embperl -- using the %fdat hash

9-21-2009

  • struggling with how to use R from embperl

9-22-2009

  • Wiki Meeting (Dani, Cori, Nao)
    • install image processing tool
    • design wiki, brainstorming for possible navigation bars
  • Wiki Phone Meeting with Kai Ludwig (Nao)
    • design header & presentation-master as well as team shirts
  • Seminar: Martijn Luijsterburg (Karolinska Institute) - Heterochromatin Protein 1 is involved in the DNA damage response. Host: Thomas Höfer, Bioquant

9-23-2009

  • Modeling Meeting with Marti, Anna-Lena (Tim, Nao)
    • contact database group (TP3)
    • statistics: characterizing peaks
      • we go for area under the curve and affinity. optionally we can choose Transfac sequence score and peak height & width
    • strategy to convince the wetlab people from the importance of modeling during the meeting on upcoming friday.
    • MATLAB license?
    • logical gates: try to start creating model topology after Friday
  • Presentation: Marti Bernado Faura (Bioquant, University of Heidelberg): Data-driven Fuzzy Logic modeling of Programmed Cell Death
    • intro into fuzzy logic
    • system development & work flow of fuzzy logic
    • fuzzy inference & model prediction
    • model types: MISO / MIMO
  • Wrap-up meeting: Team HEARTBEAT (Tim, Nao)
    • split up computational work into three tracks: HEARTBEAT DB, HEARTBEAT GUI and modeling
      • database: documentation (until Oct 18), peak characterization, calculate absolute density function
      • GUI: based on embperl, design according to our new wiki
      • modeling: MATLAB license, collect sequences & input data, develop network model, include pathways
  • literature work

9-24-2009

  • prepare slides for meeting tomorrow
  • pathway search: TNF-alpha/NFkB, VDR, SREBP and crosstalks. NFkB has a lot of pathway crosstalks, while SREBP and VDR show a interesting connection. Upon induction, SREBP activates VDR.

9-25-2009

  • Team Meeting (Wetlab, Nao)
    • short progress report of all of us
    • modeling: discussing scheme, modeling elements and strategies


[TOP]

9-28-2009

  • Wiki Phone Meeting with Kai Ludwig (Nao)

9-29-2009

  • designed synthetic promoters (HB_0001 - HB_0025) will be joined to CMV core promoter since JeT core promoter contains a Sp1 site in it. All other sequences (random synthesized, e.g.) are coupled with JeT core promoter.
  • literature studies on combinatorial cis-regulation as well as on modelig of the lambda-switch
  • prepare slides for the next modeling meeting

9-30-2009

  • Wiki Meeting (Dani, Nao)
  • MATLAB license order (Jens)
  • postpone Yara meeting (Wetlab, Tim)
  • got sequences from Lars
  • got qRT-PCR setup from Chenchen
  • Modeling Meeting with Marti & Anna-Lena (Tim, Nao)
    • still need to collect FACS and microscopy results
    • discuss our network prediction model using TNF-alpha as an example
    • maybe we can use the lambda switch paper as a good starting point for our modeling

[TOP]

October

Week Days
Mon Tue Wed Thu Fri Sat Sun
40 - - - 1 2 3 4
41 5 6 7 8 9 10 11
42 12 13 14 15 16 17 18
43 19 20 21 22 23 24 25
44 26 27 28 29 30 31 -

10-1-2009

  • Wiki & Presentation Meeting with Dani (Nao)

10-2-2009

  • some wiki work

[TOP]

10-5-2009

10-6-2009

  • Internal Team Meeting
    • check out number and measurement plans of randomly assembled synthetic promoters (5x NFkB, 5x p53, 2x pPARg, 2x SREBP)
  • Wiki Meeting (Corinna, Daniela, Nao)
    • discuss design of the top page and possible features
    • try out CSS design

10-7-2009

  • Wiki Design (Nao)
  • Wiki Phone Meeting with Kai Ludwig (Nao)
  • MATLAB has arrived!
  • literature work
  • Wetlab Meeting: progress report on measurement of random assembled synthetic promoters
  • make thoughts about the whole storyboard of our presentation at the jamboree

10-8-2009

  • Short Meeting with Roland
  • image processing work for wiki

10-9-2009

[TOP]

10-12-2009

10-13-2009

  • Measurement discussion with Lars: REU/RMPU, defining equations for mammalian systems
  • literature work on PoPS paper (Kelly JR et al.) and apply their equations
  • Marti Modeling Meeting (Anna-Lena, Tim, Nao)
    • Journal Club (Tim, Nao)
    • summary of meeting from Team Meeting from last thursday
  • Marti: start modeling using MATLAB and Fuzzy Logic Toolbox (FLT), playing around with FLT and tutorial

10-14-2009

Nao

  • develop first test fuzzy inference system (FIS) for testing
  • Marti Modeling Meeting, specify model topology
  • collect data: FACS (Cori), Microscopy (Hannah), Sequence & TECAN (Lars)
  • start calculating position score using R
  • translating project abstract

10-15-2009

Nao

  • calculate affinity score using TRAP (Anna-Lena)
  • collect ideas for integrating TFwise scores in order to calculate final position/affinity score for one sequence: median, mean, maximum, weighted mean?
  • all data analysis is stored in three sheets (SequenceAnalysis, ResultSummary and CalculateTRAP)
  • from now on we concentrate on FACS measurements because they are the most reliable ones (TECAN used only for scanning)
  • fill up TRAP data with missing transcription factors

10-16-2009

Nao

  • Anna-Lena Meeting: discuss how to integrate sequence scores
  • get & check p53, pPARg and random SREBP sequences
  • go through FACS results
  • add HEARTBEAT sequences for data analysis
  • modeling documentation
  • parsing experimental setups for modeling use
  • Chenchen qRT-PCR results
  • define possible modeling layers
    • first layer: input
      • drug type, pathway, drug mode of action, drug concentration, targeted cells, incubation time
      • sequence type, position score, affinity score
      • we choose position & affinity score, sequence type and the presence of stimulation. Time as well as different concentration (unfortunately no data available) can be added in future
    • second layer: promoters
      • 6 constitutives, 3 standards, 6 inducible available
      • data analysis narrows this to 5 constitutives, 3 standards and 4 inducible
      • HEARTBEAT sequences have to be measured a.s.a.p.
  • Marti Modeling Meeting
    • try to define some fuzzy rules
    • we assume better binding -> better expression
    • define membership functions
    • start modeling with NFkB results

All

  • Internal Team Meeting
    • reminder: wiki task, wiki to do
  • Official Team Meeting

10-17-2009

  • final decision: we go for maximum of position and affinity score
  • added HB sequences for data analysis table; as soon as results are there we can model designed synthetic promoters
  • define shape of membership functions
  • literature search for missing activity values?
  • still TODO: check out p53 results since the p53-NFkB crosstalk is really interesting!

10-18-2009

Nao

  • SREBP/VDR paper arrived
  • finish data analysis
  • study & playing around with MATLAB FLT, programming from both FLT GUI and MATLAB command line
  • define our work to be (i) error checking and (ii) exclusive pathway modeling
  • high potential of this model lies in its plug'n'play structrue, with a high capacity of integrating more inputs, outputs and also the middle layer (promoter diversity)

[TOP]

10-19-2009

Nao

  • define final network structure
  • wiki work
  • reading RFC documentation and correction
  • we call this project HEARTBEAT fuzzy network (FN)
  • HB FN documentation and first results!
    • creating two fuzzy controllers: inducible NFkB and constitutive
  • how do we integrate the data? combine via Simulink!

10-20-2009

  • Creating, developing, integrating and combining fuzzy network modeling (MATLAB, Simulink)
  • first analysis of HB sequences
  • HEARTBEAT FN documentation

10-21-2009

10-22-2009

  • FROZEN WIKI!!!!

[TOP]