Team:Berkeley Software/AdamNotebook

From 2009.igem.org

Revision as of 04:45, 20 October 2009 by Adam z liu (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Adam's iGEM 2009 Notebook

Back to notebook index

Contents


ANTLR Tutorials

http://w3.msi.vxu.se/users/tgumsi/antlr/ (http://www.antlr.org/pipermail/antlr-interest/2009-February/032930.html)

http://java.ociweb.com/mark/programming/ANTLR3.html

http://supportweb.cs.bham.ac.uk/documentation/tutorials/docsystem/build/tutorials/antlr/antlr.html

http://www.alittlemadness.com/2006/06/05/antlr-by-example-part-1-the-language/


Adam z liu 06:26, 4 June 2009 (UTC)

Adam in Shanghai, China

Introduction:

My name is Adam Liu, and I am going to be a 4th year EECS student in the fall. This summer I will be working as a member of Berkeley's computational team. With the success of Clotho last iGEM, we have a great foundation to build off of this year. Although there are still a few bugs to fix, we plan to pursue many new exciting projects to move Clotho forward. The team is much bigger than last year, so hopefully we will not be too overwhelmed. My main focus this summer will be language development.


Short-term project:

I will be making a plug-in that will take a textual description of biological parts in an input file, tentatively named Adam's Language For Synthetic (ALFS), and generate an image with BOGL symbols.


Long-term project:

  • Familiarize myself with Proto, Antimony, and SBML.
  • Get an existing language to work with Clotho.
  • Design and implement our own language (structural netlist first).


Adam z liu 00:25, 11 June 2009 (UTC)

I finished a simple plugin called Pictoparts. The user inputs a *.p file describing BOGL parts, and my plugin parses the file, stitches together images, and outputs and displays an image of the composite part. I also met my partner Lesia, and we had our first project meeting with Doug. We came up with the following outline of tasks to complete this week:

  1. Think of a name for our language
  2. Give Lesia the Proto paper
  3. Look for other languages
  4. Write up a draft language for the BOGL RFC
    • List of primitive datatypes
    • List of properties for primitives
    • Rule syntax

Once we have a first draft for syntax, we will test the expressive power of our language on a set of diagrams by a member of the Stanford BOGL effort. We will then submit everything to the BOGL/BOL community for critique. Next week, we plan on giving our language to the wetlab to test, as well. After we settle on acceptable syntax, we will specify the grammar to GNU Bison to parse our language.


Adam z liu 19:22, 19 June 2009 (UTC)

Doug, Lesia, and I got feedback from various different sources about the sample syntax:

  • Cesar Rodriguez - contributor to the Stanford BOGL project
    • Rename Part to Component and Composite to Device
    • Make the language object-oriented with everything inheriting from Sequence
  • Paul Hilfinger - compilers professor at Berkeley
    • Need a way to define modules
    • Use a parser generator
    • CS community not really interested in domain-specific languages unless new processing is being done
  • J. Chris Anderson - Berkeley iGEM wetlab instructor
    • The language cannot be just a tweaked version of others already out there
    • Be able to elaborate on the ultimate purpose of the language

Lesia and I are developing a grammar for ANTLR. We also gave a PowerPoint presentation about our project to the group.


Adam z liu 22:17, 24 June 2009 (UTC)

Data model for Eugene back-end
/*
The following example syntax demonstrates how our parser populates our 
back-end data model, as specified in DataModel.png. Example1.png show 
each global HashMap in more depth.
*/
Property Sequence(txt);
Property Orientation(txt);
Property ID(txt);
Component Promoter(Sequence, Orientation, ID);
Component RBS(Sequence, Orientation, ID);
Component ORF(Sequence, Orientation, ID);
Image (Promoter, "C:/My Images/Promoter.jpg");
Image (RBS, "C:/My Images/RibosomeBindingSite.jpg");
Promoter p1("ATCG", "Forward", "BBa_K123456");
Promoter p2("GCTA", "Reverse", "BBa_K654321");
RBS rbs1(.Sequence("CGAT"), .Orientation("Reverse"));
num ten = 10;
num one = ten - 9;
Rule r1(p2 BEFORE p1);
Rule r2(rbs WITH p2);
Rule r3(p1 NEXTTO rbs1);
Assert(r1 OR r2);
Assert(NOT r3);
Note(NOT(r1 AND r3));
Device c1(p1, rbs1);
Device c2(p2, c1);
Back-end value population for example syntax


Adam z liu 00:36, 3 July 2009 (UTC)

Lesia and I worked on fixing as many bugs as possible for our initial release. I packaged our compiler into an executable that can run from the command line or from a GUI. It comes with an extensive README with instructions on how to code and compile with Eugene. Lesia is moving on to work on rules, and I will now be working on integrating with Spectacles, Richard and Joanna's visual piece.

Sourceforge wiki: http://eugene.wiki.sourceforge.net/About+Eugene

Sourceforge download: http://sourceforge.net/projects/eugene/files/eugene/eugene.tar.gz/download


Adam z liu 21:18, 11 July 2009 (UTC)

Lesia finished the pass on rules, and I wrote a Eugene exporter than converts the data model back to Eugene code. I also wrote a syntax highlighting file for Notepad++ to include with the next release. We had a meeting with Cesar and Akshay on Friday, and we agreed that the details about properties and components should be hidden from the user in the form of header files. Next week, we are going to comment and clean the grammar file, and create a Eugene editor for Spectacles. I will also create a thorough test case to find more bugs. The next release will be for Cesar's labmates at Stanford.


Adam z liu 04:11, 19 July 2009 (UTC)

We added header file functionality and integrated Eugene into Spectacles, which now has a code editor, as well. We brought the finished project, which is still a little buggy, to Stanford to show Drew, Cesar, and Akshay on Friday. Drew shared with us his vision of abstraction barriers between sequences, parts, and devices. He explained that tools simply do not exist that help refine sequences and particate them. Though Eugene and Spectacles provide the ability to move downward from the part level to the sequence level, until tools are developed to move upward, the problem is only half solved. We decided to adopt the rapid application development paradigm with our future efforts, and try to release working pieces as we finish them. Next week we will begin developing XML translators to move between our data model and Cesar's XML schema.


Adam z liu 011:33, 24 July 2009 (UTC)

I wrote an XML translator that takes XML in Cesar's below format and converts it into a Eugene header file.

<?xml version="1.0" encoding="UTF-8" ?> 
<registry>
  <part>
    <identifier>BBa_K106019</identifier> 
    <type>ORF</type> 
    <name>Receptor, tar-envZ</name> 
    <description type="string">Coding sequence for Tar-EnvZ fusion protein. Tar is the bacterial chemotaxis 
    receptor for aspartate. EnvZ is an osmosensor kinase, which, through phosphorylation of OmpR, regulates 
    transcription of porin genes.</description> 
    <size>1491</size> 
    <sequence>ccaggcatcaaataaaacgaaaggctcagtcgaaagactgggcctttcgttttatctgttgtttgtcggtgaacgctctc...</sequence> 
    <orientation>+</orientation> 
  </part>
</registry>

I also made a few diagrams for Lesia's paper.

Drew Endy's vision of abstraction layers


Adam z liu 20:57, 31 July 2009 (UTC)

Spectacles and Eugene refactored Component back to Part. I cleaned up the indenting in the grammar file and edited images for Lesia's paper as needed. Doug, Lesia, and I went through the paper and tried to get polished up for SUPERB. On Thursday, we went to the iGEM barbecue. I posted about it on the Berkeley Engineering blog. On Friday, everyone went to Lesia's poster session and looked at all the other SUPERB projects. The month of August is going to be all about migrating everything to Clotho II. On Monday, I will be packaging the Enzyme Library into a tool.


Adam z liu 19:12, 7 August 2009 (UTC)

Doug, Bing, and I started porting over the old tools to Clotho II. I worked primarily on the Enzyme Library, making sure highlighting is still compatible with the Sequence View. We discussed how communication should work between tools, and decided that tools will generally call the getData and sendData methods in the core, passing a tool class and a keyword. Tools will all subscribe to a universal set of keywords and not implement their own custom enumeration. This way, the vocabulary will be more controlled and and data requests will be synchronized with database mappings. Keywords can be relied on for fetching universally relevant information, like sequence. For more tool-specific information, like whether the methylated check-box is checked in the Sequence View, tools will have to get the specific tool from the core and call custom getter methods. The necessity for this kind of communication will be limited mostly to the old tools that are highly interdependent. Tool writers in the future hopefully will not need to do this very much. The sendData method will be called to ask a tool to perform a function on the sent piece of data. Functions will vary from tool to tool and will be specified in each tool's operation enumeration.

I also wrote up a translator that converts Eugene to XML. There are still some details that need to be hammered out regarding the format of the XML, but generation basically works.

Next week, Doug and I are trying to meet with Sherine from the wet lab to demo Clotho II and to get interesting constructs for the Eugene paper.

For the month of August I was given the following tasks:

  • Enzyme Library
    • Convert processData method to new ClothoTool interface
    • Preferences
    • Help file
    • Test
  • Eugene
    • Learn about 'if' statements
    • Look into 'while' loops
    • Make rules more expressive
    • Get rules working with Spectacles
    • Find more data for paper (around 20 static parts and 2 design scenarios)
    • Convert paper to JBE format


Adam z liu 19:33, 14 August 2009 (UTC)

This week, I met with Sherine from the wet lab with the following agenda:

  • Find around 10 new parts, homogeneous or heterogeneous in device composition, to add to the paper results
  • Describe 3 design scenarios
    • Incremental device design
    • Data-driven device design
    • Rule-driven device design

A lot of the work in the wet lab this year involved making many parts by reusing a large portion and swapping out a piece, a good example of incremental design. We could not come up with concrete examples for data-driven and rule-driven design, so we may need to consult Chris Anderson in the future. To capture the incremental design scenario in Eugene, I am working on a function that will take a device and generate all the variations from swapping parts. For instance, if the device has a promoter, new devices will be generated picking a promoter from the set of all promoters already defined.

I also fixed some bugs in Eugene regarding rule correctness and line numbers in code with comments. I updated the README to include documentation on 'if' statements, as well.


Adam z liu 19:56, 21 August 2009 (UTC)

I added the ability to export XML from the compiler in two different formats. The default format, option -xml-default, below, loops through all the properties of each part, while the standardized format, option -xml-sbdtp, will look for specific keywords to generate tags. There is still a lot of discussion of what should be included in the standardized format, so it is currently just a place-holder.

<?xml version="1.0" encoding="UTF-8" ?> 
<design>
  <parts>
    <part name="p1" type="Promoter">
      <sequence>ATCG</sequence>
      <id>BBa_K12345</id>
      <orientation>Forward</orientation>
    </part>
    <part name="p2" type="Promoter">
      <sequence>TCGA</sequence>
      <id>BBa_K23456</id>
      <orientation>Reverse</orientation>
    </part>
    <part name="rbs1" type="RBS">
      <sequence>CGAT</sequence>
      <id>BBa_K34567</id>
      <orientation>Bidirectional</orientation>
    </part>
  </parts>
  <devices>
    <device name="d1">
      <components>
        <component type="Promoter">p1</component>
        <component type="RBS">rbs1</component>
      </components>
    </device>
    <device name="d2">
      <components>
        <component type="Promoter">p2</component>
        <component type="Device">d1</component>
      </components>
    </device>
  </devices>
</design>

The generateVariations function is debugged and working, and documentation will be added to the README soon. It required an additional house-keeping data structure, a third ArrayList<String> in partDefinitions that keeps a list of names of all instances of each part. This made looking up the available pool of interchangeable parts more efficient.

To allow rule-checking to Spectacles, I added two new global data structures, ruleAssertViolations and ruleNoteViolations, that map device names to array lists of rule statements that are violated. Since Eugene currently throws an exception immediately when a rule is violated, ruleAssertViolations only holds one device and one statement right now.

This week, the majority of us went to the wet lab to get mentored by Sherine in the assembly process. We did Eco/Bam transfers, picked colonies, ran gels, and assembled composite parts. The iGEM experience would not have been complete without some wet lab work. After all, we are designing tools to aid in this process.


Back to notebook index