Team:Berkeley Software/LesiaNotebook/Notes
From 2009.igem.org
Contents |
Week of June 8, 2009
Monday was the first day of the SUPERB program. I got introduced to the IGEM team. Doug gave a tutorial on Clotho and on plug ins. Got the Clotho source code from the repository.
Tuesday we had a meeting. I will be working with Adam on the language project. BOL will be a language which is based on the abstract model of parts which form the lowest level of hierarchy and will be the basic data types in this language. Parts can be composed of other parts, but in themselves cannot be decomposed any further into lower level data types. Devices will be at the second level of hierarchy and can consist of multiple devices as well as parts. It will be different from other languages like Antimony, [http://sbml.org/Main_Page SBML], or [http://bionetgen.org/index.php/Main_Page BNGL], as BOL will be a structural human readable language not geared towards modeling. BOL has a direct relationship to [http://openwetware.org/wiki/Endy:Notebook/BioBrick_Open_Graphical_Language BOGL] symbols, the graphical representation and can be used to convert textual designs to graphical representations and vice versa.
Wednesday brainstormed with Adam on the syntax of BOL. It will have defined data structures based on the [http://openwetware.org/wiki/Endy:Notebook/BioBrick_Open_Graphical_Language BOGL] symbols like promoter, RFS. Supportive data types will be string, int, composite. Also, an important feature that we would like BOL to have is the ability to create more user defined data types. I created a preliminary data type table on properties, data types, operators. Also started familiarizing myself with [http://www.gnu.org/software/bison/ GNU Bison], which we will use as our parser for the language.
Researched more on other languages. It seems that BOL will indeed be different from previous languages. The rule based description has been implemented in [http://bionetgen.org/index.php/Main_Page BNGL], which is a modeling language and concentrates on species, molecules rather than standard parts. We still have to go over the syntax for rules and how we will implement them in BOL. The logical "AND", "OR" and complement operators, as well as <, > seem appropriate for right now.
Week of June 15, 2009
Monday had the presentation on Plug-Ins. Got feedback from group and Professor Anderson on the Plug-In xml file generator:
- Have mouse overs on each field that needs to be filled
- Make option to save the xml file under user specified name
- Make restrictions on choice for the interface and package name, since they have to correspond to each other
Started familiarizing myself with [http://www.antlr.org/ ANTLR], a tool for constructing compilers and interpreters from grammar rules. Each grammar rule checks the syntax of the program. One can specify actions for each rule which will be responsible for the semantic context of the program. The program consists of parser rules and lexer rules, where ANTLR constructs the Parser and Lexer files, the Test file and a Test input file after debugging. So far ANTLR seems more user friendly than GNU Bison, but this can be due to the fact that there are more tutorials available.
Tuesday had a meeting with Doug and Adam on the language development:
- Need to specify probabilities with each rule somehow, maybe each rule can have a probability property and when enforcing rules this can be taken into consideration
- Need to provide functions, for example could have:
- isDownstream()
- isUpstream()
- Translate()
- ReverseComplement()
- Need to include input/output capabilities
- Control and iteration statements
- Scope of the variables for right now will be global
Started writing the Context Free Grammar and thinking about the structure of the intermediate language, that is how are we going to implement the semantics. Right now have two classes (they are more like structs):
- Property which has type, variable name and the different fields that will store the values of the property. Type just means if the value of the property that we are going to store is text, integer or list. Variable name refers to what property we are talking about, eg Sequence, ID, etc; it should help later accessing each property.
- Part stores an ArrayList of properties, the part type, eg Promoter, RBS, etc and a HashMap of objects that each Part will have. For example:
Part Promoter(Sequence, BioBrickID);
Promoter p1, p2, p3;
The instance Promoter will store part type: Promoter, the properties that Promoter can have, which are Sequence and BioBrickID, and objects stores the instances p1, p2, p3, where each instance will have its own list of property values.
Wednesday continued working on the grammar, tried to generate the parser and lexer code and ran into lots of errors. The error message that ANTLR gives seem not specific enough, so decided to debug my code by creating a new grammar and copying slowly part by part. The problem turn out to be when one declares a grammar rule, it has to be in such specific way that left recursion will not occur, for example:
expr: declareObj expr | instantObje expr | instantObje | declareObj;
would cause left recursion and errors. One way to solve that problem, is follwing:
program: expr*;
expr: declarObj | instantObje;
The sign * means that multiple statements of declaring Objects and instantiating them will be accepted by the compiler
Thursday met with Professor Hilfinger to get some feedback on the language design, possible useful considerations:
- Creation of modules -> certain areas of code that do the same thing and only the parameters differ can be included in modules
- Ability to import modules into larger structures
- Rule scope -> need to know to which part rules apply
- Conceptual language behind the syntax, what structures, data types should we use?
- Decide on the interpreter/parser, right now it seems we are going to stick with ANTLR
Friday discussed with Doug Plug-In Model:
Right now we have five interfaces. In order to create new Plug-Ins one needs to extend one of those interfaces. It would be easier to have one main interface which contains the methods and an abstract class that implements the interface. Therefore if we want to create new Plug-Ins the class just needs to extend the abstract class and override the needed method.
Continued to work on the grammar and the semantic actions. Added print functionality. The following statements work so far: PROGRAM
Property someprop(txt);
Property someprop2(txt);
Property RelativeStrength(num);
Part customP(Sequence, BioBrickID);
customP.addProperties(RelativeStrength);
Promoter p(GCTA, BBa_435);
Week of June 22, 2009
Decided with Adam and Doug on the data structures to store and retrieve information for the compiler, see Adam's Notebook [1]. Continued to work on Eugene compiler. Here is some sample syntax and output that works: Eugene Test Different Inputs
Week of June 29, 2009
New link for documentation, source code and jar file for [http://eugene.wiki.sourceforge.net/About+Eugene Eugene]
Wednesday & Thursday worked on
- added entries to the wiki at source forge
- adding more functionality to print statements
- adding more functionality in declarations of primitives
- found some bugs and fixed them, like we didn't' check all hash tables of instances if an object has already been defined for components or devices
- reorganized some of the code and grammar rules
- enabled access of individual elements in arrays and multidimensional arrays
- enabled declarations of rules
most important event on Thursday -> release of Eugene 0.01
To Do:
- still need to implement Assert
Friday worked on Assert statement
- created grammar rules to recognize all kinds of Assert Statements
- Assert Statements are stored in an assertList, which consists of the names of the rules to be asserted
- the assertList stores the expressions in postfix notation so as to observe order of precedence
- every time somebody creates a new Assert statement it overwrites over the previous one, so as not to have confusion on which assert is implemented
- every time a device is created and an Assert statement exists, the method AssertRule is called:
AssertRule algorithm:
if assertList is not empty:
for every member in assertList: start for if(member is an operand) start if push to stack else pop from stack evaluate result (another algorithm) push back result end if end for
- current restrictions
- Rule declaration cannot handle currently comparison of properties
- only one current assert can be implemented, need to know scope of it
- not clear currently what to do with statements like p1 BEFORE p2 and either one of them or both are not in the considered device. If neither are in this device, should one return true, or if only one is contained should one return false or true??? Again it depends on the biological context.
- after the end of evaluation if overall statement result is false, compiler will issue warning statement: "Warning, the current Assert statement has been violated"
To Do:
- actually run the Assert code and debug (am sure there will be some kind of bug!!)
- add checking to grammar, if rules have been declared, if components or device instances inside the rule declarations are valid and have been declared
- think of better messages if some rule has been violated but the overall result of the Assert statement is true?????
- add to Rule declaration ability to compare properties using the operators ==, != , <=, >=, < , >
- add this functionality to AssertRule method or maybe write a similar method for comparing properties