Team:Berkeley Software/Data Model
From 2009.igem.org
- Eugene
- Spectacles
- Kepler
- Data Model
Clotho Infrastructure and Reconfigurable Data Model
Contents |
Introduction
With the growing number of parts available to synthetic biologists, the amount of data needed to identify and describe these parts grows as well. However, there is a lack of a unified data model to organize this data. As a result, each institution's database of parts looks slightly different from the databases at other institutions, making tool development for the wider community difficult. Clotho Classic had a database connection manager that could connect to many different types of databases. However, this manager was limited and could only connect to databases with structures similar to its internal data model. Clotho builds upon this idea, and now includes a much more robust database connection module that can deal with a larger range of database organizational patterns. This is accomplished by allowing the user to extend the core set of keywords that the tools linked into Clotho may use. This functionality will allow Clotho to be useful during while the standards in the synthetic biology community continue to be developed.
Feature Overview
The new Clotho infrastructure provides three useful services to all Clotho Tools. First, all Clotho Tools implement a ClothoTool interface, which allows them to communicate with other ClothoTools through the Tool API. Second, the Clotho infrastructure provides a robust database connection where database tuples are returned as Java objects (using [http://en.wikipedia.org/wiki/Object_Relational_Mapping Object Relational Mapping]). Finally, Clotho's data model is reconfigurable at runtime, which means that Clotho Tools don't need to be tied down to a specific data model.
Technical Overview
The new Clotho infrastructure has been greatly simplified since last year. It is now far easier to develop a new tool that can operate within Clotho, and these tools have more powerful API functions at their disposal. The above picture shows the relationships between the three services Clotho provides to tools. Tools can speak to each other through the Clotho Tool Core shown on the right. Tools can also access the Data Core through the Data API to talk to external databases. On the left hand
Tool API
The Tool API provides a standardized way for tools to interact, since all Clotho Tools must implement our ClothoTool interface in order to be loaded by the Tool Core. For example, take our sequence view tool. This tool is just a simple sequence editor based on [http://www.biology.utah.edu/jorgensen/wayned/ape/ ApE]. We would like this tool to be able to import sequence data from any other tool easily. Using the Tool API, this operation is very simple - we merely call the getData() method with the argument "sequence", since all Clotho Tools must have a getData() method. So in this example, the sequence view can query the Tool Core for a list of all active tools, display the list, and then import a sequence from a user-selected tool.
We realize that our Tool API doesn't solve all tool-to-tool communication problems - if two tools need to work very closely together, their code will still be more interdependent. However, our Tool API does provide a way to make tools that integrate easily with new tools.
Clotho Keywords
Clotho keywords provide the flexibility in both tool-to-tool interaction and tool-to-database communication. A keyword is a name given to some type of a data synthetic biologists would like to keep track of. We separate keywords into two categories: object keywords and field keywords. An object keyword is a name for a collection of data that describes some entity. For example, biobrick and person could be two object keywords. A field keyword, on the other hand, is a name given to a primitive piece of data. For example, sequence or name are two field keywords. An object keyword will generally have one or more field keywords associated with it, such as sequence for biobrick or name for person.
These keywords and the relationship between them are described in keyword files, which are XML files in a format that Clotho can understand. These files specify what keywords tools may use. Clotho comes with a default core keyword file that all the tools written by the Berkeley iGEM team use, but a developer can easily write their own keyword file to allow their own tools to work with data not described in the core Clotho data model.
Data API
The Data API provides tools with a simple and robust interface to external data sources. Using the same Clotho keywords as described earlier, tools can ask for data packaged up into Java objects. These objects will have types as specified by mapped object keywords. For example, the parts manager needs to work with biobrick and collections, so it will use Data API to ask for biobrick and collection objects. If these keywords were not mapped during the database connection process, then the parts manager will notify the user of an error. Otherwise, all the data that describes the biobricks and collections the parts manager needs use are returned as easy to use Java objects that can be queried for the data in their fields.