Team:Freiburg software/Project

From 2009.igem.org

Revision as of 20:57, 20 October 2009 by JonasOnlyJonas (Talk | contribs)

[http://www.youtube.com/watch?v=hM44QasLyfw Watch a high resolution verion here]

Contents

Motivation

Today the web offers a wide range of tools to communicate, collaborate and share personal data. 80% of the web's content is created by its users. People share their pictures with Picasa, students interact in social networks like facebook, programmers contribute to software projects using platforms like sourceforge. In contrast, science communication is still pretty outdated, making little use of the collaborative world wide web. As synthetic biology in special needs many scientists to work together in order to build such complex things like artificial proteins, chromosomes or later even genomes, there has started a process of wiring results in synthetic biology research. We want to push this process and go one step further: Why only store and share results? Why not making use of the web to collaborate? Scientist could make transparent the whole process of creating data, they could even create data together. Today, the Web offers all the technologies needed. Make use of it!

Highlights

hm

The transmission and manipulation of common data like graphics, rich text documents, videos and other media is based on widely adopted standards. Not so for scientific data. You cannot embedded protein data into a website easily. You can send it by E-Mail to other scientists, but their mail clients have no idea how to handle this data. For manipulating and even displaying synthetic biological information, users are still dependent on their individual software environment.

State of art

Modern ways of communication and collaboration in the web are stamped by the following characteristics:

  • Usage of rich internet applications (RIA). They can be accessed by a browser via the internet but their look-and-feel reminds the user of desktop applications.
  • Normalized usage of data such as rich text documents and other media such as picture, video, music.
  • Semantically annotated and linked data so that data can be understood in its context.
  • A strong tendency to multi-user conversation and collaboration. Everybody is welcome to contribute content.
  • Minimal effort needed to participate.

TODO: state of art from the biology side...

  • Biological data is stored in many different and specialized formats on many different locations both in form of files and databases.
  • Sequence analysis and manipulation is only possible in expensive, purpose-build desktop applications or low-level software suits like embl, conatining huge amounts of little commandline driven programs which need improved knowledge on bioinformatic scripting.
  • Arising connection between different databases but only in form of metadatabases and new and currently less supported standards.
  • Only separated User-to-user communication.
  • Much effort needed to communicate and collaborate with biological data. Barely more then emails, business trips and the good old phone call.

our work

When we first heard of Google Wave, we instantly noticed it could be the solid basis we were looking for to build such a next level biological software-suite. As we got access to its developer preview version later, we were quite confirmed in this approach but soon noticed that we would need a lot of features not yet build-in in Wave, some not yet even planned.

As far as we know, SynBioWave is the only approach to write a collaborative web-based biological software, as well as the only bigger project based on Google Wave. Entering that unknown territory gave us lots of problems to bother our heads about:

  • We spend weeks to go deep down to the fundamentals of wave in order to implement basic things like file up- and download. SynBioWave is the only Wave development that managed to do this so far!
  • We are the first (and maybe only) Developers creating a project consisting of multiple Wave-Robots and therefor had to think about a way to make this robots communicate with each other. Google has not foreseen this need of Robot-Robot-Communication and has not even decided to add this to their Robot-API.
  • We took time exploring Google Wave to get n feeling for Wave and its typical, conversation-like workflow that feels very natural. We worked closely with the Biologist of the Freiburg Bioware Team and different labs at Freiburg University to create a new wave-y way of doing bioinformatical tasks. Easy and clear usability is of high importance for us!
  • In order to create some of this usability, we created a highly dynamic menu-system, which allows users to add the functions they need (in form of robots) to a wave. Such an approach is absolutely unique in Google Wave.
  • To lay ground for an successful open source project, we wrapped all these efforts into an easy to implement framework, allowing developers to contribute to SynBioWave. One key feature of this framework is an abstract template class which can be used for the implementation of additional functionality without worrying about the issues mentioned above.

Additionally, we created a basic stack of Robots to prove and demonstrate the concept of SynBioWave.

The resulting software

Look-and-feel of SynBioWave

The result of four month development is enjoyable! We have extended Wave to handle synthetic biological data. Biologists can not only document research results, but also record and share the process of creating these. Moreover scientists can collaboratively perform basic biosynthetic task using SynBioWave.

For example users can start conversations, invite participants, import sequences from several resources, comment data, perform some tasks and display or export the results. Each participant experiences the others' actions in real time. New participants track back the hole process using the playback function. And all you need is a browser connected to the internet. Could it be easer to invite colleges to your research?

Key features of SynBioWave

  • open source, free web application accessible from every computer connected to the internet
  • strong communication and collaboration functionality
  • embedding of common data like rich text documents, pictures, videos and many more (using other extensions of wave)
  • embedding of synthetic biological data possible
  • basic synthetic biology functionality
  • new participants track back history using the playback function
  • easyly extendable framework, custom/additional biosythetic functions are easy to implement

Existing technologies

Google Wave

Google Wave is "a personal communication and collaboration tool" developed by Google. It is a web-based service, computing platform, and communication protocol designed to merge e-mail, instant messaging, wikis, and social networking. It has a strong collaborative and real-time focus and provides several ways to extend its functionality.[1]

Wave basically consists of an open communication protocol similar to email, as well as of client- and server-software. Like email, the protocol aims to be open, decentralized and easy to adapt, but includes modern achievements like multi-user-, real-time-communication and rich formated text with embedded data as well. At the moment, the only working server and client software for Wave is also written by Google, but with the protocol being open-source already, other - Google independent - servers and clients will soon be available.

Freiburg software Wave-example.png

Practically a typical Wave-conversations - called a wavelet - normally works like this: User Alice creates a new Wavelet. She than invites his friend Bob to join the conversation. Bob accepts and can now write Messages to the wave. Each message creates a so called Blip. What differentiates Wave from normal instant-messaging is, that if Alice and Bob decide to write an Document together, they can start to edit the same Blip together. Each change they make to the text there is shown to the other on in real time. If they want, they can also use a build-in playback feature similar to the version-history in wikis to review the changes made to the wavelet.

Additional Google has published an API for writing so called Robots and Gadgets for Wave. While Robots are small programs written in Python or Java, which can participate in Wave similar to normal users, Gadgets are small Webpages that can be embedded into a wave-conversation.

That makes Google Wave quite interesting for Systemic Biology: Synthetic biologist work with text most of the time, Google Wave is made for collaborative text creation. Synthetic biologist need to work together in order to create artificial sequences, protein and later maybe hole genomes, Google Wave is made for working together with any embedded data. Synthetic biologist need automated features for their work, Wave offers them via robots.

BioJava

[http://www.biojava.org BioJava] is an open source project dedicated to providing a Java framework for processing biological data. It provides analytical and statistical routines, parsers for common file formats and allows the manipulation of sequences and 3D structures. The goal of the BioJava project is to facilitate rapid application development for bioinformatics.

Qooxdoo

[http://www.qooxdoo.org Qooxdoo] is one of the leading frameworks for creating rich internet application (RIAs). RIAs are web applications which are accessed from the browser and commonly made for multi user tasks. In contrast to usual web sites, RIAs look and feel like desktop applications. Building a RIA from scratch is nearly impossible. Therefore qooxdoo provides a platform-independent development tool chain, a state-of-the-art GUI toolkit and an advanced client-server communication layer.

Because Wave's user interface is only weakly customizable, qooxdoo is a perfect candidate to extend Wave's user interface with custom toolbars, buttons, forms, context-menus and much more.

Have a look at the qooxWave protocol section to learn more about how we used qooxdoo in our project.

Software

Architecture

SynBioWave is a Google Wave extension, turning Wave into a biosynthetic Software Suite. There are currently two available methods of extending Google Wave (See Google Wave for an introduction):

  1. Robots
  2. Gadgets

We are making use of both!

Robots

Freiburg software SynBioWave-Conceptual-Architecture-flowchart.png

Robots are automated participants, able to modify the wavelets' contents they have joined and interact with real participants. Using Google's Java Client library for Robots gives us the possibility to implement biosynthetic functionality provided by the BioJava library. The SynBioWave Robot can be regarded as the core of our extension. Adding this robot to a wavelet activates SynBioWave. There are additional robots for specific functions, which can added to a wavlet to activate this certain function. The SynBioWave Robot is responsible for:

  • Organization of all SynBioWave functions represented by additional robots
  • extending Wave's user Interface with SynBioWave specific elements using the qooxWave protocol
  • storing the important informations written to the wave in an external database
  • Providing basic functionality like
    • Import/export of sequences
    • Rendering and displaying of sequences
    • Circular view
    • BioBrick communication (searching and importing biological parts)

Additional robots extending SynBioWave functionality

To extend SynBioWave with new biosynthetic functions, one can use additional robots. Adding such a robot to an active SynBioWave wavelet (a wavelet that contains the SynBioWave Robot), enables this function to all users participating in this wave. For SynBioWave users it is very easy to customize and extend SynBioWave. A User searches the robot representing a certain function and adds it to the wavelet he is working with. But this is not the only benefit! This modular architectures makes it very easy for other developers to create additional robots. To facilitate contribution, we provide an abstract SynBioWave Template class. Using this class simplifies additional robot creation a lot. The developer neither needs to worry about Wave development nor about SynBioWave integration (a minimal understanding of both is still needed). There for, one can concentrate on biosynthetic development.

Organizing Robots Work

As we were testing this concept of multiple Robots in one and the same conversation, it pointed out quite clearly that we need to organize the way the robots work: With different Robots all listening to different (or maybe even the same) Events, editing text, writing messages and so on we ended up in greater chaos most of the time. So we decided to extend our Framework to provide a standard in- and output for robots. For the input part, we created a menu, which dynamically displays all the functions of the robots in the current wave. As Google has not foreseen this need of a standardized and easy usable input-method for Wave-robots we had to create this from scratch up, and implemented it by creating a Gadget displaying the menu, a Java-Class Robots can use to create Menus and the so called qooxWave protocol to communicate the menu between the robots and the gadgets. To organize the output of the robots, we created a standardized sequence-display. At the moment the user can choose if he likes at have the sequences either directly in the wave, which feels more "natural" or inside a Gadget, which is currently much more usable because of strict limitations of text-formatting inside Google Wave. Most likely all Sequences will be written directly in the wave in the future as Google has announced to boost die layout options of Wave dramatically.

Gadgets

Freiburg software SynBioWave-flowchart.png

The SnyBioWave robots could be thought of some kind of program logic. They are running somewhere in the cloud, process and organize data. Gadgets come into play, when data like sequences is displayed inside a Wave. Moreover Gadgets offer the possibility to insert custom user interfaces which are essential for manipulating and managing the biological data. Gadgets are little websites included to a wavelet and communicating with it via Google's Wave API. So you could think of gadgets as some kind of visible objects, that are essential for the integration of biological data into Wave.

SynBioWave's user interface

The Wave API provides some customization of the user interface. But unfortunately it's less then weak. To provide some basic features like an upload form for sequences or a toolbar that bundles SynBioWave's functionality we use Gadgets. The SnyBioWave robot dynamically creates a gadget containing a toolbar. He adds all user-interface-elements that correspond to functions provided by the SynBioWave Robot itself and by additional robots. The Gadgets sends events and input data from the user back to the robots. For this purpose we introduced the qooxWave protocol.

Bring it together: The qooxWave protocol

Freiburg software qooxWave-flowchart.png

We invented the qooxWave protocol to realise an easy to use interface for creating graphical user interfaces (GUI) inside a wave from a robot. The general goal is to provide one abstract robot class for implementing new function into SynBioWave (each function is provided by a robot; TODO: link). The programmer who uses this class does not need to worry about the client side GUI implementation. This protocol ensures some easy to use server side function, that automatically build the client side GUI.

The protocol defines a a server-to-client and client-to-server communication. This communication contains

  • a server to client communication for building client-side GUI elements from server side (so a server side robot can create a client side button for example)
  • a client to server communication for reporting events fired in the client side GUI and to transfer user input data to the robots (so the server knows when the user clicks a button for example)

The communication is realised via JSON-Strings that are stored inside a gadgets state object. Both, server (robot) an client (gadget) can access the state object and they can both react on changes of this state object (this is provided by the google wave API).

For a detailed description/definition of the qooxWave protocol, have a look at Team:Freiburg_software/Project/qooxWave-details.

Client side implementation

The Client-side implementation of qooxWave is based on a qooxdoo application loaded into a gadget inside the Wave (have a look at the screenshot on the right TODO). This qooxdoo application - written entirely in JavaScript - communicates with the SynBioWave Robot via JSON-Strings using a state object provided by the Wave API. The state-object is actually designed for gadget-robot communication. But it only supports String-Data. This is why we use JSON to transmit more complex objects.

The qooxdoo application receives a JSON-object from the robot containing a GUI-structure. It parses this object and dynamically adds the ui-elements. On the other hand, the application sends user input back to the robot, also in form of JSON-Strings.

A MVC concept for a proper implementation

The qooxwave-protocol enables a complex structure of ui-elements containing each other. Moreover there are complex events causing complex user input to be send to the robots. To ensure a proper client-side implementation, we created a model-view-controler concept inside the qooxdoo application. We created a store object, that converts the application data into a model, this model is converted into a JSON-String and the other way around. The store receives and sends this JSON-String on any server- or client-side event. This is a additional useful abstraction of the application layout.

TODO: MVC-flowchart

Server side implementation

MenuItem.java

Extending Google Wave I/O

Over the past few decades rapid developments in genomic and other molecular research technologies and developments in information technologies have combined to produce a tremendous amount of information related to molecular biology. This huge amounts of data lays the ground for the work of synthetic biologists. By analyzing,modifying and extending this sequences, the synthetic biologists are enabled to build new functionalities and potentially whole genomes. So one basic feature for the work in synthetic biology is to have fully access to this pool of sequence data, provided in different formats. As only file sharing was intended for google wave, we were forced to extend the build in servlet functionality of the Google Wave Java API with the needs for file import/export as well as methods for database access.

Sequence file import/export

Google Wave robots written in the Java programming language are specialized forms of Java HttpServlets. So we extended the Google Wave robot servlet class capabilities not only to server robot events but to serve file upload and downloads as well. To assure a very threadsave and robust upload functionality, we based the file upload on the well known apache commons-upload project. To avoid any abuse of the file upload, the file upload is directly connected to the sequence creation and is only used if the file contains sequence information in a supported format. Using the newest BioJava classes for parsing the file uploade provides an easy way to extend the amount of supported file formats in the near future.

Database access

SynBioWave currently supports DAS access to the iGEM related BioBrick database. SynBioWave implements a simple name browser for the biological parts and allows the user to directly import the fully annotated sequence into the working process.

Internal Database

SynBioWave uses an internal database to track over all sequences produced or imported in the working flow. With the datastore browser, SynBioWave provides an intuitive user interface to search and reload this sequences. In this early state Google Wave only supports robots hostet on Googles own AppEngine application server. So we are using Googles Datastore to store the sequences. In further times Goole pronounced that this restriction will be retrected and robots can be hosted anywhere. To ensure the compatibility to any other database provided by the choosen host, we make use of Suns JDO specification or more precisely the apache JDO2 library.

You want to no more, have a look in the I/O-details informations.

How to display sequences

Nowadays every application knows to display text and pictures. Lots of applications actually playing audio and video data. But there is no native support for displaying sequence data. SynBioWave currently supports three different ways to display your sequences. There is a simple view for short sequences typed or copied directly to the wave and a embedded gadget view for longer sequences and sequence comparisons like for example in multiple sequence alignments. Both views providing a clearly represented scaling and increase readability by automatically colorizing the sequences according to the sequence type. And in the end there is a circular view for displying fully featured circular dna as needed for example in displaying vectors and plasmids.

Interested in more detailed informations on coloring mechanisms or the display of sequence features, have look at the viewing-details.

Conclusion

Perspective

Future releases

Currently we are planing to continue focusing on the improvement of the framework while Google Wave is still a Preview Version and adapt it to the new functions Wave will hopefully get in the future.

Version 0.2

By the time Wave enters a real public and final beta state, we will release Version 0.2 of SynBioWave, which will have a mostly completed, stable and we'll documented Framework and enables developers to create SynBioWave-Robots even more easily than the iGEM-Release.

Improvements planed:

Framework

  • Improved Robot-Robot-Communication, both via Wave and direct URL-connections (1)
  • Bidirectional Robot-Gadget-Communication (1)
  • More simple menu creation in robots i.e. changing the MenuItem-class to an interface and create a class implementation for every available menu item.
  • Integrate Callback-functions for menu items
  • Improved sequence-display and manipulation, either in a proper-styled(1) Inline-Blip(1) with scrollbar(1) or in a "wave-y" Gadget.
  • Improving the Model-View-Controller-Concept of the menu-Gadget
  • Heavily improving the usability with lots of testing and feedback from biologist(1)
  • Improving the Documentation

BioBrick-Robot

  • Integration of Assembly-Algorithms
  • Support for different Biobrick-Standarts
  • Direct Parts-Upload to the iGEM-Server

Blast-Robot

  • Make it do something useful with the received Blast-hits

Other

  • Some more Robots as prove of concept and examples for developers


Version 0.3

Versions 0.3 will make SynBioWave attractive to even more Developers by offering them the possibility to write SynBioWave-Robots in Python and further simplifies the creation of Robots with a SynBioWave-Eclipse-plugin. With hopefully many more Robots available at that point, this could be the first release that can be used in labs.

  • A Python-implementation of the SynBioWave-Framework
  • An Eclipse-Plugin for SynBioWave-Developing
  • Many many more Robots


Later Versions

  • Support for own Wave- and Robot-Servers (1)
  • All functions typically needed in Synthetic Biologie (1)


(1) : currently not supported by Google Wave and/or Google AppEngine, but announced for the future.

........... alte Version ................

Concept

Our concept is to create a collaborative software suite called SynBioWave for synthetic biology purpose. SynBioWave is a Google Wave extension using BioJava to add synthetic biology functionality, giving synthetic biological research access to the collaborative and interactive web 2.0. Using SynBioWave, scientists can share their results in Waves or even conduct research together from different places around the world. Users can add and modify sequences within conversations while others observe the progress or even interact. Participants can be invited to a conversation any time and track back the collaboration process using the playback function, which fully supports all biosynthetic contents.

SynBioWave makes use of Wave's powerful communication and collaboration functionality and is designed to be be easily extended with new synthetic biology functionality. Mashing up the reinvention of the email with a major library for processing synthetic biology data, raises science collaboration to a new level.

Our small team of three developers will not be able to create a full-value synthetic biology software by iGEM Jamboree 2009. Our goal is to lay the foundation for a robust software suite and to demonstrate the benefits of this wave approach for synthetic biological research. Moreover we implement some basic biological functionality to demonstrate this concept.

SynBioWaves' key features

  • open source, free web application accessible from every computer connected to the internet
  • strong communication and collaboration functionality
  • basic synthetic biology functionality
  • easy to extended with additional synthetic biology functionality

The road to success

TODO: Grafik

For addressing a wide audience of users and contributers, SynBioWave is published under a free licence. This will attract other developers creating new functions or modify the software for their own purpose.

One of the key goals of SynBioWave is the feature of easy-extendibility. We want to create a framework that allows other developers to contribute new biosynthetic functionality with a minimum knowledge of Wave development. For this purpose SynBioWave offers an abstract robot class which can be regarded as a template for biosynthetic functions. This concept has very nice side-effect: SynBioWave can be easily customized by adding and removing robots which represent certain function.

SynBioWave will not only be a simple mashup, a synthetic biological software running inside a wavelet. It will be a perfect symbioses of Wave and BioJava. The look and feel of SynBiowave will perfectly fit in the Wave concept. Waves real-time-editing, multi-user-editing functions as well as the playback function must work in harmony with SynBioWave. This sounds quite trivial. But looking at the current Wave extensions gives reason to be concerned about this.

Benefits of the symbioses

Building SynBioWave as mashup of Wave and BioJava brings up certain benefits:

  • no need for building the whole application from scratch
  • get the communication and collaboration features for nothing (wave)
  • get multi-user editing for nothing (wave)
  • get real-time editing for nothing (wave)
  • as a web application, SynBioWave can be accessed from any computer connected to the internet.
  • easy to setup. No local installation is needed
  • get many key bio features for nothing (biojava)
  • robust and high quality software basis
  • As Google's latest child, wave is going to be talked of a lot. As one of the first application using google wave, SynBioWave has probably a huge audience

challenge and difficulties

  • With the begin of SynBioWave's development, wave is in a very unstable Alpha version available. Even the API is still weak and might change. There is not much documentation and discussion yet, many Bugs and "no-yet-implemented-features" often nearly drove us crazy.
  • At the moment, Google forces developers to host Robots trough their AppEngine-project. AppEngine has every strict limitations build-in, making it incredible hard for us to implement even simple features like file up- and download. Nearly no existing bioinformatic Java-class works on AppEngine without modifications. When Google opens Wave for own robot-servers, these problems will instantly vanish. They have announced to do so in the foreseeable future.
  • Developing a collaborative web application faces programmers special challenges and difficulties. For example, multi-user editing and real-time editing issues the challenge of synchronising user input. What happens, for example, if two users submit contrary input at the same time?

References

[1] Inspired by Wikipedia. [http://en.wikipedia.org/w/index.php?title=Google_Wave&oldid=320807113 Link].