Team:Freiburg software/Project

From 2009.igem.org

Revision as of 18:48, 17 October 2009 by JonasOnlyJonas (Talk | contribs)

Contents

Introduction

Motivation

Google Wave

Google Wave is "a personal communication and collaboration tool" developed by Google. It is a web-based service, computing platform, and communications protocol designed to merge e-mail, instant messaging, wikis, and social networking. It has a strong collaborative and real-time focus and provides several ways to extend its functionality.


In Google Wave terminologies a treelike conversation is called a Wavelet and each message of the conversation is a Blip.

Additional Google has published an API for writing so called Robots and Gadgets for Wave. While Robots are small programs written in Python or Java, which can participate in Wave similar to normal users, Gadgets are small Webpages that can be embedded into a wave-conversation.

The BioJava project

BioJava is an open-source project dedicated to providing a Java framework for processing biological data. It provides analytical and statistical routines, parsers for common file formats and allows the manipulation of sequences and 3D structures. The goal of the biojava project is to facilitate rapid application development for bioinformatics.

Using the scalable, cross-platform, network-aware power of Java technology, researchers at Great Britain's famed Sanger Institute for genetic study have spawned BioJava--an open-source project dedicated to providing genomic researchers with a Java technology-based developer's toolkit. BioJava offers bioinformatics developers over 1200 classes and interfaces for manipulating genomic sequences, file parsing, CORBA interoperability, and more. The facility is already being used at major research and pharmaceutical centers, and in over 85 countries around the world.

Why BioJava ?

There are three major frameworks for processing biological data: BioPerl, BioJava and BioPhyton. As google wave provides developer APIs both for the Java and the Phyton programming language, there was the decision between BioJava and BioPhyton.

Why we have choosen the BioJava framework:

BioJava has grown tremendously since its beginnings. The most recent site statistics show 1,264 public classes and interfaces, with over 200,000 lines of code, and over 14 people regularly contributing to the code. "The total number of classes sounds a bit scary when you count it," explains Pocock, "but there are really only about 15 interfaces. And pretty much everything you ever write is to those 15 interfaces. So there's a frightening amount of complexity that you never see, and are happy not to see!"

It includes objects for manipulating biological sequences, file parsers, DAS client and server support, access to BioSQL and Ensembl databases, tools for making sequence analysis GUIs and powerful analysis and statistical routines including a dynamic programming toolkit.


Pocock arrived at Sanger with C++ and Perl coding experience already under his belt, but soon found the languages lacking for his tasks. "With Perl, I just couldn't get the performance I needed," says Pocock. "When you're working with Genomic data sets, you're often dealing with Gigabytes of data. And Perl didn't handle that very well. C++ could handle that amount of data, but the language really didn't help you to write portable, robust code."


Both Pocock and Down keep an active hand in maintaining and enhancing the BioJava code, but it is a truly collaborative open source effort. "Someone like myself, or Thomas, or Mark Schreiber, who is now a major contributor to the site, would approve anything that touched the core object model. And we would also discuss that on the mailing list or the IRC. But the project is actually quite modular. There are two people who are involved with the sequence-searching algorithm code. And they would be in charge of making sure that anything committed to that was safe and sane. In the current era, there is no one person who knows the entire library, or who has responsibility for it."

The most recent monthly site statistics (for April of 2004) show a hit rate of over 170,000, with greater than 400 downloads of the BioJava package, comprising a total of over 130,000 files. At peak times, the site receives over 10,000 hits an hour.

The following projects make use of BioJava.

  • Dazzle: A BioJava based DAS server.
  • Bioclipse: A free, open source, workbench for chemo- and bioinformatics with powerful editing and visualization capabilities for molecules, sequences, proteins, spectra etc.
  • Geneious: A molecular biology toolkit.
  • SPICE: A browser for the annotations for protein sequences and structures that is based on the DAS protocol.

Concept

Our concept is to create a collaborative software suite called SynBioWave for synthetic biology purpose. SynBioWave is a Google Wave extension using BioJava to add synthetic biology functionality, giving synthetic biological research access to the collaborative and interactive web 2.0.

SynBioWave makes use of Wave's powerful communication and collaboration functionality and is designed to be be easily extended with new synthetic biology functionality. Mashing up the reinvention of the email with a major library for processing synthetic biology data, raises science collaboration to a new level.

Our small team of three developers will not be able to create a full-value synthetic biology software by iGEM Jamboree 2009. Our goal is to lay the foundation for a robust software suite and to implement some basic synthetic biological functionality.

SynBioWaves' key features

  • open source, free web application accessible from every computer connected to the internet
  • strong communication and collaboration functionality
  • basic synthetic biology functionality
  • easy to extended with additional synthetic biology functionality

The Software

General

iGEM Release

Road-map

Guids

User-guide

Developer-guide