Team:Freiburg software/Project

From 2009.igem.org

Revision as of 17:51, 17 October 2009 by Davidn (Talk | contribs)

Contents

Introduction

Motivation

Google Wave

Google Wave is "a personal communication and collaboration tool" announced by Google at the Google I/O conference on May 27, 2009. It is a web-based service, computing platform, and communications protocol designed to merge e-mail, instant messaging, wikis, and social networking. It has a strong collaborative and real-time focus and provides several ways to extend its functionality.

Wave basically consists of a communication-protocol created by Google.


At the moment Wave consist of a Communication-Protocol, a Server and a Webclient. all created by Google. While the protocol is open-soured already, Google has announced to publish the code of both the server and the client in the future in order to create a completely free system.

The BioJava project

BioJava is an open-source project dedicated to providing a [http://www.java.sun.com Java] framework for processing biological data. It provides analytical and statistical routines, parsers for common file formats and allows the manipulation of sequences and 3D structures. The goal of the biojava project is to facilitate rapid application development for bioinformatics.

The BioJava library is useful for automating many daily and mundane bioinformatics tasks. As the library matures, the BioJava libraries will provide a foundation upon which both free software and commercial packages can be developed.

It includes objects for manipulating biological sequences, file parsers, [http://biodas.org/ DAS] client and server support, access to BioSQL and [http://www.ensembl.org Ensembl] databases, tools for making sequence analysis GUIs and powerful analysis and statistical routines including a dynamic programming toolkit.

Using the scalable, cross-platform, network-aware power of Java technology, researchers at Great Britain's famed Sanger Institute for genetic study have spawned BioJava--an open-source project dedicated to providing genomic researchers with a Java technology-based developer's toolkit. BioJava offers bioinformatics developers over 1200 classes and interfaces for manipulating genomic sequences, file parsing, CORBA interoperability, and more. The facility is already being used at major research and pharmaceutical centers, and in over 85 countries around the world.

Pocock arrived at Sanger with C++ and Perl coding experience already under his belt, but soon found the languages lacking for his tasks. "With Perl, I just couldn't get the performance I needed," says Pocock. "When you're working with Genomic data sets, you're often dealing with Gigabytes of data. And Perl didn't handle that very well. C++ could handle that amount of data, but the language really didn't help you to write portable, robust code."

BioJava has grown tremendously since its beginnings. The most recent site statistics show 1,264 public classes and interfaces, with over 200,000 lines of code, and over 14 people regularly contributing to the code. "The total number of classes sounds a bit scary when you count it," explains Pocock, "but there are really only about 15 interfaces. And pretty much everything you ever write is to those 15 interfaces. So there's a frightening amount of complexity that you never see, and are happy not to see!"

Both Pocock and Down keep an active hand in maintaining and enhancing the BioJava code, but it is a truly collaborative open source effort. "Someone like myself, or Thomas, or Mark Schreiber, who is now a major contributor to the site, would approve anything that touched the core object model. And we would also discuss that on the mailing list or the IRC. But the project is actually quite modular. There are two people who are involved with the sequence-searching algorithm code. And they would be in charge of making sure that anything committed to that was safe and sane. In the current era, there is no one person who knows the entire library, or who has responsibility for it."

The most recent monthly site statistics (for April of 2004) show a hit rate of over 170,000, with greater than 400 downloads of the BioJava package, comprising a total of over 130,000 files. At peak times, the site receives over 10,000 hits an hour.

Template:Reference

Projects

The following projects make use of BioJava. If you know of other projects please add them to the list.

  • [http://www.dengueinfo.org/ DengueInfo]: a Dengue genome information portal that uses BioJava in the middleware and talks to a biosql database.
  • [http://www.derkholm.net/thomas/dazzle Dazzle]: A BioJava based DAS server.
  • [http://www.inforsense.com/biosense.html Biosense]: A commercial informatics offering from [http://www.inforsense.com/ Inforsense] that uses BioJava under the hood.
  • [http://www.bioclipse.net Bioclipse]: A free, open source, workbench for chemo- and bioinformatics with powerful editing and visualization capabilities for molecules, sequences, proteins, spectra etc.
  • [http://webclu.bio.wzw.tum.de/prompt PROMPT]: A free, open source framework and application for the comparison and mapping of protein sets. Uses BioJava for handling most input data formats.
  • [http://www.cytoscape.org Cytoscape]: An open source bioinformatics software platform for visualizing molecular interaction networks.
  • [http://www.bioweka.org BioWeka]: An open source biological data mining application.
  • [http://www.biomatters.com Geneious]: A molecular biology toolkit.
  • [http://www.proteomecommons.org/dev/masssieve/index.html MassSieve]: An open source application to analyze mass spec proteomics data.
  • [http://www.charite.de/bioinf/strap/ Strap]: A tool for multiple sequence alignment and sequence based structure alignment.
  • [http://www.jstacs.de Jstacs]: A Java framework for statistical analysis and classification of biological sequences
  • [http://www.bioinf.jku.at/software/LSTM_protein/ jLSTM] "Long Short-Term Memory" for protein classification
  • [http://lajolla.sourceforge.net LaJolla] Structural alignment of RNA and proteins using an index structure for fast alignment of thousands of structures. Including an easy to use command line interface. Open source at Sourceforge.

Publications

BioJava has been used in the following publications. If you know of other publications please add them.

<biblio>

  1. hidalgo1998 pmid=9564045
  2. jacobs2000 pmid=10592251
  3. xie2000 pmid=12761070
  4. schrieber2002 pmid=12016048
  5. bussow2002 pmid=12493080
  6. aerts2003 pmid=12626717
  7. bernado2003 pmid=12967955
  8. brown2003 pmid=15130816
  9. carbone2003 pmid=14594704
  10. gurvich2003 pmid=14592990
  11. huang2003 pmid=14668218
  12. sugawara2003 pmid=12824432
  13. zuyderduyn pmid=14583100
  14. aerts2004 pmid=15044242
  15. dong2004 pmid=15215471
  16. down2004 pmid=15369604
  17. hajarnavis2004 pmid=15247332
  18. hertz-folwer2004 pmid=14681429
  19. an2005 pmid=15610565
  20. carbone2005 pmid=15537809
  21. down2005 pmid=15760844
  22. finack2005 pmid=15572471
  23. gorban2005 pmid=15984937
  24. gouret2005 pmid=16083500
  25. kersey2005 pmid=15608201
  26. pain2005 pmid=15640145
  27. prlic2005 pmid=16204122
  28. pudimat2005 pmid=15905283
  29. spindel2005 pmid=16288651
  30. bindewald2006 pmid=16845037
  31. down2006 pmid=17002805
  32. carter2006 pmid=16925840
  33. gille2006 pmid=16469097
  34. hasan2006 pmid=16789813
  35. hasan2006 pmid=16990246
  36. lee2006 pmid=16402215
  37. liang2006 pmid=17054788
  38. lu2006 pmid=16260186
  39. mcdonald2006 pmid=17000643
  40. powel2006 pmid=16423288
  41. ross2006 pmid=16845480
  42. schmidt2006 pmid=16817977
  43. vernicos2006 pmid=16837528
  44. vizcaino2006 pmid=16872539
  45. andreeva2007 pmid=17068077
  46. bui2007 pmid=17288609
  47. down2007 pmid=17238282
  48. gewehr2007 pmid=17237069
  49. hanekamp2007 pmid=17332025
  50. makias2007 pmid=17400476
  51. nikolajewa2007 pmid=17537825
  52. spjuth2007 pmid=17316423
  53. zajac2008 pmid=18061398
  54. vernikos2008 pmid=18071028
  55. liang2008 pmid=17054788
  56. chalk2008 pmid=18397893
  57. gront2008 pmid=18227118

</biblio>

Bauer, R.; Rother, K.; Moor, P.; Reinert, K.; Steinke, T.; Bujnicki, J. M.; Preissner, R. Fast Structural Alignment of Biomolecules Using a Hash Table, N-Grams and String Descriptors. Algorithms 2009, 2, 692-709. [http://www.mdpi.com/1999-4893/2/2/692 open access full text]

Concept

Our concept is to create a collaborative software suite called SynBioWave for synthetic biology purpose. SynBioWave is a Google Wave extension using BioJava to add synthetic biology functionality, giving synthetic biology access to the collaborative and interactive web 2.0.

SynBioWaves' key features

  • open source, free software
  • strong communication and collaboration functionality
  • basic synthetic biology functionality
  • easy to extended with additional synthetic biology functionality

The Software

General

iGEM Release

Road-map

Guids

User-guide

Developer-guide