I/O-details
From 2009.igem.org
Contents |
SynBioWave sequence I/O
Sequence file import and export
Although nowadays databases are the most important source for biological sequences and informations, hard copies of sequence information in the form of local stored files will never become extinct. There are many reasons, why local storage will furthermore be necessary for the biological working process. For example the need of protective data back-ups and short time storage. Also there is the need of network capabilities, which are not given at any time.
Google Wave Servlet improvement
As mentioned above there is urgend need for handling files in as many as possible file formats. As Google Wave is only intended for low-level file sharing, we bear the challenge and improved the marginal servlet capabilities with the methods for serving file up- and downloads.
The use of Apache commons-FileUpload
In order to provide a stable and modern upload capability, we choosed to make use of the well known apache commons-FileUpload library. This API provides extensive classes with intuitively to use methods. As shown in the code sample on the right, we filter the incoming post requenst primarily by the connection type. This is a simple but effective way to separate the incoming data from the event data used for the Google Wave Server-Robot communication. After deciphering the incoming data, we are used to think about security and abuse of processing file uploads. To avoid any chances to misusage of the file upload, we decided to commit the incoming data stream directly to the BioJava file parsers without saving them locally. By this method, we ensured that all file formats not supported by this parsers will simply be ignored.
The use of BioJava
BioJava is a well maintained low-level bioinformatic API, which provides a huge range of biological basic functions. By using this fast growing and well object oriented community based open-source community project, as it is planned also for our own project, we can participate on it's development process.
Supported file formats
As shown in the code snipped appended aside, BioJava provides an intuitive interface for including new file formats.
Currently supported are FASTA, GenBank/GenPept, EMBL and INSDseq formats.
But with the provided format factories, it will be possible to rapidly improve this in the near future.
Integration of the BioBrick partsregitry database
The DAS protocol
The use of Dasobert
The internal database
Securing database flexibility
Providing usage flexibility
The use of Google AppEngines datastore