Goal
The ultimate goal of our program is to assist the experimentalists to design the plasmid that works as the requirement. For example, if an oscillating behavior is the requirement as the input of the software, then the output in our imagination is a DNA sequence which works as an oscillator in E.coli or other specific biology. It is only an imagination that we have a long way to go. So, as the first goal, the output we are trying to do for the software is a network which can stably work as the requirement. Generally, the desired phenotype is the input of the software, and, optionally, the restrictions extracted from the other experiments or from the condition can be the input at the same time. And the output is a list of networks that have the similar phenotypes to the requirement, with the information of the value of parameters and the sensitivities.
Work Flow
The flow chart is shown in Figure1. The three-layer optimization is expected during the whole design process: the optimization of parameters in a fixed mathematic model, the selection of interaction forms in a fixed topology and the comparison and screening of different topologies. And during the optimization of the parameters, there are two score functions considered. One is the RMSD(root mean square deviation) between the phenotype of the designed network and the requirement, and the other is the sensitivity of each parameter. As the cell system is noisy, the networks are hard to realize in experiments if some parameters are too sensitive. So the parameters' sensitivities are working as a filter to get rid of the networks that works not stably enough. After the three-layer's optimization and comparison, the list of the best networks are output as the final results.
Platform
A user-friendly network-design platform is realized in our software with C# for the experimentalists. The interface is shown in Figure2.Users can input the requirement curves with uploading a data file. And the picture files for curves are also supported by our software. And the network can be designed by manually drawing the species and the interactions. The phenotypes of the designed networks will be shown with graphs that users can directly see the performances of the results and the deviation between the results and the requirement.
Algorithm
Particle Swarm Optimization algorithm
The particle swarm optimization algorithm (PSO) is employed to optimized the parameters in a fixed mathematic model.In past several years, PSO has been successfully applied in optimizing the parameters for the non-linear system. It is demonstrated that PSO gets better results in a faster, cheaper way compared with other methods. The most important reason we choose to implement this algorithm into our software is that this algorithm is easy to realize parallelization. Since the most time-consuming part in our scheme is the optimization of parameters for a given topological structure. If we cannot find a efficient optimizer, it is impossible to deal with systems contains more than five or six nodes. parallelization of the optimization process will be implemented in our next version.
Genetic Algorithm
The genetic algorithm (GA) is employed to search the best topologis and the best interaction forms.It is a powerful method for complex optimization problems. It realizes an essential evolution process in a computer. Under a fitness function, the members of the population will be improved from generation to generation. And the population will fit the pressure much better by the intraspecific competition.It is suitable for our problems ,because it can be converged in a moderate generations and can give a population of best topologies, not only one by other algorithms.
Future
It is just the first step. We still have a lot of to realize the final goal. First, the link should be established between the interaction forms and the real particles, as the promoters, the proteins, the ligands and so on. We are trying to build a database to construct the links, but the experiments data now are far than enough. And there are still some problems in the measurement of the parameters. Second, the optimization space is too large for us to search. Our program should run for a long time to finish the whole job. The parallel computation is favorable here. So we will use the parallel computation to do the optimization in the next version. Third, the on-line version is also required as it will be more convenient to the users.
|