Team:USTC Software/Algorithm

From 2009.igem.org

Revision as of 11:42, 18 October 2009 by Bigben (Talk | contribs)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)

Contents

Genetic Algorithm (GA)

Introduction of GA

What's GA

Genetic algorithms are powerful methods for complex optimization problems. They are essentially evolution in a computer. A population of GA population random candidate solutions is generated and scored. These solutions are allowed to mate with each other, generating offspring that are hybrids of both parental solutions. The mating frequencies are related to a boltzmann-like probability term; akin to simulated annealing, the simulation temperature required for calculating these is high early in the run, and cools down. Since the mating probability for a given candidate solution is related to its score, better solutions mate more frequently, making their components more numerous in the offspring. This new population of candidate solutions is scored, ranked, and allowed to mate.

Why GA?

Genetic algorithm (GA) is utilized to identify the topology of the network. A rate term can be described as two parts: signal and function type. The former usually indicates the positive and negative interaction from reactants to products which strongly correlate with the SBGN entity relationship diagram. The later reflects the biochemical reaction type. GA searches an optimized solution by mimicking the process of evolution. Members of an initial population compete for the privilege to mate and produce offspring for successive generations. The probability of an individual mating is proportional to the fitness of the environment. The generation with better fitness will be obtained generation by generation. As the topology, the interaction forms and the parameters for each component function are all necessary in the final result, we employed a multi-level optimization. The lowest level optimization is to give the suitable parameters with the topology and the interaction forms known. For the next level, we are trying to find the interactions forms of component functions with the known topology with genetic algorithm. At last, for the upper level, the potential topologies are selected also with genetic algorithm. And the fitness functions are the similarity between the dynamic behaviors and the target, which are measured by the standard error.

Optimization Processes of GA

  • Step1: Randomly generate a population of topologies.
  • Step2: For each topology, generate a population of networks with a serial of random interaction forms.
  • Step3: For each network, obtain the fitness score by optimize the parameters.
  • Step4: Obtain the fitness score of a topology by searching the most favorable network.
  • Step5: Evolutes the topologies with each fitness score.

Parameter For GA

  • Population: the number of members in one generation
  • Mutation ratio: the ratio of each gene alter its status in a generation
  • Recombine ratio: the ratio that whether a gene change with its allel in mating
  • Max_cycle: the maximum number of generations considered for the evolution



Particle Swarm Optimization Algorithm (PSO)

Introduction of PSO

What's PSO

Particle Swarm Optimization(PSO) is an optimization algorithm based on swarm intelligence theory. Motivated by the evolution of nature, a series of evolutionary computation techniques, such as evolutionary programming, genetic algorithms, evolutionary strategies, are proposed, in which a best solution is evolved through the generations. In contrast to evolutionary computation techniques, Eberhart and Kennedy developed a different algorithm through simulating social behavior of bird flocking or fish schooling. In a particle swarm optimizer, instead of using genetic operators, individual swarms are evolved through cooperation and competition. Each particle adjusts its flying according to its own flying experience and its companions' flying experience. The position of each particle represents a potential solution to a problem.

The following is excerpt from"http://www.swarmintelligence.org/", it shows clearly how PSO works: "Each particle keeps track of its coordinates in the problem space which are associated with the best solution (fitness) it has achieved so far. (The fitness value is also stored.) This value is called pbest. Another "best" value that is tracked by the particle swarm optimizer is the best value, obtained so far by any particle in the neighbors of the particle. This location is called lbest. when a particle takes all the population as its topological neighbors, the best value is a global best and is called gbest.

The particle swarm optimization concept consists of, at each time step, changing the velocity of (accelerating) each particle toward its pbest and lbest locations (local version of PSO). Acceleration is weighted by a random term, with separate random numbers being generated for acceleration toward pbest and lbest locations." (cite from the website)

Why PSO?

In past several years, PSO has been successfully applied in many research and application areas. It is demonstrated that PSO gets better results in a faster, cheaper way compared with other methods. Compared to GA, the advantages of PSO are that PSO is easy to implement and there are few parameters to adjust. The most important reason we choose to implement this algorithm into our software is that this algorithm is easy to realize parallelization. Since the most time-consuming part in our scheme is the optimization of parameters for a given topological structure. If we cannot find a efficient optimizer, it is impossible to deal with systems contains more than five or six nodes. parallelization of the optimization process will be implemented in our next version.

Basic Formulas of PSO

First we must be clear with several concepts:

  • particle:have a position and a velocity, fly through parameter space
  • velocity: velocity of a particle in the parameter space
  • pbest: The best solution a particle has achieved so far
  • lbest: Best solution obtained so far by any particle in the neighbors of the particle
  • gbest: The best solution achieved by all particles so far

Parameters used in PSO algorithm:

  • Number of Particles:
  • learning factors <math>c_{1}</math>, <math>c_{2}</math>
  • inertial factor <math>w</math>
  • range of parameter and velocity on each dimension
  • optimization threshold

The particles update their positions and velocities with following equation:

<math>v[] = w*v[] + c_{1}*rand()*(pbest[]-postion[]) + c_{2}*rand()*(gbest[] - position[])</math>

(1)

<math>position[] = position[] + v[]</math>

(2)

where <math>c_{1}</math> and <math>c_{2}</math> are two learning factors, we choose <math>c_{1}=c_{2}=2</math>. Particles' velocities on each dimension are clamped to a maximum velocity <math>V_{\max }</math>. If the sum of accelerations would cause the velocity on that dimension to exceed <math>V_{\max }</math>. Then the velocity on that dimension is limited to <math>V_{\max }</math>. If a particle's position go beyond the range of parameter, we will set the parameter on that dimension to the boundary condition and let the sign of velocity on that dimension change meanwhile in order to simulate reflection.

Performance of PSO

In contrast with simulated annealing, another method we used to optimize parameters, PSO performs much better both on the efficiency and accuracy. Specifically, PSO is good at dealing with very large parameters because it is a global optimization algorithm.

Reference

  • Eberhart, R.C. and Kennedy, J (1995). A new optimizer Using Particle Swarm Theory, Proc. Sixth International Symposium on Micro Machine and Human Science

Data Structure and Organization

struct reactor{
	int rnum;//number of reactants
	char **ract;//names of reactants
};

struct product{
	int pnum;//number of products
	char **prdt;//names of products
};

struct reaction{
	int num;//reaction number
	struct reactor RACT;//info of reactants
	struct product PRDT;//info of products
	int type;//reaction type as listed in keneticlaw.pdf
	double *para;//all parameters
	char note[MAXLEN];//self-defined notes
};


Global Sensitivity

Local Sensitivity