Team:USTC Software/hoWAlgorithm

{|-
 * valign = "top"|

=Genetic Algorithm (GA)=
 * valign = "top" align = "justify" border = "0" width = "600px" bordercolor = "#F3F3F3"|

Introduction to GA
''' What's GA

Genetic algorithms are powerful methods for complex optimization problems. They are essentially evolution in a computer. A population of GA population random candidate solutions is generated and scored. These solutions are allowed to mate with each other, generating offspring that are hybrids of both parental solutions. The mating frequencies are related to a boltzmann-like probability term; akin to simulated annealing, the simulation temperature required for calculating these is high early in the run, and cools down. Since the mating probability for a given candidate solution is related to its score, better solutions mate more frequently, making their components more numerous in the offspring. This new population of candidate solutions is scored, ranked, and allowed to mate.

''' Why GA?

Genetic algorithm (GA) is utilized to identify the topology of the network. A rate term can be described as two parts: signal and function type. The former usually indicates the positive and negative interaction from reactants to products which strongly correlate with the SBGN entity relationship diagram. The later reflects the biochemical reaction type. GA searches an optimized solution by mimicking the process of evolution. Members of an initial population compete for the privilege to mate and produce offspring for successive generations. The probability of an individual mating is proportional to the fitness of the environment. The generation with better fitness will be obtained generation by generation. As the topology, the interaction forms and the parameters for each component function are all necessary in the final result, we employed a multi-level optimization. The lowest level optimization is to give the suitable parameters with the topology and the interaction forms known. For the next level, we are trying to find the interactions forms of component functions with the known topology with genetic algorithm. At last, for the upper level, the potential topologies are selected also with genetic algorithm. And the fitness functions are the similarity between the dynamic behaviors and the target, which are measured by the standard error.

Optimization Processes of GA

 * Step1: Randomly generate a population of topologies.


 * Step2: For each topology, generate a population of networks with a serial of random interaction forms.


 * Step3: For each network, obtain the fitness score by optimize the parameters.


 * Step4: Obtain the fitness score of a topology by searching the most favorable network.


 * Step5: Evolutes the topologies with each fitness score.

Parameter For GA

 * Population: the number of members in one generation


 * Mutation ratio: the ratio of each gene alter its status in a generation


 * Recombine ratio: the ratio that whether a gene change with its allel in mating


 * Max_cycle: the maximum number of generations considered for the evolution

=Particle Swarm Optimization Algorithm (PSO)=

Introduction to PSO
''' What's PSO

Particle Swarm Optimization(PSO) is an optimization algorithm based on swarm intelligence theory. Motivated by the evolution of nature, a series of evolutionary computation techniques, such as evolutionary programming, genetic algorithms, evolutionary strategies, are proposed, in which a best solution is evolved through the generations. In contrast to evolutionary computation techniques, Eberhart and Kennedy developed a different algorithm through simulating social behavior of bird flocking or fish schooling. In a particle swarm optimizer, instead of using genetic operators, individual swarms are evolved through cooperation and competition. Each particle adjusts its flying according to its own flying experience and its companions' flying experience. The position of each particle represents a potential solution to a problem.

The following is excerpt from"http://www.swarmintelligence.org/", it shows clearly how PSO works: "Each particle keeps track of its coordinates in the problem space which are associated with the best solution (fitness) it has achieved so far. (The fitness value is also stored.) This value is called pbest. Another "best" value that is tracked by the particle swarm optimizer is the best value, obtained so far by any particle in the neighbors of the particle. This location is called lbest. when a particle takes all the population as its topological neighbors, the best value is a global best and is called gbest.

The particle swarm optimization concept consists of, at each time step, changing the velocity of (accelerating) each particle toward its pbest and lbest locations (local version of PSO). Acceleration is weighted by a random term, with separate random numbers being generated for acceleration toward pbest and lbest locations." (cite from the website)

''' Why PSO?

In past several years, PSO has been successfully applied in many research and application areas. It is demonstrated that PSO gets better results in a faster, cheaper way compared with other methods. Compared to GA, the advantages of PSO are that PSO is easy to implement and there are few parameters to adjust. The most important reason we choose to implement this algorithm into our software is that this algorithm is easy to realize parallelization. Since the most time-consuming part in our scheme is the optimization of parameters for a given topological structure. If we cannot find a efficient optimizer, it is impossible to deal with systems contains more than five or six nodes. parallelization of the optimization process will be implemented in our next version.

Basic Formulas of PSO
First we must be clear with several concepts:


 * particle:have a position and a velocity, fly through parameter space


 * velocity: velocity of a particle in the parameter space


 * pbest: The best solution a particle has achieved so far


 * lbest: Best solution obtained so far by any particle in the neighbors of the particle


 * gbest: The best solution achieved by all particles so far

Parameters used in PSO algorithm:


 * Number of Particles:


 * learning factors $$c_{1}$$, $$c_{2}$$


 * inertial factor $$w$$


 * range of parameter and velocity on each dimension


 * optimization threshold

The particles update their positions and velocities with following equation:

$$v[] = w*v[] + c_{1}*rand*(pbest[]-postion[]) + c_{2}*rand*(gbest[] - position[])$$

$$position[] = position[] + v[]$$

where $$c_{1}$$ and $$c_{2}$$ are two learning factors, we choose $$c_{1}=c_{2}=2$$. Particles' velocities on each dimension are clamped to a maximum velocity $$V_{\max }$$. If the sum of accelerations would cause the velocity on that dimension to exceed $$V_{\max }$$. Then the velocity on that dimension is limited to $$V_{\max }$$. If a particle's position go beyond the range of parameter, we will set the parameter on that dimension to the boundary condition and let the sign of velocity on that dimension change meanwhile in order to simulate reflection.

Performance of PSO
In contrast with simulated annealing, another method we used to optimize parameters, PSO performs much better both on the efficiency and accuracy. Specifically, PSO is good at dealing with very large parameters because it is a global optimization algorithm.

Reference

 * http://www.swarmintelligence.org/


 * Eberhart, R.C. and Kennedy, J (1995). A new optimizer Using Particle Swarm Theory, Proc. Sixth International Symposium on Micro Machine and Human Science

=Global Sensitivity Analysis (GSA)=

A general objective in our modeling process is an exploration of the high-dimensional input variable space as thoroughly as possible for its impact on observable system behavior, often with either optimization in mind or simply for achieving a better understanding of the phenomena involved. Since the system input->output behavior is typically a nonlinear relationship, simple logic suggests that the number of runs could grow exponentially with the number of input variables. However, an emerging family of high dimensional model representation concepts and techniques capable of dealing with such input->output problems in a practical fashion. RS-HDMR is a global sensitivity analysis technique that can decompose the high-dimensional, nonlinear contributions of each parameter to the network properties(represented by their total sensitivity) into a hierarchy of low-dimensional terms. Calculations by RS-HDMR require only the ODE model, an estimate of the initial conditions and an estimate of the dynamic range to explore for each parameter.

Introduction to GSA
''' What's Sensitivity Analysis? The sensitivity of a system to its parameters may be determined using either global or local methods. We implement both of them into our code. Specifically, the global sensitivity analysis employed Random Sampling-High Dimensional Model Representation(RS-HDMR) algorithm to estimate the sensitivities of the circuit properties with respect to the circuit model parameters, such as rate constants, without knowing the precise parameter values.

''' Why Sensitivity Analysis? The models we use to describe a gene network are systems of ordinary differential equations. In the case of ODE models, robustness is generally studied in the context of parametric uncertainty via properties such as robust stability and response to perturbation. Although there is no standard quantitative measure of robustness, a common thread in the literature relates robustness to parametric sensitivity. Parameters whose perturbation causes undesirable changes (either qualitative or quantitative) in some output identify processes with fragility The sensitivity information can then guide the selection of the optimal mutation targets and thereby reduce the laboratory effort.

Theoretical Foundations of GSA
High dimensional model representation(HDMR) is a general set of quantitative model assessment and analysis tools for capturing high dimensional IO system behavior. As the impact of the multiple input variables on the output can be independent and cooperative, it is natural to express the model output $$f(x)$$ as a finite hierarchical correlated function expansion in terms of the input variables:

$$f(x)=f_{0}+\sum_{i=1}^{n}f_{i}(x_{i})+\sum_{1\leq i<j\leq n} f_{ij}(x_{i}, x_{j})+ \sum_{1\leq i<j<k\leq n}f_{ijk}(x_{i},x_{j},x_{k})+...+\sum_{1\leq i_{1}<...<i_{l}\leq n}f_{i_{1}i_{2}...i_{l}}(x_{i1},x_{i2},...,x_{il}) +...+f_{12...n}(x_{1},x_{2},...,x_{n})$$

Where the zeroth-order component function $$f_{0}$$ is a constant representing the mean response to $$f(x)$$, and the first order component function $$f_{i}(x_{i})$$  gives the independent contribution to $$f(x)$$ by the ith input variable acting alone, the second order component function $$f_{ij}(x_{i},x_{j})$$ gives the pair correlated contribution to $$f(x)$$ by the input variables $$x_{i}$$ and $$x_{j}$$, etc. The last term contains any residual nth order correlated contribution of all input variables. The basic conjecture underlying HDMR is that the component functions arising in typical real problems are likely to exhibit only low order l cooperativity among the input variables such that the significant terms in the HDMR expansion are expected to satisfy the relation: $$l<>l$$ . An HDMR expansion to second order

$$f(x)=f_{0}+\sum_{i=1}^{n}f_{i}(x_{i})+\sum_{1\leq i<j\leq n} f_{ij}(x_{i}, x_{j})$$

often provides a satisfactory description of $$f(x)$$ for many high dimensional systems when the input variables are properly chosen. This is also the formula we use to achieve HDMR expansion. How to determine HDMR component functions? Practical formulations of the HDMR component functions determine whether we can carry out this algorithm. Theoretically, a component function

$$f_{i_{1}i_{2}...i_{l}}(x_{i1},x_{i2},...,x_{il})$$

is obtained by minimizing the functional,

$$\min_{f_{i_{1}i_{2}...i_{l}}(u_{i1},u_{i2},...,u_{il}}\int_{\Omega}\omega_{i_{1}i_{2}...i_{l}}(\hat{x},u) [f(u)-f_{0}-\sum_{i=1}^{n}f_{i}(u_{i})-\sum_{1\leq i<j\leq n} f_{ij}(u_{i}, u_{j})-...- \sum_{1\leq i_{1}<...<i_{l}\leq n}f_{i_{1}i_{2}...i_{l}}(u_{i1},u_{i2},...,u_{il})]^{2}du$$

under a suitable specified orhogonality condition which guarantees that all the component functions are determined step by step. Here,$$\hat{x}=(x_{i_{1}},x_{i_{2}},...,x_{i_{n}}$$, $$\omega_{i_{1}i_{2}...i_{l}}(\hat{x},u)$$ is a weight function. Different weight functions will produce distinct, but formally equivalent HDMR expansions. There are two commonly used HDMR expansions: Cut- and RS(Random Sampling)-HDMR. Cut-HDMR expresses $$f(x)$$  in reference to a specified cut point, while RS-HDMR depends on the average value of $$f(x)$$ over the whole domain $$\Omega$$ . We adopt RS-HDMR to do global sensitivity analysis. For RS-HDMR, we first rescale variables $$x_{i}$$ such that $$0<x_{i}<1$$ for all i. The output function $$f(x)$$  is then defined in the unit hypercube

$$K^{n}={(x_{1},x_{2},...x_{n})|0<\leq x_{i}\leq 1, i=1,2,...,n}$$

by suitable transformations. The component functions of RS-HDMR possess the following forms:

$$f_{0}=\int_{K^{n}}f(u)du$$

$$f_{i}(x_{i})=\int_{K^{n-1}}f(x_{i},u^{i})du^{i}-f_{0}$$

$$f_{ij}(x_{i},x_{j})=\int_{K^{n-2}}f(x_{i},x_{j},u^{ij})du^{ij}-f_{i}(x_{i})-f_{j}(x_{j})-f_{0}$$

Where $$du^{i}$$ and $$du^{ij}$$  are just the product $$du_{1}du_{2}...du_{n}$$ without $$du_{i}$$ and $$du_{ij}$$ respectively. Evaluation of the high dimensional integrals in the RS-HDMR expansion is carried out by the Monte Carlo random sampling. Orthogonality conditions are used to obtain the above formulas. The integral of a component function of RS-HDMR with respect to any of its own variables is zero, i.e.,

$$\int_{0}^{1}f_{i_{1}i_{2}...i_{n}}(x_{i_{1}},x_{i_{2}},...,x_{i_{l}})dx_{s}=0, s\in{i_{1},i_{2},...,i_{l}}$$

Using the orthobonality property of the RS-HDMR component functions, it can be proven that the total variance of $$f(x)$$ caused by all input variables sampled uniformly over their full range may be decomposed in to distinct input contributions in the following manner.

$$\begin{align} \sigma_{\bar{f}}^{2} &= \int_{K^{n}}[f(x)-\bar{f}]^{2}dx \\ &= \sum_{i=1}^{n}\int_{0}^{1}f_{i}^{2}(x_{i})dx_{i}+\sum_{1\leq i< j \leq n} \int_{0}^{1}\int_{0}^{1}f_{ij}^{2}(x_{i},x_{j})dx_{i}dx_{j}+... \\ &= \sum_{i=1}^{n}\sigma_{i}^{2}+\sum_{1\leq i< j \leq n}\sigma_{ij}^{2}+... \\ \end{align}\,\!$$

Thus, the total variance $$\sigma_{\bar{f}}^{2}$$ is the sum of first-order variances $$\sigma_{i}^{2}$$ , second-order covariances $$\sigma_{ij}^{2}$$, etc. The magnitudes of the indices $$\sigma_{i}^{2}$$, $$\sigma_{ij}^{2}$$ reveal how the output uncertainty is influenced by the input uncertainties and the nature of the cooperativities that exist. The direct determination of the component functions of RS-HDMR at different values in the unit hypercube by Monte Carlo integration can require a large number of random samples. To reduce the sampling effort, the RS-HDMR component functions may be approximated to any desired level of accuracy by either analytical approximation or numerical approximation. In our code, we implement the former approximation, which approximate the RS-HDMR component function by expansion in terms of a suitable set of functions. Specifically, orthogonal polynomials \cite{paper7} are used as a basis to approximate $$f_{i}(x_{i})$$ , $$f_{ij}(x_{i},x_{j})$$ as follows:

$$f_{i}(x_{i})\sum_{k=1}^{\infty}\alpha_{k}^{i}\phi_{k}(x_{i})$$

$$f_{ij}(x_{i},x_{j})=\sum_{k,l=1}^{\infty}\beta_{kl}^{ij}\phi_{k}(x_{i})\phi_{l}(x_{j})$$

$$...$$

In most cases, to achieve a desired accuracy using $$\phi_{1}(x)$$ ,$$\phi_{2}(x)$$ and $$\phi_{3}(x)$$ is sufficient. Description of the use of orthonormal polynomials in details can be found in ref\cite{paper7}, we only give some results here.

$$\begin{align} \sigma_{i}^{2} &= \int_{0}^{1}f_{i}^{2}(x_{i})dx_{i}\approx \int_{0}^{1}[\sum_{k=1}^{s_{1}}\alpha_{k}^{i}\phi_{k}(x_{i})]dx_{i} \\ &= \sum_{k=1}^{s_{l}}(\alpha_{k}^{i})^{2} \\ \end{align}\,\!$$

$$\begin{align} \sigma_{ij}^{2} &= \int_{0}^{1}\int_{0}^{1}f_{ij}^{2}(x_{i},x_{j})dx_{i}dx_{j} \\ &\approx \int_{0}^{1}\int_{0}^{1}[\sum_{k=1}^{s_{1}}\sum_{k=1}^{s_{1}'}\beta_{kl}^{ij}\phi_{k}(x_{i})\phi_{l}(x_{j})]^{2}dx_{i}dx_{j} \\ &= \sum_{k=1}^{s_{1}}\sum_{k=1}^{s_{1}'}(\beta_{kl}^{ij})^{2}\\ \end{align}\,\!$$

Only one random sampling set of $$f(x)$$ is needed to determine $$f_{0}$$  and all the expansion coefficients and consequently,$$\sigma_{\bar{f}}^{2}$$, $$\sigma_{i}^{2}$$, $$\sigma_{ij}^{2}$$. This dramatically reduces the sampling effort and makes global sensitivity analysis of a model very efficient.

Reference

 * 1 http://en.wikipedia.org/wiki/Sensitivity_analysis
 * 2 J.Kim IET Syst. Biol. Vol.153, no.3 pp.96-104 2006
 * 3 Science vol.101 no.36, pp.13210-5
 * 4 H. El-Samad et al, Proc. IEEE, vol.94, no.4, pp.832-853m 2006
 * 5 J.Stelling et al. Cell, vol.118, no.6, pp.675-85 2004
 * 6 H. Kitano, ”Biological robustness” Nat. Rev. Genet. Vol.5, no.11 pp.826-37,2004
 * 7 M. Morohashi et al. J. Theor. Biol. Vol.216, pp 19-30 2002

=Local Sensitivity Analysis (LSA)=

Besides global sensitivity analysis, we also implement additional local sensitivity analysis technique, which can measure time-varying sensitivities along arbitrary trajectories. For local methods, the basic measures of sensitivity are the partial derivatives.

$$\frac{\partial(Model Responses)}{\partial (Model Parameters)}$$,

called the sensitivity coefficients of the model\cite{paper8}. We adopt the method proposed by \cite{paper9}, which can determine the sensitivity to the perturbation throughout the time evolution of the system, regardless of the nature of the trajectory. The network is modeled by the ordinary differential equation

$$\frac{d}{dt}\mathbf{s}(t)=N\mathbf{v}(\mathbf{s}(t),\mathbf{p},t)$$

where the n-vector s is composed of the concentrations of each species, the constant r-vector p is composed of the parameters of interest in the model, the m-vector valued function

$$\mathbf{v}=\mathbf{v}(s, p, t)$$

describes the rate of each reaction as function of species concentrations and parameter values.

Introduction to LSA
''' What's Sensitivity Analysis? The sensitivity of a system to its parameters may be determined using either global or local methods. We implement both of them into our code. Specifically, the global sensitivity analysis employed Random Sampling-High Dimensional Model Representation(RS-HDMR) algorithm to estimate the sensitivities of the circuit properties with respect to the circuit model parameters, such as rate constants, without knowing the precise parameter values.

''' Why Sensitivity Analysis? The models we use to describe a gene network are systems of ordinary differential equations. In the case of ODE models, robustness is generally studied in the context of parametric uncertainty via properties such as robust stability and response to perturbation. Although there is no standard quantitative measure of robustness a common thread in the literature relates robustness to parametric sensitivity. Parameters whose perturbation causes undesirable changes (either qualitative or quantitative) in some output identify processes with fragility The sensitivity information can then guide the selection of the optimal mutation targets and thereby reduce the laboratory effort.

Basic Definitions
Definition 1 Given an initial condition $$s(0)=s_{0}$$ and a set of parameter values $$p_{0}$$, which together for a vector $$q_{0}$$ ,the time-varying concentration sensitivity coefficients(or concentration response coefficients) are define as the elements of the $$n\times(n+r)$$ matrix function $$R_{q}^{s}(t)$$ given by

$$R_{q}^{s}(t):=\frac{\partial s(t,q)}{\partial q}|_{q=q_{0}}=\lim_{\Delta q\rightarrow0} \frac{s(t,q_{0}+\Delta q)-s(t,q_{0})}{\Delta q}$$

for all $$t\geq0$$ The response coefficient defined above provides a measure of the difference between this "perturbed trajectory" and the "nominal" trajectory at each time t. Definition 2 Given an initial condition $$s(0)=s_{0}$$ and a set of parameter values $$p_{0}$$, which together for a vector $$q_{0}$$ ,the time-varying rate sensitivity coefficients(or rate response coefficients) as the elements of the $$m\times r$$ matrix $$R_{q}^{v}(t)$$ function given by,

$$R_{q}^{v}(t)=\frac{\partial v(s(t,q),p,t)}{\partial q}|_{q=q_{0}}$$

for all, where the derivatives are evaluated at and The rate response coefficients give the response in the rates at time t to a perturbation at time zero.

Computation
The sensitivity coefficients are defined by a first-order linear ordinary differential equation,

$$\frac{d}{dt}\frac{\partial s(t,q)}{\partial q}=N[\frac{\partial v(t)}{s}\frac{\partial s(t)}{\partial q}+ \frac{\partial v(t)}{\partial q}]$$

for all $$t \geq 0$$. The choice of $$s(0)=s_{0}$$ and $$p_{0}$$ fix a trajectory $$s(t)=s(t,s_{0},p_{0}$$, and hence determines the function $$v(t)=v(s(t),p,t)$$ as well. The equation above can be solved numerically for the concentration sensitivities. In performing such a calculation, we must note that the right-hand side depends on the time-varying value of the species concentration vector s(t), so a simple computational strategy is to determine $$s_{i}(.)$$ and $$\partial s_{i}(.)/\partial q$$ simultaneously.

Reference

 * 1 H. El-Samad et al, Proc. IEEE, vol.94, no.4, pp.832-853m 2006
 * 2 J.Stelling et al. Cell, vol.118, no.6, pp.675-85 2004; H. Kitano
 * 3 "Biological robustness" Nat. Rev. Genet. Vol.5, no.11 pp.826-37,2004
 * 4 M. Morohashi et al. J. Theor. Biol. Vol.216, pp 19-30 2002
 * 5 J.Kim IET Syst. Biol. Vol.153, no.3 pp.96-104 2006
 * 6 Science vol.101 no.36, pp.13210-5
 * 7 G. Li et al./Chemical Engineering Science 57 (2002) 4445-4460
 * 8 Jorge R. Leis et al. ACM Transactions on Mathematical Software, vol.14 No.1, March 1988., Pages 45-60
 * 9 B. P. Ingalls and Herbert M. Sauro[J. Theor. Biol. Vol.222, pp 23-36 2003
 * 10 J.Kim IET Syst. Biol. Vol.153, no.3 pp.96-104 2006
 * valign = "top"|


 * }
 * }