Team:Warsaw/Models evaluation

From 2009.igem.org

Contents

Methods to evaluate the models correctness

Ramachandran plot

Ramachandran plot is a plausible way to depict dihedral angles phi against psi in the amino acids backbone residues in protein structure. Since these angles are mainly responsible for protein conformation the plot indirectly reveal the local geometry of a polypeptide chain. If the analysed structure has a large number residues having dihedral angles with unexpected values for both dihedral angles it suggest that the structure might be incorrect.

RMSD

One of the most widely accepted difference measures for conformations of a molecule is least root mean square deviation (RMSD). To calculate the RMSD of a pair of structures it is required for each structure to be represented as a 3N-length vector of coordinates. The RMSD is the square root of the average of the squared distances between corresponding atoms of both compared structures. It is a measure of the average atomic displacement between the two conformations.

TM-score

TM-score is a recently proposed scale for measuring the structural similarity between two structures . The purpose of proposing TM-score is to solve the problem of RMSD which is sensitive to the local error. Because RMSD is an average distance of all residue pairs in two structures, a local error (e.g. a misorientation of the tail) will arise a big RMSD value although the global topology is correct. In TM-score, however, the small distance is weighted stronger than the big distance which makes the score insensitive to the local modeling error. The value of TM score is in the range of [0,1] and a TM-score >0.5 indicates a model of correct topology and a TM-score < 0.17 means only a random similarity. These cutoff does not depends on the protein length.

C-score

C-score is a confidence score for estimating the quality of predicted models by I-TASSER server. It is calculated based on the significance of threading template alignments and the convergence parameters of the structure assembly simulations. C-score is typically in the range of [-5,2], where a C-score of higher value signifies a model with a high confidence and vice-versa. In a benchmark test set of 500 non-homologous proteins, it has been found that C-score is highly correlated with TM-score and RMSD.