Team:EPF-Lausanne/Analysis

From 2009.igem.org

Revision as of 14:56, 30 July 2009 by Guittet (Talk | contribs)


Analysis of Equilibrium




RMSD for individual residues

We aim at finding the average RMSD over time for each residue in the protein using VMD.

1. We load the .psf and .dcd files obtained after the first round of simulation.


2. In the TkCon window type:

source residue_rmsd.tcl
set sel_resid [[atomselect top "protein and alpha"] get resid]

This will get all the residues number of all alpha-carbons in the protein. Since there is just one and only one α-carbon per residue, it is a good option.


3. Now we will calculate the RMSD values of all atoms in the newly created selection:

rmsd residue over time top $sel resid

At the end of the calculation, we have a list of the avergae RMSD per residue (in the file residue_rmsd.dat)


4. In VMD, in Graphics → Representations, do the following actions:

  • create two replicates: protein and resname FMN in selected atoms
  • for the FMN, choose CPK as a drawing method
  • for the protein, choose tube as drawing method, and user as coloring method. Now click on the Trajectory tab, and in the Color Scale Data Range, type 0.40 and 1.00.
Vmd color.png

The protein is colored according to its average RMSD values. The residues displayed in blue are more mobile while the ones in red move less.

Here is a movie with the protein colored according to average RMSD values.

RMSD moving residues


5. Now we can plot the RMSD value per residue by typing in a Terminal window :

xmgrace residue rmsd.dat

We obtain the following picture:

RMSD CA per res.jpg

Back to top



Maxwell-Boltzmann Energy Distribution

Here we will confirm that the kinetic energy distribution of the atoms in a system corresponds to the Maxwell distribution for a given temperature.

1. In VMD, load the .psf file. Browse for the restart velocity file (in our case, it was 2v0w_hydr_wb_i_eq.restart.vel), the type of file need this time to be selected. In the Determine file type: pull-down menu, choose NAMD Binary Coordinates, and load again.

The molecule looks terrible! That is because VMD is reading the velocities as if they were coordinates, but how it looks doesn't matter: we just need VMD to make a file containing the masses and velocities for every atom.


2. We create a selection with all the atoms in the system, by typing in the TkCon window:

set all [atomselect top all]

Open a file energy.dat for writing:

set fil [open energy.dat w]


3. For each atom in the system, calculate the kinetic energy, and write it to a file

foreach m [$all get mass] v [$all get {x y z}] {
puts $fil [expr 0.5 * $m * [vecdot $v $v] ]
}

Close the file:

close $fil


All these steps can be avoided by typing in the Terminal Window:

vmd -dispdev text -e get energy.tcl


We now have a file of raw data that can be used to fit the obtained energy distribution to the Maxwell-Boltzmann distribution. The temperature of the distribution can be obtained from that fit.


1. In a Terminal window, type xmgrace. In the xmgrace window, choose the Data → Import → ASCI I menu item. In the Files box, select the file energy.dat. Click on the OK button. You should see a black trace which correspond to the raw data.


2. To look at the distribution of points, we will make a histogram of this data. Choose the Data → Transformations → Histograms menu item. In the Source → Set window, click on the first line, in order to make a histogram of the data just loaded.

Click on the Normalize option.

Fill in Start: 0, Stop at: 10, and # of bins: 100, and Apply.

Non linear curve fitting.jpg


3. The plot has been created, but cannot be seen yet. Use the right button of the mouse to click on the first set (in the Source → Set window). Click on Hide. Now, go to the main Grace window and click on the button labeled AS: it will resize the plot to fit the existing values. This is the distribution of energies.

Now, we will fit a Maxwell-Boltzmann distribution to this plot and find out the temperature that the distribution corresponds to.


4. Choose the Data → Transformations → Non-linear curve fitting menu item. In the Source → Set window, click on the last line, which corresponds to the histogram you created.

In the Main tab, type in the Formula window:

y = (2/ sqrt(Pi * a0 ∧3 )) * sqrt(x) * exp (-x / a0)

This will fit the curve and get a fit for the parameter a0, which corresponds to kB T in units of kcal/mol.

Boltzmann.png


5. In the Parameters drop-down menu, choose 1. You can give an initial value to A0, so that the iterations will look for a value in the vicinity of your initial guess. In the A0 window, type 0.6, which in kcal/mol corresponds to a temperature of ∼ 300K.

A0.jpg

Click on the Apply button. This will open a window with the new value for a0, as well as some statistical measures, including the Correlation Coefficient, which is a measure of the fit.

Correlation coefficient.jpg

The value of a0 obtained corresponds to kB T . Obtain the temperature T for this distribution with kB = 0.00198657 kcal /mol /K.


We obtwain the following histogramm!

Maxwell-Boltzmann Energy Distribution.jpg

Back to top


Energies

Here we will calculate the averages of the various energies (kinetic and the different internal energies: bonded (bonds, angles and dihedrals) and non-bonded (electrostatic, van der Waals)) over the course of a simulation.


1. We start with a file obtained from NAMD: http://www.ks.uiuc.edu/Research/namd/utilities/ and download namdstat.tcl


2. In the VMD TkCon window, type

source namdstats.tcl
data avg ../2v0w_hydr_wb_eq.log 101 last

The second line will call a procedure which will calculate the average for all output variables in the log file from the first logged timestep after 101 to the end of the simulation.


Temperature distribution

Objective: Simulate our protein in an NVE ensemble, and analyze the temperature distribution.

In order to obtain the data for the temperature from the log file we will again use the script namdstats.tcl, which was already sourced. Type in a terminal window:

data_time TEMP namd_log

It will store each timestep and its corresponding temperature in the file TEMP.dat.


Using EXCEL, we obtain the following graph, which represents the evolution of the temperature in function of time:

Temp(t).png

The first part corresponds the the heating, then we let the system reach an equilibrium (NPT state), a NVT portion, and finally a NPT portion again.


Pressure as a function of simulation time

In order to obtain the data for the pressure from the log file we will again use the script namdstats.tcl, which has been updated for the occasion. The file can be found here. Please rename to namdstats.tcl after download.

Here are the steps to use this script:

source namdstats.tcl
data_time DATA_NAME LOG_FILE

It will extract data from LOG_FILE, creating a DATA_NAME.dat containing values and time informations.

Available DATA_NAME can be: BOND, ANGLE, DIHED, IMPRP, ELECT, VDW, BOUNDARY, MISC, KINETIC, TOTAL, TEMP, TOTAL2, TOTAL3, TEMPAVG, PRESSURE, GPRESSURE, VOLUME, PRESSAVG, GPRESSAVG.

So, to extract pressure from our first simulation, the command is: data_time PRESSURE namd_log

Here is a small plot of pressure and temperature in function of time

1st run.jpg


RMSD of selected atoms compared to initial position along time

We made a small TCL script to calculate RMSD from selected atoms compared to their initial position along timestep. The file can be found here. Please rename to Residue_rmsd_igem09.tcl after download.

Example to run the script:

load .dcd + .psf on VMD
source residue_rmsd_igem09.tcl
set sel_resid [[atomselect top "protein and alpha"] get resid]
rmsd_residue_over_time top $sel_resid

Here is how the script process:

  • calcuate how many timesteps are in .dcd
  • for each timestep, the script aligns (best fit) the protein + FMN to the initial position to minimize RMDS. We could remove hydrogen here (TEST!)
  • for each residue (selected by sel_resid), RMSD is computed and the sum of all RMSD (one for each residue) is stored for current timestep
  • script's output is data_rmsd.dat


Remark: we should try to select only heavy atoms from protein + FMN → change set sel_resid [[atomselect top "protein and alpha"] get resid] ( I tried to keep this parameter out of the script)


Here is a fast graph of the output of the average RMSD of the atoms in function of time. It seems normal.

Rmsd.jpg

Back to top