Team:EPF-Lausanne/Modeling/Simulation

From 2009.igem.org


Creating input files for namd

We start with the PDB file of the protein, obtained through the Protein Data Bank.

We open VMD, and lauch our protein.


Nomenclature

Here are the convention we are using to name files:

PROT.pdb : pdb of the protein downloaded from the internet(in capital letters)
prot.pdb : pdb after removing unwanted atoms (step 3 on this page)
protp.pdb : pdb after processing by psfgen
protp.psf : psf after processing by psfgen
protp_wb.pdb : pdb after solvating
protp_wb.psf : psf after solvating
protp_wb_i.pdb : pdb after adding ions (final version of the file)
protp_wb_i.psf : psf after adding ions (final version of the file)


Creation of the pdb file

.pdb from the web → .pdb file ready for vmd processing (removing some atoms)


After creating the molecule containing the .pdb downloaded from internet, in the Tk Console menu of VMD we open the VMD TkCon window, and type the following commands:

set our_protein [atomselect top "not water and not resname GOL"]
$our_protein writepdb 2v0u.pdb

This will select all the protein except the water and except the glycerol, with the cofactor. If we have multiple sequences, we can simply select them and write them to different .pdb.

We have created the file 2v0u.pdb, which contains the coordinates of the protein alone without hydrogens. Quit VMD.


Creation of the psf file

.pdb file + topology file (*.rtf) → .psf file + matching .pdb file
.psf file contains atom parameters derived from topology, but is not human readable.

There are 2 ways to create a psf:

  • in VMD, Extensions → Modeling → Automatic PSF Builder (psfgen GUI) (WARNING: bug prevent from dealing with multile chains!)
  • We first make a pgn file, which is a script file that will command psfgen.

Unique segment:

In a Terminal window, open a text editor and create a file that you'll call 2v0w.pgn:

package require psfgen
topology top_all27_prot_lipid.inp
pdbalias residue HIS HSE
pdbalias atom ILE CD1 CD
segment U
{pdb 2v0w.pdb}
coordpdb 2v0w.pdb U
guesscoord
writepdb 2v0wp.pdb
writepsf 2v0wp.psf

You'll need to place the topology file within same directory. Here is the current topology file, please rename to Top_all27_prot_lipid-fmc.rtf after download. The topology file is specific for the .pdb we have, ask if you need one for dark/light FMN.

In a Terminal window, type the following command:

> vmd -dispdev text -e 2v0w.pgn

This will run the psfgen from the file 2v0w.pgn and generate the psf (2v0wp.psf) and the pdb (2v0wp.pdb) file of the protein with hydrogens. A new pdb file with the complete coordinates of all atoms is written, including H; and a psf file with the complete structural information of the protein.


Case of multiple segments:

This section is based on psfgen manual.

Psfgen is not able to deal with multiple sequences within a single .pdb (even using TER and different seg names). We have to separate them, either by hand, either using grep. So you should have as many .pdb files as chains (2v0u_prot.pdb, 2v0u_fmn.pdb).

and .pgn gets. I added some comments to explain the commands(//), please remove them before run.

package require psfgen //check if plugin available
topology top_all27_prot_lipid-fmn_dark.inp //loads topology file
pdbalias residue HIS HSE //some aliases
pdbalias atom ILE CD1 CD //some aliases
segment 2v0u { //creates a chain that'll contain the prot
pdb 2v0u_prot.pdb //loads list of atoms
}
coordpdb 2v0u_prot.pdb 2v0u //loads coord. of atoms
segment fmn { //creates the second chain
pdb 2v0u_fmn.pdb //loads atoms from FMN
first none //prevents psfgen from "capping" the first residue
last none //prevents psfgen from "capping" the last residue
}
coordpdb 2v0u_fmn.pdb fmn //loads coord. of atoms
guesscoord //complete missing coord.
writepdb 2v0up.pdb //write .pdb containing all previously loaded chains
writepsf 2v0up.psf //write .psf containing all previously loaded chains

The structure is quite important, as psfgen applies some patch at the end of loaded chains (NTERM, CTERM). In the case of FMN without link to the cystein, we have to prevent psfgen from adding atoms to neutralize the molecule as if it was a protein. That's the purpose of first none and last none.


Solvating the protein

.pdb + .psf containing the protein → .pdb + .psf containing the protein and water molecules


Now, the protein needs to be solvated, i.e., put inside water, to more closely resemble the cellular environment. This will be done by placing the protein in a water box, in preparation for minimization and equilibration with periodic boundary conditions.

In the VMD Main window, open the Tk Console, and type:

package require solvate
solvate 2v0up.psf 2v0up.pdb -t 12 -o 2v0up_wb

The "solvate package" will put the protein in a box of water. The -t option creates the water box dimensions such that there is a layer of water 12 Angström in each direction from the atom with the largest coordinate in that direction. We have to take a special care to be sure we don't have interaction with the protein in the next periodic box. Minimization usually shrinks the water box. Minimal distance between nearer atoms of 2 boxes should be higher than ~12 Angstroem. As we have a small protein, we can increase the size of the box without having too many atoms. The -o option creates the output files our_prot_in_a_water_box .pdb and our_prot_in_a_water_box.psf.


Add ions

add ions in .pdb + .psf


In VMD, we load the psf and the pdb created with the pgn. Under Extension/Modeling/Add Ions, and knowing the charge of the protein (for exemple -7), we add 7 atoms of Na. Instead of concentrations, we click user defined to add 7 Na, and neutralize. The automatic mode works quite well. At least it is really efficient to calculate the total charge of the system, but it might add both positive and negative charges to reach neutral potential. So we prefer to add a given number of ions by hand.

Ionization


Measurement of the water box coordinates

we need the origin and 3 dimension vectors of the system as initial conditions for namd


In VMD, load 2v0up_wb_i.psf and 2v0up_wb_i.pdb. This will display our protein in a water box. In the VMD TkCon window type:

set everyone [atomselect top all]
measure minmax $everyone

This gives the minimum and maximum values of x, y and z coordinates of the entire protein-water system, relative to the origin of the coordinate system.

The center of the water box may be determined by typing:

measure center $everyone

These coordinates have to be kept and recorded for the referential of namd.

You now have all the input files for namd!


More informations from the tutorial

NOT NEEDED! Simulation with Periodic Boundary Conditions

.psf + .pdb + .prm + .conf → namd → anything you want

This part of the tutorial doesn't have its place here anymore, as we have to detail much more the process of launching namd. Please go back to the root for more informations.

The use of periodic boundary conditions are effective in eliminating surface interaction of the water molecules and creating a more faithful representation of the in vivo environment than a water sphere surrounded by vacuum provides.

We first create a configuration file:
From the tutorial of VMD, we can download the configuration files for the minimization and equilibration of the protein in a water box. The process of editing .conf will be explained on a separate page. Look at the tutorial for a more detailed explanation of the different parameters listed.

The simulation can be run by typing in a Terminal window:

namd2 our_prot_configuration_file.conf > output_file.log

Output of the water sphere minimization-equilibration simulation will yield eleven output files. See the tutorial for a more detailed explanation of each file.


VMD - Make a movie from the .dcd

This step occurs after the namd calculation. Open VMD, load first the file 2v0up_wb_i.psf, and then the file .dcd automatically created. The protein will appear surrounded by molecules of water. To see it more clearly, go in Graphics → Representation, and create 3 replicates.

In the first replicates, type in selected atoms "protein" to select all residues in the protein, and choose NewCartoon as the drawing method for example. It will identify how many helices, betasheets and coils are present in the protein. In the second replicate, type "resname FMN" to select only the cofactor, and choose CPK for example to display the atoms of the cofactor. Finally, type "water" in the last replicate, and you can choose to deselect all molecules of water for a clearest view by double-clicking on it.

Now, to make a movie, just go on the VMD Main menu, and click on the triangle in the right bottom of the menu, and go to the Extension → Vizualization → Movie Maker Here, just change the name of the movie, the working directory and make the movie!

Movie Maker


Movie generation.jpg


Back to top