Msi 0112 p
Upcoming SlideShare
Loading in...5
×
 

Msi 0112 p

on

  • 335 views

practical training for molecular simulations

practical training for molecular simulations

Statistics

Views

Total Views
335
Views on SlideShare
335
Embed Views
0

Actions

Likes
0
Downloads
1
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Msi 0112 p Msi 0112 p Document Transcript

    • MSI February 2012 Practical examplesContents1 Introduction 22 System preparation 23 Practical Session I: Molecular dynamics 4 3.1 MD with NAMD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 3.1.1 Creating a PSF file for PDB 1i45 . . . . . . . . . . . . . . . . . . . . . . . 4 3.1.2 Solvating the structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.1.3 Running the simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 MD with MOLARIS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2.1 Preparing the PDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2.2 Running an interactive MOLARIS session . . . . . . . . . . . . . . . . . . 9 3.3 MD with ADUN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.3.1 Preparing the PDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.4 Running simulations with ADUN . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 3.4.1 Analyizing the results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 Practical Session II: Solvation, pKa , FEP 16 4.1 Running solvation and pKa simulations using PDLD/S-LRA . . . . . . . . . . . . 16 4.2 LIE runs with ADUN . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 185 Practical Session III: enzymatic reactivity with EVB 18 5.1 EVB for enzymatic reactivity analysis . . . . . . . . . . . . . . . . . . . . . . . . 18A Running in the luke cluster 21B Extra material for NAMD 21 B.1 Set up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 B.2 VMD: wat_sphere.tcl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 B.3 VMD: sod2pot.tcl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 B.4 VMD: 1i45_ws_eq.conf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 B.5 VMD: 1i45_wb_eq.conf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24C Extra material for ADUN 26 C.1 Set up . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26c 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 1
    • MSI February 2012 Practical examplesD Additional tools 26 Molecular Simulations: a Practical Approach1 IntroductionIn this session we will practice Molecular Dynamics simulations using three different programs andthe help of VMD, Chimera and R to visualize and analyze the data obtained. We will demonstratethe methods on triosephosphate isomerase (TIM), an enzyme catalyzing the reversible interconver-sion of the triose phosphate isomers dihydroxyacetone phosphate (DHAP) and D-glyceraldehyde3-phosphate (GAP). Apart from its obvious interest, the protein has some characteristics that makeit a good example for this class (it does not contain disulfide bonds, it is an enzyme with a multiplefree energy barrier, it is a dimer...). The programs we will learn to use are • NAMD[Phillips et al., 2005], a popular high performance computing MD program that is tightly linked to the VMD[Humphrey et al., 1996] visualization program. Useful for running stan- dard MD runs with periodic boundary conditions and popular force fields like AMBER or CHARMM. 1 • MOLARIS[Lee et al., 1993], a program containing advanced algorithms for spherical bound- ary conditions and the PDLD/S-LRA model for semimacroscopic solvation calculations.2 • ADUN[Johnston et al., 2005], a high performance productivity and framework based computer program for MD simulations, including a plugin system for additions of complex algorithms. We will use it in Section 4 for LIE calculations.3 We will be running these programs in two different platforms. On the one hand, the use of NAMDwill be demonstrated assuming a Mac OS X based computer, although extremely analogous com-mands can be used in a unix machine. On the other hand, ADUN and MOLARIS will be run remotelyin a Linux cluster using Fedora 8. Additionally, the interested participants can download an ex-perimental live CD for ADUN as it can be found at http://susegallery.com/a/hvXWpn/adun-user (you need to -freely- register). Through this document, we will use THIS COLOR when showing bash scripting code andTHIS COLOR when showing Tcl scripting for VMD.2 System preparationFirst, we need to obtain the PDB files we will be using. TIM is known to explore two conformationsthat influence its ability to bind the substrate. We will be using the following PDB codes: 1I45 forthe open[Rozovsky et al., 2001] and 1NEY for the closed [Jogl et al., 2003]. We will first check thatthese two files correspond precisely to the same protein sequence. From the PDB we can get thesequences of the A chains (the B is identical in this dimeric protein), by running: 1 http://www.ks.uiuc.edu/Research/namd/ 2 http://futura.usc.edu/programs/index.html#molaris 3 adun.imim.es, It is highly recommended subscribing to the adun-users mailing list (https://mail.gna.org/listinfo/adun-users) to be aware of new improvements and to report problems.c 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 2
    • MSI February 2012 Practical examples$ wget −−o u t p u t −document =1NEY . f a s t a http : / / www. pdb . o r g / pdb / download / d o w n l o a d F i l e . do ? f i l e F o r m a t = f a s t a c h a i n & s t r u c t u r e I d =1NEY& c h a i n I d =A$ wget −−o u t p u t −document =1 I 4 5 . f a s t a http : / / www. pdb . o r g / pdb / download / d o w n l o a d F i l e . do ? f i l e F o r m a t = f a s t a c h a i n & s t r u c t u r e I d =1 I 4 5 & c h a i n I d =A$ wget −−o u t p u t −document =1YPI . f a s t a http : / / www. pdb . o r g / pdb / download / d o w n l o a d F i l e . do ? f i l e F o r m a t = f a s t a c h a i n & s t r u c t u r e I d =1YPI& c h a i n I d =Awhere $ refers to the system prompt. From now on, we will not use the $ sign, to simplify thewriting. Notice we have also downloaded the sequence for the PDB code 1YPI[Lolis et al., 1990],which corresponds to the wild type protein, for comparison. We can then put the three sequences inthe same filec a t 1NEY . f a s t a 1 I 4 5 . f a s t a 1YPI . f a s t a > s e q s . f a s t aand run clustalw to obtainCLUSTAL W (1.83) multiple sequence alignment1NEY_A|PDBID|CHAIN|SEQUENCE -ARTFFVGGNFKLNGSKQSIKEIVERLNTASIPENVEVVICPPATYLDYS1I45_A|PDBID|CHAIN|SEQUENCE MARTFFVGGNFKLNGSKQSIKEIVERLNTASIPENVEVVICPPATYLDYS1YPI_A|PDBID|CHAIN|SEQUENCE -ARTFFVGGNFKLNGSKQSIKEIVERLNTASIPENVEVVICPPATYLDYS *************************************************1NEY_A|PDBID|CHAIN|SEQUENCE VSLVKKPQVTVGAQNAYLKASGAFTGENSVDQIKDVGAKYVILGHSERRS1I45_A|PDBID|CHAIN|SEQUENCE VSLVKKPQVTVGAQNAYLKASGAFTGENSVDQIKDVGAKYVILGHSERRS1YPI_A|PDBID|CHAIN|SEQUENCE VSLVKKPQVTVGAQNAYLKASGAFTGENSVDQIKDVGAKWVILGHSERRS ***************************************:**********1NEY_A|PDBID|CHAIN|SEQUENCE YFHEDDKFIADKTKFALGQGVGVILCIGETLEEKKAGKTLDVVERQLNAV1I45_A|PDBID|CHAIN|SEQUENCE YFHEDDKFIADKTKFALGQGVGVILCIGETLEEKKAGKTLDVVERQLNAV1YPI_A|PDBID|CHAIN|SEQUENCE YFHEDDKFIADKTKFALGQGVGVILCIGETLEEKKAGKTLDVVERQLNAV **************************************************1NEY_A|PDBID|CHAIN|SEQUENCE LEEVKDFTNVVVAYEPVWAIGTGLAATPEDAQDIHASIRKFLASKLGDKA1I45_A|PDBID|CHAIN|SEQUENCE LEEVKDFTNVVVAYEPVWAIGTGLAATPEDAQDIHASIRKFLASKLGDKA1YPI_A|PDBID|CHAIN|SEQUENCE LEEVKDWTNVVVAYEPVWAIGTGLAATPEDAQDIHASIRKFLASKLGDKA ******:*******************************************Notice here that the two PDBs we will work with are indeed mutated structures from the wild type1YPI. It is ok, as they have been seen to have the same activity as the wt protein[Rozovsky et al., 2001,Jogl et al., 2003]. We can obtain the PDB files themselves in very different ways. We can see the contents of thefiles by accessing the PDB.4 and obtain the files from those pages or, more conveniently, by typing:wget f t p : / / f t p . wwpdb . o r g / pub / pdb / d a t a / s t r u c t u r e s / a l l / pdb / pdb1ney . e n t . gzwget f t p : / / f t p . wwpdb . o r g / pub / pdb / d a t a / s t r u c t u r e s / a l l / pdb / p d b 1 i 4 5 . e n t . gzwget f t p : / / f t p . wwpdb . o r g / pub / pdb / d a t a / s t r u c t u r e s / a l l / pdb / p d b 1 y p i . e n t . gzUpon inspection of the two PDB files, we realize 1NEY, which is in the closed form, contains thesubstrate DHAP, while 1I45 contains no substrate. In addition, we realize that in both cases the ac-tual mutations include a fluorinated variant of Trp 168: Trp90Tyr Trp157Phe with 5’-fluorotryptophanat Trp168 (see ??). We can keep working with that fluorinated version but it is better to modify itby the original Trp, as the former was just used for experimental monitoring of loop 6 and we don’tneed it here. 4 Try accessing http://www.rcsb.org/pdb/explore/explore.do?structureId=1I45 and http://www.rcsb.org/pdb/explore/explore.do?structureId=1NEYc 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 3
    • MSI February 2012 Practical examples Figure 1: 5’fluoro-tryptophane3 Practical Session I: Molecular dynamics3.1 MD with NAMDThe first task we will do is to run regular MD simulations with NAMD with periodic boundaryconditions using the CHARMM force field with CMAP, an energy correction lately added to theCHARMM force field. The NAMD team has produced a series of excellent tutorials that can be foundat http://www.ks.uiuc.edu/Training/Tutorials/. Here we will adapt the generalNAMD tutorial to the simulation of our two proteins 1I45 and 1NEY, to analyze their behavior. Torun NAMD we need: • a PDB file • a protein structure file (PSF), which stores the information about the topology of the protein structure • a force field parameter file (for example the file toppar_c35b2_c36a2.tgz obtained from: http://mackerell.umaryland.edu/CHARMM_ff_params.html • a configuration or input file, specifying what do we want to do with running the program Figure 3.1 shows the way we will proceed. More details below and in the original NAMD tutorial.3.1.1 Creating a PSF file for PDB 1i45The first task to do is to split the pdb files into their two subunits, as this is needed by the psfgenprogram.grep ’ A ’ p d b 1 i 4 5 . e n t | grep −v ’HOH’ >1 i45A . pdbgrep ’ B ’ p d b 1 i 4 5 . e n t | grep −v ’HOH’ >1 i45B . pdbWe need to be this in order to make the two monomers being segments of the PSF file we willgenerate. In principle, the psfgen should do the rest for us. Thus, we simply runc 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 4
    • MSI February 2012 Practical examplesFigure 2: Flowchart of the process of running a NAMD run, specifying the different tools to beused to generate the output. Extracted from the NAMD tutorial at http://www.ks.uiuc.edu/Training/Tutorials/.a l i a s vmdrun = / A p p l i c a t i o n s /VMD 1 . 8 . 7 b e t a 3 . app / C o n t e n t s / MacOS / s t a r t u p . commandvmdrun −d i s p d e v t e x t − e o f e x i t < 1 i 4 5 _ p g n . t c lor its equivalent in windows or unix, where the 1i45_pgn.tcl file contains:package r e q u i r e p s f g e nresetpsf# loading the topologytopology t o p _ a l l 2 7 _ p r o t _ l i p i d . r t f# c r e a t i n g t h e s e g m e n t f o r t h e f i r s t monomeri f {1} { # a l i a s i n g some names p d b a l i a s r e s i d u e HIS HSE p d b a l i a s r e s i d u e FTR TRP p d b a l i a s atom ILE CD1 CD segment A {pdb 1 i 4 5 A . p d b } coordpdb 1 i 4 5 A . p d b A}i f {1} { # a l i a s i n g some names p d b a l i a s r e s i d u e HIS HSE p d b a l i a s r e s i d u e FTR TRP p d b a l i a s atom ILE CD1 CD segment B {pdb 1 i 4 5 B . p d b } coordpdb 1 i 4 5 B . p d b B}guesscoordwritepdb 1 i 4 5 . p d bwritepsf 1 i45.psfc 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 5
    • MSI February 2012 Practical examplesNotice the generation of the two segments. Equivalently, we can source the 1i45_pgn.tcl filefrom within VMD, by accessing the Extensions;Tk Console and, there:cd < where t h e pgn f i l e i s >source 1 i 4 5 _ p g n . t c lInspection of the 1i45.pdb and 1i45.psf files generated shows that the psfgen program dida good job in assigning the patches capping the two chains:PSF CMAP 9 !NTITLE REMARKS original generated structure x-plor psf file REMARKS 4 patches were applied to the molecule. REMARKS topology ./toppar/top_all27_prot_lipid.rtf REMARKS segment A { first NTER; last CTER; auto angles dihedrals } REMARKS segment B { first NTER; last CTER; auto angles dihedrals } REMARKS defaultpatch NTER A:2 REMARKS defaultpatch CTER A:248 REMARKS defaultpatch NTER B:2 REMARKS defaultpatch CTER B:248 7542 !NATOM 1 A 2 ALA N NH3 -0.300000 14.0070 0 2 A 2 ALA HT1 HC 0.330000 1.0080 0 3 A 2 ALA HT2 HC 0.330000 1.0080 0(...) Exercise 1Prepare the PSF files for the 1NEY and 1YPI structures.3.1.2 Solvating the structureNAMD offers two alternatives for the solvation of the structure, prior to the MD runs. One can choosea sphere to solvate the proteins and treat the solvent using spherical boundary conditions (SBC) orone can use periodic boundary conditions (PBC) with, e.g., a a cube or a rectangular prism. We willdemonstrate later the SBC with the SCAAS method[Warshel and King, 1985] in MOLARIS but forthe sake of completeness we will show here how to build both a sphere and a rectangular prism ofwaters around our system, while running MD simulations in both. We start by creating a sphere of waters around the system to run SBC. We will use the wat_sphere.tclfile in Appendix B.2vmdrun −d i s p d e v t e x t − e o f e x i t < w a t _ s p h e r e . t c l This generates the files 1i45_ws.pdb and 1i45_ws.psf, that can be displayed with VMD(seeFigure 3.1.2a). Afterwards, we use the file wat_box.tcl below in an analogous manner and obtain the sol-vated system in Figure 3.1.2b.package r e q u i r e s o l v a t es o l v a t e 1 i 4 5 . p s f 1 i 4 5 . p d b −t 5 −o 1 i45_wbc 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 6
    • MSI February 2012 Practical examples Figure 3: Solvated systems for NAMD run with spherical and rectangular prism representations Some proteins may be sensitive to the ionic strength of the surrounding solvent. Even whenthat is not the case, in molecular dynamics (MD) simulations with periodic boundary conditions,the energy of the electrostatic interactions is often computed using the particle-mesh Ewald (PME)summation, which requires the system to be electrically neutral. The vmd autoionize plugin pro-vides a quick way to make the net charge of the system zero by adding random (following someminimum distances between ions) sodium and chlorine ions to the solvent. In our case, for thePBC-based simulations with 0.05M in NaCl, we can run VMD in text mode again with this Tclscript:package r e q u i r e a u t o i o n i z ea u t o i o n i z e −psf 1 i 4 5 _ w b . p s f −pdb 1 i 4 5 _ w b . p d b −is 0 . 0 5 −o 1 i45_wb_NaClsource s o d 2 p o t . t c lwhere the sod2pot.tcl script, used to substitute the Na+ by K+ ions can be obtained in Ap-pendix B.3.3.1.3 Running the simulationsOnce the files needed have been built, we simply need to run the simulation by typingnamd2 1 i 4 5 _ w s _ e q . c o n f > 1 i 4 5 _ w s _ e q . l o g &where an example of configuration file 1i45_ws_eq.conf is given in Appendix B.4. In a similar manner, an example of configuration file for PBC is given in Appendix B.5. More details on running NAMD simulations can be found in the official NAMD tutorial at http://www.ks.uiuc.edu/Training/Tutorials/. See also Appendix D for links to extraexamples and useful resources. Exercise 2 Run SBC and PBC relaxations and heating for 1NEY and 1YPI. Produce plots for RMSD and totalenergy in each case.c 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 7
    • MSI February 2012 Practical examples3.2 MD with MOLARISIn this quick start guide we will show how to run molecular simulations using MOLARIS[Chu et al., 2003].In particular, we will be setting up a system and running molecular dynamics in a given region withan explicit representation of the solvent.3.2.1 Preparing the PDBUsing our preferred editor, we edit the *.ent files and change the FTR entries by TRP entries. Forexample, we change the pdb1i45.ent file into a file we will call 1i45mod.pdb by doing the followingsubstitution:HETATM 1282 N FTR A 168 38.662 51.541 42.102 1.00 15.78 NHETATM 1283 CA FTR A 168 37.687 51.997 43.087 1.00 16.01 CHETATM 1284 CB FTR A 168 37.016 53.291 42.612 1.00 16.50 CHETATM 1285 CG FTR A 168 36.457 53.215 41.211 1.00 18.36 CHETATM 1286 CD2 FTR A 168 35.103 52.917 40.831 1.00 18.75 CHETATM 1287 CE2 FTR A 168 35.046 52.962 39.419 1.00 18.66 CHETATM 1288 CE3 FTR A 168 33.932 52.616 41.545 1.00 20.74 CHETATM 1289 CD1 FTR A 168 37.142 53.419 40.045 1.00 18.66 CHETATM 1290 NE1 FTR A 168 36.302 53.270 38.967 1.00 17.24 NHETATM 1291 CZ2 FTR A 168 33.864 52.718 38.705 1.00 19.86 CHETATM 1292 CZ3 FTR A 168 32.754 52.372 40.827 1.00 21.32 CHETATM 1293 F FTR A 168 31.644 52.083 41.514 0.57 24.59 FHETATM 1294 CH2 FTR A 168 32.735 52.427 39.425 1.00 20.56 CHETATM 1295 C FTR A 168 36.600 50.963 43.385 1.00 16.95 CHETATM 1296 O FTR A 168 35.850 51.115 44.348 1.00 17.00 OintoATOM 1282 N TRP A 168 38.662 51.541 42.102 1.00 15.78 NATOM 1283 CA TRP A 168 37.687 51.997 43.087 1.00 16.01 CATOM 1284 CB TRP A 168 37.016 53.291 42.612 1.00 16.50 CATOM 1285 CG TRP A 168 36.457 53.215 41.211 1.00 18.36 CATOM 1286 CD2 TRP A 168 35.103 52.917 40.831 1.00 18.75 CATOM 1287 CE2 TRP A 168 35.046 52.962 39.419 1.00 18.66 CATOM 1288 CE3 TRP A 168 33.932 52.616 41.545 1.00 20.74 CATOM 1289 CD1 TRP A 168 37.142 53.419 40.045 1.00 18.66 CATOM 1290 NE1 TRP A 168 36.302 53.270 38.967 1.00 17.24 NATOM 1291 CZ2 TRP A 168 33.864 52.718 38.705 1.00 19.86 CATOM 1292 CZ3 TRP A 168 32.754 52.372 40.827 1.00 21.32 CATOM 1294 CH2 TRP A 168 32.735 52.427 39.425 1.00 20.56 CATOM 1295 C TRP A 168 36.600 50.963 43.385 1.00 16.95 CATOM 1296 O TRP A 168 35.850 51.115 44.348 1.00 17.00 Oas well as the same change for chain B, of course. Alternatively, we can do something likes e d ’ s / FTR / TRP / ’ p d b 1 i 4 5 . e n t >temp . pdbmv temp . pdb 1 i45mod . pdbs e d ’ s / FTR / TRP / ’ pdb1ney . e n t >temp . pdbmv temp . pdb 1 neymod . pdbc 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 8
    • MSI February 2012 Practical examples3.2.2 Running an interactive MOLARIS sessionAt the prompt, we typemolaris After this, the program will promptSourced /cbbl/soft/molaris/bin/.molaris_rcSourced /home/jvilla/.molaris_rc Usage: For interactive run, please press the Enter key. For using input file on command line, please press the Enter key, type quit, then type on the command line: molaris < input_file_name or: molaris input_file_name you may use command line options to read in alternative libraries: molaris [-a amino_lib_name] [-p parm_lib_name] [-e evb_lib_name] [-s solvent_opt_name] [-o output_directory_name] This message informs the user about the different possibilities of running MOLARIS. The usercan run the program interactively, as we will do now, or prepare an input file with all the appropriatecommands for running the calculation in the background. We type <Enter> and after some infor-mation we are prompted for a PDB name. At this point we write the name of the coordinates file ofthe system we are interested in. MOLARIS accepts both PDB and Mol2 formats, or a combinationof them. In this case we type 1i45mod.pdb. Initially the program checks the coordinates file and look for possible errors in it. If the fileis OK then the program will proceed by comparing the residues of the coordinates file with theresidues in the topology library, provided with the program and called amino98.lib in the currentversion. If the file contains a residue that is not in the library a new entry is automatically added tothis library. After checking and writing the topology in a special file called $OUT_DIR/1i45mod.topthe user is asked what task he/she wants to perform. In this case we will choose ENZYMIX and theprogram prompts the following table: Table of the Keywords for the Enzymix Level ........................................... keyword modifier example ------- -------- ------- pre_enz no pre_enz relax no relax ac no ac evb no evb evb2 no evb2 evb_ab no evb_ab adiab_pot no adiab_pot adiab_tem no adiab_tem end no end help yes help <keyword1> <keyword2> ... help yes help all help no help exit/quit no exit------------------------------------------------------------------------ Here you start to see that the MOLARIS package works as nested tasks, where every keywordfollow a hierarchy of execution. In this way, every time we finish a particular task we must writec 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 9
    • MSI February 2012 Practical examplesan end statement if we want to save the changes made or exit if we did some mistake and we wantto quit without saving. In this particular case we want to perform a relaxation of the protein, so weselect relax. The following table appears: Table of the Keywords RELAX Level ................................. keyword modifier example ------- -------- ------ md_parm no md_parm rest_in yes rest_in rest.in rest_out yes rest_out rest.1 energy_out yes energy_out gap.out end no end help yes help <keyword1> <keyword2> ... help yes help all help no help exit/quit no exit Here we have several choices to make. If we just quit the level with end, the program willperform a relaxation taking the default parameters for the MD calculation. Let us change thoseparameters before quitting the relax level. When typing md_parm we enter in the next hierarchylevel and we have all the possible choices in the following table: Table of the Keywords MD_PARM Level ................................... keyword modifier example ------- -------- ------- nsteps yes nsteps 500 temperature(K) yes temperature 300.0 tolerance_temp yes tolerance_temp 3000.0 stepsize (ps) yes stepsize 0.002 nbupdate yes nbupdate 30 gas_phase yes gas_phase 0 region2a_r yes region2a_r 18.0 water_r yes water_r 18.0 langevin_r yes langevin_r 20.0 ex_w_center yes ex_w_center 3.0 4.5 2.34 solvent yes solvent water induce yes induce 0 indforce yes indforce 0 constraint_1 yes constraint_1 0.03 constraint_2 yes constraint_2 0.03 constraint_w yes constraint_w 30.0 constraint_pair yes constraint_pair 5 9 10.0 1.3 constraint_post yes constraint_post 10 10. 10. 10. 3.4 -4.6 4.7 constraint_r yes constraint_r 5 10.0 50.0 2.0 4.6 7.3 constraint_ang yes constraint_ang 10 34 35 10.0 120. constraint_tor yes constraint_tor 5 10 34 35 10.0 120.0 1.0 1 h_constraint yes h_constraint 0 movie_co yes movie_co rg1 movie_fq yes movie_fq 10 pmf no pmf ub_sampling no ub_sampling fix_region yes fix_region 1 fix_atom yes fix_atom 8 dist_atoms yes dist_atoms 2 5 dist_write_fq yes dist_write_fq 10 log_write_fq yes log_write_fq 10 opt_his yes opt_his 1 steep_mini yes steep_mini 1 df_mini yes df_mini 1 0.0001 log_detail yes log_detail 1 help yes help <keyword1> <keyword2> ... help yes help all help no helpc 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 10
    • MSI February 2012 Practical examples end no end exit/quit no exit All the parameters have their default value, but let’s say that we want a shorter run and we wantto change the temperature and the stepsize. To do so we type: md_parm> nsteps 300 md_parm> temperature 200. md_parm> stepsize 0.0002 md_parm> end Closing the relax level with an additional end keyword will start the run. At the beginningthe relevant information is printed (different radii, coordinates for the center, number of solventmolecules generated...) and then the actual MD calculation starts, giving the values of the energiesat intervals of 10 steps:(...) In dynamics: Istep= 49 Temp= 205.16 Target= 200.00 In dynamics: Istep= 50 Temp= 205.22 Target= 200.00 rms of all protein heavy atoms for (x_average-x0) = 0.02 rms of all protein heavy atoms for (x_current-x0) = 0.05 Energies for the system at step 50:------------------------------------------------------------------------ protein - ebond : 496.38 ethet : 1699.07 ephi : 3569.11 eitor : 41.70 evdw : 3130.24 emumu : -1934.36 ehb_pp : -568.27 water - ebond : 57.04 ethet : 19.09 evdw : -4.90 emumu : -81.70 ehb_ww : 0.00 pro-wat - evdw : 28.55 emumu : -83.32 ehb_pw : 0.00 long - elong : -78.27 ac - evd_acp : 0.00 emumuacp : 0.00 evd_acw : 0.00 emumuacw : 0.00 ehb_acp : 0.00 ehb_acw : 0.00 evb - ebond : 0.00 ethet : 0.00 ephi : 0.00 evdw : 0.00 emumu : 0.00 eoff : 0.00 egpshift : 0.00 eindq : 0.00 ebulk : 0.00 induce - eindp : 0.00 eindw : 0.00 const. - ewatc : 209.95 eproc : 35.30 edist : 0.00 langevin- elgvn : 0.00 evdw_lgv : 0.00 eborn : -22.38 system - epot : 6513.23 ekin : 1574.62 etot : 8087.85 ________________________________________________________________________ Constraint energy on region I: 0.00 In dynamics: Istep= 51 Temp= 205.25 Target= 200.00(...)If the decrease in temperature and stepsize is not enough to obtain a stable run, we can use a sim-ple steepest descent minimization by choosing steep_mini 1 in the md_parm table. Oncec 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 11
    • MSI February 2012 Practical examplesfinnished, we end the enzymix level and we enter the analyze level in order to get the coordi-nates in PDB format after the relaxation. Table of the Keywords for the Analysis Level ............................................ keyword modifier example ------- -------- ------- rest_in yes rest_in rest.in rest_to_pdb yes rest_to_pdb rest.pdb allres no allres restype yes restype ASP resatom yes resatom 1 resbond yes resbond 1 resang yes resang 1 restor yes restor 1 resitor yes resitor 1 distatoms yes distatoms 2 5 distatompnt yes distatompnt 2 1.0 2.0 2.3 chkbond yes chkbond 50.0 chkdisulfide yes chkdisulfide electro yes electro 1 18.0 4 center_s no center_s center_r yes center_r 5 12 center_c yes center_c 1 sphereion yes sphereion 12.50 3.64 -6.28 10.63 sphereion_r yes sphereion_r 12.50 4 sphereres yes sphereres 12.50 3.64 -6.28 10.63 sphereres_r yes sphereres_r 12.50 4 sphereatm yes sphereatm 12.50 3.64 -6.28 10.63 addbond yes addbond 2 5 9 10 18 mutate_res yes mutate_res 2 SER rotate_h yes rotate_h 5 12 rotate_axis yes rotate_axis 5 12 rotate_axis 2.0 3.5 6.7 12.0 23.1 -2.3 prot_prot no prot_prot viewmovie no viewmovie viewpot no viewpot vdwsurf no vdwsurf makepdb no makepdb makelib1 no makelib1 dock no dock add_memgrid yes add_memgrid 1.0 3.2 0.5 Y 10.0 20.0 10.0 1 1.0 end no end help yes help <keyword1> <keyword2> ... help yes help all help no help exit/quit no exit We choose makepdb: Table of the Keywords makepdb Level ................................... keyword modifier example ------- -------- ------ residue yes residue 2 residue 2 to 10 residue all residue all+w file_nm yes file_nm file.pdb end no end help yes help <keyword1> <keyword2> ... help no help exit/quit no exit Then we select the right options and quit the makepdb level:c 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 12
    • MSI February 2012 Practical examples makepdb> file_nm 1i45mod_wat.pdb makepdb> residue all+w The program executes the requested commands and it is ready to be quit by double typing end. At this point it is important to note the use of the non-interactive way of running the program,which allows one to redirect the output. Try, for examplemolaris 1 i45mod_relax . inp 1 i45mod_relax . outwhich puts the output in a file 1i45mod_relax.out file in the $OUT_DIR/1i45mod_relax. Obviusly several runs can be concatenated in the input file when, for example, one needs to heatthe system in several stages. For example, one can create the configuration file:1i45mod.pdbenzymix relax md_parm steep_mini 1 stepsize 0.001 water_r 30 nsteps 30 end rest_out 1i45mod_rx.rest endendanalyze makepdb file_nm 1i45mod_rx.pdb residue all+wat endendendwhich execution can be followed by1i45mod_rx.pdbenzymix relax md_parm temperature 100 stepsize 0.002 nsteps 100 end rest_out 1i45mod_md100.rest end relax md_parm temperature 300 stepsize 0.002 nsteps 1000 end rest_in $OUT_DIR/1i45mod_md100.rest endendanalyze makepdb file_nm 1i45mod_md300.pdb residue all endendendc 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 13
    • MSI February 2012 Practical examples Exercise 3 Check the PDB created by the above scripts. What do you need to do to run a relaxation calculationincluding all the residues in the TIM dimer interface? Run such calculation for the three systems1I45, 1NEY and 1YPI. Plot the behavior of the total energy and the RMSD.3.3 MD with ADUNADUN is a program that is based on the Cocoa/NextStep frameworks. This provides excellent toolsfor a graphical user interface and you can download the latest version of the program from the ADUNGNA site: https://gna.org/projects/adun/. In this session we will use ADUN using itscommand line version, as some of the calculations to be done are still experimental (in particularthe LIE implementation in Section 4.2.3.3.1 Preparing the PDBAgain, we need to clean the pdb file for being used with ADUN. Analogously to what was donebefore:# download p d b swget h t t p : / / www. pdb . o r g / pdb / f i l e s / 1 I 4 5 . pdb# t r a n s f o r m FTP t o TRP and d e l e t e what i s n o t n e e d e ds e d ’ s / FTR / TRP / ’ 1 I 4 5 . pdb | grep −v HOH >temp . pdbs e d ’ s /HETATM/ATOM / ’ temp . pdb >1 i45mod . pdb File 1i45mod.pdb has multiple models. Delete all of them except the one you want to use. Stripwater as well. Clean pdbs, renumber them, and add hydrogen atoms with reduce[Word et al., 1999].Clean again fixing hydrogens, cap the protein and take care of histidine namings./ c b b l / s o f t / adun / s h a r e d a p p s / r e p a i r P D B . py 1 i45mod . pdb numb c l e a nr e d u c e −BUILD 1 i 4 5 m o d _ f i x e d . pdb > 1 i 4 5 m o d _ r e d u c e d . pdb/ c b b l / s o f t / adun / s h a r e d a p p s / r e p a i r P D B . py 1 i 4 5 m o d _ r e d u c e d . pdb c l e a n hyd c a p h i s Now we can build the adun datasources for each of the systems5 . The build may complain thatthere are two Atoms (the two fluorines), although it can be safely ignored. There is a known issuewith the builder script. In around 10% cases it misteriouslycrashes with a segmentation fault. Justrerun it, it will work./ c b b l / s o f t / adun / c h i l e / s c r i p t Builder . st 1 i 4 5 m o d _ r e d u c e d _ f i x e d . pdb Amber963.4 Running simulations with ADUNMake a separate directory for the simulation and put the PDB file theremkdir 1 i 4 5cp 1 i 4 5 m o d _ d i m e r . d a t a s o u r c e 1 i 4 5 / 5 datasources or systems are the main objects in ADUN. Check http://lavandula.imim.es/adun-new/?page_id=294 for more detailsc 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 14
    • MSI February 2012 Practical examples Prepare a template file by editing copies of /cbbl/soft/adun/resources/template.tempthat you will place in each directory. The original file template.temp already has sensible seet-ings. However, <DATASOURCE> must be replaced by 1i45mod_reduced_fixed.datasource.Also, <NUMBER_OF_STEPS> need to be set to a sensible value. The unit is femtoseconds. In order to run the simulations in the CBBL cluster we have provide a useful script that doesmost of the job for you# prepare a cluster f i l ecp / c b b l / s o f t / adun / r e s o u r c e s / c l u s t e r . i n i 1 i 4 5 /# go i n t o a l l d i r e c t o r i e s and e d i t t h e c l u s t e r . i n i# p u t i n a n i c e name and a q u e u e ( e . g . c b b l )# start simulations/ c b b l / s o f t / adun / c h i l e / c l u s t e r / c b b l / u s e r s / s c r a t c h / c h i l e / 1 i 4 53.4.1 Analyizing the resultsOne of the powerful characteristics of ADUN is its ability to extract results from the simulations tobe analyzed using diverse algorithms.RMSD analysis We can run the RMSD plugin using the alpha carbons only by/ c b b l / s o f t / adun / c h i l e / s c r i p t RMSD. s t / c b b l / u s e r s / s c r a t c h / c h i l e / 1 i 4 5 / s i m u l a t i o n 1 / CHILE . s i m u l a t i o n 1 i45mod_reduced_fixed @CAExtract energies and trajectories As ADUN is free energy calculation oriented, the analysis ofthe energetics of the system through the MD trajectory is critical. The folowing commands extractenergies, starting at frame 0 until frame 1000 is reached and obtained every second frame:/ c b b l / s o f t / adun / c h i l e / r e s u l t s C o n v e r t e r −Mode E n e r g y −S i m u l a t i o n / c b b l / u s e r s / s c r a t c h / c h i l e / 1 i 4 5 / s i m u l a t i o n 1 / CHILE . s i m u l a t i o n −S t a r t 0 −L e n g t h 1000 −S t e p S i z e 2 Using also the resultsConverter tool, one can extract a series of pdbs, so that the trajec-tory can be viewed in, e.g., VMD:/ c b b l / s o f t / adun / c h i l e / r e s u l t s C o n v e r t e r −Mode C o n f i g u r a t i o n −S i m u l a t i o n / c b b l / u s e r s / s c r a t c h / c h i l e / 1 i 4 5 / s i m u l a t i o n 1 / CHILE . s i m u l a t i o n −S t a r t 0 −L e n g t h 1000 −S t e p S i z e 2Essential dynamics To obtain the essential modes for the alpha carbons only of the system weuse the corresponding ED.st script:/ c b b l / s o f t / adun / c h i l e / s c r i p t ED . s t / c b b l / u s e r s / s c r a t c h / c h i l e / 1 i 4 5 / s i m u l a t i o n 1 / CHILE . s i m u l a t i o n 1 i45mod_dimer @CAc 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 15
    • MSI February 2012 Practical examples Exercise 4Follow the same procedure to run an MD simulation for 1NEY. 1NEY has an 13P ligand, strip it,we will come back to how to incorporate the ligand later. Create RMSD plots and a movie of theMD run for 1NEY.4 Practical Session II: Solvation, pKa , FEP4.1 Running solvation and pKa simulations using PDLD/S-LRAThe next task consists in the calculation of the pKa shift for the residues in the TIM interface. Todo so, we will choose the polaris task and the program will prompt us with a table of the options forpolaris: Table of the Keywords for the Polaris Level ........................................... keyword modifier example ------- -------- ------- pre_pol no pre_pol solv_pdld no solv_pdld solv_pdld_evb no solv_pdld_evb solv_fep no solv_fep ai_pdld no ai_pdld bind_pdld no bind_pdld bind_pdld_evb no bind_pdld_evb bind_fep no bind_fep pka_pdld no pka_pdld pka_fep no pka_fep redox_pdld no redox_pdld redox_fep no redox_fep logp no logp titra_ph_0 no titra_ph_0 titra_ph no titra_ph pka_multi no pka_multi evb_pdld no evb_pdld prot_prot no prot_prot end no end help yes help <keyword1> <keyword2> ... help yes help all help no help exit/quit no exitWe will choose pka_pdld, and the program will prompt: Table of the Keywords for the pKa_pdld Level ............................................ keyword modifier example ------- -------- ------- reg1_res yes reg1_res 2 pka_w yes pka_w 3.0 pdld_fn yes pdld_fn asp.pdld reg1_atm yes reg1_atm 10 to 20 ab_crg yes ab_crg 10 0.50 0.0 regII_r yes regII_r 16.0 config yes config 0 5 use_restart no use_restart md_parm_r no md_parm_r md_parm_w no md_parm_w md_parm_p no md_parm_pc 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 16
    • MSI February 2012 Practical examples help yes help <keyword1> <keyword2> ... help no help end no end exit/quit no exit Next we will choose residue 137 as our region I. We will also set the number of configurationsto run and the characteristics of the dynamics. Thus, we will tell the program to run the calculationon the initial protein structure and on 2 more conformations which will be generated automaticallyby MD runs. pka_pdld> reg1_res 137 atoms added to region I: atom# charg_a charg_b ----- ------- ------- 2108 0.000 0.000 2109 0.000 0.000 2110 0.000 0.000 2111 -0.080 0.000 2112 0.360 0.000 2113 0.360 0.000 2114 0.360 0.000 pka_pdld> config 1 2 In order to run the program we will just end the level and the calculation will proceed. The finalresult of the pKa calculations of this very simple (and of course unreliable because of the short run)test: PDLD SEMI-MACROSCOPIC ESTIMATE FOR pKa ...................................... effective dielectric 2 4 6 8 20 40 80 epsilon_p(e_p) pKa_intr for str. 1 9.93 10.17 10.25 10.28 10.29 10.35 10.37 pKa_intr for str. 2 9.91 10.16 10.24 10.28 10.29 10.35 10.37 ------------------------------------------------------------------------------ aver pKa_int 9.92 10.16 10.24 10.28 10.29 10.35 10.37 estimated apparent pKa 13.01where the pKa_int corresponds to the intrinsic pKa , the one due to the self energy of the system,while the estimated apparent pKa includes the charge-charge contribution (see the course slides fordetails). Exercise 5Find the pKa shifts for all residues in the interface of the TIM dimer. Use the prot_prot keywordat the analyze level as a guide. Exercise 6 Recall the solv_pdld keyword at the polaris level is in fact a simplified version of the ther-modynamic cycle for the pKa shift calculations. Based on this fact, check the stability of theloop 6 residues in the three structure 1I45, 1NEY and 1YPI and discuss the results. See, e.g.,[Bonet et al., 2006, Scheper et al., 2009].c 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 17
    • MSI February 2012 Practical examples4.2 LIE runs with ADUNNext we will evaluate the absolute free energy of solvation of a ligand to the two structures bymeans of the linear interaction energy method by Aqvist and coworkers[Hansson et al., 1998]. Weare going to run a linear interaction energy calculation using PGH as a ligand for the TIM structures.6 First, we will build PDB files from XXXXmod_reduced_fixed containing the protein plusthe ligand (after docking with autodock, for example). We will call these files XXXXmod_complex.pdb. Then, we will build datasources for the TIM+PGH and the PGH alone as in Section 3.3.1. Besure that in all PDB files, the PGH moiety bears the same chain label (C, here)./ c b b l / s o f t / adun / c h i l e / s c r i p t B u i l d e r . s t 1 neymod_dimere . pdb Amber96/ c b b l / s o f t / adun / c h i l e / s c r i p t B u i l d e r . s t 1 i 4 5 m o d _ d i m e r e . pdb Amber96/ c b b l / s o f t / adun / c h i l e / s c r i p t B u i l d e r . s t PGH . pdb Amber96 Finally, the LIE run is done by:/ c b b l / s o f t / adun / c h i l e / s c r i p t LIE . s t / cbbl / users / scratch / c h i l e / 1 ney / s i m u l a t i o n 1 / CHILE . s i m u l a t i o n / cbbl / users / scratch / c h i l e / pgh / s i m u l a t i o n 1 / CHILE . s i m u l a t i o n C/ c b b l / s o f t / adun / c h i l e / s c r i p t LIE . s t / cbbl / users / scratch / c h i l e / 1 i 4 5 / s i m u l a t i o n 1 / CHILE . s i m u l a t i o n / cbbl / users / scratch / c h i l e / pgh / s i m u l a t i o n 1 / CHILE . s i m u l a t i o n C Exercise 7 Calculate the absolute binding free energy for DHAP in both proteins (the 13P HETATM structurein 1NEY).5 Practical Session III: enzymatic reactivity with EVB5.1 EVB for enzymatic reactivity analysisIn order to study an enzymatic reaction we should compare the reaction mechanism in the proteinand the corresponding reaction in water. Thus, most of the times we are interested in comparing thefree energy profiles in both environments. We will see here how to run a simulation in protein. Torun it in water you should modify the pdb file and the input files below. First, we need to define the resonance states we are going to explore, following Aqvist[Åqvist and Fothergill, 1996]../cbbl/users/jparetas/molaris/md1wyi.pdbenzymix evb evb_state 5 1.00 0.00 0.0 0.0 0.0 1 evb_atm 3749 -0.68859 O- -0.68295 O- -0.68661 O- -0.71403 O- -0.70662 O- evb_atm 3750 0.19013 C0 0.19017 C0 0.19699 C0 0.20108 C0 0.21502 C0 evb_atm 3751 0.58918 C+ 0.58214 C+ 0.33797 C+ 0.27155 C+ 0.33581 C+ evb_atm 3752 -0.81382 O- -0.80576 O- -0.47856 O- -0.51389 O- -0.65842 O- evb_atm 3753 0.18825 C0 0.15741 C0 0.03782 C0 0.07854 C0 0.47497 C0 6 This is an experimental implementation in ADUN, so we have tried to minimize the sources of error, although somemay exist still. c 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 18
    • MSI February 2012 Practical examples evb_atm 3754 -0.63437 O- -0.61647 O- -0.61681 O- -0.62313 O- -0.72027 O- evb_atm 3755 0.07703 H0 0.10567 H0 0.04920 H0 0.04987 H0 0.03755 H0 evb_atm 3756 0.06757 H0 0.07493 H0 0.05844 H0 0.05610 H0 0.06462 H0 evb_atm 3757 0.07803 H0 0.05535 H0 0.03800 H0 0.02770 H0 0.01087 H0 evb_atm 3758 0.08548 H0 0.03909 H0 0.22519 H0 0.26049 H0 0.12992 H0 evb_atm 3759 0.24922 H0 0.25966 H0 0.21598 H0 0.12190 H0 0.14398 H0 evb_atm 1416 0.17993 C0 0.17114 C0 0.16329 C0 0.17634 C0 0.17737 C0 evb_atm 1417 -0.53201 N- -0.46055 N- -0.47382 N- -0.51056 N- -0.50804 N- evb_atm 1418 0.15798 H0 0.12208 H0 0.23880 H0 0.23902 H0 0.23974 H0 evb_atm 1419 0.20793 C0 0.18771 C0 0.19239 C0 0.21360 C0 0.20854 C0 evb_atm 1420 0.00456 H0 0.02048 H0 0.00419 H0 0.00979 H0 0.00686 H0 evb_atm 1421 0.05774 N0 0.12747 N0 -0.36805 N0 0.12963 N0 0.08870 N0 evb_atm 1422 -0.12399 C0 -0.11718 C0 -0.14108 C0 -0.11732 C0 -0.12039 C0 evb_atm 1423 0.01450 H0 0.01485 H0 0.00967 H0 0.01515 H0 0.01468 H0 evb_atm 2505 -0.03807 C0 -0.02721 C0 -0.00095 C0 -0.02629 C0 -0.03614 C0 evb_atm 2506 0.07814 H0 0.06715 H0 0.05948 H0 0.06925 H0 0.07107 H0 evb_atm 2507 0.06048 H0 0.08105 H0 0.09790 H0 0.08797 H0 0.07871 H0 evb_atm 2508 0.72937 C+ 0.73513 C+ 0.74744 C+ 0.73159 C+ 0.73425 C+ evb_atm 2509 -0.80168 O- -0.75704 O- -0.78554 O- -0.88197 O- -0.88894 O- evb_atm 2510 -0.84230 O0 -0.88402 O0 -0.46001 O0 -0.75316 O- -0.80857 O- evb_bnd 0 3749 3750 evb_bnd 0 3750 3755 evb_bnd 0 3750 3756 evb_bnd 0 3750 3751 evb_bnd 0 3751 3752 evb_bnd 0 3751 3753 evb_bnd 0 3753 3757 evb_bnd 1 3753 3758 evb_bnd 2 3758 2510 evb_bnd 3 3758 2510 evb_bnd 4 3758 2510 evb_bnd 0 3753 3754 evb_bnd 1 3754 3759 evb_bnd 2 3754 3759 evb_bnd 3 3754 3759 evb_bnd 0 2505 2506 evb_bnd 0 2505 2507 evb_bnd 0 2505 2508 evb_bnd 0 2508 2509 evb_bnd 0 2508 2510 evb_bnd 0 1416 1422 evb_bnd 0 1422 1423 evb_bnd 0 1422 1421 evb_bnd 1 1421 1418 evb_bnd 2 1421 1418 evb_bnd 3 3752 1418 evb_bnd 4 3752 1418 evb_bnd 5 3752 1418 evb_bnd 0 1421 1419 evb_bnd 4 1421 3759 evb_bnd 5 1421 3759 evb_bnd 5 3758 3751 evb_bnd 0 1419 1420 evb_bnd 0 1419 1417 evb_bnd 0 1417 1416and then we can continue with a regular MD run gas_dg 1 0.0 gas_dg 2 115.0 gas_dg 3 50.0 evb_parm iflag_r4 0 end rest_out evb_tim12.res md_parm temperature 300.0c 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 19
    • MSI February 2012 Practical examples nsteps 50000 ss 0.001 region2a_r 18 water_r 18 langevin_r 18.5 movie_co 1 2 3 movie_fq 1000 constraint_1 0.30 end endend# now we check the numbering of the atoms for defining region Ianalyze resatom 1 resatom 2 resatom 3endAfter the relaxation we can proceed with the actual free energy perturbation calculations:enzymix evb evb_state 5 1.00 0.00 0.0 0.0 0.0 1 ap_pf 51 1 2 evb_atm 3749 -0.68859 O- -0.68295 O- -0.68661 O- -0.71403 O- -0.70662 O-(...)# repeat here the description of the EVB region(...) evb_bnd 0 1417 1416 # gas_dg 1 0.0 # gas_dg 2 115.0 # gas_dg 3 50.0 hij 1 2 3753 2510 10. 2.5 evb_parm iflag_r4 0 end rest_out evb_tim12.res md_parm temperature 300.0 nsteps 50000 ss 0.001 region2a_r 18 water_r 18 langevin_r 18.5 movie_co 1 2 3 movie_fq 1000 constraint_1 0.30 end endend# now we check the numbering of the atoms for defining region Ianalyze resatom 1 resatom 2 resatom 3end After the end command, the program will start the FEP protocol, according to the settingsabove. The final result of the program is a bunch of *.map files, each of them corresponding toevery frame.c 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 20
    • MSI February 2012 Practical examples Exercise 8Design the EVB run for the water system and execute both in the CBBL cluster.A Running in the luke clusterThe luke cluster is going to be used for all expensive runs with NAMD as well as for all runs inMOLARIS and ADUN. To connect, uses s h < username > @ a r b u t u s . imim . e sThis will bring you to your /home/<username> directory. To send calculations, first changeto the /homes/users/<username> directory, which is the one shared by all the nodes in thecluster. Otherwise your calculations will not succeed. The luke cluster uses the Sun grid engine (SGE) queuing system. In order to use it, first add thisline to your $HOME/.bashrc file:s o u r c e / c b b l / s o f t / s g e 6 . 2 u2_1 / d e f a u l t / common / s e t t i n g s . s hSome important keywords are qsub, qstat, qdel. Use the unix man command to obtain help oneach SGE keyword. You can also find information in several web sites (see, e.g., http://www.ats.ucla.edu/clusters/common/computing/batch/sge.htm). Running ADUN in the luke cluster is done by using special scripts, as described in the corre-sponding sections.B Extra material for NAMDB.1 Set upIn this hands on class you are supposed to use NAMD in a local installation. In case this is notpossible or you need to run in the cluster, you can add the following line to your $HOME/.bashrcfile:e x p o r t PATH= / c b b l / s o f t /NAMD_2. 6 _Linux−amd64 / : $PATHB.2 VMD: wat_sphere.tcl# ## S c r i p t t o i m m e r s e TIM i n a s p h e r e o f w a t e r j u s t l a r g e enough# ## t o c o v e r i t . $max i s t h e r a d i u s o f t h e p r o t e i n# ## A d a p t e d f r o m t h e NAMD t u t o r i a ls e t molname 1 i 4 5mol new $ { molname } . p s fmol a d d f i l e $ { molname } . p d b# ## D e t e r m i n e t h e c e n t e r o f mass o f t h e m o l e c u l e and s t o r e t h e c o o r d i n a t e sset c e n [ m e a s u r e c e n t e r [ a t o m s e l e c t t o p a l l ] w e i g h t mass ]set x1 [ l i n d e x $ c e n 0 ]set y1 [ l i n d e x $ c e n 1 ]set z1 [ l i n d e x $ c e n 2 ]c 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 21
    • MSI February 2012 Practical exampless e t max 0# ## D e t e r m i n e t h e d i s t a n c e o f t h e f a r t h e s t atom f r o m t h e c e n t e r o f massf o r e a c h atom [ [ a t o m s e l e c t t o p a l l ] g e t i n d e x ] { s e t p o s [ l i n d e x [ [ a t o m s e l e c t t o p " i n d e x $atom " ] g e t { x y z } ] 0 ] s e t x2 [ l i n d e x $ p o s 0 ] s e t y2 [ l i n d e x $ p o s 1 ] s e t z2 [ l i n d e x $ p o s 2 ] s e t d i s t [ expr pow ( ( $x2−$x1 ) ∗ ( $x2−$x1 ) + ( $y2−$y1 ) ∗ ( $y2−$y1 ) + ( $z2−$z1 ) ∗ ( $z2−$z1 ) , 0 . 5 ) ] i f { $ d i s t > $max} { s e t max $ d i s t } }mol d e l e t e t o p# ## S o l v a t e t h e m o l e c u l e i n a w a t e r box w i t h enough p a d d i n g ( 1 5 A ) .# ## One c o u l d a l t e r n a t i v e l y a l i g n t h e m o l e c u l e s u c h t h a t t h e v e c t o r# ## f r o m t h e c e n t e r o f mass t o t h e f a r t h e s t atom i s a l i g n e d w i t h an a x i s ,# ## and t h e n u s e no p a d d i n gpackage r e q u i r e s o l v a t es o l v a t e $ { molname } . p s f $ { molname } . p d b −t 15 −o d e l _ w a t e rresetpsfpackage r e q u i r e p s f g e nmol new d e l _ w a t e r . p s fmol a d d f i l e d e l _ w a t e r . p d breadpsf del_water.psfcoordpdb d e l _ w a t e r . p d b# ## D e t e r m i n e w h i c h w a t e r m o l e c u l e s n e e d t o be d e l e t e d and u s e a f o r l o o p# ## t o d e l e t e thems e t wat [ a t o m s e l e c t t o p " same r e s i d u e a s { w a t e r and ( ( x−$x1 ) ∗ ( x−$x1 ) + ( y−$y1 ) ∗ ( y−$y1 ) + ( z−$z1 ) ∗ ( z−$z1 ) ) < ( $mas e t d e l [ a t o m s e l e c t t o p " w a t e r and n o t same r e s i d u e a s { w a t e r and ( ( x−$x1 ) ∗ ( x−$x1 ) + ( y−$y1 ) ∗ ( y−$y1 ) + ( z−$z1 )s e t seg [ $del get segid ]set res [ $del get r e s i d ]s e t name [ $ d e l g e t name ]f o r { s e t i 0} { $ i < [ l l e n g t h $ s e g ] } { i n c r i } { d e l a t o m [ l i n d e x $ s e g $ i ] [ l i n d e x $ r e s $ i ] [ l i n d e x $name $ i ] }w r i t e p s f $ { molname } _ w s . p s fw r i t e p d b $ { molname } _ w s . p d bmol d e l e t e t o pmol new $ { molname } _ w s . p s fmol a d d f i l e $ { molname } _ w s . p d bp u t s "CENTER OF MASS OF SPHERE I S : [ m e a s u r e c e n t e r [ a t o m s e l e c t t o p a l l ] w e i g h t mass ] "p u t s "RADIUS OF SPHERE I S : $max "mol d e l e t e t o pB.3 VMD: sod2pot.tcl# ! / u s r / l o c a l / b i n / vmd − d i s p d e v t e x t# r e p l a c i n g Na+ w i t h K+ ( o r a n y t h i n g e l s e w i t h a n y t h i n g e l s e )# adapted from t h e o r i g i n a l f i l e from# I l y a B a l a b i n ( i l y a @ k s . u i u c . e d u ) , 2002 −2003# d e f i n e input f i l e s heres e t p s f f i l e "1 i45_wb_NaCl.psf "s e t p d b f i l e " 1 i45_wb_NaCl.pdb "s e t p r e f i x " 1 i45_wb_KCl "# d e f i n e what i o n s t o r e p l a c e w i t h what i o n ss e t i o n f r o m "SOD"s e t i o n t o "POT"# do n o t c h a n g e a n y t h i n g b e l o w t h i s l i n epackage r e q u i r e p s f g e ntopology t o p _ a l l 2 7 _ p r o t _ l i p i d . r t fputs " nSod2pot ) Reading ${ p s f f i l e } / ${ p d b f i l e } . . . "resetpsfc 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 22
    • MSI February 2012 Practical examples readpsf $psffile coordpdb $ p d b f i l e mol l o a d p s f $ p s f f i l e pdb $ p d b f i l e s e t s e l [ a t o m s e l e c t t o p " name $ i o n f r o m " ] s e t p o s l i s t [ $ s e l get {x y z }] set s e g l i s t [ $sel get segid ] set r e s l i s t [ $sel get r e s i d ] s e t num [ l l e n g t h $ r e s l i s t ] p u t s " S o d 2 p o t ) Found $ {num} $ { i o n f r o m } i o n s t o r e p l a c e . . . " s e t num 0 foreach segid $ s e g l i s t r e s i d $ r e s l i s t { delatom $segid $ r e s i d i n c r num } p u t s " S o d 2 p o t ) D e l e t e d $ {num} $ { i o n f r o m } i o n s " segment $ i o n t o { f i r s t NONE l a s t NONE foreach re s $ r e s l i s t { residue $res $ionto } } s e t num [ l l e n g t h $ r e s l i s t ] p u t s " S o d 2 p o t ) C r e a t e d $ {num} t o p o l o g y e n t r i e s f o r $ { i o n t o } i o n s " s e t num 0 f o r e a c h xyz $ p o s l i s t r e s $ r e s l i s t { c o o r d $ i o n t o $ r e s $ i o n t o $xyz i n c r num } p u t s " S o d 2 p o t ) S e t c o o r d i n a t e s f o r $ {num} $ { i o n t o } i o n s " w r i t e p s f " ${ p r e f i x } . p s f " writepdb " ${ p r e f i x } .pdb " p u t s " S o d 2 p o t ) Wrote $ { p r e f i x } . p s f / $ { p r e f i x } . p d b " puts " Sod2pot ) All done. " quit B.4 VMD: 1i45_ws_eq.conf 1 # Minimization and Equilibration of TIM in a water sphere 2 3 ############################################################# 4 ## ADJUSTABLE PARAMETERS ## 5 ############################################################# 6 7 structure 1i45_ws.psf 8 coordinates 1i45_ws.pdb 9 set temperature 31010 set outputname 1i45_ws_eq11 firsttimestep 01213 #############################################################14 ## SIMULATION PARAMETERS ##15 #############################################################1617 # Input18 paraTypeCharmm on19 parameters par_all27_prot_lipid.prm20 temperature $temperature2122 # Force-Field Parameters23 exclude scaled1-424 1-4scaling 1.025 cutoff 12.026 switching on c 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 23
    • MSI February 2012 Practical examples27 switchdist 10.028 pairlistdist 13.52930 # Integrator Parameters31 timestep 2.0 ;# 2fs/step32 rigidBonds all ;# needed for 2fs steps33 nonbondedFreq 134 fullElectFrequency 235 stepspercycle 103637 # Constant Temperature Control38 langevin on ;# do langevin dynamics39 langevinDamping 5 ;# damping coefficient (gamma) of 5/ps40 langevinTemp $temperature41 langevinHydrogen off ;# don’t couple langevin bath to hydrogens4243 # Output44 outputName $outputname45 restartfreq 500 ;# 500steps = every 1ps46 dcdfreq 25047 outputEnergies 10048 outputPressure 1004950 #############################################################51 ## EXTRA PARAMETERS ##52 #############################################################5354 # Spherical boundary conditions55 sphericalBC on56 sphericalBCcenter 30.3081743413, 28.8049907121, 15.35399442357 sphericalBCr1 26.058 sphericalBCk1 1059 sphericalBCexp1 26061 #############################################################62 ## EXECUTION SCRIPT ##63 #############################################################6465 minimize 100066 reinitvels $temperature67 run 2500 ;# 5ps B.5 VMD: 1i45_wb_eq.conf 1 # Minimization and Equilibration of 2 # Ubiquitin in a Water Box 3 4 ############################################################# 5 ## ADJUSTABLE PARAMETERS ## 6 ############################################################# 7 8 structure 1i45_wb.psf 9 coordinates 1i45_wb.pdb10 set temperature 31011 set outputname 1i45_wb_eq12 firsttimestep 01314 #############################################################15 ## SIMULATION PARAMETERS ## c 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 24
    • MSI February 2012 Practical examples16 #############################################################1718 # Input19 paraTypeCharmm on20 parameters par_all27_prot_lipid.prm21 temperature $temperature2223 # Force-Field Parameters24 exclude scaled1-425 1-4scaling 1.026 cutoff 12.027 switching on28 switchdist 10.029 pairlistdist 13.53031 # Integrator Parameters32 timestep 2.0 ;# 2fs/step33 rigidBonds all ;# needed for 2fs steps34 nonbondedFreq 135 fullElectFrequency 236 stepspercycle 103738 # Constant Temperature Control39 langevin on ;# do langevin dynamics40 langevinDamping 5 ;# damping coefficient (gamma) of 5/ps41 langevinTemp $temperature42 langevinHydrogen off ;# don’t couple langevin bath to hydrogens4344 # Periodic Boundary Conditions45 cellBasisVector1 42.0 0. 0.46 cellBasisVector2 0. 44.0 0.47 cellBasisVector3 0. 0 47.048 cellOrigin 31.0 29.0 17.549 wrapAll on5051 # PME (for full-system periodic electrostatics)52 PME yes53 PMEGridSpacing 1.05455 #manual grid definition56 #PMEGridSizeX 4557 #PMEGridSizeY 4558 #PMEGridSizeZ 485960 # Constant Pressure Control (variable volume)61 useGroupPressure yes ;# needed for rigidBonds62 useFlexibleCell no63 useConstantArea no64 langevinPiston on65 langevinPistonTarget 1.01325 ;# in bar -> 1 atm66 langevinPistonPeriod 100.067 langevinPistonDecay 50.068 langevinPistonTemp $temperature6970 # Output71 outputName $outputname72 restartfreq 500 ;# 500steps = every 1ps73 dcdfreq 25074 xstFreq 25075 outputEnergies 10076 outputPressure 10077 c 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 25
    • MSI February 2012 Practical examples78 #############################################################79 ## EXECUTION SCRIPT ##80 #############################################################8182 # Minimization83 minimize 10084 reinitvels $temperature85 run 2500 ;# 5ps C Extra material for ADUN C.1 Set up Add these lines to your $HOME/.bashrc file: export LD_LIBRARY_PATH= / c b b l / s o f t / adun / c h i l e / GNUstep / L i b r a r y / L i b r a r i e s : / soft / l i b : / c b b l / s o f t / GNUstep / System / L i b r a r y / L i b r a r i e s : / cbbl / s o f t / adun / l i b / l i b : / c b b l / s o f t / OMPI / l i b export HOMEPATH=$HOME export PATH= / c b b l / s o f t / r e d u c e − 3 . 1 3 / : $PATH D Additional tools NAMD • a nice interface to NAMD runs can be found at http://mmtsb.org/workshops/mmtsb-ctbp_ workshop_2009/Tutorials/MMTSB_NAMDSimulation/MMTSB_NAMDSimulation.html • an extensive example of a standard protocol for minimizing, heating and producing simulations with NAMD is provided here: http://faculty.uml.edu/vbarsegov/teaching/ bioinformatics/lectures/MDSimulationsModified.pdf • running replica exchange simulations with NAMD: http://www.ks.uiuc.edu/Research/ namd/2.6/ug/node40.html • NAMD case studies http://www.ks.uiuc.edu/Training/CaseStudies/ MOLARIS • The complete MOLARIS tutorials: http://cbbl.imim.es/?page_id=143# molaris. ADUN • The Adun site: http://adun.imim.es • The Adun install guide can be found here: http://lavandula.imim.es/adun-new/ ?page_id=103 • What to check if something goes wrong with adun? http://lavandula.imim. es/adun-new/?page_id=308 • Experimental Live CD including and ADUN distribution: http://susegallery. com/a/hvXWpn/adun-user. References [Åqvist and Fothergill, 1996] Åqvist, J. and Fothergill, M. (1996). Computer Simulation of the Triosephosphate Isomerase Catalyzed Reaction. 271(17):10010–10016. c 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 26
    • MSI February 2012 Practical examples[Bonet et al., 2006] Bonet, J., Caltabiano, G., Khan, A., Johnstons, M., Corbí, C., Gómez, A., Rovira, X., Teyra, J., and Villà-Freixa, J. (2006). The Role of Residue Stability in Tran- sient Protein-Protein Interactions Involved in Enzymatic Phosphate Hydrolysis. A Computational Study. 63:65–77.[Chu et al., 2003] Chu, Z. T., Villà-Freixa, J., Štrajbl, M., Schutz, C. N., Shurki, A., and Warshel, A. (2003). MOLARIS version alpha9.06.01.[Hansson et al., 1998] Hansson, T., Marelius, J., and Åqvist, J. (1998). Ligand-binding affinity prediction by linear interaction energy methods. J. of Comput-Aided Mol. Design, 12(1):27–35.[Humphrey et al., 1996] Humphrey, W., Dalke, A., and Schulten, K. (1996). Vmd: visual molecu- lar dynamics. J Mol Graph, 14(1):33–8, 27–8.[Jogl et al., 2003] Jogl, G., Rozovsky, S., McDermott, A. E., and Tong, L. (2003). Optimal align- ment for enzymatic proton transfer: Structure of the Michaelis complex of triosephosphate iso- merase at 1.2-A resolution. Proceedings of the National Academy of Sciences of the United States of America, 100(1):50–55.[Johnston et al., 2005] Johnston, M. A., Galvan, I. F., and Villà-Freixa, J. (2005). Framework-based design of a new all-purpose molecular simulation application: the adun simulator. J Comput Chem, 26(15):1647–1659.[Lee et al., 1993] Lee, F., Chu, Z., and Warshel, A. (1993). Microscopic and semimicroscopic calculations of electrostatic energies in proteins by the POLARIS and ENZYMIX programs. Journal of Computational Chemistry, 14(2):161–185.[Lolis et al., 1990] Lolis, E., Alber, T., Davenport, R. C., Rose, D., Hartman, F. C., and Petsko, G. A. (1990). Structure of yeast triosephosphate isomerase at 1.9A resolution. Biochemistry, 29(28):6609–6618.[Phillips et al., 2005] Phillips, J. C., Braun, R., Wang, W., Gumbart, J., Tajkhorshid, E., Villa, E., Chipot, C., Skeel, R. D., Kalé, L., and Schulten, K. (2005). Scalable molecular dynamics with namd. J Comput Chem, 26(16):1781–1802.[Rozovsky et al., 2001] Rozovsky, S., Jogl, G., Tong, L., and McDermott, A. E. (2001). Solution- state NMR investigations of triosephosphate isomerase active site loop motion: ligand release in relation to active site loop dynamics. Journal of Molecular Biology, 310(1):271 – 280.[Scheper et al., 2009] Scheper, J., Oliva, B., Villà-Freixa, J., and Thomson, T. M. (2009). Analysis of electrostatic contributions to the selectivity of interactions between ring-finger domains and ubiquitin-conjugating enzymes. Proteins, 74(1):92–103.[Warshel and King, 1985] Warshel, A. and King, G. (1985). Polarization Constraints in Molecular Dynamics Simulation of Aqueous Solutions: The Surface Constraint All Atom Solvent (SCAAS) Model. 121:124–9.[Word et al., 1999] Word, J., Lovell, S., Richardson, J., and Richardson, D. (1999). Asparagine and glutamine: using hydrogen atom contacts in the choice of side-chain amide orientation1. Journal of molecular biology, 285(4):1735–1747.c 2008-2012 Jordi Villà-Freixa, 2010 Nils Drechsel 27