How To Recoord


Published on

Published in: Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

How To Recoord

  1. 1. Running RECOORD Scripts using CNS 1.1 Christiane Riedinger, Feb 2007 The RECOORD scripts are bash scripts that generate input scripts (*.inp) for CNS tailor-made for NMR structure calculation. RECOORD comes with its own set of forcefields and allows the user to carry out a structure calculation using a standardised protocol. Aart J. Nederveen, Jurgen F. Doreleijers, Wim Vranken, Zachary Miller, Chris A.E.M. Spronk, Sander B. Nabuurs, Peter Guentert, Miron Livny, John L. Markley, Michael Nilges, Eldon L. Ulrich, Robert Kaptein and Alexandre M.J.J. Bonvin (2005). RECOORD: a REcalculated COORdinates Database of 500+ proteins from the PDB using restraints from the BioMagResBank. Proteins 59, 662-672. 1. Installing the Scripts • Make a project directory for your structure calculation in your home directory, e.g. <doc1> • Put the RECOORD script directory into <doc1> • Make all RECOORD scripts executable and add the directory to your path # vi .cshrc add: set path = (~/doc1/RECOORD) • Edit and execute the script “”: #!/bin/bash # # script to change scripts directory, to be run in scripts directory # script will change itself as well # # fill in directory here newDir=/users1/riedinger/doc1/RECOORD 2. Creating the topology file (.mtf) The topology file contains atom information, bond information (lengths and angles), torsion angles, disulphide bonds, but it does not contain coordinates. 2.1. Generating the topology file from a primary sequence file If you only have a primary sequence file available for your protein, you need to generate the topology file using a script from the CNS webpage, called generate_seq.inp: • go to the CNS webpage and edit this script • enter the name of your primary sequence file, the desired name of the output file…. • set hydrogens to true, select false for B-factor and occupancy (we are not doing crystallography!)…. • RECOORD comes with its own set of parameter and topology files which need to be specified in generate_seq.inp: {================== protein topology and parameter files ===================} {* protein topology file *} {===>} prot_topology_infile="SCRIPTS:/toppar/"; {* protein linkage file *} {===>} prot_link_infile="SCRIPTS:/toppar/topallhdg5.3.pep"; {* protein parameter file *}
  2. 2. {===>} prot_parameter_infile="SCRIPTS:/toppar/"; {================nucleic acid topology and parameter files =================} {* nucleic acid topology file *} {===>} nucl_topology_infile="SCRIPTS:/toppar/"; {* nucleic acid linkage file *} {===>} nucl_link_infile="SCRIPTS:/toppar/"; {* nucleic acid parameter file *} {===>} nucl_parameter_infile="SCRIPTS:/toppar/dna-rna-allatom.param"; {=================== water topology and parameter files ====================} {* water topology file *} {===>} water_topology_infile="SCRIPTS:/toppar/topallhdg5.3.sol"; {* water parameter file *} {===>} water_parameter_infile="SCRIPTS:/toppar/parallhdg5.3.sol"; {================= carbohydrate topology and parameter files ===============} {* carbohydrate topology file *} {===>} carbo_topology_infile="SCRIPTS:/toppar/"; {* carbohydrate parameter file *} {===>} carbo_parameter_infile="SCRIPTS:/toppar/carbohydrate.param"; {============= prosthetic group topology and parameter files ===============} {* prosthetic group topology file *} {===>} prost_topology_infile=""; {* prosthetic group parameter file *} {===>} prost_parameter_infile=""; {===================== ion topology and parameter files ====================} {* ion topology file *} {===>} ion_topology_infile="SCRIPTS:/toppar/"; {* ion parameter file *} {===>} ion_parameter_infile="SCRIPTS:/toppar/ion.param"; • Save the script in your project directory, also put your primary sequence file there. • Now you need to set an environment variable: # setenv SCRIPTS ~/doc1/RECOORD • Run the Script: # cns < generate_seq.inp > generate_seq.out • The file generated is doc1.mtf • The output file will give you information in case things have gone wrong. 2.2. Generating the topology file from a pdb file In case that you already have a pdb file of your protein, you generate the topology file with (similar to generate_easy.inp from the CNS webpage) 3. Generate an extended Structure The an extended structure is the starting point for simulated annealing and is generated using the script • # <your mtf file> • creates <your protein>_extended.pdb
  3. 3. 4. Start the simulated annealing Place your restraint files in your project directory. The restraint files need to be named as follows: • unambig.tbl • ambig.tbl • dihedrals.tbl • hbonds.tbl • methyls.tbl • … An example of a unambiguous restraint file: (Exclamation mark indicates a comment) !Q 60 assign (resid 60 and name hn) (resid 60 and name hb1) 1.8 0.0 1.1 ! 3dhnnoe_new.259 1.80378 strong assign (resid 60 and name hn) (resid 60 and name hb2) 1.8 0.0 1.7 ! 3dhnnoe_new.240 0.93875 medium An example of a dihedral restrain file: (coming from TALOS) ! 1. Q 57 Phi -103.79 +/- 70.98 (-174.77 to -32.81) assign (resid 56 and name C) (resid 57 and name N) (resid 57 and name CA) (resid 57 and name C) 1.0 -103.79 70.98 2 • the simulated annealing is run with the script • the file generates the CNS input file annealing.inp and run.cns (containing the restraints). It then generates refineLong.inp files, each one calculating one pdb file. • go thoroughly through the script to specify parameters: #!/bin/bash # run as: <entries> # # script for calculating an NMR ensemble with MDSA # per model only one job is generated # # Aart Nederveen 2003, Utrecht University ############ settings for CNS calculation ############# # The following files should be present in project directory: # project_cns.pdb # project_cns.mtf # project_cns_extended.pdb ∫# simply rename your mft and extended pdb file to contain <your # protein>_cns_extended.pdb. I don’t know why it needs a second pdb file, so I just # copied my doc1_cns_extended.pdb file to doc1_cns.pdb. that seemed to work, but no # guarantee! # directory for models that are calculated dirRefined='str' # directory for scripts dirScripts='/users1/riedinger/doc1/RECOORD' # directory for CNS output dirCalc='cnsRef' # submit command for cluster # if 'csh', then your own computer is used # submit='ssub linux_cluster' # submit='csh' # either chose ‘csh’ for your own computer, or if you are using synapse, specify: submit = ‘qsub –V cwd’ # number of models that are generated # all models are generated with the same protocol; only the seed number differs number=2 # select as you wish # if deletePrevious is 0 then no calculation is performed if coordinatefile is already present deletePrevious=0
  4. 4. # sleep time between successive jobs to make cluster happy sleepTime='3s' # cns executable # cnsExec='/software/cns_1.1/cns_solve_1.1/intel-i686-linux_g77/bin/cns' cnsExec='/packages/cns/cns_solve_1.1/intel-i686-linux/bin/cns' # enter your correct path # settings for symmetric dimer (ncs + symmetry restraints) # mind that segid names and residuenumbers are grepped from $entry.pdb symDimerOn=0 # 0 = off, if you have a monomer # choose longer protocol; double number of steps, default 0 doubleSteps=0 … • finally, to run the script, you need to be one directory above your project directory • to run the script: # <your project directory> 5. Analysing Violations • First, use the Script, which will generate and run calcViol_all.inp • Before you run it for the first time, make sure you have entered the correct CNS executable for your installation • You have to run this script from within your project directory: # doc1_cns str violations 0.3 doc1_cns.mtf > calcViol.out • There won’t be an output file if there are no errors • The exact input variables are explained in the script itself, the above is just an example • When you’re done, run the second script, called analys • Useful tips: • If running a script again after an error, remove every file that it has created. • Check the CNS executable is stated correctly