Molecular dynamics (MD) simulations allow atoms and molecules to interact over time, representing a virtual experiment. MD was used to give dynamics to SUMO proteins in solution. The SUMO protein was divided into fragments which were given random conformations using CYANA. These conformations were then converted to GROMACS format and molecular dynamics simulations were performed using GROMACS. The simulations involved energy minimization to relieve strain, followed by production runs. Various analysis tools were then used to analyze the results.
Molecular Dynamics for Beginners : Detailed OverviewGirinath Pillai
Detailed presentation of what is molecular dynamics, how it is performed, why it is performed, applications, limitations and software resources on how to perform calculations are discussed.
Energy minimization methods - Molecular ModelingChandni Pathak
Methods to minimize the energy of molecules during drug designing - Computational chemistry. According to the PCI syllabus, B.Pharm 8th Sem - Computer-Aided Drug Design (CADD).
Molecular Dynamics for Beginners : Detailed OverviewGirinath Pillai
Detailed presentation of what is molecular dynamics, how it is performed, why it is performed, applications, limitations and software resources on how to perform calculations are discussed.
Energy minimization methods - Molecular ModelingChandni Pathak
Methods to minimize the energy of molecules during drug designing - Computational chemistry. According to the PCI syllabus, B.Pharm 8th Sem - Computer-Aided Drug Design (CADD).
Molecular Mechanics in Molecular ModelingAkshay Kank
In this slide you learn about the computational chemistry and its role in designing a drug molecule. Also learn concept about the molecular mechanics and its application to Computer Aided Drug Design. difference between the Quantum mechanics and Molecular Mechanics.
ADMET properties prediction using AI will accelerate the process of drug discovery.
This slide mostly focuses on using graph-based deep learning techniques to predict drug properties.
Molecular Mechanics in Molecular ModelingAkshay Kank
In this slide you learn about the computational chemistry and its role in designing a drug molecule. Also learn concept about the molecular mechanics and its application to Computer Aided Drug Design. difference between the Quantum mechanics and Molecular Mechanics.
ADMET properties prediction using AI will accelerate the process of drug discovery.
This slide mostly focuses on using graph-based deep learning techniques to predict drug properties.
Research Inventy : International Journal of Engineering and Scienceresearchinventy
Research Inventy : International Journal of Engineering and Science is published by the group of young academic and industrial researchers with 12 Issues per year. It is an online as well as print version open access journal that provides rapid publication (monthly) of articles in all areas of the subject such as: civil, mechanical, chemical, electronic and computer engineering as well as production and information technology. The Journal welcomes the submission of manuscripts that meet the general criteria of significance and scientific excellence. Papers will be published by rapid process within 20 days after acceptance and peer review process takes only 7 days. All articles published in Research Inventy will be peer-reviewed.
Protein Structure Prediction Using Support Vector Machine ijsc
Support Vector Machine (SVM) is used for predict the protein structural. Bioinformatics method use to protein structure prediction mostly depends on the amino acid sequence. In this paper, work predicted of 1-D, 2-D, and 3-D protein structure prediction. Protein structure prediction is one of the most important problems in modern computation biology. Support Vector Machine haves shown strong generalization ability protein structure prediction. Binary classification techniques of Support Vector Machine are implemented and RBF kernel function is used in SVM. This Radial Basic Function (RBF) of SVM produces better accuracy in terms of classification and the learning results.
PROTEIN STRUCTURE PREDICTION USING SUPPORT VECTOR MACHINEijsc
Support Vector Machine (SVM) is used for predict the protein structural. Bioinformatics method use to protein structure prediction mostly depends on the amino acid sequence. In this paper, work predicted of 1-
D, 2-D, and 3-D protein structure prediction. Protein structure prediction is one of the most important problems in modern computation biology. Support Vector Machine haves shown strong generalization ability protein structure prediction. Binary classification techniques of Support Vector Machine are implemented and RBF kernel function is used in SVM. This Radial Basic Function (RBF) of SVM produces better accuracy in terms of classification and the learning results.
Project report: Investigating the effect of cellular objectives on genome-sca...Jarle Pahr
Report from a half-semester master-level project carried out at the department of biotechnology, Norwegian University of Science and Technology. Describes a MATLAB-based framework for comparing experimental metabolic flux data with model predictions and evaluating objective functions.
Prediction of the three dimensional structure of a given protein sequence i.e. target protein from the amino acid sequence of a homologous (template) protein for which an X-ray or NMR structure is available based on an alignment to one or more known protein structures
Ab Initio Protein Structure Prediction is a method to determine the tertiary structure of protein in the absence of experimentally solved structure of a similar/homologous protein. This method builds protein structure guided by energy function.
I had prepared this presentation for an internal project during my masters degree course.
2. Molecular dynamics and Simulations
— Molecular dynamics (MD) is a form of computer simulation
in which atoms and molecules are allowed to interact for a
period of time.
— Because molecular systems generally consist of a vast number of
particles, it is impossible to find the properties of such complex
systems analytically; MD simulation circumvents this problem
by using numerical methods.
— It represents an interface between laboratory experiments and
theory.
— Itcan be understoodas a "virtual experiment.
3. Purpose Of MD for SUMO
• Proteins in solution are consideredto dynamic.
• It is difficult to study their motions, behavior, structural flexibility in
solution.
• The strucutre of small proteins can be solved and studied by the
conventional techniqueof X-RAY CRYSTALLOGRAPHY.
• X-RAY techniques require strict periodic boundary conditions which
is very difficult to obtain in a non crystalline strucutres.
• Molecular dynamics simulations can predict the state of a protein in
solution and save these states in the form of a trajectory.
• MD can predict the movement of large proteins in the solution which
is not possible in X-ray.
• MD can simulate the exactconditon of the existence of a protein.
• Structures obtained after MD simulation can be regarded as best
energy minimized and geometrically optimized structres thus
allowing them to be used in various experiments-------NMR,
Docking, protein-ligandinteractions.
4. Brief Methodology
1. Use physics to find the potential energy betweenall pairsof atoms.
2. Move atomsto the nextstate.
3. Repeat.
Energy Function
— Describes the interaction energiesofall atomsand molecules in the
system.
— Alwaysan approximation.
Closer to real physics --> more realistic, more computationtime (I.e.
smaller time steps and more interactions increase accuracy)
5. Scale in Simulations
Ηψ = Εψ
F = MA
exp(-ΔE/kT)
domain
quantum
chemistry
molecular
dynamics
Monte Carlo
mesoscale continuum
Length Scale
10-10 M 10-8 M 10-6 M 10-4 M
10-12 S
10-8 S
10-6 S
6. Molecular dynamics on proteins
— Although normally represented as static structures, proteins
are in fact dynamic.
— Most experimental properties, for example, measure a time
average or an ensemble average over the range of possible
configurations the molecule can adopt.
— One way to investigate the range of accessible configurations
is to simulate the motions or dynamics of a molecule
numerically. This can be done by computing a trajectory, a
series of molecular configurations as a function of time, by the
simultaneous integrationof Newton's equationsof motion.
7. So what exactly the Molecular Dynamics is?
• It is the process of giving the movements to proteins internally
which is produced by increasing the temperature of the system
and coolingthem rapidly in a very shorttime scale.
• During these conditions the steric interactions or the imperfect
bonds between the amino acid residues and the peptides are
removedor modified.
• It generates the most stable and the energy minimized
conformationsof the protein.
• While doing so it computes many different frames or trajectories
of the same protein.
10. SUMO proteins
— Small Ubiquitin-like Modifier or SUMO proteins are a family of
small proteins that are covalently attached to and detached from
other proteinsin cellsto modify their function.
— The function performed by SUMO proteins is known as
SUMOylation.
— post-trnalational modification involved in various cellular
processes such as transcriptional regulation, apoptosis, protein
stability etc.
— Similar to ubiquitin and SUMOylation is directed by an enzymatic
cascade analogous to that involved in ubiquitination. In contrast
to ubiquitin, SUMO is not used to tag proteinsfor degradation.
12. Function of SUMO
— SUMO modification of proteins has many functions. Among the
most frequent and best studied are protein stability, nuclear-
cytosolictransport,and transcriptionalregulation.
— Typically, only a small fraction of a given protein is SUMOylated
and this modification is rapidly reversed by the action of
deSUMOylating enzymes. The SUMO-1 modification of RanGAP1
(the first identified SUMO substrate) leads to its trafficking from
cytosol to nuclear pore complex.
— The SUMO modification of protein leadsto its movementfrom
the centrosome to the nucleus .
13. Structure
— There are 3 confirmed SUMO isoformsi n humans; SUMO-1, SUMO-2 and
SUMO-3. SUMO-2/3 show high a high degree of similarity to eachother and
are distinct from SUMO-1.
— SUMO proteins are small; mostare around 88 to100 amino acids in length
and 12 kDA in mass. The exactlength and mass varies betweenSUMO family
members and depends on which organism the protein comes from.
— Although SUMO has very little homology with Ubiquitin at the amino acid
level, it has a nearly identical structural fold.
— SUMO1 as a globular protein with both ends of the amino acid chain sticking
out of the protein's centre. The spherical core consists of an alpha helix and
a beta sheet.
The SUMO protein takenfor this work was extracted out from Drosophila
melangaster
14. Giving Dynamics to the protein
Step 1 : generationof structures.
Step 2: performing molecular dynamicson each of the topologies.
Step 3: Recordingthe potential energy changesin protein during
Dynamics.
Step 4: Clusteringof the best minimizedstructures.
Programs/software'sused:
• Cyana
• NAMD/VMD
• VEGA ZZ
• GROMACS
15. Generation of structures using Cyana
— For the sake of convenienceandease of dynamics, SUMO protein
wasdivided in to five fragments.
— These fragmentswere dividedbased on their propensity to form
secondary structures.
Fragment Residue numbers
Fragment 1 1-12
Fragment2 11-32
Fragment3 31-53
Fragment4 52-72
Fragment5 71-88
16. — Firstit was necessary to make 1000 randomconformersor topologies
from each of the five differentfragments.
— The programused for this structure generation wasCYANA which is
linux-based program.
— For thisthe sequencesof each of the five fragmentsof SUMO protein
wasgiven to the programand was told to generate1000 random
topologies.
— Dynamicsand annealing conditionswereappliedto give the energy of
these 1000 randomstructures.
— The programwastold to select 20 bestenergy minimized structures.
— These 20 differenttopologiescould be viewedusing softwareslike
Pymol, Molmol, VMD etc.
— The dynamicsof the protein wascarriedout in vacuum withoutgiving
any constraintsto themand the dynamics of the protein could be
playedusing the above mentionedsoftwaresandare saved.
17. Files used in Cyana
— Firstneed to create .CCO file of the FIVE fragments
.CCO file------------
1 MET H HA 6.7277 3.20E+00
2 SER H HA 6.9968 1.20E+00
3 ASP H HA 6.4720 1.20E+00
4 GLU H HA 6.8444 1.20E+00
5 LYS H HA 6.9625 1.20E+00
.
.
.
.
.
.
.
53 THR H HA 7.5359 2.20E+00
18. •Init.cya File
.cya is a batch file which containsa set commands.
Rmsd range := 31.....53
Cyana.lib
Read seq third.seq
Swap = 0
19. Batch file
— ‘Seed’ asks the program togenerate 1000 topologies.
— The last two commands create the 20 best topologies.
21. HIGHLIGHTS
— Generally 3 to 10 times fasterthan other Molecular Dynamics programs
— Very user-friendly: issues clear error messages, no scripting language is
required to run the programs, prints out the progress of the program that
is running, etc.
— Allows the trajectory data to be stored in a compact way.
— Gromacs provides a basic trajectory data viewer; xmgr or Grace may also
be used toanalyze the results.
— Files from earlier versions of Gromacs may be used in the latest Gromacs,
version 3.1.
To run a simulation several things are needed:
1. a file containing the coordinates for all atoms.
2. information on the interactions (bond angles, charges, Van der
Waals).
3. parameters to control the simulation.
22. The exercise falls apart in foursections,
corresponding to the actual steps in an MD
simulation.
1. Conversion of the pdb structure file to a Gromacsstructure file,
with the simultaneousgenerationof a descriptive topology file.
2. Energy minimizationof the structure to releasestrain.
3. Running a full simulations.
4. Analyzing results.
23. File Formats
— PDB file
------ format used by BrookhavenProteinDataBank.
Atom
residue
Res.no
X,Y,Zcoordinates
29. *.xvg file:
file format that is read by Grace (formerly calledXmgr), which is a
plotting tool for the X window system.
Plot of X vs Y
30. *.mdp file:
allows the user to set up specific parameters forall the
calculations that Gromacs performs.
Recording every 0.002ps
No of steps for
MD=500000
34. The exercise falls apart in four sections,
corresponding to the actual steps in an MD
simulation.
1. Conversion of the pdb structure file toa Gromacs structure
file, with the simultaneousgenerationof a descriptive
topology file.
2. Energy minimization of the structure to release strain.
3. Running a full simulations.
4. Analyzing results.
36. Step 1: Conversion of the PDB File
— Each of the topologiessavedin the PDB format were used asan input
file for MD simulationswith Gromacs.
— It is first necessary to convertitto the gromos file type (*.gro).
Original data in the pdb file is often incomplete,carbon bound
hydrogensaregenerallyomitted.
— The conversion programpdb2gmx will check every residue in the
structure file againsta database and add all hydrogens.In the
conversionprocessitalso createsa topology file, with all the
connectionsbetween the atomslisted.
— pdb2gmx can be used by simply typing it at the prompt.
Example : we geta listof availableoptionsthatthat thisconversion
programcan execute - - - -
pdb2gmx -h
37. pdb2gmx -h
— This programreadsa pdb file, readssome database files, adds
hydrogensto the molecules and generatescoordinatesin Gromacs
(Gromos) format and a topology in Gromacs format.
— This conversionprogramcontainsmany in-built force-fields. We have
to selectthe requiredforce fields.
Option in Pdb2gmx:
Options Description
-f Input
-o Output
-p Output for topology file
-i Output
-n Output
-q output
-ff To assign force field
38. — The programwill ask to selecta force field:
— Select the Force Field:
0: GROMOS96 43a1 force field.
1: GROMOS96 43b1 vacuum force field.
2: GROMOS96 43a2 force field (improvedalkane
dihedrals).
3: OPLS-AA/L all-atomforce field (for aminoacid
dihedrals).
4: Gromacs force field (gmx) with hydrogensforNMR.
5: Encad all-atomforce field, using scaled-down
vacuum charges.
6: Encad all-atomforce field, using full solventcharges.
Gmx force field was used (for NMR)
39. Outputs produced by this command…………………..
Once the selection of the force field is done, three kinds of output files are produced:
1. PDB files
2. the generated topology (.top) file
3. gromos (.gro) file
• dsmt3.gro:
It looks a lot like the original pdb file, containing the same information regarding
the positions of the atoms, but the layout is different, hydrogens have been added
and units have been converted tonm.
• dsmt3.top:
This file contains the information on the atom names, types, masses and charges, as
well as a description of bonds, angles, dihedrals, etc.
40. Force fields used using during molecular dynamics
— force field (also called a forcefield)refersto the functional form and
parametersets used to describe the potential energy of a system of
particles( in thiscase the atoms and the residues).
— As protein models consistof hundreds or thousands of atoms the only
feasible methods of computing systemsof such size are molecular
mechanicscalculations.
— A Force- Field is assigned to each atom in the protein.This figure is a
schematic representation ofthe four key contributionsto a molecular
mechanics force field: bond stretching,angle bending, torsional terms
and non-bonded interactions.
45. Energy =
Stretching Energy
+
Bending Energy
+
Torsion Energy
+
Non-Bonded Interaction Energy
Types of force fields
1. All-atom force fields - provide parameters forevery atom in a system, including
hydrogen.
2. united-atom force fields - treat the hydrogen and carbon atoms in methyl and
methylene groups as a single interaction center.
3. Coarse-grained force fields - which are frequently used in long-time simulations of
proteins.
These equations togetherwith the data (parameters) required to describe the behavior
of different kinds of atoms and bonds, is called a force-field.
47. Editconf
• editconf puts .gro file into a box
• The box can be modified with options -box, -d and -angles. Both -box and –d will center
the system in the box.
• Option -bt determines the box type: cubic is a rectangularbox with all sides equal
dodecahedron represents a rhombic dodecahedron and octahedron is a truncated
octahedron.
• With -d and cubic, dodecahedron or octahedron boxes, the dimensions are set to the
diameterof the system.
Options in Editconf
Option Description
-f Input
-n Output
-o Output
-bt For box type
-d Distance between the solute and the
box
50. Genbox……………….
Genbox can do one of 2 things:
1) Generate a box of solvent.
2) Solvate a solute configuration, eg. a protein, in a bath of solvent molecules. Specify -cp
(solute) and -cs (solvent). The box specified in the solute coordinate file (-cp) is used.
Options in Genbox
• Here the solvent of 8M urea (in the form of the denaturant) was prepared with the protein
acting as a solute (protein dissolved in 8M urea).
• The solvent file for urea was in the the form of urea+water.gro
Options Description
-cp Input
-cs Input
-o Output
-p Output
53. — Step 2: Energy Minimization
• The structure is now complete (hydrogens have been added) and a topology
file has been created.
• However, there may be local strain in the protein, due to the generation of
the hydrogens, and bad Van der Waals contacts may exist, caused by
particles that are too close.
• The strain has to be removed by energy minimization of the structure. This
can be done with the program 'mdrun', which is the MD program. Mdrun
uses a single .tprfile as input, which is generated by combining the topology
(aki.top), structure (aki.gro) and parameterfiles (minim.mdp).
• grompp also reads parameters forthe mdrun (eg. number of MD steps, time
step, cut-off).
• To generate the .tpr file the program grompphas to be used.
— A description of grompp can be obtained by giving the command:
grompp -h
54. Options in Grompp
Option Description
-f grompp input file with MD parameters
-po grompp input file with MD parameters
-c Input
-r Input
-n Input
-p Input the topology file
-pp Preprocess and outputthe toplology file
-o Output
-t Input the trajectory file
-e Input the energy file
-np Generate the status file
56. Mdrun
• The mdrun program is the main computational chemistry engine within GROMACS.
• It performs Molecular Dynamics simulations, Brownian Dynamics and Langevin
Dynamics as well as Conjugate Gradient or Steepest Descents energy minimization.
Principle
The mdrun program reads the run input file (-s)
Distributes the topology over nodes.
The coordinates are passed around, so that computations can begin.
A neighborlist is made, then the forces are computed.
The forces are globally summed, and positions are updated.
If necessary shake is performed to constrain bond lengths and/or bond
angles.
• Temperature and Pressure can be controlled using weak coupling to a bath.
57. • Option in Mdrun
Option Description
np Number of nodes used
s Input
o Output
c Output
e Output
g Output
x Output
58. — The energy minimization may takesome time, depending on the CPU in and the load
of the computer.
— The trajectory file is not very important in energy minimizations, but the generated
structure file (minimized.gro) will serve as input for the simulation.
— During the minimization the potential energy decreases. A plot from the energy over
time can be made from the minim_ener.edrfile using g_energy command.
— Simply make a plot from the .edr file by executing:
— This will display something like the following:
g_energy -f dsmt3-em_ener.edr -o dsmt3-em_ener.xvg
59. — Select the property you want by typing the name, e.g. Potential, which
codes for potential energy and then press return and another return to
quit.
— The program g_energy produces a .xvg graph, which can be viewed and
edited with xmgrace (prgram which makes graphs in gromacs) :
— GRAPH
xmgrace -nxy dsmt3-
em_ener.xvg
61. Position Restrained MD
• molecular dynamics of the water molecules of water molecules are done, and position of the peptide
is kept fixed. This is called position restrained (PR) MD.
• Position Restrained MD keeps the peptide fixed and lets all water molecules equilibrate around the
peptide in order to fill holes, etc., which were not filled by the genbox program.
• It is first necessary to pre-process the input files to generate the binary topology. The input files are:
the topology file, the structure file (output of the EM) and a parameter file.
• By default, the system was split into two groups - Protein and SOL(vent), to put position restraints on
all the atoms of the peptide.
• The parameter file (.mdp extension) contains information aboutthe PR-MD such as step size,
number of steps, temperature, etc. This parameter file also tells GROMACS what kind of simulation
should be performed
preprocess Position restrained MD file
Topology file of the
protein
Energy minimized
binary file from
previous step
Position restrained
MD file with energy
minimized protein in
it
65. • These two commands are used to give full moloeculardynamics, where none of
the systems are fixed.
• Both the proteins and the solvent is subjected to motion until both of them
become stable ata particular point, which is retained as the final output.
• g_filter frequency filters trajectories, useful for making smooth movies. Many
of the trajectories are filtered and in all only 10 trajectories are kept. These can
be read by pymol(proteinvisualization software).
67. — Pymol cannot read Gromacs xtc trajectories , and it is betterto removethe solvent in
the trajectory to concentrateon the proteins.
— This is easy to fix by using another Gromacs program to convert the trajectoryto PDB
format and only select one group:
— An output group is asked to select and protein is selected in that.
— The trajectories of protein afterMD is visualized in Pymol by giving a command:
Pymol dsmt3-finaltraj.pdb