Homology
Modelling
&
Modeller
28/06/20 HM and modeller 2
What to expect !!
➢Basics on Molecular modelling, Different approaches to it
➢Homology Modelling, steps and certain pipelines that helps here !
➢Working with Modeller !!
28/06/20 HM and modeller 3
The Basics !!
What is molecular modelling ?
What is a protein ?
Primary structure
Secondary structure
Tertiary structure
Quaternary structure
Source: Welcometrust.org
28/06/20 HM and modeller 4
The Basics !! Molecular modelling
What is molecular modelling ?
Deriving, Representing
Mechanisms, Manipulate for benefit
Applying
Its satifies the need for methods to elucidate
prot.strcuture
Applying
Its satifies the need for methods to elucidate
prot.strcuture
28/06/20 HM and modeller 5
Approaches towards molecular modelling
Sources of structural information ? Role in Activity & Function!!
X-ray Crystallography
NMR
Cryo-EM
Analogy of Hand and Shadow
28/06/20 HM and modeller 6
Limitations in protein Structure acquisition
X-ray Crystallography
●
Needs Crystals, very tough to get for protein
●
Resolution problem. Along the loops, Certain side chains, tautomeric state
●
Ambiguities in bound ligands
Nuclear Magnetic resonance
●
Mass restriction (upto 64KDa)
●
Poor resolution power
●
Sensitivity losses and Increased spectral complexity
Davis, A.M., Teague, S.J. and Kleywegt, G.J. (2003), Angewandte Chemie International Edition, 42: 2718-2736.
Dominique et al,Curr Opin Struct Biol. 2013 Oct; 23(5): 734–739.
28/06/20 HM and modeller 7
H.Models, Seq.Identity and Applicability
Source:A Sali et al
28/06/20 HM and modeller 8
Approaches towards molecular modelling
Modelling approches -Threading (30<Similarity<50)
Given the Protein sequence, Which fold
of a known Structure resembles the unknown ?
Source: researchgate.net
CATH, SCOP
https://en.wikipedia.org/wiki/Threading_(protein_sequence)
Raptor
HHpred
Phyre
Softwares
28/06/20 HM and modeller 9
Approaches towards molecular modelling
Modelling approches – Ab-initio Method (Similarity <30)
Source: researchgate.net
Uses the knowledge from physics,
Protein confromation parameters and
Structure for small fragments of protein
To Construct the actual model
“Thermodynamic hypothesis” -Principle
1 2
Jamesmccaffrey.wordpress.com
Www.biology-online.com
Rosetta, TOUCHSTONE-2, i-Tasser
https://bioinformaticsreview.com/20171210/ab-initio-prediction-of-protein-structure-an-introduction/
28/06/20 HM and modeller 10
Approaches towards molecular modelling
Modelling approches – Homology modelling (Similarity>50%)
Source: researchgate.net
“Evolutionarily Structures are more conserved than
Sequence”
The level of divergence from conservation depends
on sequence similarity/ homology
Certain Terminology : Homologs, Paralogs, Orthologs
Similar Seq. = Similar Str.
28/06/20 HM and modeller 11
Errors in a modelling may be due to ?
●
Errors in the Template
●
Target-template alignment error
●
Side-chain packing error
●
Distorsion/ shift in aligned region
https://pdb101.rcsb.org
28/06/20 HM and modeller 12
Modeller 9.24
➢Modeller is a homology based protein structure prediction tool
➢The models are developed by “satisfaction of spatial restraints technique”.
➢Its build on python, libraries and other function that helps to manage structural data is also
incorporated within
➢A robust methods to interpret model and evaluate its energy is also its key points
➢Its free for academics under GPL scheme
https://salilab.org/modeller/
28/06/20 HM and modeller 13
Installation
●
If conda envirnment already there, modeller is accessible through conda package manager
$conda config –add channel salilab ; conda install modeller
Insert the “key” instead of “XXXX” the
/home/../anaconda2/lib/modeller-9.4/modlib/modeller/config.py file
●
For linux and windows installation please follow the guide
https://salilab.org/modeller/download_installation.html
$sudo apt install ./package.deb
Fire in terminal :$mod9.24 -- to Check if the installation has succeeded
B. Webb, A. Sali. Comparative Protein Structure Modeling Using Modeller. Current Protocols in Bioinformatics 54, John Wiley & Sons, Inc., 5.6.1-5.6.37, 2016.
28/06/20 HM and modeller 14
Homology modelling Workflow
Read the PDB paper, Checkout the quality parameter
{Resolution,R-value, R-free Value}, Read the str.pdb
check for remarks, wwPDBValidation report
Template Selection
Seq. Alignment
Model Construction
Evaluate model
Template idenfication E-value [close to 0] > Coverage > Identity > Similarty
Srinivas Ramachandran, Pradeep Kota, Feng Ding,,Nikolay V. Dokholyan et alProteins. 2011 Jan; 79(1): 261–270.
https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=BlastHelp#get_subsequence
Davis, A.M., Teague, S.J. and Kleywegt, G.J. (2003), Angewandte Chemie International Edition, 42: 2718-2736.
https://pdb101.rcsb.org/learn/guide-to-understanding-pdb-data/r-value-and-r-free
Sequence based and Structure based
Rigid body assembly, segment matching and satisfacti
-on of spatial restraints method !!
Swiss model validation, SAVES, PROSESS
28/06/20 HM and modeller 15
Satifying Spatial restraints !!
Homology derived restraints Statistical restraints.from exp Strucure Forcefield based restraints
The template act as a guide
were the new model is build
upon satifying the restraints
imposed
B. Webb, A. Sali. Comparative Protein Structure Modeling Using Modeller. Current Protocols in Bioinformatics 54, John Wiley & Sons, Inc., 5.6.1-5.6.37, 2016.
28/06/20 HM and modeller 16
Satifying Spatial restraints !!
Homology derived restraints Statistical restraints.from exp Strucure Forcefield based restraints
An optimization function
runs to minimize the
restrains, finally giving the
model
B. Webb, A. Sali. Comparative Protein Structure Modeling Using Modeller. Current Protocols in Bioinformatics 54, John Wiley & Sons, Inc., 5.6.1-5.6.37, 2016.
28/06/20 HM and modeller 17
Session on modeller9.24
Template Selection
Seq. Alignment
Model Consrtuction
Evaluate model
A basic modelling on TvLDH is took
here !!
https://salilab.org/modeller/
tutorial/
What you need for
starting
The TvLDH.ali
All the python scripts
pdb_95.pir
A non-reduntant
collection of high
quality PDB structure
$mod9.24 pythonscript.py
Fires the
command
28/06/20 HM and modeller 18
Session on modeller9.24
Template Selection
Seq. Alignment
Model Consrtuction
Evaluate model
1.Making a modeller compatible input format
The pir format saved in ali extension
Fasta --- > PIR
>P1;TvLDH
sequence:TvLDH:::::::0.00: 0.00
MSEAAHVLITGAAGQIGYILSHWIASGELYGDRQVYLHLLDIPPAMNRLTALTMELEDCAFPHLAGFVATTDPKA
AFKDIDCAFLVASMPLKPGQVRADLISSNSVIFKNTGEYLSKWAKPSVKVLVIGNPDNTNCEIAMLHAKNLKPEN
FSSLSMLDQNRAYYEVASKLGVDVKDVHDIIVWGNHGESMVADLTQATFTKEGKTQKVVDVLDHDYVFDTFFKKI
GHRAWDILEHRGFTSAASPTKAAIQHMKAWLFGTAPGEVLSMGIPVPEGNPYGIKPGVVFSFPCNVDKEGKIHVV
EGFKVNDWLREKLDFTEKDLFHEKEIALNHLAQGG*
>sp|Q6UXH0|ANGL8_HUMAN Angiopoietin-like protein 8 OS=Homo sapiens OX=9606
GN=ANGPTL8 PE=1 SV=1
MPVPALCLLWALAMVTRPASAAPMGGPELAQHEELTLLFHGTLQLGQALNGVYRTTEGRL
TKARNSLGLYGRTIELLGQEVSRGRDAAQELRASLLETQMEEDILQLQAEATAEVLGEVA
QAQKVLRDSVQRLEVQLRSAWLGPAYREFEVLKAHADKQSHILWALTGHVQRQRREMVAQ
QHRLRQIQERLHTAALPA
28/06/20 HM and modeller 19
Session on modeller9.24
Template Selection
Seq. Alignment
Model Consrtuction
Evaluate model
from modeller import *
log.verbose()
env = environ()
#-- Prepare the input files
#-- Read in the sequence database
sdb = sequence_db(env)
sdb.read(seq_database_file='pdb_95.pir', seq_database_format='PIR',
chains_list='ALL', minmax_db_seq_len=(30, 4000), clean_sequences=True)
#-- Write the sequence database in binary form
sdb.write(seq_database_file='pdb_95.bin', seq_database_format='BINARY',
chains_list='ALL')
#-- Now, read in the binary database
sdb.read(seq_database_file='pdb_95.bin', seq_database_format='BINARY',
chains_list='ALL')
#-- Read in the target sequence/alignment
aln = alignment(env)
aln.append(file='TvLDH.ali', alignment_format='PIR', align_codes='ALL')
#-- Convert the input sequence/alignment into
# profile format
prf = aln.to_profile()
#-- Scan sequence database to pick up homologous sequences
prf.build(sdb, matrix_offset=-450, rr_file='${LIB}/blosum62.sim.mat',
gap_penalties_1d=(-500, -50), n_prof_iterations=1,
check_profile=False, max_aln_evalue=0.01)
#-- Write out the profile in text format
prf.write(file='build_profile.prf', profile_format='TEXT')
#-- Convert the profile back to alignment format
aln = prf.to_alignment()
#-- Write out the alignment file
aln.write(file='build_profile.ali', alignment_format='PIR')
2.Executing buildprofile.py
script
28/06/20 HM and modeller 20
Session on modeller9.24
Template Selection
Seq. Alignment
Model Consrtuction
Evaluate model
Template
pdb
Lengt
h
Identit
y
E-
Value
“Buildprofile.prf” file
28/06/20 HM and modeller 21
Session on modeller9.24
Template Selection
Seq. Alignment
Model Consrtuction
Evaluate model
from modeller import *
env = environ()
aln = alignment(env)
for (pdb, chain) in (('1b8p', 'A'), ('1bdm', 'A'), ('1civ', 'A'),
('5mdh', 'A'), ('7mdh', 'A'), ('1smk', 'A')):
m = model(env, file=pdb, model_segment=('FIRST:'+chain, 'LAST:'+chain))
aln.append_model(m, atom_files=pdb, align_codes=pdb+chain)
aln.malign()
aln.malign3d()
aln.compare_structures()
aln.id_table(matrix_file='family.mat')
env.dendrogram(matrix_file='family.mat', cluster_cut=-1.0)
Assess which template is best by analysing
structural and sequence similarity
Compare.py
28/06/20 HM and modeller 22
Session on modeller9.24
Template Selection
Seq. Alignment
Model Consrtuction
Evaluate model
28/06/20 HM and modeller 23
Session on modeller9.24
●
Template identification and selection using
BLAST and manual search
28/06/20 HM and modeller 24
Session on modeller9.24
Align2d.py
Template Selection
Seq. Alignment
Model Consrtuction
Evaluate model
from modeller import *
env = environ()
aln = alignment(env)
mdl = model(env, file='1iz9', model_segment=('FIRST:A','LAST:A'))
aln.append_model(mdl, align_codes='1bdmA', atom_files='1iz9.pdb')
aln.append(file='TvLDH.ali', align_codes='TvLDH')
aln.align2d()
aln.write(file='TvLDH-1iz9A.ali', alignment_format='PIR')
aln.write(file='TvLDH-1iz9A.pap', alignment_format='PAP')
Based on Dynamics programing although a more specific
One because it factors in the structural information as well !!
Variable penalty gaps !!
28/06/20 HM and modeller 25
Session on modeller9.24
Template Selection
Seq. Alignment
Model Consrtuction
Evaluate model
Align2d.pap
28/06/20 HM and modeller 26
Session on modeller9.24
Template Selection
Seq. Alignment
Model Consrtuction
Evaluate model
from modeller import *
from modeller.automodel import *
#from modeller import soap_protein_od
env = environ()
a = automodel(env, alnfile='TvLDH-1bdmA.ali',
knowns='1bdmA', sequence='TvLDH',
assess_methods=(assess.DOPE,
#soap_protein_od.Scorer(),
assess.GA341))
a.starting_model = 1
a.ending_model = 5
a.make()
28/06/20 HM and modeller 27
Session on modeller9.24
Template Selection
Seq. Alignment
Model Consrtuction
Evaluate model
>> Summary of successfully produced models:
Filename molpdf DOPE score GA341 score
----------------------------------------------------------------------
TvLDH.B99990001.pdb 1763.56104 -38079.76172 1.00000
TvLDH.B99990002.pdb 1560.93396 -38515.98047 1.00000
TvLDH.B99990003.pdb 1712.44104 -37984.30859 1.00000
TvLDH.B99990004.pdb 1720.70801 -37869.91406 1.00000
TvLDH.B99990005.pdb 1840.91772 -38052.00781 1.00000
Model-single.log
28/06/20 HM and modeller 28
Session on modeller9.24
Template Selection
Seq. Alignment
Model Consrtuction
Evaluate model
from modeller import *
from modeller.scripts import complete_pdb
log.verbose() # request verbose output
env = environ()
env.libs.topology.read(file='$(LIB)/top_heav.lib') # read topology
env.libs.parameters.read(file='$(LIB)/par.lib') # read parameters
# read model file
mdl = complete_pdb(env, 'TvLDH.B99990002.pdb')
# Assess with DOPE:
s = selection(mdl) # all atom selection
s.assess_dope(output='ENERGY_PROFILE NO_REPORT', file='TvLDH.profile',
normalize_profile=True, smoothing_window=15)
Generates a energy profile
28/06/20 HM and modeller 29
Ways to increase the accuracy -MSA
salign.py !!
Creates a MSA between
the template !!
28/06/20 HM and modeller 30
Ways to increase the accuracy -MSA
from modeller import *
log.verbose()
env = environ()
env.libs.topology.read(file='$(LIB)/top_heav.lib')
# Read aligned structure(s):
aln = alignment(env)
aln.append(file='fm00495.ali', align_codes='all')
aln_block = len(aln)
# Read aligned sequence(s):
aln.append(file='TvLDH.ali', align_codes='TvLDH')
# Structure sensitive variable gap penalty sequence-sequence alignment:
aln.salign(output='', max_gap_length=20,
gap_function=True, # to use structure-dependent gap penalty
alignment_type='PAIRWISE', align_block=aln_block,
feature_weights=(1., 0., 0., 0., 0., 0.), overhang=0,
gap_penalties_1d=(-450, 0),
gap_penalties_2d=(0.35, 1.2, 0.9, 1.2, 0.6, 8.6, 1.2, 0., 0.),
similarity_flag=True)
aln.write(file='TvLDH-mult.ali', alignment_format='PIR')
aln.write(file='TvLDH-mult.pap', alignment_format='PAP')
This aligns the querry sequence to
the template_MSA without distur-
bing it !!
Structure dependent gap penalty !!
Get the model - model.py !!
Use the model_mult.py
align2d_mult.py
28/06/20 HM and modeller 34
Ah !! there is more
●
For more tutorials check out this link
●
For Documentation on modeller
https://salilab.org/modeller/9.24/manual.pdf
●
For starters !! Check out this paper !!
https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5031415/
Happy modelling !!
https://salilab.org/modeller/tutorial/
28/06/20 HM and modeller 35
Refining using MD
●
It preferable to do a MD based energy minimization to minimize the structural clashes
●
GROMAC, NAMD/VMD, AMBER are all MD simulation softwares useful for this purpose
●
Ideally a 5ns run for removing clashes and may be 20ns or above for loop refinement
28/06/20 HM and modeller 36
A web interphase for modeller !! ModWeb
https://modbase.compbio.ucsf.edu/modweb/
https://modbase.compbio.ucsf.edu/modweb/help.cgi?type=help
28/06/20 HM and modeller 37
A web interphase for modeller !! ModWeb
https://modbase.compbio.ucsf.edu/modweb/help.cgi?type=help
Automate using ModPipe !!
28/06/20 HM and modeller 39
Structure Validation - “Swiss structure assessment”
●
Access the server using the link down
●
Ramachandran validation , Molprobity , Qmean !!
https://swissmodel.expasy.org/assess/help
https://swissmodel.expasy.org/assess
28/06/20 HM and modeller 40
Structure Validation - “SAVES”
A cogglomerate of many validation programs
Did the model got the folds right ?
Varify3D – Checks how well a model structure is based on the 3D profile to its amino
acid sequence. Best structure exihibits higher score . The segmment lacking in quality can
also be identified from the residue plot
Leuthy et al, 1992
28/06/20 HM and modeller 41
Structure Validation - “SAVES”
How is the structural sterochemistry ?
Whatcheck – Derived from whatif , boast mutiple programs specialized to check for
steriochemical quality of a model !!
Z-score and RMS-Z values
https://swift.cmbi.umcn.nl/gv/whatcheck/
Assess how normal, or how unusual, the geometry
of the residues in a given protein structure is, as
compared with stereochemical parameters derived
from well-refined, high-resolution structures
Procheck !!!
http://www.csb.yale.edu/userguides/datamanip/procheck/manual/man1.html
28/06/20 HM and modeller 42
Structure Validation - “SAVES”
●
How well does the quality of the model fare with respect to atomic interactions ?
ERRAT – Atomic interaction often follows certain patterns, errors whilst modelling
randomize these. A statistical description of standard non-bonded interaction when
fitted against with query model can distinguish the errors
https://www.ncbi.nlm.nih.gov/pubmed/8401235?dopt=Abstract
28/06/20 HM and modeller 43
Take Home message
●
Modelling using Homology depends heavily on the quality of seqence alignement and the
template structure
●
Modeller is based on python lang. and uses ‘satisfication of spatial restraints’ for
constructing the model
●
Minimal knowledge on python scripting, especially on modeller libraries and operators will
help you manage it much effectively
●
The best model is a result of many iteration of the “ basic steps “, while improving
the quality in each step
●
SAVES, swiss structure assessment and process are quiet well tools for model
validation
●
The model need not be perfect in every sense of quality. The key is to make the quality
of model as close to that of the template
●
Strategies for quality improvement, gives a better result
28/06/20 HM and modeller 44
Thank You
and happy
modelling
!!

Homology modeling

  • 1.
  • 2.
    28/06/20 HM andmodeller 2 What to expect !! ➢Basics on Molecular modelling, Different approaches to it ➢Homology Modelling, steps and certain pipelines that helps here ! ➢Working with Modeller !!
  • 3.
    28/06/20 HM andmodeller 3 The Basics !! What is molecular modelling ? What is a protein ? Primary structure Secondary structure Tertiary structure Quaternary structure Source: Welcometrust.org
  • 4.
    28/06/20 HM andmodeller 4 The Basics !! Molecular modelling What is molecular modelling ? Deriving, Representing Mechanisms, Manipulate for benefit Applying Its satifies the need for methods to elucidate prot.strcuture Applying Its satifies the need for methods to elucidate prot.strcuture
  • 5.
    28/06/20 HM andmodeller 5 Approaches towards molecular modelling Sources of structural information ? Role in Activity & Function!! X-ray Crystallography NMR Cryo-EM Analogy of Hand and Shadow
  • 6.
    28/06/20 HM andmodeller 6 Limitations in protein Structure acquisition X-ray Crystallography ● Needs Crystals, very tough to get for protein ● Resolution problem. Along the loops, Certain side chains, tautomeric state ● Ambiguities in bound ligands Nuclear Magnetic resonance ● Mass restriction (upto 64KDa) ● Poor resolution power ● Sensitivity losses and Increased spectral complexity Davis, A.M., Teague, S.J. and Kleywegt, G.J. (2003), Angewandte Chemie International Edition, 42: 2718-2736. Dominique et al,Curr Opin Struct Biol. 2013 Oct; 23(5): 734–739.
  • 7.
    28/06/20 HM andmodeller 7 H.Models, Seq.Identity and Applicability Source:A Sali et al
  • 8.
    28/06/20 HM andmodeller 8 Approaches towards molecular modelling Modelling approches -Threading (30<Similarity<50) Given the Protein sequence, Which fold of a known Structure resembles the unknown ? Source: researchgate.net CATH, SCOP https://en.wikipedia.org/wiki/Threading_(protein_sequence) Raptor HHpred Phyre Softwares
  • 9.
    28/06/20 HM andmodeller 9 Approaches towards molecular modelling Modelling approches – Ab-initio Method (Similarity <30) Source: researchgate.net Uses the knowledge from physics, Protein confromation parameters and Structure for small fragments of protein To Construct the actual model “Thermodynamic hypothesis” -Principle 1 2 Jamesmccaffrey.wordpress.com Www.biology-online.com Rosetta, TOUCHSTONE-2, i-Tasser https://bioinformaticsreview.com/20171210/ab-initio-prediction-of-protein-structure-an-introduction/
  • 10.
    28/06/20 HM andmodeller 10 Approaches towards molecular modelling Modelling approches – Homology modelling (Similarity>50%) Source: researchgate.net “Evolutionarily Structures are more conserved than Sequence” The level of divergence from conservation depends on sequence similarity/ homology Certain Terminology : Homologs, Paralogs, Orthologs Similar Seq. = Similar Str.
  • 11.
    28/06/20 HM andmodeller 11 Errors in a modelling may be due to ? ● Errors in the Template ● Target-template alignment error ● Side-chain packing error ● Distorsion/ shift in aligned region https://pdb101.rcsb.org
  • 12.
    28/06/20 HM andmodeller 12 Modeller 9.24 ➢Modeller is a homology based protein structure prediction tool ➢The models are developed by “satisfaction of spatial restraints technique”. ➢Its build on python, libraries and other function that helps to manage structural data is also incorporated within ➢A robust methods to interpret model and evaluate its energy is also its key points ➢Its free for academics under GPL scheme https://salilab.org/modeller/
  • 13.
    28/06/20 HM andmodeller 13 Installation ● If conda envirnment already there, modeller is accessible through conda package manager $conda config –add channel salilab ; conda install modeller Insert the “key” instead of “XXXX” the /home/../anaconda2/lib/modeller-9.4/modlib/modeller/config.py file ● For linux and windows installation please follow the guide https://salilab.org/modeller/download_installation.html $sudo apt install ./package.deb Fire in terminal :$mod9.24 -- to Check if the installation has succeeded B. Webb, A. Sali. Comparative Protein Structure Modeling Using Modeller. Current Protocols in Bioinformatics 54, John Wiley & Sons, Inc., 5.6.1-5.6.37, 2016.
  • 14.
    28/06/20 HM andmodeller 14 Homology modelling Workflow Read the PDB paper, Checkout the quality parameter {Resolution,R-value, R-free Value}, Read the str.pdb check for remarks, wwPDBValidation report Template Selection Seq. Alignment Model Construction Evaluate model Template idenfication E-value [close to 0] > Coverage > Identity > Similarty Srinivas Ramachandran, Pradeep Kota, Feng Ding,,Nikolay V. Dokholyan et alProteins. 2011 Jan; 79(1): 261–270. https://blast.ncbi.nlm.nih.gov/Blast.cgi?CMD=Web&PAGE_TYPE=BlastDocs&DOC_TYPE=BlastHelp#get_subsequence Davis, A.M., Teague, S.J. and Kleywegt, G.J. (2003), Angewandte Chemie International Edition, 42: 2718-2736. https://pdb101.rcsb.org/learn/guide-to-understanding-pdb-data/r-value-and-r-free Sequence based and Structure based Rigid body assembly, segment matching and satisfacti -on of spatial restraints method !! Swiss model validation, SAVES, PROSESS
  • 15.
    28/06/20 HM andmodeller 15 Satifying Spatial restraints !! Homology derived restraints Statistical restraints.from exp Strucure Forcefield based restraints The template act as a guide were the new model is build upon satifying the restraints imposed B. Webb, A. Sali. Comparative Protein Structure Modeling Using Modeller. Current Protocols in Bioinformatics 54, John Wiley & Sons, Inc., 5.6.1-5.6.37, 2016.
  • 16.
    28/06/20 HM andmodeller 16 Satifying Spatial restraints !! Homology derived restraints Statistical restraints.from exp Strucure Forcefield based restraints An optimization function runs to minimize the restrains, finally giving the model B. Webb, A. Sali. Comparative Protein Structure Modeling Using Modeller. Current Protocols in Bioinformatics 54, John Wiley & Sons, Inc., 5.6.1-5.6.37, 2016.
  • 17.
    28/06/20 HM andmodeller 17 Session on modeller9.24 Template Selection Seq. Alignment Model Consrtuction Evaluate model A basic modelling on TvLDH is took here !! https://salilab.org/modeller/ tutorial/ What you need for starting The TvLDH.ali All the python scripts pdb_95.pir A non-reduntant collection of high quality PDB structure $mod9.24 pythonscript.py Fires the command
  • 18.
    28/06/20 HM andmodeller 18 Session on modeller9.24 Template Selection Seq. Alignment Model Consrtuction Evaluate model 1.Making a modeller compatible input format The pir format saved in ali extension Fasta --- > PIR >P1;TvLDH sequence:TvLDH:::::::0.00: 0.00 MSEAAHVLITGAAGQIGYILSHWIASGELYGDRQVYLHLLDIPPAMNRLTALTMELEDCAFPHLAGFVATTDPKA AFKDIDCAFLVASMPLKPGQVRADLISSNSVIFKNTGEYLSKWAKPSVKVLVIGNPDNTNCEIAMLHAKNLKPEN FSSLSMLDQNRAYYEVASKLGVDVKDVHDIIVWGNHGESMVADLTQATFTKEGKTQKVVDVLDHDYVFDTFFKKI GHRAWDILEHRGFTSAASPTKAAIQHMKAWLFGTAPGEVLSMGIPVPEGNPYGIKPGVVFSFPCNVDKEGKIHVV EGFKVNDWLREKLDFTEKDLFHEKEIALNHLAQGG* >sp|Q6UXH0|ANGL8_HUMAN Angiopoietin-like protein 8 OS=Homo sapiens OX=9606 GN=ANGPTL8 PE=1 SV=1 MPVPALCLLWALAMVTRPASAAPMGGPELAQHEELTLLFHGTLQLGQALNGVYRTTEGRL TKARNSLGLYGRTIELLGQEVSRGRDAAQELRASLLETQMEEDILQLQAEATAEVLGEVA QAQKVLRDSVQRLEVQLRSAWLGPAYREFEVLKAHADKQSHILWALTGHVQRQRREMVAQ QHRLRQIQERLHTAALPA
  • 19.
    28/06/20 HM andmodeller 19 Session on modeller9.24 Template Selection Seq. Alignment Model Consrtuction Evaluate model from modeller import * log.verbose() env = environ() #-- Prepare the input files #-- Read in the sequence database sdb = sequence_db(env) sdb.read(seq_database_file='pdb_95.pir', seq_database_format='PIR', chains_list='ALL', minmax_db_seq_len=(30, 4000), clean_sequences=True) #-- Write the sequence database in binary form sdb.write(seq_database_file='pdb_95.bin', seq_database_format='BINARY', chains_list='ALL') #-- Now, read in the binary database sdb.read(seq_database_file='pdb_95.bin', seq_database_format='BINARY', chains_list='ALL') #-- Read in the target sequence/alignment aln = alignment(env) aln.append(file='TvLDH.ali', alignment_format='PIR', align_codes='ALL') #-- Convert the input sequence/alignment into # profile format prf = aln.to_profile() #-- Scan sequence database to pick up homologous sequences prf.build(sdb, matrix_offset=-450, rr_file='${LIB}/blosum62.sim.mat', gap_penalties_1d=(-500, -50), n_prof_iterations=1, check_profile=False, max_aln_evalue=0.01) #-- Write out the profile in text format prf.write(file='build_profile.prf', profile_format='TEXT') #-- Convert the profile back to alignment format aln = prf.to_alignment() #-- Write out the alignment file aln.write(file='build_profile.ali', alignment_format='PIR') 2.Executing buildprofile.py script
  • 20.
    28/06/20 HM andmodeller 20 Session on modeller9.24 Template Selection Seq. Alignment Model Consrtuction Evaluate model Template pdb Lengt h Identit y E- Value “Buildprofile.prf” file
  • 21.
    28/06/20 HM andmodeller 21 Session on modeller9.24 Template Selection Seq. Alignment Model Consrtuction Evaluate model from modeller import * env = environ() aln = alignment(env) for (pdb, chain) in (('1b8p', 'A'), ('1bdm', 'A'), ('1civ', 'A'), ('5mdh', 'A'), ('7mdh', 'A'), ('1smk', 'A')): m = model(env, file=pdb, model_segment=('FIRST:'+chain, 'LAST:'+chain)) aln.append_model(m, atom_files=pdb, align_codes=pdb+chain) aln.malign() aln.malign3d() aln.compare_structures() aln.id_table(matrix_file='family.mat') env.dendrogram(matrix_file='family.mat', cluster_cut=-1.0) Assess which template is best by analysing structural and sequence similarity Compare.py
  • 22.
    28/06/20 HM andmodeller 22 Session on modeller9.24 Template Selection Seq. Alignment Model Consrtuction Evaluate model
  • 23.
    28/06/20 HM andmodeller 23 Session on modeller9.24 ● Template identification and selection using BLAST and manual search
  • 24.
    28/06/20 HM andmodeller 24 Session on modeller9.24 Align2d.py Template Selection Seq. Alignment Model Consrtuction Evaluate model from modeller import * env = environ() aln = alignment(env) mdl = model(env, file='1iz9', model_segment=('FIRST:A','LAST:A')) aln.append_model(mdl, align_codes='1bdmA', atom_files='1iz9.pdb') aln.append(file='TvLDH.ali', align_codes='TvLDH') aln.align2d() aln.write(file='TvLDH-1iz9A.ali', alignment_format='PIR') aln.write(file='TvLDH-1iz9A.pap', alignment_format='PAP') Based on Dynamics programing although a more specific One because it factors in the structural information as well !! Variable penalty gaps !!
  • 25.
    28/06/20 HM andmodeller 25 Session on modeller9.24 Template Selection Seq. Alignment Model Consrtuction Evaluate model Align2d.pap
  • 26.
    28/06/20 HM andmodeller 26 Session on modeller9.24 Template Selection Seq. Alignment Model Consrtuction Evaluate model from modeller import * from modeller.automodel import * #from modeller import soap_protein_od env = environ() a = automodel(env, alnfile='TvLDH-1bdmA.ali', knowns='1bdmA', sequence='TvLDH', assess_methods=(assess.DOPE, #soap_protein_od.Scorer(), assess.GA341)) a.starting_model = 1 a.ending_model = 5 a.make()
  • 27.
    28/06/20 HM andmodeller 27 Session on modeller9.24 Template Selection Seq. Alignment Model Consrtuction Evaluate model >> Summary of successfully produced models: Filename molpdf DOPE score GA341 score ---------------------------------------------------------------------- TvLDH.B99990001.pdb 1763.56104 -38079.76172 1.00000 TvLDH.B99990002.pdb 1560.93396 -38515.98047 1.00000 TvLDH.B99990003.pdb 1712.44104 -37984.30859 1.00000 TvLDH.B99990004.pdb 1720.70801 -37869.91406 1.00000 TvLDH.B99990005.pdb 1840.91772 -38052.00781 1.00000 Model-single.log
  • 28.
    28/06/20 HM andmodeller 28 Session on modeller9.24 Template Selection Seq. Alignment Model Consrtuction Evaluate model from modeller import * from modeller.scripts import complete_pdb log.verbose() # request verbose output env = environ() env.libs.topology.read(file='$(LIB)/top_heav.lib') # read topology env.libs.parameters.read(file='$(LIB)/par.lib') # read parameters # read model file mdl = complete_pdb(env, 'TvLDH.B99990002.pdb') # Assess with DOPE: s = selection(mdl) # all atom selection s.assess_dope(output='ENERGY_PROFILE NO_REPORT', file='TvLDH.profile', normalize_profile=True, smoothing_window=15) Generates a energy profile
  • 29.
    28/06/20 HM andmodeller 29 Ways to increase the accuracy -MSA salign.py !! Creates a MSA between the template !!
  • 30.
    28/06/20 HM andmodeller 30 Ways to increase the accuracy -MSA from modeller import * log.verbose() env = environ() env.libs.topology.read(file='$(LIB)/top_heav.lib') # Read aligned structure(s): aln = alignment(env) aln.append(file='fm00495.ali', align_codes='all') aln_block = len(aln) # Read aligned sequence(s): aln.append(file='TvLDH.ali', align_codes='TvLDH') # Structure sensitive variable gap penalty sequence-sequence alignment: aln.salign(output='', max_gap_length=20, gap_function=True, # to use structure-dependent gap penalty alignment_type='PAIRWISE', align_block=aln_block, feature_weights=(1., 0., 0., 0., 0., 0.), overhang=0, gap_penalties_1d=(-450, 0), gap_penalties_2d=(0.35, 1.2, 0.9, 1.2, 0.6, 8.6, 1.2, 0., 0.), similarity_flag=True) aln.write(file='TvLDH-mult.ali', alignment_format='PIR') aln.write(file='TvLDH-mult.pap', alignment_format='PAP') This aligns the querry sequence to the template_MSA without distur- bing it !! Structure dependent gap penalty !! Get the model - model.py !! Use the model_mult.py align2d_mult.py
  • 31.
    28/06/20 HM andmodeller 34 Ah !! there is more ● For more tutorials check out this link ● For Documentation on modeller https://salilab.org/modeller/9.24/manual.pdf ● For starters !! Check out this paper !! https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5031415/ Happy modelling !! https://salilab.org/modeller/tutorial/
  • 32.
    28/06/20 HM andmodeller 35 Refining using MD ● It preferable to do a MD based energy minimization to minimize the structural clashes ● GROMAC, NAMD/VMD, AMBER are all MD simulation softwares useful for this purpose ● Ideally a 5ns run for removing clashes and may be 20ns or above for loop refinement
  • 33.
    28/06/20 HM andmodeller 36 A web interphase for modeller !! ModWeb https://modbase.compbio.ucsf.edu/modweb/ https://modbase.compbio.ucsf.edu/modweb/help.cgi?type=help
  • 34.
    28/06/20 HM andmodeller 37 A web interphase for modeller !! ModWeb https://modbase.compbio.ucsf.edu/modweb/help.cgi?type=help Automate using ModPipe !!
  • 35.
    28/06/20 HM andmodeller 39 Structure Validation - “Swiss structure assessment” ● Access the server using the link down ● Ramachandran validation , Molprobity , Qmean !! https://swissmodel.expasy.org/assess/help https://swissmodel.expasy.org/assess
  • 36.
    28/06/20 HM andmodeller 40 Structure Validation - “SAVES” A cogglomerate of many validation programs Did the model got the folds right ? Varify3D – Checks how well a model structure is based on the 3D profile to its amino acid sequence. Best structure exihibits higher score . The segmment lacking in quality can also be identified from the residue plot Leuthy et al, 1992
  • 37.
    28/06/20 HM andmodeller 41 Structure Validation - “SAVES” How is the structural sterochemistry ? Whatcheck – Derived from whatif , boast mutiple programs specialized to check for steriochemical quality of a model !! Z-score and RMS-Z values https://swift.cmbi.umcn.nl/gv/whatcheck/ Assess how normal, or how unusual, the geometry of the residues in a given protein structure is, as compared with stereochemical parameters derived from well-refined, high-resolution structures Procheck !!! http://www.csb.yale.edu/userguides/datamanip/procheck/manual/man1.html
  • 38.
    28/06/20 HM andmodeller 42 Structure Validation - “SAVES” ● How well does the quality of the model fare with respect to atomic interactions ? ERRAT – Atomic interaction often follows certain patterns, errors whilst modelling randomize these. A statistical description of standard non-bonded interaction when fitted against with query model can distinguish the errors https://www.ncbi.nlm.nih.gov/pubmed/8401235?dopt=Abstract
  • 39.
    28/06/20 HM andmodeller 43 Take Home message ● Modelling using Homology depends heavily on the quality of seqence alignement and the template structure ● Modeller is based on python lang. and uses ‘satisfication of spatial restraints’ for constructing the model ● Minimal knowledge on python scripting, especially on modeller libraries and operators will help you manage it much effectively ● The best model is a result of many iteration of the “ basic steps “, while improving the quality in each step ● SAVES, swiss structure assessment and process are quiet well tools for model validation ● The model need not be perfect in every sense of quality. The key is to make the quality of model as close to that of the template ● Strategies for quality improvement, gives a better result
  • 40.
    28/06/20 HM andmodeller 44 Thank You and happy modelling !!