SlideShare a Scribd company logo
Biophysics 101:
Genomics & Computational Biology
Section 8: Protein Structure
Faisal Reza
Nov. 11th, 2003
B101.pdb from PS5 shown at left with:
• animated ball and stick model, colored CPK
• H-bonds on, colored green
• van der Waals radii on, also colored CPK
Based on the backbone and H-bond configuration shown,
what secondary structure might this be?
Outline
• Course Projects
• Biology/Chemistry of Protein Structure
– Protein Assembly, Folding, Packing and Interaction
– Primary, Secondary, Tertiary and Quaternary
structures
– Class, Fold, Topology
• CS/Math/Physics of Protein Structure
– Experimental Determination and Analysis
– Computational Determination and Analysis
• Proteomics
• Mass Spectrometry
• Videotaping authorization form
• Submission Parameters (via email)
– when: December 2, 2003 12noon EST.
(9AM EST if presenting on December 2, 2003)
– where: bphys101@fas.harvard.edu
– what: (1) written project (.doc, ~1000-3000 words)
(2) presentation slides (.ppt, 1-2 MB)
• Presentation Parameters (in person)
– when: December {2, 9, 16}, 2003 {12-2PM, 5:30-7:30PM} EST.
– where: HMS Cannon Seminar Room for 12-2PM
Science Ctr. Lecture Hall A for 5:30-7:30PM
– what: (1) oral presentations (6 min/person + 2 min/person Q/A)
(2) grading rubric and further information:
http://www.courses.fas.harvard.edu/~bphys101/projects/index.html
Course Projects
Biology/Chemistry of Protein Structure
Primary
Secondary
Tertiary
Quaternary
Assembly
Folding
Packing
Interaction
S
T
R
U
C
T
U
R
E
P
R
O
C
E
S
S
Protein Assembly
• occurs at the ribosome
• involves dehydration
synthesis and
polymerization of amino
acids attached to tRNA:
NH - {A + B  A-B + H O} -COO
• thermodynamically
unfavorable, with E =
+10kJ/mol, thus coupled
to reactions that act as
sources of free energy
• yields primary structure
2 n
3
+ -
Primary Structure
• linear
• ordered
• 1 dimensional
• sequence of amino
acid polymer
• by convention, written
from amino end to
carboxyl end
• a perfectly linear
amino acid polymer is
neither functional nor
energetically
favorable  folding!
primary structure of human insulin
CHAIN 1: GIVEQ CCTSI CSLYQ LENYC N
CHAIN 2: FVNQH LCGSH LVEAL YLVCG ERGFF YTPKT
Protein Folding
• tumbles towards
conformations that reduce
E (this process is thermo-
dynamically favorable)
• yields secondary structure
• occurs in the cytosol
• involves localized spatial
interaction among primary
structure elements, i.e. the
amino acids
• may or may not involve
chaperone proteins
Secondary Structure
• non-linear
• 3 dimensional
• localized to regions of an
amino acid chain
• formed and stabilized by
hydrogen bonding,
electrostatic and van der
Waals interactions
Ramachandran Plot
• Pauling built models based on the following
principles, codified by Ramachandran:
(1) bond lengths and angles – should be
similar to those found in individual
amino acids and small peptides
(2) peptide bond – should be planer
(3) overlaps – not permitted, pairs of atoms
no closer than sum of their covalent radii
(4) stabilization – have sterics that permit
hydrogen bonding
• Two degrees of freedom:
(1)  (phi) angle = rotation about N – C
(2)  (psi) angle = rotation about C – C
• A linear amino acid polymer with some folds
is better but still not functional nor
completely energetically favorable 
packing!
Protein Packing
• occurs in the cytosol (~60% bulk
water, ~40% water of hydration)
• involves interaction between
secondary structure elements
and solvent
• may be promoted by
chaperones, membrane proteins
• tumbles into molten globule
states
• overall entropy loss is small
enough so enthalpy determines
sign of E, which decreases
(loss in entropy from packing
counteracted by gain from
desolvation and reorganization
of water, i.e. hydrophobic effect)
• yields tertiary structure
Tertiary Structure
• non-linear
• 3 dimensional
• global but restricted to the
amino acid polymer
• formed and stabilized by
hydrogen bonding, covalent
(e.g. disulfide) bonding,
hydrophobic packing toward
core and hydrophilic
exposure to solvent
• A globular amino acid
polymer folded and
compacted is somewhat
functional (catalytic) and
energetically favorable 
interaction!
Protein Interaction
• occurs in the cytosol, in close proximity to other
folded and packed proteins
• involves interaction among tertiary structure
elements of separate polymer chains
• may be promoted by chaperones, membrane
proteins, cytosolic and extracellular elements as
well as the proteins’ own propensities
• E decreases further due to further
desolvation and reduction of surface area
• globular proteins, e.g. hemoglobin,
largely involved in catalytic roles
• fibrous proteins, e.g. collagen,
largely involved in structural roles
• yields quaternary structure
Quaternary Structure
• non-linear
• 3 dimensional
• global, and across
distinct amino acid
polymers
• formed by hydrogen
bonding, covalent
bonding, hydrophobic
packing and hydrophilic
exposure
• favorable, functional
structures occur
frequently and have been
categorized
Class/Motif
• class = secondary structure
composition,
e.g. all , all , segregated +,
mixed /
• motif = small, specific
combinations of secondary
structure elements,
e.g. -- loop
• both subset of
fold/architecture/domains
Fold/Architecture/Domains
• fold = architecture = the
overall shape and
orientation of the secondary
structures, ignoring
connectivity between the
structures,
e.g. / barrel, TIM barrel
• domain = the
functional property
of such a fold or
architecture,
e.g. binding, cleaving,
spanning sites
• subset of topology/fold
families/superfamilies
Topology/Fold families/Superfamilies
• topology = the overall shape
and connectivity of the folds
and domains
• fold families = categorization
that takes into account
topology and previous subsets
as well as empirical/biological
properties, e.g. flavodoxin
• superfamilies = in addition to
fold families, includes
evolutionary/ancestral
properties
CLASS: +
FOLD: sandwich
FOLD FAMILY: flavodoxin
CS/Math/Physics of Protein Structure
• Experimental Determination and Analysis
• Computational Determination and Analysis
Experimental Determination and Analysis
• Repositories
– Protein Data Bank
– Molecular Modeling DataBase
• Resolution
– X-Ray Crystallography
– NMR Spectroscopy
– Mass Spectroscopy (next week)
– Fluorescence Resonance Energy Transfer
Protein Data Bank
• Coordinates database
RCSB Protein Data Bank (PDB)
– has many structures, partly
due to minor differences in
structure resolution and
annotation
– has much fewer fold
families, partly due to
evolved pathways and
mechanisms
– .pdb = data from experiment,
with missing parameters
and multiple conformations
Cumulative increase in the
number of domains
Cumulative increase in the
number of domains
Cumulative increase in the
number of folds and
superfamilies
Molecular Modeling DataBase
• Comparative database
NCBI Molecular Modeling DataBase (MMDB)
– subset of PDB, excludes theoretical structures, with
native .asn format
– .asn = single-coordinate per-atom molecules, explicit
bonding and SS remarks
– suited for computation, such as homology modeling
and structure comparison
X-Ray Crystallography
• crystallize and
immobilize single,
perfect protein
• bombard with X-rays,
record scattering
diffraction patterns
• determine electron
density map from
scattering and phase
via Fourier transform:
• use electron density
and biochemical
knowledge of the
protein to refine and
determine a model
"All crystallographic models are not equal. ... The brightly colored stereo views
of a protein model, which are in fact more akin to cartoons than to
molecules, endow the model with a concreteness that exceeds the
intentions of the thoughtful crystallographer. It is impossible for the
crystallographer, with vivid recall of the massive labor that produced the
model, to forget its shortcomings. It is all too easy for users of the model to
be unaware of them. It is also all too easy for the user to be unaware that,
through temperature factors, occupancies, undetected parts of the protein,
and unexplained density, crystallography reveals more than a single
molecular model shows.“
- Rhodes, “Crystallography Made Crystal Clear” p. 183.
NMR Spectroscopy
• protein in aqueous
solution, motile and
tumbles/vibrates with
thermal motion
• NMR detects chemical
shifts of atomic nuclei
with non-zero spin, shifts
due to electronic
environment nearby
• determine distances
between specific pairs of
atoms based on shifts,
“constraints”
• use constraints and
biochemical knowledge of
the protein to determine
an ensemble of models
determining constraints
using constraints to determine
secondary structure
Fluorescence Resonance Energy Transfer
• FRET described as a “molecular ruler”
• segments of a protein are tagged with fluorophores
• energy transfer occurs when donor and acceptor
interact, falls off as 1/d6 where d is separation between
donor and acceptor
• donor and acceptor must be within 50 Å,
acceptor emission sensitive to distance change
• can determine pairs of side chains that are separated
when unfolded and close when folded
Computational Determination and Analysis
• Databases
– CATH (Class, Architecture, Topology, Homologous
superfamily)
– SCOP (Structural Classification Of Proteins)
– FSSP (Fold classification based on Structure-Structure
alignment of Proteins)
• Prediction
– Ab-initio, theoretical modeling, and conformation space
search
– Homology modeling and threading
– Energy minimization, simulation and Monte Carlo
• Proteomics (next week)
CATH
• a combination of manual and
automated hierarchical classification
• four major levels:
– Class (C) – based on secondary
structure content
– Architecture (A) – based on gross
orientation of secondary
structures
– Topology (T) – based on
connections and numbers of
secondary structures
– Homologous superfamily (H) –
based on structure/function
evolutionary commonalities
• provides useful geometric
information (e.g. architecture)
• partial automation may result in
examples near fixed thresholds
being assigned inaccurately
SCOP • a purely manual hierarchical
classification
• three major levels:
– Family – based on clear
evolutionary relationship
(pairwise residue identities
between proteins are >30%)
– Superfamily – based on
probable evolutionary origin
(low sequence identity but
common structure/function
features
– Fold – based on major
structural similarity (major
secondary structures in
same arrangement and
topology
• provides detailed evolutionary
information
• manual process influences
update frequency and equally
exhaustive examination
FSSP
• a purely automated
• hierarchical classification
• three major levels:
– representative set – 330
protein chains (less than 30%
sequence identity)
– clustering – based on
structural alignment into fold
families
– convergence – cutting at a
high statistical significance
level increases the number of
distinct families, gradually
approaching one family per
protein chain
• continually updated, presents
data and lets user assess
• Without sufficient knowledge,
user may not assess data
appropriately
list of representative set
clustering dendogram
CATH vs. SCOP vs. FSSP
• approximately two-thirds of the protein chains in each
database are common to all three databases
FSSP pairwise matches (Z-score 
4.0) compared to CATH and
SCOP matches at the fold level
(a), homology level (b)
FSSP pairwise matches (Z-score 
6.0) compared to CATH and
SCOP matches at the fold level
(c), homology level (d)
FSSP pairwise matches (Z-score 
8.0) compared to CATH and
SCOP matches at the fold level
(e), homology level (f)
Ab-initio, theoretical modeling,
and conformation space search
• Ab-initio = given amino acid primary structure, i.e. sequence,
derive structure from first principles (e.g. treat amino acids as
beads and derive possible structures by rotating through all
possible ,  angles using a “reliable” energy function, then
optimize globally)
• Theoretical modeling = subset of ab-initio, given amino acid
primary structure and knowledge about characteristic features,
derive structure that has that structure and features
(e.g. protein has an iron binding site 
possible heme substructure)
• Conformation space search = subset of ab-initio, but a
stochastic search in which the sample space is reduced by
initial conditions/assumptions (e.g. reduce sample space to
conform to Ramachandran plot)
Homology modeling and threading
• Homology modeling = knowledge-based approach, given a
sequence database, use multiple sequence alignment on this
database to identify structurally conserved regions and
construct structure backbone and loops based on these
regions, restore side-chains and refine through energy
minimization (apply to proteins that have high sequence
similarity to those in the database)
• Threading = knowledge-based approach, given a structure
database of interest (e.g. one that provides a limited set of
possible structures per given sequence for fold recognition,
one that provides a one structure per given limited set of
possible sequences for inverse folding) use scoring
functions and correlations from this database to derive
structure that is in agreement (apply to proteins with
moderate sequence similarity to those in the database)
Energy minimization, simulation
and Monte Carlo
• Energy minimization = select an appropriate energy function
and derive conformations that yield minimal energies based
on this function
• Simulation = select appropriate molecular conditions and
derive conformations that are suited to these molecular
conditions
• Monte Carlo = subset of molecular simulation, but it is an
iterated search through a Markov chain of conformations
(many iterations  canonical distribution, P(particular
conformation)~exp(-E/T)) proposed by N. Metropolis, in which
a new conformation is generated from the current one by a
small ``move'' and is accepted with a probability Pacc = min(1,
exp(-E/kT)), which depends on the corresponding change in
energy, E, and on an external adjustable parameter, kT
Next Week
• Proteomics
• Mass Spectrometry
References
C. Branden, J. Tooze. “Introduction to Protein Structure.” Garland Science Publishing, 1999.
C. Chothia, T. Hubard, S. Brenner, H. Barns, A. Murzin. “Protein Folds in the All-β and ALL-α Classes.”
Annu. Rev. Biophys. Biomol. Struct., 1997, 26:597-627.
G.M. Church. “Proteins 1: Structure and Interactions.” Biophysics 101: Computational Biology and
Genomics, October 28, 2003.
C. Hadley, D.T. Jones. “A systematic comparison of protein structure classifications: SCOP, CATH and
FSSP.” Structure, August 27, 1999, 7:1099-1112.
S. Komili. “Section 8: Protein Structure.” Biophysics 101: Computational Biology and Genomics,
November 12, 2002.
D.L. Nelson, A.L. Lehninger, M.M. Cox. “Principles of Biochemistry, Third Edition.” Worth Publishing,
May 2002.
.pdb animation created with PDB to MultiGif, http://www.dkfz-heidelberg.de/spec/pdb2mgif/expert.html

More Related Content

Similar to protein structure from genomic and computational biology

Protein Structure Determination
Protein Structure DeterminationProtein Structure Determination
Protein Structure Determination
Amjad Ibrahim
 
2005_lecture_01.ppt
2005_lecture_01.ppt2005_lecture_01.ppt
2005_lecture_01.ppt
KhalidBassiouny1
 
Lecture 14 2013.ppt
Lecture 14 2013.pptLecture 14 2013.ppt
Lecture 14 2013.ppt
KhalidBassiouny1
 
Chemical similarity
Chemical similarityChemical similarity
Chemical similarity
Nina Jeliazkova
 
protein Modeling Abi.pptx
protein Modeling Abi.pptxprotein Modeling Abi.pptx
protein Modeling Abi.pptx
MuhammadRizwan863722
 
Davey l1 macromolec-struc-anlys(1) lec 1
Davey l1 macromolec-struc-anlys(1) lec 1Davey l1 macromolec-struc-anlys(1) lec 1
Davey l1 macromolec-struc-anlys(1) lec 1
RANJANI001
 
Protein structure analysis
Protein structure analysis Protein structure analysis
Protein structure analysis
Anfal Izaldeen AL KATEEB
 
Campbell6e lecture ch4
Campbell6e lecture ch4Campbell6e lecture ch4
Campbell6e lecture ch4
Katweena Sarmiento
 
Molecular modelling (1)
Molecular modelling (1)Molecular modelling (1)
Molecular modelling (1)
Bharatesha S Viru
 
Molecular docking
Molecular dockingMolecular docking
Molecular docking
palliyath91
 
Lecture 18
Lecture 18Lecture 18
Lecture 18
Zaib Chaudhry
 
Homology Modeling.pptx
Homology Modeling.pptxHomology Modeling.pptx
Homology Modeling.pptx
AmnaAkram29
 
Computational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptxComputational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptx
ashharnomani
 
MDC Connects: Proteins, structures and how to get them
MDC Connects: Proteins, structures and how to get themMDC Connects: Proteins, structures and how to get them
MDC Connects: Proteins, structures and how to get them
Medicines Discovery Catapult
 
HOMOLOGY MODELING IN EASIER WAY
HOMOLOGY MODELING IN EASIER WAYHOMOLOGY MODELING IN EASIER WAY
HOMOLOGY MODELING IN EASIER WAY
Shikha Popali
 
Design of fragment screening libraries (IQPC 2008)
Design of fragment screening libraries (IQPC 2008)Design of fragment screening libraries (IQPC 2008)
Design of fragment screening libraries (IQPC 2008)
Peter Kenny
 
L1Protein_Structure_Analysis.pptx
L1Protein_Structure_Analysis.pptxL1Protein_Structure_Analysis.pptx
L1Protein_Structure_Analysis.pptx
kigaruantony
 
docking
docking docking
docking
prateek kumar
 
Lanjutan kimed
Lanjutan kimedLanjutan kimed
PPI Presentation - 1
PPI Presentation - 1PPI Presentation - 1
PPI Presentation - 1
Ala'a Siam
 

Similar to protein structure from genomic and computational biology (20)

Protein Structure Determination
Protein Structure DeterminationProtein Structure Determination
Protein Structure Determination
 
2005_lecture_01.ppt
2005_lecture_01.ppt2005_lecture_01.ppt
2005_lecture_01.ppt
 
Lecture 14 2013.ppt
Lecture 14 2013.pptLecture 14 2013.ppt
Lecture 14 2013.ppt
 
Chemical similarity
Chemical similarityChemical similarity
Chemical similarity
 
protein Modeling Abi.pptx
protein Modeling Abi.pptxprotein Modeling Abi.pptx
protein Modeling Abi.pptx
 
Davey l1 macromolec-struc-anlys(1) lec 1
Davey l1 macromolec-struc-anlys(1) lec 1Davey l1 macromolec-struc-anlys(1) lec 1
Davey l1 macromolec-struc-anlys(1) lec 1
 
Protein structure analysis
Protein structure analysis Protein structure analysis
Protein structure analysis
 
Campbell6e lecture ch4
Campbell6e lecture ch4Campbell6e lecture ch4
Campbell6e lecture ch4
 
Molecular modelling (1)
Molecular modelling (1)Molecular modelling (1)
Molecular modelling (1)
 
Molecular docking
Molecular dockingMolecular docking
Molecular docking
 
Lecture 18
Lecture 18Lecture 18
Lecture 18
 
Homology Modeling.pptx
Homology Modeling.pptxHomology Modeling.pptx
Homology Modeling.pptx
 
Computational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptxComputational Prediction Of Protein-1.pptx
Computational Prediction Of Protein-1.pptx
 
MDC Connects: Proteins, structures and how to get them
MDC Connects: Proteins, structures and how to get themMDC Connects: Proteins, structures and how to get them
MDC Connects: Proteins, structures and how to get them
 
HOMOLOGY MODELING IN EASIER WAY
HOMOLOGY MODELING IN EASIER WAYHOMOLOGY MODELING IN EASIER WAY
HOMOLOGY MODELING IN EASIER WAY
 
Design of fragment screening libraries (IQPC 2008)
Design of fragment screening libraries (IQPC 2008)Design of fragment screening libraries (IQPC 2008)
Design of fragment screening libraries (IQPC 2008)
 
L1Protein_Structure_Analysis.pptx
L1Protein_Structure_Analysis.pptxL1Protein_Structure_Analysis.pptx
L1Protein_Structure_Analysis.pptx
 
docking
docking docking
docking
 
Lanjutan kimed
Lanjutan kimedLanjutan kimed
Lanjutan kimed
 
PPI Presentation - 1
PPI Presentation - 1PPI Presentation - 1
PPI Presentation - 1
 

Recently uploaded

mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
HongcNguyn6
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
European Sustainable Phosphorus Platform
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
PRIYANKA PATEL
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
Gokturk Mehmet Dilci
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
Sérgio Sacani
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
Anagha Prasad
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
International Food Policy Research Institute- South Asia Office
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
vluwdy49
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
University of Hertfordshire
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
Texas Alliance of Groundwater Districts
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
AbdullaAlAsif1
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
MaheshaNanjegowda
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
Aditi Bajpai
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
Areesha Ahmad
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
hozt8xgk
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
Vandana Devesh Sharma
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
KrushnaDarade1
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Leonel Morgado
 

Recently uploaded (20)

mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốtmô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
mô tả các thí nghiệm về đánh giá tác động dòng khí hóa sau đốt
 
Thornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdfThornton ESPP slides UK WW Network 4_6_24.pdf
Thornton ESPP slides UK WW Network 4_6_24.pdf
 
ESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptxESR spectroscopy in liquid food and beverages.pptx
ESR spectroscopy in liquid food and beverages.pptx
 
Shallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptxShallowest Oil Discovery of Turkiye.pptx
Shallowest Oil Discovery of Turkiye.pptx
 
The binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defectsThe binding of cosmological structures by massless topological defects
The binding of cosmological structures by massless topological defects
 
molar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptxmolar-distalization in orthodontics-seminar.pptx
molar-distalization in orthodontics-seminar.pptx
 
Direct Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart AgricultureDirect Seeded Rice - Climate Smart Agriculture
Direct Seeded Rice - Climate Smart Agriculture
 
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
在线办理(salfor毕业证书)索尔福德大学毕业证毕业完成信一模一样
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
Applied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdfApplied Science: Thermodynamics, Laws & Methodology.pdf
Applied Science: Thermodynamics, Laws & Methodology.pdf
 
Bob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdfBob Reedy - Nitrate in Texas Groundwater.pdf
Bob Reedy - Nitrate in Texas Groundwater.pdf
 
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
Unlocking the mysteries of reproduction: Exploring fecundity and gonadosomati...
 
Basics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different formsBasics of crystallography, crystal systems, classes and different forms
Basics of crystallography, crystal systems, classes and different forms
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 
Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.Micronuclei test.M.sc.zoology.fisheries.
Micronuclei test.M.sc.zoology.fisheries.
 
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of ProteinsGBSN - Biochemistry (Unit 6) Chemistry of Proteins
GBSN - Biochemistry (Unit 6) Chemistry of Proteins
 
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
快速办理(UAM毕业证书)马德里自治大学毕业证学位证一模一样
 
Compexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titrationCompexometric titration/Chelatorphy titration/chelating titration
Compexometric titration/Chelatorphy titration/chelating titration
 
SAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdfSAR of Medicinal Chemistry 1st by dk.pdf
SAR of Medicinal Chemistry 1st by dk.pdf
 
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...
 

protein structure from genomic and computational biology

  • 1. Biophysics 101: Genomics & Computational Biology Section 8: Protein Structure Faisal Reza Nov. 11th, 2003 B101.pdb from PS5 shown at left with: • animated ball and stick model, colored CPK • H-bonds on, colored green • van der Waals radii on, also colored CPK Based on the backbone and H-bond configuration shown, what secondary structure might this be?
  • 2. Outline • Course Projects • Biology/Chemistry of Protein Structure – Protein Assembly, Folding, Packing and Interaction – Primary, Secondary, Tertiary and Quaternary structures – Class, Fold, Topology • CS/Math/Physics of Protein Structure – Experimental Determination and Analysis – Computational Determination and Analysis • Proteomics • Mass Spectrometry
  • 3. • Videotaping authorization form • Submission Parameters (via email) – when: December 2, 2003 12noon EST. (9AM EST if presenting on December 2, 2003) – where: bphys101@fas.harvard.edu – what: (1) written project (.doc, ~1000-3000 words) (2) presentation slides (.ppt, 1-2 MB) • Presentation Parameters (in person) – when: December {2, 9, 16}, 2003 {12-2PM, 5:30-7:30PM} EST. – where: HMS Cannon Seminar Room for 12-2PM Science Ctr. Lecture Hall A for 5:30-7:30PM – what: (1) oral presentations (6 min/person + 2 min/person Q/A) (2) grading rubric and further information: http://www.courses.fas.harvard.edu/~bphys101/projects/index.html Course Projects
  • 4. Biology/Chemistry of Protein Structure Primary Secondary Tertiary Quaternary Assembly Folding Packing Interaction S T R U C T U R E P R O C E S S
  • 5. Protein Assembly • occurs at the ribosome • involves dehydration synthesis and polymerization of amino acids attached to tRNA: NH - {A + B  A-B + H O} -COO • thermodynamically unfavorable, with E = +10kJ/mol, thus coupled to reactions that act as sources of free energy • yields primary structure 2 n 3 + -
  • 6. Primary Structure • linear • ordered • 1 dimensional • sequence of amino acid polymer • by convention, written from amino end to carboxyl end • a perfectly linear amino acid polymer is neither functional nor energetically favorable  folding! primary structure of human insulin CHAIN 1: GIVEQ CCTSI CSLYQ LENYC N CHAIN 2: FVNQH LCGSH LVEAL YLVCG ERGFF YTPKT
  • 7. Protein Folding • tumbles towards conformations that reduce E (this process is thermo- dynamically favorable) • yields secondary structure • occurs in the cytosol • involves localized spatial interaction among primary structure elements, i.e. the amino acids • may or may not involve chaperone proteins
  • 8. Secondary Structure • non-linear • 3 dimensional • localized to regions of an amino acid chain • formed and stabilized by hydrogen bonding, electrostatic and van der Waals interactions
  • 9. Ramachandran Plot • Pauling built models based on the following principles, codified by Ramachandran: (1) bond lengths and angles – should be similar to those found in individual amino acids and small peptides (2) peptide bond – should be planer (3) overlaps – not permitted, pairs of atoms no closer than sum of their covalent radii (4) stabilization – have sterics that permit hydrogen bonding • Two degrees of freedom: (1)  (phi) angle = rotation about N – C (2)  (psi) angle = rotation about C – C • A linear amino acid polymer with some folds is better but still not functional nor completely energetically favorable  packing!
  • 10. Protein Packing • occurs in the cytosol (~60% bulk water, ~40% water of hydration) • involves interaction between secondary structure elements and solvent • may be promoted by chaperones, membrane proteins • tumbles into molten globule states • overall entropy loss is small enough so enthalpy determines sign of E, which decreases (loss in entropy from packing counteracted by gain from desolvation and reorganization of water, i.e. hydrophobic effect) • yields tertiary structure
  • 11. Tertiary Structure • non-linear • 3 dimensional • global but restricted to the amino acid polymer • formed and stabilized by hydrogen bonding, covalent (e.g. disulfide) bonding, hydrophobic packing toward core and hydrophilic exposure to solvent • A globular amino acid polymer folded and compacted is somewhat functional (catalytic) and energetically favorable  interaction!
  • 12. Protein Interaction • occurs in the cytosol, in close proximity to other folded and packed proteins • involves interaction among tertiary structure elements of separate polymer chains • may be promoted by chaperones, membrane proteins, cytosolic and extracellular elements as well as the proteins’ own propensities • E decreases further due to further desolvation and reduction of surface area • globular proteins, e.g. hemoglobin, largely involved in catalytic roles • fibrous proteins, e.g. collagen, largely involved in structural roles • yields quaternary structure
  • 13. Quaternary Structure • non-linear • 3 dimensional • global, and across distinct amino acid polymers • formed by hydrogen bonding, covalent bonding, hydrophobic packing and hydrophilic exposure • favorable, functional structures occur frequently and have been categorized
  • 14. Class/Motif • class = secondary structure composition, e.g. all , all , segregated +, mixed / • motif = small, specific combinations of secondary structure elements, e.g. -- loop • both subset of fold/architecture/domains
  • 15. Fold/Architecture/Domains • fold = architecture = the overall shape and orientation of the secondary structures, ignoring connectivity between the structures, e.g. / barrel, TIM barrel • domain = the functional property of such a fold or architecture, e.g. binding, cleaving, spanning sites • subset of topology/fold families/superfamilies
  • 16. Topology/Fold families/Superfamilies • topology = the overall shape and connectivity of the folds and domains • fold families = categorization that takes into account topology and previous subsets as well as empirical/biological properties, e.g. flavodoxin • superfamilies = in addition to fold families, includes evolutionary/ancestral properties CLASS: + FOLD: sandwich FOLD FAMILY: flavodoxin
  • 17. CS/Math/Physics of Protein Structure • Experimental Determination and Analysis • Computational Determination and Analysis
  • 18. Experimental Determination and Analysis • Repositories – Protein Data Bank – Molecular Modeling DataBase • Resolution – X-Ray Crystallography – NMR Spectroscopy – Mass Spectroscopy (next week) – Fluorescence Resonance Energy Transfer
  • 19. Protein Data Bank • Coordinates database RCSB Protein Data Bank (PDB) – has many structures, partly due to minor differences in structure resolution and annotation – has much fewer fold families, partly due to evolved pathways and mechanisms – .pdb = data from experiment, with missing parameters and multiple conformations Cumulative increase in the number of domains Cumulative increase in the number of domains Cumulative increase in the number of folds and superfamilies
  • 20. Molecular Modeling DataBase • Comparative database NCBI Molecular Modeling DataBase (MMDB) – subset of PDB, excludes theoretical structures, with native .asn format – .asn = single-coordinate per-atom molecules, explicit bonding and SS remarks – suited for computation, such as homology modeling and structure comparison
  • 21. X-Ray Crystallography • crystallize and immobilize single, perfect protein • bombard with X-rays, record scattering diffraction patterns • determine electron density map from scattering and phase via Fourier transform: • use electron density and biochemical knowledge of the protein to refine and determine a model "All crystallographic models are not equal. ... The brightly colored stereo views of a protein model, which are in fact more akin to cartoons than to molecules, endow the model with a concreteness that exceeds the intentions of the thoughtful crystallographer. It is impossible for the crystallographer, with vivid recall of the massive labor that produced the model, to forget its shortcomings. It is all too easy for users of the model to be unaware of them. It is also all too easy for the user to be unaware that, through temperature factors, occupancies, undetected parts of the protein, and unexplained density, crystallography reveals more than a single molecular model shows.“ - Rhodes, “Crystallography Made Crystal Clear” p. 183.
  • 22. NMR Spectroscopy • protein in aqueous solution, motile and tumbles/vibrates with thermal motion • NMR detects chemical shifts of atomic nuclei with non-zero spin, shifts due to electronic environment nearby • determine distances between specific pairs of atoms based on shifts, “constraints” • use constraints and biochemical knowledge of the protein to determine an ensemble of models determining constraints using constraints to determine secondary structure
  • 23. Fluorescence Resonance Energy Transfer • FRET described as a “molecular ruler” • segments of a protein are tagged with fluorophores • energy transfer occurs when donor and acceptor interact, falls off as 1/d6 where d is separation between donor and acceptor • donor and acceptor must be within 50 Å, acceptor emission sensitive to distance change • can determine pairs of side chains that are separated when unfolded and close when folded
  • 24. Computational Determination and Analysis • Databases – CATH (Class, Architecture, Topology, Homologous superfamily) – SCOP (Structural Classification Of Proteins) – FSSP (Fold classification based on Structure-Structure alignment of Proteins) • Prediction – Ab-initio, theoretical modeling, and conformation space search – Homology modeling and threading – Energy minimization, simulation and Monte Carlo • Proteomics (next week)
  • 25. CATH • a combination of manual and automated hierarchical classification • four major levels: – Class (C) – based on secondary structure content – Architecture (A) – based on gross orientation of secondary structures – Topology (T) – based on connections and numbers of secondary structures – Homologous superfamily (H) – based on structure/function evolutionary commonalities • provides useful geometric information (e.g. architecture) • partial automation may result in examples near fixed thresholds being assigned inaccurately
  • 26. SCOP • a purely manual hierarchical classification • three major levels: – Family – based on clear evolutionary relationship (pairwise residue identities between proteins are >30%) – Superfamily – based on probable evolutionary origin (low sequence identity but common structure/function features – Fold – based on major structural similarity (major secondary structures in same arrangement and topology • provides detailed evolutionary information • manual process influences update frequency and equally exhaustive examination
  • 27. FSSP • a purely automated • hierarchical classification • three major levels: – representative set – 330 protein chains (less than 30% sequence identity) – clustering – based on structural alignment into fold families – convergence – cutting at a high statistical significance level increases the number of distinct families, gradually approaching one family per protein chain • continually updated, presents data and lets user assess • Without sufficient knowledge, user may not assess data appropriately list of representative set clustering dendogram
  • 28. CATH vs. SCOP vs. FSSP • approximately two-thirds of the protein chains in each database are common to all three databases FSSP pairwise matches (Z-score  4.0) compared to CATH and SCOP matches at the fold level (a), homology level (b) FSSP pairwise matches (Z-score  6.0) compared to CATH and SCOP matches at the fold level (c), homology level (d) FSSP pairwise matches (Z-score  8.0) compared to CATH and SCOP matches at the fold level (e), homology level (f)
  • 29. Ab-initio, theoretical modeling, and conformation space search • Ab-initio = given amino acid primary structure, i.e. sequence, derive structure from first principles (e.g. treat amino acids as beads and derive possible structures by rotating through all possible ,  angles using a “reliable” energy function, then optimize globally) • Theoretical modeling = subset of ab-initio, given amino acid primary structure and knowledge about characteristic features, derive structure that has that structure and features (e.g. protein has an iron binding site  possible heme substructure) • Conformation space search = subset of ab-initio, but a stochastic search in which the sample space is reduced by initial conditions/assumptions (e.g. reduce sample space to conform to Ramachandran plot)
  • 30. Homology modeling and threading • Homology modeling = knowledge-based approach, given a sequence database, use multiple sequence alignment on this database to identify structurally conserved regions and construct structure backbone and loops based on these regions, restore side-chains and refine through energy minimization (apply to proteins that have high sequence similarity to those in the database) • Threading = knowledge-based approach, given a structure database of interest (e.g. one that provides a limited set of possible structures per given sequence for fold recognition, one that provides a one structure per given limited set of possible sequences for inverse folding) use scoring functions and correlations from this database to derive structure that is in agreement (apply to proteins with moderate sequence similarity to those in the database)
  • 31. Energy minimization, simulation and Monte Carlo • Energy minimization = select an appropriate energy function and derive conformations that yield minimal energies based on this function • Simulation = select appropriate molecular conditions and derive conformations that are suited to these molecular conditions • Monte Carlo = subset of molecular simulation, but it is an iterated search through a Markov chain of conformations (many iterations  canonical distribution, P(particular conformation)~exp(-E/T)) proposed by N. Metropolis, in which a new conformation is generated from the current one by a small ``move'' and is accepted with a probability Pacc = min(1, exp(-E/kT)), which depends on the corresponding change in energy, E, and on an external adjustable parameter, kT
  • 32. Next Week • Proteomics • Mass Spectrometry
  • 33. References C. Branden, J. Tooze. “Introduction to Protein Structure.” Garland Science Publishing, 1999. C. Chothia, T. Hubard, S. Brenner, H. Barns, A. Murzin. “Protein Folds in the All-β and ALL-α Classes.” Annu. Rev. Biophys. Biomol. Struct., 1997, 26:597-627. G.M. Church. “Proteins 1: Structure and Interactions.” Biophysics 101: Computational Biology and Genomics, October 28, 2003. C. Hadley, D.T. Jones. “A systematic comparison of protein structure classifications: SCOP, CATH and FSSP.” Structure, August 27, 1999, 7:1099-1112. S. Komili. “Section 8: Protein Structure.” Biophysics 101: Computational Biology and Genomics, November 12, 2002. D.L. Nelson, A.L. Lehninger, M.M. Cox. “Principles of Biochemistry, Third Edition.” Worth Publishing, May 2002. .pdb animation created with PDB to MultiGif, http://www.dkfz-heidelberg.de/spec/pdb2mgif/expert.html