SlideShare a Scribd company logo
Jia-Ming Chang 0509
Graph Algorithms and Their Applications to Bioinformatics
1/40
Determine Protein Structure
 X-ray
波長約 1 Å
 長度接近原子間的距離
 研究結晶的狀態的分子行為
 定出其晶體結構,也包含蛋白質體結構
X-ray 與結構生物學
 利用 X-ray 繞射法分析高度純化結晶的蛋白質的每個
基團和原子的空間定位。
 Nuclear magnetic resonance (NMR)
NMR 是涉及原子核吸收的過程。因為對某些原子核而
言,具有自旋和磁矩的性質。因此,若暴露於強磁場
中原子核會吸收電磁輻射,這是由磁場誘導而發生能
階分裂的結果。科學家並發現,分子環境會影響在磁
場中原子核的無線電波的吸收,利用這種特性來分析
分子的結構
AVANCE 800 AV IBMS, Sinica 2/40
NMR – Nuclear Spin (1/5)
3/40
NMR – Nuclear Spin (2/5)
4/40
NMR - Magnetic Field (3/5)
5/40
NMR – Resonance (4/5)
6/40
NMR – Chemical Shift (5/5)
7/40
Find out Chemical Shift for Each Atom
• Backbone: Ca, Cb, C’, N, NH
HSQC, CBCANH, CBCACONH
Cα CΟN
H H
Cβ
Cγ
Cδ
H2
H2
H3
Chemical Shift Assignment (1/2)
One amino acid
8/40
Chemical Shift Assignment (2/2)
H-C-
H
H-CC-
H
H
-N-C-C-N-C-C-N-C-C-N-C-C-
O
O
O
O
H H
H
H
H O
H
H-C-
H
CH3
Backbone
ppm18-23
19-24
16-20
17-23
31-34
55-60
CH3 30-35
9/40
HSQC Spectra
 HSQC peaks (1 chemical shifts for an amino acid)
HH NN IntensityIntensity
8.1098.109 118.60118.60 6592003265920032
HSQC
10/40
CBCA(CO)NH Spectra
 CBCA(CO)NH peaks (2 chemical shifts for one amino
acid)
HH NN CC IntensityIntensity
8.1168.116 118.25118.25 16.3716.37 7923881179238811
8.1098.109 118.60118.60 36.5236.52 6592003265920032
11/40
CBCANH Spectra
 CBCANH peaks (4 chemical shifts for one amino acid)
 Ca (+), Cb (-)
HH NN CC IntensityIntensity
8.1168.116 118.25118.25 16.3716.37 7923881179238811
8.1098.109 118.60118.60 36.5236.52 -65920032-65920032
8.1178.117 118.90118.90 61.5861.58 -51223894-51223894
8.1198.119 117.25117.25 57.4257.42 109928374109928374
++
--
12/40
A Dataset Example
N
H
HSQC
HNCACB
CBCA(CO)NH
13/40
A Perfect Spin System Group
NN HH CC IntensityIntensity
113.293113.293 7.8977.897 56.29456.294 1.64325e+0081.64325e+008
113.293113.293 7.8977.897 27.85327.853 1.08099e+0081.08099e+008
CCaa
i-1i-1 CCbb
i-1i-1 CCaa
ii CCbb
ii
56.29
4
28.16
5
62.544 68.48
3NN HH CC IntensityIntensity
113.293113.293 7.927.92 62.54462.544 8.52851e+0078.52851e+007
113.293113.293 7.927.92 56.29456.294 4.71331e+0074.71331e+007
113.293113.293 7.927.92 68.48368.483 -8.54121e+007-8.54121e+007
113.293113.293 7.927.92 28.16528.165 -3.49346e+007-3.49346e+007
CBCA(CO)NH
CBCANH
i -1
i -1
Ca
Ca
Cb
Cb
14/40
Coding
 Translate the target protein sequence
and spin systems into coding sequences
based on the following table.
Atreya, H.S., K.V.R. Chary, and G. Govil, Automated NMR assignments of proteins for high
throughput structure determination: TATAPRO II. Current Science, 2002. 83(11): p. 1372-1376.
15/40
Backbone Assignment
 Goal
Assign chemical shifts to N, NH, Ca (and
Cb) along the protein backbone.
 General approaches
Generate spin systems
○ A spin system: an amino acid with known
chemical shifts on its N, NH, Ca (and Cb).
Link spin systems
16/40
17 /40
Backbone Assignment
DGRIGEIKGRKTLATPAVRRLAMENNIKLS
18 /40
Blind Men’s Elephant
 We cannot directly “see” the positions of
these atoms (the 3D structure)
 But we can measure a set of parameters
(with constraints) on these atoms,
which can help us infer their coordinates
Each experiment can only determine
a subset of parameters (with noises)
To combine the parameters of different
experiments we need to stitch them together
A Peculiar Parking Lot (valet parking)
Information you have: The make of your car, the car parked in
front of you (approximately). Together with others, try to identify
as many cars as possible (maximizing the overall satisfaction).
19 /43
Ambiguities
 All 4 point experiments are mixed
together
 All 2 point experiments are mixed
together
 Each spin system can be mapped to
several amino acids in the protein
sequence
 False positives, false negatives
20/40
Multiple Candidates
 One spin system maybe assign to many places
of a protein sequence.
 Spin system(SS)
 Protein Sequence:
AKFERQHMDSSTSRNLTKDR
NN HH CCaa
i-1i-1 CCbb
i-1i-1 CCaa
ii CCbb
ii
119.7119.7 8.848.84 58.458.4 32.732.7 56.356.3 40.840.8
SS SS SS SSPossible place
21/40
False Positives and False Negatives
 False positives
Noise with high intensity
Produce fake spin systems
 False negatives
Peaks with low intensity
Missing peaks
 In real wet-lab data, nearly 50% are
noises (false positive).
22/40
False Positive & False Negative
Perfect
False Negative
False Positive
N
H
HSQC
HNCACB
CBCA(CO)NH
23/40
Ambiguous Spin System
NN HH CC IntensityIntensity
106.9106.9 8.878.87 54.9254.92 423879423879
106.9106.9 8.878.87 40.3540.35 524522524522
NN HH CC IntensityIntensity
106.91106.91 8.858.85 59.759.7 235673235673
106.92106.92 8.868.86 54.9354.93 346234346234
106.91106.91 8.868.86 61.561.5 432432432432
106.91106.91 8.858.85 40.3140.31 -335759-335759
106.92106.92 8.868.86 30.530.5 -483759-483759
NN HH CCaa
i-1i-1 CCbb
i-1i-1 CCaa
ii CCbb
ii
106.1106.1 8.858.85 54.9354.93 40.3140.31 59.759.7 30.530.5
106.1106.1 8.858.85 61.561.5 40.3140.31 59.759.7 30.530.5
Two possible spin systems
24/40
Spin System Group
 Nearest Neighboring (TATAPRO, RIBRA, GASA)
N
H
HSQC
HNCACB
CBCA(CO)NH
25/40
Spin System Linking
 Goal
Link spin system as long as possible.
 Constraints
Each spin system is uniquely assigned to a
position of the target protein sequence.
Two spin systems are linked only if the
chemical shift differences of their intra- and
inter- residues are less than the predefined
thresholds.
26/40
Previous Approaches
 Constrained bipartite matching problem*
Can’t deal with ambiguous link
Legal matching Illegal matching under constraints
*Xu Y, Xu D, Kim D, Olman V, Razumovskaya J, Jiang T. Automated assignment of backbone NMR peaks using constrained
bipartite matching. Computing in Science & Engineering 2002;4(1):50-62.
27/40
Naatural Language Processing
─ Noises or Ambiguity ?
 Speech recognition : Homopone selection
台 北 市 一 位 小 孩 走 失 了
台 北 市 小 孩
台 北
適 宜 走 失
事 宜
一 位
一 味
移 位
28/40
An Error-Tolerant Algorithm
29/40
Phrase, Sentence Combination
30/40
Spin System Positioning
55.266 38.675 44.555 0
44.417 0 55.043 30.04
44.417 0 30.665 28.72
55356 29.782 60.044 37.541
D 50 G 10 R 40 I 50|51
55.266 38.675 44.555 0 => 50 10
44.417 0 55.043 30.04 =>10 40
44.417 0 30.665 28.72 =>10 40
55356 29.782 60.044 37.541 => 40 50
 We assign spin system groups to a proteinWe assign spin system groups to a protein
sequence according to their codes.sequence according to their codes.
Spin System
31/40
Link Spin System groups
Segment 3
Segment 2
Segment 1
55.266 38.675 44.555 0
44.417 0 55.043 30.04
44.417 0 30.665 28.72
55356 29.782 60.044 37.541
D G R I
32/40
Iterative Concatenation
DGRI….FKJJREKL
….
Step n Segment 99
1
2
….
56
Spin Systems
1
2
2
47
1
Step1
56…
Step2 Segment 1
Segment 2
Segment 31…
Step n-1 Segment 78 Segment 79…
33/40
Conflict Segments
DGRIDGRIGEIKGRKTLATPAVRRLAMENNIKLSGEIKGRKTLATPAVRRLAMENNIKLS
Segment 78
Segment 71
Segment 79
Segment 99 Segment 98
Segment 97
Two kinds of conflict segments
Overlap (e.g. segment 71, segment 99)
Use the same spin system (e.g. both segment 78 and
segment 79 contain spin system 1)
34/40
Independent Set
Subset S of vertices such that no two vertices in S are connected
www.cs.rochester.edu/~stefanko/Teaching/06CS282/06-CSC282-17.ppt
35/40
Independent Set
Subset S of vertices such that no two vertices in S are connected
www.cs.rochester.edu/~stefanko/Teaching/06CS282/06-CSC282-17.ppt
36/40
A Graph Model for Spin System Linking
 G(V,E)
 V: a set of nodes (segments).
 E: (u, v), u, v ∈ V, u and v are conflict.
 Goal
 Assign as many non-conflict segments
as possible => find the maximum
independent set of G.
37/40
An Example of G
 Seq. :Seq. : GEIKGRKTLATPAVRRLAMENNIKLSEGEIKGRKTLATPAVRRLAMENNIKLSE
Segment1: SP12->SP13->SP14
Segment2: SP9->SP13->SP20->SP4
Segment3: SP8->SP15->SP21
Segment4: SP7->SP1->SP15->SP3
Seg1 Seg3
Seg4 Seg2
Seg1
Seg3
Seg2
Seg4
SP13
SP15
Overlap
Overlap
38/40
Segment weight
 The larger length of segment is, the
higher weight of segment is.
 The less frequency of segment is, the
higher of segment is.
39/40
Find Maximum Weight Independent Set of G (1/2)
 Boppana, R. and M.M. Halldόrsson, Approximating Maximum Independent
Sets by Excluding Subgraphs. BIR, 1992. 32(2).
V
N(v)
Head_N(v)
40/40
Find Maximum Weight Independent Set of G (2/2)
 Boppana, R. and M.M. Halldόrsson, Approximating Maximum Independent
Sets by Excluding Subgraphs. BIR, 1992. 32(2).
V
41/40
An Iterative Approach
 We perform spin system generation
and linking iteratively.
 Three stages.
 Perfect spin systems
 Weak false negative spin systems
 Severe false negative spin systems
42/40
Segment Extension
DGRDGRGEKGRKTLATPAVRRLAMENNIKLSGEKGRKTLATPAVRRLAMENNIKLS
MaxIndSetMaxIndSet
77 99‘ 97‘
99 97
45
23
26
31
29
32
33
24
27
28
28
77
71
78
99‘
97‘
99 97
43/40

More Related Content

Similar to RIBRA–an error-tolerant algorithm for the NMR backbone assignment problem

xrd.pptx
xrd.pptxxrd.pptx
xrd.pptx
AhsanMuhammad22
 
SYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEM
SYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEMSYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEM
SYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEM
ijccmsjournal
 
Restrained refinement using Reflex
Restrained refinement using ReflexRestrained refinement using Reflex
Restrained refinement using Reflex
zavalij
 
SYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEM
SYNCHRONIZATION OF A FOUR-WING  HYPERCHAOTIC SYSTEMSYNCHRONIZATION OF A FOUR-WING  HYPERCHAOTIC SYSTEM
SYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEM
ijccmsjournal
 
SYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEM
SYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEMSYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEM
SYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEM
ijccmsjournal
 
EFFECT OF TWO EXOSYSTEM STRUCTURES ON OUTPUT REGULATION OF THE RTAC SYSTEM
EFFECT OF TWO EXOSYSTEM STRUCTURES ON OUTPUT REGULATION OF THE RTAC SYSTEMEFFECT OF TWO EXOSYSTEM STRUCTURES ON OUTPUT REGULATION OF THE RTAC SYSTEM
EFFECT OF TWO EXOSYSTEM STRUCTURES ON OUTPUT REGULATION OF THE RTAC SYSTEM
ijctcm
 
Conformational_Analysis.pptx
Conformational_Analysis.pptxConformational_Analysis.pptx
Conformational_Analysis.pptx
Chandni Pathak
 
Alpbach_GC_final
Alpbach_GC_finalAlpbach_GC_final
Alpbach_GC_final
Glenn Carrington
 
Higher-Order Squeezing of a Generic Quadratically-Coupled Optomechanical System
Higher-Order Squeezing of a Generic Quadratically-Coupled Optomechanical SystemHigher-Order Squeezing of a Generic Quadratically-Coupled Optomechanical System
Higher-Order Squeezing of a Generic Quadratically-Coupled Optomechanical System
IOSRJAP
 
Aeolian vibrations of overhead transmission line bundled conductors during in...
Aeolian vibrations of overhead transmission line bundled conductors during in...Aeolian vibrations of overhead transmission line bundled conductors during in...
Aeolian vibrations of overhead transmission line bundled conductors during in...
Power System Operation
 
Circular Dichroism ppt,
Circular Dichroism ppt, Circular Dichroism ppt,
Circular Dichroism ppt,
Manu MS
 
S.S. NMR Presentation
S.S.  NMR PresentationS.S.  NMR Presentation
S.S. NMR Presentation
sania saljoughian
 
Monte Carlo Simulations & Membrane Simulation and Dynamics
Monte Carlo Simulations & Membrane Simulation and DynamicsMonte Carlo Simulations & Membrane Simulation and Dynamics
Monte Carlo Simulations & Membrane Simulation and Dynamics
Arindam Ghosh
 
17.pmsm speed sensor less direct torque control based on ekf
17.pmsm speed sensor less direct torque control based on ekf17.pmsm speed sensor less direct torque control based on ekf
17.pmsm speed sensor less direct torque control based on ekf
Mouli Reddy
 
Methods of Protein structure determination
Methods of  Protein structure determination Methods of  Protein structure determination
Methods of Protein structure determination
EL Sayed Sabry
 
Wereszczynski Molecular Dynamics
Wereszczynski Molecular DynamicsWereszczynski Molecular Dynamics
Wereszczynski Molecular Dynamics
SciCompIIT
 
XRD principle and application
XRD principle and applicationXRD principle and application
XRD principle and application
Techef In
 
SFSCON23 - Alan Ianeselli - Machine learning driven simulation of protein fol...
SFSCON23 - Alan Ianeselli - Machine learning driven simulation of protein fol...SFSCON23 - Alan Ianeselli - Machine learning driven simulation of protein fol...
SFSCON23 - Alan Ianeselli - Machine learning driven simulation of protein fol...
South Tyrol Free Software Conference
 
Multi string PV array
Multi string PV arrayMulti string PV array
Multi string PV array
NIT MEGHALAYA
 
Optical control of resonant light transmission for an atom-cavity system_Arij...
Optical control of resonant light transmission for an atom-cavity system_Arij...Optical control of resonant light transmission for an atom-cavity system_Arij...
Optical control of resonant light transmission for an atom-cavity system_Arij...
Arijit Sharma
 

Similar to RIBRA–an error-tolerant algorithm for the NMR backbone assignment problem (20)

xrd.pptx
xrd.pptxxrd.pptx
xrd.pptx
 
SYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEM
SYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEMSYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEM
SYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEM
 
Restrained refinement using Reflex
Restrained refinement using ReflexRestrained refinement using Reflex
Restrained refinement using Reflex
 
SYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEM
SYNCHRONIZATION OF A FOUR-WING  HYPERCHAOTIC SYSTEMSYNCHRONIZATION OF A FOUR-WING  HYPERCHAOTIC SYSTEM
SYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEM
 
SYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEM
SYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEMSYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEM
SYNCHRONIZATION OF A FOUR-WING HYPERCHAOTIC SYSTEM
 
EFFECT OF TWO EXOSYSTEM STRUCTURES ON OUTPUT REGULATION OF THE RTAC SYSTEM
EFFECT OF TWO EXOSYSTEM STRUCTURES ON OUTPUT REGULATION OF THE RTAC SYSTEMEFFECT OF TWO EXOSYSTEM STRUCTURES ON OUTPUT REGULATION OF THE RTAC SYSTEM
EFFECT OF TWO EXOSYSTEM STRUCTURES ON OUTPUT REGULATION OF THE RTAC SYSTEM
 
Conformational_Analysis.pptx
Conformational_Analysis.pptxConformational_Analysis.pptx
Conformational_Analysis.pptx
 
Alpbach_GC_final
Alpbach_GC_finalAlpbach_GC_final
Alpbach_GC_final
 
Higher-Order Squeezing of a Generic Quadratically-Coupled Optomechanical System
Higher-Order Squeezing of a Generic Quadratically-Coupled Optomechanical SystemHigher-Order Squeezing of a Generic Quadratically-Coupled Optomechanical System
Higher-Order Squeezing of a Generic Quadratically-Coupled Optomechanical System
 
Aeolian vibrations of overhead transmission line bundled conductors during in...
Aeolian vibrations of overhead transmission line bundled conductors during in...Aeolian vibrations of overhead transmission line bundled conductors during in...
Aeolian vibrations of overhead transmission line bundled conductors during in...
 
Circular Dichroism ppt,
Circular Dichroism ppt, Circular Dichroism ppt,
Circular Dichroism ppt,
 
S.S. NMR Presentation
S.S.  NMR PresentationS.S.  NMR Presentation
S.S. NMR Presentation
 
Monte Carlo Simulations & Membrane Simulation and Dynamics
Monte Carlo Simulations & Membrane Simulation and DynamicsMonte Carlo Simulations & Membrane Simulation and Dynamics
Monte Carlo Simulations & Membrane Simulation and Dynamics
 
17.pmsm speed sensor less direct torque control based on ekf
17.pmsm speed sensor less direct torque control based on ekf17.pmsm speed sensor less direct torque control based on ekf
17.pmsm speed sensor less direct torque control based on ekf
 
Methods of Protein structure determination
Methods of  Protein structure determination Methods of  Protein structure determination
Methods of Protein structure determination
 
Wereszczynski Molecular Dynamics
Wereszczynski Molecular DynamicsWereszczynski Molecular Dynamics
Wereszczynski Molecular Dynamics
 
XRD principle and application
XRD principle and applicationXRD principle and application
XRD principle and application
 
SFSCON23 - Alan Ianeselli - Machine learning driven simulation of protein fol...
SFSCON23 - Alan Ianeselli - Machine learning driven simulation of protein fol...SFSCON23 - Alan Ianeselli - Machine learning driven simulation of protein fol...
SFSCON23 - Alan Ianeselli - Machine learning driven simulation of protein fol...
 
Multi string PV array
Multi string PV arrayMulti string PV array
Multi string PV array
 
Optical control of resonant light transmission for an atom-cavity system_Arij...
Optical control of resonant light transmission for an atom-cavity system_Arij...Optical control of resonant light transmission for an atom-cavity system_Arij...
Optical control of resonant light transmission for an atom-cavity system_Arij...
 

Recently uploaded

Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
TechSoup
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
zuzanka
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
Nguyen Thanh Tu Collection
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
imrankhan141184
 
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptxBIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
RidwanHassanYusuf
 
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptxRESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
zuzanka
 
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
National Information Standards Organization (NISO)
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
HajraNaeem15
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
Steve Thomason
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
Nguyen Thanh Tu Collection
 
Standardized tool for Intelligence test.
Standardized tool for Intelligence test.Standardized tool for Intelligence test.
Standardized tool for Intelligence test.
deepaannamalai16
 
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdfREASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
giancarloi8888
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
GeorgeMilliken2
 
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDFLifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Vivekanand Anglo Vedic Academy
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
Krassimira Luka
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
mulvey2
 
Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"
National Information Standards Organization (NISO)
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
Celine George
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Fajar Baskoro
 
Electric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger HuntElectric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger Hunt
RamseyBerglund
 

Recently uploaded (20)

Walmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdfWalmart Business+ and Spark Good for Nonprofits.pdf
Walmart Business+ and Spark Good for Nonprofits.pdf
 
SWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptxSWOT analysis in the project Keeping the Memory @live.pptx
SWOT analysis in the project Keeping the Memory @live.pptx
 
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
BÀI TẬP DẠY THÊM TIẾNG ANH LỚP 7 CẢ NĂM FRIENDS PLUS SÁCH CHÂN TRỜI SÁNG TẠO ...
 
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
Traditional Musical Instruments of Arunachal Pradesh and Uttar Pradesh - RAYH...
 
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptxBIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
BIOLOGY NATIONAL EXAMINATION COUNCIL (NECO) 2024 PRACTICAL MANUAL.pptx
 
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptxRESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
RESULTS OF THE EVALUATION QUESTIONNAIRE.pptx
 
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
Jemison, MacLaughlin, and Majumder "Broadening Pathways for Editors and Authors"
 
How to deliver Powerpoint Presentations.pptx
How to deliver Powerpoint  Presentations.pptxHow to deliver Powerpoint  Presentations.pptx
How to deliver Powerpoint Presentations.pptx
 
A Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two HeartsA Visual Guide to 1 Samuel | A Tale of Two Hearts
A Visual Guide to 1 Samuel | A Tale of Two Hearts
 
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
BÀI TẬP BỔ TRỢ TIẾNG ANH LỚP 9 CẢ NĂM - GLOBAL SUCCESS - NĂM HỌC 2024-2025 - ...
 
Standardized tool for Intelligence test.
Standardized tool for Intelligence test.Standardized tool for Intelligence test.
Standardized tool for Intelligence test.
 
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdfREASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
REASIGNACION 2024 UGEL CHUPACA 2024 UGEL CHUPACA.pdf
 
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
What is Digital Literacy? A guest blog from Andy McLaughlin, University of Ab...
 
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDFLifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
Lifelines of National Economy chapter for Class 10 STUDY MATERIAL PDF
 
Temple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation resultsTemple of Asclepius in Thrace. Excavation results
Temple of Asclepius in Thrace. Excavation results
 
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptxC1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
C1 Rubenstein AP HuG xxxxxxxxxxxxxx.pptx
 
Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"Benner "Expanding Pathways to Publishing Careers"
Benner "Expanding Pathways to Publishing Careers"
 
How to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 InventoryHow to Setup Warehouse & Location in Odoo 17 Inventory
How to Setup Warehouse & Location in Odoo 17 Inventory
 
Pengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptxPengantar Penggunaan Flutter - Dart programming language1.pptx
Pengantar Penggunaan Flutter - Dart programming language1.pptx
 
Electric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger HuntElectric Fetus - Record Store Scavenger Hunt
Electric Fetus - Record Store Scavenger Hunt
 

RIBRA–an error-tolerant algorithm for the NMR backbone assignment problem

  • 1. Jia-Ming Chang 0509 Graph Algorithms and Their Applications to Bioinformatics 1/40
  • 2. Determine Protein Structure  X-ray 波長約 1 Å  長度接近原子間的距離  研究結晶的狀態的分子行為  定出其晶體結構,也包含蛋白質體結構 X-ray 與結構生物學  利用 X-ray 繞射法分析高度純化結晶的蛋白質的每個 基團和原子的空間定位。  Nuclear magnetic resonance (NMR) NMR 是涉及原子核吸收的過程。因為對某些原子核而 言,具有自旋和磁矩的性質。因此,若暴露於強磁場 中原子核會吸收電磁輻射,這是由磁場誘導而發生能 階分裂的結果。科學家並發現,分子環境會影響在磁 場中原子核的無線電波的吸收,利用這種特性來分析 分子的結構 AVANCE 800 AV IBMS, Sinica 2/40
  • 3. NMR – Nuclear Spin (1/5) 3/40
  • 4. NMR – Nuclear Spin (2/5) 4/40
  • 5. NMR - Magnetic Field (3/5) 5/40
  • 6. NMR – Resonance (4/5) 6/40
  • 7. NMR – Chemical Shift (5/5) 7/40
  • 8. Find out Chemical Shift for Each Atom • Backbone: Ca, Cb, C’, N, NH HSQC, CBCANH, CBCACONH Cα CΟN H H Cβ Cγ Cδ H2 H2 H3 Chemical Shift Assignment (1/2) One amino acid 8/40
  • 9. Chemical Shift Assignment (2/2) H-C- H H-CC- H H -N-C-C-N-C-C-N-C-C-N-C-C- O O O O H H H H H O H H-C- H CH3 Backbone ppm18-23 19-24 16-20 17-23 31-34 55-60 CH3 30-35 9/40
  • 10. HSQC Spectra  HSQC peaks (1 chemical shifts for an amino acid) HH NN IntensityIntensity 8.1098.109 118.60118.60 6592003265920032 HSQC 10/40
  • 11. CBCA(CO)NH Spectra  CBCA(CO)NH peaks (2 chemical shifts for one amino acid) HH NN CC IntensityIntensity 8.1168.116 118.25118.25 16.3716.37 7923881179238811 8.1098.109 118.60118.60 36.5236.52 6592003265920032 11/40
  • 12. CBCANH Spectra  CBCANH peaks (4 chemical shifts for one amino acid)  Ca (+), Cb (-) HH NN CC IntensityIntensity 8.1168.116 118.25118.25 16.3716.37 7923881179238811 8.1098.109 118.60118.60 36.5236.52 -65920032-65920032 8.1178.117 118.90118.90 61.5861.58 -51223894-51223894 8.1198.119 117.25117.25 57.4257.42 109928374109928374 ++ -- 12/40
  • 14. A Perfect Spin System Group NN HH CC IntensityIntensity 113.293113.293 7.8977.897 56.29456.294 1.64325e+0081.64325e+008 113.293113.293 7.8977.897 27.85327.853 1.08099e+0081.08099e+008 CCaa i-1i-1 CCbb i-1i-1 CCaa ii CCbb ii 56.29 4 28.16 5 62.544 68.48 3NN HH CC IntensityIntensity 113.293113.293 7.927.92 62.54462.544 8.52851e+0078.52851e+007 113.293113.293 7.927.92 56.29456.294 4.71331e+0074.71331e+007 113.293113.293 7.927.92 68.48368.483 -8.54121e+007-8.54121e+007 113.293113.293 7.927.92 28.16528.165 -3.49346e+007-3.49346e+007 CBCA(CO)NH CBCANH i -1 i -1 Ca Ca Cb Cb 14/40
  • 15. Coding  Translate the target protein sequence and spin systems into coding sequences based on the following table. Atreya, H.S., K.V.R. Chary, and G. Govil, Automated NMR assignments of proteins for high throughput structure determination: TATAPRO II. Current Science, 2002. 83(11): p. 1372-1376. 15/40
  • 16. Backbone Assignment  Goal Assign chemical shifts to N, NH, Ca (and Cb) along the protein backbone.  General approaches Generate spin systems ○ A spin system: an amino acid with known chemical shifts on its N, NH, Ca (and Cb). Link spin systems 16/40
  • 18. 18 /40 Blind Men’s Elephant  We cannot directly “see” the positions of these atoms (the 3D structure)  But we can measure a set of parameters (with constraints) on these atoms, which can help us infer their coordinates Each experiment can only determine a subset of parameters (with noises) To combine the parameters of different experiments we need to stitch them together
  • 19. A Peculiar Parking Lot (valet parking) Information you have: The make of your car, the car parked in front of you (approximately). Together with others, try to identify as many cars as possible (maximizing the overall satisfaction). 19 /43
  • 20. Ambiguities  All 4 point experiments are mixed together  All 2 point experiments are mixed together  Each spin system can be mapped to several amino acids in the protein sequence  False positives, false negatives 20/40
  • 21. Multiple Candidates  One spin system maybe assign to many places of a protein sequence.  Spin system(SS)  Protein Sequence: AKFERQHMDSSTSRNLTKDR NN HH CCaa i-1i-1 CCbb i-1i-1 CCaa ii CCbb ii 119.7119.7 8.848.84 58.458.4 32.732.7 56.356.3 40.840.8 SS SS SS SSPossible place 21/40
  • 22. False Positives and False Negatives  False positives Noise with high intensity Produce fake spin systems  False negatives Peaks with low intensity Missing peaks  In real wet-lab data, nearly 50% are noises (false positive). 22/40
  • 23. False Positive & False Negative Perfect False Negative False Positive N H HSQC HNCACB CBCA(CO)NH 23/40
  • 24. Ambiguous Spin System NN HH CC IntensityIntensity 106.9106.9 8.878.87 54.9254.92 423879423879 106.9106.9 8.878.87 40.3540.35 524522524522 NN HH CC IntensityIntensity 106.91106.91 8.858.85 59.759.7 235673235673 106.92106.92 8.868.86 54.9354.93 346234346234 106.91106.91 8.868.86 61.561.5 432432432432 106.91106.91 8.858.85 40.3140.31 -335759-335759 106.92106.92 8.868.86 30.530.5 -483759-483759 NN HH CCaa i-1i-1 CCbb i-1i-1 CCaa ii CCbb ii 106.1106.1 8.858.85 54.9354.93 40.3140.31 59.759.7 30.530.5 106.1106.1 8.858.85 61.561.5 40.3140.31 59.759.7 30.530.5 Two possible spin systems 24/40
  • 25. Spin System Group  Nearest Neighboring (TATAPRO, RIBRA, GASA) N H HSQC HNCACB CBCA(CO)NH 25/40
  • 26. Spin System Linking  Goal Link spin system as long as possible.  Constraints Each spin system is uniquely assigned to a position of the target protein sequence. Two spin systems are linked only if the chemical shift differences of their intra- and inter- residues are less than the predefined thresholds. 26/40
  • 27. Previous Approaches  Constrained bipartite matching problem* Can’t deal with ambiguous link Legal matching Illegal matching under constraints *Xu Y, Xu D, Kim D, Olman V, Razumovskaya J, Jiang T. Automated assignment of backbone NMR peaks using constrained bipartite matching. Computing in Science & Engineering 2002;4(1):50-62. 27/40
  • 28. Naatural Language Processing ─ Noises or Ambiguity ?  Speech recognition : Homopone selection 台 北 市 一 位 小 孩 走 失 了 台 北 市 小 孩 台 北 適 宜 走 失 事 宜 一 位 一 味 移 位 28/40
  • 31. Spin System Positioning 55.266 38.675 44.555 0 44.417 0 55.043 30.04 44.417 0 30.665 28.72 55356 29.782 60.044 37.541 D 50 G 10 R 40 I 50|51 55.266 38.675 44.555 0 => 50 10 44.417 0 55.043 30.04 =>10 40 44.417 0 30.665 28.72 =>10 40 55356 29.782 60.044 37.541 => 40 50  We assign spin system groups to a proteinWe assign spin system groups to a protein sequence according to their codes.sequence according to their codes. Spin System 31/40
  • 32. Link Spin System groups Segment 3 Segment 2 Segment 1 55.266 38.675 44.555 0 44.417 0 55.043 30.04 44.417 0 30.665 28.72 55356 29.782 60.044 37.541 D G R I 32/40
  • 33. Iterative Concatenation DGRI….FKJJREKL …. Step n Segment 99 1 2 …. 56 Spin Systems 1 2 2 47 1 Step1 56… Step2 Segment 1 Segment 2 Segment 31… Step n-1 Segment 78 Segment 79… 33/40
  • 34. Conflict Segments DGRIDGRIGEIKGRKTLATPAVRRLAMENNIKLSGEIKGRKTLATPAVRRLAMENNIKLS Segment 78 Segment 71 Segment 79 Segment 99 Segment 98 Segment 97 Two kinds of conflict segments Overlap (e.g. segment 71, segment 99) Use the same spin system (e.g. both segment 78 and segment 79 contain spin system 1) 34/40
  • 35. Independent Set Subset S of vertices such that no two vertices in S are connected www.cs.rochester.edu/~stefanko/Teaching/06CS282/06-CSC282-17.ppt 35/40
  • 36. Independent Set Subset S of vertices such that no two vertices in S are connected www.cs.rochester.edu/~stefanko/Teaching/06CS282/06-CSC282-17.ppt 36/40
  • 37. A Graph Model for Spin System Linking  G(V,E)  V: a set of nodes (segments).  E: (u, v), u, v ∈ V, u and v are conflict.  Goal  Assign as many non-conflict segments as possible => find the maximum independent set of G. 37/40
  • 38. An Example of G  Seq. :Seq. : GEIKGRKTLATPAVRRLAMENNIKLSEGEIKGRKTLATPAVRRLAMENNIKLSE Segment1: SP12->SP13->SP14 Segment2: SP9->SP13->SP20->SP4 Segment3: SP8->SP15->SP21 Segment4: SP7->SP1->SP15->SP3 Seg1 Seg3 Seg4 Seg2 Seg1 Seg3 Seg2 Seg4 SP13 SP15 Overlap Overlap 38/40
  • 39. Segment weight  The larger length of segment is, the higher weight of segment is.  The less frequency of segment is, the higher of segment is. 39/40
  • 40. Find Maximum Weight Independent Set of G (1/2)  Boppana, R. and M.M. Halldόrsson, Approximating Maximum Independent Sets by Excluding Subgraphs. BIR, 1992. 32(2). V N(v) Head_N(v) 40/40
  • 41. Find Maximum Weight Independent Set of G (2/2)  Boppana, R. and M.M. Halldόrsson, Approximating Maximum Independent Sets by Excluding Subgraphs. BIR, 1992. 32(2). V 41/40
  • 42. An Iterative Approach  We perform spin system generation and linking iteratively.  Three stages.  Perfect spin systems  Weak false negative spin systems  Severe false negative spin systems 42/40
  • 43. Segment Extension DGRDGRGEKGRKTLATPAVRRLAMENNIKLSGEKGRKTLATPAVRRLAMENNIKLS MaxIndSetMaxIndSet 77 99‘ 97‘ 99 97 45 23 26 31 29 32 33 24 27 28 28 77 71 78 99‘ 97‘ 99 97 43/40

Editor's Notes

  1. /32
  2. Generate spin systems A spin system: an amino acid with known chemical shifts on its N, NH, Ca (and Cb). Link spin systems