DaisukeKihara

Human and Server CAPRI Protein Docking
Prediction Using LZerD with Combined
Scoring Functions
Daisuke Kihara
Department of Biological Sciences
Department of Computer Science
Purdue University, Indiana, USA
1
http://kiharalab.org

CAPRI Round 30 Results
2(Lensink et al., CAPRI30 group paper, 2016)

Overview of Protein Docking
Prediction Using LZerD in CAPRI
3
Re-ranking with
scoring functions
HHPred SparksX
MUFold
TASSER
Phyre2
TASSERlite
MultiCom
Single Chain
Modeling
PRESCO
Sub-unit models
LZerD
~50,000
docking models
Clustering,
RMSD < 5 Å
10 models
MD relaxation Submit

LZerD(Local 3D Zernike descriptor-based Docking program)
4
normal
vector
3DZernike
descriptor
6Å
Interface area
(Venkatraman, Yang, Sael, & Kihara,
BMC Bioinformatics, 2009)
(Lizard)

3D Zernike Descriptors (3DZD)
 An extension of
spherical harmonics
based descriptors
 A 3D object can be
represented by a
series of orthogonal
functions, thus
practically
represented by a
series of coefficients
as a feature vector
 Compact
 Rotation invariant
5
A surface representation of 1ew0A (A) is reconstructed from its 3D Zernike
invariants of the order 5, 10, 15, 20, and 25 (B-F). (Sael & Kihara, 2009)
),()(),,( ϕϑϕϑ m
lnl
m
nl YrRrZ =
),( ϕϑm
lY )(rRnl
),,( ϕϑrZ m
nl
: Spherical harmonics, : radial functions
polynomials in Cartesian coordinates
∫ ≤
=Ω
14
3
.)()(
x
xxx dZf m
nl
m
nl πZernike moments:
Zernike Descriptor:
2
)( m
nl
lm
lm
nlF Ω= ∑
=
−=

Protein Residue Environment SCOre
(PRESCO)
6
within a sphere of 6 or 8 Å
along the main-chain
Center
(Kim & Kihara, Proteins 2014)

Finding Similar Side-Chain Depth
Environment (SDE) from a database
7
Structure
Database
2536 proteins
500 lowest
RMSD
fragments of 9
side-chain
centroids;
Superimposed
with the query
fragment
Select SDE
with the same
number of
side-chain
centroids in
the sphere of
8.0Å
Query SDE
Compute RMSD
of residue-
depth for
corresponding
side-chain
centroids
Sort by depth RMSD
to the query
surface

CASP11 Free Modeling Category
Ranking (Model 1)
8
(http://www.predictioncenter.org/casp11/zscores_final.cgi?formula=assessors)
(Kim & Kihara, Proteins 2015)

DFIRE, GOAP, ITScore Scoring
Functions
 DFIRE (Yaoqi Zhou): statistical distance-
dependent atom contact potential using the
finite ideal-gas reference state
 GOAP (Jeff Skolnick): DFIRE * orientation
dependent term
 ITScore (Xiaoqin Zou):iteratively refined
statistical distance-dependent atom contact
potential
9

The BindML Algorithm
10(La D, & Kihara D, Proteins 2012)

Generating Substitution Models
iPFAM (505 Families)
Model Model
11

iPfam Dataset Benchmark
ROC based on 449
Protein Complexes
12

BindML
Webserver
13
http://kiharalab.org/bindml
(Wei Q, La D, & Kihara D,
Methods in Mol.Biol. In press 2016)

T79 (Round 30)

(Interface 2) Kihara: 3 hits; LZerD: 1 hit

Homodimer

LZerD runs:
 No-interface prediction
 With BindML-consPPISP prediction

LZerD selection strategy:
 Consensus of ITScore and GOAP
 5 from no-interface, 5 from BindML-consPPISP

Kihara selection strategy:
 Manual combination of ITScore, GOAP, DFIRE,
and PRESCO
 10 from no-interface
14

T79 Subunit Model Quality
Chain A
RMSD: 4.0 Å
Chain B
RMSD: 4.0 Å
native
model
15

T79 Human Selected Model
fnat 0.16, L-RMSD 14.1Å, i-RMSD 3.8 Å
native
model
16

T79 Interface Prediction
Method Precision Recall F-Score
BindML 0 0 NA
Cons-PPISP 0.10 0.18 0.12
17

T79 Scores (no-interface prediction)
18
ITScoreGOAP DFIRE
LRMSDfnatiRMSD

T79 Score Comparison
19ITScoreGOAP DFIRE
ITScoreGOAPDFIRE

T79 PRESCO scores
20
lRMSD
PRESCO PRESCO
With Inteface Prediction Without Interface Prediction

T79 Score performance summary
Run Score RFH Hits in top 10
nointerface ITScore 1 (62) 3
nointerface GOAP 1 (72) 3
nointerface DFIRE 1 (111) 5
BindML-
consPPISP
all - -
RFH: rank of first acceptable (medium) hit
21

T91 (Round 30)

Kihara: 8 hits; LZerD: 2 hits

Homodimer

LZerD runs:
 No-interface prediction (with our monomer model)
 With BindML+consPPISP interface prediction
 Zhang1 CASP server model, no-interface prediction

Server selection strategy
 10 from no-interface

Human selection strategy
 Consensus of ITScore, GOAP, PRESCO, and visual
inspection
 5 from no-interface, 5 from Zhang1
22

T91 Subunit Models
Chain C
Our model: RMSD 6.0 Å
Zhang: RMSD 4.9 Å
native
Our model
Zhang1
Chain D
Our model RMSD 6.5 Å
Zhang: RMSD 5.7 Å
23

T91 Human Selected Model
model
native
fnat 0.33, L-RMSD 9.0 Å, I-RMSD 4.2 Å
24

T91 Interface Prediction
Method Precision Recall F-Score
BindML 0.64 0.20 0.30
Cons-PPISP 0.50 0.28 0.36
25

T91 Score (no interface prediction)
26
ITScoreGOAP DFIRE
LRMSDfnatiRMSD

T91 Scores (With Interface prediction)
27
ITScoreGOAP DFIRE
LRMSDfnatiRMSD

T91 Scores (Zhang models)
28ITScoreGOAP DFIRE
LRMSDfnatiRMSD

T91 Zhang1 Score Comparison
29
ITScoreGOAP DFIRE
ITScoreGOAPDFIRE

T91 PRESCO Scores
Without Interface PredictionDocking with Zhang models
PRESCO PRESCO
LRMSD
Top 5 models selected from each
30

T91 Score Performance Summary
Run Score RFH Hits in top 10
nointerface ITScore 2 2
nointerface GOAP 2 1
nointerface DFIRE 1 2
interface ITScore 1042 0
interface GOAP 165 0
interface DFIRE 116 0
zhang1 ITScore 1 (4) 5
zhang1 GOAP 2 (16) 5
zhang1 DFIRE 1 (6) 6
RFH: rank of first acceptable (medium) hit
31

T96 (Round 31)

Heterodimer

Predictor hits: 0 (5 by other groups)

Scorer hits: human 1, server 0 (1 by other
group)
 Human: 6 selected by PRESCO, 4 selected from
with predicted interface, ITScore, GOAP, DFIRE

No PDB file for the native structure available:
metrics computed using two scorer hits
(average L-RMSD/I-RMSD, max fnat)
32

T96 scorer hits
Chain B
S39.M03 (Haliloglu)
fnat 0.22
L-RMSD 5.68 Å
I-RMSD 2.44 Å
Chain A
Chain B
S31.M06 (Kihara)
fnat 0.32
L-RMSD 7.99 Å
I-RMSD 2.67 Å
33

T96 interface prediction
Chain Method Precision Recall F-score
A BindML 0.15 0.2 0.17
Cons-PPISP 0 0 NA
B BindML 0.12 0.11 0.12
Cons-PPISP* NA NA NA
*Cons-PPISP predictions were only for the N-terminal tail; visual
inspection suggests that N-terminal tail is not a likely a binding site, so
these predictions were not used.
34

T96 Scorer-Models Scores
35
ITScoreGOAP DFIRE
lRMSDfnatiRMSD

T96 Score Performance Summary
Score RFH Hits in top 10
ITScore 529 0
GOAP 6 1
DFIRE 125 0
RFH: rank of first acceptable hit
• The hit for GOAP/DFIRE is the same model picked by PRESCO
36

Summary
 Our docking prediction procedure runs LZerD,
and decoys were selected by combining DFIRE,
ITScore, GOAP, and PRESCO. Binding sites were
predicted by BindML and cons-PPISP.
 On the examples shown, PRESCO’s performance
was not as spectacular as we expected from its
performance on single chain str. prediction.
 DFIRE, ITScore, GOAP showed similar,
reasonably good performance.
 Scoring functions performance depends on
subunit model quality.
 The way to use BindML prediction needs to be
improved. 37

Lab Members
38
@kiharalab
Lenna
Peterson
Hyung-
Rae Kim

DaisukeKihara

Recommended

Recommended

More Related Content

Similar to DaisukeKihara

Similar to DaisukeKihara (20)

Recently uploaded

Recently uploaded (20)

DaisukeKihara

Editor's Notes