NEXT GENERATION DATA AND
OPPORTUNITIES FOR CLINICAL
PHARMACOLOGISTS
Philip E. Bourne Ph.D.
Associate Director for Data Sci...
As of March 3, 2014
Agenda
 Research that Informs my NIH Agenda
– The TB drugome – towards reproducibility
– Systems pharmacology – towards i...
Reconstruction of Genome-Scale
3D Drug-Target Interaction Models
Integrating chemical genomics and structural systems biol...
• Geometric and topological constraints
• Evolutionary constraints
• Dynamic constraints
• Physiochemical constraints
Dete...
Geometric Potential – A Geometric Constraint
 Challenge: inherent flexibility
and uncertainty in homology
models
 Repres...
Sequence-order Independent
Profile-Profile Alignment (SOIPPA)
L E R
V K D L
L E R
V K D L
Structure A Structure B
S = 8
S ...
Similarity Matrix of Alignment – Chemical &
Evolutionary Constraints?
Constraint - Chemical Similarity
• Amino acid groupi...
The Problem with Tuberculosis
 One third of global population infected
 1.7 million deaths per year
 95% of deaths in d...
The TB-Drugome
1. Determine the TB structural proteome
2. Determine all known drug binding sites from the
PDB
3. Determine...
1. Determine the TB Structural
Proteome
284
1, 446
3, 996 2, 266
TB proteom
e
hom
ology
m
odels
solved
structures
 High q...
2. Determine all Known Drug Binding
Sites in the PDB
 Searched the PDB for protein crystal structures
bound with FDA-appr...
3. Map 2 onto 1 –
The TB-Drugome
http://funsite.sdsc.edu/drugome/TB/
Similarities between the binding sites of M.tb protei...
From a Drug Repositioning
Perspective
 Similarities between drug binding sites and
TB proteins are found for 61/268 drugs...
Agenda
 Research that Informs my NIH Agenda
– The TB drugome – towards reproducibility
– Systems pharmacology – towards i...
Agenda
 Research that Informs my NIH Agenda
– The TB drugome – towards reproducibility
– Systems pharmacology – towards i...
Characteristics of the Original and
Current Experiment
 Original and Current:
– Purely in silico
– Uses a combination of ...
Considered the Ability to Reproduce
by Four Classes of User
 REP-AUTHOR – original author of the work
 REP-EXPERT – doma...
A Conceptual Overview of the Method
Should Be Mandatory
Garijo et al 2013 PLOS ONE 8(11): e80278
Time to Reproduce the Method
Garijo et al 2013 PLOS ONE 8(11): e80278
Its not that we could not reproduce
the work, but the effort involved was
substantial
Any graduate student could tell you
...
Agenda
 Research that Informs my NIH Agenda
– The TB drugome – towards reproducibility
– Systems pharmacology – towards i...
Human Kidney Modeling Pipeline
Recon1
metabolic
network
constrain
exchange
fluxes
preliminary
model
refine
based on
capabi...
Agenda
 Research that Informs my NIH Agenda
– The TB drugome – towards reproducibility
– Systems pharmacology – towards i...
Agenda
 Research that Informs my NIH Agenda
– The TB drugome – towards reproducibility
– Systems pharmacology – towards i...
Representation
 Requires community engagement:
– RDA
– GA4GH
– FORCE11
– ……
 Policies
– Genomic data sharing plan
– Mach...
Sustainability
The How of Data Sharing
 More credit to the data scientists
 Change to funding models – become less IC ba...
Discoverability
 Calls for data and software registries (e.g., DDI)
 Data commons (NIH drive?)
 More clinical trial dat...
Training
 Calls out for training grants – new and as
supplements to existing training efforts
 Regional training centers...
NIHNIH……
Turning Discovery Into HealthTurning Discovery Into Health
Upcoming SlideShare
Loading in …5
×

Next Generation Data and Opportunities for Clinical Pharmacologists

731 views
658 views

Published on

Presentation at the Pre-meeting Workshop Next-Generation Clinical Pharmacology: Integrating Systems Pharmacology, Data-Driven Therapeutics, and Personalized Medicine. American Society for Clinical Pharmacology and Therapeutics Annual Meeting Atlanta GA March 18, 2014.

Published in: Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
731
On SlideShare
0
From Embeds
0
Number of Embeds
9
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide
  • Tuberculosis, which is caused by the bacterial pathogen Mycobacterium tuberculosis, is a leading cause of mortality among the infectious diseases. It has been estimated by the World Health Organization (WHO) that almost one-third of the world's population, around 2 billion people, is infected with the disease.
    Every year, more than 8 million people develop an active form of the disease, which claims the lives of nearly 2 million. This translates to over 4,900 deaths per day, and more than 95% of these are in developing countries.
    Despite the current global situation, antitubercular drugs have remained largely unchanged over the last four decades. The widespread use of these agents has provided a strong selective pressure for M.tuberculosis, thus encouraging the emergence of resistant strains.
    Multidrug resistant (MDR) tuberculosis is defined as resistance to the first-line drugs isoniazid and rifampin. The effective treatment of MDR tuberculosis necessitates long-term use of second-line drug combinations, an unfortunate consequence of which is the emergence of further drug resistance.
    Enter extensively drug resistant (XDR) tuberculosis - M.tuberculosis strains that are resistant to both isoniazid plus rifampin, as well as key second-line drugs. Since the only remaining drug classes exhibit such low potency and high toxicity, XDR tuberculosis is extremely difficult to treat.
    The rise of XDR tuberculosis around the world imposes a great threat on human health, therefore reinforcing the development of new antitubercular agents as an urgent priority.
    Very few Mtb proteins explored as drug targets
  • 3,996 proteins in TB proteome
    749 solved structures in the PDB, representing a total of 284 proteins (7.2% coverage)
    ModBase contains homology models for entire TB proteome
    1,446 ‘high quality’ homology models were added to the data set
    Structural coverage increased to 43.8%
    Retained only those models with a model score of > 0.7 and a Modpipe quality score of > 1.1 (2818 models).
    There were multiple models per protein. For each TB protein, chose the model with the best model score, and if they were equal, chose the model with the best Modpipe quality score (1703 models).
    However, 251 (+6) models were removed since they correspond to TB proteins that already have solved structures. 1446 models remained)
    Score for the reliability of a Model, derived from statistical potentials (F. Melo, R. Sanchez, A. Sali,2001 PDF). A model is predicted to be good when the model score is higher than a pre-specified cutoff (0.7). A reliable model has a probability of the correct fold that is larger than 95%. A fold is correct when at least 30% of its Calpha atoms superpose within 3.5A of their correct positions.
    The ModPipe Protein Quality Score is a composite score comprising sequence identity to the template, coverage, and the three individual scores evalue, z-Dope and GA341. We consider a MPQS of >1.1 as reliable
  • (nutraceuticals excluded)
  • Next Generation Data and Opportunities for Clinical Pharmacologists

    1. 1. NEXT GENERATION DATA AND OPPORTUNITIES FOR CLINICAL PHARMACOLOGISTS Philip E. Bourne Ph.D. Associate Director for Data Science National Institutes of Health
    2. 2. As of March 3, 2014
    3. 3. Agenda  Research that Informs my NIH Agenda – The TB drugome – towards reproducibility – Systems pharmacology – towards interoperability  Some Challenges – We have the why, but we lack the how – The how involves: • Representation • Sustainability • Discoverability • Training
    4. 4. Reconstruction of Genome-Scale 3D Drug-Target Interaction Models Integrating chemical genomics and structural systems biology MD simulation Mj Q Mj Q ligENTS SMAP Protein-ligand docking Mj Q Mi 3D model of novel Target 3D model of annotated target interaction model Query chemical Network modeling Experimental support L. Xie and P.E. Bourne 2008 PNAS, 105(14) 5441-5446 http//:funsite.sdsc.edu
    5. 5. • Geometric and topological constraints • Evolutionary constraints • Dynamic constraints • Physiochemical constraints Detecting Protein Binding Promiscuity in a Given Proteome HASSTRVCTVREPRTSEQAENCE SMAP v2.0 Approach
    6. 6. Geometric Potential – A Geometric Constraint  Challenge: inherent flexibility and uncertainty in homology models  Representation of the protein structure - Cα atoms only - Delaunay tessellation - Graph representation  Geometric Potential (GP) GP = P + Pi Di+1.0neighbors ∑ × cos(αi)+1.0 2.0 L. Xie & P. E. Bourne, BMC Bioinformatics, 8(2007):S9 100 0 Geometric Potential Scale 0 0.5 1 1.5 2 2.5 3 3.5 4 0 11 22 33 44 55 66 77 88 99 Geometric Potential binding site non-binding site Approach
    7. 7. Sequence-order Independent Profile-Profile Alignment (SOIPPA) L E R V K D L L E R V K D L Structure A Structure B S = 8 S = 4 Xie & Bourne, PNAS, 105(2008):5441 Approach
    8. 8. Similarity Matrix of Alignment – Chemical & Evolutionary Constraints? Constraint - Chemical Similarity • Amino acid grouping: (LVIMC), (AGSTP), (FYW), and (EDNQKRH) • Amino acid chemical similarity matrix Constraint - Evolutionary Correlation • Amino acid substitution matrix such as BLOSUM45 • Similarity score between two sequence profiles i a i i b i b i i a SfSfd ∑∑ += fa, fb are the 20 amino acid target frequencies of profile a and b, respectively Sa, Sb are the PSSM of profile a and b, respectively Xie and Bourne 2008 PNAS, 105(14) 5441
    9. 9. The Problem with Tuberculosis  One third of global population infected  1.7 million deaths per year  95% of deaths in developing countries  Anti-TB drugs hardly changed in 40 years  MDR-TB and XDR-TB pose a threat to human health worldwide  Development of novel, effective and inexpensive drugs is an urgent priority
    10. 10. The TB-Drugome 1. Determine the TB structural proteome 2. Determine all known drug binding sites from the PDB 3. Determine which of the sites found in 2 exist in 1 4. Call the result the TB-drugome Kinnings et al 2010 PLoS Comp Biol 6(11): e1000976
    11. 11. 1. Determine the TB Structural Proteome 284 1, 446 3, 996 2, 266 TB proteom e hom ology m odels solved structures  High quality homology models from ModBase (http://modbase.compbio.ucsf.edu) increase structural coverage from 7.1% to 43.3% Kinnings et al 2010 PLoS Comp Biol 6(11): e1000976
    12. 12. 2. Determine all Known Drug Binding Sites in the PDB  Searched the PDB for protein crystal structures bound with FDA-approved drugs  268 drugs bound in a total of 931 binding sites No. of drug binding sites Methotrexate Chenodiol Alitretinoin Conjugated estrogens Darunavir Acarbose Kinnings et al 2010 PLoS Comp Biol 6(11): e1000976
    13. 13. 3. Map 2 onto 1 – The TB-Drugome http://funsite.sdsc.edu/drugome/TB/ Similarities between the binding sites of M.tb proteins (blue), and binding sites containing approved drugs (red).
    14. 14. From a Drug Repositioning Perspective  Similarities between drug binding sites and TB proteins are found for 61/268 drugs  41 of these drugs could potentially inhibit more than one TB protein No. of potential TB targets raloxifene alitretinoin conjugated estrogens & methotrexate ritonavir testosterone levothyroxine chenodiol Kinnings et al 2010 PLoS Comp Biol 6(11): e1000976
    15. 15. Agenda  Research that Informs my NIH Agenda – The TB drugome – towards reproducibility – Systems pharmacology – towards interoperability  Some Challenges – We have the why, but we lack the how – The how involves: • Representation • Sustainability • Discoverability • Training
    16. 16. Agenda  Research that Informs my NIH Agenda – The TB drugome – towards reproducibility – Systems pharmacology – towards interoperability  Some Challenges – We have the why, but we lack the how – The how involves: • Representation • Sustainability • Discoverability • Training
    17. 17. Characteristics of the Original and Current Experiment  Original and Current: – Purely in silico – Uses a combination of public databases and open source software by us and others  Original: – http://funsite.sdsc.edu/drugome/TB/  Current: – Recast in the Wings workflow system
    18. 18. Considered the Ability to Reproduce by Four Classes of User  REP-AUTHOR – original author of the work  REP-EXPERT – domain expert – can reproduce even with incomplete methods described  REP-NOVICE – basic domain (bioinformatics) expertise  REP-MINIMAL – researcher with no domain expertise Garijo et al 2013 PLOS ONE 8(11): e80278
    19. 19. A Conceptual Overview of the Method Should Be Mandatory Garijo et al 2013 PLOS ONE 8(11): e80278
    20. 20. Time to Reproduce the Method Garijo et al 2013 PLOS ONE 8(11): e80278
    21. 21. Its not that we could not reproduce the work, but the effort involved was substantial Any graduate student could tell you this and little has changed in 40 years Perhaps it is time we did better?
    22. 22. Agenda  Research that Informs my NIH Agenda – The TB drugome – towards reproducibility – Systems pharmacology – towards interoperability  Some Challenges – We have the why, but we lack the how – The how involves: • Representation • Sustainability • Discoverability • Training
    23. 23. Human Kidney Modeling Pipeline Recon1 metabolic network constrain exchange fluxes preliminary model refine based on capabilities literatur e set flux constraints normalize & set threshold renal objectives set minimum objective flux GIMME metabolic influx metabolic efflux kidney model healthy kidney gene expression data Approach metabolomic blood/urine & kidney localization data R.L Chang et al. 2010 PLOS Comp. Biol. 6(9): e1000938
    24. 24. Agenda  Research that Informs my NIH Agenda – The TB drugome – towards reproducibility – Systems pharmacology – towards interoperability  Some Challenges – We have the why, but we lack the how – The how involves: • Representation • Sustainability • Discoverability • Training
    25. 25. Agenda  Research that Informs my NIH Agenda – The TB drugome – towards reproducibility – Systems pharmacology – towards interoperability  Some Challenges – We have the why, but we lack the how – The how involves: • Representation • Sustainability • Discoverability • Training
    26. 26. Representation  Requires community engagement: – RDA – GA4GH – FORCE11 – ……  Policies – Genomic data sharing plan – Machine readable data sharing plans  Particular needs surrounding phenotypic data
    27. 27. Sustainability The How of Data Sharing  More credit to the data scientists  Change to funding models – become less IC based  Public/Private partnerships  Interagency cooperation  International cooperation  Better evaluation and more informed decisions about existing and proposed resources – How are current data being used?  Role of institutional repositories – reward institutions rather than PIs
    28. 28. Discoverability  Calls for data and software registries (e.g., DDI)  Data commons (NIH drive?)  More clinical trial data in the public domain  Facilitate authentication and hence access to clinical data
    29. 29. Training  Calls out for training grants – new and as supplements to existing training efforts  Regional training centers (cf Cold Spring Harbor)?
    30. 30. NIHNIH…… Turning Discovery Into HealthTurning Discovery Into Health

    ×