Bradley Opal 2011

  • 1,000 views
Uploaded on

Jean-Claude Bradley presents at the Opal Events 3rd Annual Drug Discovery Partnership: Filling the Pipeline on Pre-competitive Collaboration: Sharing Data to Increase Predictability

Jean-Claude Bradley presents at the Opal Events 3rd Annual Drug Discovery Partnership: Filling the Pipeline on Pre-competitive Collaboration: Sharing Data to Increase Predictability

More in: Education , Technology
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • It was my pleasure to co-present in this session with you JC. I hope you got some questions over lunch as we ran out of time in the session...I think it makes a good back to back pair. We should make a movie together!
    Are you sure you want to
    Your message goes here
No Downloads

Views

Total Views
1,000
On Slideshare
0
From Embeds
0
Number of Embeds
1

Actions

Shares
Downloads
10
Comments
1
Likes
2

Embeds 0

No embeds

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
    No notes for slide

Transcript

  • 1. Pre-competitive Collaboration: Sharing Data to Increase Predictability
    3rd Annual Drug Discovery Partnership: Filling the Pipeline
    Jean-Claude Bradley
    Associate Professor of Chemistry
    Drexel University
    October 17, 2011
  • 2. Opportunities for Competitive Collaboration
  • 3. Industry is Sharing More
  • 4. Solubility and
    Melting Points
    are critical properties in the drug discovery process
  • 5. Data quality is essential for both measurements and predictions based on measurements
  • 6. Openness is proving to be a powerful tool for assessing the reliability of data
  • 7. Solubility prediction for Taxol using
    Abraham descriptors
    Pred Exp
  • 8. Predicted temperature dependent solubility of Taxol in water based on melting point (M)
  • 9. The Trusted Source Model
    Before online databases (early 90s) searching for properties like melting points using ONE “trusted source” was practical and acceptable as part of the chemistry culture.
    • CRC Handbook
    • 10. Merck Index
    • 11. Chemical Vendor Catalogs (e.g. Sigma-Aldrich)
    • 12. Peer-Reviewed Journals
    Single values don’t tend to be contradicted
  • 13. Question Assumptions
    Using technology, we can begin to replace the “trusted source” model with one based on transparency and provenance
  • 14. The Chemical Information Validation Sheet
    567 curated and referenced measurements from
    Fall 2010 Chemical Information Retrieval course
  • 15. Discovering outliers for melting points (stdev/average)
  • 16. Investigating the m.p. inconsistencies of EGCG
  • 17. Investigating the m.p. inconsistencies of cyclohexanone
  • 18. Most popular data sources
  • 19. Alfa Aesar donates melting points to the public
  • 20. Open Melting Point Explorer
    (Andrew Lang)
  • 21. Outliers
    MDPI
    dataset
    PhysProp (EPA donated all data to public also)
  • 22. Outliers for ethanol: Alfa Aesar and Oxford MSDS
  • 23. Inconsistencies and SMILES problems within MDPI dataset
  • 24. MDPI Dataset labeled with High Trust Level
  • 25. Open Melting Point Datasets
    Currently 27,000 mps for 20,000 compounds
  • 26. What is the melting point of 4-benzyltoluene?
    American Petroleum Institute5 C
    PHYSPROP-30 C
    PHYSPROP 125 C
    peer reviewed journal (2008)97.5 C
    government database-30 C
    government database4.58 C
  • 27. The quest to resolve the melting point
    of 4-benzyltoluene: liquid at room temp
    and can be frozen <-30C (Evan Curtin)
  • 28. Open Lab Notebook page measuring the melting point of 4-benzyltoluene
  • 29. Motivation: Faster Science,Better Science
  • 30. Ruling out all melting points above -15C?
  • 31. Oops – 4-benzyltoluene freezes after 16 days at -15C!
  • 32. Measuring the melting point by slowly heating from -15 C gives 5 C
  • 33. There are NO FACTS,
    only measurements embedded within assumptions
    Open Notebook Science maintains the integrity of data provenance by making assumptions explicit
  • 34. TRUST
    PROOF
  • 35. Common errors in datasets
    multiple melting points for the same compound in the same database
    stereochemistry issues
    sign inversion
    conversion errors (Kelvin/Celcius Fahrenheit/Celcius)
    bad SMILES (non-rendering)
    salts associated with SMILES for free base
    using boiling point for melting point
  • 36. Open Random Forest modeling of Open Melting Point data using CDK descriptors
    (Andrew Lang)
    R2 = 0.78, TPSA and nHdon most important
  • 37. Melting point prediction service
  • 38. Melting point predictions and measurements on iPhone/iPad (Andrew Lang and Alex Clark)
  • 39. Publication of double+ validated melting point dataset to Nature Precedings and LuLu
  • 40.
  • 41.
  • 42. Crowdsourcing Solubility Data
  • 43. ONS Challenge Judges
  • 44. ONS Challenge Award Winners
  • 45. Web services for summary data
    (Andrew Lang)
  • 46. Reaction Attempts Book
  • 47. Reaction Attempts Book: Reactants listed Alphabetically
  • 48.
  • 49. Interactive NMR spectra using JSpecView or ChemDoodleand the Open JCAMP-DX format
  • 50. Predicting Best Solvent for Imine Formation using solubility and melting point data
    (Evan Curtin)
  • 51. Predicting Yield of Imine Formation in Ethanol
    (Evan Curtin)
  • 52. Google Apps Scripts web services
  • 53. Google Apps Scripts for conveniently exploring melting point data
  • 54. Comparison of model with triple validated measurements
    Straight chain carboxylic acids from 1 to 10 carbons
    Straight chain alcohols from 1 to 10 carbons
  • 55. Cyclic primary amines from 3 to 6 carbons (cyclobutylamine flagged for validation – only single source available)
  • 56. Google Apps Scripts for planning reactions and creating schemes
  • 57. Open Melting Points in Supplementary Data Pages of Wikipedia (Martin Walker)
  • 58. All ONS web services
  • 59. Some Initiatives Promoting More Openness in Drug Discovery
  • 60. Open Primary Research in Drug Design using Web2.0 tools (malaria)(blogs, wikis, Second Life, mailing lists)
    Rajarshi Guha
    Indiana U
    Tsu-Soo Tan
    Nanyang Inst.
    Docking
    JC Bradley
    Drexel U
    Synthesis
    Phil Rosenthal
    UCSF
    (malaria)
    Dan Zaharevitz
    NCI
    (tumors)
    Testing
  • 61. Outcome of Guha-Bradley-Rosenthal collaboration
  • 62. Conclusions
    • For science to progress quickly there is great benefit in moving away from a “trusted source” model to one based on transparency and data provenance
    • 63. Open Notebook Science can be a useful tool in this context