Your SlideShare is downloading. ×
Bradley Opal 2011
Upcoming SlideShare
Loading in...5

Thanks for flagging this SlideShare!

Oops! An error has occurred.

Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Bradley Opal 2011


Published on

Jean-Claude Bradley presents at the Opal Events 3rd Annual Drug Discovery Partnership: Filling the Pipeline on Pre-competitive Collaboration: Sharing Data to Increase Predictability

Jean-Claude Bradley presents at the Opal Events 3rd Annual Drug Discovery Partnership: Filling the Pipeline on Pre-competitive Collaboration: Sharing Data to Increase Predictability

Published in: Education, Technology

1 Comment
  • It was my pleasure to co-present in this session with you JC. I hope you got some questions over lunch as we ran out of time in the session...I think it makes a good back to back pair. We should make a movie together!
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Total Views
On Slideshare
From Embeds
Number of Embeds
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

No notes for slide


  • 1. Pre-competitive Collaboration: Sharing Data to Increase Predictability
    3rd Annual Drug Discovery Partnership: Filling the Pipeline
    Jean-Claude Bradley
    Associate Professor of Chemistry
    Drexel University
    October 17, 2011
  • 2. Opportunities for Competitive Collaboration
  • 3. Industry is Sharing More
  • 4. Solubility and
    Melting Points
    are critical properties in the drug discovery process
  • 5. Data quality is essential for both measurements and predictions based on measurements
  • 6. Openness is proving to be a powerful tool for assessing the reliability of data
  • 7. Solubility prediction for Taxol using
    Abraham descriptors
    Pred Exp
  • 8. Predicted temperature dependent solubility of Taxol in water based on melting point (M)
  • 9. The Trusted Source Model
    Before online databases (early 90s) searching for properties like melting points using ONE “trusted source” was practical and acceptable as part of the chemistry culture.
    • CRC Handbook
    • 10. Merck Index
    • 11. Chemical Vendor Catalogs (e.g. Sigma-Aldrich)
    • 12. Peer-Reviewed Journals
    Single values don’t tend to be contradicted
  • 13. Question Assumptions
    Using technology, we can begin to replace the “trusted source” model with one based on transparency and provenance
  • 14. The Chemical Information Validation Sheet
    567 curated and referenced measurements from
    Fall 2010 Chemical Information Retrieval course
  • 15. Discovering outliers for melting points (stdev/average)
  • 16. Investigating the m.p. inconsistencies of EGCG
  • 17. Investigating the m.p. inconsistencies of cyclohexanone
  • 18. Most popular data sources
  • 19. Alfa Aesar donates melting points to the public
  • 20. Open Melting Point Explorer
    (Andrew Lang)
  • 21. Outliers
    PhysProp (EPA donated all data to public also)
  • 22. Outliers for ethanol: Alfa Aesar and Oxford MSDS
  • 23. Inconsistencies and SMILES problems within MDPI dataset
  • 24. MDPI Dataset labeled with High Trust Level
  • 25. Open Melting Point Datasets
    Currently 27,000 mps for 20,000 compounds
  • 26. What is the melting point of 4-benzyltoluene?
    American Petroleum Institute5 C
    PHYSPROP 125 C
    peer reviewed journal (2008)97.5 C
    government database-30 C
    government database4.58 C
  • 27. The quest to resolve the melting point
    of 4-benzyltoluene: liquid at room temp
    and can be frozen <-30C (Evan Curtin)
  • 28. Open Lab Notebook page measuring the melting point of 4-benzyltoluene
  • 29. Motivation: Faster Science,Better Science
  • 30. Ruling out all melting points above -15C?
  • 31. Oops – 4-benzyltoluene freezes after 16 days at -15C!
  • 32. Measuring the melting point by slowly heating from -15 C gives 5 C
  • 33. There are NO FACTS,
    only measurements embedded within assumptions
    Open Notebook Science maintains the integrity of data provenance by making assumptions explicit
  • 34. TRUST
  • 35. Common errors in datasets
    multiple melting points for the same compound in the same database
    stereochemistry issues
    sign inversion
    conversion errors (Kelvin/Celcius Fahrenheit/Celcius)
    bad SMILES (non-rendering)
    salts associated with SMILES for free base
    using boiling point for melting point
  • 36. Open Random Forest modeling of Open Melting Point data using CDK descriptors
    (Andrew Lang)
    R2 = 0.78, TPSA and nHdon most important
  • 37. Melting point prediction service
  • 38. Melting point predictions and measurements on iPhone/iPad (Andrew Lang and Alex Clark)
  • 39. Publication of double+ validated melting point dataset to Nature Precedings and LuLu
  • 40.
  • 41.
  • 42. Crowdsourcing Solubility Data
  • 43. ONS Challenge Judges
  • 44. ONS Challenge Award Winners
  • 45. Web services for summary data
    (Andrew Lang)
  • 46. Reaction Attempts Book
  • 47. Reaction Attempts Book: Reactants listed Alphabetically
  • 48.
  • 49. Interactive NMR spectra using JSpecView or ChemDoodleand the Open JCAMP-DX format
  • 50. Predicting Best Solvent for Imine Formation using solubility and melting point data
    (Evan Curtin)
  • 51. Predicting Yield of Imine Formation in Ethanol
    (Evan Curtin)
  • 52. Google Apps Scripts web services
  • 53. Google Apps Scripts for conveniently exploring melting point data
  • 54. Comparison of model with triple validated measurements
    Straight chain carboxylic acids from 1 to 10 carbons
    Straight chain alcohols from 1 to 10 carbons
  • 55. Cyclic primary amines from 3 to 6 carbons (cyclobutylamine flagged for validation – only single source available)
  • 56. Google Apps Scripts for planning reactions and creating schemes
  • 57. Open Melting Points in Supplementary Data Pages of Wikipedia (Martin Walker)
  • 58. All ONS web services
  • 59. Some Initiatives Promoting More Openness in Drug Discovery
  • 60. Open Primary Research in Drug Design using Web2.0 tools (malaria)(blogs, wikis, Second Life, mailing lists)
    Rajarshi Guha
    Indiana U
    Tsu-Soo Tan
    Nanyang Inst.
    JC Bradley
    Drexel U
    Phil Rosenthal
    Dan Zaharevitz
  • 61. Outcome of Guha-Bradley-Rosenthal collaboration
  • 62. Conclusions
    • For science to progress quickly there is great benefit in moving away from a “trusted source” model to one based on transparency and data provenance
    • 63. Open Notebook Science can be a useful tool in this context