Bradley Opal 2011
Upcoming SlideShare
Loading in...5

Bradley Opal 2011



Jean-Claude Bradley presents at the Opal Events 3rd Annual Drug Discovery Partnership: Filling the Pipeline on Pre-competitive Collaboration: Sharing Data to Increase Predictability

Jean-Claude Bradley presents at the Opal Events 3rd Annual Drug Discovery Partnership: Filling the Pipeline on Pre-competitive Collaboration: Sharing Data to Increase Predictability



Total Views
Views on SlideShare
Embed Views



1 Embed 29 29



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.


11 of 1

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
  • It was my pleasure to co-present in this session with you JC. I hope you got some questions over lunch as we ran out of time in the session...I think it makes a good back to back pair. We should make a movie together!
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

    Bradley Opal 2011 Bradley Opal 2011 Presentation Transcript

    • Pre-competitive Collaboration: Sharing Data to Increase Predictability
      3rd Annual Drug Discovery Partnership: Filling the Pipeline
      Jean-Claude Bradley
      Associate Professor of Chemistry
      Drexel University
      October 17, 2011
    • Opportunities for Competitive Collaboration
    • Industry is Sharing More
    • Solubility and
      Melting Points
      are critical properties in the drug discovery process
    • Data quality is essential for both measurements and predictions based on measurements
    • Openness is proving to be a powerful tool for assessing the reliability of data
    • Solubility prediction for Taxol using
      Abraham descriptors
      Pred Exp
    • Predicted temperature dependent solubility of Taxol in water based on melting point (M)
    • The Trusted Source Model
      Before online databases (early 90s) searching for properties like melting points using ONE “trusted source” was practical and acceptable as part of the chemistry culture.
      • CRC Handbook
      • Merck Index
      • Chemical Vendor Catalogs (e.g. Sigma-Aldrich)
      • Peer-Reviewed Journals
      Single values don’t tend to be contradicted
    • Question Assumptions
      Using technology, we can begin to replace the “trusted source” model with one based on transparency and provenance
    • The Chemical Information Validation Sheet
      567 curated and referenced measurements from
      Fall 2010 Chemical Information Retrieval course
    • Discovering outliers for melting points (stdev/average)
    • Investigating the m.p. inconsistencies of EGCG
    • Investigating the m.p. inconsistencies of cyclohexanone
    • Most popular data sources
    • Alfa Aesar donates melting points to the public
    • Open Melting Point Explorer
      (Andrew Lang)
    • Outliers
      PhysProp (EPA donated all data to public also)
    • Outliers for ethanol: Alfa Aesar and Oxford MSDS
    • Inconsistencies and SMILES problems within MDPI dataset
    • MDPI Dataset labeled with High Trust Level
    • Open Melting Point Datasets
      Currently 27,000 mps for 20,000 compounds
    • What is the melting point of 4-benzyltoluene?
      American Petroleum Institute5 C
      PHYSPROP-30 C
      PHYSPROP 125 C
      peer reviewed journal (2008)97.5 C
      government database-30 C
      government database4.58 C
    • The quest to resolve the melting point
      of 4-benzyltoluene: liquid at room temp
      and can be frozen <-30C (Evan Curtin)
    • Open Lab Notebook page measuring the melting point of 4-benzyltoluene
    • Motivation: Faster Science,Better Science
    • Ruling out all melting points above -15C?
    • Oops – 4-benzyltoluene freezes after 16 days at -15C!
    • Measuring the melting point by slowly heating from -15 C gives 5 C
    • There are NO FACTS,
      only measurements embedded within assumptions
      Open Notebook Science maintains the integrity of data provenance by making assumptions explicit
    • TRUST
    • Common errors in datasets
      multiple melting points for the same compound in the same database
      stereochemistry issues
      sign inversion
      conversion errors (Kelvin/Celcius Fahrenheit/Celcius)
      bad SMILES (non-rendering)
      salts associated with SMILES for free base
      using boiling point for melting point
    • Open Random Forest modeling of Open Melting Point data using CDK descriptors
      (Andrew Lang)
      R2 = 0.78, TPSA and nHdon most important
    • Melting point prediction service
    • Melting point predictions and measurements on iPhone/iPad (Andrew Lang and Alex Clark)
    • Publication of double+ validated melting point dataset to Nature Precedings and LuLu
    • Crowdsourcing Solubility Data
    • ONS Challenge Judges
    • ONS Challenge Award Winners
    • Web services for summary data
      (Andrew Lang)
    • Reaction Attempts Book
    • Reaction Attempts Book: Reactants listed Alphabetically
    • Interactive NMR spectra using JSpecView or ChemDoodleand the Open JCAMP-DX format
    • Predicting Best Solvent for Imine Formation using solubility and melting point data
      (Evan Curtin)
    • Predicting Yield of Imine Formation in Ethanol
      (Evan Curtin)
    • Google Apps Scripts web services
    • Google Apps Scripts for conveniently exploring melting point data
    • Comparison of model with triple validated measurements
      Straight chain carboxylic acids from 1 to 10 carbons
      Straight chain alcohols from 1 to 10 carbons
    • Cyclic primary amines from 3 to 6 carbons (cyclobutylamine flagged for validation – only single source available)
    • Google Apps Scripts for planning reactions and creating schemes
    • Open Melting Points in Supplementary Data Pages of Wikipedia (Martin Walker)
    • All ONS web services
    • Some Initiatives Promoting More Openness in Drug Discovery
    • Open Primary Research in Drug Design using Web2.0 tools (malaria)(blogs, wikis, Second Life, mailing lists)
      Rajarshi Guha
      Indiana U
      Tsu-Soo Tan
      Nanyang Inst.
      JC Bradley
      Drexel U
      Phil Rosenthal
      Dan Zaharevitz
    • Outcome of Guha-Bradley-Rosenthal collaboration
    • Conclusions
      • For science to progress quickly there is great benefit in moving away from a “trusted source” model to one based on transparency and data provenance
      • Open Notebook Science can be a useful tool in this context