NBCC Open Notebook Science Talk

19,424 views

Published on

Jean-Claude Bradley presents "Accelerating Discovery by Sharing: a case for Open Notebook Science" at the National Breast Cancer Coalition Annual Advocacy Conference in Arlington, VA on May 1, 2011.

Published in: Education, Technology
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
19,424
On SlideShare
0
From Embeds
0
Number of Embeds
16,657
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

NBCC Open Notebook Science Talk

  1. 1. Accelerating Discovery by Sharing: a case for Open Notebook Science Jean-Claude Bradley May 1, 2011 National Breast Cancer Coalition Annual Advocacy Conference Associate Professor of Chemistry Drexel University
  2. 2. Outline <ul><li>Trends in sharing for drug discovery </li></ul><ul><li>ONS for malaria research </li></ul><ul><li>Crowdsourcing solubility with ONS </li></ul><ul><li>Leveraging the educational system to contribute new science </li></ul><ul><li>Open modeling and web services </li></ul><ul><li>Discovering connections </li></ul><ul><li>Moving forward: tools and practices </li></ul>
  3. 3. Industry is Sharing More
  4. 4. Opportunities for Competitive Collaboration
  5. 5. Some Initiatives Promoting More Openness in Drug Discovery
  6. 6. Motivation: Faster Science, Better Science
  7. 7. There are NO FACTS, only measurements embedded within assumptions Open Notebook Science maintains the integrity of data provenance by making assumptions explicit
  8. 8. TRUST PROOF
  9. 9. First record then abstract structure In order to be discoverable use Google friendly formats (simple HTML, no login) In order to be replicable use free hosted tools (Wikispaces, Google Spreadsheets) Strategy for an Open Notebook:
  10. 10. UsefulChem Project: Open Primary Research in Drug Design using Web2.0 tools Docking Synthesis Testing Rajarshi Guha Indiana U JC Bradley Drexel U Phil Rosenthal UCSF (malaria) Dan Zaharevitz NCI (tumors) Tsu-Soo Tan Nanyang Inst.
  11. 11. Malaria Target: falcipain-2 involved in hemoglobin metabolism Dana.org
  12. 12. The Ugi Reaction
  13. 13. Outcome of Guha-Bradley-Rosenthal collaboration
  14. 14. References to papers, blog posts, lab notebook pages, raw data
  15. 15. The Ugi reaction: can we predict precipitation? Can we predict solubility in organic solvents?
  16. 16. Crowdsourcing Solubility Data
  17. 17. ONS Challenge Judges
  18. 18. ONS Challenge Award Winners
  19. 19. Solubilities collected in a Google Spreadsheet
  20. 20. Rajarshi Guha’s Live Web Query using Google Viz API
  21. 21. Data provenance: From Wikipedia to…
  22. 22. … the lab notebook and raw data
  23. 23. Interactive NMR spectra using JSpecView and JCAMP-DX
  24. 24. (Andy Lang, Tony Williams) Open Data JCAMP spectra for education (Andy Lang, Tony Williams, Robert Lancashire)
  25. 25. Raw Data As Images Splatter? Some liquid
  26. 26. YouTube for demonstrating experimental set-up
  27. 27. The importance of raw data availability Missed in a prior publication on solubility for this compound
  28. 28. Case study: Chemical Information Retrieval course at Drexel (Fall 2009/2010) Leveraging the educational system to contribute new science
  29. 29. The Chemical Information Validation Sheet 567 curated and referenced measurements from Fall 2010 Chemical Information Retrieval course
  30. 30. The Chemical Information Validation Explorer (Andrew Lang)
  31. 31. Discovering outliers for melting points (stdev/average)
  32. 32. Investigating the m.p. inconsistencies of EGCG
  33. 33. Investigating the m.p. inconsistencies of cyclohexanone
  34. 34. Sigma-Aldrich, Acros and Wolfram Alpha apparently use the same sources for melting points
  35. 35. Sigma-Aldrich, Acros and Wolfram Alpha apparently use the same sources for boiling points
  36. 36. Sigma-Aldrich, Acros and Wolfram Alpha apparently DO NOT use the same sources for flash points
  37. 37. Most popular data sources
  38. 38. Alfa Aesar donates melting points to the public
  39. 39. Open Melting Point Explorer
  40. 40. Outliers MDPI dataset EPI (via ChemSpider)
  41. 41. Outliers Alfa Aesar
  42. 42. Inconsistencies and SMILES problems within MDPI dataset
  43. 43. MDPI Dataset labeled with High Trust Level
  44. 44. Open Melting Point Datasets
  45. 45. Open Random Forest modeling of Open Melting Point data using CDK descriptors (Andrew Lang) R2 = 0.78, TPSA and nHdon most important
  46. 46. Melting point prediction service
  47. 47. Other Web Services… (Andrew Lang) General Transparent Solubility Prediction
  48. 48. Convenient web services for solubility measurement and prediction (Andrew Lang)
  49. 49. Integration of Multiple Web Services to Recommend Solvents for Reactions (Andrew Lang)
  50. 50. Using melting point for temperature dependent solubility prediction
  51. 51. Semi-Automated Measurement of solubility via web service analysis of JCAMP-DX files (Andy Lang)
  52. 54. Solubility Prediction (Andy Lang uses Abraham Model)
  53. 55. Reaction Attempts Book
  54. 56. Reaction Attempts Book: Reactants listed Alphabetically
  55. 58. Dynamic links to private tagged Mendeley collections (Andrew Lang)
  56. 59. All ONS web services
  57. 60. For all Formats of ONS Projects
  58. 61. ONS Challenge Solubility Book cited for nanotechnology application
  59. 62. Visualizing molecule-researcher connection maps reveals link between 2 Open Notebooks (Todd and Bradley) (Don Pellegrino)
  60. 63. The Intersection of Open Notebooks (Bradley/Todd) and IP implications Open Notebook could have blocked patent if done earlier
  61. 64. Decanoic acid Water NaCl
  62. 65. Phrase searching for useful solubility applications
  63. 66. Search for applications of solubility for breast cancer research
  64. 67. Solubility prediction for Taxol using Abraham descriptors Pred Exp
  65. 68. Predicted temperature dependent solubility of Taxol in water (M)
  66. 69. Current research questions for Taxol solubility <ul><li>Does Taxol have a meaningful solubility in methanol or does it decompose too quickly? </li></ul><ul><li>Why is methanol reported to decompose Taxol but not ethanol? </li></ul><ul><li>Can the solubility of Taxol in solvent mixtures be predicted, especially for approved excipients? </li></ul><ul><li>Can the solubility of Taxol analogs be used to create reliable models for the solubility of this class of compounds? </li></ul>
  67. 70. Moving Forward: Tools and Practices Use free hosted web tools and open data formats <ul><li>Google Spreadsheets (numerical data) </li></ul><ul><li>Wikispaces (human readable format) </li></ul><ul><li>YouTube, SlideShare, LuLu, Nature Precedings, etc. (multiple data formats) </li></ul><ul><li>JCAMP-DX for spectral data </li></ul>Practices <ul><li>Report all findings immediately – even if tentative </li></ul><ul><li>Participate in social media to share progress and find collaborators </li></ul><ul><li>Abstract experiments and findings to machine readable formats and make these easily discoverable </li></ul>

×