Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Bradley Open Notebook Science Georgia Tech OA week


Published on

Jean-Claude Bradley presents on Open Notebook Science: Transparency in Research on October 23, 2012 at Georgia Tech for Open Access Week. Topics include solubility, melting points, a recrystallization app, the Chemical Information Retrieval class at Drexel University and the Open Chemical Property Matrix (OCPM). YouTube recording here:

Published in: Education

Bradley Open Notebook Science Georgia Tech OA week

  1. 1. Open Notebook Science: Transparency in Research Georgia Tech Library Open Access Week Jean-Claude Bradley Associate Professor of Chemistry Drexel University October 23, 2012
  2. 2. Openness in Chemistry WHY?
  3. 3. Dibenzalacetone derivatives docking against tubulin (paclitaxel site) (Andrew Lang)
  4. 4. “Simple” aldol condensation synthesis Top Hit (no reports of synthesis) In top ten (a few reports of synthesis) (Andrew Lang)
  5. 5. What is the current standard for “sufficient information” in communicating organic chemistry? By definition, all peer-reviewed published documentation has been approved as sufficient by authors, editors and reviewers.
  6. 6. Searching for aldol condensations of acetone in the Reaction Attempts database (Andrew Lang)
  7. 7. Information from the literature on the target synthesis
  8. 8. Information from the literature on the target synthesis
  9. 9. Information from the literature on the target synthesis
  10. 10. A successful synthesis by avoiding water, dramatically increasing NaOH and long reaction time
  11. 11. An example of a failed experiment in an Open Notebook with useful information
  12. 12. A failed experiment reveals the importance of aldehyde solubility
  13. 13. Motivation: Faster Science, Better Science
  14. 14. An example of a successful experiment in an Open Notebook
  15. 15. Never having to leave the Google Spreadsheet dashboard for access to key info (Andrew Lang and Rich Apodaca)
  16. 16. A click away from an interactive NMR display (using JCAMP-DX format and ChemDoodle) (Andrew Lang)
  17. 17. Contributing to Science while Teaching it: Chemical Information Retrieval Class
  18. 18. The Chemical Information Validation Sheet 567 curated and referenced measurements from Fall 2010 Chemical Information Retrieval course
  19. 19. Discovering outliers for melting points (stdev/average)
  20. 20. Investigating the m.p. inconsistencies of EGCG
  21. 21. Investigating the m.p. inconsistencies of cyclohexanone
  22. 22. Most popular data sources
  23. 23. Alfa Aesar donates melting points to the public
  24. 24. Open Melting Point Explorer (Andrew Lang)
  25. 25. OutliersMDPI EPI (donated alldataset data to public also)
  26. 26. Outliers for ethanol: Alfa Aesar and Oxford MSDS
  27. 27. Inconsistencies and SMILES problems within MDPI dataset
  28. 28. MDPI Dataset labeled with High Trust Level
  29. 29. Open Melting Point DatasetsCurrently 20,000 compounds with Open MPs
  30. 30. What is the melting point of 4-benzyltoluene? American Petroleum Institute5 C PHYSPROP -30 C PHYSPROP 125 C peer reviewed journal (2008) 97.5 C government database -30 C government database 4.58 C
  31. 31. The quest to resolve the melting pointof 4-benzyltoluene: liquid at room temp and can be frozen <-30C
  32. 32. Open Lab Notebook page measuring the melting point of 4-benzyltoluene
  33. 33. Ruling out all melting points above -15C?
  34. 34. Oops – 4-benzyltoluene freezes after 16 days at -15C!
  35. 35. Measuring the melting point by slowly heating from -15 C gives 5 C
  36. 36. There are NO FACTS, only measurements embedded within assumptionsOpen Notebook Science maintainsthe integrity of data provenance by making assumptions explicit
  37. 37. Open Random Forest modeling of Open Melting Point data using CDK descriptors (Andrew Lang) R2 = 0.78, TPSA and nHdon most important
  38. 38. Melting point prediction service
  39. 39. Web services for summary data (Andrew Lang)
  40. 40. Using a Google Spreadsheet as a “dashboard interface” for reaction planning and analysis
  41. 41. Calling Google App Scripts
  42. 42. Calling Google App Scripts (Andrew Lang and Rich Apodaca)
  43. 43. Google Apps Scripts for conveniently exploring melting point data
  44. 44. Comparison of model with triple validated measurements Straight chain carboxylic acids from 1 to 10 carbons Straight chain alcohols from 1 to 10 carbons
  45. 45. Cyclic primary amines from 3 to 6 carbons (cyclobutylamine flagged for validation – only single source available)
  46. 46. Open Melting Points in Supplementary Data Pages of Wikipedia (Martin Walker)
  47. 47. Google Apps Scripts web services
  48. 48. Chemistry Google App Scripts description sheet (Andrew Lang and Rich Apodaca)
  49. 49. Integration of Multiple Web Services to Recommend Solvents for Reactions (Andrew Lang)
  50. 50. The Recrystallization App (Andrew Lang)
  51. 51. The importance of recrystallization• Generally preferred if there is a known solvent that gives a good yield• Scales much more easily and cheaply than chromatography• However, for new compounds much trial and error may be needed
  52. 52. How does it work?1. Look up the solvent boiling point2. Look up the room temperature solubility or predict it viaAbraham descriptors predicted from a model using theCDK3. Look up the solute melting point or predict it via a modelusing the CDK4. Use the melting point and the solubility at roomtemperature to predict the solubility at boiling5. Calculate the predicted recrystallization yield
  53. 53. The Recrystallization App produces and usesOpen Data:•Open Solubility Collection and Models•Open Melting Point Collection and Models•Modeling depends mainly on CDK (OpenSource Software with Open Descriptors)•Open Notebook Science
  54. 54. What are good solvents to recrystallize benzoic acid? (Andrew Lang)
  55. 55. Click on the solvent to see temp curve (Andrew Lang)
  56. 56. Deliver melting point data via App (Andrew Lang)
  57. 57. Chemical Information Retrieval 2012 property assignment
  58. 58. Melting Point Outlier List
  59. 59. Melting Point Outlier example
  60. 60. Solubility Outlier List
  61. 61. Solubility of benzoic acid in 1-octanol discrepancies
  62. 62. Using ChemSpider to ensure allstereocenters are defined before searching for properties
  63. 63. Using the InChIKey to find single isomers
  64. 64. Chemical Information Validation Sheet 2012
  65. 65. Each entry validated with an image
  66. 66. Avoiding redundant property data points with a single click within the validation sheet
  67. 67. Open Chemical Property Matrix (OCPM)Boiling point Vapor pressure Flash point Abraham Melting point descriptors logP Aqueous Octanol solubility solubility
  68. 68. Open Chemical Property Matrix (OCPM)
  69. 69. OCPM relationships
  70. 70. OCPM melting point sheet
  71. 71. ConclusionsMore openness in chemistry can make science more efficientProvide interfaces that make sense to the end users:Open Data, Open Models and Open Source Software to modelersApps (smartphones, Google App Scripts, etc.) for chemists at the bench Acknowledgements Andrew Lang (code, modeling) Bill Acree (modeling, solubility data contribution) Antony Williams (ChemSpider services, mp data curation) Matthew McBride and Rida Atif (recrystallization and synthesis) Kayla Gogarty (OCPM)