Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Open Notebook Science HUBzero 2011

20,905 views

Published on

Jean-Claude Bradley presents "Open Notebook Science: Does Transparency Work?" at the HUBzero conference on April 6, 2011.

Published in: Education
  • Be the first to comment

Open Notebook Science HUBzero 2011

  1. 1. Open Notebook Science: Does Transparency Work?<br />HUBzero Conference<br />Jean-Claude Bradley<br />Department of Chemistry<br />Drexel University<br />April 6, 2011<br />
  2. 2. The current state of transparency in scientific communication<br />Case study of melting point data<br />
  3. 3. The Chemical Information Validation Sheet <br />567 curated and referenced measurements from <br />Fall 2010 Chemical Information Retrieval course<br />
  4. 4. The Chemical Information Validation Explorer <br />(Andrew Lang)<br />
  5. 5. Discovering outliers for melting points (stdev/average)<br />
  6. 6. Investigating the m.p. inconsistencies of EGCG<br />
  7. 7. Investigating the m.p. inconsistencies of cyclohexanone<br />
  8. 8. Sigma-Aldrich, Acros and Wolfram Alpha apparently use the same sources for melting points<br />
  9. 9. Sigma-Aldrich, Acros and Wolfram Alpha apparently use the same sources for boiling points<br />
  10. 10. Sigma-Aldrich, Acros and Wolfram Alpha apparently <br />DO NOT use the same sources for flash points<br />
  11. 11. Most popular data sources<br />
  12. 12. Alfa Aesar donates melting points to the public<br />
  13. 13. Open Melting Point Explorer<br />
  14. 14. Outliers<br />MDPI <br />dataset<br />EPI (via ChemSpider)<br />
  15. 15. Outliers<br />Alfa Aesar<br />
  16. 16. Inconsistencies and SMILES problems within MDPI dataset<br />
  17. 17. MDPI Dataset labeled with High Trust Level<br />
  18. 18. Open Melting Point Datasets<br />
  19. 19. Open Random Forest modeling of Open Melting Point data using CDK descriptors<br />(Andrew Lang)<br />R2 = 0.78, TPSA and nHdon most important<br />
  20. 20. Melting point prediction service<br />
  21. 21. Using melting point for temperature dependent solubility prediction<br />
  22. 22. Motivation: Faster Science,Better Science<br />
  23. 23. There are NO FACTS, <br />only measurements embedded within assumptions<br />Open Notebook Science maintains the integrity of data provenance by making assumptions explicit<br />
  24. 24. TRUST<br />PROOF<br />
  25. 25. Strategy for an Open Notebook:<br />First record then abstract structure<br />In order to be discoverable use Google friendly formats (simple HTML, no login) <br />In order to be replicable use free hosted tools (Wikispaces, Google Spreadsheets)<br />
  26. 26. Crowdsourcing Solubility Data<br />
  27. 27. Data provenance: <br />From Wikipedia to…<br />
  28. 28. …the lab notebook and raw data<br />
  29. 29. Calculations Made Public on <br />Google Spreadsheets<br />
  30. 30. Interactive NMR spectra using JSpecView and JCAMP-DX<br />
  31. 31. Raw Data As Images<br />Splatter?<br />Some liquid<br />
  32. 32. YouTube for demonstrating experimental set-up<br />
  33. 33. The importance of raw data availability<br />Missed in a prior publication on solubility for this compound<br />
  34. 34. Solubilities collected in a Google Spreadsheet<br />
  35. 35. Rajarshi Guha’s Live Web Query using Google Viz API<br />
  36. 36. Web services for summary data<br />(Andrew Lang)<br />
  37. 37. Web service calls from within a Google Spreadsheet for solubility measurement and prediction<br />(Andrew Lang)<br />
  38. 38. Integration of Multiple Web Services to Recommend Solvents for Reactions<br />(Andrew Lang)<br />
  39. 39.
  40. 40.
  41. 41.
  42. 42. Reaction Attempts Book<br />
  43. 43. Reaction Attempts Book: Reactants listed Alphabetically<br />
  44. 44. ONS Challenge Solubility Book cited for nanotechnology application<br />
  45. 45. Lulu.com Data Disks<br />
  46. 46. Visualizing molecule-researcher connection maps reveals link between 2 Open Notebooks (Todd and Bradley)<br />(Don Pellegrino)<br />
  47. 47. All ONS web services <br />
  48. 48. For all Formats of ONS Projects<br />
  49. 49. Conclusions<br /><ul><li>Our current system of publication is not as transparent as it could be
  50. 50. Open Notebook Science offers an efficient way to make research transparent and discoverable</li>

×