Open Notebook Science HUBzero 2011

18,367 views
18,796 views

Published on

Jean-Claude Bradley presents "Open Notebook Science: Does Transparency Work?" at the HUBzero conference on April 6, 2011.

Published in: Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
18,367
On SlideShare
0
From Embeds
0
Number of Embeds
16,013
Actions
Shares
0
Downloads
9
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Open Notebook Science HUBzero 2011

  1. 1. Open Notebook Science: Does Transparency Work?<br />HUBzero Conference<br />Jean-Claude Bradley<br />Department of Chemistry<br />Drexel University<br />April 6, 2011<br />
  2. 2. The current state of transparency in scientific communication<br />Case study of melting point data<br />
  3. 3. The Chemical Information Validation Sheet <br />567 curated and referenced measurements from <br />Fall 2010 Chemical Information Retrieval course<br />
  4. 4. The Chemical Information Validation Explorer <br />(Andrew Lang)<br />
  5. 5. Discovering outliers for melting points (stdev/average)<br />
  6. 6. Investigating the m.p. inconsistencies of EGCG<br />
  7. 7. Investigating the m.p. inconsistencies of cyclohexanone<br />
  8. 8. Sigma-Aldrich, Acros and Wolfram Alpha apparently use the same sources for melting points<br />
  9. 9. Sigma-Aldrich, Acros and Wolfram Alpha apparently use the same sources for boiling points<br />
  10. 10. Sigma-Aldrich, Acros and Wolfram Alpha apparently <br />DO NOT use the same sources for flash points<br />
  11. 11. Most popular data sources<br />
  12. 12. Alfa Aesar donates melting points to the public<br />
  13. 13. Open Melting Point Explorer<br />
  14. 14. Outliers<br />MDPI <br />dataset<br />EPI (via ChemSpider)<br />
  15. 15. Outliers<br />Alfa Aesar<br />
  16. 16. Inconsistencies and SMILES problems within MDPI dataset<br />
  17. 17. MDPI Dataset labeled with High Trust Level<br />
  18. 18. Open Melting Point Datasets<br />
  19. 19. Open Random Forest modeling of Open Melting Point data using CDK descriptors<br />(Andrew Lang)<br />R2 = 0.78, TPSA and nHdon most important<br />
  20. 20. Melting point prediction service<br />
  21. 21. Using melting point for temperature dependent solubility prediction<br />
  22. 22. Motivation: Faster Science,Better Science<br />
  23. 23. There are NO FACTS, <br />only measurements embedded within assumptions<br />Open Notebook Science maintains the integrity of data provenance by making assumptions explicit<br />
  24. 24. TRUST<br />PROOF<br />
  25. 25. Strategy for an Open Notebook:<br />First record then abstract structure<br />In order to be discoverable use Google friendly formats (simple HTML, no login) <br />In order to be replicable use free hosted tools (Wikispaces, Google Spreadsheets)<br />
  26. 26. Crowdsourcing Solubility Data<br />
  27. 27. Data provenance: <br />From Wikipedia to…<br />
  28. 28. …the lab notebook and raw data<br />
  29. 29. Calculations Made Public on <br />Google Spreadsheets<br />
  30. 30. Interactive NMR spectra using JSpecView and JCAMP-DX<br />
  31. 31. Raw Data As Images<br />Splatter?<br />Some liquid<br />
  32. 32. YouTube for demonstrating experimental set-up<br />
  33. 33. The importance of raw data availability<br />Missed in a prior publication on solubility for this compound<br />
  34. 34. Solubilities collected in a Google Spreadsheet<br />
  35. 35. Rajarshi Guha’s Live Web Query using Google Viz API<br />
  36. 36. Web services for summary data<br />(Andrew Lang)<br />
  37. 37. Web service calls from within a Google Spreadsheet for solubility measurement and prediction<br />(Andrew Lang)<br />
  38. 38. Integration of Multiple Web Services to Recommend Solvents for Reactions<br />(Andrew Lang)<br />
  39. 39.
  40. 40.
  41. 41.
  42. 42. Reaction Attempts Book<br />
  43. 43. Reaction Attempts Book: Reactants listed Alphabetically<br />
  44. 44. ONS Challenge Solubility Book cited for nanotechnology application<br />
  45. 45. Lulu.com Data Disks<br />
  46. 46. Visualizing molecule-researcher connection maps reveals link between 2 Open Notebooks (Todd and Bradley)<br />(Don Pellegrino)<br />
  47. 47. All ONS web services <br />
  48. 48. For all Formats of ONS Projects<br />
  49. 49. Conclusions<br /><ul><li>Our current system of publication is not as transparent as it could be
  50. 50. Open Notebook Science offers an efficient way to make research transparent and discoverable</li>

×