IJCAI09 Open Notebook Science talk

1,439 views

Published on

Jean-Claude Bradley presents on The Role of Openness in Scientific Automation: a case for Open Notebook Science at the IJCAI'09 Workshop on Abductive and Inductive Knowledge Development in Pasadena, CA on July 12, 2009.

Published in: Education, Technology
0 Comments
2 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
1,439
On SlideShare
0
From Embeds
0
Number of Embeds
5
Actions
Shares
0
Downloads
12
Comments
0
Likes
2
Embeds 0
No embeds

No notes for slide

IJCAI09 Open Notebook Science talk

  1. 1. The Role of Openness in Scientific Automation: a case for Open Notebook Science IJCAI'09 Workshop on Abductive and Inductive Knowledge Development Jean-Claude Bradley Associate Professor of Chemistry Drexel University July 12, 2009
  2. 2. What is science? FINDING OUT WHAT IS: GOOD BAD AND MAKING CHANGES: BETTER CURE
  3. 3. Why are humans good at science? GOSSIP CURIOSITY NEED TO ART TO KNOW CREATE UNDERSTANDING CONTROL PREDICTIVE MODELS CONCEPTUAL (docking) EMPIRICAL (QSAR)
  4. 4. The standard model of how science happens Propose Design hypothesis experiments Analyze Perform results experiments
  5. 5. The actual way much of science happens Think up Find collaborators Convince something who also think what somebody to cool/useful to you want to do is fund your do cool cool idea Find existing Do the work Share what you data or (experiments) found with the models to world minimize the amount of Usually last (and trial and error only a small part)
  6. 6. How can the scientific process become more automated? WE ARE HERE
  7. 7. The Robot Scientist
  8. 8. How radical openness can assist in the automation of science  Agents can participate with zero or near- zero cost (free hosted services – e.g. Google)  Self-organizing redundant processes
  9. 9. How can machines know what is important? Ask the humans
  10. 10. UsefulChem Blog
  11. 11. What chemists think is important in 2005
  12. 12. Find-A-Drug
  13. 13. Open and Closed Science Traditional Traditional Open Notebook Lab Notebook Journal Open Access Science (full (unpublished) Article Journal Article transparency) RESEARCH CLOSED OPEN TEACHING Lectures Archived Traditional Assigned Notes Lectures Paper problems public Public and Textbook public free online F2F lectures textbooks
  14. 14. Motivation: Faster Science, Better Science
  15. 15. Open Notebook Science Logos (Andy Lang, Shirley Wu) Sharing: how much and when
  16. 16. There are NO FACTS, only measurements embedded within assumptions Open Notebook Science maintains the integrity of data provenance by making assumptions explicit
  17. 17. TRUST PROOF
  18. 18. The solubility of 4-chlorobenzaldehyde
  19. 19. The Log makes Assumptions Explicit
  20. 20. The Rationale of Findings Explicit
  21. 21. Raw Data Made Public Splatter? Some liquid
  22. 22. YouTube for demonstrating experimental set-up
  23. 23. Calculations Made Public on Google Spreadsheets
  24. 24. Revision History on Google Spreadsheets
  25. 25. Wiki Page History
  26. 26. Comparing Wiki Page Versions
  27. 27. Proof of Purity with interactive NMR spectrum using JSpecView and JCAMP-DX
  28. 28. Linking to Molecules in Chemistry Databases
  29. 29. Experimental Spectra and User- Deposited Data on ChemSpider
  30. 30. Open Data JCAMP spectra for education (Andy Lang, Tony Williams) (Andy Lang, Tony Williams, Robert Lancashire)
  31. 31. Database Curation via Game Playing
  32. 32. Over 50,000 spectrum views so far - worldwide
  33. 33. Link Spectral Game to Open Educational Content
  34. 34. NMR game in Second Life (Andy Lang)
  35. 35. Crowdsourcing Solubility Data
  36. 36. ONS Challenge Students
  37. 37. ONS Challenge Judges
  38. 38. Teaching Lab: Brent Friesen (Dominican University)
  39. 39. Solubility Experiment List
  40. 40. Solubilities collected in a Google Spreadsheet
  41. 41. Rajarshi Guha’s Live Web Query using Google Viz API
  42. 42. Rajarshi Guha and Andy Lang: Chemical Space Explorer
  43. 43. Semi-Automated Measurement of solubility via web service analysis of JCAMP-DX files (Andy Lang)
  44. 44. Solubility Measurement Requests: DoSol sheet •Outlier Bot: flags measurements with high standard deviation to mean ratios •Google Analytics queries – new solvent/solute searches •Solubility request form – researcher in Israel requesting pyrene in acetonitrile solubility for environmental soil contamination study •Application based models – high priority Ugi reactants
  45. 45. Solvent mixture and temperature: multidimensional solubility data Actual Data From quadratic regression (4-nitrobenzaldehyde) of 5D space Feeds DoSol Sheet the next points to measure to best cover the space
  46. 46. Crowdsourcing Reaction Requests: DoUgi sheet
  47. 47. Understanding in addition to empirical modeling Missed in a prior publication on solubility for this compound
  48. 48. Data provenance: From Wikipedia to…
  49. 49. …the lab notebook and raw data
  50. 50. Including links to the literature
  51. 51. Pierre Lindenbaum’s Solubility Data as RDF Triples
  52. 52. Results and Workflows in Machine- Friendly Format
  53. 53. Experiments in Chemical Markup Language
  54. 54. UsefulChem Project: Open Primary Research in Drug Design using Web2.0 tools Rajarshi Guha Tsu-Soo Tan Indiana U Docking Nanyang Inst. JC Bradley Drexel U Synthesis Phil Rosenthal Dan Zaharevitz UCSF Testing NCI (malaria) (tumors)
  55. 55. Malaria Target: falcipain-2 involved in hemoglobin metabolism Dana.org
  56. 56. Where’s the Beef?
  57. 57. Link to Lab Notebook Page in Wiki
  58. 58. Link to Docking Procedure (Rajarshi Guha)
  59. 59. Link to Docking Results: Lists of SMILES in GoogleDocs (Rajarshi Guha)
  60. 60. Outcome of Guha-Bradley-Rosenthal collboration
  61. 61. Collaborative Drug Discovery (CDD) Database
  62. 62. How does Open Notebook Science fit with traditional publication? •Concentration (0.4, 0.2, 0.07 M) •Solvent (methanol, ethanol, acetonitrile, THF) •Excess of some reagents (1.2 eq.)
  63. 63. Mettler-Toledo MiniMapper
  64. 64. Mettler-Toledo MiniBlock System
  65. 65. XML reports from MiniMapper robot
  66. 66. GoogleDoc to program and report
  67. 67. Paper written on Wiki
  68. 68. References to papers, blog posts, lab notebook pages, raw data
  69. 69. Paper on Journal of Visualized Experiments (JoVE)
  70. 70. Pre-print on Nature Precedings
  71. 71. ChemSpider Automated Mark-up of Chemical Names
  72. 72. Archiving and Curation of ONS and other new forms of scholarship?
  73. 73. Other Open Notebooks Cameron Neylon’s Notebooks
  74. 74. Anthony Salvagno’s Notebook (Steve Koch group)
  75. 75. Acknowledgements  Khalid Mirza (Drexel)  Jenny Hale (Southampton U.)  David Bulger (Oral Roberts U.)  Tim Bohinsky (Drexel)  Kevin Owens (Drexel)  Tom Osborne (Mettler-Toledo)  Antony Williams (ChemSpider)  Andrew Lang (Oral Roberts U.)  Rajarshi Guha (Indiana U.)  Cameron Neylon (Southampton U.)

×