The value of Openness in
 Research and Teaching

   University of Delaware
     Tuesday Tech Talks
        Jean-Claude Bradley
      Associate Professor of Chemistry
             Drexel University

            February 12, 2013
The story of the enigmatic solvent

Sophomore Organic Chemistry lab at Drexel
              University
                      Lab procedure
                         required
                     recrystallization
                   from ethyl acetate
                   and synthesis not
                          reliable
Searching for a logical reason

• The synthesis of DBA is widespread in
  organic teaching labs
• Methods vary but ethyl acetate is often
  used to recrystallize DBA
• Its use traces back to a paper from 1903
• However the solubility of DBA in ethyl
  acetate was apparently never reported
Developing a rational method to choose a
synthesis and recrystallization solvent – in the
                    open




               Matthew McBride
An example of a failed experiment in an Open
      Notebook with useful information
A failed experiment reveals the importance of aldehyde
                       solubility
An example of a successful experiment in an Open
                    Notebook




This synthesis will now be used in the revised organic
           teaching lab manual at Drexel
Incomplete information from the literature
        can be very problematic
Motivation: Faster Science, Better Science
The Recrystallization App




                       (Andrew Lang)
What are good solvents to recrystallize benzoic acid?




                                       (Andrew Lang)
Click on the solvent to see temp curve




                             (Andrew Lang)
Deliver melting point data via App




                           (Andrew Lang)
The Recrystallization App produces and uses
Open Data:
• Open Solubility Collection and Models
• Open Melting Point Collection and Models
• Modeling depends mainly on CDK (Open
  Source Software with Open Descriptors)
• Open Notebook Science
Open Melting Point Datasets
Currently 20,000 compounds with Open MPs
Contributing to Science while Teaching it:
  Chemical Information Retrieval Class
The Chemical Information Validation Sheet

       567 curated and referenced measurements from
       Fall 2010 Chemical Information Retrieval course
Each entry validated with an image
Alfa Aesar donates melting points to the public
Outliers for ethanol: Alfa Aesar and Oxford MSDS
Solubility Outlier List
Open Melting Point Explorer




                         (Andrew Lang)
Searching for aldol condensations of acetone
     in the Reaction Attempts database




                                (Andrew Lang)
Open Random Forest modeling of Open Melting Point
           data using CDK descriptors
                 (Andrew Lang)
   R2 = 0.78, TPSA and nHdon most important
Melting point prediction service
Web services for summary data




                      (Andrew Lang)
Using a Google Spreadsheet as a “dashboard interface”
          for reaction planning and analysis
Calling Google App Scripts
Calling Google App Scripts




                   (Andrew Lang and Rich
                         Apodaca)
Google Apps Scripts for conveniently
   exploring melting point data
A click away from an interactive NMR display (using
         JCAMP-DX format and ChemDoodle)




                                    (Andrew Lang)
Open Melting Points in Supplementary Data Pages
          of Wikipedia (Martin Walker)
Google Apps Scripts web services
Chemistry Google App Scripts description sheet




                            (Andrew Lang and Rich
                                  Apodaca)
Open Chemical Property Matrix (OCPM)
Boiling point         Vapor
                      pressure
                                        Flash point

     Abraham                     Melting point
     descriptors

                      logP
         Aqueous                       Octanol
         solubility                    solubility
Conclusions

More openness in chemistry can make science more efficient

Provide interfaces that make sense to the end users:
Open Data, Open Models and Open Source Software to modelers
Apps (smartphones, Google App Scripts, etc.) for chemists at the bench



                   Acknowledgements
   Andrew Lang (code, modeling)
   Bill Acree (modeling, solubility data contribution)
   Antony Williams (ChemSpider services, mp data curation)
   Matthew McBride and Rida Atif (recrystallization and synthesis)
   Kayla Gogarty (OCPM)

The Value of Openness in Research and Teaching

  • 1.
    The value ofOpenness in Research and Teaching University of Delaware Tuesday Tech Talks Jean-Claude Bradley Associate Professor of Chemistry Drexel University February 12, 2013
  • 2.
    The story ofthe enigmatic solvent Sophomore Organic Chemistry lab at Drexel University Lab procedure required recrystallization from ethyl acetate and synthesis not reliable
  • 3.
    Searching for alogical reason • The synthesis of DBA is widespread in organic teaching labs • Methods vary but ethyl acetate is often used to recrystallize DBA • Its use traces back to a paper from 1903 • However the solubility of DBA in ethyl acetate was apparently never reported
  • 4.
    Developing a rationalmethod to choose a synthesis and recrystallization solvent – in the open Matthew McBride
  • 5.
    An example ofa failed experiment in an Open Notebook with useful information
  • 6.
    A failed experimentreveals the importance of aldehyde solubility
  • 7.
    An example ofa successful experiment in an Open Notebook This synthesis will now be used in the revised organic teaching lab manual at Drexel
  • 8.
    Incomplete information fromthe literature can be very problematic
  • 9.
  • 10.
  • 11.
    What are goodsolvents to recrystallize benzoic acid? (Andrew Lang)
  • 12.
    Click on thesolvent to see temp curve (Andrew Lang)
  • 13.
    Deliver melting pointdata via App (Andrew Lang)
  • 14.
    The Recrystallization Appproduces and uses Open Data: • Open Solubility Collection and Models • Open Melting Point Collection and Models • Modeling depends mainly on CDK (Open Source Software with Open Descriptors) • Open Notebook Science
  • 15.
    Open Melting PointDatasets Currently 20,000 compounds with Open MPs
  • 16.
    Contributing to Sciencewhile Teaching it: Chemical Information Retrieval Class
  • 17.
    The Chemical InformationValidation Sheet 567 curated and referenced measurements from Fall 2010 Chemical Information Retrieval course
  • 18.
    Each entry validatedwith an image
  • 19.
    Alfa Aesar donatesmelting points to the public
  • 20.
    Outliers for ethanol:Alfa Aesar and Oxford MSDS
  • 21.
  • 22.
    Open Melting PointExplorer (Andrew Lang)
  • 23.
    Searching for aldolcondensations of acetone in the Reaction Attempts database (Andrew Lang)
  • 24.
    Open Random Forestmodeling of Open Melting Point data using CDK descriptors (Andrew Lang) R2 = 0.78, TPSA and nHdon most important
  • 25.
  • 26.
    Web services forsummary data (Andrew Lang)
  • 27.
    Using a GoogleSpreadsheet as a “dashboard interface” for reaction planning and analysis
  • 28.
  • 29.
    Calling Google AppScripts (Andrew Lang and Rich Apodaca)
  • 30.
    Google Apps Scriptsfor conveniently exploring melting point data
  • 31.
    A click awayfrom an interactive NMR display (using JCAMP-DX format and ChemDoodle) (Andrew Lang)
  • 32.
    Open Melting Pointsin Supplementary Data Pages of Wikipedia (Martin Walker)
  • 33.
    Google Apps Scriptsweb services
  • 34.
    Chemistry Google AppScripts description sheet (Andrew Lang and Rich Apodaca)
  • 35.
    Open Chemical PropertyMatrix (OCPM) Boiling point Vapor pressure Flash point Abraham Melting point descriptors logP Aqueous Octanol solubility solubility
  • 36.
    Conclusions More openness inchemistry can make science more efficient Provide interfaces that make sense to the end users: Open Data, Open Models and Open Source Software to modelers Apps (smartphones, Google App Scripts, etc.) for chemists at the bench Acknowledgements Andrew Lang (code, modeling) Bill Acree (modeling, solubility data contribution) Antony Williams (ChemSpider services, mp data curation) Matthew McBride and Rida Atif (recrystallization and synthesis) Kayla Gogarty (OCPM)