SlideShare a Scribd company logo
1 of 19
Download to read offline
Prof. Geo
ff
rey R. Hutchison, Department of Chemistry, University of Pittsburgh
From Robert Boyle’s 

The Sceptical Chymist to 

Modern Data-Driven Chemistry
(And where do we go from here?)
https://hutchison.chem.pitt.edu/
Scientific Prestige:
Public Disputes
How do you decide the best?
1535 - Fontana solves cubic
Wins prestige (and job)
Writes a coded poem
Niccolò Fontana “Tartaglia”
Image: Wikipedia

(Rijksmuseum NL)
Gerolamo Cardano
Image: Wikipedia

(Unknown origin)
The Skeptical
Chymist: pub. 1661
Transition between alchemy
and more modern chemistry
Also, beginnings of scienti
fi
c
publishing…
Image Credit: Wikipedia (U. Penn Library)
1660s

Chemistry = Alchemy
Boyle believed in transmutation
So did Newton
But they published observations
and discoveries
(sometimes reluctantly) Philosophical Transactions - 1665
Image: Wikipedia (Royal Society)
Scientific Publishing in 1665
Philosophical Transactions and The Sceptical Chymist
• Words and static
fi
gures / drawings

• Four
fi
gures 

- chimney, tools for mining

- unusual calf’s head (no nose!)

• Not that di
ff
erent from most

modern scienti
fi
c articles 



Philosophical Transactions - 1665
Royal Society
https://royalsociety.org/blog/2017/02/images-from-the-archive/
Modern Chemistry
Data-intensive
.. a lot goes into the
fi
gures

& tables
O N
N
N
N
O
N
N
N
O
HN
O
theobromine ca
ff
eine
• Analytical Data

(1H, 13C NMR, MS, IR, UV/Vis…)
• Crystallography
• Calculations (DFT, etc.)
• Applications (AFM, devices, …)
• Reactions, Chemical Diagrams..
Modern Chemistry
Submission
Peer Review / Reproducible?
Editing
Published Final Form
O N
N
N
N
O
N
N
N
O
HN
O
theobromine ca
ff
eine
• Analytical Data

(1H, 13C NMR, MS, IR, UV/Vis…)
• Crystallography
• Calculations (DFT, etc.)
• Applications (AFM, devices, …)
• Reactions, Chemical Diagrams..
Modern Chemistry
Submission
Peer Review / Reproducible?
Editing
Published Final Form
O N
N
N
N
O
N
N
N
O
HN
O
theobromine ca
ff
eine
• Analytical Data

(1H, 13C NMR, MS, IR, UV/Vis…)
• Crystallography
• Calculations (DFT, etc.)
• Applications (AFM, devices, …)
• Reactions, Chemical Diagrams..
Preprint
Modern Chemistry
Submission
Peer Review / Reproducible?
Editing
Published Final Form
O N
N
N
N
O
N
N
N
O
HN
O
theobromine ca
ff
eine
• Analytical Data

(1H, 13C NMR, MS, IR, UV/Vis…)
• Crystallography
• Calculations (DFT, etc.)
• Applications (AFM, devices, …)
• Reactions, Chemical Diagrams..
Preprint Generate PDF(s)
Extract it from the images…
• Copy and paste (?)


• WebPlotDigitizer


• OSRA (Igor Filippov)


• ChemDataExtractor (M. Swain)


• Pay an undergraduate
You Want the Data?
3n
E
kcal/mol
500
100
10
8
5
3
Confab - MMFF94
#
of
Conformers 1
102
104
106
108
# of Rotatable Bonds
0 5 10 15
days
hrs
msec
“Holy Grail”
GFN
DFT-D
MP2
ML
Force Field
Median
R
2
0
0.5
1.0
Time (s)
10−4
10−2
1 102
104
7.5 7.0 6.5 6.0 5.5 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 ppm
0.897
0.917
0.945
0.966
1.322
1.569
2.151
2.356
1.00
0.46
0.78
0.23
0.14
NAME
EXPNO
PROCN
Date_
Time
INSTR
PROBH
PULPR
TD
SOLVE
NS
DS
SWH
FIDRE
AQ
RG
DW
DE
TE
D1
TD0
=====
SFO1
NUC1
P1
SI
SF
WDW
SSB
LB
GB
PC
Mo2-Tributyl phosphate
US Patent 8,198,437
Open Data, Open Standards
Open Source
• Peter Murray-Rust, Henry Rzepa

Rajarshi Guha, Christoph
Steinbeck, Jörg Wegner, Rich
Apodaca, Egon L. Willighagen…

• ACS San Diego 2005

• DOI: 10.1021/ci050400b
The Blue Obelisk
Blue Obelisk, Horton Plaza

San Diego
“
”
— Prof. Henry S. Rzepa (Imperial College)


Spring 2005 ACS Meeting, San Diego, CA
I can plug my iPod into any
computer and it will recognize
my music and give me all sorts
of metadata: artist, title, type of
music...

Why can’t I read the chemical
data off my
fi
les?
But why do I care?
.. why chemistry needs to share open data
• It’s e
ffi
cient. Student creates PDF of data, extract data from another PDF??

• It enables reuse. Philosophical Transactions worked. Reuse is science

• Data is king. There’s even a journal, Scienti
fi
c Data
• Crowdsourcing. Imagine every chemistry student taking melting points,

solubility, measuring spectra, calculations …
Some shared chemical data
• Cambridge Crystallographic Data
Center (CDC)

• Inorganic Crystal Structure
Database (ICSD)

• Open Crystallography Database
(COD) 

• Protein Data Bank (PDB)

• Ligand Expo

• American Mineralogist Database
Crystallography
Materials Horizons 2020, 7, 135-142

via https://crystallography.net/
Drug Discovery / Catalogues
• PubChem

• ZINC

• ChemSpider

• eMolecules

• ChEMBL

• DrugBase
Not only…
Sharing data customary
• Enabling ML methods:

• QM7

• QM9

• ANI-1

• PubChemQC

• …
Computational Chemistry
DOI: 10.1038/s41597-021-00833-x
Often small…
• AIST Spectral Database SDBS
(34k)

• NMRShiftDB (44k)

• (Commercial data)

• IR, NMR, MS databases

• Often hard to share
Spectroscopy
.. at least in chemistry
• Interactive
fi
gures:
• Not just 2D static images

• Supporting information as 

repository:
• Documents / text / README

• Raw data, spectra, etc.

• Code

• Jupyter notebooks (analysis)
Future of Publishing
Take-Home
Modern chemistry is data - publishing should be too
• Post your raw data and open it - you can create a DOI: Zenodo, Figshare …

• Enable shared repositories - imagine searching all the chemical data 

• Cheminformatics exists - exchange data and metadata
It’s 2021, not 1665
Publish 21st century science

More Related Content

Similar to From Robert Boyle’s 
The Sceptical Chymist to 
Modern Data-Driven Chemistry

From Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science TalesFrom Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science TalesBertram Ludäscher
 
Integration of oreChem with the eCrystals repository for crystal structures
Integration of oreChem with the eCrystals repository for crystal structuresIntegration of oreChem with the eCrystals repository for crystal structures
Integration of oreChem with the eCrystals repository for crystal structuresMark Borkum
 
oreChem: Planning and Enacting Chemistry on the Semantic Web
oreChem: Planning and Enacting Chemistry on the Semantic WeboreChem: Planning and Enacting Chemistry on the Semantic Web
oreChem: Planning and Enacting Chemistry on the Semantic WebMark Borkum
 
ContentMining in Neuroscience
ContentMining in NeuroscienceContentMining in Neuroscience
ContentMining in Neurosciencepetermurrayrust
 
ContentMining in Neuroscience
ContentMining in NeuroscienceContentMining in Neuroscience
ContentMining in NeuroscienceTheContentMine
 
ContentMining in Neuroscience
ContentMining in NeuroscienceContentMining in Neuroscience
ContentMining in NeuroscienceTheContentMine
 
Discovering advanced materials for energy applications (with high-throughput ...
Discovering advanced materials for energy applications (with high-throughput ...Discovering advanced materials for energy applications (with high-throughput ...
Discovering advanced materials for energy applications (with high-throughput ...Anubhav Jain
 
Morgan uw maGIV v1.3 dist
Morgan uw maGIV v1.3 distMorgan uw maGIV v1.3 dist
Morgan uw maGIV v1.3 distddm314
 
Examples of how to inspire the next generation to pursue computational chemis...
Examples of how to inspire the next generation to pursue computational chemis...Examples of how to inspire the next generation to pursue computational chemis...
Examples of how to inspire the next generation to pursue computational chemis...Sean Ekins
 
Applying tensor decompositions to author name disambiguation of common Japane...
Applying tensor decompositions to author name disambiguation of common Japane...Applying tensor decompositions to author name disambiguation of common Japane...
Applying tensor decompositions to author name disambiguation of common Japane...National Institute of Informatics
 
Museum impact: linking-up specimens with research published on them
Museum impact: linking-up specimens with research published on themMuseum impact: linking-up specimens with research published on them
Museum impact: linking-up specimens with research published on themRoss Mounce
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceAlex Henderson
 

Similar to From Robert Boyle’s 
The Sceptical Chymist to 
Modern Data-Driven Chemistry (20)

From Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science TalesFrom Workflows to Transparent Research Objects and Reproducible Science Tales
From Workflows to Transparent Research Objects and Reproducible Science Tales
 
Current initiatives in developing research data repositories at the Royal Soc...
Current initiatives in developing research data repositories at the Royal Soc...Current initiatives in developing research data repositories at the Royal Soc...
Current initiatives in developing research data repositories at the Royal Soc...
 
The importance of the InChI identifier as a foundation technology for eScienc...
The importance of the InChI identifier as a foundation technology for eScienc...The importance of the InChI identifier as a foundation technology for eScienc...
The importance of the InChI identifier as a foundation technology for eScienc...
 
Integration of oreChem with the eCrystals repository for crystal structures
Integration of oreChem with the eCrystals repository for crystal structuresIntegration of oreChem with the eCrystals repository for crystal structures
Integration of oreChem with the eCrystals repository for crystal structures
 
oreChem: Planning and Enacting Chemistry on the Semantic Web
oreChem: Planning and Enacting Chemistry on the Semantic WeboreChem: Planning and Enacting Chemistry on the Semantic Web
oreChem: Planning and Enacting Chemistry on the Semantic Web
 
Cern general information
Cern general informationCern general information
Cern general information
 
The importance of standards for data exchange and interchange on the Royal So...
The importance of standards for data exchange and interchange on the Royal So...The importance of standards for data exchange and interchange on the Royal So...
The importance of standards for data exchange and interchange on the Royal So...
 
Building a data repository to manage chemistry research data
Building a data repository to manage chemistry research dataBuilding a data repository to manage chemistry research data
Building a data repository to manage chemistry research data
 
ContentMining in Neuroscience
ContentMining in NeuroscienceContentMining in Neuroscience
ContentMining in Neuroscience
 
ContentMining in Neuroscience
ContentMining in NeuroscienceContentMining in Neuroscience
ContentMining in Neuroscience
 
ContentMining in Neuroscience
ContentMining in NeuroscienceContentMining in Neuroscience
ContentMining in Neuroscience
 
Digitally enabling the RSC archive
Digitally enabling the RSC archiveDigitally enabling the RSC archive
Digitally enabling the RSC archive
 
Discovering advanced materials for energy applications (with high-throughput ...
Discovering advanced materials for energy applications (with high-throughput ...Discovering advanced materials for energy applications (with high-throughput ...
Discovering advanced materials for energy applications (with high-throughput ...
 
Serving the medicinal chemistry community with Royal Society of Chemistry che...
Serving the medicinal chemistry community with Royal Society of Chemistry che...Serving the medicinal chemistry community with Royal Society of Chemistry che...
Serving the medicinal chemistry community with Royal Society of Chemistry che...
 
Morgan uw maGIV v1.3 dist
Morgan uw maGIV v1.3 distMorgan uw maGIV v1.3 dist
Morgan uw maGIV v1.3 dist
 
Examples of how to inspire the next generation to pursue computational chemis...
Examples of how to inspire the next generation to pursue computational chemis...Examples of how to inspire the next generation to pursue computational chemis...
Examples of how to inspire the next generation to pursue computational chemis...
 
Applying tensor decompositions to author name disambiguation of common Japane...
Applying tensor decompositions to author name disambiguation of common Japane...Applying tensor decompositions to author name disambiguation of common Japane...
Applying tensor decompositions to author name disambiguation of common Japane...
 
Museum impact: linking-up specimens with research published on them
Museum impact: linking-up specimens with research published on themMuseum impact: linking-up specimens with research published on them
Museum impact: linking-up specimens with research published on them
 
FAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical ScienceFAIRSpectra - Enabling the FAIRification of Analytical Science
FAIRSpectra - Enabling the FAIRification of Analytical Science
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data online
 

Recently uploaded

Introduction and significance of Symbiotic algae
Introduction and significance of  Symbiotic algaeIntroduction and significance of  Symbiotic algae
Introduction and significance of Symbiotic algaekushbuR
 
GBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyGBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyAreesha Ahmad
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...yogeshlabana357357
 
Terpineol and it's characterization pptx
Terpineol and it's characterization pptxTerpineol and it's characterization pptx
Terpineol and it's characterization pptxMuhammadRazzaq31
 
Taphonomy and Quality of the Fossil Record
Taphonomy and Quality of the  Fossil RecordTaphonomy and Quality of the  Fossil Record
Taphonomy and Quality of the Fossil RecordSangram Sahoo
 
ANITINUTRITION FACTOR GYLCOSIDES SAPONINS CYANODENS
ANITINUTRITION FACTOR GYLCOSIDES SAPONINS CYANODENSANITINUTRITION FACTOR GYLCOSIDES SAPONINS CYANODENS
ANITINUTRITION FACTOR GYLCOSIDES SAPONINS CYANODENSDr. TATHAGAT KHOBRAGADE
 
GBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationGBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationAreesha Ahmad
 
Efficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationEfficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationSérgio Sacani
 
GBSN - Microbiology (Unit 4) Concept of Asepsis
GBSN - Microbiology (Unit 4) Concept of AsepsisGBSN - Microbiology (Unit 4) Concept of Asepsis
GBSN - Microbiology (Unit 4) Concept of AsepsisAreesha Ahmad
 
Warming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptxWarming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptxGlendelCaroz
 
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...Sérgio Sacani
 
Factor Causing low production and physiology of mamary Gland
Factor Causing low production and physiology of mamary GlandFactor Causing low production and physiology of mamary Gland
Factor Causing low production and physiology of mamary GlandRcvets
 
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Ansari Aashif Raza Mohd Imtiyaz
 
Polyethylene and its polymerization.pptx
Polyethylene and its polymerization.pptxPolyethylene and its polymerization.pptx
Polyethylene and its polymerization.pptxMuhammadRazzaq31
 
A Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert EinsteinA Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert Einsteinxgamestudios8
 
Precision Farming in Fruit Crops presentation
Precision Farming in Fruit Crops presentationPrecision Farming in Fruit Crops presentation
Precision Farming in Fruit Crops presentationscvns2828
 
Technical english Technical english.pptx
Technical english Technical english.pptxTechnical english Technical english.pptx
Technical english Technical english.pptxyoussefboujtat3
 
Classification of Kerogen, Perspective on palynofacies in depositional envi...
Classification of Kerogen,  Perspective on palynofacies in depositional  envi...Classification of Kerogen,  Perspective on palynofacies in depositional  envi...
Classification of Kerogen, Perspective on palynofacies in depositional envi...Sangram Sahoo
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsSérgio Sacani
 
GENETICALLY MODIFIED ORGANISM'S PRESENTATION.ppt
GENETICALLY MODIFIED ORGANISM'S PRESENTATION.pptGENETICALLY MODIFIED ORGANISM'S PRESENTATION.ppt
GENETICALLY MODIFIED ORGANISM'S PRESENTATION.pptSyedArifMalki
 

Recently uploaded (20)

Introduction and significance of Symbiotic algae
Introduction and significance of  Symbiotic algaeIntroduction and significance of  Symbiotic algae
Introduction and significance of Symbiotic algae
 
GBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) EnzymologyGBSN - Biochemistry (Unit 8) Enzymology
GBSN - Biochemistry (Unit 8) Enzymology
 
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
Soil and Water Conservation Engineering (SWCE) is a specialized field of stud...
 
Terpineol and it's characterization pptx
Terpineol and it's characterization pptxTerpineol and it's characterization pptx
Terpineol and it's characterization pptx
 
Taphonomy and Quality of the Fossil Record
Taphonomy and Quality of the  Fossil RecordTaphonomy and Quality of the  Fossil Record
Taphonomy and Quality of the Fossil Record
 
ANITINUTRITION FACTOR GYLCOSIDES SAPONINS CYANODENS
ANITINUTRITION FACTOR GYLCOSIDES SAPONINS CYANODENSANITINUTRITION FACTOR GYLCOSIDES SAPONINS CYANODENS
ANITINUTRITION FACTOR GYLCOSIDES SAPONINS CYANODENS
 
GBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolationGBSN - Microbiology (Unit 5) Concept of isolation
GBSN - Microbiology (Unit 5) Concept of isolation
 
Efficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence accelerationEfficient spin-up of Earth System Models usingsequence acceleration
Efficient spin-up of Earth System Models usingsequence acceleration
 
GBSN - Microbiology (Unit 4) Concept of Asepsis
GBSN - Microbiology (Unit 4) Concept of AsepsisGBSN - Microbiology (Unit 4) Concept of Asepsis
GBSN - Microbiology (Unit 4) Concept of Asepsis
 
Warming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptxWarming the earth and the atmosphere.pptx
Warming the earth and the atmosphere.pptx
 
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...Manganese‐RichSandstonesasanIndicatorofAncientOxic  LakeWaterConditionsinGale...
Manganese‐RichSandstonesasanIndicatorofAncientOxic LakeWaterConditionsinGale...
 
Factor Causing low production and physiology of mamary Gland
Factor Causing low production and physiology of mamary GlandFactor Causing low production and physiology of mamary Gland
Factor Causing low production and physiology of mamary Gland
 
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
Molecular and Cellular Mechanism of Action of Hormones such as Growth Hormone...
 
Polyethylene and its polymerization.pptx
Polyethylene and its polymerization.pptxPolyethylene and its polymerization.pptx
Polyethylene and its polymerization.pptx
 
A Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert EinsteinA Scientific PowerPoint on Albert Einstein
A Scientific PowerPoint on Albert Einstein
 
Precision Farming in Fruit Crops presentation
Precision Farming in Fruit Crops presentationPrecision Farming in Fruit Crops presentation
Precision Farming in Fruit Crops presentation
 
Technical english Technical english.pptx
Technical english Technical english.pptxTechnical english Technical english.pptx
Technical english Technical english.pptx
 
Classification of Kerogen, Perspective on palynofacies in depositional envi...
Classification of Kerogen,  Perspective on palynofacies in depositional  envi...Classification of Kerogen,  Perspective on palynofacies in depositional  envi...
Classification of Kerogen, Perspective on palynofacies in depositional envi...
 
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRingsTransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
TransientOffsetin14CAftertheCarringtonEventRecordedbyPolarTreeRings
 
GENETICALLY MODIFIED ORGANISM'S PRESENTATION.ppt
GENETICALLY MODIFIED ORGANISM'S PRESENTATION.pptGENETICALLY MODIFIED ORGANISM'S PRESENTATION.ppt
GENETICALLY MODIFIED ORGANISM'S PRESENTATION.ppt
 

From Robert Boyle’s 
The Sceptical Chymist to 
Modern Data-Driven Chemistry

  • 1. Prof. Geo ff rey R. Hutchison, Department of Chemistry, University of Pittsburgh From Robert Boyle’s 
 The Sceptical Chymist to 
 Modern Data-Driven Chemistry (And where do we go from here?) https://hutchison.chem.pitt.edu/
  • 2. Scientific Prestige: Public Disputes How do you decide the best? 1535 - Fontana solves cubic Wins prestige (and job) Writes a coded poem Niccolò Fontana “Tartaglia” Image: Wikipedia
 (Rijksmuseum NL) Gerolamo Cardano Image: Wikipedia
 (Unknown origin)
  • 3. The Skeptical Chymist: pub. 1661 Transition between alchemy and more modern chemistry Also, beginnings of scienti fi c publishing… Image Credit: Wikipedia (U. Penn Library)
  • 4. 1660s
 Chemistry = Alchemy Boyle believed in transmutation So did Newton But they published observations and discoveries (sometimes reluctantly) Philosophical Transactions - 1665 Image: Wikipedia (Royal Society)
  • 5. Scientific Publishing in 1665 Philosophical Transactions and The Sceptical Chymist • Words and static fi gures / drawings • Four fi gures 
 - chimney, tools for mining
 - unusual calf’s head (no nose!) • Not that di ff erent from most
 modern scienti fi c articles 
 
 Philosophical Transactions - 1665 Royal Society https://royalsociety.org/blog/2017/02/images-from-the-archive/
  • 6. Modern Chemistry Data-intensive .. a lot goes into the fi gures
 & tables O N N N N O N N N O HN O theobromine ca ff eine • Analytical Data
 (1H, 13C NMR, MS, IR, UV/Vis…) • Crystallography • Calculations (DFT, etc.) • Applications (AFM, devices, …) • Reactions, Chemical Diagrams..
  • 7. Modern Chemistry Submission Peer Review / Reproducible? Editing Published Final Form O N N N N O N N N O HN O theobromine ca ff eine • Analytical Data
 (1H, 13C NMR, MS, IR, UV/Vis…) • Crystallography • Calculations (DFT, etc.) • Applications (AFM, devices, …) • Reactions, Chemical Diagrams..
  • 8. Modern Chemistry Submission Peer Review / Reproducible? Editing Published Final Form O N N N N O N N N O HN O theobromine ca ff eine • Analytical Data
 (1H, 13C NMR, MS, IR, UV/Vis…) • Crystallography • Calculations (DFT, etc.) • Applications (AFM, devices, …) • Reactions, Chemical Diagrams.. Preprint
  • 9. Modern Chemistry Submission Peer Review / Reproducible? Editing Published Final Form O N N N N O N N N O HN O theobromine ca ff eine • Analytical Data
 (1H, 13C NMR, MS, IR, UV/Vis…) • Crystallography • Calculations (DFT, etc.) • Applications (AFM, devices, …) • Reactions, Chemical Diagrams.. Preprint Generate PDF(s)
  • 10. Extract it from the images… • Copy and paste (?) • WebPlotDigitizer • OSRA (Igor Filippov) • ChemDataExtractor (M. Swain) • Pay an undergraduate You Want the Data? 3n E kcal/mol 500 100 10 8 5 3 Confab - MMFF94 # of Conformers 1 102 104 106 108 # of Rotatable Bonds 0 5 10 15 days hrs msec “Holy Grail” GFN DFT-D MP2 ML Force Field Median R 2 0 0.5 1.0 Time (s) 10−4 10−2 1 102 104 7.5 7.0 6.5 6.0 5.5 5.0 4.5 4.0 3.5 3.0 2.5 2.0 1.5 1.0 ppm 0.897 0.917 0.945 0.966 1.322 1.569 2.151 2.356 1.00 0.46 0.78 0.23 0.14 NAME EXPNO PROCN Date_ Time INSTR PROBH PULPR TD SOLVE NS DS SWH FIDRE AQ RG DW DE TE D1 TD0 ===== SFO1 NUC1 P1 SI SF WDW SSB LB GB PC Mo2-Tributyl phosphate US Patent 8,198,437
  • 11. Open Data, Open Standards Open Source • Peter Murray-Rust, Henry Rzepa
 Rajarshi Guha, Christoph Steinbeck, Jörg Wegner, Rich Apodaca, Egon L. Willighagen… • ACS San Diego 2005 • DOI: 10.1021/ci050400b The Blue Obelisk Blue Obelisk, Horton Plaza
 San Diego
  • 12. “ ” — Prof. Henry S. Rzepa (Imperial College) 
 Spring 2005 ACS Meeting, San Diego, CA I can plug my iPod into any computer and it will recognize my music and give me all sorts of metadata: artist, title, type of music... Why can’t I read the chemical data off my fi les?
  • 13. But why do I care? .. why chemistry needs to share open data • It’s e ffi cient. Student creates PDF of data, extract data from another PDF?? • It enables reuse. Philosophical Transactions worked. Reuse is science • Data is king. There’s even a journal, Scienti fi c Data • Crowdsourcing. Imagine every chemistry student taking melting points,
 solubility, measuring spectra, calculations …
  • 14. Some shared chemical data • Cambridge Crystallographic Data Center (CDC) • Inorganic Crystal Structure Database (ICSD) • Open Crystallography Database (COD) • Protein Data Bank (PDB) • Ligand Expo • American Mineralogist Database Crystallography Materials Horizons 2020, 7, 135-142 via https://crystallography.net/
  • 15. Drug Discovery / Catalogues • PubChem • ZINC • ChemSpider • eMolecules • ChEMBL • DrugBase Not only…
  • 16. Sharing data customary • Enabling ML methods: • QM7 • QM9 • ANI-1 • PubChemQC • … Computational Chemistry DOI: 10.1038/s41597-021-00833-x
  • 17. Often small… • AIST Spectral Database SDBS (34k) • NMRShiftDB (44k) • (Commercial data) • IR, NMR, MS databases • Often hard to share Spectroscopy
  • 18. .. at least in chemistry • Interactive fi gures: • Not just 2D static images • Supporting information as 
 repository: • Documents / text / README • Raw data, spectra, etc. • Code • Jupyter notebooks (analysis) Future of Publishing
  • 19. Take-Home Modern chemistry is data - publishing should be too • Post your raw data and open it - you can create a DOI: Zenodo, Figshare … • Enable shared repositories - imagine searching all the chemical data • Cheminformatics exists - exchange data and metadata It’s 2021, not 1665 Publish 21st century science