SlideShare a Scribd company logo
Evolution of open chemical information
Valery Tkachenko
Royal Society of Chemistry
ACS Fall 2016
Philadelphia, PA
The Short History of Time
Image credit: Rhys Taylor, Cardiff University
~1992
Chemical database
PubChem
• 57 million chemicals and growing
• Data sourced from >500 different sources
• Crowdsourced curation and annotation
• Ongoing deposition of data from our
journals and our collaborators
• A structure centric hub for web-searching
ChemSpider
ChemSpider
ChemSpider real-time curation
Article X-ray
Compounds
Reaction
Analytical Data
Text and References
Reaction 1: NextMove reaction text-mined
from RSC archive – original article
Reaction 1: NextMove reaction text-
mined from RSC archive – cml output
<?xml version="1.0" encoding="UTF-8"?>
<reactionList xmlns="http://www.xml-cml.org/schema" xmlns:cmlDict="http://www.xml-cml.org/dictionary/cml/" xmlns:nameDict="http://www.xml-cml.org/dictionary/cml/name/"
xmlns:unit="http://www.xml-cml.org/unit/" xmlns:cml="http://www.xml-cml.org/schema" xmlns:dl="http://bitbucket.org/dan2097">
<reaction>
<dl:source>
<dl:documentId>c3ra45871g</dl:documentId>
<dl:paragraphText>Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL)
at −78 °C. The reaction mixture was stirred at −78 °C for another 2 h, warmed up to rt, quenched with methanol (3 mL) and citric acid(aq) (w/w, 10%, 5 mL), concentrated. The residue
was added with water (10 mL) and extracted with dichloromethane (12 mL × 3). The organic layers were combined, dried over Na2SO4, filtered and concentrated. The crude product
was further purified by column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33) to give 10 (308 mg, 1.02 mmol, 79%) as a colourless liquid. [α]D20 −24.2 (c 1.1, CHCl3); 1H NMR
(CDCl3, 300 MHz) δ 0.04 (s, 3H), 0.07 (s, 3H), 0.85 (s, 9H), 1.34 (s, 3H), 1.44 (s, 3H), 2.16 (br, 1H), 3.68–3.81 (m, 3H), 4.16 (t, J = 13.8 Hz, J = 13.8 Hz, 1H), 4.59 (t, J = 6.6 Hz, J = 6.6
Hz, 1H), 5.22 (d, J = 10.7 Hz, 1H), 5.34 (d, J = 17.1 Hz, 1H), 5.90 (ddd, J = 7.2 Hz, J = 10.2 Hz, J = 17.2 Hz, 1H); 13C NMR (CDCl3, 75 MHz) δ 134.1, 118.4, 108.5, 79.5, 78.8, 70.8,
65.0, 27.8, 25.9, 25.4, 18.1, −3.7, −4.4. HRMS (ESI) calcd for [M + Na]+ (C15H30O4SiNa) 325.1811, found 325.1807.</dl:paragraphText>
</dl:source>
<dl:reactionSmiles>[H-
].C([Al+]CC(C)C)C(C)C.C([O:17][CH2:18][C@@H:19]([O:29][Si:30]([C:33]([CH3:36])([CH3:35])[CH3:34])([CH3:32])[CH3:31])[C@@H:20]1[C@H:24]([CH:25]=[CH2:26])[O:23][C:22]([CH3
:28])([CH3:27])[O:21]1)(=O)C(C)(C)C&gt;ClCCl&gt;[C:33]([Si:30]([CH3:32])([CH3:31])[O:29][C@@H:19]([C@@H:20]1[C@H:24]([CH:25]=[CH2:26])[O:23][C:22]([CH3:28])([CH3:27])[O:21
]1)[CH2:18][OH:17])([CH3:36])([CH3:35])[CH3:34] |f:0.1|</dl:reactionSmiles>
<productList>
<product role="product">
<molecule id="m0">
<name dictRef="nameDict:unknown">10</name>
<dl:nameResolved>(R)-2-((tert-Butyldimethylsilyl)oxy)-2-((4S,5S)-2,2-dimethyl-5-vinyl-1,3-dioxolan-4-yl)ethanol</dl:nameResolved>
</molecule>
<amount dl:propertyType="AMOUNT" dl:normalizedValue="0.00102">1.02 mmol</amount>
<amount dl:propertyType="MASS" dl:normalizedValue="0.308">308 mg</amount>
<amount dl:propertyType="PERCENTYIELD" dl:normalizedValue="79">79%</amount>
<amount dl:propertyType="CALCULATEDPERCENTYIELD" dl:normalizedValue="79.1" units="unit:percentYield">79.1</amount>
<identifier dictRef="cml:smiles" value="C(C)(C)(C)[Si](O[C@H](CO)[C@H]1OC(O[C@H]1C=C)(C)C)(C)C"/>
<identifier dictRef="cml:inchi" value="InChI=1S/C15H30O4Si/c1-9-11-13(18-15(5,6)17-11)12(10-16)19-20(7,8)14(2,3)4/h9,11-13,16H,1,10H2,2-8H3/t11-,12+,13-/m0/s1"/>
<dl:entityType>definiteReference</dl:entityType>
<dl:appearance>colourless</dl:appearance>
<dl:state>liquid</dl:state>
</product>
</productList>
<reactantList>
<reactant role="reactant">
<molecule id="m1">
<name dictRef="nameDict:unknown">Diisobutylaluminium hydride</name>
</molecule>
<amount dl:propertyType="AMOUNT" dl:normalizedValue="0.00323">3.23 mmol</amount>
Reaction 1: procedure steps
Diisobutylaluminium hydride (1.1 M in
cyclohexane, 2.93 mL, 3.23 mmol) was added
dropwise to the solution of 9 (500 mg, 1.29
mmol) and dichloromethane (20 mL) at −78 °C.
The reaction mixture was stirred at −78 °C for
another 2 h, warmed up to rt, quenched with
methanol (3 mL) and citric acid (aq) (w/w, 10%,
5 mL), concentrated. The residue was added
with water (10 mL) and extracted with
dichloromethane (12 mL × 3). The organic
layers were combined, dried over Na2SO4,
filtered and concentrated. The crude product
was further purified by column chromatography
(SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33) to give
10 (308 mg, 1.02 mmol, 79%) as a colourless
liquid.
Text mining breaks down procedure summary into steps:
<dl:reactionActionList/dl:reactionActions> dl:phraseTexts
• action="Add“: Diisobutylaluminium hydride (1.1 M in
cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to
the solution of 9 (500 mg, 1.29 mmol) and
dichloromethane (20 mL) at −78 °C
• action=" Stir“: The reaction mixture was stirred at −78 °C
for another 2 h
• action="Heat“: warmed up to rt
• action="Quench“: quenched with methanol (3 mL) and
citric acid(aq) (w/w, 10%, 5 mL)
• action="Concentrate“: concentrated
• action="Add“: The residue was added with water (10 mL)
• action="Extract“: extracted with dichloromethane (12 mL ×
3)
• action="Dry“: dried over Na2SO4
• action="Filter“: filtered
• action="Concentrate“: concentrated
• action="Purify“: The crude product was further purified by
column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf
0.33)
• action="Yield“: to give 10 (308 mg, 1.02 mmol, 79%) as a
colourless liquid
http://www.wired.com/2014/04/google-project-ara/
http://www.wsj.com/articles/googles-modular-phones-to-go-on-
sale-next-year-1463783371
The World we are heading into
http://www.gartner.com/newsroom/id/3143521
Our World is hyperconnected
Standards?
Data quality issues
Robochemistry
Proliferation of errors in public and
private databases
Automated quality control system
CVSP
CVSP – submission details
CVSP – issues review
J. Brechner, IUPAC
Graphical Representation of
stereochem. configurations
Section: ST-1.1.10
DB06287
CVSP - mapping
CVSP – rules
Dimensions and complexity of science
D2I2K2W
info@openphactsfoundation.org @Open_PHACTS
Open PHACTS Practical Semantics
OpenPHACTS
GlaxoSmithKline – Coordinator
Universität Wien – Managing entity
Technical University of Denmark
University of Hamburg, Center for
Bioinformatics
BioSolveIT GmBH
Consorci Mar Parc de Salut de Barcelona
Leiden University Medical Centre
Royal Society of Chemistry
Vrije Universiteit Amsterdam
Novartis
Merck Serono
H. Lundbeck A/S
Eli Lilly
Netherlands Bioinformatics Centre
Swiss Institute of Bioinformatics
ConnectedDiscovery
EMBL-European Bioinformatics Institute
Janssen Esteve Almirall
OpenLink Scibite
The Open PHACTS Foundation
Spanish National Cancer Research Centre
University of Manchester
Maastricht University
Aqnowledge
University of Santiago de Compostela
Rheinische Friedrich-Wilhelms-Universität
Bonn
AstraZeneca
Pfizer
Why is it so hard to….
Competitors?
What’s the
structure?
Are they in our
file?
What’s
similar?
What’s the
target?Pharmacology
data?
Known
Pathways?
Working On
Now?
Connections to
disease?
Expressed in right
cell type?
IP?
@gray_alasdair Big Data Integration 30
Knowledge is federated
Publishing – then…
…and now?
http://ec.europa.eu/research/press/2016/pdf/opendata-infographic_072016.pdf
Data Market
Publishers - the guardians of knowledge
This is a poster for Guardians of the Galaxy. The poster art copyright is believed to belong to the distributor of the Film, Walt Disney Studios Motion
Pictures, the publisher, Marvel Studios, or the graphic artist.
Data Publishing
Original artist: Joseph Ferdinand Keppler (1838-1894) Restoration: Adam Cuerden - http://www.loc.gov/pictures/item/2011661385/ by way
ofhttp://adamcuerden.deviantart.com/gallery/#/d5onmxh
The World we live in
Moore’s Law
"Internet host count history". Internet Systems Consortium. Retrieved May 16,2012.
We are on a verge of a new technical revolution
and it feels great to anticipate it and be ready to ride!
Image from surfline.com by Mike Cianciulli
Data Science @ RSC
The team. From left to right: Valery Tkachenko and Alexey Pshenichnov, based in the United States;
Aileen Day, based in Southampton; John Boyle, Peter Corbett, Colin Batchelor, Jeff White, Nicholas
Bailey and Val the plant, based at TGH
Thank you
Email: tkachenkov@rsc.org
Slides:
http://www.slideshare.net/valerytkachenko16

More Related Content

Viewers also liked

2016 laura's resume
2016 laura's resume2016 laura's resume
2016 laura's resume
Laura Estudillo
 
Empleo y esclerosis múltiple.
Empleo y esclerosis múltiple.Empleo y esclerosis múltiple.
Empleo y esclerosis múltiple.
José María
 
MAPA CONCEPTUAL SOCIOLOGIA (unidades IV Y V)
MAPA CONCEPTUAL SOCIOLOGIA (unidades IV Y V)MAPA CONCEPTUAL SOCIOLOGIA (unidades IV Y V)
MAPA CONCEPTUAL SOCIOLOGIA (unidades IV Y V)aalvarez1410
 
Verkko ja tyohyvinvointi, TTL 26052010
Verkko ja tyohyvinvointi, TTL 26052010Verkko ja tyohyvinvointi, TTL 26052010
Verkko ja tyohyvinvointi, TTL 26052010
Karoliina Luoto
 
Processes should serve creativity - Which processes help creatives to work be...
Processes should serve creativity - Which processes help creatives to work be...Processes should serve creativity - Which processes help creatives to work be...
Processes should serve creativity - Which processes help creatives to work be...
Jan Korsanke
 
Nociones de derecho civil mapa conceptual
Nociones de derecho civil mapa conceptualNociones de derecho civil mapa conceptual
Nociones de derecho civil mapa conceptual
ylerin
 
Serverless Logging with AWS Lambda and the Elastic Stack
Serverless Logging with AWS Lambda and the Elastic StackServerless Logging with AWS Lambda and the Elastic Stack
Serverless Logging with AWS Lambda and the Elastic Stack
Edoardo Paolo Scalafiotti
 
Érzelmek hálójában – hálózat- és tartalomelemzés
Érzelmek hálójában – hálózat- és tartalomelemzésÉrzelmek hálójában – hálózat- és tartalomelemzés
Érzelmek hálójában – hálózat- és tartalomelemzés
Zoltan Varju
 
Nociones del derecho civil mapa conceptual
Nociones del derecho civil mapa conceptualNociones del derecho civil mapa conceptual
Nociones del derecho civil mapa conceptualMaría Herrera
 
OpenPHACTS - Chemistry Platform Update and Learnings
OpenPHACTS - Chemistry Platform Update and LearningsOpenPHACTS - Chemistry Platform Update and Learnings
OpenPHACTS - Chemistry Platform Update and Learnings
Valery Tkachenko
 
Postcron: Automate and Plan Posts Ahead
Postcron: Automate and Plan Posts AheadPostcron: Automate and Plan Posts Ahead
Postcron: Automate and Plan Posts Ahead
Mafel Gorne
 
How to Decide: When to Use What In Office 365 - SharePoint Fest DC
How to Decide: When to Use What In Office 365 - SharePoint Fest DCHow to Decide: When to Use What In Office 365 - SharePoint Fest DC
How to Decide: When to Use What In Office 365 - SharePoint Fest DC
Richard Harbridge
 
Medicina Humana
Medicina HumanaMedicina Humana
Historic Gay Rights Decision
Historic Gay Rights DecisionHistoric Gay Rights Decision
Historic Gay Rights Decisionmaditabalnco
 
#Agilité Transformation #Disruption #User #Centricity #damien #ALEXANDRE
#Agilité Transformation #Disruption #User #Centricity #damien #ALEXANDRE#Agilité Transformation #Disruption #User #Centricity #damien #ALEXANDRE
#Agilité Transformation #Disruption #User #Centricity #damien #ALEXANDRE
ALEXANDRE damien - Sculpteur de transformations
 
Gradle: Harder, Stronger, Better, Faster
Gradle: Harder, Stronger, Better, FasterGradle: Harder, Stronger, Better, Faster
Gradle: Harder, Stronger, Better, Faster
Andres Almiray
 

Viewers also liked (17)

2016 laura's resume
2016 laura's resume2016 laura's resume
2016 laura's resume
 
Empleo y esclerosis múltiple.
Empleo y esclerosis múltiple.Empleo y esclerosis múltiple.
Empleo y esclerosis múltiple.
 
MAPA CONCEPTUAL SOCIOLOGIA (unidades IV Y V)
MAPA CONCEPTUAL SOCIOLOGIA (unidades IV Y V)MAPA CONCEPTUAL SOCIOLOGIA (unidades IV Y V)
MAPA CONCEPTUAL SOCIOLOGIA (unidades IV Y V)
 
Verkko ja tyohyvinvointi, TTL 26052010
Verkko ja tyohyvinvointi, TTL 26052010Verkko ja tyohyvinvointi, TTL 26052010
Verkko ja tyohyvinvointi, TTL 26052010
 
Processes should serve creativity - Which processes help creatives to work be...
Processes should serve creativity - Which processes help creatives to work be...Processes should serve creativity - Which processes help creatives to work be...
Processes should serve creativity - Which processes help creatives to work be...
 
Nociones de derecho civil mapa conceptual
Nociones de derecho civil mapa conceptualNociones de derecho civil mapa conceptual
Nociones de derecho civil mapa conceptual
 
Serverless Logging with AWS Lambda and the Elastic Stack
Serverless Logging with AWS Lambda and the Elastic StackServerless Logging with AWS Lambda and the Elastic Stack
Serverless Logging with AWS Lambda and the Elastic Stack
 
Érzelmek hálójában – hálózat- és tartalomelemzés
Érzelmek hálójában – hálózat- és tartalomelemzésÉrzelmek hálójában – hálózat- és tartalomelemzés
Érzelmek hálójában – hálózat- és tartalomelemzés
 
Nociones del derecho civil mapa conceptual
Nociones del derecho civil mapa conceptualNociones del derecho civil mapa conceptual
Nociones del derecho civil mapa conceptual
 
OpenPHACTS - Chemistry Platform Update and Learnings
OpenPHACTS - Chemistry Platform Update and LearningsOpenPHACTS - Chemistry Platform Update and Learnings
OpenPHACTS - Chemistry Platform Update and Learnings
 
Postcron: Automate and Plan Posts Ahead
Postcron: Automate and Plan Posts AheadPostcron: Automate and Plan Posts Ahead
Postcron: Automate and Plan Posts Ahead
 
How to Decide: When to Use What In Office 365 - SharePoint Fest DC
How to Decide: When to Use What In Office 365 - SharePoint Fest DCHow to Decide: When to Use What In Office 365 - SharePoint Fest DC
How to Decide: When to Use What In Office 365 - SharePoint Fest DC
 
Medicina Humana
Medicina HumanaMedicina Humana
Medicina Humana
 
DEBILIDAD INTELECTUAL
DEBILIDAD INTELECTUALDEBILIDAD INTELECTUAL
DEBILIDAD INTELECTUAL
 
Historic Gay Rights Decision
Historic Gay Rights DecisionHistoric Gay Rights Decision
Historic Gay Rights Decision
 
#Agilité Transformation #Disruption #User #Centricity #damien #ALEXANDRE
#Agilité Transformation #Disruption #User #Centricity #damien #ALEXANDRE#Agilité Transformation #Disruption #User #Centricity #damien #ALEXANDRE
#Agilité Transformation #Disruption #User #Centricity #damien #ALEXANDRE
 
Gradle: Harder, Stronger, Better, Faster
Gradle: Harder, Stronger, Better, FasterGradle: Harder, Stronger, Better, Faster
Gradle: Harder, Stronger, Better, Faster
 

Similar to Evolution of open chemical information

Current initiatives in developing research data repositories at the Royal Soc...
Current initiatives in developing research data repositories at the Royal Soc...Current initiatives in developing research data repositories at the Royal Soc...
Current initiatives in developing research data repositories at the Royal Soc...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Not just another reaction database
Not just another reaction databaseNot just another reaction database
Not just another reaction database
Valery Tkachenko
 
Metabolomics.ppt
Metabolomics.pptMetabolomics.ppt
Metabolomics.ppt
Robinakhan13
 
The importance of standards for data exchange and interchange on the Royal So...
The importance of standards for data exchange and interchange on the Royal So...The importance of standards for data exchange and interchange on the Royal So...
The importance of standards for data exchange and interchange on the Royal So...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
IRSAE aquatic ecology 28 June 2018 metabolomics
IRSAE aquatic ecology 28 June 2018 metabolomicsIRSAE aquatic ecology 28 June 2018 metabolomics
IRSAE aquatic ecology 28 June 2018 metabolomics
Panagiotis Arapitsas
 
Experiences in Hosting Big Chemistry Data Collections for the Community
Experiences in Hosting Big Chemistry Data Collections for the CommunityExperiences in Hosting Big Chemistry Data Collections for the Community
Experiences in Hosting Big Chemistry Data Collections for the Community
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Microwave-Assisted_Extraction_and_HPLC-DAD_Determi.pdf
Microwave-Assisted_Extraction_and_HPLC-DAD_Determi.pdfMicrowave-Assisted_Extraction_and_HPLC-DAD_Determi.pdf
Microwave-Assisted_Extraction_and_HPLC-DAD_Determi.pdf
Dr.Venkata Suresh Ponnuru
 
NOMAD
NOMADNOMAD
NOMAD
Jisc RDM
 
Text mining to produce large chemistry datasets for community access
Text mining to produce large chemistry datasets for community accessText mining to produce large chemistry datasets for community access
Text mining to produce large chemistry datasets for community access
Valery Tkachenko
 
3Slide_Presentation_for_Lee_Ferguson,_Ph.D.,_Duke_University.ppt
3Slide_Presentation_for_Lee_Ferguson,_Ph.D.,_Duke_University.ppt3Slide_Presentation_for_Lee_Ferguson,_Ph.D.,_Duke_University.ppt
3Slide_Presentation_for_Lee_Ferguson,_Ph.D.,_Duke_University.ppt
ARUNNT2
 
Hemodynamic Assessment Series by Transonic -- Part 1: PV Loop Case Study
Hemodynamic Assessment Series by Transonic -- Part 1: PV Loop Case StudyHemodynamic Assessment Series by Transonic -- Part 1: PV Loop Case Study
Hemodynamic Assessment Series by Transonic -- Part 1: PV Loop Case Study
InsideScientific
 
A chemistry data repository to serve them all
A chemistry data repository to serve them allA chemistry data repository to serve them all
151 performance of a localized fiber optic
151 performance of a localized fiber optic151 performance of a localized fiber optic
151 performance of a localized fiber optic
SHAPE Society
 
Synthesis of 2,3 o,o-dibenzyl-6-o-tosyl-l-ascorbic acid
Synthesis of 2,3 o,o-dibenzyl-6-o-tosyl-l-ascorbic acidSynthesis of 2,3 o,o-dibenzyl-6-o-tosyl-l-ascorbic acid
Synthesis of 2,3 o,o-dibenzyl-6-o-tosyl-l-ascorbic acid
SSR Institute of International Journal of Life Sciences
 
N-alkylation methods, Characterization and Evaluation of antibacterial activi...
N-alkylation methods, Characterization and Evaluation of antibacterial activi...N-alkylation methods, Characterization and Evaluation of antibacterial activi...
N-alkylation methods, Characterization and Evaluation of antibacterial activi...
IJERA Editor
 
Dealing with the complex challenge of managing diverse analytical chemistry d...
Dealing with the complex challenge of managing diverse analytical chemistry d...Dealing with the complex challenge of managing diverse analytical chemistry d...
Dealing with the complex challenge of managing diverse analytical chemistry d...
US Environmental Protection Agency (EPA), Center for Computational Toxicology and Exposure
 
Chemical intelligence that makes hidden knowledge effortlessly reachable
Chemical intelligence that makes hidden knowledge effortlessly reachableChemical intelligence that makes hidden knowledge effortlessly reachable
Chemical intelligence that makes hidden knowledge effortlessly reachable
ChemAxon
 
A041030106
A041030106A041030106
A041030106
IOSR-JEN
 

Similar to Evolution of open chemical information (20)

Current initiatives in developing research data repositories at the Royal Soc...
Current initiatives in developing research data repositories at the Royal Soc...Current initiatives in developing research data repositories at the Royal Soc...
Current initiatives in developing research data repositories at the Royal Soc...
 
Not just another reaction database
Not just another reaction databaseNot just another reaction database
Not just another reaction database
 
Metabolomics.ppt
Metabolomics.pptMetabolomics.ppt
Metabolomics.ppt
 
The importance of standards for data exchange and interchange on the Royal So...
The importance of standards for data exchange and interchange on the Royal So...The importance of standards for data exchange and interchange on the Royal So...
The importance of standards for data exchange and interchange on the Royal So...
 
IRSAE aquatic ecology 28 June 2018 metabolomics
IRSAE aquatic ecology 28 June 2018 metabolomicsIRSAE aquatic ecology 28 June 2018 metabolomics
IRSAE aquatic ecology 28 June 2018 metabolomics
 
Experiences in Hosting Big Chemistry Data Collections for the Community
Experiences in Hosting Big Chemistry Data Collections for the CommunityExperiences in Hosting Big Chemistry Data Collections for the Community
Experiences in Hosting Big Chemistry Data Collections for the Community
 
Microwave-Assisted_Extraction_and_HPLC-DAD_Determi.pdf
Microwave-Assisted_Extraction_and_HPLC-DAD_Determi.pdfMicrowave-Assisted_Extraction_and_HPLC-DAD_Determi.pdf
Microwave-Assisted_Extraction_and_HPLC-DAD_Determi.pdf
 
NOMAD
NOMADNOMAD
NOMAD
 
Text mining to produce large chemistry datasets for community access
Text mining to produce large chemistry datasets for community accessText mining to produce large chemistry datasets for community access
Text mining to produce large chemistry datasets for community access
 
3Slide_Presentation_for_Lee_Ferguson,_Ph.D.,_Duke_University.ppt
3Slide_Presentation_for_Lee_Ferguson,_Ph.D.,_Duke_University.ppt3Slide_Presentation_for_Lee_Ferguson,_Ph.D.,_Duke_University.ppt
3Slide_Presentation_for_Lee_Ferguson,_Ph.D.,_Duke_University.ppt
 
Hemodynamic Assessment Series by Transonic -- Part 1: PV Loop Case Study
Hemodynamic Assessment Series by Transonic -- Part 1: PV Loop Case StudyHemodynamic Assessment Series by Transonic -- Part 1: PV Loop Case Study
Hemodynamic Assessment Series by Transonic -- Part 1: PV Loop Case Study
 
A chemistry data repository to serve them all
A chemistry data repository to serve them allA chemistry data repository to serve them all
A chemistry data repository to serve them all
 
151 performance of a localized fiber optic
151 performance of a localized fiber optic151 performance of a localized fiber optic
151 performance of a localized fiber optic
 
Nirs
NirsNirs
Nirs
 
151 performance of a localized fiber optic
151 performance of a localized fiber optic151 performance of a localized fiber optic
151 performance of a localized fiber optic
 
Synthesis of 2,3 o,o-dibenzyl-6-o-tosyl-l-ascorbic acid
Synthesis of 2,3 o,o-dibenzyl-6-o-tosyl-l-ascorbic acidSynthesis of 2,3 o,o-dibenzyl-6-o-tosyl-l-ascorbic acid
Synthesis of 2,3 o,o-dibenzyl-6-o-tosyl-l-ascorbic acid
 
N-alkylation methods, Characterization and Evaluation of antibacterial activi...
N-alkylation methods, Characterization and Evaluation of antibacterial activi...N-alkylation methods, Characterization and Evaluation of antibacterial activi...
N-alkylation methods, Characterization and Evaluation of antibacterial activi...
 
Dealing with the complex challenge of managing diverse analytical chemistry d...
Dealing with the complex challenge of managing diverse analytical chemistry d...Dealing with the complex challenge of managing diverse analytical chemistry d...
Dealing with the complex challenge of managing diverse analytical chemistry d...
 
Chemical intelligence that makes hidden knowledge effortlessly reachable
Chemical intelligence that makes hidden knowledge effortlessly reachableChemical intelligence that makes hidden knowledge effortlessly reachable
Chemical intelligence that makes hidden knowledge effortlessly reachable
 
A041030106
A041030106A041030106
A041030106
 

More from Valery Tkachenko

Evolution of public chemistry databases: past and the future
Evolution of public chemistry databases: past and the futureEvolution of public chemistry databases: past and the future
Evolution of public chemistry databases: past and the future
Valery Tkachenko
 
In silico design of new functional materials
In silico design of new functional materialsIn silico design of new functional materials
In silico design of new functional materials
Valery Tkachenko
 
Metal-organic frameworks: from database to supramolecular effects in complexa...
Metal-organic frameworks: from database to supramolecular effects in complexa...Metal-organic frameworks: from database to supramolecular effects in complexa...
Metal-organic frameworks: from database to supramolecular effects in complexa...
Valery Tkachenko
 
Abstract recommendation system: beyond word-level representations
Abstract recommendation system: beyond word-level representationsAbstract recommendation system: beyond word-level representations
Abstract recommendation system: beyond word-level representations
Valery Tkachenko
 
Machine learning methods for chemical properties and toxicity based endpoints
Machine learning methods for chemical properties and toxicity based endpointsMachine learning methods for chemical properties and toxicity based endpoints
Machine learning methods for chemical properties and toxicity based endpoints
Valery Tkachenko
 
Chemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collectionChemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collection
Valery Tkachenko
 
Deep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsDeep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpoints
Valery Tkachenko
 
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictionsDeep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Valery Tkachenko
 
Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...
Valery Tkachenko
 
Need and benefits for structure standardization to facilitate integration and...
Need and benefits for structure standardization to facilitate integration and...Need and benefits for structure standardization to facilitate integration and...
Need and benefits for structure standardization to facilitate integration and...
Valery Tkachenko
 
Development and comparison of deep learning toolkit with other machine learni...
Development and comparison of deep learning toolkit with other machine learni...Development and comparison of deep learning toolkit with other machine learni...
Development and comparison of deep learning toolkit with other machine learni...
Valery Tkachenko
 
Living in a world of federated knowledge challenges, principles, tools and ...
Living in a world of federated knowledge   challenges, principles, tools and ...Living in a world of federated knowledge   challenges, principles, tools and ...
Living in a world of federated knowledge challenges, principles, tools and ...
Valery Tkachenko
 
Open chemistry registry and mapping platform based on open source cheminforma...
Open chemistry registry and mapping platform based on open source cheminforma...Open chemistry registry and mapping platform based on open source cheminforma...
Open chemistry registry and mapping platform based on open source cheminforma...
Valery Tkachenko
 
Using the structured product labeling format to index versatile chemical data
Using the structured product labeling format to index versatile chemical dataUsing the structured product labeling format to index versatile chemical data
Using the structured product labeling format to index versatile chemical data
Valery Tkachenko
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databases
Valery Tkachenko
 
Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0
Valery Tkachenko
 
Open Science Data Repository - the platform for materials research
Open Science Data Repository - the platform for materials researchOpen Science Data Repository - the platform for materials research
Open Science Data Repository - the platform for materials research
Valery Tkachenko
 
Opportunities in chemical structure standardization
Opportunities in chemical structure standardizationOpportunities in chemical structure standardization
Opportunities in chemical structure standardization
Valery Tkachenko
 
OMPOL – visualisation of large chemical spaces
OMPOL – visualisation of large chemical spacesOMPOL – visualisation of large chemical spaces
OMPOL – visualisation of large chemical spaces
Valery Tkachenko
 
Implementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTSImplementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTS
Valery Tkachenko
 

More from Valery Tkachenko (20)

Evolution of public chemistry databases: past and the future
Evolution of public chemistry databases: past and the futureEvolution of public chemistry databases: past and the future
Evolution of public chemistry databases: past and the future
 
In silico design of new functional materials
In silico design of new functional materialsIn silico design of new functional materials
In silico design of new functional materials
 
Metal-organic frameworks: from database to supramolecular effects in complexa...
Metal-organic frameworks: from database to supramolecular effects in complexa...Metal-organic frameworks: from database to supramolecular effects in complexa...
Metal-organic frameworks: from database to supramolecular effects in complexa...
 
Abstract recommendation system: beyond word-level representations
Abstract recommendation system: beyond word-level representationsAbstract recommendation system: beyond word-level representations
Abstract recommendation system: beyond word-level representations
 
Machine learning methods for chemical properties and toxicity based endpoints
Machine learning methods for chemical properties and toxicity based endpointsMachine learning methods for chemical properties and toxicity based endpoints
Machine learning methods for chemical properties and toxicity based endpoints
 
Chemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collectionChemical workflows supporting automated research data collection
Chemical workflows supporting automated research data collection
 
Deep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpointsDeep learning methods applied to physicochemical and toxicological endpoints
Deep learning methods applied to physicochemical and toxicological endpoints
 
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictionsDeep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
Deep Learning on nVidia GPUs for QSAR, QSPR and QNAR predictions
 
Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...Using publicly available resources to build a comprehensive knowledgebase of ...
Using publicly available resources to build a comprehensive knowledgebase of ...
 
Need and benefits for structure standardization to facilitate integration and...
Need and benefits for structure standardization to facilitate integration and...Need and benefits for structure standardization to facilitate integration and...
Need and benefits for structure standardization to facilitate integration and...
 
Development and comparison of deep learning toolkit with other machine learni...
Development and comparison of deep learning toolkit with other machine learni...Development and comparison of deep learning toolkit with other machine learni...
Development and comparison of deep learning toolkit with other machine learni...
 
Living in a world of federated knowledge challenges, principles, tools and ...
Living in a world of federated knowledge   challenges, principles, tools and ...Living in a world of federated knowledge   challenges, principles, tools and ...
Living in a world of federated knowledge challenges, principles, tools and ...
 
Open chemistry registry and mapping platform based on open source cheminforma...
Open chemistry registry and mapping platform based on open source cheminforma...Open chemistry registry and mapping platform based on open source cheminforma...
Open chemistry registry and mapping platform based on open source cheminforma...
 
Using the structured product labeling format to index versatile chemical data
Using the structured product labeling format to index versatile chemical dataUsing the structured product labeling format to index versatile chemical data
Using the structured product labeling format to index versatile chemical data
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databases
 
Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0
 
Open Science Data Repository - the platform for materials research
Open Science Data Repository - the platform for materials researchOpen Science Data Repository - the platform for materials research
Open Science Data Repository - the platform for materials research
 
Opportunities in chemical structure standardization
Opportunities in chemical structure standardizationOpportunities in chemical structure standardization
Opportunities in chemical structure standardization
 
OMPOL – visualisation of large chemical spaces
OMPOL – visualisation of large chemical spacesOMPOL – visualisation of large chemical spaces
OMPOL – visualisation of large chemical spaces
 
Implementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTSImplementing chemistry platform for OpenPHACTS
Implementing chemistry platform for OpenPHACTS
 

Recently uploaded

BREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptxBREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
RASHMI M G
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
RitabrataSarkar3
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
Abdul Wali Khan University Mardan,kP,Pakistan
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
University of Maribor
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 
The Evolution of Science Education PraxiLabs’ Vision- Presentation (2).pdf
The Evolution of Science Education PraxiLabs’ Vision- Presentation (2).pdfThe Evolution of Science Education PraxiLabs’ Vision- Presentation (2).pdf
The Evolution of Science Education PraxiLabs’ Vision- Presentation (2).pdf
mediapraxi
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
RenuJangid3
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
kejapriya1
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
RASHMI M G
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
moosaasad1975
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
University of Rennes, INSA Rennes, Inria/IRISA, CNRS
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
pablovgd
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
University of Maribor
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
MAGOTI ERNEST
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
tonzsalvador2222
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
TinyAnderson
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
Wasswaderrick3
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
David Osipyan
 

Recently uploaded (20)

BREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptxBREEDING METHODS FOR DISEASE RESISTANCE.pptx
BREEDING METHODS FOR DISEASE RESISTANCE.pptx
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Eukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptxEukaryotic Transcription Presentation.pptx
Eukaryotic Transcription Presentation.pptx
 
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...THEMATIC  APPERCEPTION  TEST(TAT) cognitive abilities, creativity, and critic...
THEMATIC APPERCEPTION TEST(TAT) cognitive abilities, creativity, and critic...
 
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 
The Evolution of Science Education PraxiLabs’ Vision- Presentation (2).pdf
The Evolution of Science Education PraxiLabs’ Vision- Presentation (2).pdfThe Evolution of Science Education PraxiLabs’ Vision- Presentation (2).pdf
The Evolution of Science Education PraxiLabs’ Vision- Presentation (2).pdf
 
Leaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdfLeaf Initiation, Growth and Differentiation.pdf
Leaf Initiation, Growth and Differentiation.pdf
 
bordetella pertussis.................................ppt
bordetella pertussis.................................pptbordetella pertussis.................................ppt
bordetella pertussis.................................ppt
 
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptx
 
What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.What is greenhouse gasses and how many gasses are there to affect the Earth.
What is greenhouse gasses and how many gasses are there to affect the Earth.
 
Deep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless ReproducibilityDeep Software Variability and Frictionless Reproducibility
Deep Software Variability and Frictionless Reproducibility
 
NuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyerNuGOweek 2024 Ghent programme overview flyer
NuGOweek 2024 Ghent programme overview flyer
 
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...
 
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptxThe use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
The use of Nauplii and metanauplii artemia in aquaculture (brine shrimp).pptx
 
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
Chapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisisChapter 12 - climate change and the energy crisis
Chapter 12 - climate change and the energy crisis
 
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdfTopic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
Topic: SICKLE CELL DISEASE IN CHILDREN-3.pdf
 
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
DERIVATION OF MODIFIED BERNOULLI EQUATION WITH VISCOUS EFFECTS AND TERMINAL V...
 
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
3D Hybrid PIC simulation of the plasma expansion (ISSS-14)
 

Evolution of open chemical information

  • 1. Evolution of open chemical information Valery Tkachenko Royal Society of Chemistry ACS Fall 2016 Philadelphia, PA
  • 2. The Short History of Time Image credit: Rhys Taylor, Cardiff University ~1992
  • 3.
  • 5.
  • 7. • 57 million chemicals and growing • Data sourced from >500 different sources • Crowdsourced curation and annotation • Ongoing deposition of data from our journals and our collaborators • A structure centric hub for web-searching
  • 12. Reaction 1: NextMove reaction text-mined from RSC archive – original article
  • 13. Reaction 1: NextMove reaction text- mined from RSC archive – cml output <?xml version="1.0" encoding="UTF-8"?> <reactionList xmlns="http://www.xml-cml.org/schema" xmlns:cmlDict="http://www.xml-cml.org/dictionary/cml/" xmlns:nameDict="http://www.xml-cml.org/dictionary/cml/name/" xmlns:unit="http://www.xml-cml.org/unit/" xmlns:cml="http://www.xml-cml.org/schema" xmlns:dl="http://bitbucket.org/dan2097"> <reaction> <dl:source> <dl:documentId>c3ra45871g</dl:documentId> <dl:paragraphText>Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C. The reaction mixture was stirred at −78 °C for another 2 h, warmed up to rt, quenched with methanol (3 mL) and citric acid(aq) (w/w, 10%, 5 mL), concentrated. The residue was added with water (10 mL) and extracted with dichloromethane (12 mL × 3). The organic layers were combined, dried over Na2SO4, filtered and concentrated. The crude product was further purified by column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33) to give 10 (308 mg, 1.02 mmol, 79%) as a colourless liquid. [α]D20 −24.2 (c 1.1, CHCl3); 1H NMR (CDCl3, 300 MHz) δ 0.04 (s, 3H), 0.07 (s, 3H), 0.85 (s, 9H), 1.34 (s, 3H), 1.44 (s, 3H), 2.16 (br, 1H), 3.68–3.81 (m, 3H), 4.16 (t, J = 13.8 Hz, J = 13.8 Hz, 1H), 4.59 (t, J = 6.6 Hz, J = 6.6 Hz, 1H), 5.22 (d, J = 10.7 Hz, 1H), 5.34 (d, J = 17.1 Hz, 1H), 5.90 (ddd, J = 7.2 Hz, J = 10.2 Hz, J = 17.2 Hz, 1H); 13C NMR (CDCl3, 75 MHz) δ 134.1, 118.4, 108.5, 79.5, 78.8, 70.8, 65.0, 27.8, 25.9, 25.4, 18.1, −3.7, −4.4. HRMS (ESI) calcd for [M + Na]+ (C15H30O4SiNa) 325.1811, found 325.1807.</dl:paragraphText> </dl:source> <dl:reactionSmiles>[H- ].C([Al+]CC(C)C)C(C)C.C([O:17][CH2:18][C@@H:19]([O:29][Si:30]([C:33]([CH3:36])([CH3:35])[CH3:34])([CH3:32])[CH3:31])[C@@H:20]1[C@H:24]([CH:25]=[CH2:26])[O:23][C:22]([CH3 :28])([CH3:27])[O:21]1)(=O)C(C)(C)C&gt;ClCCl&gt;[C:33]([Si:30]([CH3:32])([CH3:31])[O:29][C@@H:19]([C@@H:20]1[C@H:24]([CH:25]=[CH2:26])[O:23][C:22]([CH3:28])([CH3:27])[O:21 ]1)[CH2:18][OH:17])([CH3:36])([CH3:35])[CH3:34] |f:0.1|</dl:reactionSmiles> <productList> <product role="product"> <molecule id="m0"> <name dictRef="nameDict:unknown">10</name> <dl:nameResolved>(R)-2-((tert-Butyldimethylsilyl)oxy)-2-((4S,5S)-2,2-dimethyl-5-vinyl-1,3-dioxolan-4-yl)ethanol</dl:nameResolved> </molecule> <amount dl:propertyType="AMOUNT" dl:normalizedValue="0.00102">1.02 mmol</amount> <amount dl:propertyType="MASS" dl:normalizedValue="0.308">308 mg</amount> <amount dl:propertyType="PERCENTYIELD" dl:normalizedValue="79">79%</amount> <amount dl:propertyType="CALCULATEDPERCENTYIELD" dl:normalizedValue="79.1" units="unit:percentYield">79.1</amount> <identifier dictRef="cml:smiles" value="C(C)(C)(C)[Si](O[C@H](CO)[C@H]1OC(O[C@H]1C=C)(C)C)(C)C"/> <identifier dictRef="cml:inchi" value="InChI=1S/C15H30O4Si/c1-9-11-13(18-15(5,6)17-11)12(10-16)19-20(7,8)14(2,3)4/h9,11-13,16H,1,10H2,2-8H3/t11-,12+,13-/m0/s1"/> <dl:entityType>definiteReference</dl:entityType> <dl:appearance>colourless</dl:appearance> <dl:state>liquid</dl:state> </product> </productList> <reactantList> <reactant role="reactant"> <molecule id="m1"> <name dictRef="nameDict:unknown">Diisobutylaluminium hydride</name> </molecule> <amount dl:propertyType="AMOUNT" dl:normalizedValue="0.00323">3.23 mmol</amount>
  • 14. Reaction 1: procedure steps Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C. The reaction mixture was stirred at −78 °C for another 2 h, warmed up to rt, quenched with methanol (3 mL) and citric acid (aq) (w/w, 10%, 5 mL), concentrated. The residue was added with water (10 mL) and extracted with dichloromethane (12 mL × 3). The organic layers were combined, dried over Na2SO4, filtered and concentrated. The crude product was further purified by column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33) to give 10 (308 mg, 1.02 mmol, 79%) as a colourless liquid. Text mining breaks down procedure summary into steps: <dl:reactionActionList/dl:reactionActions> dl:phraseTexts • action="Add“: Diisobutylaluminium hydride (1.1 M in cyclohexane, 2.93 mL, 3.23 mmol) was added dropwise to the solution of 9 (500 mg, 1.29 mmol) and dichloromethane (20 mL) at −78 °C • action=" Stir“: The reaction mixture was stirred at −78 °C for another 2 h • action="Heat“: warmed up to rt • action="Quench“: quenched with methanol (3 mL) and citric acid(aq) (w/w, 10%, 5 mL) • action="Concentrate“: concentrated • action="Add“: The residue was added with water (10 mL) • action="Extract“: extracted with dichloromethane (12 mL × 3) • action="Dry“: dried over Na2SO4 • action="Filter“: filtered • action="Concentrate“: concentrated • action="Purify“: The crude product was further purified by column chromatography (SiO2, EtOAc–hexanes, 1 : 7; Rf 0.33) • action="Yield“: to give 10 (308 mg, 1.02 mmol, 79%) as a colourless liquid
  • 16. The World we are heading into http://www.gartner.com/newsroom/id/3143521
  • 17. Our World is hyperconnected
  • 19. Data quality issues Robochemistry Proliferation of errors in public and private databases Automated quality control system
  • 20. CVSP
  • 22. CVSP – issues review
  • 23. J. Brechner, IUPAC Graphical Representation of stereochem. configurations Section: ST-1.1.10 DB06287
  • 28. info@openphactsfoundation.org @Open_PHACTS Open PHACTS Practical Semantics OpenPHACTS GlaxoSmithKline – Coordinator Universität Wien – Managing entity Technical University of Denmark University of Hamburg, Center for Bioinformatics BioSolveIT GmBH Consorci Mar Parc de Salut de Barcelona Leiden University Medical Centre Royal Society of Chemistry Vrije Universiteit Amsterdam Novartis Merck Serono H. Lundbeck A/S Eli Lilly Netherlands Bioinformatics Centre Swiss Institute of Bioinformatics ConnectedDiscovery EMBL-European Bioinformatics Institute Janssen Esteve Almirall OpenLink Scibite The Open PHACTS Foundation Spanish National Cancer Research Centre University of Manchester Maastricht University Aqnowledge University of Santiago de Compostela Rheinische Friedrich-Wilhelms-Universität Bonn AstraZeneca Pfizer
  • 29. Why is it so hard to…. Competitors? What’s the structure? Are they in our file? What’s similar? What’s the target?Pharmacology data? Known Pathways? Working On Now? Connections to disease? Expressed in right cell type? IP?
  • 30. @gray_alasdair Big Data Integration 30 Knowledge is federated
  • 35. Publishers - the guardians of knowledge This is a poster for Guardians of the Galaxy. The poster art copyright is believed to belong to the distributor of the Film, Walt Disney Studios Motion Pictures, the publisher, Marvel Studios, or the graphic artist.
  • 36. Data Publishing Original artist: Joseph Ferdinand Keppler (1838-1894) Restoration: Adam Cuerden - http://www.loc.gov/pictures/item/2011661385/ by way ofhttp://adamcuerden.deviantart.com/gallery/#/d5onmxh
  • 37. The World we live in
  • 38.
  • 40. "Internet host count history". Internet Systems Consortium. Retrieved May 16,2012.
  • 41. We are on a verge of a new technical revolution and it feels great to anticipate it and be ready to ride! Image from surfline.com by Mike Cianciulli
  • 42.
  • 43. Data Science @ RSC The team. From left to right: Valery Tkachenko and Alexey Pshenichnov, based in the United States; Aileen Day, based in Southampton; John Boyle, Peter Corbett, Colin Batchelor, Jeff White, Nicholas Bailey and Val the plant, based at TGH

Editor's Notes

  1. What about science and chemistry in particular?
  2. Remember this, some of these questions are easier to answer than others
  3. Open PHACTS was developed to support the key questions of drug discovery Business questions have been at the heart of Open PHACTS and have driven the development of the platform Mx/psa, how calculated who did it? Mash up. With your data too, - top layer join together but need them all commercial Data provided by many publishers Originally in many formats: relational, SD files and RDF Worked closely with publishers Data licensing was a major issue Over 5 billion triples – 14 datasets & growing Hosted on beefy hardware; data in memory (aim) Extensive memcaching Pose complex queries to extract data