SlideShare a Scribd company logo
1 of 22
ChemSpider Reactions:
   Delivering a free community
resource of chemical syntheses

Valery Tkachenko, Colin Batchelor, Daniel Lowe, Ken
     Karapetyan, David Sharpe and Antony Williams
                   ACS New Orleans April 2013
Overview
•   Motivation
•   The RSC and chemical reaction data
•   New sources of chemical reaction data
•   ChemSpider Reactions: bringing it all together
•   Experiments with reaction classification
•   The National Chemical Database Service
Who needs another reaction
           database?
• Those who cannot afford to license access…
• Those who would like to access data that is
  not abstracted
• Those who might like to contribute data to a
  database
• Anybody wanting to integrate their systems in
  and to pull data out.
RSC and chemical reaction data 1



Graphical abstracting journals:
Methods in Organic Synthesis (monthly, 1990 to present)
Catalysts and Catalysed Reactions (monthly, 2005 to
present)

These constitute a backfile of over 50000 novel reactions
RSC and chemical reaction data 2
RSC and chemical reaction data 3
New sources of reaction data



Daniel Lowe’s PhD thesis (Cantab, 2012) was on
extracting reactions from US patent data.
We can apply this technology to the RSC Journal
archive.
ChemSpider Reactions
  bringing it all together
http://csr.dev.rsc-us.org/


WORK IN PROGRESS
Reaction classification              1
Project Prospect has text-mined RSC journal
articles for named reactions and molecular
processes, annotated according to Creative
Commons-licensed ontologies:

See http://rxno.googlecode.com/
Reaction classification   2




Classification of Daniel’s US
Patent data
Reaction InChI
To do for reactions what InChI has done for
structures
•Think online searching
•Deduplication and linking

http://www-rinchi.ch.cam.ac.uk/help.html
Reaction InChI
Early work – RInChIs layered on to a few
hundred thousand reactions
•Not generated for a few 10s of thousands of
reactions
•Reaction deduplication results differ based on
algorithm – GGA software versus RInChI
•Under investigation
Other sources
ChemSpider SyntheticPages

•Electronic Lab Notebooks
•University repositories

Please send theses
What will ChemSpider Reactions serve?
• Chemical Database Service
• Linking back to original
  publications/supplementary data
• Underpinning other tools e.g. retrosynthetic
  analysis (depends on data quality and
  mapping)
Chemical Database
Service
National Chemical Database Service
for UK academics
Integrates commercial databases and
services
Chemicals, analytical data, prediction
algorithms
Development of data repository
ARChem from SimBioSys                 1
Synthesis planning tool which performs rule-
and precedent-based retrosynthetic analysis
back to commercially available starting
materials.
ARChem from SimBiosys   2
ARChem from SimBioSys   3
But what about data quality?
• Data validation and curation
  required
• Encouraging participation with
  Rewards and RECOGNITION
Manual curation
• Integrated commenting, curating and validation
  platform across ALL eScience and Publishing
  platforms
• All integrated to a central RSC profile and
  feeding the alt-metrics tools
The other kind of RDF
               (made-up example)
Chemical reactions are unusually well-suited to representation. (Donald
Davidson’s event semantics)

_:r1 a obo:RXNO_0000004 ; # Diels–Alder
 obo:has_participant_ceasing_to_exist _:m1 ;
# a diene
 obo:has_participant_ceasing_to_exist _:m2 ;
# an olefin
 obo:has_participant_starting_to_exist _:m3 .
# a substituted cyclohexene
_:m1 a <http://rdf.chemspider.com/233000> .
_:m2 a <http://rdf.chemspider.com/233001> .
_:m3 a <http://rdf.chemspider.com/233002> .
Questions?

E-mail: tkachenkov@rsc.org, batchelorc@rsc.org

More Related Content

What's hot

The application of cloud computing to royal society of chemistry data platforms
The application of cloud computing to royal society of chemistry data platformsThe application of cloud computing to royal society of chemistry data platforms
The application of cloud computing to royal society of chemistry data platformsValery Tkachenko
 
Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Valery Tkachenko
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesValery Tkachenko
 
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...Kamel Mansouri
 
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...Andrew McEachran
 
Opportunities in chemical structure standardization
Opportunities in chemical structure standardizationOpportunities in chemical structure standardization
Opportunities in chemical structure standardizationValery Tkachenko
 

What's hot (20)

Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...Delivering The Benefits of Chemical-Biological Integration in Computational T...
Delivering The Benefits of Chemical-Biological Integration in Computational T...
 
The needs for chemistry standards, database tools and data curation at the ch...
The needs for chemistry standards, database tools and data curation at the ch...The needs for chemistry standards, database tools and data curation at the ch...
The needs for chemistry standards, database tools and data curation at the ch...
 
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
 
Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...Structure Identification Using High Resolution Mass Spectrometry Data and the...
Structure Identification Using High Resolution Mass Spectrometry Data and the...
 
The application of cloud computing to royal society of chemistry data platforms
The application of cloud computing to royal society of chemistry data platformsThe application of cloud computing to royal society of chemistry data platforms
The application of cloud computing to royal society of chemistry data platforms
 
An examination of data quality on QSAR Modeling in regards to the environment...
An examination of data quality on QSAR Modeling in regards to the environment...An examination of data quality on QSAR Modeling in regards to the environment...
An examination of data quality on QSAR Modeling in regards to the environment...
 
Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...Non-targeted analysis supported by data and cheminformatics delivered via the...
Non-targeted analysis supported by data and cheminformatics delivered via the...
 
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
The EPA Comptox Chemistry Dashboard: A Web-Based Data Integration Hub for Env...
 
The EPA CompTox Chemistry Dashboard and Underpinning Software Architecture
The EPA CompTox Chemistry Dashboard and Underpinning Software Architecture The EPA CompTox Chemistry Dashboard and Underpinning Software Architecture
The EPA CompTox Chemistry Dashboard and Underpinning Software Architecture
 
Accessing information for chemicals in hydraulic fracturing fluids using the ...
Accessing information for chemicals in hydraulic fracturing fluids using the ...Accessing information for chemicals in hydraulic fracturing fluids using the ...
Accessing information for chemicals in hydraulic fracturing fluids using the ...
 
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
Using the US EPA’s CompTox Chemistry Dashboard for structure identification a...
 
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
Applications of the US EPA’s CompTox chemicals dashboard to support structure...Applications of the US EPA’s CompTox chemicals dashboard to support structure...
Applications of the US EPA’s CompTox chemicals dashboard to support structure...
 
Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...Chemical identification of unknowns in high resolution mass spectrometry usin...
Chemical identification of unknowns in high resolution mass spectrometry usin...
 
Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...
 
Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0Chemistry Validation and Standardization Platform v2.0
Chemistry Validation and Standardization Platform v2.0
 
ACRL Trust in Science Talk
ACRL Trust in Science TalkACRL Trust in Science Talk
ACRL Trust in Science Talk
 
Tools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databasesTools and approaches for data deposition into nanomaterial databases
Tools and approaches for data deposition into nanomaterial databases
 
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
The importance of data curation on QSAR Modeling: PHYSPROP open data as a cas...
 
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
The EPA CompTox Dashboard as a Data Integration Hub for Environmental Chemist...
 
Opportunities in chemical structure standardization
Opportunities in chemical structure standardizationOpportunities in chemical structure standardization
Opportunities in chemical structure standardization
 

Viewers also liked

SXSW 2013: How Twitter is Changing How We Watch TV
SXSW 2013: How Twitter is Changing How We Watch TVSXSW 2013: How Twitter is Changing How We Watch TV
SXSW 2013: How Twitter is Changing How We Watch TVJenn Deering Davis
 
Morgan e xt_062811
Morgan e xt_062811Morgan e xt_062811
Morgan e xt_062811kimorgan613
 
Coordinating Senior Care with Intuit's Weave
Coordinating Senior Care with Intuit's Weave Coordinating Senior Care with Intuit's Weave
Coordinating Senior Care with Intuit's Weave Ted Drake
 
Scrummaster Needed Desperately at 2016 Scrum Australia
Scrummaster Needed Desperately at 2016 Scrum AustraliaScrummaster Needed Desperately at 2016 Scrum Australia
Scrummaster Needed Desperately at 2016 Scrum AustraliaBernd Schiffer
 
Vetting Plugins : WordCamp Columbus 2015
Vetting Plugins : WordCamp Columbus 2015Vetting Plugins : WordCamp Columbus 2015
Vetting Plugins : WordCamp Columbus 2015Jessica C. Gardner
 
Lançando versões em um clique - deploy contínuo
Lançando versões em um clique - deploy contínuoLançando versões em um clique - deploy contínuo
Lançando versões em um clique - deploy contínuoHélio Medeiros
 
2012 ACS Skolnik Symposium - ChemSpotlight
2012 ACS Skolnik Symposium - ChemSpotlight2012 ACS Skolnik Symposium - ChemSpotlight
2012 ACS Skolnik Symposium - ChemSpotlightGeoffrey Hutchison
 
The State of PHPUnit
The State of PHPUnitThe State of PHPUnit
The State of PHPUnitEdorian
 
Rapid Product Design in the Wild - Agile Iceland
Rapid Product Design in the Wild - Agile IcelandRapid Product Design in the Wild - Agile Iceland
Rapid Product Design in the Wild - Agile IcelandMichele Ide-Smith
 
Web Frontend development: tools and good practices to (re)organize the chaos
Web Frontend development: tools and good practices to (re)organize the chaosWeb Frontend development: tools and good practices to (re)organize the chaos
Web Frontend development: tools and good practices to (re)organize the chaosMatteo Papadopoulos
 
Introduction to Perl Best Practices
Introduction to Perl Best PracticesIntroduction to Perl Best Practices
Introduction to Perl Best PracticesJosé Castro
 
Martijn van Exel - Collaborate to compete: Regain your Competitive Edge with osm
Martijn van Exel - Collaborate to compete: Regain your Competitive Edge with osmMartijn van Exel - Collaborate to compete: Regain your Competitive Edge with osm
Martijn van Exel - Collaborate to compete: Regain your Competitive Edge with osmOSMFstateofthemap
 
Mobile Content Prototyping - Jump Start Your Mobile Project
Mobile Content Prototyping - Jump Start Your Mobile ProjectMobile Content Prototyping - Jump Start Your Mobile Project
Mobile Content Prototyping - Jump Start Your Mobile ProjectMarta Rauch
 
Cowboy development with Django
Cowboy development with DjangoCowboy development with Django
Cowboy development with DjangoSimon Willison
 
What may I do with your data? What do I have to do with your data? Policie...
What may I do with your data? What do I have to do with your data? Policie...What may I do with your data? What do I have to do with your data? Policie...
What may I do with your data? What do I have to do with your data? Policie...Steffen Staab
 
Make your web apps "Go, Go" like Power Rangers
Make your web apps "Go, Go" like Power RangersMake your web apps "Go, Go" like Power Rangers
Make your web apps "Go, Go" like Power RangersKarolina Szczur
 
Beyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeBeyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeWim Godden
 
7 reasons to start using Docker
7 reasons to start using Docker7 reasons to start using Docker
7 reasons to start using DockerTaras Lyapun
 

Viewers also liked (20)

SXSW 2013: How Twitter is Changing How We Watch TV
SXSW 2013: How Twitter is Changing How We Watch TVSXSW 2013: How Twitter is Changing How We Watch TV
SXSW 2013: How Twitter is Changing How We Watch TV
 
Lynne Cazaly - Insights & Connections
Lynne Cazaly - Insights & ConnectionsLynne Cazaly - Insights & Connections
Lynne Cazaly - Insights & Connections
 
Morgan e xt_062811
Morgan e xt_062811Morgan e xt_062811
Morgan e xt_062811
 
Coordinating Senior Care with Intuit's Weave
Coordinating Senior Care with Intuit's Weave Coordinating Senior Care with Intuit's Weave
Coordinating Senior Care with Intuit's Weave
 
Scrummaster Needed Desperately at 2016 Scrum Australia
Scrummaster Needed Desperately at 2016 Scrum AustraliaScrummaster Needed Desperately at 2016 Scrum Australia
Scrummaster Needed Desperately at 2016 Scrum Australia
 
Engaging students in publishing on the internet early in their careers
Engaging students in publishing on the internet early in their careersEngaging students in publishing on the internet early in their careers
Engaging students in publishing on the internet early in their careers
 
Vetting Plugins : WordCamp Columbus 2015
Vetting Plugins : WordCamp Columbus 2015Vetting Plugins : WordCamp Columbus 2015
Vetting Plugins : WordCamp Columbus 2015
 
Lançando versões em um clique - deploy contínuo
Lançando versões em um clique - deploy contínuoLançando versões em um clique - deploy contínuo
Lançando versões em um clique - deploy contínuo
 
2012 ACS Skolnik Symposium - ChemSpotlight
2012 ACS Skolnik Symposium - ChemSpotlight2012 ACS Skolnik Symposium - ChemSpotlight
2012 ACS Skolnik Symposium - ChemSpotlight
 
The State of PHPUnit
The State of PHPUnitThe State of PHPUnit
The State of PHPUnit
 
Rapid Product Design in the Wild - Agile Iceland
Rapid Product Design in the Wild - Agile IcelandRapid Product Design in the Wild - Agile Iceland
Rapid Product Design in the Wild - Agile Iceland
 
Web Frontend development: tools and good practices to (re)organize the chaos
Web Frontend development: tools and good practices to (re)organize the chaosWeb Frontend development: tools and good practices to (re)organize the chaos
Web Frontend development: tools and good practices to (re)organize the chaos
 
Introduction to Perl Best Practices
Introduction to Perl Best PracticesIntroduction to Perl Best Practices
Introduction to Perl Best Practices
 
Martijn van Exel - Collaborate to compete: Regain your Competitive Edge with osm
Martijn van Exel - Collaborate to compete: Regain your Competitive Edge with osmMartijn van Exel - Collaborate to compete: Regain your Competitive Edge with osm
Martijn van Exel - Collaborate to compete: Regain your Competitive Edge with osm
 
Mobile Content Prototyping - Jump Start Your Mobile Project
Mobile Content Prototyping - Jump Start Your Mobile ProjectMobile Content Prototyping - Jump Start Your Mobile Project
Mobile Content Prototyping - Jump Start Your Mobile Project
 
Cowboy development with Django
Cowboy development with DjangoCowboy development with Django
Cowboy development with Django
 
What may I do with your data? What do I have to do with your data? Policie...
What may I do with your data? What do I have to do with your data? Policie...What may I do with your data? What do I have to do with your data? Policie...
What may I do with your data? What do I have to do with your data? Policie...
 
Make your web apps "Go, Go" like Power Rangers
Make your web apps "Go, Go" like Power RangersMake your web apps "Go, Go" like Power Rangers
Make your web apps "Go, Go" like Power Rangers
 
Beyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the codeBeyond PHP - It's not (just) about the code
Beyond PHP - It's not (just) about the code
 
7 reasons to start using Docker
7 reasons to start using Docker7 reasons to start using Docker
7 reasons to start using Docker
 

Similar to ChemSpider reactions – delivering a free community resource of chemical syntheses

ChemSpider reactions – delivering a free community resource of chemical synth...
ChemSpider reactions – delivering a free community resource of chemical synth...ChemSpider reactions – delivering a free community resource of chemical synth...
ChemSpider reactions – delivering a free community resource of chemical synth...Ken Karapetyan
 
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of ChemistryICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of ChemistryDr. Haxel Consult
 
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platformsChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platformsKen Karapetyan
 
Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_Ann-Marie Roche
 
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...Frederik van den Broek
 
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...Dr. Haxel Consult
 
Data drivenapproach to medicinalchemistry
Data drivenapproach to medicinalchemistryData drivenapproach to medicinalchemistry
Data drivenapproach to medicinalchemistryAnn-Marie Roche
 

Similar to ChemSpider reactions – delivering a free community resource of chemical syntheses (20)

ChemSpider reactions – delivering a free community resource of chemical synth...
ChemSpider reactions – delivering a free community resource of chemical synth...ChemSpider reactions – delivering a free community resource of chemical synth...
ChemSpider reactions – delivering a free community resource of chemical synth...
 
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...Structure identification approaches using the EPA CompTox Chemicals Dashboard...
Structure identification approaches using the EPA CompTox Chemicals Dashboard...
 
Serving the medicinal chemistry community with Royal Society of Chemistry che...
Serving the medicinal chemistry community with Royal Society of Chemistry che...Serving the medicinal chemistry community with Royal Society of Chemistry che...
Serving the medicinal chemistry community with Royal Society of Chemistry che...
 
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
Accessing Environmental Chemistry Data via Data Dashboards and Applications t...
 
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of ChemistryICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
 
Big data challenges associated with building a national data repository for c...
Big data challenges associated with building a national data repository for c...Big data challenges associated with building a national data repository for c...
Big data challenges associated with building a national data repository for c...
 
A chemistry data repository to serve them all
A chemistry data repository to serve them allA chemistry data repository to serve them all
A chemistry data repository to serve them all
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
 
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
Structure identification by Mass Spectrometry Non-Targeted Analysis using the...
 
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platformsChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
 
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
 
Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_
 
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
Cheminformatics tools and chemistry data underpinning mass spectrometry analy...
 
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...
UDM (Unified Data Model) - Enabling Exchange of Comprehensive Reaction Inform...
 
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
ICIC 2017: Tutorial - Digging bioactive chemistry out of patents using open r...
 
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
CompTox Chemicals Dashboard: Data and tools to support chemical and environme...
 
Data drivenapproach to medicinalchemistry
Data drivenapproach to medicinalchemistryData drivenapproach to medicinalchemistry
Data drivenapproach to medicinalchemistry
 
eScience at the Royal Society of Chemistry and our current initiatives
eScience at the Royal Society of Chemistry and our current initiativeseScience at the Royal Society of Chemistry and our current initiatives
eScience at the Royal Society of Chemistry and our current initiatives
 
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted AnalysisThe US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
The US-EPA CompTox Chemicals Dashboard to support Non-Targeted Analysis
 
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
 

Recently uploaded

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfngoud9212
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDGMarianaLemus7
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 

Recently uploaded (20)

Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Bluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdfBluetooth Controlled Car with Arduino.pdf
Bluetooth Controlled Car with Arduino.pdf
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
APIForce Zurich 5 April Automation LPDG
APIForce Zurich 5 April  Automation LPDGAPIForce Zurich 5 April  Automation LPDG
APIForce Zurich 5 April Automation LPDG
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Pigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food ManufacturingPigging Solutions in Pet Food Manufacturing
Pigging Solutions in Pet Food Manufacturing
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 

ChemSpider reactions – delivering a free community resource of chemical syntheses

  • 1. ChemSpider Reactions: Delivering a free community resource of chemical syntheses Valery Tkachenko, Colin Batchelor, Daniel Lowe, Ken Karapetyan, David Sharpe and Antony Williams ACS New Orleans April 2013
  • 2. Overview • Motivation • The RSC and chemical reaction data • New sources of chemical reaction data • ChemSpider Reactions: bringing it all together • Experiments with reaction classification • The National Chemical Database Service
  • 3. Who needs another reaction database? • Those who cannot afford to license access… • Those who would like to access data that is not abstracted • Those who might like to contribute data to a database • Anybody wanting to integrate their systems in and to pull data out.
  • 4. RSC and chemical reaction data 1 Graphical abstracting journals: Methods in Organic Synthesis (monthly, 1990 to present) Catalysts and Catalysed Reactions (monthly, 2005 to present) These constitute a backfile of over 50000 novel reactions
  • 5. RSC and chemical reaction data 2
  • 6. RSC and chemical reaction data 3
  • 7. New sources of reaction data Daniel Lowe’s PhD thesis (Cantab, 2012) was on extracting reactions from US patent data. We can apply this technology to the RSC Journal archive.
  • 8. ChemSpider Reactions bringing it all together http://csr.dev.rsc-us.org/ WORK IN PROGRESS
  • 9. Reaction classification 1 Project Prospect has text-mined RSC journal articles for named reactions and molecular processes, annotated according to Creative Commons-licensed ontologies: See http://rxno.googlecode.com/
  • 10. Reaction classification 2 Classification of Daniel’s US Patent data
  • 11. Reaction InChI To do for reactions what InChI has done for structures •Think online searching •Deduplication and linking http://www-rinchi.ch.cam.ac.uk/help.html
  • 12. Reaction InChI Early work – RInChIs layered on to a few hundred thousand reactions •Not generated for a few 10s of thousands of reactions •Reaction deduplication results differ based on algorithm – GGA software versus RInChI •Under investigation
  • 13. Other sources ChemSpider SyntheticPages •Electronic Lab Notebooks •University repositories Please send theses
  • 14. What will ChemSpider Reactions serve? • Chemical Database Service • Linking back to original publications/supplementary data • Underpinning other tools e.g. retrosynthetic analysis (depends on data quality and mapping)
  • 15. Chemical Database Service National Chemical Database Service for UK academics Integrates commercial databases and services Chemicals, analytical data, prediction algorithms Development of data repository
  • 16. ARChem from SimBioSys 1 Synthesis planning tool which performs rule- and precedent-based retrosynthetic analysis back to commercially available starting materials.
  • 19. But what about data quality? • Data validation and curation required • Encouraging participation with Rewards and RECOGNITION
  • 20. Manual curation • Integrated commenting, curating and validation platform across ALL eScience and Publishing platforms • All integrated to a central RSC profile and feeding the alt-metrics tools
  • 21. The other kind of RDF (made-up example) Chemical reactions are unusually well-suited to representation. (Donald Davidson’s event semantics) _:r1 a obo:RXNO_0000004 ; # Diels–Alder obo:has_participant_ceasing_to_exist _:m1 ; # a diene obo:has_participant_ceasing_to_exist _:m2 ; # an olefin obo:has_participant_starting_to_exist _:m3 . # a substituted cyclohexene _:m1 a <http://rdf.chemspider.com/233000> . _:m2 a <http://rdf.chemspider.com/233001> . _:m3 a <http://rdf.chemspider.com/233002> .