SlideShare a Scribd company logo
1 of 53
Download to read offline
UK National Chemical Database
Service: An integration of
commercial and public chemistry
services to support chemists in
the United Kingdom
Antony Williams, Valery Tkachenko
and Richard Kidd
ACS Dallas
March 2014
UK Chemical Database Service
• The National Chemical Database Service is for
UK academics – see later for Rest of World
Vision for the Service PART 1
• Provide access to databases and services of
interest to the academic community to serve
their needs. Access to services to include:
• Crystallography data – Organic and inorganic
materials
• Thermophysical data
• Reactions Data including retrosynthetic analysis
• Prediction technologies – name generation,
physicochemical parameters, NMR prediction
The UK National Chemical Database Service – an integration of commercial and public chemistry services to support chemists in the United Kingdom
Service Rollout
• Many services are hosted in the cloud
• Access through login/password, IP
authentication or Shibboleth authentication
• Lots of hard work in a very short time – so
much thanks to all of the service providers
• More providers stepped up to help –
ChemAxon
• Crystallography concern (understatement!)
Feedback from Community
• Converted initial public negativity spike on
Twitter pre-release to very positive feedback
post-release
• Training required – onsite training sessions
organized
• Available Chemicals Directory is big plus!
• Concerns with Retrosynthetic Analysis tool
Usage
• Majority of usage is for crystallography data –
previous provider had same bias
• Usage is increasing month-by-month
• Still way-under used and in many cases low
awareness
Vision for the Service PART 2
• Response to the call for proposals included
our vision for a 21st Century data repository
• At a time of Open Access, Open Data and
funding agency requirement to make data
public – build a data repository
• Funding is split for licensing content and
services (VAST MAJORITY) and some
funding for research and development
An Initial “Vague” Vision Set
• Manage “all” of the chemistry data associated
with chemical substances
• Data to be downloadable, reusable, interactive
• Build a platform that enables the scientist
• Data storage, validation, standardization and
curation
• Collaborative data sharing
• Provide data platform that can enable and
enhance publishing of scientific papers
Data Repository
• Registration of chemical compounds
• Deposition of chemical syntheses
• Addition of analytical data
• Integration to electronic notebooks
• Rewards and recognition for data sharing
• Document processing
• Hosting of data as private, embargoed or
public
What we will deliver for all data
• Simple interfaces for uploading of data
• Embeddable widgets and programming
interfaces to utilize in in-house systems, ELNs
• Automated harvesting approaches – sweeping
directories for data
• Data validation where possible
Input data pipeline
Deposition Gateway
Staging
databases
Compounds
Reactions
Spectra
Materials
Articles / CSSP
Compounds
Module
Spectra
Module
Reactions
Module
Materials
Module
Textmining
Module Module
Web UI for unified depositions
DropBox, Google Drive,
SkyDrive, etc
LabTroveand other templated
data
Documents
API, FTP, etc
Raw data Validated data
Staging
databases
Alldatabases are
sliced by data
sources/data
collections and
havesimple
security model
where each data
slice/sourceis
private, public or
embargoed
Compounds upload
• Draw chemicals in the interface (Javascript
editors – PC, Mac, Tablets, Phones)
• Drag and drop of compounds
• Automated generate of properties – Formulae,
Mw, Mi, physchem properties
• Metadata input forms
• Bulk upload
Depositions Gateway User
Interface
Depositions Gateway User
Interface
Chemical Validation and
Standardization
The UK National Chemical Database Service – an integration of commercial and public chemistry services to support chemists in the United Kingdom
Reactions
• Hosting of reaction data – standard “document
formats” – full flexibility but limiting – extraction
of data from embedded objects
• Encourage template formats – using ELNs for
example, community agreed templates
The UK National Chemical Database Service – an integration of commercial and public chemistry services to support chemists in the United Kingdom
The UK National Chemical Database Service – an integration of commercial and public chemistry services to support chemists in the United Kingdom
The UK National Chemical Database Service – an integration of commercial and public chemistry services to support chemists in the United Kingdom
Electronic Notebook Data
• Development work integrating chemistry into
the Southampton Labtrove notebook
• Stoichiometry table development
• Analytical data integration
• “ChemTrove” rolled out to a small test group
in January
Micropublishing Syntheses
ChemSpider SyntheticPages
The UK National Chemical Database Service – an integration of commercial and public chemistry services to support chemists in the United Kingdom
Requirements
• Community agreement on acceptable
templates for CSSP/Reactions deposition
• Data Model deposition based on mappings
between template and CSSP model
• Adoption of Labtrove interface for deposition
What we will deliver
• Micropublishing platform for submission of
• Protocols and Procedures
• Reactions
• Safety and Hazard data (LATER)
• Template-based submissions of procedures
• Matched to ELN submissions
• Full details for user submission versus
mapped submission into database
Reaction Deposition/Validation
Reaction Deposition/Validation
Spectral Data
• Support for “structure identification” is a must
– “greatest value” for reference and lookup
• Support for data standards primarily – JCAMP,
mzML, SPC
• Want to support ASSIGNED data formats
• Hold binary files but prefer standards – WHY?
Raw Spectral Data
10 years from now…
• Binary file formats generally need original
data processing software to deal with them –
from Bruker, Agilent, Jeol, Thermo, Waters,
blah, blah, blah, blah,…
• While we can store the original raw data files
for posterity should we? This has been one
focus for data repositories
This is way more useful
Processed data…
Spectral searching is made possible
Spectral matching is possible
This is what we really want…
The UK National Chemical Database Service – an integration of commercial and public chemistry services to support chemists in the United Kingdom
The UK National Chemical Database Service – an integration of commercial and public chemistry services to support chemists in the United Kingdom
Addition of Analytical Data
• Spectral Container is in development using
componentized widgets for display
• NIST spectra converted into standardized
JCAMP format for deposition - 296,103
spectra deposited
• 10% of remaining NIST spectra need to be
curated as there are obvious structure issues
Javascript viewer NMR, MS, IR
Depositions Gateway User
Interface
Document processing
Depositions Gateway User
Interface
User Interface Approach
Compounds Reactions Spectra Materials Documents
Compounds
API
Reactions
API
Spectra
API
Materials
API
Documents
API
Compounds
Widgets
Reactions
Widgets
Spectra
Widgets
Materials
Widgets
Documents
Widgets
Data tier
Data access
tier
User
interface
components
tier
Analytical Laboratory application
User
interface tier
(examples) Electronic Laboratory Notebook
Paid 3rd
party integrations (various platforms – SharePoint, Google, etc)
Chemical Inventory application
The UK National Chemical Database Service – an integration of commercial and public chemistry services to support chemists in the United Kingdom
The UK National Chemical Database Service – an integration of commercial and public chemistry services to support chemists in the United Kingdom
The UK National Chemical Database Service – an integration of commercial and public chemistry services to support chemists in the United Kingdom
User Interface Approach
Compounds Reactions Spectra Materials Documents
Compounds
API
Reactions
API
Spectra
API
Materials
API
Documents
API
Compounds
Widgets
Reactions
Widgets
Spectra
Widgets
Materials
Widgets
Documents
Widgets
Data tier
Data access
tier
User
interface
components
tier
Analytical Laboratory application
User
interface tier
(examples) Electronic Laboratory Notebook
Paid 3rd
party integrations (various platforms – SharePoint, Google, etc)
Chemical Inventory application
Analytical Chemist
Characterize
Measure
Search
Store
<<include>>
<<include>>
<<include>>
Synthetic Chemist
Search
(synthetic procedure)
Document
(publish synthetic procedure)
Retrosynthetic analysis
Medicinal Chemist
Search
(against database of properties)
Source
(find vendor)
Analyse
(cluster, dock, screen)
Computational Chemist
Search or Develop algorithm
Store results
Run calculations
Synthesize
Measure activity
Present activities for ACS Fall
• Deposition process development of compounds,
reactions and spectral data by end of Spring
• FTP, DropBox, Web-upload, ELN integration
• Compounds, Reactions, Spectral data search,
display, download
• Data sharing – private, public, collaborative
• Metadata, metadata, metadata standards!
• Open Sourcing Chemical Registry System
including CVSP
UK Chemical Database Service
• The National Chemical Database Service is for
UK academics
• What would be necessary to make this
available for “Rest of World”, a single
institution, an organization?
• It’s not really technology…that’s scale out and
can be handled
• It’s negotiation with database providers,
pricing, login/authentication, localization?
Acknowledgments
• Jeremy Frey and Simon Coles, University of
Southampton
• Will Dichtel and Leah McEwan, Cornell
University
• Stuart Chalk, University of North Florida
• Bob Hanson and Bob Lancashire, Jmol and
JSpecView
Thank you
Email: williamsa@rsc.org
ORCID: 0000-0002-2668-4821
Twitter: @ChemConnector
Personal Blog: www.chemconnector.com
SLIDES: www.slideshare.net/AntonyWilliams

More Related Content

What's hot

ACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP ProjectACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP ProjectStuart Chalk
 
Acs collaborative computational technologies for biomedical research an enabl...
Acs collaborative computational technologies for biomedical research an enabl...Acs collaborative computational technologies for biomedical research an enabl...
Acs collaborative computational technologies for biomedical research an enabl...Sean Ekins
 
Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...Valery Tkachenko
 
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka IntegrationACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka IntegrationStuart Chalk
 
Building a semantic chemistry platform with the royal society of chemistry
Building a semantic chemistry platform with the royal society of chemistryBuilding a semantic chemistry platform with the royal society of chemistry
Building a semantic chemistry platform with the royal society of chemistryValery Tkachenko
 
Building a Standard for Standards: The ChAMP Project
Building a Standard for Standards: The ChAMP ProjectBuilding a Standard for Standards: The ChAMP Project
Building a Standard for Standards: The ChAMP ProjectStuart Chalk
 
Publishing data and code openly
Publishing data and code openlyPublishing data and code openly
Publishing data and code openlyFAIRDOM
 
FAIR Data and Model Management for Systems Biology (and SOPs too!)
FAIR Data and Model Management for Systems Biology(and SOPs too!)FAIR Data and Model Management for Systems Biology(and SOPs too!)
FAIR Data and Model Management for Systems Biology (and SOPs too!)Carole Goble
 
The application of cloud computing to royal society of chemistry data platforms
The application of cloud computing to royal society of chemistry data platformsThe application of cloud computing to royal society of chemistry data platforms
The application of cloud computing to royal society of chemistry data platformsValery Tkachenko
 

What's hot (20)

Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data online
 
Hosting a compound centric community resource for chemistry data
Hosting a compound centric community resource for chemistry dataHosting a compound centric community resource for chemistry data
Hosting a compound centric community resource for chemistry data
 
Cheminformatics and the Structure Elucidation of Natural Products
Cheminformatics and the Structure Elucidation of Natural ProductsCheminformatics and the Structure Elucidation of Natural Products
Cheminformatics and the Structure Elucidation of Natural Products
 
ACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP ProjectACS 248th Paper 71 ChAMP Project
ACS 248th Paper 71 ChAMP Project
 
Acs collaborative computational technologies for biomedical research an enabl...
Acs collaborative computational technologies for biomedical research an enabl...Acs collaborative computational technologies for biomedical research an enabl...
Acs collaborative computational technologies for biomedical research an enabl...
 
Dealing with the complex challenge of managing diverse analytical chemistry d...
Dealing with the complex challenge of managing diverse analytical chemistry d...Dealing with the complex challenge of managing diverse analytical chemistry d...
Dealing with the complex challenge of managing diverse analytical chemistry d...
 
eScience Resources for the Chemistry Community from the Royal Society of Chem...
eScience Resources for the Chemistry Community from the Royal Society of Chem...eScience Resources for the Chemistry Community from the Royal Society of Chem...
eScience Resources for the Chemistry Community from the Royal Society of Chem...
 
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...Activities at the Royal Society of Chemistry to gather, extract and analyze b...
Activities at the Royal Society of Chemistry to gather, extract and analyze b...
 
Open innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts projectOpen innovation contributions from RSC resulting from the Open Phacts project
Open innovation contributions from RSC resulting from the Open Phacts project
 
Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...Clustering the royal society of chemistry chemical repository to enable enhan...
Clustering the royal society of chemistry chemical repository to enable enhan...
 
Serving the medicinal chemistry community with Royal Society of Chemistry che...
Serving the medicinal chemistry community with Royal Society of Chemistry che...Serving the medicinal chemistry community with Royal Society of Chemistry che...
Serving the medicinal chemistry community with Royal Society of Chemistry che...
 
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
 
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka IntegrationACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
 
Building a semantic chemistry platform with the royal society of chemistry
Building a semantic chemistry platform with the royal society of chemistryBuilding a semantic chemistry platform with the royal society of chemistry
Building a semantic chemistry platform with the royal society of chemistry
 
Building a Standard for Standards: The ChAMP Project
Building a Standard for Standards: The ChAMP ProjectBuilding a Standard for Standards: The ChAMP Project
Building a Standard for Standards: The ChAMP Project
 
Publishing data and code openly
Publishing data and code openlyPublishing data and code openly
Publishing data and code openly
 
Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...Incorporating new technologies and High Throughput Screening in the design an...
Incorporating new technologies and High Throughput Screening in the design an...
 
FAIR Data and Model Management for Systems Biology (and SOPs too!)
FAIR Data and Model Management for Systems Biology(and SOPs too!)FAIR Data and Model Management for Systems Biology(and SOPs too!)
FAIR Data and Model Management for Systems Biology (and SOPs too!)
 
The application of cloud computing to royal society of chemistry data platforms
The application of cloud computing to royal society of chemistry data platformsThe application of cloud computing to royal society of chemistry data platforms
The application of cloud computing to royal society of chemistry data platforms
 

Similar to The UK National Chemical Database Service – an integration of commercial and public chemistry services to support chemists in the United Kingdom

Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Ken Karapetyan
 
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platformsChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platformsKen Karapetyan
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineKen Karapetyan
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
 
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of ChemistryICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of ChemistryDr. Haxel Consult
 
Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_Ann-Marie Roche
 
How to expose research data in EOSC
How to expose research data in EOSCHow to expose research data in EOSC
How to expose research data in EOSCEUDAT
 
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...OSTHUS
 
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...Yongyao Jiang
 
Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Vivien Bonazzi
 
Data munging and analysis
Data munging and analysisData munging and analysis
Data munging and analysisRaminder Singh
 
Crossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Crossing Analytics Systems: Case for Integrated Provenance in Data LakesCrossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Crossing Analytics Systems: Case for Integrated Provenance in Data LakesIsuru Suriarachchi
 
BioSHaRE: Opal and Mica: a software suite for data harmonization and federati...
BioSHaRE: Opal and Mica: a software suite for data harmonization and federati...BioSHaRE: Opal and Mica: a software suite for data harmonization and federati...
BioSHaRE: Opal and Mica: a software suite for data harmonization and federati...Lisette Giepmans
 
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...OSTHUS
 

Similar to The UK National Chemical Database Service – an integration of commercial and public chemistry services to support chemists in the United Kingdom (20)

Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...Royal society of chemistry activities to develop a data repository for chemis...
Royal society of chemistry activities to develop a data repository for chemis...
 
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platformsChemSpider – disseminating data and enabling an abundance of chemistry platforms
ChemSpider – disseminating data and enabling an abundance of chemistry platforms
 
Dealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data onlineDealing with the complex challenge of managing diverse chemistry data online
Dealing with the complex challenge of managing diverse chemistry data online
 
Marrying ACDLabs technologies to eScience Projects at the Royal Society of C...
Marrying ACDLabs technologies to eScience Projects at the  Royal Society of C...Marrying ACDLabs technologies to eScience Projects at the  Royal Society of C...
Marrying ACDLabs technologies to eScience Projects at the Royal Society of C...
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of ChemistryICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
ICIC 2013 Conference Proceedings Antony Williams Royal Society of Chemistry
 
Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_Reaxys rmc unified platform_ webinar_
Reaxys rmc unified platform_ webinar_
 
How to expose research data in EOSC
How to expose research data in EOSCHow to expose research data in EOSC
How to expose research data in EOSC
 
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
Allotrope Foundation & OSTHUS at SmartLab Exchange 2015: Update on the Allotr...
 
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
MUDROD - Mining and Utilizing Dataset Relevancy from Oceanographic Dataset Me...
 
Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2Bonazzi commons bd2 k ahm 2016 v2
Bonazzi commons bd2 k ahm 2016 v2
 
Psicquic
PsicquicPsicquic
Psicquic
 
iMicrobe_ASLO_2015
iMicrobe_ASLO_2015iMicrobe_ASLO_2015
iMicrobe_ASLO_2015
 
Data munging and analysis
Data munging and analysisData munging and analysis
Data munging and analysis
 
JOSA TechTalk: Metadata Management
in Big Data
JOSA TechTalk: Metadata Management
in Big DataJOSA TechTalk: Metadata Management
in Big Data
JOSA TechTalk: Metadata Management
in Big Data
 
eScience at the Royal Society of Chemistry and our current initiatives
eScience at the Royal Society of Chemistry and our current initiativeseScience at the Royal Society of Chemistry and our current initiatives
eScience at the Royal Society of Chemistry and our current initiatives
 
Crossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Crossing Analytics Systems: Case for Integrated Provenance in Data LakesCrossing Analytics Systems: Case for Integrated Provenance in Data Lakes
Crossing Analytics Systems: Case for Integrated Provenance in Data Lakes
 
BioSHaRE: Opal and Mica: a software suite for data harmonization and federati...
BioSHaRE: Opal and Mica: a software suite for data harmonization and federati...BioSHaRE: Opal and Mica: a software suite for data harmonization and federati...
BioSHaRE: Opal and Mica: a software suite for data harmonization and federati...
 
Delivering chemical-associated data via EPA web applications
Delivering chemical-associated data via EPA web applicationsDelivering chemical-associated data via EPA web applications
Delivering chemical-associated data via EPA web applications
 
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...Revolutionizing Laboratory  Instrument Data for the  Pharmaceutical Industry:...
Revolutionizing Laboratory Instrument Data for the Pharmaceutical Industry:...
 

Recently uploaded

Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.IPLOOK Networks
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarThousandEyes
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNeo4j
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Muhammad Tiham Siddiqui
 
Flow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameFlow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameKapil Thakar
 
The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)codyslingerland1
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)IES VE
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch TuesdayIvanti
 
Scenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosScenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosErol GIRAUDY
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024Brian Pichman
 
Top 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTop 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTopCSSGallery
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxNeo4j
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveIES VE
 
Oracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxOracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxSatishbabu Gunukula
 
From the origin to the future of Open Source model and business
From the origin to the future of  Open Source model and businessFrom the origin to the future of  Open Source model and business
From the origin to the future of Open Source model and businessFrancesco Corti
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfCheryl Hung
 
.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptxHansamali Gamage
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc
 
Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...DianaGray10
 
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox
 

Recently uploaded (20)

Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.Introduction - IPLOOK NETWORKS CO., LTD.
Introduction - IPLOOK NETWORKS CO., LTD.
 
EMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? WebinarEMEA What is ThousandEyes? Webinar
EMEA What is ThousandEyes? Webinar
 
Novo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4jNovo Nordisk's journey in developing an open-source application on Neo4j
Novo Nordisk's journey in developing an open-source application on Neo4j
 
Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)Trailblazer Community - Flows Workshop (Session 2)
Trailblazer Community - Flows Workshop (Session 2)
 
Flow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First FrameFlow Control | Block Size | ST Min | First Frame
Flow Control | Block Size | ST Min | First Frame
 
The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)The New Cloud World Order Is FinOps (Slideshow)
The New Cloud World Order Is FinOps (Slideshow)
 
The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)The Importance of Indoor Air Quality (English)
The Importance of Indoor Air Quality (English)
 
March Patch Tuesday
March Patch TuesdayMarch Patch Tuesday
March Patch Tuesday
 
Scenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenariosScenario Library et REX Discover industry- and role- based scenarios
Scenario Library et REX Discover industry- and role- based scenarios
 
AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024AI Workshops at Computers In Libraries 2024
AI Workshops at Computers In Libraries 2024
 
Top 10 Squarespace Development Companies
Top 10 Squarespace Development CompaniesTop 10 Squarespace Development Companies
Top 10 Squarespace Development Companies
 
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptxGraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
GraphSummit Copenhagen 2024 - Neo4j Vision and Roadmap.pptx
 
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES LiveKeep Your Finger on the Pulse of Your Building's Performance with IES Live
Keep Your Finger on the Pulse of Your Building's Performance with IES Live
 
Oracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptxOracle Database 23c Security New Features.pptx
Oracle Database 23c Security New Features.pptx
 
From the origin to the future of Open Source model and business
From the origin to the future of  Open Source model and businessFrom the origin to the future of  Open Source model and business
From the origin to the future of Open Source model and business
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx.NET 8 ChatBot with Azure OpenAI Services.pptx
.NET 8 ChatBot with Azure OpenAI Services.pptx
 
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie WorldTrustArc Webinar - How to Live in a Post Third-Party Cookie World
TrustArc Webinar - How to Live in a Post Third-Party Cookie World
 
Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...Explore the UiPath Community and ways you can benefit on your journey to auto...
Explore the UiPath Community and ways you can benefit on your journey to auto...
 
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through TokenizationStobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
Stobox 4: Revolutionizing Investment in Real-World Assets Through Tokenization
 

The UK National Chemical Database Service – an integration of commercial and public chemistry services to support chemists in the United Kingdom

  • 1. UK National Chemical Database Service: An integration of commercial and public chemistry services to support chemists in the United Kingdom Antony Williams, Valery Tkachenko and Richard Kidd ACS Dallas March 2014
  • 2. UK Chemical Database Service • The National Chemical Database Service is for UK academics – see later for Rest of World
  • 3. Vision for the Service PART 1 • Provide access to databases and services of interest to the academic community to serve their needs. Access to services to include: • Crystallography data – Organic and inorganic materials • Thermophysical data • Reactions Data including retrosynthetic analysis • Prediction technologies – name generation, physicochemical parameters, NMR prediction
  • 5. Service Rollout • Many services are hosted in the cloud • Access through login/password, IP authentication or Shibboleth authentication • Lots of hard work in a very short time – so much thanks to all of the service providers • More providers stepped up to help – ChemAxon • Crystallography concern (understatement!)
  • 6. Feedback from Community • Converted initial public negativity spike on Twitter pre-release to very positive feedback post-release • Training required – onsite training sessions organized • Available Chemicals Directory is big plus! • Concerns with Retrosynthetic Analysis tool
  • 7. Usage • Majority of usage is for crystallography data – previous provider had same bias • Usage is increasing month-by-month • Still way-under used and in many cases low awareness
  • 8. Vision for the Service PART 2 • Response to the call for proposals included our vision for a 21st Century data repository • At a time of Open Access, Open Data and funding agency requirement to make data public – build a data repository • Funding is split for licensing content and services (VAST MAJORITY) and some funding for research and development
  • 9. An Initial “Vague” Vision Set • Manage “all” of the chemistry data associated with chemical substances • Data to be downloadable, reusable, interactive • Build a platform that enables the scientist • Data storage, validation, standardization and curation • Collaborative data sharing • Provide data platform that can enable and enhance publishing of scientific papers
  • 10. Data Repository • Registration of chemical compounds • Deposition of chemical syntheses • Addition of analytical data • Integration to electronic notebooks • Rewards and recognition for data sharing • Document processing • Hosting of data as private, embargoed or public
  • 11. What we will deliver for all data • Simple interfaces for uploading of data • Embeddable widgets and programming interfaces to utilize in in-house systems, ELNs • Automated harvesting approaches – sweeping directories for data • Data validation where possible
  • 12. Input data pipeline Deposition Gateway Staging databases Compounds Reactions Spectra Materials Articles / CSSP Compounds Module Spectra Module Reactions Module Materials Module Textmining Module Module Web UI for unified depositions DropBox, Google Drive, SkyDrive, etc LabTroveand other templated data Documents API, FTP, etc Raw data Validated data Staging databases Alldatabases are sliced by data sources/data collections and havesimple security model where each data slice/sourceis private, public or embargoed
  • 13. Compounds upload • Draw chemicals in the interface (Javascript editors – PC, Mac, Tablets, Phones) • Drag and drop of compounds • Automated generate of properties – Formulae, Mw, Mi, physchem properties • Metadata input forms • Bulk upload
  • 18. Reactions • Hosting of reaction data – standard “document formats” – full flexibility but limiting – extraction of data from embedded objects • Encourage template formats – using ELNs for example, community agreed templates
  • 22. Electronic Notebook Data • Development work integrating chemistry into the Southampton Labtrove notebook • Stoichiometry table development • Analytical data integration • “ChemTrove” rolled out to a small test group in January
  • 26. Requirements • Community agreement on acceptable templates for CSSP/Reactions deposition • Data Model deposition based on mappings between template and CSSP model • Adoption of Labtrove interface for deposition
  • 27. What we will deliver • Micropublishing platform for submission of • Protocols and Procedures • Reactions • Safety and Hazard data (LATER) • Template-based submissions of procedures • Matched to ELN submissions • Full details for user submission versus mapped submission into database
  • 30. Spectral Data • Support for “structure identification” is a must – “greatest value” for reference and lookup • Support for data standards primarily – JCAMP, mzML, SPC • Want to support ASSIGNED data formats • Hold binary files but prefer standards – WHY?
  • 32. 10 years from now… • Binary file formats generally need original data processing software to deal with them – from Bruker, Agilent, Jeol, Thermo, Waters, blah, blah, blah, blah,… • While we can store the original raw data files for posterity should we? This has been one focus for data repositories
  • 33. This is way more useful
  • 34. Processed data… Spectral searching is made possible Spectral matching is possible
  • 35. This is what we really want…
  • 38. Addition of Analytical Data • Spectral Container is in development using componentized widgets for display • NIST spectra converted into standardized JCAMP format for deposition - 296,103 spectra deposited • 10% of remaining NIST spectra need to be curated as there are obvious structure issues
  • 43. User Interface Approach Compounds Reactions Spectra Materials Documents Compounds API Reactions API Spectra API Materials API Documents API Compounds Widgets Reactions Widgets Spectra Widgets Materials Widgets Documents Widgets Data tier Data access tier User interface components tier Analytical Laboratory application User interface tier (examples) Electronic Laboratory Notebook Paid 3rd party integrations (various platforms – SharePoint, Google, etc) Chemical Inventory application
  • 47. User Interface Approach Compounds Reactions Spectra Materials Documents Compounds API Reactions API Spectra API Materials API Documents API Compounds Widgets Reactions Widgets Spectra Widgets Materials Widgets Documents Widgets Data tier Data access tier User interface components tier Analytical Laboratory application User interface tier (examples) Electronic Laboratory Notebook Paid 3rd party integrations (various platforms – SharePoint, Google, etc) Chemical Inventory application
  • 49. Medicinal Chemist Search (against database of properties) Source (find vendor) Analyse (cluster, dock, screen) Computational Chemist Search or Develop algorithm Store results Run calculations Synthesize Measure activity
  • 50. Present activities for ACS Fall • Deposition process development of compounds, reactions and spectral data by end of Spring • FTP, DropBox, Web-upload, ELN integration • Compounds, Reactions, Spectral data search, display, download • Data sharing – private, public, collaborative • Metadata, metadata, metadata standards! • Open Sourcing Chemical Registry System including CVSP
  • 51. UK Chemical Database Service • The National Chemical Database Service is for UK academics • What would be necessary to make this available for “Rest of World”, a single institution, an organization? • It’s not really technology…that’s scale out and can be handled • It’s negotiation with database providers, pricing, login/authentication, localization?
  • 52. Acknowledgments • Jeremy Frey and Simon Coles, University of Southampton • Will Dichtel and Leah McEwan, Cornell University • Stuart Chalk, University of North Florida • Bob Hanson and Bob Lancashire, Jmol and JSpecView
  • 53. Thank you Email: williamsa@rsc.org ORCID: 0000-0002-2668-4821 Twitter: @ChemConnector Personal Blog: www.chemconnector.com SLIDES: www.slideshare.net/AntonyWilliams