SlideShare a Scribd company logo
NINA JELIAZKOVA
IdeaConsult Ltd.
Sofia, Bulgaria
www.ideaconsult.net
On chemical structures,
substances,
nanomaterials
and measurements
Sharing experience
about:
OpenTox API and beyond
Chemical structures
Substance identity
Experimental data
challenges
Protocols
Nanomaterials
Final thoughts
I D E A C O N S U L T L T D . 2
CONTENT
• EC FP7 2008-2011 OpenTox
• Distributed framework for predictive
toxicology
• Building blocks: data, chemical
structures, algorithms and models.
• Build models, apply models, validate
models, access and query data in various
ways;
• Tech: REST API, RDF
DATASETS, MODELS
I D E A C O N S U L T L T D . 3
Open Melting Point Dataset #33
PREDICTIONS
I D E A C O N S U L T L T D . 4
31 May 2013 :
The REACH deadline
for registering
substances [100 to
1000 tonnes per year]
http://ToxPredict.net access statistics
• AMBIT REST web
services
 OpenTox Application Programming
Interface (API)
 Dataset web services
 Chemical search, data pooling, structure QA
 Computational web services
 Descriptor calculation, machine learning,
structure optimisation, tautomers
 Web Applications using AMBIT
REST web services
New 2014: Embeddable
JS widgets
I D E A C O N S U L T L T D . 5
AMBIT http://ambit.sf.net
I D E A C O N S U L T L T D . 6
DATA CURATION EXAMPLE (DIISONONYLPHTALATE)
I D E A C O N S U L T L T D . 7
DATA CURATION EXAMPLE (RN 25155-25-3)
I D E A C O N S U L T L T D . 8
DATA CURATION EXAMPLE (RN 25155-25-3)
European Chemical Agency Registration dossier
SUBSTANCE IDENTITY IN REACH
• Guidance for identification and naming of
substances under REACH and CLP (118
pages)
• Substance characterization
“During the first 5 months of 2009, around 450 enquiries were received by
ECHA, 23% of which were rejected on the grounds that the dossiers were
incomplete (e.g. missing spectral data) or the substance identity had not been
sufficiently described.”
I D E A C O N S U L T L T D . 9
http://echa.europa.eu/documents/10162/13643/substance_id_en.pdf
“Only a limited
number of tools are
capable to provide
easily accessible data
on
substance identity,
composition
together with
chemical structures
and high quality and
detailed endpoint
data”
I D E A C O N S U L T L T D . 10
SUBSTANCE IDENTITY/COMPOSITION
SUBSTANCE ENDPOINT DATA
I D E A C O N S U L T L T D . 11
OECD Harmonized templates
Well defined XML schema for > 100
endpoints
Experimental protocols:
OECD Guidelines
BioPortal ontologies coverage
of OECD guidelines: None
PROTOCOLS, SOP,
INVESTIGATIONS, STUDY, ASSAYS
SEP
COACH
Towards the replacement of in vivo
repeated dose systemic toxicity testing
SEURAT-1 ~ 70 research groups from European Universities, Public
Research Institutes and Companies (more than 30% SMEs)
http://www.seurat-1.eu/
http://toxbank.net/
FP7 Projects
G O A L S
Prediction of repeated dose toxicity
Shared repository of know-how and
experimental results
 from SEURAT-1 research activities and
relevant public sources
Examples include:
 Protocol describing a method for long term
maintenance of functional hepatocytes
 Results from a repeated dose 14 day
transcriptomics study using acetaminophen
and iPS-derived hepatocytes
T E C H N O L O G I C A L
S O L U T I O N S
• REST Web services API
• Protocol service
• Investigation service
• RDF data model
• ISA-TAB & ontologies
• ISA-TAB converted to RDF
• Stored in a triple store
• Chemical search (AMBIT)
13
TOXBANK DATA WAREHOUSE
Challenges:
• Diverse data types
• Changing research protocols
• Data formatting
time consuming
• Data sharing - little incentive
FP7 ENANOMAPPER PROJECT
• Develop an ontology and database unifying
information about nanomaterial safety (in humans
and the environment)
• Cover the full lifecycle from manufacturing to
environmental decay or accumulation
• Pan-European project, 7 partners
• Ontology growth through community and re-use
NANOINFORMATICS CHALLENGES
• nanoSMILES
• nanoInChI
• Nanomaterial identity - only through characterisation
with multitude of experimental methods
• Experiments reproducibility; standards
• Experiments description (protocols, experimental
details)
• Models: structure based cheminformatics doesn’t
really work
• Common database? NO!
But Yes! for an integrated search across databases! (requirement analysis
feedback)
I D E A C O N S U L T L T D . 15
Nanomaterial “unique” challenge of identification?
NANOMATERIAL ENDPOINT DATA
I D E A C O N S U L T L T D . 16
• Same data model as for substances
(ISA-TAB inspired)
• NM specific measurement protocols
• Ontology support – under
development eNanoMapper WP2
(Janna Hastings, Egon Willighagen)
NANOMATERIAL SEARCH
I D E A C O N S U L T L T D . 17
LESSONS LEARNED
What is more difficult:
1. Succeed in implementing a “moving target” API
by a distributed team of developers.
2. Succeed in bringing together several wet lab
teams to use a common tool/ format for
preparing and sharing experimental data.
I D E A C O N S U L T L T D . 18
1. OpenTox: Partners succeeded in creating 5 independent
implementations of the OpenTox API; through “rough consensus and
running code”; most services are online and being used 3y after the
OpenTox project completion; API being used and extended in related
projects;
2. In ToxBank we’ve resorted to taking the role of “data managers” in
SEURAT-1 cluster; a setup typical to most EU data projects.
WHY DATA FORMATTING AND SHARING IS SO
DIFFICULT?
Thoughts about the technology aspects; not about the
incentives to share
• Data format – the more flexible the format is, the more
difficult is the data preparation;
• Tools typically need to understand both data modelling
and the experimental setup;
• Preparing and data sharing requires additional efforts,
which are typically not within the scope of the research
projects;
• Typical setup is “data managers” or “Excel templates”
I D E A C O N S U L T L T D . 19
Compare with the easiness of sharing, liking and tagging pictures on
social networks; liking and tagging essentially creates semantic
knowledge!
GUESS THE AUTHOR
“This proposal concerns the management of
general information about experiments at ???.
It discusses the problems of loss of
information about complex evolving systems
and derives a solution based on a ???"
I D E A C O N S U L T L T D . 20
TIM BERNERS-LEE , 1989
“This proposal concerns the management of
general information about accelerators and
experiments at CERN.
It discusses the problems of loss of
information about complex evolving systems
and derives a solution based on a distributed
hypertext system."
I D E A C O N S U L T L T D . 21
http://www.w3.org/History/1989/proposal.html
Non-Centralisation
Information systems start small and grow. They also start
isolated and then merge. A new system must allow existing
systems to be linked together without requiring any
central control or coordination.
FINAL THOUGHTS
• Facilitate researchers organize their own data locally;
• The cost of entering /recording data should be low;
• Easy to use tools;
• Formats – understandable or hidden behind user friendly
tools;
• Non-centralisation;
• Added value:
“The data-sharing environment must invite collaboration as well
as facilitate it. Stakeholders have broad interests that go beyond
retrieving existing data — they want to discover materials and
forecast enhanced products”
I D E A C O N S U L T L T D . 22
http://www.nature.com/news/technology-
sharing-data-in-materials-science-1.14224
I D E A C O N S U L T L T D . 23

More Related Content

What's hot

Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016
Carole Goble
 
Embl ebi use-cases_-_t.wildish
Embl ebi use-cases_-_t.wildishEmbl ebi use-cases_-_t.wildish
Embl ebi use-cases_-_t.wildish
Archiver
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Carole Goble
 
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
Carole Goble
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
Carole Goble
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
Carole Goble
 
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka IntegrationACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
Stuart Chalk
 
OpenAIRE presentation at EuroCRIS Seminar "Evaluation of Research using a CRIS"
OpenAIRE presentation at EuroCRIS Seminar "Evaluation of Research using a CRIS"OpenAIRE presentation at EuroCRIS Seminar "Evaluation of Research using a CRIS"
OpenAIRE presentation at EuroCRIS Seminar "Evaluation of Research using a CRIS"
OpenAIRE
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
Norman Morrison
 
GeoChronos - SpecNet Workshop 2009 Presentation
GeoChronos - SpecNet Workshop 2009 PresentationGeoChronos - SpecNet Workshop 2009 Presentation
GeoChronos - SpecNet Workshop 2009 Presentation
Cameron Kiddle
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
Carole Goble
 
Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow Environments
Carole Goble
 
Experiences to learn from the MS proteomics field
Experiences to learn from the MS proteomics fieldExperiences to learn from the MS proteomics field
Experiences to learn from the MS proteomics field
Juan Antonio Vizcaino
 
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
Herbert Van de Sompel
 
bonino
boninobonino
bonino
Dario Bonino
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Juan Antonio Vizcaino
 
Data and model management in Systems Biology
Data and model management in Systems BiologyData and model management in Systems Biology
Data and model management in Systems Biology
University Medicine Greifswald
 
towards interoperable archives: the Universal Preprint Service initiative
towards interoperable archives:  the Universal Preprint Service initiativetowards interoperable archives:  the Universal Preprint Service initiative
towards interoperable archives: the Universal Preprint Service initiative
Herbert Van de Sompel
 
Adding value to scientific results: COMBINE standards & guidelines for system...
Adding value to scientific results: COMBINE standards & guidelines for system...Adding value to scientific results: COMBINE standards & guidelines for system...
Adding value to scientific results: COMBINE standards & guidelines for system...
University Medicine Greifswald
 
When is a model FAIR – and why should we care?
When is a model FAIR – and why should we care?When is a model FAIR – and why should we care?
When is a model FAIR – and why should we care?
University Medicine Greifswald
 

What's hot (20)

Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016Reproducibility, Research Objects and Reality, Leiden 2016
Reproducibility, Research Objects and Reality, Leiden 2016
 
Embl ebi use-cases_-_t.wildish
Embl ebi use-cases_-_t.wildishEmbl ebi use-cases_-_t.wildish
Embl ebi use-cases_-_t.wildish
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
FAIRDOM - FAIR Asset management and sharing experiences in Systems and Synthe...
 
Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017Being Reproducible: SSBSS Summer School 2017
Being Reproducible: SSBSS Summer School 2017
 
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
FAIR Software (and Data) Citation: Europe, Research Object Systems, Networks ...
 
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka IntegrationACS 248th Paper 136 JSmol/JSpecView Eureka Integration
ACS 248th Paper 136 JSmol/JSpecView Eureka Integration
 
OpenAIRE presentation at EuroCRIS Seminar "Evaluation of Research using a CRIS"
OpenAIRE presentation at EuroCRIS Seminar "Evaluation of Research using a CRIS"OpenAIRE presentation at EuroCRIS Seminar "Evaluation of Research using a CRIS"
OpenAIRE presentation at EuroCRIS Seminar "Evaluation of Research using a CRIS"
 
Research Shared: researchobject.org
Research Shared: researchobject.orgResearch Shared: researchobject.org
Research Shared: researchobject.org
 
GeoChronos - SpecNet Workshop 2009 Presentation
GeoChronos - SpecNet Workshop 2009 PresentationGeoChronos - SpecNet Workshop 2009 Presentation
GeoChronos - SpecNet Workshop 2009 Presentation
 
FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout FAIR Workflows and Research Objects get a Workout
FAIR Workflows and Research Objects get a Workout
 
Advances in Scientific Workflow Environments
Advances in Scientific Workflow EnvironmentsAdvances in Scientific Workflow Environments
Advances in Scientific Workflow Environments
 
Experiences to learn from the MS proteomics field
Experiences to learn from the MS proteomics fieldExperiences to learn from the MS proteomics field
Experiences to learn from the MS proteomics field
 
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
The OAI-ORE Interoperability Framework in the Context of the Current Scholarl...
 
bonino
boninobonino
bonino
 
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
Proteomics and the "big data" trend: challenges and new possibilitites (Talk ...
 
Data and model management in Systems Biology
Data and model management in Systems BiologyData and model management in Systems Biology
Data and model management in Systems Biology
 
towards interoperable archives: the Universal Preprint Service initiative
towards interoperable archives:  the Universal Preprint Service initiativetowards interoperable archives:  the Universal Preprint Service initiative
towards interoperable archives: the Universal Preprint Service initiative
 
Adding value to scientific results: COMBINE standards & guidelines for system...
Adding value to scientific results: COMBINE standards & guidelines for system...Adding value to scientific results: COMBINE standards & guidelines for system...
Adding value to scientific results: COMBINE standards & guidelines for system...
 
When is a model FAIR – and why should we care?
When is a model FAIR – and why should we care?When is a model FAIR – and why should we care?
When is a model FAIR – and why should we care?
 

Similar to On chemical structures, substances, nanomaterials and measurements

OSFair2017 Workshop | The European Open Science Cloud Pilot
OSFair2017 Workshop | The European Open Science Cloud Pilot OSFair2017 Workshop | The European Open Science Cloud Pilot
OSFair2017 Workshop | The European Open Science Cloud Pilot
Open Science Fair
 
The European Open Science Cloud: just what is it?
The European Open Science Cloud: just what is it?The European Open Science Cloud: just what is it?
The European Open Science Cloud: just what is it?
Carole Goble
 
eTRIKS at Pharma IT 2017, London
eTRIKS at Pharma IT 2017, LondoneTRIKS at Pharma IT 2017, London
eTRIKS at Pharma IT 2017, London
Paul Agapow
 
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
BigData_Europe
 
The European Open Science Cloud: just what is it?
The European Open Science Cloud: just what is it?The European Open Science Cloud: just what is it?
The European Open Science Cloud: just what is it?
Jisc
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
Research Data Alliance
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
Research Data Alliance
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
BigData_Europe
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13
Kristi Holmes
 
Technical activities in ELIXIR Europe
Technical activities in ELIXIR EuropeTechnical activities in ELIXIR Europe
Technical activities in ELIXIR Europe
Rafael C. Jimenez
 
FAIR play?
FAIR play? FAIR play?
FAIR play?
Sarah Jones
 
1st eStandards conference: next steps for standardization in large scale eHea...
1st eStandards conference: next steps for standardization in large scale eHea...1st eStandards conference: next steps for standardization in large scale eHea...
1st eStandards conference: next steps for standardization in large scale eHea...
chronaki
 
WEBINAR: "How to manage your data to make them open and fair"
WEBINAR:  "How to manage your data to make them open and fair"  WEBINAR:  "How to manage your data to make them open and fair"
WEBINAR: "How to manage your data to make them open and fair"
OpenAIRE
 
Shifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data ProviderShifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data Provider
The HDF-EOS Tools and Information Center
 
Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...
Lucy McKenna
 
Making Research Data Repositories Visible – The re3data.org Registry
Making Research Data Repositories Visible – The re3data.org RegistryMaking Research Data Repositories Visible – The re3data.org Registry
Making Research Data Repositories Visible – The re3data.org Registry
Heinz Pampel
 
PaNOSC Overview - ExPaNDS kick-off meeting - September 2019
PaNOSC Overview - ExPaNDS kick-off meeting - September 2019PaNOSC Overview - ExPaNDS kick-off meeting - September 2019
PaNOSC Overview - ExPaNDS kick-off meeting - September 2019
PaNOSC
 
Scholze liber 2015-06-25_final
Scholze liber 2015-06-25_finalScholze liber 2015-06-25_final
Scholze liber 2015-06-25_final
Karlsruhe Institute of Technology (KIT)
 
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
OpenAIRE
 
INSPIRE data scope
INSPIRE data scopeINSPIRE data scope
INSPIRE data scope
inspireeu
 

Similar to On chemical structures, substances, nanomaterials and measurements (20)

OSFair2017 Workshop | The European Open Science Cloud Pilot
OSFair2017 Workshop | The European Open Science Cloud Pilot OSFair2017 Workshop | The European Open Science Cloud Pilot
OSFair2017 Workshop | The European Open Science Cloud Pilot
 
The European Open Science Cloud: just what is it?
The European Open Science Cloud: just what is it?The European Open Science Cloud: just what is it?
The European Open Science Cloud: just what is it?
 
eTRIKS at Pharma IT 2017, London
eTRIKS at Pharma IT 2017, LondoneTRIKS at Pharma IT 2017, London
eTRIKS at Pharma IT 2017, London
 
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
Big Data Europe: SC6 Workshop 3: The European Research Data Landscape: Opport...
 
The European Open Science Cloud: just what is it?
The European Open Science Cloud: just what is it?The European Open Science Cloud: just what is it?
The European Open Science Cloud: just what is it?
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
 
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation OpenAIRE and Eudat services and tools to support FAIR DMP implementation
OpenAIRE and Eudat services and tools to support FAIR DMP implementation
 
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
Big Data Europe SC6 WS 3: Ron Dekker, Director CESSDA European Open Science A...
 
Linked Open Data_mlanet13
Linked Open Data_mlanet13Linked Open Data_mlanet13
Linked Open Data_mlanet13
 
Technical activities in ELIXIR Europe
Technical activities in ELIXIR EuropeTechnical activities in ELIXIR Europe
Technical activities in ELIXIR Europe
 
FAIR play?
FAIR play? FAIR play?
FAIR play?
 
1st eStandards conference: next steps for standardization in large scale eHea...
1st eStandards conference: next steps for standardization in large scale eHea...1st eStandards conference: next steps for standardization in large scale eHea...
1st eStandards conference: next steps for standardization in large scale eHea...
 
WEBINAR: "How to manage your data to make them open and fair"
WEBINAR:  "How to manage your data to make them open and fair"  WEBINAR:  "How to manage your data to make them open and fair"
WEBINAR: "How to manage your data to make them open and fair"
 
Shifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data ProviderShifting the Burden from the User to the Data Provider
Shifting the Burden from the User to the Data Provider
 
Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...Engaging Information Professionals in the Process of Authoritative Interlinki...
Engaging Information Professionals in the Process of Authoritative Interlinki...
 
Making Research Data Repositories Visible – The re3data.org Registry
Making Research Data Repositories Visible – The re3data.org RegistryMaking Research Data Repositories Visible – The re3data.org Registry
Making Research Data Repositories Visible – The re3data.org Registry
 
PaNOSC Overview - ExPaNDS kick-off meeting - September 2019
PaNOSC Overview - ExPaNDS kick-off meeting - September 2019PaNOSC Overview - ExPaNDS kick-off meeting - September 2019
PaNOSC Overview - ExPaNDS kick-off meeting - September 2019
 
Scholze liber 2015-06-25_final
Scholze liber 2015-06-25_finalScholze liber 2015-06-25_final
Scholze liber 2015-06-25_final
 
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
(Open) Research Data Management in H2020 (ISERD – Tel Aviv, Oct 31, 2016)
 
INSPIRE data scope
INSPIRE data scopeINSPIRE data scope
INSPIRE data scope
 

Recently uploaded

End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
Lars Albertsson
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
soxrziqu
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
Bill641377
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
Timothy Spann
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
xclpvhuk
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
Walaa Eldin Moustafa
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
manishkhaire30
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
y3i0qsdzb
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
jitskeb
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Kaxil Naik
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
SaffaIbrahim1
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
apvysm8
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
bopyb
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Aggregage
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
hyfjgavov
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
ElizabethGarrettChri
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
sameer shah
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
AndrzejJarynowski
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
sameer shah
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
nuttdpt
 

Recently uploaded (20)

End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024End-to-end pipeline agility - Berlin Buzzwords 2024
End-to-end pipeline agility - Berlin Buzzwords 2024
 
University of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma TranscriptUniversity of New South Wales degree offer diploma Transcript
University of New South Wales degree offer diploma Transcript
 
Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...Population Growth in Bataan: The effects of population growth around rural pl...
Population Growth in Bataan: The effects of population growth around rural pl...
 
DSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelinesDSSML24_tspann_CodelessGenerativeAIPipelines
DSSML24_tspann_CodelessGenerativeAIPipelines
 
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
一比一原版(Unimelb毕业证书)墨尔本大学毕业证如何办理
 
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data LakeViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
ViewShift: Hassle-free Dynamic Policy Enforcement for Every Data Lake
 
Learn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queriesLearn SQL from basic queries to Advance queries
Learn SQL from basic queries to Advance queries
 
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
一比一原版巴斯大学毕业证(Bath毕业证书)学历如何办理
 
Experts live - Improving user adoption with AI
Experts live - Improving user adoption with AIExperts live - Improving user adoption with AI
Experts live - Improving user adoption with AI
 
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
Orchestrating the Future: Navigating Today's Data Workflow Challenges with Ai...
 
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docxDATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
DATA COMMS-NETWORKS YR2 lecture 08 NAT & CLOUD.docx
 
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
办(uts毕业证书)悉尼科技大学毕业证学历证书原版一模一样
 
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
一比一原版(GWU,GW文凭证书)乔治·华盛顿大学毕业证如何办理
 
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
Beyond the Basics of A/B Tests: Highly Innovative Experimentation Tactics You...
 
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
一比一原版兰加拉学院毕业证(Langara毕业证书)学历如何办理
 
Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024Open Source Contributions to Postgres: The Basics POSETTE 2024
Open Source Contributions to Postgres: The Basics POSETTE 2024
 
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
STATATHON: Unleashing the Power of Statistics in a 48-Hour Knowledge Extravag...
 
Intelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicineIntelligence supported media monitoring in veterinary medicine
Intelligence supported media monitoring in veterinary medicine
 
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens""Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
"Financial Odyssey: Navigating Past Performance Through Diverse Analytical Lens"
 
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
一比一原版(UCSF文凭证书)旧金山分校毕业证如何办理
 

On chemical structures, substances, nanomaterials and measurements

  • 1. NINA JELIAZKOVA IdeaConsult Ltd. Sofia, Bulgaria www.ideaconsult.net On chemical structures, substances, nanomaterials and measurements
  • 2. Sharing experience about: OpenTox API and beyond Chemical structures Substance identity Experimental data challenges Protocols Nanomaterials Final thoughts I D E A C O N S U L T L T D . 2 CONTENT • EC FP7 2008-2011 OpenTox • Distributed framework for predictive toxicology • Building blocks: data, chemical structures, algorithms and models. • Build models, apply models, validate models, access and query data in various ways; • Tech: REST API, RDF
  • 3. DATASETS, MODELS I D E A C O N S U L T L T D . 3 Open Melting Point Dataset #33
  • 4. PREDICTIONS I D E A C O N S U L T L T D . 4 31 May 2013 : The REACH deadline for registering substances [100 to 1000 tonnes per year] http://ToxPredict.net access statistics
  • 5. • AMBIT REST web services  OpenTox Application Programming Interface (API)  Dataset web services  Chemical search, data pooling, structure QA  Computational web services  Descriptor calculation, machine learning, structure optimisation, tautomers  Web Applications using AMBIT REST web services New 2014: Embeddable JS widgets I D E A C O N S U L T L T D . 5 AMBIT http://ambit.sf.net
  • 6. I D E A C O N S U L T L T D . 6 DATA CURATION EXAMPLE (DIISONONYLPHTALATE)
  • 7. I D E A C O N S U L T L T D . 7 DATA CURATION EXAMPLE (RN 25155-25-3)
  • 8. I D E A C O N S U L T L T D . 8 DATA CURATION EXAMPLE (RN 25155-25-3) European Chemical Agency Registration dossier
  • 9. SUBSTANCE IDENTITY IN REACH • Guidance for identification and naming of substances under REACH and CLP (118 pages) • Substance characterization “During the first 5 months of 2009, around 450 enquiries were received by ECHA, 23% of which were rejected on the grounds that the dossiers were incomplete (e.g. missing spectral data) or the substance identity had not been sufficiently described.” I D E A C O N S U L T L T D . 9 http://echa.europa.eu/documents/10162/13643/substance_id_en.pdf
  • 10. “Only a limited number of tools are capable to provide easily accessible data on substance identity, composition together with chemical structures and high quality and detailed endpoint data” I D E A C O N S U L T L T D . 10 SUBSTANCE IDENTITY/COMPOSITION
  • 11. SUBSTANCE ENDPOINT DATA I D E A C O N S U L T L T D . 11 OECD Harmonized templates Well defined XML schema for > 100 endpoints Experimental protocols: OECD Guidelines BioPortal ontologies coverage of OECD guidelines: None
  • 12. PROTOCOLS, SOP, INVESTIGATIONS, STUDY, ASSAYS SEP COACH Towards the replacement of in vivo repeated dose systemic toxicity testing SEURAT-1 ~ 70 research groups from European Universities, Public Research Institutes and Companies (more than 30% SMEs) http://www.seurat-1.eu/ http://toxbank.net/ FP7 Projects
  • 13. G O A L S Prediction of repeated dose toxicity Shared repository of know-how and experimental results  from SEURAT-1 research activities and relevant public sources Examples include:  Protocol describing a method for long term maintenance of functional hepatocytes  Results from a repeated dose 14 day transcriptomics study using acetaminophen and iPS-derived hepatocytes T E C H N O L O G I C A L S O L U T I O N S • REST Web services API • Protocol service • Investigation service • RDF data model • ISA-TAB & ontologies • ISA-TAB converted to RDF • Stored in a triple store • Chemical search (AMBIT) 13 TOXBANK DATA WAREHOUSE Challenges: • Diverse data types • Changing research protocols • Data formatting time consuming • Data sharing - little incentive
  • 14. FP7 ENANOMAPPER PROJECT • Develop an ontology and database unifying information about nanomaterial safety (in humans and the environment) • Cover the full lifecycle from manufacturing to environmental decay or accumulation • Pan-European project, 7 partners • Ontology growth through community and re-use
  • 15. NANOINFORMATICS CHALLENGES • nanoSMILES • nanoInChI • Nanomaterial identity - only through characterisation with multitude of experimental methods • Experiments reproducibility; standards • Experiments description (protocols, experimental details) • Models: structure based cheminformatics doesn’t really work • Common database? NO! But Yes! for an integrated search across databases! (requirement analysis feedback) I D E A C O N S U L T L T D . 15 Nanomaterial “unique” challenge of identification?
  • 16. NANOMATERIAL ENDPOINT DATA I D E A C O N S U L T L T D . 16 • Same data model as for substances (ISA-TAB inspired) • NM specific measurement protocols • Ontology support – under development eNanoMapper WP2 (Janna Hastings, Egon Willighagen)
  • 17. NANOMATERIAL SEARCH I D E A C O N S U L T L T D . 17
  • 18. LESSONS LEARNED What is more difficult: 1. Succeed in implementing a “moving target” API by a distributed team of developers. 2. Succeed in bringing together several wet lab teams to use a common tool/ format for preparing and sharing experimental data. I D E A C O N S U L T L T D . 18 1. OpenTox: Partners succeeded in creating 5 independent implementations of the OpenTox API; through “rough consensus and running code”; most services are online and being used 3y after the OpenTox project completion; API being used and extended in related projects; 2. In ToxBank we’ve resorted to taking the role of “data managers” in SEURAT-1 cluster; a setup typical to most EU data projects.
  • 19. WHY DATA FORMATTING AND SHARING IS SO DIFFICULT? Thoughts about the technology aspects; not about the incentives to share • Data format – the more flexible the format is, the more difficult is the data preparation; • Tools typically need to understand both data modelling and the experimental setup; • Preparing and data sharing requires additional efforts, which are typically not within the scope of the research projects; • Typical setup is “data managers” or “Excel templates” I D E A C O N S U L T L T D . 19 Compare with the easiness of sharing, liking and tagging pictures on social networks; liking and tagging essentially creates semantic knowledge!
  • 20. GUESS THE AUTHOR “This proposal concerns the management of general information about experiments at ???. It discusses the problems of loss of information about complex evolving systems and derives a solution based on a ???" I D E A C O N S U L T L T D . 20
  • 21. TIM BERNERS-LEE , 1989 “This proposal concerns the management of general information about accelerators and experiments at CERN. It discusses the problems of loss of information about complex evolving systems and derives a solution based on a distributed hypertext system." I D E A C O N S U L T L T D . 21 http://www.w3.org/History/1989/proposal.html Non-Centralisation Information systems start small and grow. They also start isolated and then merge. A new system must allow existing systems to be linked together without requiring any central control or coordination.
  • 22. FINAL THOUGHTS • Facilitate researchers organize their own data locally; • The cost of entering /recording data should be low; • Easy to use tools; • Formats – understandable or hidden behind user friendly tools; • Non-centralisation; • Added value: “The data-sharing environment must invite collaboration as well as facilitate it. Stakeholders have broad interests that go beyond retrieving existing data — they want to discover materials and forecast enhanced products” I D E A C O N S U L T L T D . 22 http://www.nature.com/news/technology- sharing-data-in-materials-science-1.14224
  • 23. I D E A C O N S U L T L T D . 23

Editor's Notes

  1. http://www.thereachcentre.com/uploaded/whitepapers/Substance_Characterisation_An_Overview.pdf