A number of challenges exist in engineered nanomaterials (ENM) data representation and integration mainly due to data complexity and provenance. We have recently described the eNanoMapper database [doi:10.1109/BIBM.2014.699936] as part of the computational infrastructure for toxicological data management of ENM, developed within the EU FP7 eNanoMapper project. The ontology-supported data model is based on an exhaustive review of existing nano-related data models, databases, and nanomaterial related entries in chemical and toxicogenomic databases. We demonstrate how this approach provides a common ground for integration of data represented in diverse formats (ISA-TAB, OECD HT, custom RDF and set of spreadsheet templates used by the EU NanoSafety Cluster projects) and enables uniform approach towards import, storage and searching of ENM physicochemical measurements and biological assay results. A configurable parser enables import of the data stored in spreadsheet templates, accommodating different organization of the data. The configuration metadata is defined in a separate file, mapping the spreadsheet into the internal data model. The demonstration data provided by eNanoMapper partners ((i) NanoWiki, (ii) a literature dataset on protein coronas and (iii) the ModNanoTox project dataset consisting of 86 assays and 100 different endpoints) illustrates the capability of the associated REST API to support a variety of tests and endpoints, recommended by the OECD Working Party of Manufactured Nanomaterials. The API is tightly integrated with a chemical structure search, allowing highlighting the function as a core, coating or functionalisation. The REST API enables graphical summaries of the data and integration in applications such as NanoQSAR modelling via programmatic interaction.
The eNanoMapper database for nanomaterial safety information: storage and query
1. The eNanoMapper database for
nanomaterial safety information:
storage and query
Nina Jeliazkova1, Nikolay Kochev2, David Vorgrimmler3, Janna Hastings4, Vedrin Jeliazkov1
1 Ideaconsult Ltd, 4 Angel Kanchev Str., 1000 Sofia, Bulgaria
2 University of Plovdiv, Dep. of Analytical and Computer Chemistry, 24 Tsar Assen Str., Plovdiv, Bulgaria
3 in silico toxicology GmbH, 41 Rastatterstr, 4057 Basel, Switzerland
4EMBL-EBI, Hinxton, United Kingdom
NINA JELIAZKOVA
IdeaConsult Ltd.
Sofia, Bulgaria
www.ideaconsult.net
9 September 2015
2. EU FP7 eNanoMapper project
eNanoMapper - A Database and Ontology Framework
for Nanomaterials Design and Safety Assessment
• Duration: 36 months (1 Feb 2014 – 31 Jan 2016)
• Pan-European project, 8 partners
• Objective: Safety by design
• Develop an ontology and database unifying information
about nanomaterial safety (in humans and the
environment)
• Cover the full lifecycle from manufacturing to
environmental decay or accumulation
• Ontology growth through community and re-use
9 September 2015
Grant Agreement: 604134
4. NanoSafety Cluster Database Survey
http://www.nanosafetycluster.eu/
• Nanomaterial Biological Interactions Knowledgebase
http://nbi.oregonstate.edu/
• DaNa www.nanoobjects.info
• Crystallography Open Database http://www.crystallography.net
• ANSI NSP Nanotechnology Standards Database
http://nanostandards.ansi.org/
• ArrayExpress functional Genomics Data
http://www.ebi.ac.uk/arrayexpress/
• cancer Nanotechnology Laboratory
https://cananolab.nci.nih.gov/caNanoLab/
• Consumer Products Inventory http://www.nanotechproject.org/cpi/
• Nanomaterials Database http://www.nanowerk.com
• Nano Material Registry www.nanomaterialregistry.org
• Nanoparticle Information Library NIL (nanoparticlelibrary.net)
• PubChem https://pubchem.ncbi.nlm.nih.gov/
9 September 2015
5. eNanomapper DB review (Q1 2014)
• 104 potential data
sources.
• A subset of 34 were
publicly available online
on the Internet.
• Most of these sources
don’t provide machine
readable data
• Simple web pages : 18
• PDF documents : 10
• Excel tables : 3
• Database dumps : 3
• ISA-Tab-Nano format : 1
• IUCLID5 format : 1
• Semantic MediaWiki : 1
• Programmatic access
through a publicly
available API : 4
• Only one source makes
distinction between raw
and processed data and
provides access to both
types of data.
9 September 2015
Contributed ~30 entries to the
Spring 2014 NSC Database Survey
7. BioAssay Ontology CODATA Uniform Description System for
Materials at the Nanoscale
9/9/2015 7
Existing data models
ISA-TAB & ISA-TAB Nano
OECD Harmonized Templates (http://iuclid.eu)
8. Chemical /Toxicogenomics DB
(no explicit NM support)
9/9/2015 8
NM: Carbon nanotube assays
>200 fullerenes; metal oxides; silver nanoparticles;
colloidal gold nanoparticles, etc.
NM: Fullerenes , Metal oxides
Gene expression data
NM: carbon nanotubes, quantum dots, graphene
oxide, zinc oxide, silver and gold nanoparticles.
Comparative Toxicogenomics Database
Includes nanomaterial related data.
The ECHA Dissemination site. Registered chemical
substances under REACH, including NM.
11. Nanomaterial database challenges
• Elucidating the data
model is difficult
• Making the data model
universal is difficult
• Reasons:
– Material
• Uniqueness
– Experimental data
• Complexity
– Modelling
• Different requirements
Analogy: Chemical structures DB
– Chemical structure and properties
• Inappropriate data model.
Instead:
– Substances - measured properties
– Structures - calculated properties.
• Substances composition
– Constituents, impurities, additives
• Nanomaterials
– Core, coating(s), linkage, etc.
– Functionalisation, nanocomposites
– Also impurities
9/9/2015 11
16. Data upload (has to be easy!)
9/9/2015 16
W3C RDF (linked data)
IUCLID5 *.i5z (OECD HT)
Spreadsheets (Excel, CSV)
ISA-TAB –Nano
(under development)
17. Spreadsheet data templates
(mostly provided by NanoSafety Cluster projects)
9/9/2015 17
Configurable parser
for spreadsheet data templates
JSON (JavaScript
Object Notation)
is a lightweight
data-interchange
format.
https://github.com/enanomapper/nmdataparser
23. Free text search supported by
ontology annotated database
9 September 2015
Hastings, J., Jeliazkova, N., Owen, G., Tsiliki, G., Munteanu, C. R., Steinbeck, C., &
Willighagen, E. eNanoMapper: harnessing ontologies to enable data integration
for nanomaterial risk assessment. Journal of Biomedical Semantics, 2015, 6(10).
24. Application Programming Interface
• A way computer programs talk to one
another. Can be understood in terms
of how a programmer sends
instructions between programs.
• Access the database via
– Any programming language
– Workflow systems
– Data analysis tools
• Implement
– Your database with different
technology but the same API
(interoperability)
• Run your own instances of the
database
– eNanoMapper database is based
on open source project
http://ambit.sf.net9/9/2015
24
http://enanomapper.github.io/API/
25. eNanoMapper database
• How to store ENM safety data:
• Prepare the ENM characterisation and assay
data using your own templates;
• The Excel parser enables converting the input
templates into the eNanoMapper data model
and upload into the DB;
• Search and explore the eNanoMapper DB;
• Format convertors by exporting the data model
into different formats (under dev)
• Ontology annotation (under development
9 September 2015
26. Using eNanoMapper DB for modelling
NN15 Fri 15:45-15:00 A web application for deriving
descriptors of nanomaterials from the analysis of TEM
images, M. Kotsiandris1, P. Doganis1, H. Chomenidis1, G. Drakakis1, P.
Sopasakis1,2, H. Sarimveis1 [1]School of Chemical Engineering, NTUA
[2]IMT Institute for Advanced Studies Lucca, Italy
Jeliazkova N., Jeliazkov V., Willighagen E., Smeets S., Munteanu C., Fadeel
B., Grafström R., Kohonen P., Sarimveis H., Tsiliki G., Doganis P.,
Vorgrimmler D., and Hastings J., The first eNanoMapper
prototype: a substance database to support safe-by-
design, IEEE BIBM'14 2014 Workshop on Nanoinformatics for
Environmental Health and Biomedicine
Jeliazkova N. et al., The eNanoMapper database for
nanomaterial safety information, Beilstein Journal of
Nanotechnology, accepted 2015
9 September 2015