Linked ATC and WHO Drug in a
Semantic Web enabled world
Kerstin Forsberg (@kerfors on Twitter, SlideShare etc.)
Informatics Analyst and Lifetime Learner
AZ IT | R&D Information
Representing and linking data, schemas, models,
data standards and terminologies
Web 1.0 (25 years) and Web 2.0 (the last 10 years)
2
Web of (Linked) Data
Web of Documents
An Intro To The Semantic Web: Why You Need To
Know About It Sooner Than Later , by Samantha Wong
Image Source: Frederic Martin
Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
Web 3.0 (RDF, foundation standard, 15 years)
3
Web of Data
Web of Documents
subject predicat object
Common Model (“Triples”)
Resource Description Framework
Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
Web of Data
Fine granularity
Web of Data as RDF Triples for “Things”
4
80+ RDF Triples (here are 4 of them)
describing Ticagrelor
Web of Data view of structured data
in Wikipedia pages
Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
Linked Data, Semantic Web and Graphs
BBC http://www.bbc.co.uk/things/
5
Linked Data, Semantic Web and Graphs
Nobel Prize http://www.nobelprize.org/nobel_organizations/nobelmedia/nobelprize_org/developer/
6
Linked Data, Semantic Web and Graphs
Google Knowledge Graph /
7
http://searchengineland.com/demystifying-knowledge-graph-201976
RDF @ NCBI
Example http://id.nlm.nih.gov/mesh/D015242
for Ofloxacin in MeSH
AstraZeneca engagements
• Public
- IMI Open PHACTS (Innovative Medicine Initiative, Open
Pharmacology Space)
- FDA/PhUSE and CDISC, Semantic technology project
- W3C Health Care and Life Science (HCLS)
• Internal
- iSIM
- i2 Semantic Framework
Semantic Web and Linked Data
9 Author | 00 Month Year Set area descriptor | Sub level 1
The Innovative Medicines
Initiative
• EC funded public-private
partnership for
pharmaceutical research
• Focus on key problems
– Efficacy, Safety,
Education & Training,
Knowledge
Management
The Open PHACTS Project
• Create a semantic integration hub (“Open
Pharmacological Space”)…
• Delivering services to support on-going drug
discovery programs in pharma and public domain
• Not just another project; Leading academics in
semantics, pharmacology and informatics, driven by
solid industry business requirements
• 23 academic partners, 8 pharmaceutical companies,
3 biotechs
• Work split into clusters:
• Tehnical Build
• Scientific Drive
• Community & Sustainability
• CDISC2RDF started Oct 2012 as a pre-competitive
project with AZ, Roche, W3C et al. to show case
Semantic Web standards and Linked Data principles.
• FDA meeting Nov 2012: Solutions for Study Data
Exchange Standards Meeting – W3C Semantic Web
presentation.
• June 2013 the Semantic Technology project,
a FDA/PhUSE working group for Emerging
Technologies, with 25+ repr. from FDA,
CDISC, Pharma:s, CRO:s and software vendors.
• Oct 2013 press release: Representing
existing standards (SDTM, CDASH,
SEND, ADaM) in RDF.
• Dec 2014, Public review of CDISC in RDF Guide.
Clinical standards in the Semantic Web
Community building and knowledge sharing
11
CDISC Interchange Europe 2011 and 2012
presentations from Roche and AstraZeneca
Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
12 Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
Clinical data standards in the Semantic Web
Example from CDISC SDTM, Adverse Event domain (AE)
RDF triples describing one variable/data element
and linking to related standard parts
“Pushing back” – Use standards for standards
AZ Vocabulary Management team shared this with MedDRA MSSO
13 Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
Courtland Yockey, Informatics Analyst
AstraZeneca R&D Information, USA
A very simple SKOS-rendering
of MedDRA
• term  skos:Concept
• hierarchy level 
skos:ConceptScheme
• SMQ  skos:Collection
Approach should be augmented with
VoID representation of MedDRA
versions and term properties
distinguishing active from inactive
terms.
Skos:Collection is likely not sufficient
to support SMQ versioning nor
context of terms in an SMQ (e.g.
weight)
“Pushing back” – Use standards for standards
AZ Vocabulary Management team created a RDF representation of
ATC codes using the SKOS Schema
14 Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
Courtland Yockey, Informatics Analyst
AstraZeneca R&D Information, USA
4 example RDF Triples
representing part of a ATC code
Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
Semantic Web
Standards
 A stack of standards to
represent data and semantics
based on Resource
Description Framework
(RDF). RDF is a framework
for creating statements in a
form of so-called triples
 OWL and SKOS: RDF-based
standards to represent
vocabularies of terms
representing identified entities
and concepts
 SPARQL: query language for
RDF triples
Building Linked Data
Applications
 Use of Semantic Web
standards and Linked Data
principles enabling us to ask
questions and solve business
problems across a
heterogeneous information
landscape across open and
closed sources
Capture
Business
Questions and
Sources
Domain
Expert
Concept Map
Build Formal
Ontolog!
Challenge
with Linked
Open Data
Model
Business
Questions
(SPARQL)
Interact with
RDF answer
in a Faceted
Browser
Web of Data
Open and Closed
 Open data sources applying
the Linked Data principles
and semantic web standards
as a Web of Data
 Central is the Wikipedia’s
structured content via
DBpedia used by e.g.
Google’s KnowledgeGraph
and IBM’s Watson.
 Closed data sources now
also form internal Webs of
Data
Linked Data
Principles
 Use URIs (Uniform Resource
Identifiers) as names for
things.
 Use HTTP URIs so that
people can look up
(dereference) those names.
 When someone looks up a
URI, provide useful
information.
 Include links to other URIs so
that they can discover more
things
Linked Data in One slide

Linked data presentation for who umc 21 jan 2015

  • 1.
    Linked ATC andWHO Drug in a Semantic Web enabled world Kerstin Forsberg (@kerfors on Twitter, SlideShare etc.) Informatics Analyst and Lifetime Learner AZ IT | R&D Information Representing and linking data, schemas, models, data standards and terminologies
  • 2.
    Web 1.0 (25years) and Web 2.0 (the last 10 years) 2 Web of (Linked) Data Web of Documents An Intro To The Semantic Web: Why You Need To Know About It Sooner Than Later , by Samantha Wong Image Source: Frederic Martin Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
  • 3.
    Web 3.0 (RDF,foundation standard, 15 years) 3 Web of Data Web of Documents subject predicat object Common Model (“Triples”) Resource Description Framework Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
  • 4.
    Web of Data Finegranularity Web of Data as RDF Triples for “Things” 4 80+ RDF Triples (here are 4 of them) describing Ticagrelor Web of Data view of structured data in Wikipedia pages Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
  • 5.
    Linked Data, SemanticWeb and Graphs BBC http://www.bbc.co.uk/things/ 5
  • 6.
    Linked Data, SemanticWeb and Graphs Nobel Prize http://www.nobelprize.org/nobel_organizations/nobelmedia/nobelprize_org/developer/ 6
  • 7.
    Linked Data, SemanticWeb and Graphs Google Knowledge Graph / 7 http://searchengineland.com/demystifying-knowledge-graph-201976
  • 8.
    RDF @ NCBI Examplehttp://id.nlm.nih.gov/mesh/D015242 for Ofloxacin in MeSH
  • 9.
    AstraZeneca engagements • Public -IMI Open PHACTS (Innovative Medicine Initiative, Open Pharmacology Space) - FDA/PhUSE and CDISC, Semantic technology project - W3C Health Care and Life Science (HCLS) • Internal - iSIM - i2 Semantic Framework Semantic Web and Linked Data 9 Author | 00 Month Year Set area descriptor | Sub level 1
  • 10.
    The Innovative Medicines Initiative •EC funded public-private partnership for pharmaceutical research • Focus on key problems – Efficacy, Safety, Education & Training, Knowledge Management The Open PHACTS Project • Create a semantic integration hub (“Open Pharmacological Space”)… • Delivering services to support on-going drug discovery programs in pharma and public domain • Not just another project; Leading academics in semantics, pharmacology and informatics, driven by solid industry business requirements • 23 academic partners, 8 pharmaceutical companies, 3 biotechs • Work split into clusters: • Tehnical Build • Scientific Drive • Community & Sustainability
  • 11.
    • CDISC2RDF startedOct 2012 as a pre-competitive project with AZ, Roche, W3C et al. to show case Semantic Web standards and Linked Data principles. • FDA meeting Nov 2012: Solutions for Study Data Exchange Standards Meeting – W3C Semantic Web presentation. • June 2013 the Semantic Technology project, a FDA/PhUSE working group for Emerging Technologies, with 25+ repr. from FDA, CDISC, Pharma:s, CRO:s and software vendors. • Oct 2013 press release: Representing existing standards (SDTM, CDASH, SEND, ADaM) in RDF. • Dec 2014, Public review of CDISC in RDF Guide. Clinical standards in the Semantic Web Community building and knowledge sharing 11 CDISC Interchange Europe 2011 and 2012 presentations from Roche and AstraZeneca Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information
  • 12.
    12 Kerstin Forsberg| WHO UMC, Jan 21 2015 AZIT | R&D Information Clinical data standards in the Semantic Web Example from CDISC SDTM, Adverse Event domain (AE) RDF triples describing one variable/data element and linking to related standard parts
  • 13.
    “Pushing back” –Use standards for standards AZ Vocabulary Management team shared this with MedDRA MSSO 13 Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information Courtland Yockey, Informatics Analyst AstraZeneca R&D Information, USA A very simple SKOS-rendering of MedDRA • term  skos:Concept • hierarchy level  skos:ConceptScheme • SMQ  skos:Collection Approach should be augmented with VoID representation of MedDRA versions and term properties distinguishing active from inactive terms. Skos:Collection is likely not sufficient to support SMQ versioning nor context of terms in an SMQ (e.g. weight)
  • 14.
    “Pushing back” –Use standards for standards AZ Vocabulary Management team created a RDF representation of ATC codes using the SKOS Schema 14 Kerstin Forsberg | WHO UMC, Jan 21 2015 AZIT | R&D Information Courtland Yockey, Informatics Analyst AstraZeneca R&D Information, USA 4 example RDF Triples representing part of a ATC code
  • 15.
    Kerstin Forsberg |WHO UMC, Jan 21 2015 AZIT | R&D Information Semantic Web Standards  A stack of standards to represent data and semantics based on Resource Description Framework (RDF). RDF is a framework for creating statements in a form of so-called triples  OWL and SKOS: RDF-based standards to represent vocabularies of terms representing identified entities and concepts  SPARQL: query language for RDF triples Building Linked Data Applications  Use of Semantic Web standards and Linked Data principles enabling us to ask questions and solve business problems across a heterogeneous information landscape across open and closed sources Capture Business Questions and Sources Domain Expert Concept Map Build Formal Ontolog! Challenge with Linked Open Data Model Business Questions (SPARQL) Interact with RDF answer in a Faceted Browser Web of Data Open and Closed  Open data sources applying the Linked Data principles and semantic web standards as a Web of Data  Central is the Wikipedia’s structured content via DBpedia used by e.g. Google’s KnowledgeGraph and IBM’s Watson.  Closed data sources now also form internal Webs of Data Linked Data Principles  Use URIs (Uniform Resource Identifiers) as names for things.  Use HTTP URIs so that people can look up (dereference) those names.  When someone looks up a URI, provide useful information.  Include links to other URIs so that they can discover more things Linked Data in One slide