Elsevier Health SciencesSmart Content Drives Smart ApplicationsLinked Data in HCLS for Commercial Applications  Semantic W...
The ChallengeElsevier Proprietary and Confidential
The Challenge     Providing doctors/researchers with       the right information in the right      moment to make the best...
How to solve it?Elsevier Proprietary and Confidential
How to Solve it•Step 1 Making Elsevier’s authoritative Health Care and Life Sciences content “smarter”•Step 2 Enriching El...
Introducing                                        Smart ContentElsevier Proprietary and Confidential
Taxonomy-Powered Content = Smart ContentContent with applied taxonomyContent today with   structured XML                  ...
Smart Content At Elsevier                                                                Smart Content Applications       ...
Introducing EMMeT (Elsevier Merged Medical Taxonomy)Parent Terms•      Breast Disorders                      2•      Cance...
Automated Indexing: Weighted Tags for Better Search                                                                       ...
Standards                                        The Key PieceElsevier Proprietary and Confidential
The Satellite: a Linked Data Compliant Data Format• Motivations:    –Help answer research questions    –Direct material to...
The Satellite Format: a Linked Data Compliant DataElsevier Health Sciences | Proprietary and Confidential
The Satellite Format: a Linked Data Compliant Data•What we have learned so far    –RDF/XML has some limitations           ...
The Satellite Format: a Linked Data Compliant Data•Turtle as the RDF serialization format    –It is becoming the de facto ...
How is all this transformed                             into Commercial applicationsElsevier Proprietary and Confidential
The Linked Data Repository•    The LDR stores metadata describing Non Information Resources [httpRange14]•    The LDR prov...
Represent Enhancements and Vocabularies In RDFSatellites                                        •Creation of Satellite Sta...
LDR Semantic Infrastructure                                                               Linked Data                     ...
Clinical Key - the most clinically relevant answers Elsevier Proprietary and Confidential
Clinical Key - the most clinically relevant answers Elsevier Proprietary and Confidential
Comprehensive Drug Research                                        • Moving world-class content online to Point of Care.  ...
Linking Patient Data To Evidence-Based Research                                        - Discover knowledge from research ...
SciVerse Widgets Powered by Smart Content                        Article search on ScienceDirect results in related       ...
Questions  Iker Huerga  i.huerga@elsevier.comElsevier Proprietary and Confidential
Upcoming SlideShare
Loading in …5
×

W3C HCLS

457 views

Published on

Published in: Health & Medicine, Education
0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
457
On SlideShare
0
From Embeds
0
Number of Embeds
4
Actions
Shares
0
Downloads
0
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

W3C HCLS

  1. 1. Elsevier Health SciencesSmart Content Drives Smart ApplicationsLinked Data in HCLS for Commercial Applications Semantic Web for Health Care and Life Iker Huerga Sciences Summer School, W3C@MIT Sr. Semantic Software Engineer August 29, 2012 i.huerga@elsevier.com @ihuergaElsevier Proprietary and Confidential
  2. 2. The ChallengeElsevier Proprietary and Confidential
  3. 3. The Challenge Providing doctors/researchers with the right information in the right moment to make the best decisionsElsevier Health Sciences | Proprietary and Confidential
  4. 4. How to solve it?Elsevier Proprietary and Confidential
  5. 5. How to Solve it•Step 1 Making Elsevier’s authoritative Health Care and Life Sciences content “smarter”•Step 2 Enriching Elsevier’s content by integration with third party data•Step 3 Creating interfaces to provide fast discoverability of the most relevant answers and more intuitive searching. We Need Semantic Web for all ThisElsevier Health Sciences | Proprietary and Confidential
  6. 6. Introducing Smart ContentElsevier Proprietary and Confidential
  7. 7. Taxonomy-Powered Content = Smart ContentContent with applied taxonomyContent today with structured XML Copyright 2011 Outsell Gilbane Services, Inc. Elsevier Proprietary and Confidential http://www.outsellinc.com http://gilbane.com/xml/2009/11/what-is-smart-content.html#ixzz0hnuRhaBc
  8. 8. Smart Content At Elsevier Smart Content Applications Better discovery through semantic search & navigation Linked data from •Faceted search & browse partners and the Web •Ontology-driven navigation •Task-specific results •Personalized/localized results •Question answering •Link to evidenced-based content Text Better understanding through analysis and Entities, visualizationElsevier concepts •Tag cloudscontent Tables and •Heatmaps •Streamgraphs relationships •Scatterplots •Time series Images •Animations New knowledge through aggregation and synthesis •Topic pages Elsevier •Social network maps knowledge •Geolocation maps organization •Data mashups systems •Text mining reports 8Elsevier Proprietary and Confidentiall
  9. 9. Introducing EMMeT (Elsevier Merged Medical Taxonomy)Parent Terms• Breast Disorders 2• Cancer of the Thorax• Mammary Neoplasms• More…. Symptoms Breast Lump, Nipple Retraction, ….. Medical Name Diagnostic Malignant Neoplasm of the Breast Mammography, Breast Biopsy, ….. Procedures Consumer Friendly Name Breast Cancer Synonyms 1 4 Malignant Tumor of Breast Treatment Chemotherapy, Mastectomy, …. Malignant Breast Neoplasm Procedures Semantic Relationships Breast Ca Codes ICD9 – 174.9 MeSH – D001943 Medications Tamoxifen, Doxorubicin, ….. SNOMED-CT – 190121004 Semantic Type/Group Neoplastic Process/Disease Risk Factors Family History, Genetics, Predisposition, ….Children Terms• Breast Sarcoma 3 Prevention Screening, Preemptive Mastectomy, ….• Familial Breast Cancer• Malignant lymphoma of the Breast• Malignant Neoplasm of the breast outer quadrant Complications Metastatic Cancer, ….• More… Elsevier Proprietary and Confidential
  10. 10. Automated Indexing: Weighted Tags for Better Search Article-level SMART Content tags help confirm relevance and provide a topical overview about a piece of content. Paragraph-level SMART Content tags uncover highly-relevant information not necessarily evident from the title or abstract alone. Elsevier Proprietary and Confidential
  11. 11. Standards The Key PieceElsevier Proprietary and Confidential
  12. 12. The Satellite: a Linked Data Compliant Data Format• Motivations: –Help answer research questions –Direct material to interested readers –Extract disparate facts from the literature to create knowledge bases Satellite Specification First Version •Use RDF/XML serialization• Technical Requirements: •Use XML Schemas to validate the syntax –Use of open standards based so that document which validate will metadata frameworks: SKOS, DCMI produce correct RDF and SWAN –Need of a common model to represent •Use the extensive XML-capable ontological annotations infrastructure, QA tools, etc. –Data will be transferred from suppliers to Elsevier and back –QA of tags (aka Provenance) –Some people have RDF knowledge, but very limited in proportionElsevier Proprietary and Confidential
  13. 13. The Satellite Format: a Linked Data Compliant DataElsevier Health Sciences | Proprietary and Confidential
  14. 14. The Satellite Format: a Linked Data Compliant Data•What we have learned so far –RDF/XML has some limitations • Not all RDF graphs can be serialized in XML (QNames, Unicode characters) • There is no support for RDF Graphs in RDF/XML, at the moment one satellite is one RDF Graph in the LDR • Complexity of RDF/XML abbreviation rules • Can’t put attributes on the predicates –An XML Capable infrastructure does not necessarily entail an RDF/XML Capable infrastructure • Many XML tools can’t be used with RDF/XML • Multiple different serializations for the same RDF Graph exist • XML Schema validation makes the specification less flexible It’s time to move towards a more “RDF friendly” serializationElsevier Health Sciences | Proprietary and Confidential
  15. 15. The Satellite Format: a Linked Data Compliant Data•Turtle as the RDF serialization format –It is becoming the de facto serialization for RDF –It makes RDF much more ‘human friendly’ –Gives us the flexibility we need for the next satellite generation –All the Libraries we are currently using support Turtle –It follows the triple pattern syntax of SPARQL, more convenient for querying•Steps to the transition –Both serializations will coexist for a period of time –Internal tools, Validation, QA, etc., need to be adapted to ‘understand’ Turtle –Tools for transforming RDF/XML into Turtle needs to be provided to the suppliersElsevier Health Sciences | Proprietary and Confidential
  16. 16. How is all this transformed into Commercial applicationsElsevier Proprietary and Confidential
  17. 17. The Linked Data Repository• The LDR stores metadata describing Non Information Resources [httpRange14]• The LDR provides a rich semantic layer on top of IR and enables search and discovery of metadata• Extends Elsevier extracted knowledge by interlinking data with other related sources of content from partners and the Web, using the Web as its API• Optimized for high-volume of RDF I/O operations• Provide service layer APIs for ease of integration• Opens up discovery and utility of content beyond searchable documentsElsevier Health Sciences | Proprietary and Confidential
  18. 18. Represent Enhancements and Vocabularies In RDFSatellites •Creation of Satellite Standards –Linked data compliant RDF representing metadata objects –Leverage common namespaces from dct, pav, rdf, skos –Taxonomies in SKOS to enhance portability in the linked data world LDR –Subject tagging against a vocabulary representing extracted knowledge –Concept URIs that can be equated to URIs in linked data •Example RDF Statements –Tags from a taxonomy for a given document –Document sections relevant to a given concept –Document sections providing answers to a given question –Genes mentioned in a given document –Documents supporting or disputing conclusions of a given documentElsevier Proprietary and Confidential –Concepts in the areas of expertise for a given author
  19. 19. LDR Semantic Infrastructure Linked Data Linked Data Loader (REST) Data Space Services Vocab & Annotation Linked Data Annotation Satellites Satellites Satellites 3rd Party Vocab Asset RDF Data Satellites Smart Content Indexing Pipeline Linked Data Pipeline Services (Hadoop) AWS Cloud Management EMMeT Vocabulary SKOS Semantic RDF Validation Ontology Svcs Generation Interlinking Reasoning Transform Network N-Quads Extract JSON … Tagging and Indexing Services (Concepts, Content Elsevier Chapters, Articles, Guidelines,etc) RDF Generation Discovery Services (Semantic Knowledgebase) 3rd Party Content Content Instit. Amazon MongoDB SOLR/SIRE Virtuoso S3 NoSQL n Triplestore Product-specific Smart Content Access & Admin & Atom Feed Analytics Search Index Entitlements Monitoring Discovery Svc Ontology SPARQL Alerts API (REST) Service 19Elsevier Proprietary and Confidential
  20. 20. Clinical Key - the most clinically relevant answers Elsevier Proprietary and Confidential
  21. 21. Clinical Key - the most clinically relevant answers Elsevier Proprietary and Confidential
  22. 22. Comprehensive Drug Research • Moving world-class content online to Point of Care. • Extracted knowledge is linked for further enrichment. • Information is condensed, immediate and actionable.Elsevier Proprietary and Confidential
  23. 23. Linking Patient Data To Evidence-Based Research - Discover knowledge from research relevant to a patient profile - Alerts on FDA Announcements.Elsevier Proprietary and Confidential
  24. 24. SciVerse Widgets Powered by Smart Content Article search on ScienceDirect results in related specialty content recommendations available from The Lancet Journal.Elsevier Proprietary and Confidential
  25. 25. Questions Iker Huerga i.huerga@elsevier.comElsevier Proprietary and Confidential

×