Smart Content applications at 
Elsevier 
NISO/NFAIS Virtual Conference: 
Connecting the Library to the Wider World - Successful Applications of Linked Data 
Michael Lauruhn 
December 3, 2014
| 2 
Introduction & Agenda 
Smart Content & Linked Data at Elsevier 
Background 
Key Components of Smart Content 
Current Examples 
Project Planning Considerations
| 3 
Introduction: Smart Content & Linked Data 
Elsevier Content 
Componentized text 
Data 
Multimedia 
3rd Party Linked data 
Web Open data 
Vocabulary
| 4 
Smart Content infrastructure in practice 
Trial: NCT00623103 
Serious Adverse events: 
Atrial fibrillation 
Elsevier 
med:drugs Rivastigmine 
Delirium treatment: An unmet challenge 
Rivastigmine, a cholinesterase inhibitor, has been used to 
treat delirium in elderly patients with stroke. 1 A biologically 
plausible premise—that impaired cholinergic transmission 
might either cause or worsen delirium—led to a 
randomised, placebo-controlled, double-blind trial by 
Maarten van Eijk and colleagues 2 in The Lancet in which 
they added rivastigmine or placebo to usual treatment of 
patients in intensive care. The trial was halted at 104 
patients by the drug safety and monitoring board (DSMB) 
because of increased mortality (12/54 in the rivastigmine 
group, 4/50 in the placebo group; p=0·07) and a worse 
outcome. The rivastigmine group … 
foaf:page 
owl:same as 
owl:same as
| 5 
Smart Content as Infrastructure 
Product Development & Enhancement 
• More accurate search results 
• Faceted navigation 
• Improved content discoverability 
Content Analytics 
• New insights and abilities to take inventory 
about what we publish 
• Identification of co-occurring terms 
• Link to related external content & data 
Personalization 
• Individual content recommendations 
• Targeted individual marketing 
Editorial Productivity 
• Flexible product types – new collections, 
image banks, etc. 
• Increased speed to market
Key Components 
of Smart Content
| 7 
Vocabulary Example: EMMeT 
EMMeT 
UMLS 
SNOMED 
ICD9 
ICD10 
MeSH 
LOINC 
Elsevier 
Custom 
Resources 
Gold 
Standard 
(Drugs) 
Multi-language taxonomy: 
>1 million concepts 
>3 million synonyms 
Classes include: 
Anatomy 
Diseases 
Drugs 
Symptoms 
Procedures 
Sourced from several 
standardized vocabularies
| 8 
• Breast Disorders 
• Cancer of the Thorax 
• Mammary Neoplasms 
• More…. 
Medical Name 
Malignant Neoplasm of the Breast 
Consumer Friendly Name 
Breast Cancer 
Synonyms 
Malignant Tumor of Breast 
Malignant Breast Neoplasm 
Breast Ca 
Codes 
ICD9 – 174.9 
MeSH – D001943 
SNOMED-CT – 190121004 
Semantic Type/Group 
Neoplastic Process/Disease 
• Breast Sarcoma 
• Familial Breast Cancer 
• Malignant lymphoma of the Breast 
• Malignant Neoplasm of the breast outer 
quadrant 
• More… 
Symptoms 
Diagnostic 
Procedures 
Treatment 
Procedures 
Medications 
Risk Factors 
Prevention 
Complications 
Breast Lump, Nipple Retraction, ….. 
Mammography, Breast Biopsy, ….. 
Chemotherapy, Mastectomy, …. 
Tamoxifen, Doxorubicin, ….. 
Family History, Genetics, Predisposition, …. 
Screening, Preemptive Mastectomy, …. 
Metastatic Cancer, …. 
Semantic Relationships 
4 
2 
1 
3
| 9 
Vocabulary Example: EMMeT 
EMMeT 
UMLS 
SNOMED 
ICD9 
ICD10 
MeSH 
LOINC 
Elsevier 
Custom 
Resources 
Gold 
Standard 
(Drugs) 
FrEMMeT 
SpEMMeT
| 10 
Linked Data Repository
| 11 
Linked Data Repository 
• Knowledgebase of semantic data 
• Large scale integration of related 
sources of medical and scientific 
content and data 
• High performance service layer 
APIs for integration into end-user 
products and internal platforms 
Classic subject metadata 
Editorial & 
Author 
Keywords 
Full-text 
Indexing 
Linked Data Environment 
Componentized text 
Robust Data models 
Entity extraction 
Semantic Annotations
Current Examples
| 13 
ClinicalKey search
| 14 
SciVal Funders Vocabulary 
• Support the FundRef initiative facilitated by CrossRef organization to 
provide a standard way of reporting funding sources for published 
scholarly research. 
• SciVal Funding is an online solution that provides targeted 
recommendations on grants, making it easier for researchers to 
discover funding opportunities related to their area of research.
| 15 
Similar Methods for Neuroscience 
System to extract and index the Methods sections of articles from 
100 Elsevier neuroscience journals 
Built a comparison and recommendation system so readers can 
find and evaluate articles with “Similar Methods” to the ones 
presented in the current article
| 16 
Similar Methods for Neuroscience 
Search process targets factors: 
what brain regions are being studied 
what organism is being used 
what methodologies are being employed 
what disease model is the focus of the study
| 17 
Leveraging Wikipedia for Neuroscience 
• Pilot project that identifies concepts from a 
Neuroscience topics vocabulary 
• Provides Wikipedia definitions to add context 
around the article’s significant concepts
Additional context for Energy terms 
18 
• A ‘dictionary app’ using the portions of the Encyclopedia of 
Energy (1818 terms) 
• Available for articles from Applied Energy and Energy 
Conversion and Management; additional pilots planned. 
Example: http://www.sciencedirect.com/science/article/pii/S0306261913001888 
Terms from dictionary 
are highlighted in article, 
when the reader clicks 
on the term the definition 
from the dictionary will 
be shown in the feature 
(right hand pane)
Project Planning 
Considerations
| 20 
Get to a Use Case early 
• Get stakeholders invested 
• Think about what users currently do… and what they can do better 
• Focus on Use Cases to stay centered and identify priorities with 
decision making. 
Particularly helpful when introducing a 
new infrastructure to an organization
| 21 
Quality & Reliable of Resources 
• Integration with third party content, data models and vocabularies 
requires a vetting process: 
 Are they accurate? 
 Are they trustworthy? 
 Are they current? 
 Are they sustainable? 
Warning: Some of the more attractive 
resources on the web are one off 
projects are no longer maintained
| 22 
Ongoing Maintenance 
• As knowledge models and vocabularies grow, resources are needed 
to keep them current 
• Governance policy should account for sources for new concepts, 
terminology and relations: 
 New content types 
 Search logs 
 New trends & discoveries 
These require resources (people’s 
time) that need to be factored into the 
total cost of ownership
| 23 
Quality & Testing 
• Applying semantic web technologies for applications is not an 
exclusively IT solution: 
 Sponsors, stakeholders and subject experts need to contribute and 
shape the vocabularies and the application functionality 
 The fine tuning for some of these applications can be surprisingly 
manual 
 It’s important to not get distracted by the outliers and corner cases 
Installing and implementing these 
technologies OOTB is getting 
easier…Quality is where it gets hard
| 24 
Quality & Testing 
• Test sets are essential 
 Real content 
 Real use cases 
 Scores that show accuracy and measure improvement 
Our SME’s: Our Application:
| 25 
Other lessons & observations 
• Don’t forget to look at opportunities for internal applications 
 Consider internal workflows 
 Look for efficiency enhancements 
 Look for discovery opportunities 
• Start small 
 Get some early proof of concepts that you can share with stakeholders 
before tackling bigger challenges
Thank You 
Michael Lauruhn 
m.lauruhn@elsevier.com 
@MikeLauruhn 
@ElsevierLabs

NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider World: Successful Applications of Linked Data

  • 1.
    Smart Content applicationsat Elsevier NISO/NFAIS Virtual Conference: Connecting the Library to the Wider World - Successful Applications of Linked Data Michael Lauruhn December 3, 2014
  • 2.
    | 2 Introduction& Agenda Smart Content & Linked Data at Elsevier Background Key Components of Smart Content Current Examples Project Planning Considerations
  • 3.
    | 3 Introduction:Smart Content & Linked Data Elsevier Content Componentized text Data Multimedia 3rd Party Linked data Web Open data Vocabulary
  • 4.
    | 4 SmartContent infrastructure in practice Trial: NCT00623103 Serious Adverse events: Atrial fibrillation Elsevier med:drugs Rivastigmine Delirium treatment: An unmet challenge Rivastigmine, a cholinesterase inhibitor, has been used to treat delirium in elderly patients with stroke. 1 A biologically plausible premise—that impaired cholinergic transmission might either cause or worsen delirium—led to a randomised, placebo-controlled, double-blind trial by Maarten van Eijk and colleagues 2 in The Lancet in which they added rivastigmine or placebo to usual treatment of patients in intensive care. The trial was halted at 104 patients by the drug safety and monitoring board (DSMB) because of increased mortality (12/54 in the rivastigmine group, 4/50 in the placebo group; p=0·07) and a worse outcome. The rivastigmine group … foaf:page owl:same as owl:same as
  • 5.
    | 5 SmartContent as Infrastructure Product Development & Enhancement • More accurate search results • Faceted navigation • Improved content discoverability Content Analytics • New insights and abilities to take inventory about what we publish • Identification of co-occurring terms • Link to related external content & data Personalization • Individual content recommendations • Targeted individual marketing Editorial Productivity • Flexible product types – new collections, image banks, etc. • Increased speed to market
  • 6.
    Key Components ofSmart Content
  • 7.
    | 7 VocabularyExample: EMMeT EMMeT UMLS SNOMED ICD9 ICD10 MeSH LOINC Elsevier Custom Resources Gold Standard (Drugs) Multi-language taxonomy: >1 million concepts >3 million synonyms Classes include: Anatomy Diseases Drugs Symptoms Procedures Sourced from several standardized vocabularies
  • 8.
    | 8 •Breast Disorders • Cancer of the Thorax • Mammary Neoplasms • More…. Medical Name Malignant Neoplasm of the Breast Consumer Friendly Name Breast Cancer Synonyms Malignant Tumor of Breast Malignant Breast Neoplasm Breast Ca Codes ICD9 – 174.9 MeSH – D001943 SNOMED-CT – 190121004 Semantic Type/Group Neoplastic Process/Disease • Breast Sarcoma • Familial Breast Cancer • Malignant lymphoma of the Breast • Malignant Neoplasm of the breast outer quadrant • More… Symptoms Diagnostic Procedures Treatment Procedures Medications Risk Factors Prevention Complications Breast Lump, Nipple Retraction, ….. Mammography, Breast Biopsy, ….. Chemotherapy, Mastectomy, …. Tamoxifen, Doxorubicin, ….. Family History, Genetics, Predisposition, …. Screening, Preemptive Mastectomy, …. Metastatic Cancer, …. Semantic Relationships 4 2 1 3
  • 9.
    | 9 VocabularyExample: EMMeT EMMeT UMLS SNOMED ICD9 ICD10 MeSH LOINC Elsevier Custom Resources Gold Standard (Drugs) FrEMMeT SpEMMeT
  • 10.
    | 10 LinkedData Repository
  • 11.
    | 11 LinkedData Repository • Knowledgebase of semantic data • Large scale integration of related sources of medical and scientific content and data • High performance service layer APIs for integration into end-user products and internal platforms Classic subject metadata Editorial & Author Keywords Full-text Indexing Linked Data Environment Componentized text Robust Data models Entity extraction Semantic Annotations
  • 12.
  • 13.
  • 14.
    | 14 SciValFunders Vocabulary • Support the FundRef initiative facilitated by CrossRef organization to provide a standard way of reporting funding sources for published scholarly research. • SciVal Funding is an online solution that provides targeted recommendations on grants, making it easier for researchers to discover funding opportunities related to their area of research.
  • 15.
    | 15 SimilarMethods for Neuroscience System to extract and index the Methods sections of articles from 100 Elsevier neuroscience journals Built a comparison and recommendation system so readers can find and evaluate articles with “Similar Methods” to the ones presented in the current article
  • 16.
    | 16 SimilarMethods for Neuroscience Search process targets factors: what brain regions are being studied what organism is being used what methodologies are being employed what disease model is the focus of the study
  • 17.
    | 17 LeveragingWikipedia for Neuroscience • Pilot project that identifies concepts from a Neuroscience topics vocabulary • Provides Wikipedia definitions to add context around the article’s significant concepts
  • 18.
    Additional context forEnergy terms 18 • A ‘dictionary app’ using the portions of the Encyclopedia of Energy (1818 terms) • Available for articles from Applied Energy and Energy Conversion and Management; additional pilots planned. Example: http://www.sciencedirect.com/science/article/pii/S0306261913001888 Terms from dictionary are highlighted in article, when the reader clicks on the term the definition from the dictionary will be shown in the feature (right hand pane)
  • 19.
  • 20.
    | 20 Getto a Use Case early • Get stakeholders invested • Think about what users currently do… and what they can do better • Focus on Use Cases to stay centered and identify priorities with decision making. Particularly helpful when introducing a new infrastructure to an organization
  • 21.
    | 21 Quality& Reliable of Resources • Integration with third party content, data models and vocabularies requires a vetting process:  Are they accurate?  Are they trustworthy?  Are they current?  Are they sustainable? Warning: Some of the more attractive resources on the web are one off projects are no longer maintained
  • 22.
    | 22 OngoingMaintenance • As knowledge models and vocabularies grow, resources are needed to keep them current • Governance policy should account for sources for new concepts, terminology and relations:  New content types  Search logs  New trends & discoveries These require resources (people’s time) that need to be factored into the total cost of ownership
  • 23.
    | 23 Quality& Testing • Applying semantic web technologies for applications is not an exclusively IT solution:  Sponsors, stakeholders and subject experts need to contribute and shape the vocabularies and the application functionality  The fine tuning for some of these applications can be surprisingly manual  It’s important to not get distracted by the outliers and corner cases Installing and implementing these technologies OOTB is getting easier…Quality is where it gets hard
  • 24.
    | 24 Quality& Testing • Test sets are essential  Real content  Real use cases  Scores that show accuracy and measure improvement Our SME’s: Our Application:
  • 25.
    | 25 Otherlessons & observations • Don’t forget to look at opportunities for internal applications  Consider internal workflows  Look for efficiency enhancements  Look for discovery opportunities • Start small  Get some early proof of concepts that you can share with stakeholders before tackling bigger challenges
  • 26.
    Thank You MichaelLauruhn m.lauruhn@elsevier.com @MikeLauruhn @ElsevierLabs