Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Open source community for
“Real World Data” Analysis
JANUARY 26, 2017, SCOPE SUMMIT, MIAMI
Kees van Bochove, CEO & Founder...
2
Agenda
1.  Introduction: The Hyve & Open Source
2.  What’s OHDSI & what can it do for you?
3.  Under the hood: OMOP Data...
1.
INTRODUCTION
THE HYVE
3
4
The Hyve
u  Professional	support	for	open	source	so+ware	for	bioinforma1cs	&	medical	informa1cs	
so5ware,	such	as	tranSM...
Interdisciplinary team
so5ware	 engineers,	 data	 scien1sts,	 project	 managers	 &	 staff;	 exper1se	 in	
bioinforma1cs,	me...
Open Source
u  Source code openly accessible and reusable for everyone
u  Enables pre-competitive collaboration: both acad...
7
3 Health Data Areas The Hyve is active in
u  Translational Research Data
(‘Clinical & bioinformatics data’)
u  Populatio...
2.
WHAT IS OHDSI?
OBSERVATIONAL HEALTH DATA SCIENCES AND INFORMATICS
8
9
10
What is OHDSI to you?
u  OHDSI is a scientific community to develop best
practices for observational research studies
u...
11
Questions OHDSI can answer
given a set of patient journeys
12
Questions OHDSI can answer
Clinical
characterization
Population-level
effect estimation
Patient-level
prediction
Which ...
13
Questions OHDSI can answer
14
How are patients with major depressive
disorder treated in real world data (250M)?
http://bit.ly/2jYCGkI
15
Informing Clinical Trial Design
u  Designing and testing inclusion/exclusion criteria for trials
u  Performing observat...
3.
UNDER THE HOOD
THE OMOP DATA MODEL & MAPPING PROCESS
16
17
OMOP & OHDSI Tools - Overview
u  OMOP: Common Data Model for observational healthcare data:
persons, drugs, procedures,...
18
OMOP Common Data Model v5.0
v  OMOP =
Observational
Medical
Outcomes
Partnership
v  CDM = Common
Data Model
v  SQL Tabl...
19
OMOP-CDM
Person data table
20
21
Mapping the source data to OMOP CDM
ETL design ETL implementation
White Rabbit
Source data inventarisation
Rabbit in a ...
22
Output from White Rabbit
Tab “Overview”: fields for each table
Tab “Medication”: per table values in fields and frequen...
23
Mapping of tables to CDM
24
v  All coded items (gender, race etc) need to be mapped
v  Mapping of Medication, Diagnosis, procedures values to
appro...
25
Overview of ontologies used in OMOP
over 80 healthcare
vocabularies mapped
4.
OHDSI – ANALYTICS TOOLS
26
27
Tools on GitHub
28
Work with the community
29
Ask the community
30
What can I do with OHDSI tools?
u  Explore & QC the mapped data
u  Build cohort definitions using concept sets
u  Look ...
31
ACHILLES: Database overview
32
ACHILLES: Achilles Heel Report
33
ACHILLES: Conditions Overview
34
ATLAS: Vocabulary Search
35
ATLAS: Concept Set Definition
36
ATLAS: Cohort Definition
37
ATLAS: Individual Patient Profile
38
Inclusion/Exclusion Query Results
Slide from P. Ryan, Janssen
5.
IMI EUROPEAN MEDICAL INFORMATION FRAMEWORK
39
To become the trusted
European hub for health
care data intelligence,
enabling new insights into
diseases and treatments
E...
The real story of the treatments in clinical practice
41
The value of healthcare data for secondary uses in clinical resea...
Data available through EMIF consortium
§  Large variety in “types” of data
§  Data is available from more than 53 million ...
43
EMIF Platform Design
Data
access
Module
Data
access
Module
Extract
Site Y
Site Z
Extract
CommonOntology/De-identificati...
>40million
MAAS
SDR
EGCUT
PEDIANET
SCTS
IMASIS
HSD
AUH
IPCI
ARS
SIDIAP
PHARMO
THIN
100 1,000 10,000 100,000 1,000,000 10,0...
Catalogue with available data sources
45 https://emif-catalogue.eu
Just released last week!
see www.emif.eu
Catalogue with available data sources
46 https://emif-catalogue.eu
47
Automatic Mapping of Drug Concepts to
the RxNorm Vocabulary
Maxim Moinat* [1], Lars Pedersen [2], Jolanda Strubel [1], ...
48
Use of OMOP/OHDSI provides EMIF with:
u  A uniform way to perform suitability and feasibility
queries across multiple d...
The goal is patient benefit
49
Prof. Johan van der Lei
Erasmus MC University Medical Center
“We need to learn from experie...
SCOPE Summit - Applying the OMOP data model & OHDSI software to national European health data registries: the IMI EMIF pro...
Upcoming SlideShare
Loading in …5
×

SCOPE Summit - Applying the OMOP data model & OHDSI software to national European health data registries: the IMI EMIF project

1,067 views

Published on

Talk from Kees van Bochove, The Hyve at SCOPE Summit, Real World Data track, Jan 26, 2017, Miami
A large open source initiative for standardisation and epidemiological analysis for real world data is OHDSI: Observational Health Data Sciences and Informatics. OHDSI leverages the OMOP common data model for observational data, and provides data analysis tools for a broad range of use cases. This talk will explain OMOP and OHDSI with case study IMI EMIF, in which health data from over 50 million patients from 13 national and regional European registries is brought together.

Published in: Health & Medicine
  • Doubled or Tripled in 5 weeks! Would recommend to anyone.  https://bit.ly/2No6XLF
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
  • Be the first to like this

SCOPE Summit - Applying the OMOP data model & OHDSI software to national European health data registries: the IMI EMIF project

  1. 1. Open source community for “Real World Data” Analysis JANUARY 26, 2017, SCOPE SUMMIT, MIAMI Kees van Bochove, CEO & Founder, The Hyve – @keesvanbochove With thanks to Patrick Ryan, Nigel Hughes & Bart Vannieuwenhuyse from Janssen for slides & feedback!
  2. 2. 2 Agenda 1.  Introduction: The Hyve & Open Source 2.  What’s OHDSI & what can it do for you? 3.  Under the hood: OMOP Data Model & Mapping Process 4.  Showcase: OHDSI data analytics tools 5.  The application of OMOP and OHDSI in IMI EMIF
  3. 3. 1. INTRODUCTION THE HYVE 3
  4. 4. 4 The Hyve u  Professional support for open source so+ware for bioinforma1cs & medical informa1cs so5ware, such as tranSMART, cBioPortal, i2b2, Galaxy, CKAN and OHDSI Mission Enable pre-compe11ve collabora1on in life science R&D by leveraging open source so+ware Core values Share Reuse Specialize Office Loca6ons Utrecht, Netherlands Cambridge, MA, United States Services So5ware development Data science services Consultancy Hos1ng / SLAs Fast-growing Started in 2012 40 people by now
  5. 5. Interdisciplinary team so5ware engineers, data scien1sts, project managers & staff; exper1se in bioinforma1cs, medical informa1cs, so5ware engineering, biosta1s1cs etc. 5
  6. 6. Open Source u  Source code openly accessible and reusable for everyone u  Enables pre-competitive collaboration: both academics and industry can use and enhance it u  Transparency: verification (scientific as well as IT security) can be done by anyone, no ‘black box’
  7. 7. 7 3 Health Data Areas The Hyve is active in u  Translational Research Data (‘Clinical & bioinformatics data’) u  Population Health Data (‘Real world data’) u  Personal Health Data (‘Mobile & sensors data’) Example (RWD) projects:
  8. 8. 2. WHAT IS OHDSI? OBSERVATIONAL HEALTH DATA SCIENCES AND INFORMATICS 8
  9. 9. 9
  10. 10. 10 What is OHDSI to you? u  OHDSI is a scientific community to develop best practices for observational research studies u  OHDSI is a data network bringing together data from over 650 million patients worldwide to execute studies u  OMOP is an open data model and OHDSI is a suite of open source software tools for analysis (epidemiology, but also e.g. inclusion/exclusion criteria feasibility)
  11. 11. 11 Questions OHDSI can answer given a set of patient journeys
  12. 12. 12 Questions OHDSI can answer Clinical characterization Population-level effect estimation Patient-level prediction Which treatment did patients choose after diagnosis? Which patients chose which treatments? How many patients experienced the outcome after treatment? Does one treatment cause the outcome more than an alternative? Does treatment cause outcome? What is the probability I will develop the disease? What is the probability I will experience the outcome?
  13. 13. 13 Questions OHDSI can answer
  14. 14. 14 How are patients with major depressive disorder treated in real world data (250M)? http://bit.ly/2jYCGkI
  15. 15. 15 Informing Clinical Trial Design u  Designing and testing inclusion/exclusion criteria for trials u  Performing observational studies as a basis for choosing effective randomized clinical trial designs and targets u  Elucidating real world use of medicines and treatments for safety purposes
  16. 16. 3. UNDER THE HOOD THE OMOP DATA MODEL & MAPPING PROCESS 16
  17. 17. 17 OMOP & OHDSI Tools - Overview u  OMOP: Common Data Model for observational healthcare data: persons, drugs, procedures, devices, conditions etc. u  OHDSI: Large-scale analytics tools for observational data An open source community, a.o. developing: u  Tools to support the ETL / mapping process into OMOP (White Rabbit etc.) u  Tools to perform analytics: e.g. Achilles for data profiling, Calypso for feasibility assessment à now being integrated into ATLAS www.omop.org www.ohdsi.org
  18. 18. 18 OMOP Common Data Model v5.0 v  OMOP = Observational Medical Outcomes Partnership v  CDM = Common Data Model v  SQL Tables
  19. 19. 19 OMOP-CDM Person data table
  20. 20. 20
  21. 21. 21 Mapping the source data to OMOP CDM ETL design ETL implementation White Rabbit Source data inventarisation Rabbit in a Hat Map source tables to CDM structure Toolsused Usagi Map source terms to CDM ontologies (vocabulairies) syntactic mapping semantic mapping ETL verification Achilles Review database profiles Review data quality assesment (Achilles Heel)
  22. 22. 22 Output from White Rabbit Tab “Overview”: fields for each table Tab “Medication”: per table values in fields and frequencies =Medication name
  23. 23. 23 Mapping of tables to CDM
  24. 24. 24 v  All coded items (gender, race etc) need to be mapped v  Mapping of Medication, Diagnosis, procedures values to appropriate ontology (RXNorm, ICD-9 etc) Map terms to target vocabularies NHANES Gender code NHANES Gender description Equivalent OMOP SOURCE_CODE OMOP SOURCE_CODE_DESCRIP TION SOURCE_TO_CONCEPT_M AP_ID . missing U UNKNOWN 8551 1 Male M MALE 8507 2 Female F FEMALE 8532
  25. 25. 25 Overview of ontologies used in OMOP over 80 healthcare vocabularies mapped
  26. 26. 4. OHDSI – ANALYTICS TOOLS 26
  27. 27. 27 Tools on GitHub
  28. 28. 28 Work with the community
  29. 29. 29 Ask the community
  30. 30. 30 What can I do with OHDSI tools? u  Explore & QC the mapped data u  Build cohort definitions using concept sets u  Look at patient profiles u  Run and evaluate queries for clinical study feasibility assesment
  31. 31. 31 ACHILLES: Database overview
  32. 32. 32 ACHILLES: Achilles Heel Report
  33. 33. 33 ACHILLES: Conditions Overview
  34. 34. 34 ATLAS: Vocabulary Search
  35. 35. 35 ATLAS: Concept Set Definition
  36. 36. 36 ATLAS: Cohort Definition
  37. 37. 37 ATLAS: Individual Patient Profile
  38. 38. 38 Inclusion/Exclusion Query Results Slide from P. Ryan, Janssen
  39. 39. 5. IMI EUROPEAN MEDICAL INFORMATION FRAMEWORK 39
  40. 40. To become the trusted European hub for health care data intelligence, enabling new insights into diseases and treatments EMIF vision 40 Discover Assess Reuse
  41. 41. The real story of the treatments in clinical practice 41 The value of healthcare data for secondary uses in clinical research and development — Gary K. Mallow, Merck, HIMSS 2012 1 2 3 4 5 6 7 8 9 1,000 10,000 100,000 1 million Years #PatientExperiences/Records The “burning platform” for life sciences Pharma-owned highly controlled clinical trials data Clinical practice, patients, payers and providers own the data Product Launch R&D Phase IV Challenge Today, Pharma doesn’t have ready access to this data, yet insights for safety, CER and other areas are within this clinical domain, which includes medical records, pharmacy, labs, claims, radiology etc.
  42. 42. Data available through EMIF consortium §  Large variety in “types” of data §  Data is available from more than 53 million subjects from seven EU countries, including Primary care data sets Hospital data Administrative data Regional record- linkage systems Registries and cohorts (broad and disease specific) Biobanks >25,000 subjects in AD cohorts >90,000 subjects in metabolic cohorts
  43. 43. 43 EMIF Platform Design Data access Module Data access Module Extract Site Y Site Z Extract CommonOntology/De-identification EMIF platform solution Governance Data owners Researchers User admin User admin Remote user 1 Remote user 2 Data Sources 1° care Hospital Admin Regional Registries & cohorts Biobanks 2° care Paediatric
  44. 44. >40million MAAS SDR EGCUT PEDIANET SCTS IMASIS HSD AUH IPCI ARS SIDIAP PHARMO THIN 100 1,000 10,000 100,000 1,000,000 10,000,000 100,000,000 Approximatetotal(cumulative)numberofsubjects Available data sources in EMIF 44 EMIF-Platform EMIF-Available Data Sources; EXAMPLES 1K 2K 52K 400K 475K 2.8M 2.3M 10M Status Jan 2016 3.6M 1.6M 1M 12M 6M
  45. 45. Catalogue with available data sources 45 https://emif-catalogue.eu Just released last week! see www.emif.eu
  46. 46. Catalogue with available data sources 46 https://emif-catalogue.eu
  47. 47. 47 Automatic Mapping of Drug Concepts to the RxNorm Vocabulary Maxim Moinat* [1], Lars Pedersen [2], Jolanda Strubel [1], Marinel Cavelaars [1], Kees van Bochove [1], Peter Rijnbeek [3], Michel van Speybroeck [4], Martijn Schuemie [4] [1] The Hyve, Utrecht, The Netherlands The Hyve, Cambridge, United States [2] Aarhus University Hospital, Aarhus, Denmark [3] Erasmus MC, Rotterdam, The Netherlands [4] Janssen Pharmaceuticals, Inc. *E-mail: maxim@thehyve.nl. 1. Background Mapping source concepts to the standard concepts in the OMOP vocabularies is one of the most time-consuming tasks during the transformation to the OMOP Common Data Model. Drug mapping is in particular challenging, because different components have to be mapped: ingredient, dose form and strength. As part of the European Medical Information Framework (EMIF) project, Danish population health data are mapped to the OMOP CDM, including the local drug codes. The Hyve assists in creating a script to automatically map a set of 4754 drugs to the RxNorm vocabulary. The input data contains ATC codes, dosage forms, numerical strengths and strength units. Two examples are shown in Figure 1. The mapping procedure presented here is based on the drug mapping for the Japan Medical Data Center Claims DatabaseI . We empower scientists by building on open source software 2. Mapping Procedure The mapping uses the RxNorm hierarchy and consists of four steps (see Figure 2). 1. Drugs are mapped to RxNorm Ingredient via the 5th level ATC code. The OMOP relationship ‘ATC - RxNorm’ is used for this purpose. 2. Dose form is added to the ingredient level, to map to Clinical Drug Form level. 3. The information on drug strength (including unit) is added to map to Clinical ▲ Figure 1: Examples of input data. Example 1 is successfully mapped automatically. Example 2 consists of two ingredients and has an ATC concept that could not be mapped to a RxNorm concept. ➢ Risperdal ➢ N05AX08 ➢ Filmovertrukne tabletter ➢ 0.5 ➢ MG Example 1 Example 2 ➢ Fortzaar ➢ C09DA06 ➢ depottabletter ➢ 100 + 25 mg Risperidone 0.5 MG Oral Tablet (RxNorm Clinical Drug) Condesartan and diuretics (ATC code) Mapped to
  48. 48. 48 Use of OMOP/OHDSI provides EMIF with: u  A uniform way to perform suitability and feasibility queries across multiple diverse European data sources u  An entry point to quickly initiate and perform observational studies within one or more data sources u  Direct insight & dashboarding of data for data owners (e.g. national registries, hospitals)
  49. 49. The goal is patient benefit 49 Prof. Johan van der Lei Erasmus MC University Medical Center “We need to learn from experience and find ways to unite the large volumes of data in Europe. At the end of the day, we are in this for better health care.” Co-coordinator EMIF-Platform EMIF-Platform

×