SlideShare a Scribd company logo
1 of 14
Download to read offline
PEDSnet:	
  A	
  Na-onal	
  Pediatric	
  Learning	
  
Health	
  System	
  
18-­‐month	
  Summary	
  on	
  Data	
  Integra3on	
  and	
  Data	
  Quality	
  
	
  
Ritu	
  Khare	
  
	
  
	
  
	
  
	
  
	
  
October	
  23	
  2015	
  
	
  
1	
  
PEDSnet	
  	
  
A	
  Na-onal	
  Pediatric	
  Learning	
  Health	
  System	
  (Forrest	
  et	
  al.	
  2014)	
  
•  What	
  is	
  PEDSnet?	
  
–  Clinical	
  Data	
  Research	
  Network	
  (CDRN)	
  	
  
•  Facilitate	
  large-­‐scale	
  research	
  
–  Aggregates	
  pediatric	
  EHR	
  data	
  from	
  mulNple	
  
insNtuNons	
  	
  
•  PCORI	
  Support	
  
–  One	
  of	
  the	
  11	
  CDRNs	
  supported	
  by	
  PaNent-­‐
Centered	
  Outcomes	
  Research	
  InsNtute	
  (PCORI)	
  
–  Part	
  of	
  a	
  larger	
  network,	
  PCORnet,	
  which	
  
combines	
  data	
  from	
  all	
  CDRNs	
  under	
  a	
  common	
  
data	
  model	
  	
  
2	
  
PEDSnet	
  Organiza3onal	
  Architecture	
  
3	
  
	
  
•  Establish	
  an	
  interoperable	
  digital	
  infrastructure	
  to	
  support	
  learning	
  
(research,	
  disseminaNon,	
  and	
  implementaNon),	
  recruitment,	
  and	
  
engagement	
  of	
  stakeholders.	
  	
  
•  Validate	
  that	
  the	
  aggregated	
  data	
  is	
  high-­‐quality	
  and	
  ready	
  to	
  be	
  
integrated	
  into	
  the	
  PCORnet	
  research	
  network	
  
Leadership	
  
Core	
  
Network	
  
Core	
  
Science	
  Core	
  
Engagement	
  
Core	
  
InformaNcs	
  Core	
  
(>50%	
  of	
  team	
  
members)	
  
Steering	
  CommiWee	
  
NaNonal	
  Advisory	
  CommiWee	
  
Building	
  PEDSnet	
  
Challenges	
  and	
  OpportuniNes	
  
4	
  
S1	
  
S2	
  
S3	
  
S4	
  
S5	
  
S6	
  
S7	
  
S8	
  
PCORnet	
  
Common	
  
Data	
  Model	
  
PEDSnet	
  sites	
  	
  
(at	
  various	
  loca-ons)	
  
PEDSnet	
  Data	
  CoordinaNng	
  Center	
  
•  lack	
  of	
  EHR	
  data’s	
  
orientaNon	
  toward	
  
research	
  analyNcs	
  
•  diversity	
  in	
  EHR	
  
configuraNons	
  
•  semanNc	
  heterogeneity	
  
	
  	
  
•  data	
  peculiariNes	
  in	
  
pediatrics	
  
•  large-­‐scale	
  
•  Interdisciplinary	
  
informaNcs	
  team	
  
•  ObservaNonal	
  
medical	
  outcomes	
  
partnership	
  (OMOP)	
  
•  common	
  data	
  
model	
  	
  
•  unified	
  vocabulary	
  	
  
•  Advances	
  in	
  data	
  
science	
  (big	
  data	
  and	
  
text	
  mining	
  tools)	
  
	
   	
  	
  
Challenges	
   Opportuni3es	
  
Building	
  PEDSnet	
  	
  
Overall	
  Approach	
  
5	
  
S1	
  
S2	
  
S3	
  
S4	
  
S5	
  
S6	
  
S7	
  
S8	
  
PCORnet	
  
Common	
  
Data	
  Model	
  
PEDSnet	
  sites	
  	
  
(at	
  various	
  loca-ons)	
  
	
  
OMOP	
  
Common	
  Data	
  
Model	
  
	
  
Post	
  hoc	
  
Data	
  Quality	
  
Assessment	
  
OMOP	
  to	
  
PCORnet	
  
transformaNon	
  
scripts	
  
Extract-­‐transform-­‐
load	
  (ETL)?	
  
Fitness	
  for	
  
research?	
  
•  More	
  granular	
  	
  
•  Vocabulary	
  support	
  
•  OMOP	
  analyNc	
  tools	
  
•  Pediatric	
  specific	
  edits	
  
Specs	
  
Post	
  hoc	
  Data	
  Quality	
  Assessment	
  
One	
  Data	
  Cycle	
  
	
  	
  
6	
  
Data	
  Quality	
  
Checks	
  
OMOP	
  
common	
  data	
  
model	
  
	
  
Communicate	
  
data	
  quality	
  
issues	
  to	
  sites	
  
	
  
Two	
  Expert	
  
Reviewers	
  
Site	
  ETL	
  
Team	
  
•  Formal	
  ViolaNons	
  of	
  SpecificaNons	
  
–  Illegal	
  values	
  in	
  coded	
  fields	
  (e.g.	
  paNent’s	
  race)	
  
–  Any	
  missing	
  data	
  or	
  informaNon	
  	
  (discharge	
  flags	
  missing	
  for	
  outpaNent	
  
visits)	
  
•  Implausible	
  Events	
  	
  
–  Death	
  date	
  in	
  2017	
  	
  
–  Visit	
  date	
  in	
  1930	
  
–  Height	
  =	
  6	
  cm	
  	
  
•  Outliers	
  (by	
  Comparison)	
  
–  “diabetes”	
  	
  -­‐	
  	
  most	
  frequent	
  condiNon	
  
–  Over	
  50	
  procedures/paNent	
  at	
  a	
  given	
  site	
  (other	
  sites	
  having	
  15-­‐20)	
  
•  Others	
  (catch	
  our	
  aWenNon!)	
  
–  a	
  provider	
  responsible	
  for	
  recording	
  over	
  2M	
  measurements	
  
–  An	
  outpaNent	
  visit	
  having	
  30k	
  measurements	
  	
  
–  sudden	
  increase	
  in	
  number	
  of	
  facts	
  aier	
  a	
  certain	
  date/year	
   7	
  
Post	
  hoc	
  Data	
  Quality	
  Assessment	
  
Types	
  of	
  Checks	
  and	
  Example	
  Issues	
  	
  
Post	
  hoc	
  Data	
  Quality	
  Assessment	
  
Communica-on	
  with	
  Sites	
  	
  
	
  
•  Prepare	
  data	
  quality	
  reports	
  
•  IdenNfy	
  causes	
  of	
  issues	
  and	
  propose	
  soluNons	
  
8	
  
Visual	
  Report	
   GitHub	
  issue	
  report	
  
Ranking	
  
18-­‐month	
  Experience	
  Summary	
  
9	
  
Data	
  Content	
  of	
  the	
  Network	
  	
  
(8	
  sites)	
  
~4.7M	
  children	
  
~98M	
  clinical	
  encounters	
  	
  	
  
	
  
~120M	
  condiNons	
  	
  
~68M	
  procedures	
  	
  
~100M	
  medicaNons	
  
~42M	
  observaNons	
  
~428M	
  labs	
  and	
  vital	
  signs	
  
Prospec3ve	
  Methods	
  
Communica-on	
   >	
  90	
  ETL	
  MeeNngs	
  
ETL	
  Code	
   	
  95	
  GitHub	
  commits/site	
  
ETL	
  
Documenta-on	
  
~150	
  GitHub	
  commits	
  
	
  
Post	
  hoc	
  
Data	
  Cycle	
  
OMOP	
  Model	
  
version	
  
Total	
  Issues	
  
Jan	
  ‘15	
   OMOP	
  v4	
   117	
  
Mar	
  ‘15	
   OMOP	
  v4	
   313	
  
May	
  ‘15	
   OMOP	
  v5	
   440	
  
Jul	
  ‘15	
   OMOP	
  v5	
   591	
  
Sep	
  ’15	
   OMOP	
  v5	
   505	
  
18-­‐month	
  Experience	
  Summary	
  
Data	
  Quality	
  in	
  the	
  Most	
  Recent	
  Data	
  Cycle	
  
10	
  
Most	
  Frequent	
  Issues	
  
Missing	
  Data	
   31%	
  
EnNty	
  Outliers	
   15%	
  
Unexpected	
  difference	
  
from	
  previous	
  ETL	
  run	
  
9%	
  
Implausible	
  Dates	
   9%	
  
Temporal	
  Outliers	
   6%	
  
Unexpected	
  Facts	
   4%	
  
Domains	
  with	
  Most	
  Issues	
  
MedicaNon	
   21%	
  
Labs	
  and	
  Vital	
  Signs	
   13%	
  
ObservaNon	
   11%	
  
CondiNon	
   10%	
  
Procedure	
   9%	
  
Demographic	
   7%	
  
Encounter	
   7%	
  
18-­‐month	
  Experience	
  Summary	
  
Causes	
  of	
  Data	
  Quality	
  Issues	
  
•  ETL	
  code	
  	
  
–  Programming	
  bug/mistake	
  	
  
–  Ambiguous	
  or	
  incomplete	
  ETL	
  specificaNons	
  	
  
–  AdministraNve	
  reasons	
  	
  
–  Can	
  be	
  fixed	
  
•  Source	
  Data	
  CharacterisNcs	
  
–  Data	
  Entry	
  Error	
  	
  
–  Clinical	
  or	
  administraNve	
  workflow	
  	
  
–  True	
  anomaly	
  	
  
–  Cannot	
  be	
  fixed	
  	
  
11	
  
18-­‐month	
  Experience	
  Summary	
  
Causes	
  of	
  Issues	
  and	
  Lessons	
  Learnt	
  
12	
  
u  ETL	
  	
  
u  Data	
  CharacterisNcs	
  
u  I2b2	
  transform	
  
u  Non-­‐issue	
  0	
  
20	
  
40	
  
60	
  
Jan	
  '15	
  
(OMOP	
  v4)	
  
Mar	
  '15	
  
(OMOP	
  v4)	
  
May	
  '15	
  
(OMOP	
  v5)	
  
Jul	
  '15	
  
(OMOP	
  v5)	
  
Percentage	
  of	
  
Issues	
  
Distribu3on	
  of	
  Issue	
  Causes	
  across	
  Data	
  Cycles	
  
•  A	
  strong	
  post	
  hoc	
  program	
  is	
  important	
  in	
  assessment	
  of	
  data	
  
validity	
  in	
  CDRNs	
  
•  The	
  percentage	
  of	
  “ETL”	
  issues	
  decreased	
  across	
  successive	
  data	
  cycles	
  
•  Avg.	
  22	
  ETL	
  issues	
  /	
  site.	
  	
  
•  This	
  study	
  serves	
  as	
  a	
  data	
  educaNon	
  plarorm	
  	
  
•  The	
  total	
  number	
  of	
  “data	
  characterisNc”	
  issues	
  increased	
  across	
  cycles	
  
Future	
  Work	
  
•  Advanced	
  data	
  quality	
  checks	
  	
  
–  Current	
  work	
  largely	
  based	
  on	
  aWribute-­‐level	
  checks	
  
–  Comparison	
  with	
  external	
  datasets	
  and	
  published	
  
findings	
  	
  
•  DeterminaNon	
  of	
  research	
  fitness	
  
–  based	
  on	
  the	
  source	
  data	
  characterisNcs	
  idenNfied	
  
through	
  post	
  hoc	
  assessments	
  
•  AutomaNon	
  	
  
–  The	
  data	
  quality	
  check	
  scripts	
  applied	
  to	
  other	
  OMOP-­‐
based	
  network.	
  	
  
13	
  
Acknowledgments	
  
14	
  
•  PEDSnet	
  Data	
  
CoordinaNng	
  Center	
  
–  Aaron	
  Browne	
  
–  Byron	
  Ruth	
  
–  EvaneWe	
  Burrows	
  
–  Jeff	
  Pennington	
  	
  
–  Kevin	
  Murphy	
  
–  Lemar	
  Davidson	
  
–  Lena	
  Kushleyeva	
  
–  Levon	
  UNdjian	
  	
  	
  
–  Toan	
  Ong	
  
•  PEDSnet	
  ETL	
  teams	
  at	
  
various	
  sites	
  
•  PEDSnet	
  InformaNcs	
  
Leadership	
  
–  Charles	
  Bailey	
  	
  
–  Michael	
  Kahn	
  	
  
–  Keith	
  Marsolo	
  	
  
–  Christopher	
  Forrest	
  
	
  
•  OHDSI	
  ConsorNum	
  	
  
	
  
•  PaNents	
  and	
  Families	
  	
  
	
  
•  This	
  work	
  was	
  supported	
  
by	
  PCORI	
  Contract	
  
CDRN-­‐1306-­‐01556.	
  

More Related Content

What's hot

Holdings Verification, Monitoring and Collecting Statistics – How Can Librari...
Holdings Verification, Monitoring and Collecting Statistics – How Can Librari...Holdings Verification, Monitoring and Collecting Statistics – How Can Librari...
Holdings Verification, Monitoring and Collecting Statistics – How Can Librari...Charleston Conference
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...GigaScience, BGI Hong Kong
 
Large Scale Studies: Malware Needles in a Haystack
Large Scale Studies: Malware Needles in a HaystackLarge Scale Studies: Malware Needles in a Haystack
Large Scale Studies: Malware Needles in a HaystackMarcus Botacin
 
Ryan_Hupe_Resume
Ryan_Hupe_ResumeRyan_Hupe_Resume
Ryan_Hupe_ResumeRyan Hupe
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentMaribel Acosta Deibe
 
IRIDA: Canada’s federated platform for genomic epidemiology
IRIDA: Canada’s federated platform for genomic epidemiology IRIDA: Canada’s federated platform for genomic epidemiology
IRIDA: Canada’s federated platform for genomic epidemiology William Hsiao
 
New Testing Standards Are on the Horizon: What Will Be Their Impact?
New Testing Standards Are on the Horizon: What Will Be Their Impact?New Testing Standards Are on the Horizon: What Will Be Their Impact?
New Testing Standards Are on the Horizon: What Will Be Their Impact?TechWell
 
ProteomeXchange_and_PRIDE_Semmeting_2015
ProteomeXchange_and_PRIDE_Semmeting_2015ProteomeXchange_and_PRIDE_Semmeting_2015
ProteomeXchange_and_PRIDE_Semmeting_2015Juan Antonio Vizcaino
 
IRIDA: Canada’s federated platform for genomic epidemiology, ABPHM 2015 WHsiao
IRIDA: Canada’s federated platform for genomic epidemiology, ABPHM 2015 WHsiaoIRIDA: Canada’s federated platform for genomic epidemiology, ABPHM 2015 WHsiao
IRIDA: Canada’s federated platform for genomic epidemiology, ABPHM 2015 WHsiaoIRIDA_community
 
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality?  - William HsiaoHow Can We Make Genomic Epidemiology a Widespread Reality?  - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality? - William HsiaoWilliam Hsiao
 
Dynamic Social Network Analysis (and more!) with eResearch Tools
Dynamic Social Network Analysis (and more!) with eResearch ToolsDynamic Social Network Analysis (and more!) with eResearch Tools
Dynamic Social Network Analysis (and more!) with eResearch ToolsAndrea Wiggins
 

What's hot (17)

Holdings Verification, Monitoring and Collecting Statistics – How Can Librari...
Holdings Verification, Monitoring and Collecting Statistics – How Can Librari...Holdings Verification, Monitoring and Collecting Statistics – How Can Librari...
Holdings Verification, Monitoring and Collecting Statistics – How Can Librari...
 
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...
 
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
The US-EPA CompTox Chemicals Dashboard – a key player in the domain of Open S...
 
Large Scale Studies: Malware Needles in a Haystack
Large Scale Studies: Malware Needles in a HaystackLarge Scale Studies: Malware Needles in a Haystack
Large Scale Studies: Malware Needles in a Haystack
 
Overview of open resources to support automated structure verification and e...
Overview of open resources to support automated structure verification  and e...Overview of open resources to support automated structure verification  and e...
Overview of open resources to support automated structure verification and e...
 
Ryan_Hupe_Resume
Ryan_Hupe_ResumeRyan_Hupe_Resume
Ryan_Hupe_Resume
 
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
Introduction to Cheminformatics: Accessing data through the CompTox Chemicals...
 
10th Annual Utah's Health Services Research Conference - Data Quality in Mult...
10th Annual Utah's Health Services Research Conference - Data Quality in Mult...10th Annual Utah's Health Services Research Conference - Data Quality in Mult...
10th Annual Utah's Health Services Research Conference - Data Quality in Mult...
 
Crowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality AssessmentCrowdsourcing Linked Data Quality Assessment
Crowdsourcing Linked Data Quality Assessment
 
IRIDA: Canada’s federated platform for genomic epidemiology
IRIDA: Canada’s federated platform for genomic epidemiology IRIDA: Canada’s federated platform for genomic epidemiology
IRIDA: Canada’s federated platform for genomic epidemiology
 
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental scienceUS-EPA Chemicals Dashboard – an integrated data hub for environmental science
US-EPA Chemicals Dashboard – an integrated data hub for environmental science
 
New Testing Standards Are on the Horizon: What Will Be Their Impact?
New Testing Standards Are on the Horizon: What Will Be Their Impact?New Testing Standards Are on the Horizon: What Will Be Their Impact?
New Testing Standards Are on the Horizon: What Will Be Their Impact?
 
ProteomeXchange_and_PRIDE_Semmeting_2015
ProteomeXchange_and_PRIDE_Semmeting_2015ProteomeXchange_and_PRIDE_Semmeting_2015
ProteomeXchange_and_PRIDE_Semmeting_2015
 
IRIDA: Canada’s federated platform for genomic epidemiology, ABPHM 2015 WHsiao
IRIDA: Canada’s federated platform for genomic epidemiology, ABPHM 2015 WHsiaoIRIDA: Canada’s federated platform for genomic epidemiology, ABPHM 2015 WHsiao
IRIDA: Canada’s federated platform for genomic epidemiology, ABPHM 2015 WHsiao
 
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality?  - William HsiaoHow Can We Make Genomic Epidemiology a Widespread Reality?  - William Hsiao
How Can We Make Genomic Epidemiology a Widespread Reality? - William Hsiao
 
Dynamic Social Network Analysis (and more!) with eResearch Tools
Dynamic Social Network Analysis (and more!) with eResearch ToolsDynamic Social Network Analysis (and more!) with eResearch Tools
Dynamic Social Network Analysis (and more!) with eResearch Tools
 
An examination of data quality on QSAR Modeling in regards to the environment...
An examination of data quality on QSAR Modeling in regards to the environment...An examination of data quality on QSAR Modeling in regards to the environment...
An examination of data quality on QSAR Modeling in regards to the environment...
 

Similar to PEDSnet : 18 month summary on data integration and data quality

Ifla203 archambault
Ifla203 archambaultIfla203 archambault
Ifla203 archambaultsusangar
 
Multi-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsMulti-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsAravind Sesagiri Raamkumar
 
Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695Ge Peng
 
An introduction to system-oriented evaluation in Information Retrieval
An introduction to system-oriented evaluation in Information RetrievalAn introduction to system-oriented evaluation in Information Retrieval
An introduction to system-oriented evaluation in Information RetrievalMounia Lalmas-Roelleke
 
Hands-on Machine Learning Using Healthcare
Hands-on Machine Learning Using HealthcareHands-on Machine Learning Using Healthcare
Hands-on Machine Learning Using HealthcareHealth Catalyst
 
Quartesian capabilities-2013
Quartesian capabilities-2013Quartesian capabilities-2013
Quartesian capabilities-2013Benjamin Jackson
 
Importance of data standards and system validation of software for clinical r...
Importance of data standards and system validation of software for clinical r...Importance of data standards and system validation of software for clinical r...
Importance of data standards and system validation of software for clinical r...Wolfgang Kuchinke
 
Overview of the TREC 2016 Open Search track: Academic Search Edition
Overview of the TREC 2016 Open Search track: Academic Search EditionOverview of the TREC 2016 Open Search track: Academic Search Edition
Overview of the TREC 2016 Open Search track: Academic Search Editionkrisztianbalog
 
CINECA webinar slides: Modular and reproducible workflows for federated molec...
CINECA webinar slides: Modular and reproducible workflows for federated molec...CINECA webinar slides: Modular and reproducible workflows for federated molec...
CINECA webinar slides: Modular and reproducible workflows for federated molec...CINECAProject
 
eSource, DIA EuroMeeting, Lisbon, March 2005
eSource, DIA EuroMeeting, Lisbon, March 2005eSource, DIA EuroMeeting, Lisbon, March 2005
eSource, DIA EuroMeeting, Lisbon, March 2005AsseroLtd
 
eSource: Data Capture Simplified - Uncover Time and Cost Saving Possibilities
eSource: Data Capture Simplified - Uncover Time and Cost Saving PossibilitieseSource: Data Capture Simplified - Uncover Time and Cost Saving Possibilities
eSource: Data Capture Simplified - Uncover Time and Cost Saving Possibilitieswww.datatrak.com
 
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...Oscar Peña del Rio
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
 
BDE Technical Webinar 1 : Requirements elicitation
BDE Technical Webinar 1 : Requirements elicitationBDE Technical Webinar 1 : Requirements elicitation
BDE Technical Webinar 1 : Requirements elicitationBigData_Europe
 
Multivarite and network tools for biological data analysis
Multivarite and network tools for biological data analysisMultivarite and network tools for biological data analysis
Multivarite and network tools for biological data analysisDmitry Grapov
 
PT-CRIS: EUROCRIS: 2013: Eco-system of research information systems
PT-CRIS: EUROCRIS: 2013: Eco-system of research information systemsPT-CRIS: EUROCRIS: 2013: Eco-system of research information systems
PT-CRIS: EUROCRIS: 2013: Eco-system of research information systemsJoão Mendes Moreira
 
Item 2 : Results of the Spectral Soil Data - Needs and capacities questionnaires
Item 2 : Results of the Spectral Soil Data - Needs and capacities questionnairesItem 2 : Results of the Spectral Soil Data - Needs and capacities questionnaires
Item 2 : Results of the Spectral Soil Data - Needs and capacities questionnairesSoils FAO-GSP
 

Similar to PEDSnet : 18 month summary on data integration and data quality (20)

Ifla203 archambault
Ifla203 archambaultIfla203 archambault
Ifla203 archambault
 
Multi-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender SystemsMulti-method Evaluation in Scientific Paper Recommender Systems
Multi-method Evaluation in Scientific Paper Recommender Systems
 
Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695Peng Privette SMM_AMS2014_P695
Peng Privette SMM_AMS2014_P695
 
Fundamental of Quality Data - Anthony Ndungu
Fundamental of Quality Data - Anthony NdunguFundamental of Quality Data - Anthony Ndungu
Fundamental of Quality Data - Anthony Ndungu
 
An introduction to system-oriented evaluation in Information Retrieval
An introduction to system-oriented evaluation in Information RetrievalAn introduction to system-oriented evaluation in Information Retrieval
An introduction to system-oriented evaluation in Information Retrieval
 
Hands-on Machine Learning Using Healthcare
Hands-on Machine Learning Using HealthcareHands-on Machine Learning Using Healthcare
Hands-on Machine Learning Using Healthcare
 
Quartesian capabilities-2013
Quartesian capabilities-2013Quartesian capabilities-2013
Quartesian capabilities-2013
 
Importance of data standards and system validation of software for clinical r...
Importance of data standards and system validation of software for clinical r...Importance of data standards and system validation of software for clinical r...
Importance of data standards and system validation of software for clinical r...
 
Overview of the TREC 2016 Open Search track: Academic Search Edition
Overview of the TREC 2016 Open Search track: Academic Search EditionOverview of the TREC 2016 Open Search track: Academic Search Edition
Overview of the TREC 2016 Open Search track: Academic Search Edition
 
CINECA webinar slides: Modular and reproducible workflows for federated molec...
CINECA webinar slides: Modular and reproducible workflows for federated molec...CINECA webinar slides: Modular and reproducible workflows for federated molec...
CINECA webinar slides: Modular and reproducible workflows for federated molec...
 
eSource, DIA EuroMeeting, Lisbon, March 2005
eSource, DIA EuroMeeting, Lisbon, March 2005eSource, DIA EuroMeeting, Lisbon, March 2005
eSource, DIA EuroMeeting, Lisbon, March 2005
 
eSource: Data Capture Simplified - Uncover Time and Cost Saving Possibilities
eSource: Data Capture Simplified - Uncover Time and Cost Saving PossibilitieseSource: Data Capture Simplified - Uncover Time and Cost Saving Possibilities
eSource: Data Capture Simplified - Uncover Time and Cost Saving Possibilities
 
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
Resource Classification as the Basis for a Visualization Pipeline in LOD Scen...
 
Data analytics, a (short) tour
Data analytics, a (short) tourData analytics, a (short) tour
Data analytics, a (short) tour
 
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
Being FAIR:  FAIR data and model management SSBSS 2017 Summer SchoolBeing FAIR:  FAIR data and model management SSBSS 2017 Summer School
Being FAIR: FAIR data and model management SSBSS 2017 Summer School
 
The UK National Chemical Database Service – an integration of commercial and ...
The UK National Chemical Database Service – an integration of commercial and ...The UK National Chemical Database Service – an integration of commercial and ...
The UK National Chemical Database Service – an integration of commercial and ...
 
BDE Technical Webinar 1 : Requirements elicitation
BDE Technical Webinar 1 : Requirements elicitationBDE Technical Webinar 1 : Requirements elicitation
BDE Technical Webinar 1 : Requirements elicitation
 
Multivarite and network tools for biological data analysis
Multivarite and network tools for biological data analysisMultivarite and network tools for biological data analysis
Multivarite and network tools for biological data analysis
 
PT-CRIS: EUROCRIS: 2013: Eco-system of research information systems
PT-CRIS: EUROCRIS: 2013: Eco-system of research information systemsPT-CRIS: EUROCRIS: 2013: Eco-system of research information systems
PT-CRIS: EUROCRIS: 2013: Eco-system of research information systems
 
Item 2 : Results of the Spectral Soil Data - Needs and capacities questionnaires
Item 2 : Results of the Spectral Soil Data - Needs and capacities questionnairesItem 2 : Results of the Spectral Soil Data - Needs and capacities questionnaires
Item 2 : Results of the Spectral Soil Data - Needs and capacities questionnaires
 

Recently uploaded

Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bSérgio Sacani
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.Nitya salvi
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxAArockiyaNisha
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡anilsa9823
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000Sapana Sha
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PPRINCE C P
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPirithiRaju
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Sérgio Sacani
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisDiwakar Mishra
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsSumit Kumar yadav
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...Sérgio Sacani
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfmuntazimhurra
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSSLeenakshiTyagi
 

Recently uploaded (20)

Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 bAsymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
Asymmetry in the atmosphere of the ultra-hot Jupiter WASP-76 b
 
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
❤Jammu Kashmir Call Girls 8617697112 Personal Whatsapp Number 💦✅.
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptxPhysiochemical properties of nanomaterials and its nanotoxicity.pptx
Physiochemical properties of nanomaterials and its nanotoxicity.pptx
 
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service  🪡
CALL ON ➥8923113531 🔝Call Girls Kesar Bagh Lucknow best Night Fun service 🪡
 
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 60009654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
9654467111 Call Girls In Raj Nagar Delhi Short 1500 Night 6000
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
VIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C PVIRUSES structure and classification ppt by Dr.Prince C P
VIRUSES structure and classification ppt by Dr.Prince C P
 
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdfPests of cotton_Sucking_Pests_Dr.UPR.pdf
Pests of cotton_Sucking_Pests_Dr.UPR.pdf
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
Discovery of an Accretion Streamer and a Slow Wide-angle Outflow around FUOri...
 
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral AnalysisRaman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
Raman spectroscopy.pptx M Pharm, M Sc, Advanced Spectral Analysis
 
Botany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questionsBotany krishna series 2nd semester Only Mcq type questions
Botany krishna series 2nd semester Only Mcq type questions
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
All-domain Anomaly Resolution Office U.S. Department of Defense (U) Case: “Eg...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Biological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdfBiological Classification BioHack (3).pdf
Biological Classification BioHack (3).pdf
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
DIFFERENCE IN BACK CROSS AND TEST CROSS
DIFFERENCE IN  BACK CROSS AND TEST CROSSDIFFERENCE IN  BACK CROSS AND TEST CROSS
DIFFERENCE IN BACK CROSS AND TEST CROSS
 

PEDSnet : 18 month summary on data integration and data quality

  • 1. PEDSnet:  A  Na-onal  Pediatric  Learning   Health  System   18-­‐month  Summary  on  Data  Integra3on  and  Data  Quality     Ritu  Khare             October  23  2015     1  
  • 2. PEDSnet     A  Na-onal  Pediatric  Learning  Health  System  (Forrest  et  al.  2014)   •  What  is  PEDSnet?   –  Clinical  Data  Research  Network  (CDRN)     •  Facilitate  large-­‐scale  research   –  Aggregates  pediatric  EHR  data  from  mulNple   insNtuNons     •  PCORI  Support   –  One  of  the  11  CDRNs  supported  by  PaNent-­‐ Centered  Outcomes  Research  InsNtute  (PCORI)   –  Part  of  a  larger  network,  PCORnet,  which   combines  data  from  all  CDRNs  under  a  common   data  model     2  
  • 3. PEDSnet  Organiza3onal  Architecture   3     •  Establish  an  interoperable  digital  infrastructure  to  support  learning   (research,  disseminaNon,  and  implementaNon),  recruitment,  and   engagement  of  stakeholders.     •  Validate  that  the  aggregated  data  is  high-­‐quality  and  ready  to  be   integrated  into  the  PCORnet  research  network   Leadership   Core   Network   Core   Science  Core   Engagement   Core   InformaNcs  Core   (>50%  of  team   members)   Steering  CommiWee   NaNonal  Advisory  CommiWee  
  • 4. Building  PEDSnet   Challenges  and  OpportuniNes   4   S1   S2   S3   S4   S5   S6   S7   S8   PCORnet   Common   Data  Model   PEDSnet  sites     (at  various  loca-ons)   PEDSnet  Data  CoordinaNng  Center   •  lack  of  EHR  data’s   orientaNon  toward   research  analyNcs   •  diversity  in  EHR   configuraNons   •  semanNc  heterogeneity       •  data  peculiariNes  in   pediatrics   •  large-­‐scale   •  Interdisciplinary   informaNcs  team   •  ObservaNonal   medical  outcomes   partnership  (OMOP)   •  common  data   model     •  unified  vocabulary     •  Advances  in  data   science  (big  data  and   text  mining  tools)         Challenges   Opportuni3es  
  • 5. Building  PEDSnet     Overall  Approach   5   S1   S2   S3   S4   S5   S6   S7   S8   PCORnet   Common   Data  Model   PEDSnet  sites     (at  various  loca-ons)     OMOP   Common  Data   Model     Post  hoc   Data  Quality   Assessment   OMOP  to   PCORnet   transformaNon   scripts   Extract-­‐transform-­‐ load  (ETL)?   Fitness  for   research?   •  More  granular     •  Vocabulary  support   •  OMOP  analyNc  tools   •  Pediatric  specific  edits   Specs  
  • 6. Post  hoc  Data  Quality  Assessment   One  Data  Cycle       6   Data  Quality   Checks   OMOP   common  data   model     Communicate   data  quality   issues  to  sites     Two  Expert   Reviewers   Site  ETL   Team  
  • 7. •  Formal  ViolaNons  of  SpecificaNons   –  Illegal  values  in  coded  fields  (e.g.  paNent’s  race)   –  Any  missing  data  or  informaNon    (discharge  flags  missing  for  outpaNent   visits)   •  Implausible  Events     –  Death  date  in  2017     –  Visit  date  in  1930   –  Height  =  6  cm     •  Outliers  (by  Comparison)   –  “diabetes”    -­‐    most  frequent  condiNon   –  Over  50  procedures/paNent  at  a  given  site  (other  sites  having  15-­‐20)   •  Others  (catch  our  aWenNon!)   –  a  provider  responsible  for  recording  over  2M  measurements   –  An  outpaNent  visit  having  30k  measurements     –  sudden  increase  in  number  of  facts  aier  a  certain  date/year   7   Post  hoc  Data  Quality  Assessment   Types  of  Checks  and  Example  Issues    
  • 8. Post  hoc  Data  Quality  Assessment   Communica-on  with  Sites       •  Prepare  data  quality  reports   •  IdenNfy  causes  of  issues  and  propose  soluNons   8   Visual  Report   GitHub  issue  report   Ranking  
  • 9. 18-­‐month  Experience  Summary   9   Data  Content  of  the  Network     (8  sites)   ~4.7M  children   ~98M  clinical  encounters         ~120M  condiNons     ~68M  procedures     ~100M  medicaNons   ~42M  observaNons   ~428M  labs  and  vital  signs   Prospec3ve  Methods   Communica-on   >  90  ETL  MeeNngs   ETL  Code    95  GitHub  commits/site   ETL   Documenta-on   ~150  GitHub  commits     Post  hoc   Data  Cycle   OMOP  Model   version   Total  Issues   Jan  ‘15   OMOP  v4   117   Mar  ‘15   OMOP  v4   313   May  ‘15   OMOP  v5   440   Jul  ‘15   OMOP  v5   591   Sep  ’15   OMOP  v5   505  
  • 10. 18-­‐month  Experience  Summary   Data  Quality  in  the  Most  Recent  Data  Cycle   10   Most  Frequent  Issues   Missing  Data   31%   EnNty  Outliers   15%   Unexpected  difference   from  previous  ETL  run   9%   Implausible  Dates   9%   Temporal  Outliers   6%   Unexpected  Facts   4%   Domains  with  Most  Issues   MedicaNon   21%   Labs  and  Vital  Signs   13%   ObservaNon   11%   CondiNon   10%   Procedure   9%   Demographic   7%   Encounter   7%  
  • 11. 18-­‐month  Experience  Summary   Causes  of  Data  Quality  Issues   •  ETL  code     –  Programming  bug/mistake     –  Ambiguous  or  incomplete  ETL  specificaNons     –  AdministraNve  reasons     –  Can  be  fixed   •  Source  Data  CharacterisNcs   –  Data  Entry  Error     –  Clinical  or  administraNve  workflow     –  True  anomaly     –  Cannot  be  fixed     11  
  • 12. 18-­‐month  Experience  Summary   Causes  of  Issues  and  Lessons  Learnt   12   u  ETL     u  Data  CharacterisNcs   u  I2b2  transform   u  Non-­‐issue  0   20   40   60   Jan  '15   (OMOP  v4)   Mar  '15   (OMOP  v4)   May  '15   (OMOP  v5)   Jul  '15   (OMOP  v5)   Percentage  of   Issues   Distribu3on  of  Issue  Causes  across  Data  Cycles   •  A  strong  post  hoc  program  is  important  in  assessment  of  data   validity  in  CDRNs   •  The  percentage  of  “ETL”  issues  decreased  across  successive  data  cycles   •  Avg.  22  ETL  issues  /  site.     •  This  study  serves  as  a  data  educaNon  plarorm     •  The  total  number  of  “data  characterisNc”  issues  increased  across  cycles  
  • 13. Future  Work   •  Advanced  data  quality  checks     –  Current  work  largely  based  on  aWribute-­‐level  checks   –  Comparison  with  external  datasets  and  published   findings     •  DeterminaNon  of  research  fitness   –  based  on  the  source  data  characterisNcs  idenNfied   through  post  hoc  assessments   •  AutomaNon     –  The  data  quality  check  scripts  applied  to  other  OMOP-­‐ based  network.     13  
  • 14. Acknowledgments   14   •  PEDSnet  Data   CoordinaNng  Center   –  Aaron  Browne   –  Byron  Ruth   –  EvaneWe  Burrows   –  Jeff  Pennington     –  Kevin  Murphy   –  Lemar  Davidson   –  Lena  Kushleyeva   –  Levon  UNdjian       –  Toan  Ong   •  PEDSnet  ETL  teams  at   various  sites   •  PEDSnet  InformaNcs   Leadership   –  Charles  Bailey     –  Michael  Kahn     –  Keith  Marsolo     –  Christopher  Forrest     •  OHDSI  ConsorNum       •  PaNents  and  Families       •  This  work  was  supported   by  PCORI  Contract   CDRN-­‐1306-­‐01556.