Awakening Clinical Data: Semantics forScalable Medical Research Informatics                    Satya S. Sahoo             ...
Big Picture of Data in Clinical Research143, 961 Patients per year(e.g. Emory)                                            ...
Big Picture of Data in Clinical Research143, 961 Patients per year(e.g. Emory)                                          MR...
Scalability in Medical Informatics: Beyond Volume                                         Exemplar: Sleep Medicine Researc...
Scalability in Medical Informatics: Beyond Volume                                         Exemplar: Sleep Medicine Researc...
A Wish List for Scalable Clinical Data Management•  Reconcile Data Heterogeneity – most critical to successful   translati...
A “not to do” list for Clinical Data Management                                         Linking Open Data cloud diagram, b...
Physio-MIMI: Multi‐Modality, Multi‐Resource Environment for Physiological                              and Clinical Resear...
Physio-MIMI: Enabling Scalable Medical Research•  NCRR‐funded, multi‐CTSA site project: Sleep medicine as   exemplar•  Fed...
Key Resource: Sleep Domain Ontology (SDO)           https://mimi.case.edu/concepts
Data Mappings: SDO to Data Dictionary                       Physio-Map Module                       •  Visual interface   ...
Provenance: Contextual Metadata for Clinical                Research             Slide courtesy: Remo Mueller
Provenance: To Trace Variations in Data and                 Results             Slide courtesy: Remo Mueller
Modified from slide courtesy: RemoMueller
Provenance: Source information for Patient Data                                    Slide courtesy: Remo Mueller
Intuitive Query Interface: Ontology (SDO)-driven       Visual Aggregator and Explorer (VisAgE) DataSetsOntology Concept – ...
PhysioMIMI in National Sleep Research Resource•  National Sleep Research Resource (NSSR) – scored and   awaiting funding r...
Challenges: Semantics in Large Scale Clinical Data•  Incentives for adopting RDF in clinical data management   – what is a...
Acknowledgements•  Guo-Qiang Zhang, Remo Mueller, Samden Lhatoo, Susan Redline, Alireza Bozorgi•  Division of Medical Info...
Upcoming SlideShare
Loading in …5
×

Awakening Clinical Data: Semantics for Scalable Medical Research Informatics

897 views

Published on

Health care data is growing at an explosive rate, with highly detailed physiological processes being recorded, high resolution scanning techniques (e.g. MRI), wireless health monitoring systems, and also traditional patient information moving towards Electronic Medical Records (EMR) systems. The challenges in leveraging this huge data resources and transforming to knowledge for improving patient care, includes the size of datasets, multi-modality, and traditional forms of heterogeneity (syntactic, structural, and semantic). In addition, the US NIH is emphasizing more multi-center clinical studies that increases complexity of data access, sharing, and integration. In this talk, I explore the potential solutions for these challenges that can use semantics of clinical data - both implicit and explicit, together with the Semantic Web technologies. I specifically discuss the ontology-driven Physio-MIMI platform for clinical data management in multi-center research studies.

Further Details: http://cci.case.edu/cci/index.php/Satya_Sahoo
Presentation at: Dagsthul Seminar: Semantic Data Management 2012
Author: Satya S. Sahoo

Published in: Health & Medicine, Business
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total views
897
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
11
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Awakening Clinical Data: Semantics for Scalable Medical Research Informatics

  1. 1. Awakening Clinical Data: Semantics forScalable Medical Research Informatics Satya S. Sahoo Division Medical Informatics Electrical Engineering and Computer Science Department Case Western Reserve University Cleveland, OH, USA
  2. 2. Big Picture of Data in Clinical Research143, 961 Patients per year(e.g. Emory) MRI: 50-100MB PET: 60-100MB National Sleep Research Resource: 500 TB MRI, PET scans Patient Reports source: PRISM project, BME dept CWRUsource: PRISM project CWRU Case Western EMU: 250 TB Epilepsy Monitoring Unit (EMU) Data 500-600MB per patient per stay in EMU Wireless Health Data source: CWRU School of Engineering ~5.6 billion wireless 1-20GB each connections and growing Polysomnograms Pathology Reports, Tissue Bank source: Physio-MIMI, PRISM CWRU source: NLM and Wikipedia
  3. 3. Big Picture of Data in Clinical Research143, 961 Patients per year(e.g. Emory) MRI: 50-100MB •  Ultra large volume of data and growing rapidly PET: 60-100MB •  Data is Multi-modal, Heterogeneous •  Heterogeneity: Syntactic, Structural, Semantic National Sleep Research Resource: 500 TB MRI, PET scans Patient Reports source: PRISM project, BME dept CWRUsource: PRISM project CWRU Case Western EMU: 250 TB Epilepsy Monitoring Unit (EMU) Data 500-600MB per patient per stay in EMU Wireless Health Data source: CWRU School of Engineering ~5.6 billion wireless 1-20GB each connections and growing Polysomnograms Pathology Reports, Tissue Bank source: Physio-MIMI, PRISM CWRU source: NLM and Wikipedia
  4. 4. Scalability in Medical Informatics: Beyond Volume Exemplar: Sleep Medicine Research MRI, PET scans Patient Reports source: PRISM project, BME dept CWRUsource: PRISM project CWRU Epilepsy Monitoring Unit (EMU) Data Wireless Health Data source: CWRU School of Engineering Polysomnograms Pathology Reports, Tissue Bank source: Physio-MIMI, PRISM CWRU source: NLM and Wikipedia
  5. 5. Scalability in Medical Informatics: Beyond Volume Exemplar: Sleep Medicine Research •  Multi-Center Studies with differing administrative requirements – business logicscans Patient Reports MRI, PET source: PRISM project, BME dept CWRUsource: PRISM project CWRU •  Dynamic data – grows over project duration Epilepsy Monitoring Unit (EMU) Data •  Data Semantics as foundation to support a wide spectrum of users – clinicians, nurse practitioners, research fellows Wireless Health Data source: CWRU School of Engineering Polysomnograms Pathology Reports, Tissue Bank source: Physio-MIMI, PRISM CWRU source: NLM and Wikipedia
  6. 6. A Wish List for Scalable Clinical Data Management•  Reconcile Data Heterogeneity – most critical to successful translational research o  Syntactic heterogeneity – less of a problem, data dictionaries help o  Structural heterogeneity – problematic, XML somewhat helpful o  Semantic heterogeneity – a huge problem, ontologies to the rescue?•  Provenance – essential for data quality, compliance, insight o  Blood Oxygen Baseline: oxygen saturation during the first 15 or 30 seconds of sleep o  Patient blood report last month cause of change in medication – Domain Provenance (not just tuple provenance)•  Intuitive access to information – clinical trials eligibility, cohort identification•  Scalable - Data sources, research partners added or removed dynamically
  7. 7. A “not to do” list for Clinical Data Management Linking Open Data cloud diagram, by Richard Cyganiak and Anja Jentzsch•  No Linked Open Patient Data – HIPAA, HITECH Act (US), Data Protection Act (UK) o  De-identified data – IRB approval•  Ontology as global schema – but no RDF o  Vast majority as RDB o  Practical issues with RDF – cannot be institution- specific URI (privacy)
  8. 8. Physio-MIMI: Multi‐Modality, Multi‐Resource Environment for Physiological and Clinical Research Clinical Researcher SNOMED-CT FMA Sleep Domain … Ontology OGMS Any number of new centers
  9. 9. Physio-MIMI: Enabling Scalable Medical Research•  NCRR‐funded, multi‐CTSA site project: Sleep medicine as exemplar•  Federated data management – scalable, adapts to changing data access policies•  Ontology-driven: o  Data mappings – Ontology class to data dictionary terms (manually curated) o  Drive query interface o  Manage provenance•  Privacy aware, IRB-compliant•  Collaboration among Case Western, U. of Michigan, Marshfield Clinic and U. of Wisconsin, Madison o  Now Harvard Medical School
  10. 10. Key Resource: Sleep Domain Ontology (SDO) https://mimi.case.edu/concepts
  11. 11. Data Mappings: SDO to Data Dictionary Physio-Map Module •  Visual interface •  Stores mappings in XML – moving towards rules •  Dynamically executed in response to user query User Voting
  12. 12. Provenance: Contextual Metadata for Clinical Research Slide courtesy: Remo Mueller
  13. 13. Provenance: To Trace Variations in Data and Results Slide courtesy: Remo Mueller
  14. 14. Modified from slide courtesy: RemoMueller
  15. 15. Provenance: Source information for Patient Data Slide courtesy: Remo Mueller
  16. 16. Intuitive Query Interface: Ontology (SDO)-driven Visual Aggregator and Explorer (VisAgE) DataSetsOntology Concept – Type of Query Widget
  17. 17. PhysioMIMI in National Sleep Research Resource•  National Sleep Research Resource (NSSR) – scored and awaiting funding review•  Collaboration between Harvard Medical School (domain experts) and Case Western (CS) with 15 projects o  50,000 sleep research studies – total size of 500TB•  Semantic Data Integration – SDO and Sleep Provenance Ontology (extending W3C PROV Ontology PROV-O)•  Signal processing tools – using a common format called European Data Format (EDF), XML-based•  Domain analysis, cross-linking – secure Web access
  18. 18. Challenges: Semantics in Large Scale Clinical Data•  Incentives for adopting RDF in clinical data management – what is already not possible in RDB?•  OWL2, RDFS reasoning – Privacy aware reasoning, semantics-aware access control (Nguyen et al. 2012)•  Missing Semantics? o  Variable, missing provenance in original study - re- create provenance with (limited) provenance? o  Fine-level granularity for semantic annotation of signal data – currently not scalable•  A little semantics does not go too far in clinical data o  Need for greater involvement of Semantic Web community in development of EHR systems
  19. 19. Acknowledgements•  Guo-Qiang Zhang, Remo Mueller, Samden Lhatoo, Susan Redline, Alireza Bozorgi•  Division of Medical Informatics: Lingyun Luo, Joe Teagno, Meng Zhao, Jake Luo, Licong Cui, Chien-Hung Chen, Catherine Jayapandian•  Physio-MIMI Team: http://physiomimi.case.edu/•  Contact Information: satya.sahoo@case.edu, http://cci.case.edu/cci/index.php/Satya_Sahoo

×