Presented at Epic's Research Advisory Council, April 3, 2014, Verona, WI
See a novel approach to query expansion based on pre-existing structured information within the EHR. Presenters adopted over-representation analysis to find statistically significant associations among the clinical terms extracted from Clarity reports. The study population consisted of over 7,000 patients and their 12 million observations - including labs, medications, phenotypes, diseases, and procedures. See the detailed findings and discuss computational and terminology challenges.
3. Conflict of interest disclosure
Tomasz Adamusiak has no real or apparent
conflicts of interest to report
3
4. Learning objectives
• Recognize the value of structured clinical
information
• Identify computational and terminology
challenges in big data analytics
• Evaluate how this approach applies to
different use cases
4
5. What is a grouper?
Lists of specific values derived from standard
vocabularies used to define clinical concepts, e.g.
patients with diabetes
• SNOMED CT concepts
• ICD-9/10 codes
• EDG terms
• CQM Value Sets
5
6. Diabetes: Eye Exam
CMS eMeasure: CMS131v2
Value Set
Name
Diabetes
Type Grouping
Steward National Committee for
Quality Assurance
Program CMS,MU2 EP Update
2013-06-14
… … …
190330002 Diabetes mellitus,
juvenile type, with
hyperosmolar coma
(disorder)
SNOMEDCT
250 Diabetes mellitus without
mention of complication,
type II or unspecified
type, not stated as
uncontrolled
ICD9CM
E10.10 Type 1 diabetes mellitus
with ketoacidosis without
coma
ICD10CM
6
7. Mining associations in EHR data
Diabetes mellitus
Yes No
Glucohemoglobin
measurement
Yes 1509 5442
No 881 99
7
Positive
association
Background
reference
18. UMLS is ideal for integration of
heterogeneous clinical data
• Single entry point to MU terminologies
• Cross-walk between MU terms
• Terminology-agnostic
• Text-mining
18
21. 6o of terminological Kevin Bacon
Acute myocardial infarction
Myocardial ischemia
Vascular Diseases
Disorder of soft tissue
Collagen Diseases
Connective Tissue Diseases
Epidermal and dermal conditions
Skin and subcutaneous tissue disorders
Dermatologic disorders
21
22. Expansion limited to MU
terminologies and by semantic type
22
Finding
Disease
or
Syndrome
Ignore
23. Open issue: cycles due to subtle
differences in meaning
23
Immune
System
Endocrine
System
24. Expansion in UMLS across MU sources
24
Diabetes mellitus without
mention of complication,
type II or unspecified
type, not stated as
uncontrolled
ICD-9
ICD-10
SNOMED CT
NDF-RT
Situation
with explicit
context
Metabolic
diseases
roots:
25. Statistical methods for establishing
over/under-representation
• Serial contingency tables
• Chi-squared test with Bonferroni correction
• RR estimate of effect size
• Test diabetes in all 18 764 concept pairs
25
26. EHR-based association rule mining
Diabetes mellitus (C0011849)
Yes No
Glucohemoglobin
measurement
(C0202054)
Yes 1509 5442
No 881 99
26
Positive
association
Background
reference
27. Other positive associations
• C0785704 Blood glucose monitoring equipment
• C0935929 Antidiabetics
• C0304870 Insulin, Long-Acting
• C0770893 Metformin hydrochloride
• C0011882 Diabetic Neuropathies
• C0011880 Diabetic Ketoacidosis
• C0011884 Diabetic Retinopathy
27
Expansion
generalization on
class or system
level
28. A non-representative control
background can bias the findings
Diabetes inversely associated with
• C1314183 Special EEG tests
• C0242953 Barbiturate hypnotic
• C0064636 lamotrigine
• C1719410 Epilepsy and recurrent seizures
28
29. Open issue: reconciling lab orders
with results
Clinical Laboratory
Hemoglobin
A1c/Hemoglobin
.total in Blood by
HPLC
LOINC:17856-6
Hemoglobin;
glycosylated (A1C)
CPT-4:83036
29
30. Challenges
• Availability of correctly and exhaustively
coded data
• Expansion with multiple inheritance
memory intensive
• Testing all possible (180M) combinations
computationally expensive
30
32. Thank You!
Tomasz Adamusiak MD PhD
Human and Molecular Genetics Center
Medical College of Wisconsin
tomasz@mcw.edu
@7omasz
For more information
• Next-generation phenotyping using the Unified
Medical Language System (UMLS). Adamusiak T,
Shimoyama N, Shimoyama M, JMIR Med Inform.
doi:10.2196/medinform.3172
• EHR-based phenome wide association study in
pancreatic cancer. Adamusiak T, Shimoyama M,
AMIA Summits Transl Sci Proc. 2014 (in press)