2015 Annual Conference
HDWA 2015 – Grand Rapids, Michigan October 13 – 15
UNLOCKING THE POWER
of DATA to TRANSFORM HEALTHCARE
Sponsored by Spectrum Health
Photos courtesy of ExperienceGR.com and Pure Michigan
Structuring EMR Data For Analytics:
Engineering Features from Repeated Clinical Measurements
Brandon Stange
Data Scientist, Trinity Health
HDWA 2015
Grand Rapids, Michigan
Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com
Agenda
• General Data Formatting for Analytics
• Transforming Repeated Clinical Measurements
– Standardize length of time-series
– Cluster common trends together
• Scaling and other options
• Questions
HDWA 2015
Grand Rapids, Michigan
Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com
Institution Profile
HDWA 2015
Grand Rapids, Michigan
Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com
Data Governance and Research
• 13 members (4 managers)
• Data scientists (3), data governance analysts
(4), clinical/business intelligence analysts (6)
• Dozens of data sources
• 30k daily reports from Unified Data
Warehouse
HDWA 2015
Grand Rapids, Michigan
Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com
Traditional BI vs. “Data Science”
OpenMRS.org
EncounterID DRG LOS BloodGlucoseResult CreatinineResult
1 870 8 80 2.3
2 281 3 170 0.8
3 313 5 100 0.6
Relational Database Flat, Tabular Data
HDWA 2015
Grand Rapids, Michigan
Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com
How can we flatten repeated measurements?
EncounterID ResultTime ResultValue
1 11/18/14 8:00 1.7
1 11/19/14 10:30 2.8
1 11/20/14 8:25 1.1
2 2/26/15 19:15 0.8
3 9/14/15 11:27 0.56
3 9/15/15 7:40 0.51
3 9/16/15 7:27 0.52
3 9/17/15 9:38 0.54
3 9/18/15 8:15 0.59
3 9/19/15 9:20 0.51
• Avg(x)
• Max(x)
• Last(x)
Transactional Flat, Tabular Data
HDWA 2015
Grand Rapids, Michigan
Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com
Another Approach
EncounterID R1 R2 R3 R4
1 0.8 0.7 0.7
2 0.67
3 0.5 0.61 0.62 0.62
4 0.6 0.8 0.7
EncounterID S1 S2 S3 S4
1 0.8 0.73 0.7 0.7
2 0.67 0.67 0.67 0.67
3 0.5 0.61 0.62 0.62
4 0.6 0.73 0.77 0.7
Standardize to length m
• Choose a value to meet needs
(3-5 seems to work well)
• For long series, smooth to m
• For short series, impute to m
Standardize the Length of Jagged Time-Series
HDWA 2015
Grand Rapids, Michigan
Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com
Cluster Patients with Similar Trends
Clustering Method
• K-Means (efficient)
• K-Medoids
• Choosing an appropriate k
Scale Each encounter to mean
• Log difference
• Allows clusters to represent
trends
HDWA 2015
Grand Rapids, Michigan
Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com
Specific Clusters
HDWA 2015
Grand Rapids, Michigan
Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com
Additional Considerations
• Store cluster centers for easy reference
• If scaled well, many types of measurements can be
clustered together
• Allows for standard “Common Trends”
• The trends can be named more descriptively
• Can be used in other areas
• Tremendous importance in Pop Health
• Clinical text (rounding observations, disease
progression)
HDWA 2015
Grand Rapids, Michigan
Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com
New Data Structure
EncounterID CreatMean CreatClus GlucMean GlucClus
1 0.55 C3 93 C1
2 0.91 C1 156 C4
3 0.76 C5 72 C2
EncounterID Measure Value
1 CreatMean 0.55
1 CreatClus C3
1 GlucMean 93
1 GlucClus C1
2 CreatMean 0.91
2 CreatClus C1
2 GlucMean 156
2 GlucClus C4
3 CreatMean 0.76
3 CreatClus C5
3 GlucMean 72
3 GlucClus C2
Storing as a flat table
is an option
Storing in a “long” format is ideal
for some applications
• Easily pivot in R, Python
• Flexible model
• Converts easily to sparse
matrix
HDWA 2015
Grand Rapids, Michigan
Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com
Alternative Approaches
• Single Value (mean, max, last, etc.)
• Generative models
• Constant + linear trend + quadratic
• Principle Components or other Matrix
Decomposition
• Other Kernel Methods
HDWA 2015
Grand Rapids, Michigan
Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com
Summary
• The needs of advanced analytics are different
than those of traditional BI
• A variety of methods exist for flattening
complex health care data
• Applying more simple methods can allow for
rapid model generation while maintaining
interpretability
HDWA 2015
Grand Rapids, Michigan
Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com
Presenter Contact Information
Brandon Stange
Brandon.Stange@trinity-health.org

Structuring EMR Data For Analytics

  • 1.
    2015 Annual Conference HDWA2015 – Grand Rapids, Michigan October 13 – 15 UNLOCKING THE POWER of DATA to TRANSFORM HEALTHCARE Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com and Pure Michigan Structuring EMR Data For Analytics: Engineering Features from Repeated Clinical Measurements Brandon Stange Data Scientist, Trinity Health
  • 2.
    HDWA 2015 Grand Rapids,Michigan Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com Agenda • General Data Formatting for Analytics • Transforming Repeated Clinical Measurements – Standardize length of time-series – Cluster common trends together • Scaling and other options • Questions
  • 3.
    HDWA 2015 Grand Rapids,Michigan Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com Institution Profile
  • 4.
    HDWA 2015 Grand Rapids,Michigan Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com Data Governance and Research • 13 members (4 managers) • Data scientists (3), data governance analysts (4), clinical/business intelligence analysts (6) • Dozens of data sources • 30k daily reports from Unified Data Warehouse
  • 5.
    HDWA 2015 Grand Rapids,Michigan Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com Traditional BI vs. “Data Science” OpenMRS.org EncounterID DRG LOS BloodGlucoseResult CreatinineResult 1 870 8 80 2.3 2 281 3 170 0.8 3 313 5 100 0.6 Relational Database Flat, Tabular Data
  • 6.
    HDWA 2015 Grand Rapids,Michigan Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com How can we flatten repeated measurements? EncounterID ResultTime ResultValue 1 11/18/14 8:00 1.7 1 11/19/14 10:30 2.8 1 11/20/14 8:25 1.1 2 2/26/15 19:15 0.8 3 9/14/15 11:27 0.56 3 9/15/15 7:40 0.51 3 9/16/15 7:27 0.52 3 9/17/15 9:38 0.54 3 9/18/15 8:15 0.59 3 9/19/15 9:20 0.51 • Avg(x) • Max(x) • Last(x) Transactional Flat, Tabular Data
  • 7.
    HDWA 2015 Grand Rapids,Michigan Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com Another Approach EncounterID R1 R2 R3 R4 1 0.8 0.7 0.7 2 0.67 3 0.5 0.61 0.62 0.62 4 0.6 0.8 0.7 EncounterID S1 S2 S3 S4 1 0.8 0.73 0.7 0.7 2 0.67 0.67 0.67 0.67 3 0.5 0.61 0.62 0.62 4 0.6 0.73 0.77 0.7 Standardize to length m • Choose a value to meet needs (3-5 seems to work well) • For long series, smooth to m • For short series, impute to m Standardize the Length of Jagged Time-Series
  • 8.
    HDWA 2015 Grand Rapids,Michigan Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com Cluster Patients with Similar Trends Clustering Method • K-Means (efficient) • K-Medoids • Choosing an appropriate k Scale Each encounter to mean • Log difference • Allows clusters to represent trends
  • 9.
    HDWA 2015 Grand Rapids,Michigan Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com Specific Clusters
  • 10.
    HDWA 2015 Grand Rapids,Michigan Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com Additional Considerations • Store cluster centers for easy reference • If scaled well, many types of measurements can be clustered together • Allows for standard “Common Trends” • The trends can be named more descriptively • Can be used in other areas • Tremendous importance in Pop Health • Clinical text (rounding observations, disease progression)
  • 11.
    HDWA 2015 Grand Rapids,Michigan Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com New Data Structure EncounterID CreatMean CreatClus GlucMean GlucClus 1 0.55 C3 93 C1 2 0.91 C1 156 C4 3 0.76 C5 72 C2 EncounterID Measure Value 1 CreatMean 0.55 1 CreatClus C3 1 GlucMean 93 1 GlucClus C1 2 CreatMean 0.91 2 CreatClus C1 2 GlucMean 156 2 GlucClus C4 3 CreatMean 0.76 3 CreatClus C5 3 GlucMean 72 3 GlucClus C2 Storing as a flat table is an option Storing in a “long” format is ideal for some applications • Easily pivot in R, Python • Flexible model • Converts easily to sparse matrix
  • 12.
    HDWA 2015 Grand Rapids,Michigan Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com Alternative Approaches • Single Value (mean, max, last, etc.) • Generative models • Constant + linear trend + quadratic • Principle Components or other Matrix Decomposition • Other Kernel Methods
  • 13.
    HDWA 2015 Grand Rapids,Michigan Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com Summary • The needs of advanced analytics are different than those of traditional BI • A variety of methods exist for flattening complex health care data • Applying more simple methods can allow for rapid model generation while maintaining interpretability
  • 14.
    HDWA 2015 Grand Rapids,Michigan Sponsored by Spectrum Health Photos courtesy of ExperienceGR.com Presenter Contact Information Brandon Stange Brandon.Stange@trinity-health.org