• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Chicago Health Atlas: The Promise, Process, and Problems of using electronic health record data for population health
 

Chicago Health Atlas: The Promise, Process, and Problems of using electronic health record data for population health

on

  • 397 views

 

Statistics

Views

Total Views
397
Views on SlideShare
395
Embed Views
2

Actions

Likes
2
Downloads
0
Comments
0

1 Embed 2

https://twitter.com 2

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Chicago Health Atlas: The Promise, Process, and Problems of using electronic health record data for population health Chicago Health Atlas: The Promise, Process, and Problems of using electronic health record data for population health Presentation Transcript

    • Chicago Health Atlas: The Promise, Process, and Problems of using electronic health record data for population health April 4, 2013Abel Kho, MD Roderick (Eric) Jones, MPHNorthwestern University Chicago Dept. Public Health
    • Session Preview• What is the Chicago Health Atlas?• The Promise: Contextual factors that play a role in the collaboration• The Process: Getting started, developing matching algorithms, minimizing reidentification risk• The Problems: Deriving meaning and delivering it to people who can use it
    • Chicago Health Atlas is a . . . collaboration• Informatics researchers from multiple healthcare institutions• Chicago Regional Extension Center (CHITREC)• Chicago Community Trust• Chicago Department of Public Health
    • Chicago Health Atlas is a . . . website
    • Chicago Health Atlas is a . . . database• De-identified electronic health record data for ~1 million Chicagoans• In-patient and out-patient visits spanning 2006-2011• Individual patient records matched across institutions
    • Chicago Context:Person, Place, Time
    • Chicago: Person, Place, Time Percent change, Percent of total Group 2000-2010 in 2010Chicago 7 [2.7 million]Non-Hispanic black 17 32Non-Hispanic white 6 32Hispanic 3 29Non-Hispanic Asian 14 5
    • Chicago: Person, Place, Time 229 Square miles 77 neighborhood “Community areas” Lake Michigan with population median of 31,000O’Hare (range, 3,000 – 99,000) Stem Leaf # Boxplot 9 9 1 0 9 4 1 | 8 | 8 02 2 | Loop 7 99 2 | 7 23 2 | 6 | 6 44 2 | 5 556667 6 |Suburban Cook County 5 223 3 | 4 559 3 +-----+ 4 0124 4 | | Midway 3 5666799 7 | + | 3 01112233 8 *-----* 2 55669 5 | | 2 01123334 8 | | 1 568888899 9 +-----+ 1 01233334 8 | 0 6679 4 | 0 33 2 | ----+----+----+----+ Multiply Stem.Leaf by 10**+4 All but two community areas have larger populations than the least- populated Illinois county
    • Chicago Context:
    • Healthy Chicago sets goals for. . . • Public policy and legislation (n=56) • Health education and awareness (n=45) • Interventions and programs (n=92)
    • HEALTHY CHICAGOChicago Department of Public Health Infrastructure
    • Highlights Infrastructure• Establish an Office of Epidemiology and Public Health Informatics• Expand epidemiology capacity through an increase in staff and the development of strategic partnerships with other entities who use or collect public health data
    • NYC Macroscope Scientific Advisory Group• New York City has embarked on a study to validate population health estimates from its Primary Care Information Project• CDPH involvement has lead to collaboration on developing vision and methodology for more widespread use of EHR data for public health
    • Highlights Infrastructure• Increase the availability of public health data through the City of Chicago website
    • Chicago Context:Health Information Exchange
    • Illinois RegionalHealth Information Exchanges
    • Even if we don’t have a matureHIE or a Regenstrief Institute,is it possible to . . .• Leverage existing EHR data• Weave together data from multiple institutions with publicly available data• Measure disease burden and care delivered?
    • Design Considerations• Limit sharing of any protected health information• Yet account for care of the same patient at multiple institutions• Protect anonymity of patients/providers/institutions• Enable linkage to new information and sources as it becomes available – Patient level – Geographic location
    • Process – getting started• Coordinated IRB approval across multiple institutions. – Constrained to adults aged 18-89 – Limited to structured data, no free text – Focus on 606xx zip codes, with known overlapping care institutions and high population density• Instead of an EMPI, create a lightweight software application to pass identifiers through a standard set of preprocessing steps, and then “hash” the data
    • ProcessHashing and Matching Methods
    • How we “Hashed” our Data-Hash algorithms accept variable size input messages and produce a smallfixed-size output called a hash value or message digest-The hash is non-degenerate; only 1 input message per final hash value-The hash is 1-way; Easy to go from message to hash value, very hard to gofrom hash value to message.-We initially used an early hash, Secure Hash Algorithm-1 (SHA-1).http://csrc.nist.gov/publications/nistbul/b-May-2008.pdf
    • Preliminary SHA-1 Single Institution Validation5-Variable Hash Concatenate WilliamGalanter22732M123456789 William Galanter 3/31/1962 M SSNWilliamGalanter22732M123456789 SHA1 20802322ED366A1EFD562A6219C4D7AF993BADAD4-Variable Hash William Galanter 3/31/1962 M Concatenate & SHA112345678901234567890123456789012345
    • Updated Hash Method• SHA-1 was found to have a potential security issue, moved to a second generation Hash, SHA-512* (512 bit)• Significant focus on data pre-processing / normalization• Trimming spaces and non A-Z characters, lower case _Jimmy__ O’Brien Jr. jimmy, obrien• Remove “-” from SSN and remove all invalid combinations• Only allow Birth year >1921• Use “F” and “M” for sex• Replace missing elements with missing data indicators*http://csrc.nist.gov/groups/ST/toolkit/secure_hashing.html
    • Updated Hash Method (cont.)• Creates 5 hash IDs (with probability weights) depending on availability of last name, first name, date of birth (DOB), gender, SSN. – All data available (1.0) – All fields except; no DOB, or no First and last name, or no SSN (0.3) – All fields, but only first three letters of names available (0.1) – SOUNDEX codes (phonetic equivalents) of the first and last name plus date of birth and gender (0.1)• Wrapped up into a standalone Java program• Can readily consume other data sources (e.g. Social Security Death Index Tables)
    • Diabetes (250.xx) Institution A Institution C/ Hash ID-1 Honest Broker John john Hash ID-2 O’Dwyer Pre- odwyer Hash Hash ID-3 6/12/1970 06121970 Process 987654329 Fxn Hash ID-4987-65-4329 Hash ID-5 M m Replace Matched StudyID HashIDs 250.xx with 401.xx Unique John john StudyID O dwyer Hash ID-1 Pre- odwyer Hash Hash ID-2 6/12/70 06121970 male Process Fxn Hash ID-3 m Hash ID-4 Hash ID-5 HTN Institution B (401.xx)
    • Data Dictionary• Standardized data specifications for data extractions from participating sites – Demographics – Vital signs – Diagnoses • Study ID | Month/Year | Encounter type | Encounter number | Diagnosis code – Medications – Laboratory tests • Study ID | Month/Year | Lab test name | Result | Units | Normal Range | Specimen type
    • ProcessPrivacy and Re-Identification Considerations Courtesy of Brad Malin Vanderbilt University
    • De-Identified Health Information De-identified health information neither identifies nor provides a reasonable basis to identify an individual. There are two ways to de-identify information; either:(1) a formal determination by a qualified statistician;(2) the removal of specified identifiers of the individual and of the individual’s relatives, household members, and employers is required, and is adequate only if the covered entity has no actual knowledge that the remaining information could be used to identify the individual. 29
    • HIPAA Expert Determination (abridged) Certify via “generally accepted statistical and scientific principles & methods, that the risk is very small that the information could be used, alone or in combination with other reasonably available information, by the anticipated recipient to identify the subject of the information.” 30
    • Uniqueness Analysis Model Uniques (%) Uniques (People) Safe Harbor 0.000064% 13
    • Uniqueness Analysis Model Uniques (%) Uniques (People) Safe Harbor 0.000064% 13 Chicago Health Atlas 0.3% 8,050
    • Uniqueness Analysis Model Uniques (%) Uniques (People) Safe Harbor 0.000064% 13 Chicago Health Atlas 0.3% 8,050
    • Completing the Re-identification Requires Resources Safe Harbored • Could link to registries Records – Birth – Marriage – Death – Divorce Identified Identified IdentifiedClinical Records Population Records Resource • What’s in vogue? Voter registration DBs Chicago Health Atlas Model Benitez & Malin. JAMIA. 2010.
    • Risk will Vary Across Regions Voter Registration Databases IL MN TN WA WIWHO Registered Political MN Voters Anyone Anyone Anyone Committees (ANYONE – In Person)Format Disk Disk Disk Disk DiskCost $500 $46; “use ONLY for $2500 $30 $12,500 elections, political activities, or law enforcement”Name     Address     Date of Birth    Sex   Race  Benitez & Malin. JAMIA. 2010.
    • Uniqueness Analysis Model Uniques (%) Uniques (People) Safe Harbor 0.000064% 13 Chicago Health Atlas 0.3% 8,050
    • Uniqueness Analysis Model Uniques (%) Uniques (People) Safe Harbor 0.000064% 13 Chicago Health Atlas 0.3% 8,050 Linked to Voter Registration Safe Harbor Really small 0 Chicago Health Atlas 0.004% 80 Linked to Voter Reg
    • Uniqueness Analysis Model Uniques (%) Uniques (People) Safe Harbor 0.000064% 13 Chicago Health Atlas 0.3% 8,050 Linked to Voter Registration Safe Harbor Really small 0 Chicago Health Atlas 0.004% 80 Linked to Voter Reg
    • Next Steps• Consider re-identification risk options – Coarsen ZIP codes – Coarsen Ethnicities – Coarsen Age groups• Search* for tradeoffs between information utility (e.g., epidemiologic findings) and privacy (i.e., re-identification risk) *Benitez & Malin. JAMIA. 2011.
    • FindingsA promising source of prevalence estimates
    • Data contribution summary, April 2013 Data Type Institution 1 2 3 4 5 6 Demographics C C C C C PC Diagnoses C C C C C PC Visit type C C C C C PC BMI, BP C PP N N N PC Glucose, HbA1c C C C N N PC Medications C C C N N PCC: complete; N: not yet incorporated;PP: partial time period; PC: partial cohort
    • How many patients receive care at more than one institution? No. of institutions Number % Cumulative % 4 or 5 393 0.0 0.0 3 8,409 0.9 0.9 2 74,372 7.6 8.5 1 892,468 91.4 100.0Includes the 5 institutions with all patient visits 2006-2010 submitted (as of April 2013).
    • Sample size/cohort comparison, by residential ZIP code, BRFSS* vs. Chicago Health AtlasSource Min Median Mean MaxIL BRFSS, Chicago2011 respondents 4 15 16 33Chicago HealthAtlas, patient with 1,339 10,031 9,270 21,2892010 visit*CDC Behavioral Risk Factor Surveillance System survey, Chicagosub-sample from Illinois dataset.
    • Diabetes prevalence estimateby residential ZIPPercent=# of patients with > 1 diabetes mellitus diagnosis code # of patients with visit in 2006-2010
    • No, patient does not have type 2 diabetesFinding type 2 diabetesin the health record• Diagnosis codes• Labs• Medications• Number of visits Yes, patient has type 2 diabetes
    • Diabetes prevalence estimateby residential ZIPPercent=# of patients with > 1 diabetes mellitus diagnosis code or lab criteria met # of patients with visit in 2006-2010
    • Percent of Atlas patients with diabetes diagnosis in 2006-2010 Percent Minimum number of visits recordedIllinois BRFSS estimates the prevalence of diabetes in Chicago at 9-11%.
    • Hypertension prevalence estimateby residential ZIPPercent=# of patients with > 1 hypertension diagnosis code # of patients with visit in 2006-2010
    • Coronary heart disease prevalenceestimateby residential ZIPPercent=# of patients with > 1 CHD diagnosis code # of patients with visit in 2006-2010
    • Gun shot wound prevalenceestimateby residential ZIPPercent=# of patients with > 1 gun shot wound diagnosis code # of patients with visit in 2006-2010
    • Problem Applying estimates to Chicago– rather than patient – populations
    • Age distribution comparison, 2010Percent Age groups
    • Race-ethnicity comparison Percent of total Group Atlas 2010 CensusNon-Hispanic black 31 32Non-Hispanic white 20 32Hispanic 14 29Non-Hispanic Asian 4 5Not given/Unknown 31 0
    • Geographic coverageby residential ZIP Percent= # of patients with visit in 2010 2010 Census population Additional text
    • ProblemZIP Codes aren’t meaningful geographic units
    • Imputation of ZIP code rates to community area Diabetes hospitalization, 2010 Imputed using age, sex,Rates by ZIP Imputed using age & sex & race-ethnicity Additional text
    • Imputation of ZIP code rates to community area Diabetes hospitalization, 2010 Imputed using age, sex,Rates by ZIP Imputed using age & sex & race-ethnicity Additional text
    • Maps courtesy of Chieko Maene, University of Chicago, as part of CDPH-UC Diabetes Translational Research Collaboration.
    • Dasymetric areal interpolation1. Calculate for each ZIP code Male & female x 19 age groups = 28 rates or Male & female x 19 age groups x 4 race-ethnicity groups = 84 rates2. Apply rates to corresponding population group in each census block to get counts3. Sum counts to Community area4. Calculate rates based on community area population denominators
    • Dataset description elements• Description (who, what, where, when)• Definitions• Calculations and formulas• Limitations, disclaimers, sources of error• Benchmarks and references
    • Chicago Health Atlas Funders• Otho S.A. Sprague Institute• Northwestern Memorial Hospital Community Engagement
    • Health Atlas Team• Northwestern University: John Cashy, Anna Roberts, Sara Lake• Univ. of Illinois-Chicago: Bill Galanter, John Lazaro• Cook County Hospital System: Bala Hota, Amanda Grasso• Univ. of Chicago Medical Center: Chris Lyttle, Ben Vekhter, David Meltzer• Alliance of Chicago: Erin Kaleba, Fred Rachman, Jermaine Dellahousaye• Rush University Medical Center: Shannon Sims, Aaron Tabor• Vanderbilt University: Brad Malin• UIC Intern team: Ariadna Garcia, Pravin Babu Karuppaiah, Shazia Sathar, Ulas Keles (Sid Battacharya, Faculty mentor)
    • facebook.com/ChicagoPublicHealth @ChiPublicHealthHealthyChicago@CityofChicago.org 312.747.9884 CityofChicago.org/Health