Data of the Dead An HMORN wide Comparison of Death Data EASTMAN


Published on

Virtual Data Warehouse

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Data of the Dead An HMORN wide Comparison of Death Data EASTMAN

  1. 1. David Eastman, KP CHR Southeast, Atlanta, GA Don Bachman, MS, KP CHR, Portland, OR Daniel Ng, BSE, MBA, KP DOR, Oakland, CA Wei Tao, MS, KP DOR, Oakland, CA
  2. 2. Topics Background Survey of death data sources  The SOURCE variable Methods of weaving death data together  The CONFIDENCE variable Inter-source agreement analysis at KPGA VDW death data QA program preliminary findings  Sprinkled throughout
  3. 3. Background VDW death files contain:  Dates of death  Qualifiers - data source, confidence, date imputation flag  Causes of death Typically VDW sites have access to multiple sources of death data How data are woven together varies considerably
  4. 4. Death Data Sources HMO Membership  Clarity Patient table  Common membership Hospital Discharges State Death Certificates Social Security Administration National Death Index Tumor Data Clarity “Death Notes”
  5. 5. HMO Death Data: Pros and Cons Pros  No probabilistic matching; unlikely to be the wrong person  Gold standard at some HMOs Cons  No cause of death information  No inactive (prior) member deaths; death after disenrollment will probably be missed  At some HMOs, family/employer must notify HMO; less rigorously reported & dates may be inaccurate.  At other sites, hospital, home health and hospice care are well integrated in the EMR and provide very reliable death dates.  At some sites, this method is more prone to false negatives than Gov’t data
  6. 6. Gov’t Death Data: Pros and Cons Pros  HMO enrollment status at time of death is irrelevant; death after disenrollment more likely to be captured if it is part of the matching algorithm  Some gov’t sources contain cause of death information Cons  Probabalistic matching on names/dates/SSN/etc.; wrong person may get matched. Some sites cannot match on SSN which makes the method less reliable.  Some sites do the matching themselves, some only get matches from the gov’t  May be more far reaching than HMO data, but may not include deaths outside of HMO’s state(s)  At some sites, this method is more prone to false positives than HMO data
  7. 7. The SOURCE Variable Spec definition: Source of death data? Spec values:  S = State Death files  N = National Death Index  T = Tumor data  Others are locally defined Based on preliminary QA results from 7 sites:  5 sites use the State Death files (S)  1 site uses National Death Index data (N)  2 sites use the Tumor data (T)  7 sites include “other” local codes
  8. 8. Methods of Weaving Death DataTogether Descriptions of methods used at:  KPGA  KPNC  KPNW
  9. 9. KPGA Method - Step 1Merge all possible death data into a research data warehouse table
  10. 10. KPGA Method – Step 2Select the “best quality” data to populate the VDWHMO sources favored (vs. Gov’t sources)Confidence variable: source agreement & postmortem activity
  11. 11. KPNC Method1. Input Pre-Processing  Combine member records containing demographic variables, contact dates, and membership dates2. QualityStage matching  Probabilistic matching of KPNC members to CA state and SSA death records3. Initial Filtering  Filter large number of match output records down to manageable size  Resulting files (KPNC-CA and KPNC-SSA matches) have multiple matches per MRN4. Ranking & Selection  Select the single, best match per MRN based on weighted comparison of match linkweights, demographic vars, and contact and membership dates5. Assign Final Variables  Select best Death date  Assign scores for overall confidence and confidence of CA and SSA matches
  12. 12. KPNW Method - Part 1Internal KP data: only use reliable sources 1. Patient table from Clarity. Most reliable & best source of death dates based on internal validation and subsequent CESR QA. 2. Common Membership including a specific death table (older sources don’t include death dates, but do correctly identify dead patients) 3. KPNW tumor registry 4. Probabilistic match of KP members to OR and WA state data by CHR Staff (unlike other many other sites).  OR & WA state don’t do the matching and won’t share SSNs.  CHR staff match members from the past 2 years to the state data. Only current source of cause of death. 18-36 month lag.
  13. 13. KPNW Method - Part 2 Been creating death files for several years Death files only include those who we believe have truly died Death dates from KP internal data appear very reliable based on CESR QA Death dates from the Tumor Registry and state data are also excellent but not as good as internal KP data Death more than 2 years after disenrollment will probably be missed with current system Would benefit from switching to a common HMORN confidence variable algorithm
  14. 14. The CONFIDENCE Variable Spec definition: “How you rate the accuracy of the observation based on source, match, # of reporting sources, discrepancies, etc.” Spec values: E=Excellent, F=Fair, P=Poor Based on preliminary QA results from 7 sites, by site:  % E ranges from 20% to 100%  % F ranges from 0% to 55%  % P ranges from 0% to 50%  % E + %F ranges from 50% to 100% The CONFIDENCE variable is inconsistently implemented!
  15. 15. The CONFIDENCE Variable What does the confidence variable measure?  Likelihood of death?  Accuracy of the death date?  Likelihood that the cause of death information is linked to the correct person?
  16. 16. Inter-source Agreement Analysis atKPGA Where do data come from? Corroborated deaths Inter-source death date agreement Postmortem activity Confidence distribution
  17. 17. Where Do Data Come From? (KPGA)
  18. 18. Corroborated Deaths (KPGA)
  19. 19. Inter-source Death Date Agreement(KPGA)
  20. 20. Postmortem Activity (KPGA)
  21. 21. Confidence Distribution (KPGA)
  22. 22. Recommendations Create new confidence variables  Confidence that the patient is really dead  Confidence in the death date  Confidence in the linkage to external source data  KPNC has implemented these as local variables Develop a common algorithm to determine the values of these confidence variables to give them a common meaning.
  23. 23. Any Questions?