Investigating the Health of Adults: Leveraging Large Data Sets For Your Study, Report or Program


Published on

Overview of UCSF-CTSI Comparative Effectiveness Large Dataset Analysis Core with emphasis on large, public data sets for studying the health of adults and the care they receive.

  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide

Investigating the Health of Adults: Leveraging Large Data Sets For Your Study, Report or Program

  1. 1. UCSF’s Comparative EffectivenessLarge Dataset Analytic Core Janet Coffman, PhD Philip R. Lee Institute for Health Policy Studies University of California, San Francisco September 21, 2011
  2. 2. CELDAC PartnersCELDAC is a partnership at UCSF among the – Philip R Lee Institute for Health Policy Studies – Academic Research Systems – Department of Orthopedic Surgery – Clinical and Translational Science InstituteFunding – Administrative supplement to the NCRR grant for UCSF’s Clinical & Translational Science Institute –California HealthCare Foundation 2
  3. 3. CELDAC PersonnelFaculty IHPS Staff• Jim G. Kahn • Leon Traister• Janet Coffman • Claire Will• Claire Brindis ARS Staff• Steve Takemoto • Rob Wynden• Adams Dudley • Ketty Mobed• Kirsten Johansen • Hari Rekapalli • Prakash Lakshminarayanan 3
  4. 4. CELDAC MissionThe mission of CELDAC is to enhanceUCSFs capacity for analysis of largelocal, state, and national health datasets toconduct comparative effectivenessresearch and other types of healthservices and health policy research. 4
  5. 5. Major Types of Large DatasetsUsed in Health Services ResearchType of Data Set Description ExamplesSurvey Collects information from • Medical Expenditure Panel individuals, families, or Survey organizations • National Health and Nutrition Examination SurveyAdministrative Information from records • Medicare Researchclaims of health professionals and Identifiable Files health care facilities, • HCUP National Inpatient usually from billing records SampleRegistries Information from datasets • California Cancer Registry that incorporate all • San Francisco persons with a particular Mammography Registry condition(s) 5
  6. 6. Major Types of Units of ObservationUnit of Observation ExamplesIndividual • Behavioral Risk Factor Surveillance System • National Health and Nutrition Examination SurveyHousehold • Medical Expenditure Panel Survey • National Health Interview SurveyVisit or discharge • National Ambulatory Medical Care Survey • HCUP National Inpatient SamplePhysician • American Medical Association Masterfile • HSC Health Tracking Physician SurveyFacility (e.g., hospital, clinic) American Hospital Association Annual Survey California OSHPD Hospital Annual Financial DataGeographic area (e.g., county, US Censusstate) Area Resource File 6
  7. 7. Major Types of Designs for SurveysType of Survey Description ExamplesCross-sectional Data collected from a • National Health Interview Survey single sample at a single • National Health and Nutrition point in time Examination Survey • California Health Interview SurveyPanel Data collected from a • Medical Expenditure Panel Survey single sample at multiple • Health and Retirement Survey points in time • National Longitudinal Survey of Youth 7
  8. 8. Medical Expenditure Panel Survey• Nationally representative sample of 22,000 to 37,000 persons• Overlapping panel design• 2 years of data collected through 5 rounds of interviews• Three major components • Household survey • Data on cost and utilization from providers caring for household survey participants • Survey of employers regarding employer-sponsored health insurance benefits 8
  9. 9. Examples of UCSF Faculty Publications Using MEPS• Newacheck P, Kim S. A national profile of health care utilization and expenditures for children with special health care need. Archives of Pediatric and Adolescent Medicine. 2005 Jan;159(1):10-7.• Yelin E., et al. Medical care expenditures and earnings losses among persons with arthritis and other rheumatic conditions in 2003, and comparisons with 1997. Arthritis and Rheumatism. 2007 May;56(5):1397-407. 9
  10. 10. National Health and Nutrition Examination Survey• Nationally representative sample of 5,000 persons per year• Data collected in 15 counties per year• Two major components – Interviews: demographic characteristics, socioeconomic status, diet, health behaviors – Physical examinations: medical, dental, physiological, lab tests 10
  11. 11. Examples of UCSF Faculty Publications Using NHANES• Seligman H.K. Food insecurity is associated with diabetes mellitus: results from the National Health Examination and Nutrition Examination Survey (NHANES) 1999-2002. Journal of General Internal Medicine. 2007 Jul;22(7):1018-23.• Woodruff T, Zota A, Schwartz J. Environmental chemicals in pregnant women in the United States: NHANES 2003-2004. Environmental Health Perspectives. 2011 Jun;119(6):878-85. 2007 Jul;22(7):1018-23. 11
  12. 12. CELDAC Goals• Accelerate access to and use of local, state, and national health datasets, as a model for other CTSAs and health research organizations.• Enhance UCSF researchers’ ability to compete for funding to use large data sets to conduct CER.• Develop procedures and infrastructure by conducting pilot studies.• Support additional studies on the comparative effectiveness of clinical interventions.• Provide consultation to researchers currently working with or interested in working with large data sets 12
  13. 13. Find Large Datasets guided search tool to find the best datasets for a project. Builds on previousefforts by Andy Bindman, Nancy Adler, Claire Brindis, Charlie Irwin and others. 13
  14. 14. Search Results –Search for administrative data on infants’ use of health care services 14
  15. 15. Analyze Large Data Sets• CELDAC has created a repository of select large, public data sets that are available to UCSF faculty at no cost.• These data sets include – HCUP National Emergency Department Sample – HCUP National Inpatient Sample – HCUP Kids Inpatient Databases – HCUP State Emergency Department and Inpatient Databases (select states) – American Hospital Association Annual Survey – Area Resource File 15
  16. 16. Provide Consultation• Study design/conceptualization• Identification of relevant datasets• Assistance with data set acquisition• Cohort selection• Data cleaning• Linking data sets• Strategies to deal with common methodological issues in analysis of observational data• Programming support for preliminary analyses 16
  17. 17. Test New Methods for Working with Large Data Sets• Conventional methods for managing large data sets have important limitations, especially for studies that draw data from multiple data sets – Requires programmers with expertise in managing and querying large data sets – Source data tables continue as individual entities – Manipulations and linkages between tables require awareness of each table’s architecture and customized “One-Off” programming 17
  18. 18. Test New Methods for Working with Large Data Sets• An Integrated Data Repository (IDR) with an i2b2 infrastructure offers an alternative – Supports integration of diverse sources of data – Can translate diverse coding of the same content into standard coding – Flexibility in data exploration – Intuitive drag-and-drop query interface – Query result sets can be exported for analysis and reporting using SAS, STATA, or other software – Reliable - backed up every 2 hours 18
  19. 19. Test New Methods for Working with Large Data Sets• Pilot Projects – Integrated repository of data on spine surgery procedures and outcomes from five data sources – Graphical user interface for browsing California Office of Statewide Health Planning and Development data on hospital discharges 19
  20. 20. Questions for Discussion• What services relating to large data set analysis would be most useful to you?• What data sets are of greatest interest to you?• How could CELDAC partner effectively with researchers in your school/department/division? 20 20
  21. 21. Contact CELDAC• Jim G. Kahn:• Janet Coffman:• Claire Will: 6009• 21