Data Mining Presentation

712 views

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
712
On SlideShare
0
From Embeds
0
Number of Embeds
1
Actions
Shares
0
Downloads
24
Comments
0
Likes
1
Embeds 0
No embeds

No notes for slide

Data Mining Presentation

  1. 1. Data Mining Applications In Healthcare TEPR 2004 May 21, 2004 V. “Juggy” Jagannathan VP of Research [email_address]
  2. 2. Introduction <ul><li>Provide an overview of the technologies that are relevant to the development and deployment of data mining solutions in healthcare </li></ul>Goals of today’s presentation: Allow participants to evaluate where the technology is useful
  3. 3. What is Data mining? Divining knowledge from data
  4. 4. . <ul><li>Data mining </li></ul><ul><li>Uses </li></ul><ul><li>Algorithms </li></ul><ul><li>Technology </li></ul><ul><li>Applications in healthcare </li></ul>Topic Outline
  5. 5. . <ul><li>Descriptive </li></ul>Data Mining Uses <ul><li>Predictive </li></ul><ul><ul><li>Classification </li></ul></ul><ul><ul><li>Regression </li></ul></ul><ul><ul><li>Time-Series </li></ul></ul><ul><ul><li>Clustering </li></ul></ul><ul><ul><li>Summarization </li></ul></ul><ul><ul><li>Association Rules </li></ul></ul><ul><ul><li>Sequence Discovery </li></ul></ul>Understand and characterize Extrapolate and forecast
  6. 6. Data Mining Algorithms <ul><li>Classification </li></ul><ul><ul><li>Statistical </li></ul></ul><ul><ul><li>K-nearest neighbors </li></ul></ul><ul><ul><li>Decision trees </li></ul></ul><ul><ul><ul><li>ID3 </li></ul></ul></ul><ul><ul><ul><li>C4.5 </li></ul></ul></ul><ul><ul><li>Neural Networks (Self Organizing Maps) </li></ul></ul><ul><li>Clustering </li></ul><ul><ul><li>Hierarchical </li></ul></ul><ul><ul><li>Partitioned </li></ul></ul><ul><ul><li>Genetic </li></ul></ul><ul><li>Association </li></ul><ul><ul><li>Apriori Algorithm </li></ul></ul><ul><ul><li>If….Then rules </li></ul></ul>
  7. 7. Technology <ul><li>Database Technologies </li></ul><ul><li>On-Line Analytical Processing (OLAP) </li></ul><ul><li>Visualization Technologies </li></ul><ul><li>Data scrubbing technologies </li></ul><ul><li>Natural Language Processing (NLP) </li></ul>Technology solutions Data Mining Infrastructure Technologies
  8. 8. Database Technologies <ul><li>Data warehouse vs. Data mart </li></ul><ul><li>Relational technologies </li></ul><ul><ul><li>Oracle </li></ul></ul><ul><ul><li>Microsoft </li></ul></ul><ul><li>XML-databases </li></ul><ul><ul><li>Raining Data </li></ul></ul><ul><li>Database </li></ul><ul><li>OLAP </li></ul><ul><li>Visualization </li></ul><ul><li>Scrubbing </li></ul><ul><li>NLP </li></ul>
  9. 9. On-Line Analytical Processing <ul><li>Analyze multi-dimensional data </li></ul><ul><li>N-dimensional data cubes </li></ul><ul><li>Operations </li></ul><ul><ul><li>Roll-up </li></ul></ul><ul><ul><li>Drill-down </li></ul></ul><ul><ul><li>Slice and dice </li></ul></ul><ul><ul><li>Pivot </li></ul></ul><ul><li>Database </li></ul><ul><li>OLAP </li></ul><ul><li>Visualization </li></ul><ul><li>Scrubbing </li></ul><ul><li>NLP </li></ul>
  10. 10. Visualization <ul><li>2D/3D Charts </li></ul><ul><li>Topographic displays </li></ul><ul><li>Cluster displays </li></ul><ul><li>Histograms </li></ul><ul><li>Scatter plots </li></ul><ul><li>Advanced visualization (genomic data patterns) </li></ul><ul><li>http://www.ncbi.nlm.nih.gov/Tools/ </li></ul><ul><li>Database </li></ul><ul><li>OLAP </li></ul><ul><li>Visualization </li></ul><ul><li>Scrubbing </li></ul><ul><li>NLP </li></ul>
  11. 11. <ul><li>Data cleansing </li></ul><ul><li>Filling in missing data </li></ul><ul><li>In healthcare, there is a strong need for de-identification to protect privacy </li></ul><ul><li>Database </li></ul><ul><li>OLAP </li></ul><ul><li>Visualization </li></ul><ul><li>Scrubbing </li></ul><ul><li>NLP </li></ul>
  12. 12. De-Identification of Medical Records * <ul><li>Names; </li></ul><ul><li>all elements of a street address, city, county, precinct, zip code, & their equivalent </li></ul><ul><li>geocodes, except for the initial three digits of a zip code for areas that contain over 20,000 people; </li></ul><ul><li>all elements of dates (except year) for dates directly related to the individual, (e.g., birth date, admission/discharge dates, date of death); and all ages over 89 </li></ul><ul><li>and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older; </li></ul><ul><li>telephone numbers; </li></ul><ul><li>fax numbers; </li></ul><ul><li>e-mail addresses; </li></ul><ul><li>social security numbers; </li></ul><ul><li>medical record numbers; </li></ul><ul><li>health plan beneficiary numbers; </li></ul><ul><li>account numbers; </li></ul><ul><li>certificate/license numbers; </li></ul><ul><li>license plate numbers, vehicle identifiers and serial numbers; </li></ul><ul><li>device identifiers and serial numbers; </li></ul><ul><li>URL addresses; </li></ul><ul><li>Internet Protocol (IP) address numbers; </li></ul><ul><li>biometric identifiers, including finger and voice prints; </li></ul><ul><li>full face photographic images and comparable images; </li></ul><ul><li>any other unique identifying number except as created by IHS to re-identify information. </li></ul>* Source : Policy and Procedures for De-Identification of Protected Health Information and Subsequent Re-Identification 45 CFR 164.514(a)-(c) posted by IHS (Indian Health Services)
  13. 13. Natural Language Processing <ul><li>NLP Uses </li></ul><ul><ul><li>translation, summarization, information extraction, document retrieval or categorization </li></ul></ul><ul><li>NLP Approaches </li></ul><ul><ul><li>Clustering, Classification, Linguistic analysis, knowledge-based analysis </li></ul></ul><ul><li>NLP Companies in health care </li></ul><ul><ul><li>A-Life </li></ul></ul><ul><ul><li>Language and Computing </li></ul></ul><ul><li>Database </li></ul><ul><li>OLAP </li></ul><ul><li>Visualization </li></ul><ul><li>Scrubbing </li></ul><ul><li>NLP </li></ul>
  14. 14. Applications in Healthcare <ul><li>Safety and quality </li></ul><ul><li>Clinical Research </li></ul><ul><li>Financial </li></ul><ul><li>Public Health </li></ul>
  15. 15. “To err is Human” IOM Report <ul><li>Characterization </li></ul><ul><ul><li>JCAHO Core Measures </li></ul></ul><ul><ul><li>CMS Quality measures starter set </li></ul></ul><ul><ul><li>Improves patient care – reactive response </li></ul></ul><ul><li>Prediction </li></ul><ul><ul><li>Identifying cases that can result in bad clinical outcomes and raising appropriate alarms </li></ul></ul><ul><ul><li>Impacts patient care – proactive response </li></ul></ul><ul><li>Safety and Quality </li></ul><ul><li>Clinical Research </li></ul><ul><li>Financial </li></ul><ul><li>Public Health </li></ul>
  16. 16. Quality Measures – Initial Set* *Source: http://www.cms.hhs.gov/quality/hospital/overview.pdf Oxygenation assessment Pneumococcal vaccination Pneumonia Initial antibiotic timing ACE inhibitor for left ventricular systolic dysfunction Heart Failure Left ventricular function assessment ACE Inhibitor for left ventricular systolic dysfunction Beta-Blocker at discharge Beta-Blocker at arrival Aspirin at discharge Acute Myocardial Infarction (AMI)/Heart attack Aspirin at arrival Condition Measure Starter Set of 10 Hospital Quality Measures
  17. 17. Safety and Quality <ul><li>University of Mississippi Medical Center </li></ul><ul><ul><li>Data Warehouse Technologies to understand Medication Errors – Funded by AHRQ </li></ul></ul><ul><ul><li>Anonymous report data collection </li></ul></ul><ul><ul><li>Data mining technologies </li></ul></ul><ul><ul><li>Use of Neural networks and associative rule inference </li></ul></ul>
  18. 18. Clinical Research & Clinical Trials <ul><li>Pharmacy and medical claims data </li></ul><ul><li>Drug efficacy and clinical trials – for example how effective is a particular drug regimen </li></ul><ul><li>Protein structure analysis </li></ul><ul><li>Genomic data mining </li></ul><ul><li>Diagnostic Imaging data research </li></ul><ul><li>Safety and Quality </li></ul><ul><li>Clinical Research </li></ul><ul><li>Financial </li></ul><ul><li>Public Health </li></ul>
  19. 19. The bottom line on cost <ul><li>General Utilization review – does the care provided meet accepted clinical and cost guidelines </li></ul><ul><li>Drug Utilization review </li></ul><ul><li>Outlier analysis – exceptions to treatment – analyzing treatments which cost more than the normal or less than normal. </li></ul><ul><li>Safety and Quality </li></ul><ul><li>Clinical Research </li></ul><ul><li>Financial </li></ul><ul><li>Public Health </li></ul>
  20. 20. Data mining in public health <ul><li>Syndromatic surveillance </li></ul><ul><li>Bio-terrorism detection </li></ul><ul><li>Communicable disease reporting (Centers for Disease Control (CDC)) </li></ul><ul><li>DAWN (Drug Awareness and Warning Network) </li></ul><ul><li>Federal Drug Agency (FDA) – reporting of adverse drug events. </li></ul><ul><li>Safety and Quality </li></ul><ul><li>Clinical Research </li></ul><ul><li>Financial </li></ul><ul><li>Public Health </li></ul>Example effort: AEGIS
  21. 21. Conclusion <ul><li>Data mining </li></ul><ul><li>Uses </li></ul><ul><li>Algorithms </li></ul><ul><li>Technology </li></ul><ul><li>Applications in healthcare </li></ul><ul><li>Descriptive </li></ul><ul><li>Predictive </li></ul><ul><li>Classification </li></ul><ul><li>Clustering </li></ul><ul><li>Association rules </li></ul><ul><li>Database </li></ul><ul><li>OLAP </li></ul><ul><li>Visualization </li></ul><ul><li>Scrubbing </li></ul><ul><li>NLP </li></ul><ul><li>Safety and Quality </li></ul><ul><li>Clinical Research </li></ul><ul><li>Financial </li></ul><ul><li>Public Health </li></ul>
  22. 22. Conclusion <ul><li>uestions? </li></ul>Technology solutions [email_address]

×