Your SlideShare is downloading. ×
0
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Data Mining Presentation
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

Data Mining Presentation

463

Published on

0 Comments
1 Like
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total Views
463
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
8
Comments
0
Likes
1
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Data Mining Applications In Healthcare TEPR 2004 May 21, 2004 V. “Juggy” Jagannathan VP of Research [email_address]
  • 2. Introduction <ul><li>Provide an overview of the technologies that are relevant to the development and deployment of data mining solutions in healthcare </li></ul>Goals of today’s presentation: Allow participants to evaluate where the technology is useful
  • 3. What is Data mining? Divining knowledge from data
  • 4. . <ul><li>Data mining </li></ul><ul><li>Uses </li></ul><ul><li>Algorithms </li></ul><ul><li>Technology </li></ul><ul><li>Applications in healthcare </li></ul>Topic Outline
  • 5. . <ul><li>Descriptive </li></ul>Data Mining Uses <ul><li>Predictive </li></ul><ul><ul><li>Classification </li></ul></ul><ul><ul><li>Regression </li></ul></ul><ul><ul><li>Time-Series </li></ul></ul><ul><ul><li>Clustering </li></ul></ul><ul><ul><li>Summarization </li></ul></ul><ul><ul><li>Association Rules </li></ul></ul><ul><ul><li>Sequence Discovery </li></ul></ul>Understand and characterize Extrapolate and forecast
  • 6. Data Mining Algorithms <ul><li>Classification </li></ul><ul><ul><li>Statistical </li></ul></ul><ul><ul><li>K-nearest neighbors </li></ul></ul><ul><ul><li>Decision trees </li></ul></ul><ul><ul><ul><li>ID3 </li></ul></ul></ul><ul><ul><ul><li>C4.5 </li></ul></ul></ul><ul><ul><li>Neural Networks (Self Organizing Maps) </li></ul></ul><ul><li>Clustering </li></ul><ul><ul><li>Hierarchical </li></ul></ul><ul><ul><li>Partitioned </li></ul></ul><ul><ul><li>Genetic </li></ul></ul><ul><li>Association </li></ul><ul><ul><li>Apriori Algorithm </li></ul></ul><ul><ul><li>If….Then rules </li></ul></ul>
  • 7. Technology <ul><li>Database Technologies </li></ul><ul><li>On-Line Analytical Processing (OLAP) </li></ul><ul><li>Visualization Technologies </li></ul><ul><li>Data scrubbing technologies </li></ul><ul><li>Natural Language Processing (NLP) </li></ul>Technology solutions Data Mining Infrastructure Technologies
  • 8. Database Technologies <ul><li>Data warehouse vs. Data mart </li></ul><ul><li>Relational technologies </li></ul><ul><ul><li>Oracle </li></ul></ul><ul><ul><li>Microsoft </li></ul></ul><ul><li>XML-databases </li></ul><ul><ul><li>Raining Data </li></ul></ul><ul><li>Database </li></ul><ul><li>OLAP </li></ul><ul><li>Visualization </li></ul><ul><li>Scrubbing </li></ul><ul><li>NLP </li></ul>
  • 9. On-Line Analytical Processing <ul><li>Analyze multi-dimensional data </li></ul><ul><li>N-dimensional data cubes </li></ul><ul><li>Operations </li></ul><ul><ul><li>Roll-up </li></ul></ul><ul><ul><li>Drill-down </li></ul></ul><ul><ul><li>Slice and dice </li></ul></ul><ul><ul><li>Pivot </li></ul></ul><ul><li>Database </li></ul><ul><li>OLAP </li></ul><ul><li>Visualization </li></ul><ul><li>Scrubbing </li></ul><ul><li>NLP </li></ul>
  • 10. Visualization <ul><li>2D/3D Charts </li></ul><ul><li>Topographic displays </li></ul><ul><li>Cluster displays </li></ul><ul><li>Histograms </li></ul><ul><li>Scatter plots </li></ul><ul><li>Advanced visualization (genomic data patterns) </li></ul><ul><li>http://www.ncbi.nlm.nih.gov/Tools/ </li></ul><ul><li>Database </li></ul><ul><li>OLAP </li></ul><ul><li>Visualization </li></ul><ul><li>Scrubbing </li></ul><ul><li>NLP </li></ul>
  • 11. <ul><li>Data cleansing </li></ul><ul><li>Filling in missing data </li></ul><ul><li>In healthcare, there is a strong need for de-identification to protect privacy </li></ul><ul><li>Database </li></ul><ul><li>OLAP </li></ul><ul><li>Visualization </li></ul><ul><li>Scrubbing </li></ul><ul><li>NLP </li></ul>
  • 12. De-Identification of Medical Records * <ul><li>Names; </li></ul><ul><li>all elements of a street address, city, county, precinct, zip code, &amp; their equivalent </li></ul><ul><li>geocodes, except for the initial three digits of a zip code for areas that contain over 20,000 people; </li></ul><ul><li>all elements of dates (except year) for dates directly related to the individual, (e.g., birth date, admission/discharge dates, date of death); and all ages over 89 </li></ul><ul><li>and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older; </li></ul><ul><li>telephone numbers; </li></ul><ul><li>fax numbers; </li></ul><ul><li>e-mail addresses; </li></ul><ul><li>social security numbers; </li></ul><ul><li>medical record numbers; </li></ul><ul><li>health plan beneficiary numbers; </li></ul><ul><li>account numbers; </li></ul><ul><li>certificate/license numbers; </li></ul><ul><li>license plate numbers, vehicle identifiers and serial numbers; </li></ul><ul><li>device identifiers and serial numbers; </li></ul><ul><li>URL addresses; </li></ul><ul><li>Internet Protocol (IP) address numbers; </li></ul><ul><li>biometric identifiers, including finger and voice prints; </li></ul><ul><li>full face photographic images and comparable images; </li></ul><ul><li>any other unique identifying number except as created by IHS to re-identify information. </li></ul>* Source : Policy and Procedures for De-Identification of Protected Health Information and Subsequent Re-Identification 45 CFR 164.514(a)-(c) posted by IHS (Indian Health Services)
  • 13. Natural Language Processing <ul><li>NLP Uses </li></ul><ul><ul><li>translation, summarization, information extraction, document retrieval or categorization </li></ul></ul><ul><li>NLP Approaches </li></ul><ul><ul><li>Clustering, Classification, Linguistic analysis, knowledge-based analysis </li></ul></ul><ul><li>NLP Companies in health care </li></ul><ul><ul><li>A-Life </li></ul></ul><ul><ul><li>Language and Computing </li></ul></ul><ul><li>Database </li></ul><ul><li>OLAP </li></ul><ul><li>Visualization </li></ul><ul><li>Scrubbing </li></ul><ul><li>NLP </li></ul>
  • 14. Applications in Healthcare <ul><li>Safety and quality </li></ul><ul><li>Clinical Research </li></ul><ul><li>Financial </li></ul><ul><li>Public Health </li></ul>
  • 15. “To err is Human” IOM Report <ul><li>Characterization </li></ul><ul><ul><li>JCAHO Core Measures </li></ul></ul><ul><ul><li>CMS Quality measures starter set </li></ul></ul><ul><ul><li>Improves patient care – reactive response </li></ul></ul><ul><li>Prediction </li></ul><ul><ul><li>Identifying cases that can result in bad clinical outcomes and raising appropriate alarms </li></ul></ul><ul><ul><li>Impacts patient care – proactive response </li></ul></ul><ul><li>Safety and Quality </li></ul><ul><li>Clinical Research </li></ul><ul><li>Financial </li></ul><ul><li>Public Health </li></ul>
  • 16. Quality Measures – Initial Set* *Source: http://www.cms.hhs.gov/quality/hospital/overview.pdf Oxygenation assessment Pneumococcal vaccination Pneumonia Initial antibiotic timing ACE inhibitor for left ventricular systolic dysfunction Heart Failure Left ventricular function assessment ACE Inhibitor for left ventricular systolic dysfunction Beta-Blocker at discharge Beta-Blocker at arrival Aspirin at discharge Acute Myocardial Infarction (AMI)/Heart attack Aspirin at arrival Condition Measure Starter Set of 10 Hospital Quality Measures
  • 17. Safety and Quality <ul><li>University of Mississippi Medical Center </li></ul><ul><ul><li>Data Warehouse Technologies to understand Medication Errors – Funded by AHRQ </li></ul></ul><ul><ul><li>Anonymous report data collection </li></ul></ul><ul><ul><li>Data mining technologies </li></ul></ul><ul><ul><li>Use of Neural networks and associative rule inference </li></ul></ul>
  • 18. Clinical Research &amp; Clinical Trials <ul><li>Pharmacy and medical claims data </li></ul><ul><li>Drug efficacy and clinical trials – for example how effective is a particular drug regimen </li></ul><ul><li>Protein structure analysis </li></ul><ul><li>Genomic data mining </li></ul><ul><li>Diagnostic Imaging data research </li></ul><ul><li>Safety and Quality </li></ul><ul><li>Clinical Research </li></ul><ul><li>Financial </li></ul><ul><li>Public Health </li></ul>
  • 19. The bottom line on cost <ul><li>General Utilization review – does the care provided meet accepted clinical and cost guidelines </li></ul><ul><li>Drug Utilization review </li></ul><ul><li>Outlier analysis – exceptions to treatment – analyzing treatments which cost more than the normal or less than normal. </li></ul><ul><li>Safety and Quality </li></ul><ul><li>Clinical Research </li></ul><ul><li>Financial </li></ul><ul><li>Public Health </li></ul>
  • 20. Data mining in public health <ul><li>Syndromatic surveillance </li></ul><ul><li>Bio-terrorism detection </li></ul><ul><li>Communicable disease reporting (Centers for Disease Control (CDC)) </li></ul><ul><li>DAWN (Drug Awareness and Warning Network) </li></ul><ul><li>Federal Drug Agency (FDA) – reporting of adverse drug events. </li></ul><ul><li>Safety and Quality </li></ul><ul><li>Clinical Research </li></ul><ul><li>Financial </li></ul><ul><li>Public Health </li></ul>Example effort: AEGIS
  • 21. Conclusion <ul><li>Data mining </li></ul><ul><li>Uses </li></ul><ul><li>Algorithms </li></ul><ul><li>Technology </li></ul><ul><li>Applications in healthcare </li></ul><ul><li>Descriptive </li></ul><ul><li>Predictive </li></ul><ul><li>Classification </li></ul><ul><li>Clustering </li></ul><ul><li>Association rules </li></ul><ul><li>Database </li></ul><ul><li>OLAP </li></ul><ul><li>Visualization </li></ul><ul><li>Scrubbing </li></ul><ul><li>NLP </li></ul><ul><li>Safety and Quality </li></ul><ul><li>Clinical Research </li></ul><ul><li>Financial </li></ul><ul><li>Public Health </li></ul>
  • 22. Conclusion <ul><li>uestions? </li></ul>Technology solutions [email_address]

×