• Share
  • Email
  • Embed
  • Like
  • Save
  • Private Content
Ama ieee-rpg
 

Ama ieee-rpg

on

  • 316 views

 

Statistics

Views

Total Views
316
Views on SlideShare
316
Embed Views
0

Actions

Likes
0
Downloads
2
Comments
0

0 Embeds 0

No embeds

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

    Ama ieee-rpg Ama ieee-rpg Presentation Transcript

    • LOW/HIGH-RISK DIABETES GROUP SEGMENTATION USING α-TREES
      AMA-IEEE Medical Technology Conference 2011
      Anurekha Ramakrishnan1, Yubin Park2, Joydeep Ghosh2
      1Dept. of Statistics and Scientific Computation
      2Dept. of Electrical and Computer Engineering
      The University of Texas at Austin
    • Barriers to M/C learning Adoption in Healthcare
      Class-imbalance
      Target ratios are often extremely skewed.
      Mismatch with Performance Metrics ‘Misclassification rates may not be relevant
      Asymmetric costs involved.
      ‘Sensitivity/Specificity’ or ‘Lift’ should be a part of learning goals.
      Interpretation of Results
      Simple AND/OR Rules (in Natural Language) are desirable.
      We suggest a possible solution for these problems using:
      Modified α-Trees,
      Disjunctive Combination of Rules.
    • Objectives
      Other Requirements:
      Interpretable segmentation - AND, OR Rules in Natural language
      Extensive coverage using Simple rules.
      Note: These objectives are different from traditional machine learning objectives. The objectives are based on the observations on many failed Medical Decision Support systems.
    • BRFSS Dataset
      Behavioral Risk Factor Surveillance System
      URL: http://www.cdc.gov/brfss/
      The largest telephone survey since 1984.
      Tracks health conditions and risk behaviors in the United States.
      Contains information on a variety of diseases
      e.g. diabetes, hypertension, cancer, asthma, HIV, etc.
      More than 400,000 records per year.
      Many states use BRFSS data to support health-related legislative efforts.
    • α-Tree1
      A Decision Tree Algorithm (e.g. CART, C4.5)
      Decision criterion: α-Divergence.
      Generalizes C4.5.
      Robust performance in class-imbalance settings.
      Stop its growth when a Low/High-risk group is obtained. (modified α-Tree)
      Different ‘α’ values result in different decision rules.
      Decision trees provide greedy solutions (sub-optimal solutions).
      By disjunctively combining different solutions from different α-Trees, we can approach to a better solution.
      Python Code available (http://www.ideal.ece.utexas.edu/~yubin/)
      1. Y.Park and J.Ghosh, “Compact Ensemble Trees for Imbalanced Data,” in 10th International Wokshop on Multiple Classifier Systems, Italy, June 2011.
    • 3-Phase Diagram
      Example)When High-risk group is defined as more than 24% Diabetes Rate group.
      - Twice Higher rate than Normal Population
      Rule1:RFHYPE5 = 1 & AGE_G >= 5.0 & RFHLTH = 2 & BMI4CAT >= 2.0 from α=0.1
      ORRule 2: RFHYPE5 ≠ 1 & RFHLTH = 1 & BMI4CAT >= 2.9 & PNEUVAC3 = 1 from α=1.0
      ORRule 3: RFHYPE5 = 2 & RFHLTH ≠ 1from α=1.5
      OR …
       These combined rules extract High-risk Diabetes Segments (>24%).
    • Example Tree Structure
      When α=2.0, total five High-risk Segmentation Rules are extracted.
      Different α values result in different tree structures.
      Yes
      No
    • Results for Twice Higher Diabetes Rate Group (High-risk)
      Resultant Rules from α-Trees.
      RFHYPE5 = 2 & RFHLTH ≠1
      RFHYPE5 ≠2 & RFHLTH = 2
      & RFCHOL = 2

      English Translation
      Segment 1: They have high-blood pressure and think themselves unhealthy (including not responding to this question).
      Segment 2: They have high cholesterol and think themselves unhealthy. But they don’t have high-blood pressure.

    • Results for Four-times lower Diabetes Rate Group (Low-risk)
      Resultant Rules from α-Trees.
      RFHYPE5 ≠2 and RFHLTH ≠2 and PNEUVAC3 ≠1
      RFHYPE5 =1 and RFHLTH ≠2 and AGE_G < 5.0

      English Translation
      Segment 1: They don’t have high blood pressure and think themselves healthy. They had a pneumonia shot at least once in their life time.
      Segment 2: They have high blood pressure, but think themselves healthy and are under 50 yrs of age.

    • AppendixA
      α-Divergence
      Special cases
    • AppendixB
      Modified α-Tree Algorithm
      Input: BRFSS (input data), α (parameter)
      Output: Low-risk group extraction rules
      Select the best feature, which gives the maximum α-divergence criterion.
      If (no such feature)
      or (number of data points < cut-off size)
      or (This group is a low/high-risk group)
      then stop its growth.
      Else
      Segment the input data based on the best feature.
      Recursively run Modified α-Tree Algorithm( segmented data, α)