Ama ieee-rpg
Upcoming SlideShare
Loading in...5

Ama ieee-rpg






Total Views
Views on SlideShare
Embed Views



0 Embeds 0

No embeds



Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
Post Comment
Edit your comment

Ama ieee-rpg Ama ieee-rpg Presentation Transcript

    AMA-IEEE Medical Technology Conference 2011
    Anurekha Ramakrishnan1, Yubin Park2, Joydeep Ghosh2
    1Dept. of Statistics and Scientific Computation
    2Dept. of Electrical and Computer Engineering
    The University of Texas at Austin
  • Barriers to M/C learning Adoption in Healthcare
    Target ratios are often extremely skewed.
    Mismatch with Performance Metrics ‘Misclassification rates may not be relevant
    Asymmetric costs involved.
    ‘Sensitivity/Specificity’ or ‘Lift’ should be a part of learning goals.
    Interpretation of Results
    Simple AND/OR Rules (in Natural Language) are desirable.
    We suggest a possible solution for these problems using:
    Modified α-Trees,
    Disjunctive Combination of Rules.
  • Objectives
    Other Requirements:
    Interpretable segmentation - AND, OR Rules in Natural language
    Extensive coverage using Simple rules.
    Note: These objectives are different from traditional machine learning objectives. The objectives are based on the observations on many failed Medical Decision Support systems.
  • BRFSS Dataset
    Behavioral Risk Factor Surveillance System
    The largest telephone survey since 1984.
    Tracks health conditions and risk behaviors in the United States.
    Contains information on a variety of diseases
    e.g. diabetes, hypertension, cancer, asthma, HIV, etc.
    More than 400,000 records per year.
    Many states use BRFSS data to support health-related legislative efforts.
  • α-Tree1
    A Decision Tree Algorithm (e.g. CART, C4.5)
    Decision criterion: α-Divergence.
    Generalizes C4.5.
    Robust performance in class-imbalance settings.
    Stop its growth when a Low/High-risk group is obtained. (modified α-Tree)
    Different ‘α’ values result in different decision rules.
    Decision trees provide greedy solutions (sub-optimal solutions).
    By disjunctively combining different solutions from different α-Trees, we can approach to a better solution.
    Python Code available (
    1. Y.Park and J.Ghosh, “Compact Ensemble Trees for Imbalanced Data,” in 10th International Wokshop on Multiple Classifier Systems, Italy, June 2011.
  • 3-Phase Diagram
    Example)When High-risk group is defined as more than 24% Diabetes Rate group.
    - Twice Higher rate than Normal Population
    Rule1:RFHYPE5 = 1 & AGE_G >= 5.0 & RFHLTH = 2 & BMI4CAT >= 2.0 from α=0.1
    ORRule 2: RFHYPE5 ≠ 1 & RFHLTH = 1 & BMI4CAT >= 2.9 & PNEUVAC3 = 1 from α=1.0
    ORRule 3: RFHYPE5 = 2 & RFHLTH ≠ 1from α=1.5
    OR …
     These combined rules extract High-risk Diabetes Segments (>24%).
  • Example Tree Structure
    When α=2.0, total five High-risk Segmentation Rules are extracted.
    Different α values result in different tree structures.
  • Results for Twice Higher Diabetes Rate Group (High-risk)
    Resultant Rules from α-Trees.
    RFHYPE5 = 2 & RFHLTH ≠1
    RFHYPE5 ≠2 & RFHLTH = 2
    & RFCHOL = 2

    English Translation
    Segment 1: They have high-blood pressure and think themselves unhealthy (including not responding to this question).
    Segment 2: They have high cholesterol and think themselves unhealthy. But they don’t have high-blood pressure.

  • Results for Four-times lower Diabetes Rate Group (Low-risk)
    Resultant Rules from α-Trees.
    RFHYPE5 ≠2 and RFHLTH ≠2 and PNEUVAC3 ≠1
    RFHYPE5 =1 and RFHLTH ≠2 and AGE_G < 5.0

    English Translation
    Segment 1: They don’t have high blood pressure and think themselves healthy. They had a pneumonia shot at least once in their life time.
    Segment 2: They have high blood pressure, but think themselves healthy and are under 50 yrs of age.

  • AppendixA
    Special cases
  • AppendixB
    Modified α-Tree Algorithm
    Input: BRFSS (input data), α (parameter)
    Output: Low-risk group extraction rules
    Select the best feature, which gives the maximum α-divergence criterion.
    If (no such feature)
    or (number of data points < cut-off size)
    or (This group is a low/high-risk group)
    then stop its growth.
    Segment the input data based on the best feature.
    Recursively run Modified α-Tree Algorithm( segmented data, α)