ID3 Algorithm

3,547 views
3,290 views

Published on

0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,547
On SlideShare
0
From Embeds
0
Number of Embeds
3
Actions
Shares
0
Downloads
140
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

ID3 Algorithm

  1. 1. ID3 Algorithm<br />CS 157B: Spring 2010<br />Meg Genoar<br />
  2. 2. Iterative Dichotomiser 3<br />Ross Quinlan – 1987<br />C4.5 Precursor<br />Decision Tree Generation<br />
  3. 3. Ross Quinlan<br />Computer Scientist – UW 1968<br />Data Mining & Decision Theory<br />AI: Data Mining<br />ID3, C4.5, & C5.0<br />RuleQuest Research<br />
  4. 4. Max-Gain Split<br />Most Useful Attribute<br />Highest Information<br /> Best Attribute<br />Measure of Uncertainty<br />Randomness<br />Efficient Separation of Decision Tree Elements<br />ID3 & Entropy<br />
  5. 5. Entropy<br />Entropy(S) = – PpositiveLog2Ppositive<br /> – PnegativeLog2Pnegative<br />Ppositive: proportion of positive data<br />Pnegative: proportion of negative data<br />
  6. 6. Example…<br />A collection S consists of 20 data examples:<br />13 Yes : 7 No<br />Entropy(S) = – (13/20) Log2(13/20) <br /> – (7/20) Log2(7/20)<br />Entropy(S) = 0.934<br />
  7. 7. Entropy  Gain Value<br />Gain: Place to Split the Tree<br />High Gain > Low Gain<br />High Gain: Top of the Tree<br />Gain(A) = E(Current Set) - ∑ E(All Child Sets)<br />
  8. 8. Movie Example<br />
  9. 9. Entropy of Table<br />Is the Film a Success?<br />Entropy(5 Yes, 5 No) = – (5/10) Log2(5/10) <br /> – (5/10) Log2(5/10)<br />Entropy(Success) = 1<br />
  10. 10. Split – Country of Origin<br />
  11. 11. Gain – Country of Origin<br />Where is the film from?<br />Entropy(USA) = – (3/4) Log2(3/4) – (1/4) Log2(1/4)<br />Entropy(USA) = 0.811<br />Entropy(Europe) = – (2/4) Log2(2/4) – (2/4) Log2(2/4)<br />Entropy(Europe) = 1<br />Entropy(Rest of World) = – (0/2) Log2(0/2) – (2/2) Log2(2/2)<br />Entropy(Rest of World) = 0<br />Gain(Origin) = 1 – (4/10 *0.811 + 4/10*1 + 2/10*0) = 0.276<br />
  12. 12. Split – Big Star<br />
  13. 13. Gain – Big Star<br />Is there a Big Star in the film?<br />Entropy(Yes) = – (4/7) Log2(4/7) – (3/7) Log2(3/7)<br />Entropy(Yes) = 0.985<br />Entropy(No) = – (1/3) Log2(1/3) – (2/3) Log2(2/3)<br />Entropy(No) = 0.918<br />Gain(Star) = 1 – (7/10 *0.985 + 3/10*0.918) = 0.0351<br />
  14. 14. Split – Genre<br />
  15. 15. Gain – Genre<br />What genre is the film?<br />Entropy(SciFi) = – (1/3) Log2(1/3) – (2/3) Log2(2/3)<br />Entropy(SciFi) = 0.918<br />Entropy(Com) = – (4/6) Log2(4/6) – (2/6) Log2(2/6)<br />Entropy(Com) = 0.918<br />Entropy(Rom) = – (0/1) Log2(0/1) – (1/1) Log2(1/1)<br />Entropy(Rom) = 0<br />Gain(Genre) = 1 – (3/10 *0.918 + 6/10*0.918+ 1/10*0) = 0.1738<br />
  16. 16. Compare Gains…<br />Gain(Origin) = 0.276<br />Gain(Star) = 0.0351<br />Gain(Genre) = 0.1738<br />
  17. 17. Compare Gains…<br />Gain(Origin) = 0.276<br />Gain(Star) = 0.0351<br />Gain(Genre) = 0.1738<br />First Split: Origin<br />
  18. 18. United States<br />Europe<br />Rest of World<br />All Movies<br />New Table<br />New Table<br />New Table<br />
  19. 19. United States<br />Europe<br />Rest of World<br />All Movies<br />New Table<br />New Table<br />New Table<br />
  20. 20. New Table – United States<br />Entropy(3 Yes, 1 No) = – (3/4) Log2(3/4) – (1/4) Log2(1/4)<br />Entropy(Success) = 0.811<br />
  21. 21. Split – Big Star<br />
  22. 22. Gain – Big Star<br />Is there a Big Star in the film?<br />Entropy(Yes) = – (3/3) Log2(3/3) – (0/3) Log2(0/3)<br />Entropy(Yes) = 0<br />Entropy(No) = – (0/1) Log2(0/1) – (1/1) Log2(1/1)<br />Entropy(No) = 0<br />Gain(Star) = 0.811 – (3/4 *0 + 1/4*0) = 0.811<br />
  23. 23. Split – Genre<br />
  24. 24. Gain – Genre<br />What genre is the film?<br />Entropy(SciFi) = – (1/1) Log2(1/1) – (0/1) Log2(0/1)<br />Entropy(SciFi) = 0<br />Entropy(Com) = – (2/3) Log2(2/3) – (1/3) Log2(1/3)<br />Entropy(Com) = 0.918<br />Gain(Genre) = 0.811 – (1/4 *0 + 3/4*0.918) = 0.1225<br />
  25. 25. Compare Gains…<br />Gain(Star) = 0.811<br />Gain(Genre) = 0.1225<br />
  26. 26. Compare Gains…<br />Gain(Star) = 0.811<br />Gain(Genre) = 0.1225<br />Split: Star<br />
  27. 27. United States<br />Europe<br />Rest of World<br />All Movies<br />New Table<br />New Table<br />New Table<br />Star<br />No Star<br />New Table<br />New Table<br />
  28. 28. United States<br />Europe<br />Rest of World<br />All Movies<br />New Table<br />New Table<br />New Table<br />Star<br />No Star<br />New Table<br />Failure<br />Sci-Fi<br />Comedy<br />Success<br />Success<br />
  29. 29. All Movies<br />Rest of World<br />United States<br />Europe<br />Table<br />Table<br />Table<br />Star<br />No Star<br />Star<br />No Star<br />Success<br />New<br />New<br />Failure<br />Sci-Fi<br />Comedy<br />Sci-Fi<br />Comedy<br />Failure<br />Success<br />Success<br />Success<br />
  30. 30. All Movies<br />Rest of World<br />United States<br />Europe<br />Table<br />Table<br />Table<br />Star<br />No Star<br />Star<br />No Star<br />Success<br />New<br />New<br />Failure<br />Sci-Fi<br />Comedy<br />Sci-Fi<br />Comedy<br />Failure<br />Success<br />Success<br />Success<br />Comedy from the US, with a big star…<br />
  31. 31. All Movies<br />Rest of World<br />United States<br />Europe<br />Table<br />Table<br />Table<br />Star<br />No Star<br />Star<br />No Star<br />Success<br />New<br />New<br />Failure<br />Sci-Fi<br />Comedy<br />Sci-Fi<br />Comedy<br />Failure<br />Success<br />Success<br />Success<br />Comedy from the US, with a big star…<br />
  32. 32. All Movies<br />Rest of World<br />United States<br />Europe<br />Table<br />Table<br />Table<br />Star<br />No Star<br />Star<br />No Star<br />Success<br />New<br />New<br />Failure<br />Sci-Fi<br />Comedy<br />Sci-Fi<br />Comedy<br />Failure<br />Success<br />Success<br />Success<br />Comedy from the US, with a big star…<br />
  33. 33. All Movies<br />Rest of World<br />United States<br />Europe<br />Table<br />Table<br />Table<br />Star<br />No Star<br />Star<br />No Star<br />Success<br />New<br />New<br />Failure<br />Sci-Fi<br />Comedy<br />Sci-Fi<br />Comedy<br />Failure<br />Success<br />Success<br />Success<br />Comedy from the US, with a big star…<br />

×