Causal Data Mining (1 of 2)

261
-1

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
261
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
3
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Causal Data Mining (1 of 2)

  1. 1. Causal Data Mining Richard Scheines Dept. of Philosophy, Machine Learning, & Human-Computer Interaction Carnegie Mellon
  2. 2. 1. Predictive Data Mining <ul><li>Finding predictive relationships in data </li></ul><ul><ul><li>What feature of student behavior predicts learning </li></ul></ul><ul><ul><li>Who will default on credit cards </li></ul></ul><ul><ul><li>Who will get an “A” in your course </li></ul></ul><ul><ul><li>Which HS students will do well at CMU </li></ul></ul><ul><ul><li>Do students cluster by “learning style” </li></ul></ul>
  3. 3. Causal Data Mining <ul><li>Finding causal relationships in data </li></ul><ul><ul><li>What feature of student behavior causes learning </li></ul></ul><ul><ul><li>What will happen when we make everyone take a reading quiz before each class </li></ul></ul><ul><ul><li>What will happen when we program our tutor to intervene to give hints after an error </li></ul></ul>
  4. 4. Predictive Data Mining Data Mining Search Predictive Model Y = f(X1, X2, …Xk) . . . . . . . . . . . . . . . . N 3 2 1 0 1.1 . . F 11 2.0 . . . . . . . . 1.8 1.1 2.4 Xk 0 M 12 2.8 1 F 17 1.9 1 M 28 1.7 Y X3 X2 X1
  5. 5. Predictive Data Mining Data Mining Search Predictive Model Y = f(X1, X2, …Xk) <ul><li>Model Classes </li></ul><ul><li>Simple Regression </li></ul><ul><li>Locally Weighted Regression </li></ul><ul><li>Logistic Regression </li></ul><ul><li>Neural Nets </li></ul><ul><li>Vector Support Machines </li></ul><ul><li>Decision Trees </li></ul><ul><li>Bayes Net </li></ul><ul><li>Naïve Bayes Classifier </li></ul><ul><li>Independent Components </li></ul><ul><li>Clustering </li></ul><ul><li>Etc. </li></ul>
  6. 6. Predictive Data Mining Predictive Model under Constraints Y = f(X1, X2, …Xk), e.g., f  Additive functions Data Mining Search
  7. 7. Predictive Data Mining Predictive Model under Constraints Y = f(X1, X2, …Xk), Or Probability Model under Constraints: P(Y | X1, X2, …, Xk), where P  Gaussian, with mean 0 Data Mining Search
  8. 8. Predictive Data Mining Decision Tree Search
  9. 9. Predictive Data Mining ≠ Causal Data Mining <ul><li>P(Y | X1, X2, …, Xk) </li></ul><ul><li> </li></ul><ul><li>P(Y | X1 set , X2, …, Xk) </li></ul>Conditioning is not the same as intervening Teeth Slides
  10. 10. Causal Discovery Statistical Data  Causal Structure Background Knowledge - X 2 before X 3 - no unmeasured common causes Statistical Inference
  11. 11. Causal Discovery Software TETRAD IV www.phil.cmu.edu/projects/tetrad
  12. 12. Full Semester Online Course in Causal & Statistical Reasoning
  13. 13. Full Semester Online Course in Causal & Statistical Reasoning <ul><li>Course is tooled to record certain events: </li></ul><ul><ul><li>Logins, page requests, print requests, quiz attempts, quiz scores, voluntary exercises attempted, etc. </li></ul></ul><ul><li>Each event was associated with attributes: </li></ul><ul><ul><li>Time </li></ul></ul><ul><ul><li>student-id </li></ul></ul><ul><ul><li>Session-id </li></ul></ul>
  14. 14. Printing and Voluntary Comprehension Checks: 2002 --> 2003 2002 2003
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×