Successfully reported this slideshow.

×

1 of 32
1 of 32

# Forest Learning from Data

## More Related Content

### Related Books

Free with a 30 day trial from Scribd

See all

### Related Audiobooks

Free with a 30 day trial from Scribd

See all

### Forest Learning from Data

1. 1. Forest Learning from Data Joe Suzuki July 17, 2017
2. 2. Road Map PART-II: July 24, 2017 (based on PART-I) 1. Estimating Mutual Information (15 mins) 2. Learning Forests from Data (25 mins) 3. Learning Bayesian Networks from Data (5 mins) 4. Exercise (45 mins) PART-I: July 17, 2017 A Bayesian Approach to Data Compression
3. 3. Entropy
4. 4. Mutual Information (MI)
5. 5. Correlation may not detect independence!
6. 6. ML Estimator of MI
7. 7. Bayesian Testing of Independence
8. 8. Bayesian Estimation of MI From Stirling’s formula For large n
9. 9. Experiments 500 trials for binary seq. of length n=200
10. 10. BNSL: a CRAN package (J. Suzuki and J. Kawahara, 2017) Bayesian Network Learning Structure https://cran.r-project.org/web/packages/BNSL/index.html collects research results by Joe Suzuki. install(“BNSL”) library(BNSL) n=200; p=0.5; x=rbinom(n,1,p); y=rbinom(n,1,p) # seqs are generated mi(x,y, proc=9) # I_n mi(x,y) # J_n
11. 11. Tree Approximation
12. 12. Factorization w.r.t. A Tree
13. 13. Find E s.t. D(P||P’) is minimized
14. 14. Kruskal’s Algorithm
15. 15. Chow-Liu Algorithm
16. 16. Experiments using Asia data set • library(BNSL) • mm=mi_matrix(asia, proc=9) # I_n is used • edge.list=kruskal(mm) • g=graph_from_edgelist(edge.list, directed=FALSE) • plot(g) • mm=mi_matrix(asia) # J_n is used • edge.list=kruskal(mm) • g=graph_from_edgelist(edge.list, directed=FALSE) • plot(g)
17. 17. Asia (8 variables) S. Lauritzen, D. Spiegelhalter. Local Computation with Probabilities on Graphical Structures and their Application to Expert Systems (with discussion). Journal of the Royal Statistical Society: Series B (Statistical Methodology), 50(2):157- 224, 1988
18. 18. Asia Data Set
19. 19. I. A. Beinlich, H. J. Suermondt, R. M. Chavez, and G. F. Cooper. The ALARM Monitoring System: A Case Study with Two Probabilistic Inference Techniques for Belief Networks. In Proceedings of the 2nd European Conference on Artificial Intelligence in Medicine, pages 247-256. Springer-Verlag, 1989. Alarm (37 varibles)
20. 20. Alarm Data Set
21. 21. Learning Bayesian Networks from Data The # of candidate structures with p nodes is more than exponential with p
22. 22. 25 DAGs exist for p=3 but only 11 BNs are considered
23. 23. 7 local scores and 11 global scores
24. 24. • Estimating Mutual Information • Learning Forests from Data • Learning Bayesian Networks from Data Summary
25. 25. Problem Set #2