Successfully reported this slideshow.

# Ben Gal

Upcoming SlideShare
01 graphical models
×

1 of 23 Ad

# Ben Gal

## More Related Content

### Ben Gal

1. 1. “To Explain or To Predict” “To Know or To Act” (Pure Science vs. Engineering, 2004) Using Target-Based Bayesian Nets for Suspects Monitoring (joint work with A. Gruber and S. Yanovski) Irad Ben-Gal Tel Aviv University
2. 2. DOE: Vs-optimal designs Ginsburg & Ben-Gal (2004) x (control) f(x) Y (output)  f(x) known: f(x)/x=0  x*  f(x) unknown:  Estimate g(x) (Meta Model: DOE, RSM,…)   g(x)/x=0  x* (R.V.)  ‘Scientists’ (to Know): Best estimation of f(x)  min V() (e.g., D-optimal exp.)  ‘Practitioner’ (to act) : Best estimation of x*  min V(x*) (new DOE optimality criterion) Tel Aviv University Department of Industrial Engineering
3. 3. The Bias-Variance Tradeoff Tel Aviv University Department of Industrial Engineering
4. 4. Presentation Layout  Bayesian networks and classifiers  Targeted Bayesian Network Learning (TBNL) (with Gruber)  TBNL application on suspects monitoring  Summary Tel Aviv University Department of Industrial Engineering 4/35
5. 5. Bayesian Networks (Pearl, 85) Tel Aviv University Department of Industrial Engineering
6. 6. What is a Bayesian Network? Joint Probability B ( G , Θ ) encodes the domain’s JPD Distribution X1 X2 X3 X4 Prob. 1 1 1 2 0.083 G  V , E  = Directed Acyclic Graph 1 1 2 2 0.167 1 2 2 3 0.25 2 2 1 1 0.25 2 2 2 1 0.25 Θ(X 3) X2 1 2 1 0.33 0.33 2 0.67 0.67 A Complete Factorization Bayesian Network P (X )  P ( X 2 )P ( X 3 | X 2 )P ( X 4 | X 3, X 2 )P( X 1 | X 4, X 3, X 2 ) Tel Aviv University Department of Industrial Engineering 6/35
7. 7. Explain or Predict (classify) Chow & Liu (1968) TBNL Tree / GBN Williamson (2000) Gruber & Ben-Gal (2010) p(X ) p(X ) True distribution q(X ) q(X ) Modeled distribution p(X ) pX   p  X i | x ' p x ' Objective i x ' X X i Principle Minimize D KL  p  X  || q  X   Minimize D KL  p  X i  || q  X i   Maximize I X i; Z i  Maximize  I  X i ;Zi  Consequence Maximize  I  X  i j ;Z j X jZ i Tel Aviv University Department of Industrial Engineering 11/35
8. 8. Unconstrained Learning Assume X is the target variable 3 GBN (adding-arrows) Target-Oriented (TBNL) i=1 i=4 i=3 i=4 i=1 Equivalent Encoding!!! Tel Aviv University Department of Industrial Engineering 13/35
9. 9. Constrained Learning Assume X is the target variable 3 GBN (adding-arrows) Target-Oriented (TBNL) i=1 i=4 i=3 i=4 i=1 Tel Aviv University Department of Industrial Engineering 14/35
10. 10. Differential Complexity Explain Predict (Classify)  r  t 𝜂 𝑡 = maximum percentage relative information exploitation about the target 𝜂 𝑟 = maximum percentage relative information exploitation about the rest attributes Tel Aviv University Department of Industrial Engineering
11. 11. Results (1/2) Data Sets Properties and Testing Methods Dataset # Attributes # Classes # Instances Test Instances/Attributes Ratio australian 14 2 690 CV5 ~49 breast 9 2 683 CV5 ~76 chess 36 2 3196 holdout ~89 cleve 11 2 196 CV5 ~18 corral 6 2 128 CV5 ~21 crx 15 2 653 CV5 ~44 german 20 2 1000 CV5 ~50 glass 9 7 214 CV5 ~24 Iris 5 3 150 CV5 ~30 lymphography 18 4 148 CV5 ~8 mofn-3-7-10 10 2 1324 holdout ~132 vote 16 3 435 CV5 ~27 Tel Aviv University Department of Industrial Engineering 16/35
12. 12. Naïve Bayes: Predict Corral Dataset Class A0 B0 Correlated Irrelevant A1 B1 Tel Aviv University Department of Industrial Engineering 17/35
13. 13. Tree Augmented Network (TAN) Class Class Class Correlated Irrelevant B0 Irrelevant A0 Correlated A1 A0 A0 B0 B1 Irrelevant B0 A1 A1 Correlated B1 B1 Class Class Class B1 A0 A1 A1 B0 B1 Irrelevant B0 B0 A0 A0 Correlated A1 Irrelevant Irrelevant B1 Correlated Correlated Tel Aviv University Department of Industrial Engineering 18/35
14. 14. Managing the Trade-off CV5 CV5 Holdout 2/3:1/3 Tel Aviv University Department of Industrial Engineering 20/35
15. 15. Results (2/2) Accuracy Dataset TBNL BNC-2P NB TAN C4.5 HGC australian 83.3 87.0 85.1 82.5 84.9 85.6 breast 95.9 95.8 97.6 96.5 93.9 97.6 chess 96.9 95.8 87.3 92.4 99.5 95.3 cleve 81.4 80.0 82.1 78.4 79.4 78.7 corral 100.0 98.8 87.2 98.6 98.5 100.0 crx 86.4 84.2 85.0 83.7 86.1 86.9 german 69.7 73.6 75.4 73.9 72.9 72.5 glass 60.0 58.3 55.9 54.2 59.3 31.2 Iris 97.0 95.8 93.0 92.4 96.0 95.7 lymphography 81.8 83.7 83.4 82.2 78.4 63.8 mofn-3-7-10 100.0 91.4 86.7 91.5 84.0 86.7 vote 96.0 95.8 90.1 94.9 94.7 95.4 Average 87.4 86.7 84.1 85.1 85.6 82.4 StdE 4% 3% 3% 4% 3% 6% Best & worst methods (incl. 5% runner up) in Bold & Italic respectively Paired t-tests show significance Tel Aviv University Department of Industrial Engineering 21/35
16. 16. Presentation Layout  Bayesian networks and classifiers  Targeted Bayesian Network Learning (TBNL)  TBNL application on suspects monitoring (w. Gruber & Yanovski)  Summary Tel Aviv University Department of Industrial Engineering 22/35
17. 17. Domain Description  Motivation  Simplicity: complexity-error tradeoff  Information extraction: utilization of meta-data  Support: help the expert understand  Available Data  CDR  Privatized  Laundered  Requirements  50% Recall with 1% False Alarm at most Tel Aviv University Department of Industrial Engineering 23/35
18. 18. Data Description of the Domain Call Detail Record (CDR) Field Description Main party Monitored Object unique IDENTIFIER Other party Other Party unique IDENTIFIER year Year of call start month Month of call start day Day of call start hour Hour of call start minute Minute of call start second Second of call start duration Call duration in Seconds caller Indication of call initiator : {1/0} 1 – main party initiated the call 0 – other party initiated the call type_id Type of interaction initiator : {1/0} 1 - phone call 0 - sms (text message) tag Type (group) of monitored Object : {1/0} 0 – main party is a non-target 1 – main party is a target Tel Aviv University Department of Industrial Engineering 24/35
19. 19. ROC curve 40 suspects to no avail 1900 missed targets Tel Aviv University Department of Industrial Engineering 27/35
20. 20. Feature Extraction Activity of calls during the day of two distinct groups Inter_prc_q1, Inter_prc_q2, Inter_prc_q3, Inter_prc_q4 – percentage of activities in 1st, 2nd, 3rd and 4th quarter of the day Tel Aviv University Department of Industrial Engineering 28/35
21. 21. Learning & Mining Mobility Patterns (PI’s: Ben-Gal, Toch and Lerner, 2012)
22. 22. Conclusions  “To Explain or to Predict” – “To know or to Act” (constraint modeling)  Managing the error-complexity tradeoff!  An “engineering approach” to modeling  Target-based BN Learning (2006), Gruber and Ben-Gal (2010)…  Vs-optimality criterion  min V(x*), Ginsburg and Ben-Gal (2006)  VOBN Ben-Gal et at (2005) – scenario dependent  More…. Tel Aviv University Department of Industrial Engineering 32/35
23. 23. Prediction can help… Tel Aviv University Department of Industrial Engineering