Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- Unetbootin by cmoreirao 149 views
- [JSS2015] Architectures Lambda avec... by GUSS 406 views
- [JSS2015] Eradiction des deadlocks by GUSS 246 views
- Presentacion de empresa para exponer by Cristopher Ayala 191 views
- 1.4.1 будьте хорошими родителями by himbaza 251 views
- Servicios de hadware by Cristopher Ayala 145 views

654 views

Published on

Published in:
Technology

No Downloads

Total views

654

On SlideShare

0

From Embeds

0

Number of Embeds

283

Shares

0

Downloads

0

Comments

0

Likes

1

No embeds

No notes for slide

- 1. www.persistentsys.com Life and Work of Judea Pearl March 9, 2013© 2012 Persistent Systems Ltd
- 2. ACM A. M. Turing Award Judea Pearl United States – 2011 Citation: For fundamental contributions to artificial intelligence through the development of a calculus for probabilistic and causal reasoning. 35th Turing Award Recipient In the Turing Centennial Year.2 © 2012 Persistent Systems Ltd
- 3. Quotes about Judea Pearl’s work ―Judea Pearls highly influential 1988 book (Probabilistic Reasoning in Intelligent Systems) brought probability and decision theory into AI.‖ AI becomes science (1987 – present), AIMA. Stuart Russel & Peter Norvig ―His accomplishments over the last 30 years have provided the theoretical basis for progress in artificial intelligence and led to extraordinary achievements in machine learning, and they have redefined the term thinking machine‖ Vint Cerf, Turing Award Recipeient & President of ACM. ―Before Pearl, most AI systems reasoned with Boolean logic — they understood true or false, but had a hard time with maybe’.‖ Alfred Spector, VP Research at Google3 © 2012 Persistent Systems Ltd
- 4. Turing Test – Defining problem for AI ―The computer passes the test if a human interrogator, after posing some written questions, cannot tell whether the written responses come from a person or not‖ Alan Turing, Computing Machinery & Intelligence (1950) The computer needs to process following capabilities Natural Language Processing Knowledge Representation Automated Reasoning Machine Learning4 © 2012 Persistent Systems Ltd
- 5. www.persistentsys.com About Judea Pearl© 2012 Persistent Systems Ltd
- 6. Background Born: 1936, Tel Aviv, Israel Education: BS, Technion Israel -1960 MS, Electronics, Newark College of Engineering - 1961 MS, Physics, Rutgers – 1965 PhD. Electrical Engineering, Polytechnic Institute of Brooklyn, 1965 Professional Career: Member Technical Staff, RCA Research Laboratories, 1961 - 1965 Director, Electronic Memories, Inc., Hawthorne, 1966–1969) Faculty, University of California, Los Angeles, Computer Science Department, 1969 – to date (Emeritus Faculty since 1994) Director of Cognitive Systems Laboratory, UCLA (1978-)6 © 2012 Persistent Systems Ltd
- 7. Research Interests Early Research: Magnetic and superconducting memories. Combinatorial Search - A* Search Heuristics: Intelligent Search Strategies for Computer Problem Solving, Probability & Decision Theory Probabilistic Reasoning in Intelligent Systems Causality & its applications in different domains Causality: Models, Reasoning, and Inference7 © 2012 Persistent Systems Ltd
- 8. Daniel Pearl Journalist, Musician Wall Street Journal (South Asia Bureau Chief 2002) Kidnapped & Murdered in Karachi, 2002 Daniel Pearl Foundation Promotes cross-cultural understanding through journalism and music. Formed by Ruth & Judea Pearl in 2002. (1963 – 2002)8 © 2012 Persistent Systems Ltd
- 9. Understanding Cause & Effect CAUSALITY9 © 2012 Persistent Systems Ltd
- 10. Thank you for smoking !10 © 2012 Persistent Systems Ltd
- 11. Thank you for Smoking ! Nick Naylor • Academy of Tobacco Studies, a firm that promotes the benefits of cigarettes. • Evangelist for Tobacco products. Ortolan K. Finnistire • Senator from Vermont – Anti Tobacco • Pass a resolution to put a skull & bones symbol on Cigarette cases.11 © 2012 Persistent Systems Ltd
- 12. Cigarette Smoking causes Lung Cancer ? Cause: Smoking Cigarettes Effect: Lung Cancer Eating Cheese Leads to Heart Disease ? Cause: Eating cheese Effect: Heart Disease12 © 2012 Persistent Systems Ltd
- 13. Which of these are actually Causal ? Eating high protein food leads to Weight Loss. Eating Aspirin reduces the risk of Heart Attack. Women’s empowerment reduces population birth rate. Bigger search button on a web page increases click-through. Drinking Milk with additives increases height of kids. Lower class size improve learning. Carbon Emissions cause global warming. Reducing Taxes increases job creation. Lower interest rates leads to improved economy. Higher pay leads to reduced attrition.13 © 2012 Persistent Systems Ltd
- 14. Pearl’s Riddles of Causation What patterns of experience convince people that connection is causal. What difference will it make if I told you that a certain connection is causal or not causal.14 © 2012 Persistent Systems Ltd
- 15. www.persistentsys.com Why study Cause & Effect (Causality)?© 2012 Persistent Systems Ltd
- 16. www.persistentsys.com Why should Computer Scientists study Causality ?© 2012 Persistent Systems Ltd
- 17. From a Pulitzer Prize–winning investigative reporter at The New York Times comes the explosive story of the rise of the processed food industry and its link to the emerging obesity epidemic. Michael Moss reveals how companies use salt, sugar, and fat to addict us and, more important, how we can fight back17 © 2012 Persistent Systems Ltd
- 18. www.persistentsys.com The Art and Science of Cause and Effect – Judea Pearl Transcript of lecture given Thursday, October 29, 1996, UCLA 81st Faculty Research Lecture Series© 2012 Persistent Systems Ltd
- 19. www.persistentsys.com Causality – A historical perspective© 2012 Persistent Systems Ltd
- 20. David Hume - Philosopher “"Treatise of Human Nature“ – David Hume “Thus we remember to have seen that species of object we call FLAME, and to have felt that species of sensation we call HEAT. We likewise call to mind their constant conjunction in all past instances. Without any farther ceremony, we call the one CAUSE and the other EFFECT, and infer the existence of the one from that of the other.― (1711 –1776)20 © 2012 Persistent Systems Ltd
- 21. www.persistentsys.com Correlation© 2012 Persistent Systems Ltd
- 22. Francis Galton & Karl Pearson Study of Inheritance of intelligence Study of fore-arm & height measurements “Co- relation must be the consequence Francis Galton of the variations of the two organs being (1822 - 1911) partly due to common causes.“ Karl Pearson (1857-1936)22 © 2012 Persistent Systems Ltd
- 23. Correlation & Dependence Correlation: It is a measure of relationship between two mathematical variables or measured data values Correlation coefficient Pearson’s correlation coefficient23 © 2012 Persistent Systems Ltd
- 24. Correlation is NOT Causation ! Careful inferring Causation from Correlation ! Indicates possibility of predictive relationship Correlation is not the sufficient condition for Causation. Correlation or Causation? Did Avas cause Housing Bubble ? Is murder rate related to the height of a mountain range ? http://www.businessweek.com/magazine/correlation-or-causation-12012011-gfx.html24 M night Shyamalan’s lack of © 2012 Persistent Systems Ltd
- 25. RANDOMIZED TRIAL25 © 2012 Persistent Systems Ltd
- 26. Sir Ronald Fisher Randomized Controlled Trials Only accepted way for proving causality. First proposed by Charles Sanders Peirce in education Promoted and formalized by Sir Ronald Fisher Design of Experiments, (1935)26 © 2012 Persistent Systems Ltd
- 27. Randomized Control Trials Diagram Four Phases for RCT in Clinical Trials Enrollment Intervention Allocation Follow-up Data Analysis27 © 2012 Persistent Systems Ltd
- 28. Randomized Control Trials in Web Current Search Widget Proposed Search Widget Which one is better ?28 Ronny Kohavi, http://www.exp-platform.com/Pages/default.aspx © 2012 Persistent Systems Ltd
- 29. Run Experiment (RCT) and Decide. Data Driven Decision Making Also known as A/B Testing Google ran approximately 12,000 randomized experiments in 2009 – 10% resulted in change. Web is ideal for running and improving using experiments. Very low cost of running the experiment on web29 Ronny Kohavi, http://www.exp-platform.com/Pages/default.aspx © 2012 Persistent Systems Ltd
- 30. Things to keep in Mind Randomization before allocation is critical The exposure of other parameters, except for the feature under test, in Control & Treatment group should be identical. Use statistical significance tests on the results Large enough of sample set Remove the random chance of obtaining result in a trial.30 © 2012 Persistent Systems Ltd
- 31. Challenges of Randomized Controlled Trials In most cases running a RCT is infeasible Economics, Anthropology, Politics In some cases it might be illegal ! Lack of Deeper Understanding Understanding = How things work when taken apart ! Lack of language to express causal concepts explicitly is responsible for the poor scientific activity around Causality. Judea Pearl31 © 2012 Persistent Systems Ltd
- 32. BEYOND RANDOMIZED TRIALS32 © 2012 Persistent Systems Ltd
- 33. Judea Pearl’s Contribution to Causality 1. Representation for capturing relationships between different pieces of information and their causal link. Bayesian Networks Capture 2. Algebra of Intervention Do operator to capture explicit actions Their relationship with Probability. Judea Pearl’s work gave language and notation to Causality and bought it under Mathematical Sciences33 © 2012 Persistent Systems Ltd
- 34. Cigarette smoking & Lung Cancer 1964 study finds that cigarette smokers have a higher chance of getting Lung Cancer Cigarette lobby indicates a presence of unknown gene that causes urge for Nicotine and causes cancer. Study finds that people who visit bars have a higher chance of getting lung cancer. Doctors find a relationship between tar deposits in lung and having lung cancer.34 © 2012 Persistent Systems Ltd
- 35. Factors Affecting Pneumonia (An Example) From: Aronsky, D. and Haug, P.J., Diagnosing community-acquired pneumonia with a Bayesian network, In: Proceedings of the Fall Symposium of the American Medical Informatics Association, (1998) 632-35 636. 35 © 2012 Persistent Systems Ltd
- 36. Challenge ! How can I establish a causal relationship between smoking and lung cancer using this data ? P(Cancer | smoking) ?? P(Cancer) P(cancer | smoking) > P(Cancer) P (cancer | smoking) = P(Cancer)36 © 2012 Persistent Systems Ltd
- 37. A Tutorial on Bayesian Networks - Oregon State University www.persistentsys.com A Tutorial on Bayesian Networks - Oregon State University Primer on Probability A Tutorial on Bayesian Networks, Weng-Keen Wong - Oregon State University © 2012 Persistent Systems Ltd
- 38. Probability Primer: Random Variables A random variable is the basic element of probability Refers to an event and there is some degree of uncertainty as to the outcome of the event For example, the random variable A could be the event of getting a heads on a coin flip38 © 2012 Persistent Systems Ltd
- 39. Boolean Random Variables We deal with the simplest type of random variables – Boolean ones Take the values true or false Think of the event as occurring or not occurring Examples (Let A be a Boolean random variable): A = Getting heads on a coin flip A = It will rain today39 A = There is a typo in these slides © 2012 Persistent Systems Ltd
- 40. Probabilities We will write P(A = true) to mean the probability that A = true. What is probability? It is the relative frequency with which an outcome would be obtained if the process were repeated a large number of times under similar conditions* The sum of the red and blue areas is 1 P(A = true) *Ahem…there’s also the Bayesian definition which says probability is your degree of belief in an outcome P(A = false)40 © 2012 Persistent Systems Ltd
- 41. Conditional Probability P(A = true | B = true) = Out of all the outcomes in which B is true, how many also have A equal to true Read this as: “Probability of A conditioned on B” or “Probability of A given B” H = “Have a headache” F = “Coming down with Flu” P(F = true) P(H = true) = 1/10 P(F = true) = 1/40 P(H = true | F = true) = 1/2 P(H = “Headaches are rare and flu is rarer, but if true) you’re coming down with flu there’s a 50-41 50 chance you’ll have a headache.” © 2012 Persistent Systems Ltd
- 42. The Joint Probability Distribution We will write P(A = true, B = true) to mean “the probability of A = true and B = true” Notice that: P(H=true|F=true) P(F = true) Area of " H and F" region Area of " F" region P(H true, F true) P(F true) P(H = true) In general, P(X|Y)=P(X,Y)/P(Y)42 © 2012 Persistent Systems Ltd
- 43. The Joint Probability Distribution A B C P(A,B,C) Joint probabilities can be false false false 0.1 between any number of false false true 0.2 variables false true false 0.05 false true true 0.05 eg. P(A = true, B = true, C = true false false 0.3 true) true false true 0.1 true true false 0.05 For each combination of true true true 0.15 variables, we need to say how probable that combination is Sums to 1 The probabilities of these43 combinations need to sum to 1 © 2012 Persistent Systems Ltd
- 44. The Joint Probability Distribution Once you have the joint probability A B C P(A,B,C) distribution, you can calculate any false false false 0.1 probability involving A, B, and C false false true 0.2 Note: May need to use marginalization false true false 0.05 and Bayes rule, (both of which are not false true true 0.05 discussed in these slides) true false false 0.3 true false true 0.1 Examples of things you can compute: true true false 0.05 true true true 0.15 • P(A=true) = sum of P(A,B,C) in rows with A=true • P(A=true, B = true | C=true) = P(A = true, B = true, C = true) / P(C = true)44 © 2012 Persistent Systems Ltd
- 45. The Problem with the Joint Distribution Lots of entries in the table to fill A B C P(A,B,C) up! false false false 0.1 For k Boolean random variables, false false true 0.2 you need a table of size 2k false true false 0.05 How do we use fewer numbers? false true true 0.05 Need the concept of true false false 0.3 independence true false true 0.1 true true false 0.05 true true true 0.1545 © 2012 Persistent Systems Ltd
- 46. Independence Variables A and B are independent if any of the following hold: P(A,B) = P(A) P(B) P(A | B) = P(A) P(B | A) = P(B) This says that knowing the outcome of A does not tell me anything new about the outcome of B.46 © 2012 Persistent Systems Ltd
- 47. Independence How is independence useful? Suppose you have n coin flips and you want to calculate the joint distribution P(C1, …, Cn) If the coin flips are not independent, you need 2n values in the table If the coin flips are independent, then n Each P(Ci) table has 2 entries and P ( C 1 ,..., C n ) P (C i ) there are n of them for a total of 2n i 1 values47 © 2012 Persistent Systems Ltd
- 48. Conditional Independence Variables A and B are conditionally independent given C if any of the following hold: P(A, B | C) = P(A | C) P(B | C) P(A | B, C) = P(A | C) P(B | A, C) = P(B | C) Knowing C tells me everything about B. I don’t gain anything by knowing A (either because A doesn’t influence B or because knowing C provides all the information knowing A would give)48 © 2012 Persistent Systems Ltd
- 49. www.persistentsys.com Bayesian Networks© 2012 Persistent Systems Ltd
- 50. A Bayesian Network A Bayesian network is made up of two things A 1. A Directed Acyclic Graph B C D 2. A set of tables for each node in the graph A P(A) A B P(B|A) B D P(D|B) B C P(C|B) false 0.6 false false 0.01 false false 0.02 false false 0.4 true 0.4 false true 0.99 false true 0.98 false true 0.6 true false 0.7 true false 0.05 true false 0.9 true true 0.3 true true 0.95 true true 0.150 © 2012 Persistent Systems Ltd
- 51. A Directed Acyclic Graph Each node in the graph is a A node X is a parent of another random variable node Y if there is an arrow from node X to node Y eg. A is a parent A of B B C D Informally, an arrow from node X to node Y means X has a direct influence on Y51 © 2012 Persistent Systems Ltd
- 52. A Set of Tables for Each Node A P(A) A B P(B|A) Each node Xi has a conditional false 0.6 false false 0.01 probability distribution P(Xi | true 0.4 false true 0.99 Parents(Xi)) that quantifies the effect true false 0.7 true true 0.3 of the parents on the node The parameters are the probabilities B C P(C|B) in these conditional probability false false 0.4 tables (CPTs) false true 0.6 A true false 0.9 true true 0.1 B B D P(D|B) false false 0.02 C D false true 0.98 true false 0.0552 true true 0.95 © 2012 Persistent Systems Ltd
- 53. Bayesian Network for Cigarette Smoking MysterySmoking| Mystery Gene) Gene Cancer Smoking in Family Late night Tar Partying P(Tar | Smoking) Deposits Lung Cancer53 © 2012 Persistent Systems Ltd
- 54. Bayesian Networks Two important properties: 1. Encodes the conditional independence relationships between the variables in the graph structure 2. Is a compact representation of the joint probability distribution over the variables54 54 © 2012 Persistent Systems Ltd
- 55. Conditional Independence The Markov condition: given its parents (P1, P2), a node (X) is conditionally independent of its non-descendants (ND1, ND2) P1 P2 ND1 X ND2 C1 C255 55 © 2012 Persistent Systems Ltd
- 56. The Joint Probability Distribution Due to the Markov condition, we can compute the joint probability distribution over all the variables X1, …, Xn in the Bayesian net using the formula: n P( X1 x1 ,..., X n xn ) P( X i x i | Parents ( X i )) i 1 Where Parents(Xi) means the values of the Parents of the node Xi with respect to the graph56 56 © 2012 Persistent Systems Ltd
- 57. Using a Bayesian Network Example Using the network in the example, suppose you want to calculate: P(A = true, B = true, C = true, D = true) = P(A = true) * P(B = true | A = true) * P(C = true | B = true) P( D = true | B = true) A = (0.4)*(0.3)*(0.1)*(0.95) B C D57 57 © 2012 Persistent Systems Ltd
- 58. Using a Bayesian Network Example Using the network in the example, suppose you want to calculate: This is from the graph structure P(A = true, B = true, C = true, D = true) = P(A = true) * P(B = true | A = true) * P(C = true | B = true) P( D = true | B = true) A = (0.4)*(0.3)*(0.1)*(0.95) B These numbers are from the conditional probability tables C D58 58 © 2012 Persistent Systems Ltd
- 59. Inference Using a Bayesian network to compute probabilities is called inference In general, inference involves queries of the form: E = The evidence variable(s) P( XX|=E ) query variable(s) The59 59 © 2012 Persistent Systems Ltd
- 60. Key Questions on Bayesian Networks How do you build a Bayesian Networks ? How do you compute conditional probabilities based on data ? What about continuous variables ? Without data how do you build Bayesian Networks ? Can you capture data from your experience in the network ? Can you learn the structure from data ? How do you draw inference using Bayesian Networks ? How do you manage the computational complexity of the network for exact inference. 60 © 2012 Persistent Systems Ltd
- 61. www.persistentsys.com Algebra of Doing© 2012 Persistent Systems Ltd
- 62. Algebra of Doing Available: Algebra of Seeing Simplify the Bayesian Network What is the chance it rained if by explicitly capturing an we see the grass is wet ? intervention. P(Rain | wet) = P (wet | rain) Causal conditional probabilities. P(rain)/P(wet) P( x |do (y)) Algebra of Doing Calculus for moving from What is the chance that it rained Causal conditional probability to if we make the grass wet ? conditional probability. P(rain | do(wet) ) = P(rain)62 © 2012 Persistent Systems Ltd
- 63. Causal Conditional Probabilities Borrowing Ideas from Gene Randomized Controlled Trials Hypothetical world where ? Can we compute Smoking Cancer P(cancer| do(smoking)) ? Allows us to override Causal Tar influences for that variable.63 © 2012 Persistent Systems Ltd
- 64. Using Causal Conditional Probabilities Setup an intervention in Intervention Bayesian networks Override all other Causal influences in presence of Smoking Cancer intervention. P(Tar | do(smoking) Tar Convert from do calculus to observational calculus.64 © 2012 Persistent Systems Ltd
- 65. www.persistentsys.com Summary© 2012 Persistent Systems Ltd
- 66. To summarize Correlation is NOT Causation Randomized Controlled Trials (RCT) can establish causation. Want more ? Study Causality !66 © 2012 Persistent Systems Ltd
- 67. References Causality: Models, Reasoning, and Inference Judea Pearl, Second Edition A Tutorial on Learning With Bayesian Networks , David Heckerman Technical Report, Microsoft Research. Bayesian Networks without Tears, Eugene Cherniak AI Magazine, 1991 If Correlation does not imply Causation, what does ? Michael Nielson blog67 © 2012 Persistent Systems Ltd
- 68. Thank You Persistent Systems Limited www.persistentsys.com68 68 © 2012 Persistent Systems Ltd © 2012 Persistent Systems Ltd

No public clipboards found for this slide

×
### Save the most important slides with Clipping

Clipping is a handy way to collect and organize the most important slides from a presentation. You can keep your great finds in clipboards organized around topics.

Be the first to comment