Successfully reported this slideshow.
Upcoming SlideShare
×

# Quantitative Methods for Lawyers - Class #9 - Bayes Theorem (Part 2), Skewness, Kurtosis & Data Distributions - Professor Daniel Martin Katz

2,598 views

Published on

Quantitative Methods for Lawyers - Class #9 - Bayes Theorem (Part 2), Skewness, Kurtosis & Data Distributions - Professor Daniel Martin Katz

Published in: Education, Technology
• Full Name
Comment goes here.

Are you sure you want to Yes No
• But kurtosis does not measure anything about the peak. It measures the rare, extreme observations only. Please see https://en.wikipedia.org/wiki/Talk:Kurtosis#Why_kurtosis_should_not_be_interpreted_as_.22peakedness.22

Are you sure you want to  Yes  No

### Quantitative Methods for Lawyers - Class #9 - Bayes Theorem (Part 2), Skewness, Kurtosis & Data Distributions - Professor Daniel Martin Katz

1. 1. Quantitative Methods for Lawyers Bayes Theorem (Part 2), Skewness, Kurtosis & Data Distributions Class #9 @ computational computationallegalstudies.com professor daniel martin katz danielmartinkatz.com lexpredict.com slideshare.net/DanielKatz
2. 2. Example: Marie is getting married tomorrow, at an outdoor ceremony in the desert. In recent years, it has rained only 5 days each year. Unfortunately, the weatherman has predicted rain for tomorrow. When it actually rains, the weatherman correctly forecasts rain 90% of the time. When it doesn't rain, he incorrectly forecasts rain 10% of the time. What is the probability that it will rain on the day of Marie's wedding? Bayes Rule
3. 3. Solution: The sample space is deﬁned by two mutually-exclusive events - it rains or it does not rain. Additionally, a third event occurs when the weatherman predicts rain. Notation for these events appears below. • Event A1. It rains on Marie's wedding. • Event A2. It does not rain on Marie's wedding • Event B. The weatherman predicts rain. Bayes Rule
4. 4. • Event A1. It rains on Marie's wedding. • Event A2. It does not rain on Marie's wedding • Event B. The weatherman predicts rain. In terms of probabilities, we know the following: • P( A1 ) = 5/365 =0.014 [rains = 5 days per year] • P( A2 ) = 360/365 = 0.986 [Not rain = 360 days per year] • P( B | A1 ) = 0.9 [When it rains, the weatherman predicts rain 90% of the time] • P( B | A2 ) = 0.1 [When it does not rain, the weatherman predicts rain 10% of the time] Bayes Rule
5. 5. A2 P(B|A1) 360 365 B B Lets Think About This Using a Diagram A1 .1 =.986 .0986 P(B|A2) .9 .0126 5 365 =.014
6. 6. We want to know P( A1 | B ), the probability it will rain on the day of Marie's wedding, given a forecast for rain by the weatherman. The answer can be determined from Bayes' theorem, as shown below: P( A1 | B ) =   _____________P( A1 ) P( B | A1 )_________ P( A1 ) P( B | A1 ) + P( A2 ) P( B | A2 ) P( A1 | B ) = ___________(0.014)(0.9)__________ [ (0.014)(0.9) + (0.1) (0.986) ] P( A1 | B ) = .1133 Note the somewhat unintuitive result. Even when the weatherman predicts rain, it only rains only about 11% of the time. Bayes Rule
7. 7. What Can We Say About The Weatherman? Bayes Rule Likelihood Increased from ~1% to ~11% That is a 11 fold increase in the likelihood However, it is still pretty unlikely to rain
8. 8. Bayes Rule How Much Signal / Information ? We Could Consider a Complex Version of the problem - Weatherman Predicts Rain + It is the Monsoon Season Compound Events The Signal was of limited value because ratio of Type I to Type II error was not favorable
9. 9. Lets Try Another Bayes Rule Problem ...
10. 10. Bayes Rule Imagine a particular test: correctly identiﬁes those with a certain disease 94% of the time and correctly diagnoses those without the disease 98% of the time A friend has just informed you that he has received a positive result and asks for your advice about how to interpret these probabilities. Before attempting to address your friend’s concern, you research the illness and discover that 4% of men have this disease. What is the probability your friend actually has the disease?
11. 11. Deﬁne the events: Express the given information and question in probability notation: “test correctly identiﬁes those with a certain serious disease 94% of the time” “test correctly diagnoses those without the disease 98% of the time” “you discover that 4% of men have this disease” this statement also tells us that 96% of men do not have the disease Bayes Rule ( )1 0.94P B A⇒ = ! 1 2 a man has this disease a man does not have this disease positive test result negative test resultC A A B B = = = = ! ( )2 0.98C P B A⇒ = ! ( )1 0.04P A⇒ = ! ( )2 0.96P A⇒ = !
12. 12. Key Question: “Given a positive result, What is the probability your friend actually has the disease ?” ( )1 ?P A B⇒ = !
13. 13. Bayes Rule a tree diagram: ! 1 2 a man has this disease a man does not have this disease positive test result negative test resultC A A B B = = = = !
14. 14. Use Bayes’ Theorem and your tree diagram to answer the question: There is a 66.2% probability that he actually has the disease. The probability is high, but considerably lower than your friend feared. Bayes Rule ( ) ( ) ( ) ( ) ( ) ( ) ( ) 1 2 1 1 2 2 2 0.0376 0.662 0.0376 0.0192 P A P B A P A B P A P B A P A P B A ⋅ = = ≈ +⋅ + ⋅ !
15. 15. http://www.agenarisk.com/resources/probability_puzzles/event_tree.shtml Review This One on Your Own
16. 16. Sampling Take 2 Use the Sample to Infer Characteristics of the Full Population
17. 17. Why Sample? Might Be Impossible to Get the Full Population Cost of Getting Full Population Sampling is concerned with the selection of a subset of individuals from within a population to estimate characteristics of the whole population Sampling Focus Upon Improving Precision v. Size
18. 18. (1) Deﬁning the population of concern (2) Specifying a sampling frame, a set of items or events possible to measure (3) Specifying a sampling method for selecting items or events from the frame (4) Determining the sample size (5) Implementing the sampling plan (6) Sampling and data collecting Sampling Stages
19. 19. Determining the Sample Size Conceptually We Understand that in order to obtain a representative sample we need to acquire somewhere between 1 > ? > Full Population But Exactly How Many Observations do we need?
20. 20. Random Sampling Error Imagine a Political Poll When You Sample at Random It is Possible to Have a Skewed Set of Observation in the Sample where the population of interest are actual voters. pollsters take smaller samples that are intended to be representative, that is, a random sample of the population. It is possible that pollsters sample 1,013 voters who happen to vote for Bush when in fact the population is evenly split between Candidate 1 and Candidate 2, but this is extremely unlikely (p = 2−1013 ≈ 1.1 × 10−305 ) given that the sample is random.
21. 21. Random Sampling Error For Simple Random Sample on a large population, the Inverse of the Square Root of the Sample Size
22. 22. Random Sampling Error For Simple Random Sample on a large population, the Inverse of the Square Root of the Sample Size Very Typically Reported
23. 23. Random Sampling Error a random sample of size 400 will give a margin of error, at a 95% conﬁdence level, of 0.98/20 or 0.049 - just under 5%. For Simple Random Sample on a large population, the Inverse of the Square Root of the Sample Size Example:
24. 24. Random Sampling Error a random sample of size 400 will give a margin of error, at a 95% conﬁdence level, of 0.98/20 or 0.049 - just under 5%. a random sample of size 1600 will give a margin of error of 0.98/40, or 0.0245 - just under 2.5%. For Simple Random Sample on a large population, the Inverse of the Square Root of the Sample Size Example:
25. 25. Random Sampling Error a random sample of size 400 will give a margin of error, at a 95% conﬁdence level, of 0.98/20 or 0.049 - just under 5%. A random sample of size 1600 will give a margin of error of 0.98/40, or 0.0245 - just under 2.5%. For Simple Random Sample on a large population, the Inverse of the Square Root of the Sample Size Example: Notice: Double the Precision Requires four times the Sample Size!
26. 26. Top portion of this graphic depicts the relative likelihood that the "true" percentage is in a particular area given a reported percentage of 50%. In other words, for each sample size, one is 95% conﬁdent that the "true" percentage is in the region indicated by the corresponding segment. The larger the sample is, the smaller the margin of error. The bottom portion shows 95% conﬁdence intervals (horizontal line segments), the corresponding margins of error (on the left), and sample sizes (on the right).
27. 27. Central Limit Theorem Try this yourself: “Netlogo Central Limit Theorem” http://ccl.northwestern.edu/netlogo/models/run.cgi?CentralLimitTheorem.715.627
28. 28. Thinking of Data as a Distribution: Histogram Histogram - histogram is a graphical representation showing a visual impression of the distribution of data (1) consists of tabular frequencies, shown as adjacent rectangles, erected over discrete intervals (bins) (2) The height of a rectangle is also equal to the frequency density of the interval, i.e., the frequency divided by the width of the interval (3) Total area of the histogram is equal to the number of data
29. 29. Thinking of Data as a Distribution: Histogram Histogram of travel time, US 2000 census. Area under the curve equals the total number of cases. This diagram uses Q/width from the table.
30. 30. Ordinary v. Cumulative Histogram
31. 31. http://www.socr.ucla.edu/htmls/SOCR_Charts.html http://www.socr.ucla.edu/ An Extra Online Resource
32. 32. Data as a Distribution Try to Start Thinking of Any Data Set as a Distribution This allows you take a broader perspective about the observations contained therein When you get a new dataset you should generate some summary statistics such as (1) Measures of Central Tendency (2) Measures of Variation ( including the ﬁrst four moments of the distribution)
33. 33. Thinking of Data as a Distribution Moment 1 = Mean Moment 2 = Variance Moment 3 = Skewness Moment 4 = Kurtosis
34. 34. Describing the Shape of the Data
35. 35. Skewness skewness is a measure of the asymmetry of a distribution
36. 36. Skewness Skewness in the Context of the Measures of Central Tendency
37. 37. a negative skew indicates that the tail on the left side of the probability density function is longer than the right side and the bulk of the values (possibly including the median) lie to the right of the mean. Skewness
38. 38. Skewness A positive skew indicates that the tail on the right side is longer than the left side and the bulk of the values lie to the left of the mean.
39. 39. Calculating Skewness 1. Subtract Mean from each Raw Score. Aka, Deviations from the mean 2. Raise each of these deviations from the mean to the third power and sum. Aka: Sum of third moment deviations 3. Calculate skewness, which is the sum of the deviations from the mean, raised to the third power, divided by number of cases minus 1, times the standard deviation raised to the third power.
40. 40. Calculating Skewness Try This Problem: http://www.indiana.edu/~educy520/ sec5982/week_12/skewness_demo.pdf 1. Subtract Mean from each Raw Score. Aka, Deviations from the mean 2. Raise each of these deviations from the mean to the third power and sum. Aka: Sum of third moment deviations 3. Calculate skewness, which is the sum of the deviations from the mean, raised to the third power, divided by number of cases minus 1, times the standard deviation raised to the third power.
41. 41. Calculating Skewness Try This Problem: http://www.indiana.edu/~educy520/sec5982/week_12/skewness_demo.pdf
42. 42. kurtosis is any measure of the "peakedness" of a distribution A high kurtosis distribution has a sharper peak and longer, fatter tails, while a low kurtosis distribution has a more rounded peak and shorter, thinner tails. Kurtosis
43. 43. Distributions with zero excess kurtosis are called mesokurtic, or mesokurtotic. The most prominent example of a mesokurtic distribution is the normal distribution A distribution with positive excess kurtosis is called leptokurtic, or leptokurtotic. "Lepto-" means "slender". In terms of shape, a leptokurtic distribution has a more acute peak around the mean and fatter tails. A distribution with negative excess kurtosis is called platykurtic, or platykurtotic. "Platy-" means "broad". In terms of shape, a platykurtic distribution has a lower, wider peak around the mean and thinner tails. Kurtosis
44. 44. The moment coefﬁcient of kurtosis of a data set is computed almost the same way as the coefﬁcient of skewness: and   “excess” kurtosis: = Kurtosis − 3 Calculating Kurtosis Note: the excess kurtosis is generally used because the excess kurtosis of a normal distribution is 0.
45. 45. Calculating Kurtosis Example: n = 100 x̄bar = 67.45 variance m2 = 8.5275
46. 46. Calculating Kurtosis Example: n = 100 x̄bar = 67.45 variance m2 = 8.5275  kurtosis is = 199.3760/ (8.5275)² = 2.7418 and the excess kurtosis is = 2.7418 − 3 = −0.2582
47. 47. An Extra Online Resource
48. 48. Calculating Skew & Kurtosis http://www.youtube.com/watch?v=eKwJUWkD2FQ
49. 49. Daniel Martin Katz @ computational computationallegalstudies.com lexpredict.com danielmartinkatz.com illinois tech - chicago kent college of law@