Quantitative
Methods
for
Lawyers
Conditional Probability
& Bayes Theorem
!
Class #8
@ computational
computationallegalstud...
Conditional Probability is an Important Concept
and A Precursor to Discussing Bayes Rule
Conditional Probability
Relies on...
In a conditional probability problem, the sample space is
“reduced” to the “space” of the given outcome (e.g. if given
B, ...
A Dice Based Example
What is the Probability of Getting a “2” if we know
that the number thrown is less than 5?
What is the Probability of Getting a “2” if we know
given that the number thrown is less than 5?
A Dice Based Example
What is the Probability of Getting a “2” if we know
given that the number thrown is less than 5?
Again Here is Our Formula...
What is the Probability of Getting a “2” if we know
given that the number thrown is less than 5?
P ( “2”|know it is Less t...
What is the Probability of Getting a “2” if we know
given that the number thrown is less than 5?
P ( “2”|know it is Less t...
Okay What is P ( “2” {1,2,3,4} ) ?
What is the Probability of Getting a “2” if we know
given that the number thrown is les...
Okay What is P ( “2” {1,2,3,4} ) ?
What is the Probability of Getting a “2” if we know
given that the number thrown is les...
Okay What is P ( “2” {1,2,3,4} ) ?
What is the Probability of Getting a “2” if we know
given that the number thrown is les...
Monty
Hall
Problem
In “Lets Make a Deal” you are given the opportunity to
select one closed door of three, behind one of which there is
a pri...
Assume You Picked Door #1
Monty Hall Problem
Now Assume Monty Has Removed Door #2
Here is the problem:
Should You Stay or ...
Monty Hall Problem
Answer is You Should Switch
This is Counterintuitive
Key Fact: the host always opens the door to
reveal...
Tree showing the probability of every possible outcome if the
player initially picks Door 1
Monty Hall Problem
There are 100 doors to pick from in the beginning
You pick one door
Monty looks at the 99 others, finds the goats, and open...
Bayes Rule
Bayes Rule In
Spam Filtering
Spam
Filtering
Fighting spam is a constant exercise. As the junk filters
become more intelligent, the spam senders come up ...
Spam
Filtering
When many people mark an email message as spam,
the filter will eventually “update”
(using an updating rule)...
Spam Filtering
Key Insight is that when developing a filter we are trying
to mimic the information that allows you (as a hu...
Bayes Rule In
Spam Filtering
Some of the same properties at work in spam filtering are
those at work in E-Discovery
bayesia...
Bayes Rule In
Spam Filtering
Type 1 vs. Type 2 Error Trade Off:
Type 1 = False positive
(convict someone /something that i...
Bayes Rule In
Spam Filtering
Type 1 vs. Type 2 Error Trade Off:
Type 1 = False positive
(convict someone /something that i...
Bayes Rule In
Spam Filtering
Basic Scoring Content-based spam filter
looks for words and other characteristics typical of s...
Train the Filter =
In light of what you have now identified as spam, update
the scoring methods or properties that the spam...
Example From a
Info Tech Company
http://www.bluewatermedia.com/support/spam-filter.html
Bayes Rule In
Spam Filtering
http://www.bluewatermedia.com/support/spam-filter.html
Keep Thinking About the Relationship
between
Spam Filters
and
EDiscovery /
Automated Doc Review
Daniel Martin Katz
@ computational
computationallegalstudies.com
lexpredict.com
danielmartinkatz.com
illinois tech - chica...
Upcoming SlideShare
Loading in …5
×

Quantitative Methods for Lawyers - Class #8 - Bayes Rule and Conditional Probability - Professor Daniel Martin Katz

1,001 views

Published on

Quantitative Methods for Lawyers - Class #8 - Bayes Rule and Conditional Probability - Professor Daniel Martin Katz

Published in: Law, Technology, News & Politics
  • Be the first to comment

  • Be the first to like this

Quantitative Methods for Lawyers - Class #8 - Bayes Rule and Conditional Probability - Professor Daniel Martin Katz

  1. 1. Quantitative Methods for Lawyers Conditional Probability & Bayes Theorem ! Class #8 @ computational computationallegalstudies.com professor daniel martin katz danielmartinkatz.com lexpredict.com slideshare.net/DanielKatz
  2. 2. Conditional Probability is an Important Concept and A Precursor to Discussing Bayes Rule Conditional Probability Relies on a Little Bit of Set Theory” Prob of “A Given B” P (A intersect B) Divided by the Prob of B
  3. 3. In a conditional probability problem, the sample space is “reduced” to the “space” of the given outcome (e.g. if given B, we now just care about the probability of A occurring “inside” of B) Given B, what’s the probability of A? A Visual Depiction of Conditional Probability The Entire Yellow Space is Intuitively we are asking ... What Share of B contains the overlapwithA?
  4. 4. A Dice Based Example What is the Probability of Getting a “2” if we know that the number thrown is less than 5?
  5. 5. What is the Probability of Getting a “2” if we know given that the number thrown is less than 5? A Dice Based Example
  6. 6. What is the Probability of Getting a “2” if we know given that the number thrown is less than 5? Again Here is Our Formula: A Dice Based Example
  7. 7. What is the Probability of Getting a “2” if we know given that the number thrown is less than 5? P ( “2”|know it is Less than 5) = P ( “2” {1,2,3,4} ) P ( {1,2,3,4} ) Again Here is Our Formula: A Dice Based Example
  8. 8. What is the Probability of Getting a “2” if we know given that the number thrown is less than 5? P ( “2”|know it is Less than 5) = P ( “2” {1,2,3,4} ) P ( {1,2,3,4} ) A Dice Based Example
  9. 9. Okay What is P ( “2” {1,2,3,4} ) ? What is the Probability of Getting a “2” if we know given that the number thrown is less than 5? The only element that intersects is “2” so is it the Prob of “2” which is 1/6 P ( “2”|know it is Less than 5) = P ( “2” {1,2,3,4} ) P ( {1,2,3,4} ) A Dice Based Example
  10. 10. Okay What is P ( “2” {1,2,3,4} ) ? What is the Probability of Getting a “2” if we know given that the number thrown is less than 5? The only element that intersects is “2” so is it the Prob of “2” which is 1/6 P ( “2”|know it is Less than 5) = P ( “2” {1,2,3,4} ) P ( {1,2,3,4} ) Now What is P ( {1,2,3,4} ) ? 1/6 + 1/6 +1/6 + 1/6 = 4/6 A Dice Based Example
  11. 11. Okay What is P ( “2” {1,2,3,4} ) ? What is the Probability of Getting a “2” if we know given that the number thrown is less than 5? The only element that intersects is “2” so is it the Prob of “2” which is 1/6 P ( “2”|know it is Less than 5) = P ( “2” {1,2,3,4} ) P ( {1,2,3,4} ) Now What is P ( {1,2,3,4} ) ? 1/6 + 1/6 +1/6 + 1/6 = 4/6 Okay Lets Put it All Together: P ( “2” {1,2,3,4} ) P ( {1,2,3,4} ) = 1/41/6 4/6 = A Dice Based Example
  12. 12. Monty Hall Problem
  13. 13. In “Lets Make a Deal” you are given the opportunity to select one closed door of three, behind one of which there is a prize. The other two doors hide “goats” (or some other such “non– prize”), or nothing at all. Once you have made your selection, Monty Hall will open one of the remaining doors, revealing that it does not contain the prize. Monty Hall Problem
  14. 14. Assume You Picked Door #1 Monty Hall Problem Now Assume Monty Has Removed Door #2 Here is the problem: Should You Stay or Should you switch?
  15. 15. Monty Hall Problem Answer is You Should Switch This is Counterintuitive Key Fact: the host always opens the door to reveal a goat (if not the properties of the problem would change)
  16. 16. Tree showing the probability of every possible outcome if the player initially picks Door 1 Monty Hall Problem
  17. 17. There are 100 doors to pick from in the beginning You pick one door Monty looks at the 99 others, finds the goats, and opens all but 1 Do you stick with your original door (1/100), or the other door, which was filtered from 99? It’s a bit clearer now : Monty is taking a set of 99 choices and improving them by removing 98 goats. When he’s done, he has the top door out of 99 for you to pick. Your decision: Do you want a random door out of 100 (initial guess) or the best door out of 99? Said another way, do you want 1 random chance or the best of 99 random chances? We’re starting to see why Monty’s actions help us. He’s letting us choose between a generic, random choice and a curated, filtered choice. Filtered is better. Monty Hall Problem
  18. 18. Bayes Rule
  19. 19. Bayes Rule In Spam Filtering
  20. 20. Spam Filtering Fighting spam is a constant exercise. As the junk filters become more intelligent, the spam senders come up with innovative means to ensure their emails reach your inbox. The automatic identification of spam and phishing scams is usually coupled with a “human” element. This Human Element has to be Weighted / Blended.
  21. 21. Spam Filtering When many people mark an email message as spam, the filter will eventually “update” (using an updating rule) The properties of a spam message are constantly in flux. Thus, spam filters need to be taught constantly.
  22. 22. Spam Filtering Key Insight is that when developing a filter we are trying to mimic the information that allows you (as a human reasoner) to rapidly detect that a message is spam: (1) Message is from another country ( in particular china, Nigeria, India, etc.) (2) Message is from new email address (3) ... What Else?
  23. 23. Bayes Rule In Spam Filtering Some of the same properties at work in spam filtering are those at work in E-Discovery bayesian spam filters calculate the probability of a message being spam based on its contents. Unlike simple content-based filters, Bayesian spam filtering learns from spam and from good mail, resulting in a very robust, adapting and efficient anti-spam approach that, best of all, returns hardly any false positives.
  24. 24. Bayes Rule In Spam Filtering Type 1 vs. Type 2 Error Trade Off: Type 1 = False positive (convict someone /something that is innocent) Type 2 = False Negative (Fail to convict someone /something that is Guilty) Which Would We Rather Have in This Context?
  25. 25. Bayes Rule In Spam Filtering Type 1 vs. Type 2 Error Trade Off: Type 1 = False positive (convict someone /something that is innocent) Type 2 = False Negative (Fail to convict someone /something that is Guilty) Which Would We Rather Have in This Context? Allow Some Messages to Go Your Inbox
  26. 26. Bayes Rule In Spam Filtering Basic Scoring Content-based spam filter looks for words and other characteristics typical of spam. Every characteristic element is assigned a score, and a spam score for the whole message is computed from the individual scores. Some scoring filters also look for characteristics of legitimate mail, lowering the complete score =
  27. 27. Train the Filter = In light of what you have now identified as spam, update the scoring methods or properties that the spam filter uses Wisdom of Crowds --> Leverage large data set to see what crowd thinks is spam
  28. 28. Example From a Info Tech Company http://www.bluewatermedia.com/support/spam-filter.html
  29. 29. Bayes Rule In Spam Filtering http://www.bluewatermedia.com/support/spam-filter.html
  30. 30. Keep Thinking About the Relationship between Spam Filters and EDiscovery / Automated Doc Review
  31. 31. Daniel Martin Katz @ computational computationallegalstudies.com lexpredict.com danielmartinkatz.com illinois tech - chicago kent college of law@

×