SlideShare a Scribd company logo
1 of 27
Class 6
Overfitting, Underfitting, & Cross-validation
Legal Analytics
Professor Daniel Martin Katz
Professor Michael J Bommarito II
legalanalyticscourse.com
Model Fit
access more at legalanalyticscourse.com
We are interested in how well
a given model performs
access more at legalanalyticscourse.com
both on existing data
access more at legalanalyticscourse.com
Underfitting occurs when a
statistical model or algorithm
cannot capture the
underlying trend of the data
access more at legalanalyticscourse.com
an underfit
model
has
low variance,
high bias
access more at legalanalyticscourse.com
Overfitting occurs when a
statistical model or algorithm
captures the noise of the data
(as opposed to the signal)
access more at legalanalyticscourse.com
an overfit
model
has
low bias,
high variance
access more at legalanalyticscourse.com
Model Fit
is all about
generalization
access more at legalanalyticscourse.com
Underfitting/Overfitting
The challenge of generalization
Why is generalization hard?
Learning, machine or otherwise, looks something like this:
!  We are presented with a view of objects in the world.
!  We encode aspects of these objects, e.g., colors, into “features.”
!  We generalize from patterns in these features to statements about objects.
Example:
!  We spend a summer on Michigan lakes and see many animals. All swans that we
see are white. We generalize from this sample to the statement that all swans are
white.
What went wrong? Mathematically speaking, we did not observe enough
variance in our observed sample; in fact, our observed variance for the color
feature was zero!
access more at legalanalyticscourse.com
Underfitting
Zero variance in our observed sample led to a model with a constant
predicted value; this model underfits the true variance of swans.
Underfitting is, in essence, model simplification or ignorance of signal.
Underfit models may perform well on modal data, but they typically struggle
with lower-frequency or more complex cases.
Underfitting can occur for a number of reasons:
!  The model is too simple for the actual system. Technically speaking, either the
model does not contain enough parameters or the functional forms are not capable of
spanning the true functions.
!  The number of records or variance of the records does not provide the learning
process with enough information.
access more at legalanalyticscourse.com
Underfitting
Let’s look at a simple example – fitting a quadratic equation with a linear
function.
Quadratic functions look like this:
y = a^2 + b x + c
A function is therefore defined by supplying three parameters: a, b, and c.
To make this realistic, let’s add some simple N(0,1) random errors, giving us
the form:
y = a^2 + b x + c + e
where e is distributed N(0,1).
access more at legalanalyticscourse.com
Underfitting
Example:
y = x^2 + 2 x + 1 + e
access more at legalanalyticscourse.com
Underfitting
What happens if we try to fit a model to this data. First, let’s start with a
simple linear function, i.e., linear regression.
Our linear form looks like this:
y = a x + b + e
A model is therefore defined by supplying two parameters: a and b.
access more at legalanalyticscourse.com
Underfitting
Example:
y = 1.94 x + 6.62
access more at legalanalyticscourse.com
Underfitting
This linear model clearly does not capture the non-linear relationship
between x and y.
However, no combination of a and b will successfully match this across all x,
since the linear model is just too simple to represent a non-linear model.
Linear models have too few parameters to fit non-linear models! Thus, they
will typically underfit non-linear models.
(fit quadratic model below)
access more at legalanalyticscourse.com
Overfitting
Overfitting is the opposite of underfitting, and it occurs when a model
codifies noise into the structure.
Overfitting may occur for a number of reasons:
!  Models that are much more complex than the underlying data, either in terms of
functional form or number of parameters.
!  Learning that is too focused on minimizing the loss function for a single training
sample.
access more at legalanalyticscourse.com
Overfitting
Let’s return to our quadratic example from before. As we discussed, our
quadratic data was generated by a model with three parameters: a, b, and c.
When we tried to explain the data with just two parameters, the resulting
model underfit the data and did a poor job.
When we tried to explain the data with three parameters, the resulting
model did an excellent job of fitting the data.
What happens if we try to explain the data with seven parameters?
access more at legalanalyticscourse.com
Overfitting
First, let’s focus on the portion of data that we saw in our training set before
– the range where x lies between -4 and 4.
At first blush, it looks like we’ve done an excellent job. Compared to our
three parameter quadratic fit, we have done an even better job of reducing the
some of our squared residuals. Why not always use more parameters?
access more at legalanalyticscourse.com
Overfitting
But what happens if we look outside of this (-4, 4) range? It turns out that
we’ve committed two common overfitting mistakes:
!  Our model is much more complex than the underlying data. Quadratic relationships
are built on three parameters, whereas our model uses eight. When we minimized
our loss function, the extra five parameters were used to fit to noise, not signal!
!  Our model was trained on a very narrow sample of the world. While we do an
excellent job of predicting values between -4 and 4, we do a very poor job outside of
this range
access more at legalanalyticscourse.com
Generalizing safely
So what can we do to safely generalize? Two of the most common approaches
are regularization and cross-validation.
Regularization is …
access more at legalanalyticscourse.com
Cross-validation
Cross-validation, like regularization, is meant to prevent the learning
process from codifying sample-specific noise as structure.
However, unlike regularization, cross-validation does not impose any
geometric constraints on the shape or “feel” of our learning solution, i.e.,
model.
Instead, it focuses on repeating the learning task on multiple samples of
training data, then evaluating the performance of these models on the “held-
out” or unseen data.
access more at legalanalyticscourse.com
Cross-validation: K-fold
The most common approach to cross-validation is to divide the training set of
data into K distinct partitions of equal size. K-1 of these partitions are then
used to learn a models. The resulting model is then used to predict the Kth
partition. This process is repeated K times, and the best performing sample is
kept as the trained model.
http://genome.tugraz.at/proclassify/help/pages/XV.html
http://stats.stackexchange.com/questions/1826/cross-validation-in-plain-english
access more at legalanalyticscourse.com
“Cross-validation is widely used to check model error by
testing on data not part of the training set. Multiple rounds
with randomly selected test sets are averaged together to
reduce variability of the cross-validation; high variability of
the model will produce high average errors on the test set.
One way of resolving the trade-off is to use mixture models
and ensemble learning. For example, boosting combines many
‘weak’ (high bias) models in an ensemble that has greater
variance than the individual models, while bagging combines
‘strong’ learners in a way that reduces their variance.”
http://en.wikipedia.org/wiki/Cross-validation_%28statistics%29
access more at legalanalyticscourse.com
Legal Analytics
Class 6 - Overfitting, Underfitting, & Cross-Validation
daniel martin katz
blog | ComputationalLegalStudies
corp | LexPredict
michael j bommarito
twitter | @computational
blog | ComputationalLegalStudies
corp | LexPredict
twitter | @mjbommar
more content available at legalanalyticscourse.com
site | danielmartinkatz.com site | bommaritollc.com

More Related Content

What's hot

Genetic algorithms
Genetic algorithmsGenetic algorithms
Genetic algorithms
zamakhan
 

What's hot (20)

Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Lstm
LstmLstm
Lstm
 
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live ImplementationAnomaly Detection - Real World Scenarios, Approaches and Live Implementation
Anomaly Detection - Real World Scenarios, Approaches and Live Implementation
 
Anomaly detection
Anomaly detectionAnomaly detection
Anomaly detection
 
Introduction to Deep learning
Introduction to Deep learningIntroduction to Deep learning
Introduction to Deep learning
 
Introduction to machine learning
Introduction to machine learningIntroduction to machine learning
Introduction to machine learning
 
Regularization in deep learning
Regularization in deep learningRegularization in deep learning
Regularization in deep learning
 
Research of adversarial example on a deep neural network
Research of adversarial example on a deep neural networkResearch of adversarial example on a deep neural network
Research of adversarial example on a deep neural network
 
Anomaly detection with machine learning at scale
Anomaly detection with machine learning at scaleAnomaly detection with machine learning at scale
Anomaly detection with machine learning at scale
 
Deep Learning Explained
Deep Learning ExplainedDeep Learning Explained
Deep Learning Explained
 
Introduction to Neural Networks
Introduction to Neural NetworksIntroduction to Neural Networks
Introduction to Neural Networks
 
Naive Bayes
Naive BayesNaive Bayes
Naive Bayes
 
Machine Learning Algorithms
Machine Learning AlgorithmsMachine Learning Algorithms
Machine Learning Algorithms
 
Semi-Supervised Learning
Semi-Supervised LearningSemi-Supervised Learning
Semi-Supervised Learning
 
Lecture_1_Introduction_to_Adversarial_Machine_Learning.pptx
Lecture_1_Introduction_to_Adversarial_Machine_Learning.pptxLecture_1_Introduction_to_Adversarial_Machine_Learning.pptx
Lecture_1_Introduction_to_Adversarial_Machine_Learning.pptx
 
Naive Bayes Presentation
Naive Bayes PresentationNaive Bayes Presentation
Naive Bayes Presentation
 
Genetic algorithms
Genetic algorithmsGenetic algorithms
Genetic algorithms
 
House price prediction
House price predictionHouse price prediction
House price prediction
 
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
Big Data Helsinki v 3 | "Federated Learning and Privacy-preserving AI" - Oguz...
 
Credit card fraud detection
Credit card fraud detectionCredit card fraud detection
Credit card fraud detection
 

Viewers also liked

nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyay
abhishek upadhyay
 
Legal Analytics, Machine Learning and Some Comments on the Status of Innovat...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovat... Legal Analytics, Machine Learning and Some Comments on the Status of Innovat...
Legal Analytics, Machine Learning and Some Comments on the Status of Innovat...
Daniel Katz
 

Viewers also liked (20)

Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...
Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...
Legal Analytics Course - Class 12 - Data Preprocessing using dPlyR - Professo...
 
Legal Analytics Course - Class 9 - Clustering Algorithms (K-Means & Hierarch...
Legal Analytics Course - Class 9 -  Clustering Algorithms (K-Means & Hierarch...Legal Analytics Course - Class 9 -  Clustering Algorithms (K-Means & Hierarch...
Legal Analytics Course - Class 9 - Clustering Algorithms (K-Means & Hierarch...
 
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
Legal Analytics Course - Class 11 - Network Analysis and Law - Professors Dan...
 
Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...
Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...
Legal Analytics Course - Class 10 - Information Visualization + DataViz in R ...
 
Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawy...
Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawy...Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawy...
Legal Analytics Course - Class #2 - Introduction to Machine Learning for Lawy...
 
Legal Analytics - Introduction to the Course - Professor Daniel Martin Katz +...
Legal Analytics - Introduction to the Course - Professor Daniel Martin Katz +...Legal Analytics - Introduction to the Course - Professor Daniel Martin Katz +...
Legal Analytics - Introduction to the Course - Professor Daniel Martin Katz +...
 
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
Law + Complexity & Prediction: Toward a Characterization of Legal Systems as ...
 
How to calculate back propagation
How to calculate back propagationHow to calculate back propagation
How to calculate back propagation
 
Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...
Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...
Legal Analytics Course - Class #4 - Github and RMarkdown Tutorial - Professor...
 
Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...
Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...
Legal Analytics Course - Class 8 - Introduction to Random Forests and Ensembl...
 
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
Legal Analytics Course - Class 7 - Binary Classification with Decision Tree L...
 
Legal Analytics Course - Class 5 - Quantitative Legal Prediction + Data Drive...
Legal Analytics Course - Class 5 - Quantitative Legal Prediction + Data Drive...Legal Analytics Course - Class 5 - Quantitative Legal Prediction + Data Drive...
Legal Analytics Course - Class 5 - Quantitative Legal Prediction + Data Drive...
 
Cross-Validation
Cross-ValidationCross-Validation
Cross-Validation
 
LexPredict - Empowering the Future of Legal Decision Making
LexPredict - Empowering the Future of Legal Decision MakingLexPredict - Empowering the Future of Legal Decision Making
LexPredict - Empowering the Future of Legal Decision Making
 
7.local and global minima
7.local and global minima7.local and global minima
7.local and global minima
 
Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...
Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...
Measure Twice, Cut Once - Solving the Legal Profession Biggest Challenges Tog...
 
nural network ER. Abhishek k. upadhyay
nural network ER. Abhishek  k. upadhyaynural network ER. Abhishek  k. upadhyay
nural network ER. Abhishek k. upadhyay
 
Adaptive Resonance Theory
Adaptive Resonance TheoryAdaptive Resonance Theory
Adaptive Resonance Theory
 
Legal Analytics, Machine Learning and Some Comments on the Status of Innovat...
 Legal Analytics, Machine Learning and Some Comments on the Status of Innovat... Legal Analytics, Machine Learning and Some Comments on the Status of Innovat...
Legal Analytics, Machine Learning and Some Comments on the Status of Innovat...
 
Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...
Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...
Machine Learning as a Service: #MLaaS, Open Source and the Future of (Legal) ...
 

Similar to Legal Analytics Course - Class 6 - Overfitting, Underfitting, & Cross-Validation - Professor Daniel Martin Katz + Professor Michael J Bommarito

notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
butest
 
Top 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdfTop 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdf
AnanthReddy38
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptx
iaeronlineexm
 

Similar to Legal Analytics Course - Class 6 - Overfitting, Underfitting, & Cross-Validation - Professor Daniel Martin Katz + Professor Michael J Bommarito (20)

Elif Lab srl - Gabriele Lami - Bayesian Probabilistic Algorithms and Human Sc...
Elif Lab srl - Gabriele Lami - Bayesian Probabilistic Algorithms and Human Sc...Elif Lab srl - Gabriele Lami - Bayesian Probabilistic Algorithms and Human Sc...
Elif Lab srl - Gabriele Lami - Bayesian Probabilistic Algorithms and Human Sc...
 
Bayesian Probabilistic Algorithms and Human Sciences for Modeling and Predict...
Bayesian Probabilistic Algorithms and Human Sciences for Modeling and Predict...Bayesian Probabilistic Algorithms and Human Sciences for Modeling and Predict...
Bayesian Probabilistic Algorithms and Human Sciences for Modeling and Predict...
 
notes as .ppt
notes as .pptnotes as .ppt
notes as .ppt
 
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
JAVA 2013 IEEE DATAMINING PROJECT Comparable entity mining from comparative q...
 
Comparable entity mining from comparative questions
Comparable entity mining from comparative questionsComparable entity mining from comparative questions
Comparable entity mining from comparative questions
 
Module 4: Model Selection and Evaluation
Module 4: Model Selection and EvaluationModule 4: Model Selection and Evaluation
Module 4: Model Selection and Evaluation
 
Top 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdfTop 20 Data Science Interview Questions and Answers in 2023.pdf
Top 20 Data Science Interview Questions and Answers in 2023.pdf
 
The importance of model fairness and interpretability in AI systems
The importance of model fairness and interpretability in AI systemsThe importance of model fairness and interpretability in AI systems
The importance of model fairness and interpretability in AI systems
 
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...
Spark + AI Summit - The Importance of Model Fairness and Interpretability in ...
 
Top 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner PitfallsTop 10 Data Science Practitioner Pitfalls
Top 10 Data Science Practitioner Pitfalls
 
Barga Data Science lecture 10
Barga Data Science lecture 10Barga Data Science lecture 10
Barga Data Science lecture 10
 
Pharmacokinetic pharmacodynamic modeling
Pharmacokinetic pharmacodynamic modelingPharmacokinetic pharmacodynamic modeling
Pharmacokinetic pharmacodynamic modeling
 
Analyzing Performance Test Data
Analyzing Performance Test DataAnalyzing Performance Test Data
Analyzing Performance Test Data
 
Machine learning - session 4
Machine learning - session 4Machine learning - session 4
Machine learning - session 4
 
Top 100+ Google Data Science Interview Questions.pdf
Top 100+ Google Data Science Interview Questions.pdfTop 100+ Google Data Science Interview Questions.pdf
Top 100+ Google Data Science Interview Questions.pdf
 
Barga Data Science lecture 9
Barga Data Science lecture 9Barga Data Science lecture 9
Barga Data Science lecture 9
 
Machine learning interview questions and answers
Machine learning interview questions and answersMachine learning interview questions and answers
Machine learning interview questions and answers
 
Machine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptxMachine Learning with Python- Methods for Machine Learning.pptx
Machine Learning with Python- Methods for Machine Learning.pptx
 
Regularization_BY_MOHAMED_ESSAM.pptx
Regularization_BY_MOHAMED_ESSAM.pptxRegularization_BY_MOHAMED_ESSAM.pptx
Regularization_BY_MOHAMED_ESSAM.pptx
 
Machine Learning - Deep Learning
Machine Learning - Deep LearningMachine Learning - Deep Learning
Machine Learning - Deep Learning
 

More from Daniel Katz

Measuring the Complexity of the Law: The United States Code ( Slides by Danie...
Measuring the Complexity of the Law: The United States Code ( Slides by Danie...Measuring the Complexity of the Law: The United States Code ( Slides by Danie...
Measuring the Complexity of the Law: The United States Code ( Slides by Danie...
Daniel Katz
 

More from Daniel Katz (15)

Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...
Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...
Legal Analytics versus Empirical Legal Studies - or - Causal Inference vs Pre...
 
Can Law Librarians Help Law Become More Data Driven ? An Open Question in Ne...
Can Law Librarians Help Law Become More Data Driven ?  An Open Question in Ne...Can Law Librarians Help Law Become More Data Driven ?  An Open Question in Ne...
Can Law Librarians Help Law Become More Data Driven ? An Open Question in Ne...
 
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
Why We Are Open Sourcing ContraxSuite and Some Thoughts About Legal Tech and ...
 
Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...
Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...
Fin (Legal) Tech – Law’s Future from Finance’s Past (Some Thoughts About the ...
 
Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...
Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...
Exploring the Physical Properties of Regulatory Ecosystems - Professors Danie...
 
Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...
Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...
Building Your Personal (Legal) Brand - Some Thoughts for Law Students and Oth...
 
Artificial Intelligence and Law - 
A Primer
Artificial Intelligence and Law - 
A Primer Artificial Intelligence and Law - 
A Primer
Artificial Intelligence and Law - 
A Primer
 
Technology, Data and Computation Session @ The World Bank - Law, Justice, and...
Technology, Data and Computation Session @ The World Bank - Law, Justice, and...Technology, Data and Computation Session @ The World Bank - Law, Justice, and...
Technology, Data and Computation Session @ The World Bank - Law, Justice, and...
 
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
{Law, Tech, Design, Delivery} Observations Regarding Innovation in the Legal ...
 
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Prof...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Prof...The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Prof...
The Three Forms of (Legal) Prediction: Experts, Crowds and Algorithms -- Prof...
 
Quantitative Methods for Lawyers - R Boot Camp Bonus Module - Professor Danie...
Quantitative Methods for Lawyers - R Boot Camp Bonus Module - Professor Danie...Quantitative Methods for Lawyers - R Boot Camp Bonus Module - Professor Danie...
Quantitative Methods for Lawyers - R Boot Camp Bonus Module - Professor Danie...
 
Quantitative Methods for Lawyers - Class #15 - Chi Square Distribution and Ch...
Quantitative Methods for Lawyers - Class #15 - Chi Square Distribution and Ch...Quantitative Methods for Lawyers - Class #15 - Chi Square Distribution and Ch...
Quantitative Methods for Lawyers - Class #15 - Chi Square Distribution and Ch...
 
Quantitative Methods for Lawyers - Class #14 - Power Laws, Hypothesis Testing...
Quantitative Methods for Lawyers - Class #14 - Power Laws, Hypothesis Testing...Quantitative Methods for Lawyers - Class #14 - Power Laws, Hypothesis Testing...
Quantitative Methods for Lawyers - Class #14 - Power Laws, Hypothesis Testing...
 
Measuring the Complexity of the Law: The United States Code ( Slides by Danie...
Measuring the Complexity of the Law: The United States Code ( Slides by Danie...Measuring the Complexity of the Law: The United States Code ( Slides by Danie...
Measuring the Complexity of the Law: The United States Code ( Slides by Danie...
 
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
Quantitative Methods for Lawyers - Class #22 - Regression Analysis - Part 5
 

Recently uploaded

6th sem cpc notes for 6th semester students samjhe. Padhlo bhai
6th sem cpc notes for 6th semester students samjhe. Padhlo bhai6th sem cpc notes for 6th semester students samjhe. Padhlo bhai
6th sem cpc notes for 6th semester students samjhe. Padhlo bhai
ShashankKumar441258
 
Audience profile - SF.pptxxxxxxxxxxxxxxxxxxxxxxxxxxx
Audience profile - SF.pptxxxxxxxxxxxxxxxxxxxxxxxxxxxAudience profile - SF.pptxxxxxxxxxxxxxxxxxxxxxxxxxxx
Audience profile - SF.pptxxxxxxxxxxxxxxxxxxxxxxxxxxx
MollyBrown86
 
一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
一比一原版(UC毕业证书)堪培拉大学毕业证如何办理一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
bd2c5966a56d
 
一比一原版(QUT毕业证书)昆士兰科技大学毕业证如何办理
一比一原版(QUT毕业证书)昆士兰科技大学毕业证如何办理一比一原版(QUT毕业证书)昆士兰科技大学毕业证如何办理
一比一原版(QUT毕业证书)昆士兰科技大学毕业证如何办理
bd2c5966a56d
 
COPYRIGHTS - PPT 01.12.2023 part- 2.pptx
COPYRIGHTS - PPT 01.12.2023 part- 2.pptxCOPYRIGHTS - PPT 01.12.2023 part- 2.pptx
COPYRIGHTS - PPT 01.12.2023 part- 2.pptx
RRR Chambers
 
Code_Ethics of_Mechanical_Engineering.ppt
Code_Ethics of_Mechanical_Engineering.pptCode_Ethics of_Mechanical_Engineering.ppt
Code_Ethics of_Mechanical_Engineering.ppt
JosephCanama
 
一比一原版赫尔大学毕业证如何办理
一比一原版赫尔大学毕业证如何办理一比一原版赫尔大学毕业证如何办理
一比一原版赫尔大学毕业证如何办理
Airst S
 
PowerPoint - Legal Citation Form 1 - Case Law.pptx
PowerPoint - Legal Citation Form 1 - Case Law.pptxPowerPoint - Legal Citation Form 1 - Case Law.pptx
PowerPoint - Legal Citation Form 1 - Case Law.pptx
ca2or2tx
 
一比一原版赫瑞瓦特大学毕业证如何办理
一比一原版赫瑞瓦特大学毕业证如何办理一比一原版赫瑞瓦特大学毕业证如何办理
一比一原版赫瑞瓦特大学毕业证如何办理
Airst S
 
一比一原版(JCU毕业证书)詹姆斯库克大学毕业证如何办理
一比一原版(JCU毕业证书)詹姆斯库克大学毕业证如何办理一比一原版(JCU毕业证书)詹姆斯库克大学毕业证如何办理
一比一原版(JCU毕业证书)詹姆斯库克大学毕业证如何办理
Airst S
 

Recently uploaded (20)

Jim Eiberger Redacted Copy Of Tenant Lease.pdf
Jim Eiberger Redacted Copy Of Tenant Lease.pdfJim Eiberger Redacted Copy Of Tenant Lease.pdf
Jim Eiberger Redacted Copy Of Tenant Lease.pdf
 
6th sem cpc notes for 6th semester students samjhe. Padhlo bhai
6th sem cpc notes for 6th semester students samjhe. Padhlo bhai6th sem cpc notes for 6th semester students samjhe. Padhlo bhai
6th sem cpc notes for 6th semester students samjhe. Padhlo bhai
 
Audience profile - SF.pptxxxxxxxxxxxxxxxxxxxxxxxxxxx
Audience profile - SF.pptxxxxxxxxxxxxxxxxxxxxxxxxxxxAudience profile - SF.pptxxxxxxxxxxxxxxxxxxxxxxxxxxx
Audience profile - SF.pptxxxxxxxxxxxxxxxxxxxxxxxxxxx
 
ARTICLE 370 PDF about the indian constitution.
ARTICLE 370 PDF about the  indian constitution.ARTICLE 370 PDF about the  indian constitution.
ARTICLE 370 PDF about the indian constitution.
 
一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
一比一原版(UC毕业证书)堪培拉大学毕业证如何办理一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
一比一原版(UC毕业证书)堪培拉大学毕业证如何办理
 
一比一原版(QUT毕业证书)昆士兰科技大学毕业证如何办理
一比一原版(QUT毕业证书)昆士兰科技大学毕业证如何办理一比一原版(QUT毕业证书)昆士兰科技大学毕业证如何办理
一比一原版(QUT毕业证书)昆士兰科技大学毕业证如何办理
 
COPYRIGHTS - PPT 01.12.2023 part- 2.pptx
COPYRIGHTS - PPT 01.12.2023 part- 2.pptxCOPYRIGHTS - PPT 01.12.2023 part- 2.pptx
COPYRIGHTS - PPT 01.12.2023 part- 2.pptx
 
PPT- Voluntary Liquidation (Under section 59).pptx
PPT- Voluntary Liquidation (Under section 59).pptxPPT- Voluntary Liquidation (Under section 59).pptx
PPT- Voluntary Liquidation (Under section 59).pptx
 
Hely-Hutchinson v. Brayhead Ltd .pdf
Hely-Hutchinson v. Brayhead Ltd         .pdfHely-Hutchinson v. Brayhead Ltd         .pdf
Hely-Hutchinson v. Brayhead Ltd .pdf
 
IBC (Insolvency and Bankruptcy Code 2016)-IOD - PPT.pptx
IBC (Insolvency and Bankruptcy Code 2016)-IOD - PPT.pptxIBC (Insolvency and Bankruptcy Code 2016)-IOD - PPT.pptx
IBC (Insolvency and Bankruptcy Code 2016)-IOD - PPT.pptx
 
Smarp Snapshot 210 -- Google's Social Media Ad Fraud & Disinformation Strategy
Smarp Snapshot 210 -- Google's Social Media Ad Fraud & Disinformation StrategySmarp Snapshot 210 -- Google's Social Media Ad Fraud & Disinformation Strategy
Smarp Snapshot 210 -- Google's Social Media Ad Fraud & Disinformation Strategy
 
Code_Ethics of_Mechanical_Engineering.ppt
Code_Ethics of_Mechanical_Engineering.pptCode_Ethics of_Mechanical_Engineering.ppt
Code_Ethics of_Mechanical_Engineering.ppt
 
一比一原版赫尔大学毕业证如何办理
一比一原版赫尔大学毕业证如何办理一比一原版赫尔大学毕业证如何办理
一比一原版赫尔大学毕业证如何办理
 
Clarifying Land Donation Issues Memo for
Clarifying Land Donation Issues Memo forClarifying Land Donation Issues Memo for
Clarifying Land Donation Issues Memo for
 
How do cyber crime lawyers in Mumbai collaborate with law enforcement agencie...
How do cyber crime lawyers in Mumbai collaborate with law enforcement agencie...How do cyber crime lawyers in Mumbai collaborate with law enforcement agencie...
How do cyber crime lawyers in Mumbai collaborate with law enforcement agencie...
 
PowerPoint - Legal Citation Form 1 - Case Law.pptx
PowerPoint - Legal Citation Form 1 - Case Law.pptxPowerPoint - Legal Citation Form 1 - Case Law.pptx
PowerPoint - Legal Citation Form 1 - Case Law.pptx
 
一比一原版赫瑞瓦特大学毕业证如何办理
一比一原版赫瑞瓦特大学毕业证如何办理一比一原版赫瑞瓦特大学毕业证如何办理
一比一原版赫瑞瓦特大学毕业证如何办理
 
The Active Management Value Ratio: The New Science of Benchmarking Investment...
The Active Management Value Ratio: The New Science of Benchmarking Investment...The Active Management Value Ratio: The New Science of Benchmarking Investment...
The Active Management Value Ratio: The New Science of Benchmarking Investment...
 
一比一原版(JCU毕业证书)詹姆斯库克大学毕业证如何办理
一比一原版(JCU毕业证书)詹姆斯库克大学毕业证如何办理一比一原版(JCU毕业证书)詹姆斯库克大学毕业证如何办理
一比一原版(JCU毕业证书)詹姆斯库克大学毕业证如何办理
 
Police Misconduct Lawyers - Law Office of Jerry L. Steering
Police Misconduct Lawyers - Law Office of Jerry L. SteeringPolice Misconduct Lawyers - Law Office of Jerry L. Steering
Police Misconduct Lawyers - Law Office of Jerry L. Steering
 

Legal Analytics Course - Class 6 - Overfitting, Underfitting, & Cross-Validation - Professor Daniel Martin Katz + Professor Michael J Bommarito

  • 1. Class 6 Overfitting, Underfitting, & Cross-validation Legal Analytics Professor Daniel Martin Katz Professor Michael J Bommarito II legalanalyticscourse.com
  • 2. Model Fit access more at legalanalyticscourse.com
  • 3. We are interested in how well a given model performs access more at legalanalyticscourse.com
  • 4. both on existing data access more at legalanalyticscourse.com
  • 5. Underfitting occurs when a statistical model or algorithm cannot capture the underlying trend of the data access more at legalanalyticscourse.com
  • 6. an underfit model has low variance, high bias access more at legalanalyticscourse.com
  • 7. Overfitting occurs when a statistical model or algorithm captures the noise of the data (as opposed to the signal) access more at legalanalyticscourse.com
  • 8. an overfit model has low bias, high variance access more at legalanalyticscourse.com
  • 9. Model Fit is all about generalization access more at legalanalyticscourse.com
  • 10.
  • 12. Why is generalization hard? Learning, machine or otherwise, looks something like this: !  We are presented with a view of objects in the world. !  We encode aspects of these objects, e.g., colors, into “features.” !  We generalize from patterns in these features to statements about objects. Example: !  We spend a summer on Michigan lakes and see many animals. All swans that we see are white. We generalize from this sample to the statement that all swans are white. What went wrong? Mathematically speaking, we did not observe enough variance in our observed sample; in fact, our observed variance for the color feature was zero! access more at legalanalyticscourse.com
  • 13. Underfitting Zero variance in our observed sample led to a model with a constant predicted value; this model underfits the true variance of swans. Underfitting is, in essence, model simplification or ignorance of signal. Underfit models may perform well on modal data, but they typically struggle with lower-frequency or more complex cases. Underfitting can occur for a number of reasons: !  The model is too simple for the actual system. Technically speaking, either the model does not contain enough parameters or the functional forms are not capable of spanning the true functions. !  The number of records or variance of the records does not provide the learning process with enough information. access more at legalanalyticscourse.com
  • 14. Underfitting Let’s look at a simple example – fitting a quadratic equation with a linear function. Quadratic functions look like this: y = a^2 + b x + c A function is therefore defined by supplying three parameters: a, b, and c. To make this realistic, let’s add some simple N(0,1) random errors, giving us the form: y = a^2 + b x + c + e where e is distributed N(0,1). access more at legalanalyticscourse.com
  • 15. Underfitting Example: y = x^2 + 2 x + 1 + e access more at legalanalyticscourse.com
  • 16. Underfitting What happens if we try to fit a model to this data. First, let’s start with a simple linear function, i.e., linear regression. Our linear form looks like this: y = a x + b + e A model is therefore defined by supplying two parameters: a and b. access more at legalanalyticscourse.com
  • 17. Underfitting Example: y = 1.94 x + 6.62 access more at legalanalyticscourse.com
  • 18. Underfitting This linear model clearly does not capture the non-linear relationship between x and y. However, no combination of a and b will successfully match this across all x, since the linear model is just too simple to represent a non-linear model. Linear models have too few parameters to fit non-linear models! Thus, they will typically underfit non-linear models. (fit quadratic model below) access more at legalanalyticscourse.com
  • 19. Overfitting Overfitting is the opposite of underfitting, and it occurs when a model codifies noise into the structure. Overfitting may occur for a number of reasons: !  Models that are much more complex than the underlying data, either in terms of functional form or number of parameters. !  Learning that is too focused on minimizing the loss function for a single training sample. access more at legalanalyticscourse.com
  • 20. Overfitting Let’s return to our quadratic example from before. As we discussed, our quadratic data was generated by a model with three parameters: a, b, and c. When we tried to explain the data with just two parameters, the resulting model underfit the data and did a poor job. When we tried to explain the data with three parameters, the resulting model did an excellent job of fitting the data. What happens if we try to explain the data with seven parameters? access more at legalanalyticscourse.com
  • 21. Overfitting First, let’s focus on the portion of data that we saw in our training set before – the range where x lies between -4 and 4. At first blush, it looks like we’ve done an excellent job. Compared to our three parameter quadratic fit, we have done an even better job of reducing the some of our squared residuals. Why not always use more parameters? access more at legalanalyticscourse.com
  • 22. Overfitting But what happens if we look outside of this (-4, 4) range? It turns out that we’ve committed two common overfitting mistakes: !  Our model is much more complex than the underlying data. Quadratic relationships are built on three parameters, whereas our model uses eight. When we minimized our loss function, the extra five parameters were used to fit to noise, not signal! !  Our model was trained on a very narrow sample of the world. While we do an excellent job of predicting values between -4 and 4, we do a very poor job outside of this range access more at legalanalyticscourse.com
  • 23. Generalizing safely So what can we do to safely generalize? Two of the most common approaches are regularization and cross-validation. Regularization is … access more at legalanalyticscourse.com
  • 24. Cross-validation Cross-validation, like regularization, is meant to prevent the learning process from codifying sample-specific noise as structure. However, unlike regularization, cross-validation does not impose any geometric constraints on the shape or “feel” of our learning solution, i.e., model. Instead, it focuses on repeating the learning task on multiple samples of training data, then evaluating the performance of these models on the “held- out” or unseen data. access more at legalanalyticscourse.com
  • 25. Cross-validation: K-fold The most common approach to cross-validation is to divide the training set of data into K distinct partitions of equal size. K-1 of these partitions are then used to learn a models. The resulting model is then used to predict the Kth partition. This process is repeated K times, and the best performing sample is kept as the trained model. http://genome.tugraz.at/proclassify/help/pages/XV.html http://stats.stackexchange.com/questions/1826/cross-validation-in-plain-english access more at legalanalyticscourse.com
  • 26. “Cross-validation is widely used to check model error by testing on data not part of the training set. Multiple rounds with randomly selected test sets are averaged together to reduce variability of the cross-validation; high variability of the model will produce high average errors on the test set. One way of resolving the trade-off is to use mixture models and ensemble learning. For example, boosting combines many ‘weak’ (high bias) models in an ensemble that has greater variance than the individual models, while bagging combines ‘strong’ learners in a way that reduces their variance.” http://en.wikipedia.org/wiki/Cross-validation_%28statistics%29 access more at legalanalyticscourse.com
  • 27. Legal Analytics Class 6 - Overfitting, Underfitting, & Cross-Validation daniel martin katz blog | ComputationalLegalStudies corp | LexPredict michael j bommarito twitter | @computational blog | ComputationalLegalStudies corp | LexPredict twitter | @mjbommar more content available at legalanalyticscourse.com site | danielmartinkatz.com site | bommaritollc.com