SlideShare a Scribd company logo
1 of 19
Download to read offline
PAC Learning
a discussion on the original paper by Valiant
Adrian Florea
13th Meetup of Papers We Love (Bucharest Chapter),
27 January 2017
Adrian Florea PAC Learning PWLB13 1 / 19
Induction in an urn I
The urn
Consider an urn containing a very large number (millions) of marbles,
possibly of different types. You are allowed to draw 100 marbles and asked
what kinds of marbles the urn contains.
no assumptions - impossible task!
assumption: all the marbles are of different types - impossible task!
assumption: all the marbles are identical - a single draw gives
complete knowledge about all the marbles. :-)
assumption: 50% of the marbles are of one type - the probability to
miss that type is (1/2)100 = 7.89 ∗ 10−31
Adrian Florea PAC Learning PWLB13 2 / 19
Induction in an urn II
assumption: there are at most 5 different marble types - if any of the
5 types occurs with frequency > 5%, the probability to miss that type
is < (1 − 0.05)100 < 0.6% so the probability to miss any one of these
frequent ones is < 5 ∗ 0.6% = 3%. There can be at most 4 types with
frequency < 5% so the rare types are < 20%.
Remark
Even if the distribution of marble types is unknown, we can predict with
97% confidence that after 100 picks (small sample) you will have seen
representatives of 80% urn content
We needed only two assumptions:
The Invariance Assumption: the urn do not change.
The Learnable Regularity Assumption: there are a fixed number of
marble types represented in the urn.
Adrian Florea PAC Learning PWLB13 3 / 19
Definition
Let X be a set called the instance space.
Definition
A concept c over X is a subset c ⊆ X of the instance space.
A subset c ⊆ X can be represented as c ∈ 2X , with c as the inverse image
of 1, i.e. c : X → {0, 1}, c(x) = 1 if x is a positive example of c and
c(x) = 0 if x is a negative example of c.
Definition
A concept class C over X is a set of concepts over X, typically with an
associated representation.
Definition
A target concept may be any concept c∗ ∈ C.
Adrian Florea PAC Learning PWLB13 4 / 19
Definition
An assignment is a function that maps a truth value to all of its variables.
Definition
A satisfying assignment is when, after applying the assignment, the
underlying formula simplifies to true.
For the sake of simplicity we can consider concepts c over {0, 1}n whose
positive examples are the satisfying assignments of Boolean formulae fc
over {0, 1}n. We can then define a concept class C by considering only fc
fulfilling certain syntactic constraints (its representation).
Definition
A learning protocol specifies the manner in which information is obtained
from the outside.
Valiant considered two routines as part of a learning protocol:
Adrian Florea PAC Learning PWLB13 5 / 19
EXAMPLES routine: has no input, it returns as output a positive
example x (c∗(x) = 1) based on a fixed and perhaps unknown
probabilistic distribution determined arbitrarily by nature;
ORACLE() routine: on x as input it returns 1 if c∗(x) = 1, or 0 if
c∗(x) = 0.
In a real system the ORACLE() may be a human expert, a data set of past
observations, etc.
Definition
A learning algorithm (called also learner) tries to infer an unknown concept
(called hypothesis), chosen from a known concept class.
What it means for a learner to be successful?
e.g. the learner must output a hypothesis identical to the target
concept, or
e.g. the hypotheses agrees with the target concept most of the time.
Adrian Florea PAC Learning PWLB13 6 / 19
The learner can call the EXAMPLES and the ORACLE() routines.
The learner calls the ORACLE() routine over the instances received in a
distribution D from the external information supply.
Definition
A set of random variables is independent and identically distributed (i.i.d.)
if each random variable has the same identical probability distribution as
the others and all are mutually independent.
The instances the learner receives from D, are independently and
identically distributed (i.i.d.).
Remark
The assumption of a fixed distribution helps us to hope that what the
learner learned from the training data will carry over to new, unseen yet,
test data.
Adrian Florea PAC Learning PWLB13 7 / 19
Definition
A learning machine consists of a learning protocol together with a learning
algorithm.
After observing the sequence S of i.i.d. training examples of the target
concept c∗, the learner L outputs the hypothesis h (its estimate of c∗)
from the set H of possible hypotheses:
D
(x1,...,xm)
−−−−−−→ ORACLE()
S≡((x1,c∗(x1)),...,(xm,c∗(xm)))
−−−−−−−−−−−−−−−−−−→ L
h∈H
−−−→ H
The success of L is determined by the performance of h over new i.i.d.
instances drawn from X according to D.
Definition
The true error of h with respect to c∗ and D is the probability that h will
misclassify an instance drawn randomly according to D:
errorD(h) ≡ Pr
x∈D
[c∗(x) = h(x)]
Adrian Florea PAC Learning PWLB13 8 / 19
The true error is defined over D and not over S because it’s about
the error of using the learned hypothesis h on subsequent instances
drawn from D.
The true error depends strongly on D.
Adrian Florea PAC Learning PWLB13 9 / 19
The true error cannot be observed by L. The learner can only observe the
performance of h over S.
Definition
The training error is the fraction of training examples misclassified by h:
errorS (h) ≡ Pr
x∈S
[c∗(x) = h(x)] = 1
m
m
i=1 I[c∗(xi ) = h(xi )]
As the true error depends on D, the training error depends on S.
How many training examples a learner needs to learn to output a
hypothesis h?
If errorD(h) = 0, the learner needs |S| = |X| training examples - that
means no learning!
Remark
We can only require that the learner probably learn a hypothesis that is
approximately correct!
Adrian Florea PAC Learning PWLB13 10 / 19
Definition
A concept class C is PAC-learnable by L using H if:
for all c∗ ∈ C
for any D over X
0 < ε < 1
2 arbitrarily small
0 < δ < 1
2 arbitrarily small
the learner L will, with the probability of at least (1 − δ), output a
hypothesis h ∈ H such that errorD(h) ≤ ε, in a polynomial time in 1
ε , 1
δ ,
size(x ∈ X), size(c).
Implicit assumption: ∀c∗ ∈ C, ∃h ∈ H s.t. errorD(h) arbitrarily small
Definition
VSH,D = {h ∈ H|∀x ∈ D, c∗(x) = h(x)} is called a version space
Adrian Florea PAC Learning PWLB13 11 / 19
Adrian Florea PAC Learning PWLB13 12 / 19
Definition
A version space VSH,D is called ε-exhausted with respect to c∗ and D, if:
∀h ∈ VSH,D, errorD(h) < ε
Definition
A consistent hypothesis is a concept that perfectly fit the training
examples.
The Theorem of ε-exhausting the version space (Haussler, 1988)
If the hypothesis space H is finite, and D is a sequence of m i.i.d. drawn
examples of the target concept c∗, then for any 0 ≤ ε ≤ 1, the probability
that VSH,D is not ε-exhausted with respect to c∗ is at least |H|e−εm
Proof: Let h1, h2, ..., hk be all hypotheses in H with
errorD(hi ) ≥ ε, i = 1, k. The probability that any single hypothesis hi with
errorD(hi ) ≥ ε is consistent with a randomly drawn example is at most
(1 − ε), so the probability for hi to be consistent with all m i.i.d. examples
Adrian Florea PAC Learning PWLB13 13 / 19
is (1 − ε)m. We fail to ε-exhaust the version space iff there is such a
hypothesis consistent with all m i.i.d. examples. Since
P(A ∪ B) ≤ P(A) + P(B), we have that the probability that all m
examples are consistent with any of the k hypotheses is at most
k(1 − ε)m. But k ≤ |H| and 1 − x ≤ e−x for 0 ≤ x ≤ 1, so the probability
is at most |H|e−εm
Corollary
m ≥ 1
ε (ln|H| + ln(1
δ ))
The number of i.i.d. examples needed to ε-exhaust a version space is
logarithmic in the size of the underlying hypothesis space, independently of
the target concept or the distribution over the instance space.
Adrian Florea PAC Learning PWLB13 14 / 19
Let’s consider the concept class C of target concepts described by
conjunctions of Boolean literals (Boolean variables or their negation).
Is C PAC-learnable?
If we have an algorithm that uses a polynomial time per training
example, the answer is yes if we can show that any consistent learner
requires a polynomial number of training examples.
We have |H| = 3n because there are 3 values for a Boolean literal:
the variable, its negation, and the situation when it’s missing in the
concept formula. So:
m ≥ 1
ε (n · ln3 + ln(1
δ ))
Example: A consistent learner trying to learn with errors less than 0.1
with a probability of 95% a target concept described by a conjunction
of up to 10 Boolean literals, requires:
1
0.1(10 · ln3 + ln( 1
0.05)) = 139.8 ≈ 140
training samples.
Adrian Florea PAC Learning PWLB13 15 / 19
Let’s consider now the concept class C of all learnable concepts over
X, where X is defined by n Boolean features. We have:
|C| = 2|X|
|X| = 2n so |C| = 22n
To learn such a concept class, the learner must use the hypothesis
space H = C:
m ≥ 1
ε (2n · ln(2) + ln(1
δ ))
exponential in n.
Adrian Florea PAC Learning PWLB13 16 / 19
Bibliography I
M.J. Kearns
The Computational Complexity of Machine Learning
MIT Press, 1990
M.J. Kearns, U.V. Vazirani
An Introduction to Computational Learning Theory
MIT Press, 1994
T.M. Mitchell
Machine Learning
McGraw-Hill, 1997
B.K. Natarajan
Machine Learning. A Theoretical Approach
Morgan Kaufmann Publishers, 1991
Adrian Florea PAC Learning PWLB13 17 / 19
Bibliography II
L.G. Valiant
Probably Approximately Correct. Nature’s Algorithms for Learning and
Prospering in a Complex World
Basic Books, 2013
J. Amsterdam
Some Philosophical Problems with Formal Learning Theory
AAAI-88 Proceedings, 580-584, 1988
D. Angluin
Learning From Noisy Examples
Machine Learning, 2:343370, 1988
C. Le Goues
A Theory of the Learnable (L.G. Valiant)
Theory Lunch Presentation, 20 May 2010
Adrian Florea PAC Learning PWLB13 18 / 19
Bibliography III
D. Haussler
Quantifying Inductive Bias: AI Learning Algorithms and Valiant’s
Learning Framework
Artificial Intelligence, 36(2):177-221,1988
L.G. Valiant
A Theory of the Learnable
Comm. ACM, 27(11):1134–1142, 1984
L.G. Valiant
Deductive Learning
Phil. Trans. R. Soc. Lond, A 312: 441-446, 1984
Adrian Florea PAC Learning PWLB13 19 / 19

More Related Content

What's hot

Machine Learning 3 - Decision Tree Learning
Machine Learning 3 - Decision Tree LearningMachine Learning 3 - Decision Tree Learning
Machine Learning 3 - Decision Tree Learning
butest
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...
butest
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximization
butest
 
lecture 30
lecture 30lecture 30
lecture 30
sajinsc
 
Point Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis testsPoint Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis tests
University of Salerno
 

What's hot (20)

Bayes learning
Bayes learningBayes learning
Bayes learning
 
Machine Learning 3 - Decision Tree Learning
Machine Learning 3 - Decision Tree LearningMachine Learning 3 - Decision Tree Learning
Machine Learning 3 - Decision Tree Learning
 
Hamiltonian cycle in data structure 2
Hamiltonian cycle in data structure  2Hamiltonian cycle in data structure  2
Hamiltonian cycle in data structure 2
 
. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...
 
Neural Networks and Deep Learning Basics
Neural Networks and Deep Learning BasicsNeural Networks and Deep Learning Basics
Neural Networks and Deep Learning Basics
 
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation MaximizationLecture 18: Gaussian Mixture Models and Expectation Maximization
Lecture 18: Gaussian Mixture Models and Expectation Maximization
 
2.5 backpropagation
2.5 backpropagation2.5 backpropagation
2.5 backpropagation
 
Huffman codes
Huffman codesHuffman codes
Huffman codes
 
lecture 30
lecture 30lecture 30
lecture 30
 
ADABoost classifier
ADABoost classifierADABoost classifier
ADABoost classifier
 
K-Folds Cross Validation Method
K-Folds Cross Validation MethodK-Folds Cross Validation Method
K-Folds Cross Validation Method
 
Apriori algorithm
Apriori algorithmApriori algorithm
Apriori algorithm
 
Point Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis testsPoint Estimate, Confidence Interval, Hypotesis tests
Point Estimate, Confidence Interval, Hypotesis tests
 
Survival Analysis of Web Users
Survival Analysis of Web UsersSurvival Analysis of Web Users
Survival Analysis of Web Users
 
Role of Tensors in Machine Learning
Role of Tensors in Machine LearningRole of Tensors in Machine Learning
Role of Tensors in Machine Learning
 
Wrapper feature selection method
Wrapper feature selection methodWrapper feature selection method
Wrapper feature selection method
 
Cross-validation Tutorial: What, how and which?
Cross-validation Tutorial: What, how and which?Cross-validation Tutorial: What, how and which?
Cross-validation Tutorial: What, how and which?
 
Human Action Recognition
Human Action RecognitionHuman Action Recognition
Human Action Recognition
 
LLM avalanche June 2023.pdf
LLM avalanche June 2023.pdfLLM avalanche June 2023.pdf
LLM avalanche June 2023.pdf
 
Bellman ford algorithm
Bellman ford algorithmBellman ford algorithm
Bellman ford algorithm
 

Similar to "PAC Learning - a discussion on the original paper by Valiant" presentation @ Papers We Love Bucharest

Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Huang Po Chun
 
Computational Learning Theory
Computational Learning TheoryComputational Learning Theory
Computational Learning Theory
butest
 
Monte Carlo Methods
Monte Carlo MethodsMonte Carlo Methods
Monte Carlo Methods
James Bell
 
Alpaydin - Chapter 2
Alpaydin - Chapter 2Alpaydin - Chapter 2
Alpaydin - Chapter 2
butest
 

Similar to "PAC Learning - a discussion on the original paper by Valiant" presentation @ Papers We Love Bucharest (20)

i2ml3e-chap2.pptx
i2ml3e-chap2.pptxi2ml3e-chap2.pptx
i2ml3e-chap2.pptx
 
I2ml3e chap2
I2ml3e chap2I2ml3e chap2
I2ml3e chap2
 
ML_Unit_1_Part_B
ML_Unit_1_Part_BML_Unit_1_Part_B
ML_Unit_1_Part_B
 
Dirichlet processes and Applications
Dirichlet processes and ApplicationsDirichlet processes and Applications
Dirichlet processes and Applications
 
Lecture5 xing
Lecture5 xingLecture5 xing
Lecture5 xing
 
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
Divide_and_Contrast__Source_free_Domain_Adaptation_via_Adaptive_Contrastive_L...
 
The Epsilon-Delta Definition of a Limit
The Epsilon-Delta Definition of a LimitThe Epsilon-Delta Definition of a Limit
The Epsilon-Delta Definition of a Limit
 
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrapStatistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
Statistics (1): estimation, Chapter 2: Empirical distribution and bootstrap
 
Application of Derivatives
Application of DerivativesApplication of Derivatives
Application of Derivatives
 
Universal Quantification DM
Universal Quantification DMUniversal Quantification DM
Universal Quantification DM
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Computational Learning Theory
Computational Learning TheoryComputational Learning Theory
Computational Learning Theory
 
Chemistry Assignment Help
Chemistry Assignment Help Chemistry Assignment Help
Chemistry Assignment Help
 
OPERATIONS RESEARCH
OPERATIONS RESEARCHOPERATIONS RESEARCH
OPERATIONS RESEARCH
 
Mathematics Assignment Help
Mathematics Assignment Help Mathematics Assignment Help
Mathematics Assignment Help
 
CMSC 56 | Lecture 3: Predicates & Quantifiers
CMSC 56 | Lecture 3: Predicates & QuantifiersCMSC 56 | Lecture 3: Predicates & Quantifiers
CMSC 56 | Lecture 3: Predicates & Quantifiers
 
ML-04.pdf
ML-04.pdfML-04.pdf
ML-04.pdf
 
Probability and Statistics
Probability and StatisticsProbability and Statistics
Probability and Statistics
 
Monte Carlo Methods
Monte Carlo MethodsMonte Carlo Methods
Monte Carlo Methods
 
Alpaydin - Chapter 2
Alpaydin - Chapter 2Alpaydin - Chapter 2
Alpaydin - Chapter 2
 

Recently uploaded

Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
amitlee9823
 

Recently uploaded (20)

Generative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and MilvusGenerative AI on Enterprise Cloud with NiFi and Milvus
Generative AI on Enterprise Cloud with NiFi and Milvus
 
Edukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFxEdukaciniai dropshipping via API with DroFx
Edukaciniai dropshipping via API with DroFx
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Bommasandra Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
(NEHA) Call Girls Katra Call Now 8617697112 Katra Escorts 24x7
 
CebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptxCebaBaby dropshipping via API with DroFX.pptx
CebaBaby dropshipping via API with DroFX.pptx
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 

"PAC Learning - a discussion on the original paper by Valiant" presentation @ Papers We Love Bucharest

  • 1. PAC Learning a discussion on the original paper by Valiant Adrian Florea 13th Meetup of Papers We Love (Bucharest Chapter), 27 January 2017 Adrian Florea PAC Learning PWLB13 1 / 19
  • 2. Induction in an urn I The urn Consider an urn containing a very large number (millions) of marbles, possibly of different types. You are allowed to draw 100 marbles and asked what kinds of marbles the urn contains. no assumptions - impossible task! assumption: all the marbles are of different types - impossible task! assumption: all the marbles are identical - a single draw gives complete knowledge about all the marbles. :-) assumption: 50% of the marbles are of one type - the probability to miss that type is (1/2)100 = 7.89 ∗ 10−31 Adrian Florea PAC Learning PWLB13 2 / 19
  • 3. Induction in an urn II assumption: there are at most 5 different marble types - if any of the 5 types occurs with frequency > 5%, the probability to miss that type is < (1 − 0.05)100 < 0.6% so the probability to miss any one of these frequent ones is < 5 ∗ 0.6% = 3%. There can be at most 4 types with frequency < 5% so the rare types are < 20%. Remark Even if the distribution of marble types is unknown, we can predict with 97% confidence that after 100 picks (small sample) you will have seen representatives of 80% urn content We needed only two assumptions: The Invariance Assumption: the urn do not change. The Learnable Regularity Assumption: there are a fixed number of marble types represented in the urn. Adrian Florea PAC Learning PWLB13 3 / 19
  • 4. Definition Let X be a set called the instance space. Definition A concept c over X is a subset c ⊆ X of the instance space. A subset c ⊆ X can be represented as c ∈ 2X , with c as the inverse image of 1, i.e. c : X → {0, 1}, c(x) = 1 if x is a positive example of c and c(x) = 0 if x is a negative example of c. Definition A concept class C over X is a set of concepts over X, typically with an associated representation. Definition A target concept may be any concept c∗ ∈ C. Adrian Florea PAC Learning PWLB13 4 / 19
  • 5. Definition An assignment is a function that maps a truth value to all of its variables. Definition A satisfying assignment is when, after applying the assignment, the underlying formula simplifies to true. For the sake of simplicity we can consider concepts c over {0, 1}n whose positive examples are the satisfying assignments of Boolean formulae fc over {0, 1}n. We can then define a concept class C by considering only fc fulfilling certain syntactic constraints (its representation). Definition A learning protocol specifies the manner in which information is obtained from the outside. Valiant considered two routines as part of a learning protocol: Adrian Florea PAC Learning PWLB13 5 / 19
  • 6. EXAMPLES routine: has no input, it returns as output a positive example x (c∗(x) = 1) based on a fixed and perhaps unknown probabilistic distribution determined arbitrarily by nature; ORACLE() routine: on x as input it returns 1 if c∗(x) = 1, or 0 if c∗(x) = 0. In a real system the ORACLE() may be a human expert, a data set of past observations, etc. Definition A learning algorithm (called also learner) tries to infer an unknown concept (called hypothesis), chosen from a known concept class. What it means for a learner to be successful? e.g. the learner must output a hypothesis identical to the target concept, or e.g. the hypotheses agrees with the target concept most of the time. Adrian Florea PAC Learning PWLB13 6 / 19
  • 7. The learner can call the EXAMPLES and the ORACLE() routines. The learner calls the ORACLE() routine over the instances received in a distribution D from the external information supply. Definition A set of random variables is independent and identically distributed (i.i.d.) if each random variable has the same identical probability distribution as the others and all are mutually independent. The instances the learner receives from D, are independently and identically distributed (i.i.d.). Remark The assumption of a fixed distribution helps us to hope that what the learner learned from the training data will carry over to new, unseen yet, test data. Adrian Florea PAC Learning PWLB13 7 / 19
  • 8. Definition A learning machine consists of a learning protocol together with a learning algorithm. After observing the sequence S of i.i.d. training examples of the target concept c∗, the learner L outputs the hypothesis h (its estimate of c∗) from the set H of possible hypotheses: D (x1,...,xm) −−−−−−→ ORACLE() S≡((x1,c∗(x1)),...,(xm,c∗(xm))) −−−−−−−−−−−−−−−−−−→ L h∈H −−−→ H The success of L is determined by the performance of h over new i.i.d. instances drawn from X according to D. Definition The true error of h with respect to c∗ and D is the probability that h will misclassify an instance drawn randomly according to D: errorD(h) ≡ Pr x∈D [c∗(x) = h(x)] Adrian Florea PAC Learning PWLB13 8 / 19
  • 9. The true error is defined over D and not over S because it’s about the error of using the learned hypothesis h on subsequent instances drawn from D. The true error depends strongly on D. Adrian Florea PAC Learning PWLB13 9 / 19
  • 10. The true error cannot be observed by L. The learner can only observe the performance of h over S. Definition The training error is the fraction of training examples misclassified by h: errorS (h) ≡ Pr x∈S [c∗(x) = h(x)] = 1 m m i=1 I[c∗(xi ) = h(xi )] As the true error depends on D, the training error depends on S. How many training examples a learner needs to learn to output a hypothesis h? If errorD(h) = 0, the learner needs |S| = |X| training examples - that means no learning! Remark We can only require that the learner probably learn a hypothesis that is approximately correct! Adrian Florea PAC Learning PWLB13 10 / 19
  • 11. Definition A concept class C is PAC-learnable by L using H if: for all c∗ ∈ C for any D over X 0 < ε < 1 2 arbitrarily small 0 < δ < 1 2 arbitrarily small the learner L will, with the probability of at least (1 − δ), output a hypothesis h ∈ H such that errorD(h) ≤ ε, in a polynomial time in 1 ε , 1 δ , size(x ∈ X), size(c). Implicit assumption: ∀c∗ ∈ C, ∃h ∈ H s.t. errorD(h) arbitrarily small Definition VSH,D = {h ∈ H|∀x ∈ D, c∗(x) = h(x)} is called a version space Adrian Florea PAC Learning PWLB13 11 / 19
  • 12. Adrian Florea PAC Learning PWLB13 12 / 19
  • 13. Definition A version space VSH,D is called ε-exhausted with respect to c∗ and D, if: ∀h ∈ VSH,D, errorD(h) < ε Definition A consistent hypothesis is a concept that perfectly fit the training examples. The Theorem of ε-exhausting the version space (Haussler, 1988) If the hypothesis space H is finite, and D is a sequence of m i.i.d. drawn examples of the target concept c∗, then for any 0 ≤ ε ≤ 1, the probability that VSH,D is not ε-exhausted with respect to c∗ is at least |H|e−εm Proof: Let h1, h2, ..., hk be all hypotheses in H with errorD(hi ) ≥ ε, i = 1, k. The probability that any single hypothesis hi with errorD(hi ) ≥ ε is consistent with a randomly drawn example is at most (1 − ε), so the probability for hi to be consistent with all m i.i.d. examples Adrian Florea PAC Learning PWLB13 13 / 19
  • 14. is (1 − ε)m. We fail to ε-exhaust the version space iff there is such a hypothesis consistent with all m i.i.d. examples. Since P(A ∪ B) ≤ P(A) + P(B), we have that the probability that all m examples are consistent with any of the k hypotheses is at most k(1 − ε)m. But k ≤ |H| and 1 − x ≤ e−x for 0 ≤ x ≤ 1, so the probability is at most |H|e−εm Corollary m ≥ 1 ε (ln|H| + ln(1 δ )) The number of i.i.d. examples needed to ε-exhaust a version space is logarithmic in the size of the underlying hypothesis space, independently of the target concept or the distribution over the instance space. Adrian Florea PAC Learning PWLB13 14 / 19
  • 15. Let’s consider the concept class C of target concepts described by conjunctions of Boolean literals (Boolean variables or their negation). Is C PAC-learnable? If we have an algorithm that uses a polynomial time per training example, the answer is yes if we can show that any consistent learner requires a polynomial number of training examples. We have |H| = 3n because there are 3 values for a Boolean literal: the variable, its negation, and the situation when it’s missing in the concept formula. So: m ≥ 1 ε (n · ln3 + ln(1 δ )) Example: A consistent learner trying to learn with errors less than 0.1 with a probability of 95% a target concept described by a conjunction of up to 10 Boolean literals, requires: 1 0.1(10 · ln3 + ln( 1 0.05)) = 139.8 ≈ 140 training samples. Adrian Florea PAC Learning PWLB13 15 / 19
  • 16. Let’s consider now the concept class C of all learnable concepts over X, where X is defined by n Boolean features. We have: |C| = 2|X| |X| = 2n so |C| = 22n To learn such a concept class, the learner must use the hypothesis space H = C: m ≥ 1 ε (2n · ln(2) + ln(1 δ )) exponential in n. Adrian Florea PAC Learning PWLB13 16 / 19
  • 17. Bibliography I M.J. Kearns The Computational Complexity of Machine Learning MIT Press, 1990 M.J. Kearns, U.V. Vazirani An Introduction to Computational Learning Theory MIT Press, 1994 T.M. Mitchell Machine Learning McGraw-Hill, 1997 B.K. Natarajan Machine Learning. A Theoretical Approach Morgan Kaufmann Publishers, 1991 Adrian Florea PAC Learning PWLB13 17 / 19
  • 18. Bibliography II L.G. Valiant Probably Approximately Correct. Nature’s Algorithms for Learning and Prospering in a Complex World Basic Books, 2013 J. Amsterdam Some Philosophical Problems with Formal Learning Theory AAAI-88 Proceedings, 580-584, 1988 D. Angluin Learning From Noisy Examples Machine Learning, 2:343370, 1988 C. Le Goues A Theory of the Learnable (L.G. Valiant) Theory Lunch Presentation, 20 May 2010 Adrian Florea PAC Learning PWLB13 18 / 19
  • 19. Bibliography III D. Haussler Quantifying Inductive Bias: AI Learning Algorithms and Valiant’s Learning Framework Artificial Intelligence, 36(2):177-221,1988 L.G. Valiant A Theory of the Learnable Comm. ACM, 27(11):1134–1142, 1984 L.G. Valiant Deductive Learning Phil. Trans. R. Soc. Lond, A 312: 441-446, 1984 Adrian Florea PAC Learning PWLB13 19 / 19