SlideShare a Scribd company logo
Concentration Inequality in ML
Subject- Machine Learning
Dr. Varun Kumar
Subject- Machine Learning Dr. Varun Kumar 1 / 12
Outlines
1 Meaning of Concentration in Probability Context
2 Markov Inequality
3 Chebeshev Inequality
4 Moment Generating Function (MGF)
5 Chernoffs Inequality
6 References
Subject- Machine Learning Dr. Varun Kumar 2 / 12
Introduction to concentration Inequality
Key features
⇒ Concentration inequalities are widely employed in non-asymptotical
analyses of mathematical statistics in a wide range of settings.
⇒ It is a method for simplifying random quantity, ie. distribution-free to
distribution-dependent.
⇒ Simplify the other distributed random variables like, exponential,
Gamma, and Weibull to Gaussian distributed.
⇒ It works, where the mean has maximum concentration.
fX (x) =
1
√
2πσ
e−
(x−µ)2
2σ2
| {z }
Gaussian
, fX (x) =
1
β
e− x
β
| {z }
Exponential
,
fX (x) =
xα−1
e− x
β
βαΓ(α)
| {z }
Gamma
fX (x) =
k
λ
x
λ
k−1
e
−

x
λ
k
| {z }
Weibull
Subject- Machine Learning Dr. Varun Kumar 3 / 12
Usage of Inequality in machine learning
⇒ Decision action plays an important role in machine learning
(especially for solving the classification problem).
⇒ Inequality relation helps for making a decision favorable or
non-favorable.
⇒ Applying Chebyshev inequality, there is requirement of variance of the
data sequence. It is independent from the type of distribution.
⇒ Applying Markov inequality, only mean value is required for finding
probability. It also independent from density function.
Subject- Machine Learning Dr. Varun Kumar 4 / 12
Mathematical description for a given random variable
Mathematical description
General mathematics for continuous random variable:
Mean = E(X) = µ =
Z ∞
−∞
xfX (x)dx (1)
Variance = σ2
=
Z ∞
−∞
(x − µ)2
fX (x)dx (2)
Subject- Machine Learning Dr. Varun Kumar 5 / 12
Markov Inequality
Statement: If X is a positive random variable, i.e X  0, having
probability density function fX (x). Let a is an positive arbitrary constant,
then
P(X  a) ≤
E(X)
a
(3)
Proof: As per the properties of random variable,
E(X) =
Z ∞
0
xfX (x)dx ≥
Z ∞
a
xfX (x)dx (4)
Let x = a, then
E(X) =
Z ∞
0
xfX (x)dx ≥ a
Z ∞
a
fX (x)dx = aP(X  a) (5)
or
P(X  a) ≤
E(X)
a
Subject- Machine Learning Dr. Varun Kumar 6 / 12
Example–
Q A customer goes to a shop is RV having mean 40. Find the
probability for the number of customer exceed more than 60.
Ans As per the question, let X is a RV then P(X  60) =? From Markov
inequality,
P(X  60) ≤
E(X)
60
=
40
60
Maximum probability=2/3
Question framing in training and testing data set:
Day D1 D2 D3 D4 D5 ... ... Dn
No of customer 34 25 38 66 64 ... ... 43
Table: Training data set
Let mean E(X) = µ = 40, and unlabeled input for number of customer
P(X ≥ 60) = µ
60 = 2
3
Subject- Machine Learning Dr. Varun Kumar 7 / 12
Chebeshev Inequality
Statement: If X is a positive random variable, i.e X  0, having probability
density function fX (x). Let  is an positive arbitrary constant, then
P(|X − µ| ≥ ) ≤
σ2
2
(6)
Proof:
σ2
=
Z ∞
−∞
(x − µ)2
fX (x)dx ≥
Z ∞
|x−µ|≥
(x − µ)2
fX (x)dx (7)
Let |x − µ| =  and ignoring the inequality then
σ2
≥
Z ∞
|x−µ|≥
(x − µ)2
fX (x)dx =
Z ∞
|x−µ|≥
2
fX (x)dx = 2
P(|x − µ| ≥ ) (8)
Hence
P(|X − µ| ≥ ) ≤
σ2
2
Subject- Machine Learning Dr. Varun Kumar 8 / 12
Example–
P(|X − µ| ≤ ) ≥ 1 −
σ2
2
Q A manufacturer produces X unit car in a week is RV having variance
is 100 and mean is 40. What will be the maximum and minimum
probability for production for 60 and and 25 unit.
Q According to question, µ = 40 and σ2 = 100
P(X ≥ 60) = P(X − 40 ≥ 20) =??
P(X ≤ 25) = P(|X − 40| ≤ 15) =??
From Chebyshev’s inequality
P(X − 40 ≥ 20) ≤ σ2
2 = 100
202 = 0.25
Similarly
P(|X − 40| ≤ 15) ≥ 1 − σ2
2 = 1 − 100
152 = 0.56
Subject- Machine Learning Dr. Varun Kumar 9 / 12
Moment generating function (MGF)
Let X is the RV then MGF is defined as
Mx (t) = E(etX
) = E
h
1 + tX +
t2X2
2!
+
t3X3
3!
+ ........
i
where t is constant. Applying expectation operator on both side
dnMx (t)
dtn
|t=0 = E[Xn
]
Chernoffs inequality
Let X is RV then etX will also be a RV for constant t. Applying the
Markov’s inequality.
P(X ≥ a) = P(etX
≥ eta
) ≤
E(etX )
eta
(9)
Subject- Machine Learning Dr. Varun Kumar 10 / 12
Jenson’s inequality
For a real convex function ϕ, numbers x1, x2, . . . , xn in its domain, and
positive weights ai , Jensen’s inequality can be stated as:
ϕ
P
ai xi
P
ai

≤
P
ai ϕ(xi )
P
ai
(10)
and the inequality is reversed if ϕ is concave, which is
ϕ
P
ai xi
P
ai

≥
P
ai ϕ(xi )
P
ai
(11)
Equality holds if and only if x1 = x2 = · · · = xn or ϕ is on a domain
containing x1, x2, · · · , xn.
Ex- Let ϕ(x) = log x is concave function then from (11)
log
x1 + x2 + ... + xn
n

≥
log x1 + log x2 + .... + log xn
n
= log(x1x2..xn)
1
n (12)
x1 + x2 + ... + xn
n
≥ (x1x2..xn)
1
n
Subject- Machine Learning Dr. Varun Kumar 11 / 12
References
E. Alpaydin, Introduction to machine learning. MIT press, 2020.
T. M. Mitchell, The discipline of machine learning. Carnegie Mellon University,
School of Computer Science, Machine Learning , 2006, vol. 9.
J. Grus, Data science from scratch: first principles with python. O’Reilly Media,
2019.
Subject- Machine Learning Dr. Varun Kumar 12 / 12

More Related Content

What's hot

Repeated-Measures and Two-Factor Analysis of Variance
Repeated-Measures and Two-Factor Analysis of VarianceRepeated-Measures and Two-Factor Analysis of Variance
Repeated-Measures and Two-Factor Analysis of Variance
jasondroesch
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distribution
Sanjay Basukala
 
Hypothesis
HypothesisHypothesis
Hypothesis
Nilanjan Bhaumik
 
Application of Chebyshev and Markov Inequality in Machine Learning
Application of Chebyshev and Markov Inequality in Machine LearningApplication of Chebyshev and Markov Inequality in Machine Learning
Application of Chebyshev and Markov Inequality in Machine Learning
VARUN KUMAR
 
Comparing means
Comparing meansComparing means
Comparing means
University of Jaffna
 
Small sample test
Small sample testSmall sample test
Small sample test
Parag Shah
 
Hypothesis testing part vi single variance
Hypothesis testing part vi single varianceHypothesis testing part vi single variance
Hypothesis testing part vi single variance
Nadeem Uddin
 
t-TEst. :D
t-TEst. :Dt-TEst. :D
t-TEst. :D
patatas
 
Kruskal Wallis test, Friedman test, Spearman Correlation
Kruskal Wallis test, Friedman test, Spearman CorrelationKruskal Wallis test, Friedman test, Spearman Correlation
Kruskal Wallis test, Friedman test, Spearman Correlation
Rizwan S A
 
t test using spss
t test using spsst test using spss
t test using spss
Parag Shah
 
T test
T testT test
T test
ashishjaswal
 
7 anova chi square test
 7 anova chi square test 7 anova chi square test
7 anova chi square test
Penny Jiang
 
Hypothesis tests for one and two population variances ppt @ bec doms
Hypothesis tests for one and two population variances ppt @ bec domsHypothesis tests for one and two population variances ppt @ bec doms
Hypothesis tests for one and two population variances ppt @ bec doms
Babasab Patil
 
Regression
RegressionRegression
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
RAVI PRASAD K.J.
 
Two Way ANOVA
Two Way ANOVATwo Way ANOVA
Hypothesis testing and parametric test
Hypothesis testing and parametric testHypothesis testing and parametric test
Hypothesis testing and parametric test
Dr. Keerti Jain
 
Analysis of variance ppt @ bec doms
Analysis of variance ppt @ bec domsAnalysis of variance ppt @ bec doms
Analysis of variance ppt @ bec doms
Babasab Patil
 
Seminar 10 BIOSTATISTICS
Seminar 10 BIOSTATISTICSSeminar 10 BIOSTATISTICS
Seminar 10 BIOSTATISTICS
Anusha Divvi
 
Estimation and confidence interval
Estimation and confidence intervalEstimation and confidence interval
Estimation and confidence interval
Homework Guru
 

What's hot (20)

Repeated-Measures and Two-Factor Analysis of Variance
Repeated-Measures and Two-Factor Analysis of VarianceRepeated-Measures and Two-Factor Analysis of Variance
Repeated-Measures and Two-Factor Analysis of Variance
 
Sampling distribution
Sampling distributionSampling distribution
Sampling distribution
 
Hypothesis
HypothesisHypothesis
Hypothesis
 
Application of Chebyshev and Markov Inequality in Machine Learning
Application of Chebyshev and Markov Inequality in Machine LearningApplication of Chebyshev and Markov Inequality in Machine Learning
Application of Chebyshev and Markov Inequality in Machine Learning
 
Comparing means
Comparing meansComparing means
Comparing means
 
Small sample test
Small sample testSmall sample test
Small sample test
 
Hypothesis testing part vi single variance
Hypothesis testing part vi single varianceHypothesis testing part vi single variance
Hypothesis testing part vi single variance
 
t-TEst. :D
t-TEst. :Dt-TEst. :D
t-TEst. :D
 
Kruskal Wallis test, Friedman test, Spearman Correlation
Kruskal Wallis test, Friedman test, Spearman CorrelationKruskal Wallis test, Friedman test, Spearman Correlation
Kruskal Wallis test, Friedman test, Spearman Correlation
 
t test using spss
t test using spsst test using spss
t test using spss
 
T test
T testT test
T test
 
7 anova chi square test
 7 anova chi square test 7 anova chi square test
7 anova chi square test
 
Hypothesis tests for one and two population variances ppt @ bec doms
Hypothesis tests for one and two population variances ppt @ bec domsHypothesis tests for one and two population variances ppt @ bec doms
Hypothesis tests for one and two population variances ppt @ bec doms
 
Regression
RegressionRegression
Regression
 
Hypothesis testing
Hypothesis testingHypothesis testing
Hypothesis testing
 
Two Way ANOVA
Two Way ANOVATwo Way ANOVA
Two Way ANOVA
 
Hypothesis testing and parametric test
Hypothesis testing and parametric testHypothesis testing and parametric test
Hypothesis testing and parametric test
 
Analysis of variance ppt @ bec doms
Analysis of variance ppt @ bec domsAnalysis of variance ppt @ bec doms
Analysis of variance ppt @ bec doms
 
Seminar 10 BIOSTATISTICS
Seminar 10 BIOSTATISTICSSeminar 10 BIOSTATISTICS
Seminar 10 BIOSTATISTICS
 
Estimation and confidence interval
Estimation and confidence intervalEstimation and confidence interval
Estimation and confidence interval
 

Similar to Concentration inequality in Machine Learning

Gaussian process in machine learning
Gaussian process in machine learningGaussian process in machine learning
Gaussian process in machine learning
VARUN KUMAR
 
Probability and Statistics
Probability and StatisticsProbability and Statistics
Probability and Statistics
Malik Sb
 
random variables-descriptive and contincuous
random variables-descriptive and contincuousrandom variables-descriptive and contincuous
random variables-descriptive and contincuous
ar9530
 
Chapter 4 2022.pdf
Chapter 4 2022.pdfChapter 4 2022.pdf
Chapter 4 2022.pdf
Mohamed Ali
 
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
The Statistical and Applied Mathematical Sciences Institute
 
QMC: Operator Splitting Workshop, Using Sequences of Iterates in Inertial Met...
QMC: Operator Splitting Workshop, Using Sequences of Iterates in Inertial Met...QMC: Operator Splitting Workshop, Using Sequences of Iterates in Inertial Met...
QMC: Operator Splitting Workshop, Using Sequences of Iterates in Inertial Met...
The Statistical and Applied Mathematical Sciences Institute
 
2 random variables notes 2p3
2 random variables notes 2p32 random variables notes 2p3
2 random variables notes 2p3
MuhannadSaleh
 
CPP.pptx
CPP.pptxCPP.pptx
Discussion about random variable ad its characterization
Discussion about random variable ad its characterizationDiscussion about random variable ad its characterization
Discussion about random variable ad its characterization
Geeta Arora
 
Cramer row inequality
Cramer row inequality Cramer row inequality
Cramer row inequality
VashuGupta8
 
QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...
QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...
QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...
The Statistical and Applied Mathematical Sciences Institute
 
Finance Enginering from Columbia.pdf
Finance Enginering from Columbia.pdfFinance Enginering from Columbia.pdf
Finance Enginering from Columbia.pdf
CarlosLazo45
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
Valentin De Bortoli
 
Probability distribution
Probability distributionProbability distribution
Probability distribution
Manoj Bhambu
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
Edgar Marca
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - Introduction
Charles Deledalle
 
A new implementation of k-MLE for mixture modelling of Wishart distributions
A new implementation of k-MLE for mixture modelling of Wishart distributionsA new implementation of k-MLE for mixture modelling of Wishart distributions
A new implementation of k-MLE for mixture modelling of Wishart distributions
Frank Nielsen
 
this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...
BhojRajAdhikari5
 
02 basics i-handout
02 basics i-handout02 basics i-handout
02 basics i-handout
sheetslibrary
 
Quantitative Techniques random variables
Quantitative Techniques random variablesQuantitative Techniques random variables
Quantitative Techniques random variables
Rohan Bhatkar
 

Similar to Concentration inequality in Machine Learning (20)

Gaussian process in machine learning
Gaussian process in machine learningGaussian process in machine learning
Gaussian process in machine learning
 
Probability and Statistics
Probability and StatisticsProbability and Statistics
Probability and Statistics
 
random variables-descriptive and contincuous
random variables-descriptive and contincuousrandom variables-descriptive and contincuous
random variables-descriptive and contincuous
 
Chapter 4 2022.pdf
Chapter 4 2022.pdfChapter 4 2022.pdf
Chapter 4 2022.pdf
 
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
Deep Learning Opening Workshop - ProxSARAH Algorithms for Stochastic Composit...
 
QMC: Operator Splitting Workshop, Using Sequences of Iterates in Inertial Met...
QMC: Operator Splitting Workshop, Using Sequences of Iterates in Inertial Met...QMC: Operator Splitting Workshop, Using Sequences of Iterates in Inertial Met...
QMC: Operator Splitting Workshop, Using Sequences of Iterates in Inertial Met...
 
2 random variables notes 2p3
2 random variables notes 2p32 random variables notes 2p3
2 random variables notes 2p3
 
CPP.pptx
CPP.pptxCPP.pptx
CPP.pptx
 
Discussion about random variable ad its characterization
Discussion about random variable ad its characterizationDiscussion about random variable ad its characterization
Discussion about random variable ad its characterization
 
Cramer row inequality
Cramer row inequality Cramer row inequality
Cramer row inequality
 
QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...
QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...
QMC: Operator Splitting Workshop, Proximal Algorithms in Probability Spaces -...
 
Finance Enginering from Columbia.pdf
Finance Enginering from Columbia.pdfFinance Enginering from Columbia.pdf
Finance Enginering from Columbia.pdf
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
Probability distribution
Probability distributionProbability distribution
Probability distribution
 
Kernels and Support Vector Machines
Kernels and Support Vector  MachinesKernels and Support Vector  Machines
Kernels and Support Vector Machines
 
IVR - Chapter 1 - Introduction
IVR - Chapter 1 - IntroductionIVR - Chapter 1 - Introduction
IVR - Chapter 1 - Introduction
 
A new implementation of k-MLE for mixture modelling of Wishart distributions
A new implementation of k-MLE for mixture modelling of Wishart distributionsA new implementation of k-MLE for mixture modelling of Wishart distributions
A new implementation of k-MLE for mixture modelling of Wishart distributions
 
this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...this materials is useful for the students who studying masters level in elect...
this materials is useful for the students who studying masters level in elect...
 
02 basics i-handout
02 basics i-handout02 basics i-handout
02 basics i-handout
 
Quantitative Techniques random variables
Quantitative Techniques random variablesQuantitative Techniques random variables
Quantitative Techniques random variables
 

More from VARUN KUMAR

Distributed rc Model
Distributed rc ModelDistributed rc Model
Distributed rc Model
VARUN KUMAR
 
Electrical Wire Model
Electrical Wire ModelElectrical Wire Model
Electrical Wire Model
VARUN KUMAR
 
Interconnect Parameter in Digital VLSI Design
Interconnect Parameter in Digital VLSI DesignInterconnect Parameter in Digital VLSI Design
Interconnect Parameter in Digital VLSI Design
VARUN KUMAR
 
Introduction to Digital VLSI Design
Introduction to Digital VLSI DesignIntroduction to Digital VLSI Design
Introduction to Digital VLSI Design
VARUN KUMAR
 
Challenges of Massive MIMO System
Challenges of Massive MIMO SystemChallenges of Massive MIMO System
Challenges of Massive MIMO System
VARUN KUMAR
 
E-democracy or Digital Democracy
E-democracy or Digital DemocracyE-democracy or Digital Democracy
E-democracy or Digital Democracy
VARUN KUMAR
 
Ethics of Parasitic Computing
Ethics of Parasitic ComputingEthics of Parasitic Computing
Ethics of Parasitic Computing
VARUN KUMAR
 
Action Lines of Geneva Plan of Action
Action Lines of Geneva Plan of ActionAction Lines of Geneva Plan of Action
Action Lines of Geneva Plan of Action
VARUN KUMAR
 
Geneva Plan of Action
Geneva Plan of ActionGeneva Plan of Action
Geneva Plan of Action
VARUN KUMAR
 
Fair Use in the Electronic Age
Fair Use in the Electronic AgeFair Use in the Electronic Age
Fair Use in the Electronic Age
VARUN KUMAR
 
Software as a Property
Software as a PropertySoftware as a Property
Software as a Property
VARUN KUMAR
 
Orthogonal Polynomial
Orthogonal PolynomialOrthogonal Polynomial
Orthogonal Polynomial
VARUN KUMAR
 
Patent Protection
Patent ProtectionPatent Protection
Patent Protection
VARUN KUMAR
 
Copyright Vs Patent and Trade Secrecy Law
Copyright Vs Patent and Trade Secrecy LawCopyright Vs Patent and Trade Secrecy Law
Copyright Vs Patent and Trade Secrecy Law
VARUN KUMAR
 
Property Right and Software
Property Right and SoftwareProperty Right and Software
Property Right and Software
VARUN KUMAR
 
Investigating Data Trials
Investigating Data TrialsInvestigating Data Trials
Investigating Data Trials
VARUN KUMAR
 
Gaussian Numerical Integration
Gaussian Numerical IntegrationGaussian Numerical Integration
Gaussian Numerical Integration
VARUN KUMAR
 
Censorship and Controversy
Censorship and ControversyCensorship and Controversy
Censorship and Controversy
VARUN KUMAR
 
Romberg's Integration
Romberg's IntegrationRomberg's Integration
Romberg's Integration
VARUN KUMAR
 
Introduction to Censorship
Introduction to Censorship Introduction to Censorship
Introduction to Censorship
VARUN KUMAR
 

More from VARUN KUMAR (20)

Distributed rc Model
Distributed rc ModelDistributed rc Model
Distributed rc Model
 
Electrical Wire Model
Electrical Wire ModelElectrical Wire Model
Electrical Wire Model
 
Interconnect Parameter in Digital VLSI Design
Interconnect Parameter in Digital VLSI DesignInterconnect Parameter in Digital VLSI Design
Interconnect Parameter in Digital VLSI Design
 
Introduction to Digital VLSI Design
Introduction to Digital VLSI DesignIntroduction to Digital VLSI Design
Introduction to Digital VLSI Design
 
Challenges of Massive MIMO System
Challenges of Massive MIMO SystemChallenges of Massive MIMO System
Challenges of Massive MIMO System
 
E-democracy or Digital Democracy
E-democracy or Digital DemocracyE-democracy or Digital Democracy
E-democracy or Digital Democracy
 
Ethics of Parasitic Computing
Ethics of Parasitic ComputingEthics of Parasitic Computing
Ethics of Parasitic Computing
 
Action Lines of Geneva Plan of Action
Action Lines of Geneva Plan of ActionAction Lines of Geneva Plan of Action
Action Lines of Geneva Plan of Action
 
Geneva Plan of Action
Geneva Plan of ActionGeneva Plan of Action
Geneva Plan of Action
 
Fair Use in the Electronic Age
Fair Use in the Electronic AgeFair Use in the Electronic Age
Fair Use in the Electronic Age
 
Software as a Property
Software as a PropertySoftware as a Property
Software as a Property
 
Orthogonal Polynomial
Orthogonal PolynomialOrthogonal Polynomial
Orthogonal Polynomial
 
Patent Protection
Patent ProtectionPatent Protection
Patent Protection
 
Copyright Vs Patent and Trade Secrecy Law
Copyright Vs Patent and Trade Secrecy LawCopyright Vs Patent and Trade Secrecy Law
Copyright Vs Patent and Trade Secrecy Law
 
Property Right and Software
Property Right and SoftwareProperty Right and Software
Property Right and Software
 
Investigating Data Trials
Investigating Data TrialsInvestigating Data Trials
Investigating Data Trials
 
Gaussian Numerical Integration
Gaussian Numerical IntegrationGaussian Numerical Integration
Gaussian Numerical Integration
 
Censorship and Controversy
Censorship and ControversyCensorship and Controversy
Censorship and Controversy
 
Romberg's Integration
Romberg's IntegrationRomberg's Integration
Romberg's Integration
 
Introduction to Censorship
Introduction to Censorship Introduction to Censorship
Introduction to Censorship
 

Recently uploaded

Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
SUTEJAS
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
Victor Morales
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
jpsjournal1
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
drwaing
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
NidhalKahouli2
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
anoopmanoharan2
 
Wearable antenna for antenna applications
Wearable antenna for antenna applicationsWearable antenna for antenna applications
Wearable antenna for antenna applications
Madhumitha Jayaram
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
Aditya Rajan Patra
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
MIGUELANGEL966976
 
CSM Cloud Service Management Presentarion
CSM Cloud Service Management PresentarionCSM Cloud Service Management Presentarion
CSM Cloud Service Management Presentarion
rpskprasana
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
IJECEIAES
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
thanhdowork
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
KrishnaveniKrishnara1
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
Madan Karki
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
kandramariana6
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
Dr Ramhari Poudyal
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
Mukeshwaran Balu
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
gestioneergodomus
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
Rahul
 
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
University of Maribor
 

Recently uploaded (20)

Understanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine LearningUnderstanding Inductive Bias in Machine Learning
Understanding Inductive Bias in Machine Learning
 
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressionsKuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
KuberTENes Birthday Bash Guadalajara - K8sGPT first impressions
 
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECTCHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
CHINA’S GEO-ECONOMIC OUTREACH IN CENTRAL ASIAN COUNTRIES AND FUTURE PROSPECT
 
digital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdfdigital fundamental by Thomas L.floydl.pdf
digital fundamental by Thomas L.floydl.pdf
 
basic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdfbasic-wireline-operations-course-mahmoud-f-radwan.pdf
basic-wireline-operations-course-mahmoud-f-radwan.pdf
 
PPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testingPPT on GRP pipes manufacturing and testing
PPT on GRP pipes manufacturing and testing
 
Wearable antenna for antenna applications
Wearable antenna for antenna applicationsWearable antenna for antenna applications
Wearable antenna for antenna applications
 
Recycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part IIIRecycled Concrete Aggregate in Construction Part III
Recycled Concrete Aggregate in Construction Part III
 
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdfBPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
BPV-GUI-01-Guide-for-ASME-Review-Teams-(General)-10-10-2023.pdf
 
CSM Cloud Service Management Presentarion
CSM Cloud Service Management PresentarionCSM Cloud Service Management Presentarion
CSM Cloud Service Management Presentarion
 
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
Electric vehicle and photovoltaic advanced roles in enhancing the financial p...
 
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Hori...
 
22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt22CYT12-Unit-V-E Waste and its Management.ppt
22CYT12-Unit-V-E Waste and its Management.ppt
 
spirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptxspirit beverages ppt without graphics.pptx
spirit beverages ppt without graphics.pptx
 
132/33KV substation case study Presentation
132/33KV substation case study Presentation132/33KV substation case study Presentation
132/33KV substation case study Presentation
 
Literature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptxLiterature Review Basics and Understanding Reference Management.pptx
Literature Review Basics and Understanding Reference Management.pptx
 
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
ACRP 4-09 Risk Assessment Method to Support Modification of Airfield Separat...
 
DfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributionsDfMAy 2024 - key insights and contributions
DfMAy 2024 - key insights and contributions
 
ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024ACEP Magazine edition 4th launched on 05.06.2024
ACEP Magazine edition 4th launched on 05.06.2024
 
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
Presentation of IEEE Slovenia CIS (Computational Intelligence Society) Chapte...
 

Concentration inequality in Machine Learning

  • 1. Concentration Inequality in ML Subject- Machine Learning Dr. Varun Kumar Subject- Machine Learning Dr. Varun Kumar 1 / 12
  • 2. Outlines 1 Meaning of Concentration in Probability Context 2 Markov Inequality 3 Chebeshev Inequality 4 Moment Generating Function (MGF) 5 Chernoffs Inequality 6 References Subject- Machine Learning Dr. Varun Kumar 2 / 12
  • 3. Introduction to concentration Inequality Key features ⇒ Concentration inequalities are widely employed in non-asymptotical analyses of mathematical statistics in a wide range of settings. ⇒ It is a method for simplifying random quantity, ie. distribution-free to distribution-dependent. ⇒ Simplify the other distributed random variables like, exponential, Gamma, and Weibull to Gaussian distributed. ⇒ It works, where the mean has maximum concentration. fX (x) = 1 √ 2πσ e− (x−µ)2 2σ2 | {z } Gaussian , fX (x) = 1 β e− x β | {z } Exponential , fX (x) = xα−1 e− x β βαΓ(α) | {z } Gamma fX (x) = k λ x λ k−1 e − x λ k | {z } Weibull Subject- Machine Learning Dr. Varun Kumar 3 / 12
  • 4. Usage of Inequality in machine learning ⇒ Decision action plays an important role in machine learning (especially for solving the classification problem). ⇒ Inequality relation helps for making a decision favorable or non-favorable. ⇒ Applying Chebyshev inequality, there is requirement of variance of the data sequence. It is independent from the type of distribution. ⇒ Applying Markov inequality, only mean value is required for finding probability. It also independent from density function. Subject- Machine Learning Dr. Varun Kumar 4 / 12
  • 5. Mathematical description for a given random variable Mathematical description General mathematics for continuous random variable: Mean = E(X) = µ = Z ∞ −∞ xfX (x)dx (1) Variance = σ2 = Z ∞ −∞ (x − µ)2 fX (x)dx (2) Subject- Machine Learning Dr. Varun Kumar 5 / 12
  • 6. Markov Inequality Statement: If X is a positive random variable, i.e X 0, having probability density function fX (x). Let a is an positive arbitrary constant, then P(X a) ≤ E(X) a (3) Proof: As per the properties of random variable, E(X) = Z ∞ 0 xfX (x)dx ≥ Z ∞ a xfX (x)dx (4) Let x = a, then E(X) = Z ∞ 0 xfX (x)dx ≥ a Z ∞ a fX (x)dx = aP(X a) (5) or P(X a) ≤ E(X) a Subject- Machine Learning Dr. Varun Kumar 6 / 12
  • 7. Example– Q A customer goes to a shop is RV having mean 40. Find the probability for the number of customer exceed more than 60. Ans As per the question, let X is a RV then P(X 60) =? From Markov inequality, P(X 60) ≤ E(X) 60 = 40 60 Maximum probability=2/3 Question framing in training and testing data set: Day D1 D2 D3 D4 D5 ... ... Dn No of customer 34 25 38 66 64 ... ... 43 Table: Training data set Let mean E(X) = µ = 40, and unlabeled input for number of customer P(X ≥ 60) = µ 60 = 2 3 Subject- Machine Learning Dr. Varun Kumar 7 / 12
  • 8. Chebeshev Inequality Statement: If X is a positive random variable, i.e X 0, having probability density function fX (x). Let is an positive arbitrary constant, then P(|X − µ| ≥ ) ≤ σ2 2 (6) Proof: σ2 = Z ∞ −∞ (x − µ)2 fX (x)dx ≥ Z ∞ |x−µ|≥ (x − µ)2 fX (x)dx (7) Let |x − µ| = and ignoring the inequality then σ2 ≥ Z ∞ |x−µ|≥ (x − µ)2 fX (x)dx = Z ∞ |x−µ|≥ 2 fX (x)dx = 2 P(|x − µ| ≥ ) (8) Hence P(|X − µ| ≥ ) ≤ σ2 2 Subject- Machine Learning Dr. Varun Kumar 8 / 12
  • 9. Example– P(|X − µ| ≤ ) ≥ 1 − σ2 2 Q A manufacturer produces X unit car in a week is RV having variance is 100 and mean is 40. What will be the maximum and minimum probability for production for 60 and and 25 unit. Q According to question, µ = 40 and σ2 = 100 P(X ≥ 60) = P(X − 40 ≥ 20) =?? P(X ≤ 25) = P(|X − 40| ≤ 15) =?? From Chebyshev’s inequality P(X − 40 ≥ 20) ≤ σ2 2 = 100 202 = 0.25 Similarly P(|X − 40| ≤ 15) ≥ 1 − σ2 2 = 1 − 100 152 = 0.56 Subject- Machine Learning Dr. Varun Kumar 9 / 12
  • 10. Moment generating function (MGF) Let X is the RV then MGF is defined as Mx (t) = E(etX ) = E h 1 + tX + t2X2 2! + t3X3 3! + ........ i where t is constant. Applying expectation operator on both side dnMx (t) dtn |t=0 = E[Xn ] Chernoffs inequality Let X is RV then etX will also be a RV for constant t. Applying the Markov’s inequality. P(X ≥ a) = P(etX ≥ eta ) ≤ E(etX ) eta (9) Subject- Machine Learning Dr. Varun Kumar 10 / 12
  • 11. Jenson’s inequality For a real convex function ϕ, numbers x1, x2, . . . , xn in its domain, and positive weights ai , Jensen’s inequality can be stated as: ϕ P ai xi P ai ≤ P ai ϕ(xi ) P ai (10) and the inequality is reversed if ϕ is concave, which is ϕ P ai xi P ai ≥ P ai ϕ(xi ) P ai (11) Equality holds if and only if x1 = x2 = · · · = xn or ϕ is on a domain containing x1, x2, · · · , xn. Ex- Let ϕ(x) = log x is concave function then from (11) log x1 + x2 + ... + xn n ≥ log x1 + log x2 + .... + log xn n = log(x1x2..xn) 1 n (12) x1 + x2 + ... + xn n ≥ (x1x2..xn) 1 n Subject- Machine Learning Dr. Varun Kumar 11 / 12
  • 12. References E. Alpaydin, Introduction to machine learning. MIT press, 2020. T. M. Mitchell, The discipline of machine learning. Carnegie Mellon University, School of Computer Science, Machine Learning , 2006, vol. 9. J. Grus, Data science from scratch: first principles with python. O’Reilly Media, 2019. Subject- Machine Learning Dr. Varun Kumar 12 / 12