SlideShare a Scribd company logo
1 of 14
Download to read offline
Jaeho Lee Sejun Park Jinwoo Shin

Korea Advanced Institute of Science and Technology (KAIST)
†
Learning bounds for Risk-sensitive learning
… or, “Robust and Fair ML with Vapnik & Chervonenkis”
Contact: jaeho-lee@kaist.ac.kr
Code: https://github.com/jaeho-lee/oce
Motivation: Robust and fair learning
Truth. Empirical risk minimization (ERM) is a theoretical foundation for ML.
̂f 𝖾𝗋𝗆 ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ
n
∑
i=1
1
n
⋅ f(Zi)
Motivation: Robust and fair learning
Truth. Study on the “empirical risk minimization” gives a concrete foundation for ML.
̂f 𝖾𝗋𝗆 ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ
n
∑
i=1
1
n
⋅ f(Zi)
Also Truth. .Modern-day ML is more than just ERM.

-We weigh samples differently, based on their loss values!
̂f ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ
n
∑
i=1
wi ⋅ f(Zi)
Depends on , relative tof(Zi) f(Z1), f(Z2), ⋯, f(Zn)
Motivation: Robust and fair learning
Truth. Study on the “empirical risk minimization” gives a concrete foundation for ML.
̂f 𝖾𝗋𝗆 ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ
n
∑
i=1
1
n
⋅ f(Zi)
Examples. .Robust learning with outliers / noisy labels (high-loss samples are ignored)

Curriculum learning (low-loss samples are prioritized)

Fair ML, with individual fairness criteria (low-loss samples are ignored)
Also Truth. .Modern-day ML is more than just ERM.

-We weigh samples differently, based on their loss values!
̂f ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ
n
∑
i=1
wi ⋅ f(Zi)
[1] e.g., Han et al., “Co-teaching: Robust training of deep neural networks with extremely noisy labels,” NeurIPS 2018.

[2] e.g., Pawan Kumar et al., “Self-paced learning for latent variable models,” NeurIPS 2010.

[3] e.g., Williamson et al., “Fairness risk measures,” ICML 2019.
[1]
[2]
[3]
Motivation: Robust and fair learning
Truth. Study on the “empirical risk minimization” gives a concrete foundation for ML.
̂f 𝖾𝗋𝗆 ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ
n
∑
i=1
1
n
⋅ f(Zi)
Also Truth. .Modern-day ML is more than just ERM.

-We weigh samples differently, based on their loss values!
̂f ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ
n
∑
i=1
wi ⋅ f(Zi)
Examples. .Robust learning with outliers / noisy labels (high-loss samples are ignored)

Curriculum learning (low-loss samples are prioritized)

Fair ML, with individual fairness criteria (low-loss samples are ignored)
Question. Can we give convergence guarantees for algorithms with loss-dependent weights?
Challenge. What theoretical framework should we use?
Framework: Optimized Certainty Equivalents (OCE)
History. Invented by Ben-Tal and Teboulle (1986) to characterize risk-aversion.

- extends the utility-theoretic perspective of von Neumann and Morgenstern.
Utility curve

(diminishing marginal utility)
Income
(Objective)
Utility

(subjective)
Δ1
Δ2
Δ3
Framework: Optimized Certainty Equivalents (OCE)
History. Invented by Ben-Tal and Teboulle (1986) to characterize risk-aversion.

- extends the utility-theoretic perspective of von Neumann and Morgenstern.
Definition. Capture the risk-averse behavior using a convex disutility function .ϕ
i.e., negative utility
𝗈𝖼𝖾(f, P) ≜ inf
λ∈ℝ
{λ + EP[ϕ(f(Z) − λ)]}
EP[ϕ(f(Z) − λ)]
λ Certain present loss
Uncertain future disutility
Framework: Optimized Certainty Equivalents (OCE)
History. Invented by Ben-Tal and Teboulle (1986) to characterize risk-aversion.

- extends the utility-theoretic perspective of von Neumann and Morgenstern.
Definition. Capture the risk-averse behavior using a convex disutility function .ϕ
i.e., negative utility
ML view. .We are penalizing the average loss + deviation!
𝗈𝖼𝖾(f, P) = EP[f(Z)] + inf
λ∈ℝ
{EP[φ(f(Z) − λ)]}
… for some convex .φ(t) = ϕ(t) − t
λ* f(Z𝗁𝗂𝗀𝗁−𝗅𝗈𝗌𝗌)f(Z𝗅𝗈𝗐−𝗅𝗈𝗌𝗌)
“deviation penalty” from the

optimized anchor λ*
Framework: Optimized Certainty Equivalents (OCE)
History. Invented by Ben-Tal and Teboulle (1986) to characterize risk-aversion.

- extends the utility-theoretic perspective of von Neumann and Morgenstern.
Definition. Capture the risk-averse behavior using a convex disutility function .ϕ
i.e., negative utility
ML view. .We are penalizing the average loss + deviation!
𝗈𝖼𝖾(f, P) = EP[f(Z)] + inf
λ∈ℝ
{EP[φ(f(Z) − λ)]}
Examples. This framework covers a wide range of “risk-averse” measures of loss.
- Average + variance penalty

- Conditional value-at-risk .(i.e., ignore low-loss samples)

- Entropic risk measure -(i.e., exponentially tilted loss).
Note: OCE is complementary to rank-based approaches

(come to our poster session for details!)
[1] e.g., Maurer and Pontil, “Empirical Bernstein bounds and sample variance penalization,” COLT 2009.

[2] e.g., Curi et al., “Adaptive sampling for stochastic risk-averse learning,” NeurIPS 2020.

[3] e.g., Li et al., “Tilted empirical risk minimization,” arXiv 2020.
[1]
[2]
[3]
Framework: Optimized Certainty Equivalents (OCE)
History. Invented by Ben-Tal and Teboulle (1986) to characterize risk-aversion.

- extends the utility-theoretic perspective of von Neumann and Morgenstern.
Definition. Capture the risk-averse behavior using a convex disutility function .ϕ
i.e., negative utility
ML view. .We are penalizing the average loss + deviation!
𝗈𝖼𝖾(f, P) = EP[f(Z)] + inf
λ∈ℝ
{EP[φ(f(Z) − λ)]}
Examples. This framework covers a wide range of “risk-averse” measures of loss.
- Average + variance penalty

- Conditional value-at-risk .(i.e., ignore low-loss samples)

- Entropic risk measure -(i.e., exponentially tilted loss).
Inverted OCE. A new notion to address “risk-seeking” algorithms (e.g., ignore high-loss samples)
𝗈𝖼𝖾(f, P) ≜ EP[f(Z)] − inf
λ∈ℝ
{EP[φ(λ − f(Z))]}
Results: Two learning bounds.
What we do. We analyze the empirical OCE minimization procedure:
Just as Vapnik&Chervonenkis studies “empirical risk minimization.”

we also give inverted OCE version.
̂f 𝖾𝗈𝗆 ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ
𝗈𝖼𝖾(f, Pn)
Results: Two learning bounds.
In a nutshell. We give learning bounds of two different type.
What we do. We analyze the empirical OCE minimization procedure:
Just as Vapnik&Chervonenkis studies “empirical risk minimization.”

we also give inverted OCE version.
𝗈𝖼𝖾( ̂f 𝖾𝗈𝗆, P) − inf
f∈ℱ
𝗈𝖼𝖾(f, P) ≈ 𝒪
(
𝖫𝗂𝗉(ϕ) ⋅ 𝖼𝗈𝗆𝗉(ℱ)
n )
EP[ ̂f 𝖾𝗈𝗆(Z)] − inf
f∈ℱ
EP[f(Z)] ≈ 𝒪
(
𝖼𝗈𝗆𝗉(ℱ)
n )
Theorem 6. Excess expected loss bound
Theorem 3. Excess OCE bound
(come to our poster session for details!)
̂f 𝖾𝗈𝗆 ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ
𝗈𝖼𝖾(f, Pn)
Results: Two learning bounds.
In a nutshell. We give learning bounds of two different type.
What we do. We analyze the empirical OCE minimization procedure:
Just as Vapnik&Chervonenkis studies “empirical risk minimization.”

we also give inverted OCE version.
Theorem 6. Excess expected loss bound
Theorem 3. Excess OCE bound
Also… We also discover the relationship to sample variance penalization (SVP) procedure,

and find that SVP is a nice baseline strategy for batch-based OCE minimization.
(come to our poster session for details!)
̂f 𝖾𝗈𝗆 ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ
𝗈𝖼𝖾(f, Pn)
𝗈𝖼𝖾( ̂f 𝖾𝗈𝗆, P) − inf
f∈ℱ
𝗈𝖼𝖾(f, P) ≈ 𝒪
(
𝖫𝗂𝗉(ϕ) ⋅ 𝖼𝗈𝗆𝗉(ℱ)
n )
EP[ ̂f 𝖾𝗈𝗆(Z)] − inf
f∈ℱ
EP[f(Z)] ≈ 𝒪
(
𝖼𝗈𝗆𝗉(ℱ)
n )
Results: Two learning bounds.
In a nutshell. We give learning bounds of two different type.
What we do. We analyze the empirical OCE minimization procedure:
Just as Vapnik&Chervonenkis studies “empirical risk minimization.”

we also give inverted OCE version.
Theorem 6. Excess expected loss bound
Theorem 3. Excess OCE bound
Also… We also discover the relationship to sample variance penalization (SVP) procedure,

and find that SVP is a nice baseline strategy for batch-based OCE minimization.
(come to our poster session for details!)
̂f 𝖾𝗈𝗆 ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ
𝗈𝖼𝖾(f, Pn)
𝗈𝖼𝖾( ̂f 𝖾𝗈𝗆, P) − inf
f∈ℱ
𝗈𝖼𝖾(f, P) ≈ 𝒪
(
𝖫𝗂𝗉(ϕ) ⋅ 𝖼𝗈𝗆𝗉(ℱ)
n )
EP[ ̂f 𝖾𝗈𝗆(Z)] − inf
f∈ℱ
EP[f(Z)] ≈ 𝒪
(
𝖼𝗈𝗆𝗉(ℱ)
n )
TL;DR. . - We give OCE-based theoretical framework to address robust/fair ML.

-- We give excess risk bounds for empirical OCE minimizers.
- Further implications of our theoretical results…

- Proof ideas…

- Experiment details…

- Comparisons with alternative frameworks…
Come to our zoom session for interesting details, including…

More Related Content

Similar to Learning bounds for risk-sensitive learning

Theory of Probability-Bernoulli, Binomial, Passion
Theory of Probability-Bernoulli, Binomial, PassionTheory of Probability-Bernoulli, Binomial, Passion
Theory of Probability-Bernoulli, Binomial, Passionnarretorojeania22
 
Deep VI with_beta_likelihood
Deep VI with_beta_likelihoodDeep VI with_beta_likelihood
Deep VI with_beta_likelihoodNatan Katz
 
Next Steps in Propositional Horn Contraction
Next Steps in Propositional Horn ContractionNext Steps in Propositional Horn Contraction
Next Steps in Propositional Horn ContractionIvan Varzinczak
 
M08 BiasVarianceTradeoff
M08 BiasVarianceTradeoffM08 BiasVarianceTradeoff
M08 BiasVarianceTradeoffRaman Kannan
 
A measure to evaluate latent variable model fit by sensitivity analysis
A measure to evaluate latent variable model fit by sensitivity analysisA measure to evaluate latent variable model fit by sensitivity analysis
A measure to evaluate latent variable model fit by sensitivity analysisDaniel Oberski
 
Demystifying the Bias-Variance Tradeoff
Demystifying the Bias-Variance TradeoffDemystifying the Bias-Variance Tradeoff
Demystifying the Bias-Variance TradeoffAshwin Rao
 
  Information Theory and the Analysis of Uncertainties in a Spatial Geologi...
  Information Theory and the Analysis of Uncertainties in a Spatial Geologi...  Information Theory and the Analysis of Uncertainties in a Spatial Geologi...
  Information Theory and the Analysis of Uncertainties in a Spatial Geologi...The University of Western Australia
 
STAT: Random experiments(2)
STAT: Random experiments(2)STAT: Random experiments(2)
STAT: Random experiments(2)Tuenti SiIx
 
Basic Inference Analysis
Basic Inference AnalysisBasic Inference Analysis
Basic Inference AnalysisAmeen AboDabash
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsChristian Robert
 
chap4_Parametric_Methods.ppt
chap4_Parametric_Methods.pptchap4_Parametric_Methods.ppt
chap4_Parametric_Methods.pptShayanChowdary
 
SOFIE - A Unified Approach To Ontology-Based Information Extraction Using Rea...
SOFIE - A Unified Approach To Ontology-Based Information Extraction Using Rea...SOFIE - A Unified Approach To Ontology-Based Information Extraction Using Rea...
SOFIE - A Unified Approach To Ontology-Based Information Extraction Using Rea...Tobias Wunner
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligencekeerthikaA8
 
Artificial intelligence.pptx
Artificial intelligence.pptxArtificial intelligence.pptx
Artificial intelligence.pptxkeerthikaA8
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligencekeerthikaA8
 

Similar to Learning bounds for risk-sensitive learning (20)

Theory of Probability-Bernoulli, Binomial, Passion
Theory of Probability-Bernoulli, Binomial, PassionTheory of Probability-Bernoulli, Binomial, Passion
Theory of Probability-Bernoulli, Binomial, Passion
 
Deep VI with_beta_likelihood
Deep VI with_beta_likelihoodDeep VI with_beta_likelihood
Deep VI with_beta_likelihood
 
Next Steps in Propositional Horn Contraction
Next Steps in Propositional Horn ContractionNext Steps in Propositional Horn Contraction
Next Steps in Propositional Horn Contraction
 
M08 BiasVarianceTradeoff
M08 BiasVarianceTradeoffM08 BiasVarianceTradeoff
M08 BiasVarianceTradeoff
 
A measure to evaluate latent variable model fit by sensitivity analysis
A measure to evaluate latent variable model fit by sensitivity analysisA measure to evaluate latent variable model fit by sensitivity analysis
A measure to evaluate latent variable model fit by sensitivity analysis
 
Demystifying the Bias-Variance Tradeoff
Demystifying the Bias-Variance TradeoffDemystifying the Bias-Variance Tradeoff
Demystifying the Bias-Variance Tradeoff
 
Lesson 26
Lesson 26Lesson 26
Lesson 26
 
AI Lesson 26
AI Lesson 26AI Lesson 26
AI Lesson 26
 
  Information Theory and the Analysis of Uncertainties in a Spatial Geologi...
  Information Theory and the Analysis of Uncertainties in a Spatial Geologi...  Information Theory and the Analysis of Uncertainties in a Spatial Geologi...
  Information Theory and the Analysis of Uncertainties in a Spatial Geologi...
 
STAT: Random experiments(2)
STAT: Random experiments(2)STAT: Random experiments(2)
STAT: Random experiments(2)
 
EWMA VaR Models
EWMA VaR ModelsEWMA VaR Models
EWMA VaR Models
 
Basic Inference Analysis
Basic Inference AnalysisBasic Inference Analysis
Basic Inference Analysis
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Chapter06
Chapter06Chapter06
Chapter06
 
Multiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximationsMultiple estimators for Monte Carlo approximations
Multiple estimators for Monte Carlo approximations
 
chap4_Parametric_Methods.ppt
chap4_Parametric_Methods.pptchap4_Parametric_Methods.ppt
chap4_Parametric_Methods.ppt
 
SOFIE - A Unified Approach To Ontology-Based Information Extraction Using Rea...
SOFIE - A Unified Approach To Ontology-Based Information Extraction Using Rea...SOFIE - A Unified Approach To Ontology-Based Information Extraction Using Rea...
SOFIE - A Unified Approach To Ontology-Based Information Extraction Using Rea...
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
 
Artificial intelligence.pptx
Artificial intelligence.pptxArtificial intelligence.pptx
Artificial intelligence.pptx
 
Artificial intelligence
Artificial intelligenceArtificial intelligence
Artificial intelligence
 

More from ALINLAB

Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised...
Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised...Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised...
Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised...ALINLAB
 
Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinf...
Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinf...Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinf...
Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinf...ALINLAB
 
CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted I...
CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted I...CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted I...
CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted I...ALINLAB
 
Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)
Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)
Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)ALINLAB
 
Context-aware Dynamics Model for Generalization in Model-Based Reinforcement ...
Context-aware Dynamics Model for Generalization in Model-Based Reinforcement ...Context-aware Dynamics Model for Generalization in Model-Based Reinforcement ...
Context-aware Dynamics Model for Generalization in Model-Based Reinforcement ...ALINLAB
 
Self-supervised Label Augmentation via Input Transformations (ICML 2020)
Self-supervised Label Augmentation via Input Transformations (ICML 2020)Self-supervised Label Augmentation via Input Transformations (ICML 2020)
Self-supervised Label Augmentation via Input Transformations (ICML 2020)ALINLAB
 
M2m: Imbalanced Classification via Major-to-minor Translation (CVPR 2020)
M2m: Imbalanced Classification via Major-to-minor Translation (CVPR 2020)M2m: Imbalanced Classification via Major-to-minor Translation (CVPR 2020)
M2m: Imbalanced Classification via Major-to-minor Translation (CVPR 2020)ALINLAB
 

More from ALINLAB (7)

Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised...
Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised...Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised...
Distribution Aligning Refinery of Pseudo-label for Imbalanced Semi-supervised...
 
Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinf...
Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinf...Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinf...
Trajectory-wise Multiple Choice Learning for Dynamics Generalization in Reinf...
 
CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted I...
CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted I...CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted I...
CSI: Novelty Detection via Contrastive Learning on Distributionally Shifted I...
 
Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)
Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)
Polynomial Tensor Sketch for Element-wise Matrix Function (ICML 2020)
 
Context-aware Dynamics Model for Generalization in Model-Based Reinforcement ...
Context-aware Dynamics Model for Generalization in Model-Based Reinforcement ...Context-aware Dynamics Model for Generalization in Model-Based Reinforcement ...
Context-aware Dynamics Model for Generalization in Model-Based Reinforcement ...
 
Self-supervised Label Augmentation via Input Transformations (ICML 2020)
Self-supervised Label Augmentation via Input Transformations (ICML 2020)Self-supervised Label Augmentation via Input Transformations (ICML 2020)
Self-supervised Label Augmentation via Input Transformations (ICML 2020)
 
M2m: Imbalanced Classification via Major-to-minor Translation (CVPR 2020)
M2m: Imbalanced Classification via Major-to-minor Translation (CVPR 2020)M2m: Imbalanced Classification via Major-to-minor Translation (CVPR 2020)
M2m: Imbalanced Classification via Major-to-minor Translation (CVPR 2020)
 

Recently uploaded

College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxupamatechverse
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineeringmalavadedarshan25
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSCAESB
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile servicerehmti665
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZTE
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escortsranjana rawat
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024Mark Billinghurst
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 

Recently uploaded (20)

Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptxExploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
Exploring_Network_Security_with_JA3_by_Rakesh Seal.pptx
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Introduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptxIntroduction to Multiple Access Protocol.pptx
Introduction to Multiple Access Protocol.pptx
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
Internship report on mechanical engineering
Internship report on mechanical engineeringInternship report on mechanical engineering
Internship report on mechanical engineering
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service
 
GDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentationGDSC ASEB Gen AI study jams presentation
GDSC ASEB Gen AI study jams presentation
 
Call Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile serviceCall Girls Delhi {Jodhpur} 9711199012 high profile service
Call Girls Delhi {Jodhpur} 9711199012 high profile service
 
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Meera Call 7001035870 Meet With Nagpur Escorts
 
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
ZXCTN 5804 / ZTE PTN / ZTE POTN / ZTE 5804 PTN / ZTE POTN 5804 ( 100/200 GE Z...
 
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur EscortsHigh Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
High Profile Call Girls Nagpur Isha Call 7001035870 Meet With Nagpur Escorts
 
IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024IVE Industry Focused Event - Defence Sector 2024
IVE Industry Focused Event - Defence Sector 2024
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 

Learning bounds for risk-sensitive learning

  • 1. Jaeho Lee Sejun Park Jinwoo Shin Korea Advanced Institute of Science and Technology (KAIST) † Learning bounds for Risk-sensitive learning … or, “Robust and Fair ML with Vapnik & Chervonenkis” Contact: jaeho-lee@kaist.ac.kr Code: https://github.com/jaeho-lee/oce
  • 2. Motivation: Robust and fair learning Truth. Empirical risk minimization (ERM) is a theoretical foundation for ML. ̂f 𝖾𝗋𝗆 ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ n ∑ i=1 1 n ⋅ f(Zi)
  • 3. Motivation: Robust and fair learning Truth. Study on the “empirical risk minimization” gives a concrete foundation for ML. ̂f 𝖾𝗋𝗆 ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ n ∑ i=1 1 n ⋅ f(Zi) Also Truth. .Modern-day ML is more than just ERM.
 -We weigh samples differently, based on their loss values! ̂f ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ n ∑ i=1 wi ⋅ f(Zi) Depends on , relative tof(Zi) f(Z1), f(Z2), ⋯, f(Zn)
  • 4. Motivation: Robust and fair learning Truth. Study on the “empirical risk minimization” gives a concrete foundation for ML. ̂f 𝖾𝗋𝗆 ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ n ∑ i=1 1 n ⋅ f(Zi) Examples. .Robust learning with outliers / noisy labels (high-loss samples are ignored)
 Curriculum learning (low-loss samples are prioritized)
 Fair ML, with individual fairness criteria (low-loss samples are ignored) Also Truth. .Modern-day ML is more than just ERM.
 -We weigh samples differently, based on their loss values! ̂f ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ n ∑ i=1 wi ⋅ f(Zi) [1] e.g., Han et al., “Co-teaching: Robust training of deep neural networks with extremely noisy labels,” NeurIPS 2018. [2] e.g., Pawan Kumar et al., “Self-paced learning for latent variable models,” NeurIPS 2010.
 [3] e.g., Williamson et al., “Fairness risk measures,” ICML 2019. [1] [2] [3]
  • 5. Motivation: Robust and fair learning Truth. Study on the “empirical risk minimization” gives a concrete foundation for ML. ̂f 𝖾𝗋𝗆 ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ n ∑ i=1 1 n ⋅ f(Zi) Also Truth. .Modern-day ML is more than just ERM.
 -We weigh samples differently, based on their loss values! ̂f ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ n ∑ i=1 wi ⋅ f(Zi) Examples. .Robust learning with outliers / noisy labels (high-loss samples are ignored)
 Curriculum learning (low-loss samples are prioritized)
 Fair ML, with individual fairness criteria (low-loss samples are ignored) Question. Can we give convergence guarantees for algorithms with loss-dependent weights? Challenge. What theoretical framework should we use?
  • 6. Framework: Optimized Certainty Equivalents (OCE) History. Invented by Ben-Tal and Teboulle (1986) to characterize risk-aversion. - extends the utility-theoretic perspective of von Neumann and Morgenstern. Utility curve
 (diminishing marginal utility) Income (Objective) Utility
 (subjective) Δ1 Δ2 Δ3
  • 7. Framework: Optimized Certainty Equivalents (OCE) History. Invented by Ben-Tal and Teboulle (1986) to characterize risk-aversion. - extends the utility-theoretic perspective of von Neumann and Morgenstern. Definition. Capture the risk-averse behavior using a convex disutility function .ϕ i.e., negative utility 𝗈𝖼𝖾(f, P) ≜ inf λ∈ℝ {λ + EP[ϕ(f(Z) − λ)]} EP[ϕ(f(Z) − λ)] λ Certain present loss Uncertain future disutility
  • 8. Framework: Optimized Certainty Equivalents (OCE) History. Invented by Ben-Tal and Teboulle (1986) to characterize risk-aversion. - extends the utility-theoretic perspective of von Neumann and Morgenstern. Definition. Capture the risk-averse behavior using a convex disutility function .ϕ i.e., negative utility ML view. .We are penalizing the average loss + deviation! 𝗈𝖼𝖾(f, P) = EP[f(Z)] + inf λ∈ℝ {EP[φ(f(Z) − λ)]} … for some convex .φ(t) = ϕ(t) − t λ* f(Z𝗁𝗂𝗀𝗁−𝗅𝗈𝗌𝗌)f(Z𝗅𝗈𝗐−𝗅𝗈𝗌𝗌) “deviation penalty” from the
 optimized anchor λ*
  • 9. Framework: Optimized Certainty Equivalents (OCE) History. Invented by Ben-Tal and Teboulle (1986) to characterize risk-aversion. - extends the utility-theoretic perspective of von Neumann and Morgenstern. Definition. Capture the risk-averse behavior using a convex disutility function .ϕ i.e., negative utility ML view. .We are penalizing the average loss + deviation! 𝗈𝖼𝖾(f, P) = EP[f(Z)] + inf λ∈ℝ {EP[φ(f(Z) − λ)]} Examples. This framework covers a wide range of “risk-averse” measures of loss. - Average + variance penalty - Conditional value-at-risk .(i.e., ignore low-loss samples) - Entropic risk measure -(i.e., exponentially tilted loss). Note: OCE is complementary to rank-based approaches
 (come to our poster session for details!) [1] e.g., Maurer and Pontil, “Empirical Bernstein bounds and sample variance penalization,” COLT 2009. [2] e.g., Curi et al., “Adaptive sampling for stochastic risk-averse learning,” NeurIPS 2020.
 [3] e.g., Li et al., “Tilted empirical risk minimization,” arXiv 2020. [1] [2] [3]
  • 10. Framework: Optimized Certainty Equivalents (OCE) History. Invented by Ben-Tal and Teboulle (1986) to characterize risk-aversion. - extends the utility-theoretic perspective of von Neumann and Morgenstern. Definition. Capture the risk-averse behavior using a convex disutility function .ϕ i.e., negative utility ML view. .We are penalizing the average loss + deviation! 𝗈𝖼𝖾(f, P) = EP[f(Z)] + inf λ∈ℝ {EP[φ(f(Z) − λ)]} Examples. This framework covers a wide range of “risk-averse” measures of loss. - Average + variance penalty - Conditional value-at-risk .(i.e., ignore low-loss samples) - Entropic risk measure -(i.e., exponentially tilted loss). Inverted OCE. A new notion to address “risk-seeking” algorithms (e.g., ignore high-loss samples) 𝗈𝖼𝖾(f, P) ≜ EP[f(Z)] − inf λ∈ℝ {EP[φ(λ − f(Z))]}
  • 11. Results: Two learning bounds. What we do. We analyze the empirical OCE minimization procedure: Just as Vapnik&Chervonenkis studies “empirical risk minimization.”
 we also give inverted OCE version. ̂f 𝖾𝗈𝗆 ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ 𝗈𝖼𝖾(f, Pn)
  • 12. Results: Two learning bounds. In a nutshell. We give learning bounds of two different type. What we do. We analyze the empirical OCE minimization procedure: Just as Vapnik&Chervonenkis studies “empirical risk minimization.”
 we also give inverted OCE version. 𝗈𝖼𝖾( ̂f 𝖾𝗈𝗆, P) − inf f∈ℱ 𝗈𝖼𝖾(f, P) ≈ 𝒪 ( 𝖫𝗂𝗉(ϕ) ⋅ 𝖼𝗈𝗆𝗉(ℱ) n ) EP[ ̂f 𝖾𝗈𝗆(Z)] − inf f∈ℱ EP[f(Z)] ≈ 𝒪 ( 𝖼𝗈𝗆𝗉(ℱ) n ) Theorem 6. Excess expected loss bound Theorem 3. Excess OCE bound (come to our poster session for details!) ̂f 𝖾𝗈𝗆 ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ 𝗈𝖼𝖾(f, Pn)
  • 13. Results: Two learning bounds. In a nutshell. We give learning bounds of two different type. What we do. We analyze the empirical OCE minimization procedure: Just as Vapnik&Chervonenkis studies “empirical risk minimization.”
 we also give inverted OCE version. Theorem 6. Excess expected loss bound Theorem 3. Excess OCE bound Also… We also discover the relationship to sample variance penalization (SVP) procedure,
 and find that SVP is a nice baseline strategy for batch-based OCE minimization. (come to our poster session for details!) ̂f 𝖾𝗈𝗆 ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ 𝗈𝖼𝖾(f, Pn) 𝗈𝖼𝖾( ̂f 𝖾𝗈𝗆, P) − inf f∈ℱ 𝗈𝖼𝖾(f, P) ≈ 𝒪 ( 𝖫𝗂𝗉(ϕ) ⋅ 𝖼𝗈𝗆𝗉(ℱ) n ) EP[ ̂f 𝖾𝗈𝗆(Z)] − inf f∈ℱ EP[f(Z)] ≈ 𝒪 ( 𝖼𝗈𝗆𝗉(ℱ) n )
  • 14. Results: Two learning bounds. In a nutshell. We give learning bounds of two different type. What we do. We analyze the empirical OCE minimization procedure: Just as Vapnik&Chervonenkis studies “empirical risk minimization.”
 we also give inverted OCE version. Theorem 6. Excess expected loss bound Theorem 3. Excess OCE bound Also… We also discover the relationship to sample variance penalization (SVP) procedure,
 and find that SVP is a nice baseline strategy for batch-based OCE minimization. (come to our poster session for details!) ̂f 𝖾𝗈𝗆 ≜ 𝖺𝗋𝗀𝗆𝗂𝗇f∈ℱ 𝗈𝖼𝖾(f, Pn) 𝗈𝖼𝖾( ̂f 𝖾𝗈𝗆, P) − inf f∈ℱ 𝗈𝖼𝖾(f, P) ≈ 𝒪 ( 𝖫𝗂𝗉(ϕ) ⋅ 𝖼𝗈𝗆𝗉(ℱ) n ) EP[ ̂f 𝖾𝗈𝗆(Z)] − inf f∈ℱ EP[f(Z)] ≈ 𝒪 ( 𝖼𝗈𝗆𝗉(ℱ) n ) TL;DR. . - We give OCE-based theoretical framework to address robust/fair ML.
 -- We give excess risk bounds for empirical OCE minimizers. - Further implications of our theoretical results…
 - Proof ideas…
 - Experiment details…
 - Comparisons with alternative frameworks… Come to our zoom session for interesting details, including…