SlideShare a Scribd company logo
1 of 41
Download to read offline
A Statistical Perspective on Retrieval-Based Models
A Statistical Perspective
on Retrieval-Based Models
ICML, 2023
Soumya Basu, Ankit Singh Rawat, Manzil Zaheer
Speaker: Po-Chuan Chen
Oct 12, 2023
1 / 41
A Statistical Perspective on Retrieval-Based Models
Table of contents
1 Abstract
2 Introduction
3 Problem setup
4 Local empirical risk minimization
5 Classification in extended feature space
6 Experiments
7 Conclusion and future direction 2 / 41
A Statistical Perspective on Retrieval-Based Models
Abstract
Table of contents
1 Abstract
2 Introduction
3 Problem setup
4 Local empirical risk minimization
5 Classification in extended feature space
6 Experiments
7 Conclusion and future direction 3 / 41
A Statistical Perspective on Retrieval-Based Models
Abstract
Abstract
This paper uses a formal treatment of retrieval-based models to
characterize their performance via a novel statistical perspective.
They study two different perspective method
Analyzing local learning framework
Learning global model using kernel methods
4 / 41
A Statistical Perspective on Retrieval-Based Models
Introduction
Table of contents
1 Abstract
2 Introduction
3 Problem setup
4 Local empirical risk minimization
5 Classification in extended feature space
6 Experiments
7 Conclusion and future direction 5 / 41
A Statistical Perspective on Retrieval-Based Models
Introduction
Introduction
To increase the expressiveness of an ML model, a popular way is to
homogeneously scale the size of a parametric model.
Such large models, however, have their own limitations
High computation cost
Catastrophic forgetting
Lack of provenance
Poor explainability
6 / 41
A Statistical Perspective on Retrieval-Based Models
Introduction
Introduction
Figure 1: An illustration of a retrieval-based classification model.
7 / 41
A Statistical Perspective on Retrieval-Based Models
Introduction
Contribution
1 Setting up a formal framework for classification via
retrieval-based models under local structure
2 Finite sample analysis of explicit local learning framework
3 Extending the analysis to a globally learnt model
4 Providing the first rigorous treatment of an end-to-end
retrieval-based model to study its generalization by using
kernel-based learning
8 / 41
A Statistical Perspective on Retrieval-Based Models
Problem setup
Table of contents I
1 Abstract
2 Introduction
3 Problem setup
Multiclass classification
Classification with local structure
Retrieval-based classification model
4 Local empirical risk minimization
5 Classification in extended feature space
9 / 41
A Statistical Perspective on Retrieval-Based Models
Problem setup
Table of contents II
6 Experiments
7 Conclusion and future direction
10 / 41
A Statistical Perspective on Retrieval-Based Models
Problem setup
Multiclass classification
Multiclass classification
In here, it’ll access to n training examples S = {(xi, yi)}i∈[n] ⊂ X × Y
, sampled i.i.d. from the data distribution D := DX,Y.
For the scorer f, the classifier takes the form:
hf (x) = arg max
y∈Y
fy(x)
Given a set of scorer F global ⊆ {f : X → ℝ|Y| }, learning a model can
find a scorer in F global that minimizes the miss-classification error or
expected 0/1 loss:
f∗
0/1 = arg min
f ∈Fglobal
ℙD(hf (X) ≠ Y)
11 / 41
A Statistical Perspective on Retrieval-Based Models
Problem setup
Multiclass classification
Multiclass classification
In this part, it uses surrogate loss [1] 𝓁 for the miss-classification error
and aims to minimize the associated population risk:
R𝓁 (f) = 𝔼(X,Y)∼D[𝓁(f (X), Y)]
With minimizing the (global) empirical risk over the function class
F global, we can learn a good scorer:
f̂ = arg min
f ∈Fglobal
1
n
∑︁
i∈[n]
𝓁(f (xi), yi)
And, R̂ := 1
n
Í
i∈[n] 𝓁(f (xi), yi).
12 / 41
A Statistical Perspective on Retrieval-Based Models
Problem setup
Classification with local structure
Classification with local structure
They define the data in each local neighborhood as
Bx,r := {x′ ∈ X : 𝕕(x, x′) ≤ r}, where x ∈ X and r > 0.
Dx,r set as the data distribution restricted to Bx,r
Dx,r
=
D(A)
D(Bx,r × Y)
A ⊆ Bx,r
× Y
13 / 41
A Statistical Perspective on Retrieval-Based Models
Problem setup
Classification with local structure
Classification with local structure
Such that we have local structure condition that approximates the
Bayes optimal for the local classification problem.
That is for a given 𝜀X > 0 and ∀x ∈ X, we have
min
f ∈Fx
Rx
𝓁 (f) ≤ min
f ∈Fglobal
Rx
𝓁 (f) + 𝜀X
And the local population risk can be defined as
Rx
𝓁 (f) = 𝔼(X′,Y′ )∼Dx,r [𝓁(f (X′
), Y′
)]
14 / 41
A Statistical Perspective on Retrieval-Based Models
Problem setup
Retrieval-based classification model
Retrieval-based classification model
In this paper, they focus on retrieval-based methods.
In local empirical risk minimization, it will give a instance x, the
local empirical risk minimization (ERM) approach first retrieves a
neighboring set Rx = {(x′
j, y′
j)} ⊆ S.
And it identifies a scorer f̂x from a function class
F loc ⊂ {f : X → ℝ|Y| }:
f̂x
= arg min
f ∈Floc
1
|Rx|
∑︁
(x′,y′ )∈Rx
𝓁(f (x′
), y′
)
if |Rx| = 0, f̂x ∈ F loc is chosen arbitrarily.
15 / 41
A Statistical Perspective on Retrieval-Based Models
Problem setup
Retrieval-based classification model
Retrieval-based classification model
Another approach is called classification with extended feature
space, that the scorer directly maps the augmented input
x × Rx ∈ X × (X × Y)∗ to per-class scores.
A scorer can be learned over extended feature space X × (X × Y)∗ as
follows:
f̂ex
= arg min
f ∈Fex
R̂ex
𝓁 (f)
where R̂ex
𝓁
(f) := 1
n
Í
i∈[n] 𝓁(f (xi, Rxi ), yi) and a function class of
interest over the extended space is denoted as
F ex ⊂ {f : X × (X × Y)∗ → ℝ|Y| }.
16 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Table of contents I
1 Abstract
2 Introduction
3 Problem setup
4 Local empirical risk minimization
Assumptions
Excess risk bound for local ERM
Illustrative examples
Endowing local ERM with global representations
17 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Table of contents II
5 Classification in extended feature space
6 Experiments
7 Conclusion and future direction
18 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Local empirical risk minimization
The goal is to characterize the excess risk of local ERM, such that it
aims to bound
𝔼(X,Y)∼D[𝓁(f̂X
(X), Y) − 𝓁(f∗
(X), Y)]
Here f̂X (X) in the above equation is a function of RX.
19 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Assumptions
Assumptions
First, they define the margin of scorer f at a given label y ∈ Y as
𝛾f (x, y) = fy(x) − max
y′≠y
fy′ (x)
To ensure the margin of the scorer f has smooth deviation as x varies, a
scorer f is L-coordinate Lipschitz iff for all y ∈ Y and x, x′ ∈ X, it has
|fy(x) − fy(x′
)| ≤ L∥x − x′
∥2
Also they define the weak margin condition for a scorer f: Given a
distribution D, a scorer f satisfies (𝛼, c)-weak margin condition iff, for
all t ≥ 0,
ℙ(X,Y)∼D(|𝛾f (X, Y)| ≤ t) ≤ ct𝛼
20 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Assumptions
Assumption 3.1 (True scorer function)
The scorer ftrue make sure for all (x, y) ∈ X × Y, ftrue generates the
true label, i.e., 𝛾ftrue (x, y) > 0 that ftrue is Ltrue-coordinate Lipschitz,
and satisfies the (𝛼true, ctrue)-weak margin condition.
21 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Assumptions
Assumption 3.2 (Margin-based Lipschitz loss)
For any given example (x, y) and any scorer f we have
𝓁(f (x), y) = 𝓁(𝛾f (x, y)) and 𝓁 is a decreasing function of the margin.
That, the loss function 𝓁 is L𝓁-Lipschitz function, i.e.
|𝓁(𝛾) − 𝓁(𝛾′)| ≤ L𝓁 |𝛾 − 𝛾′|, ∀𝛾 ≥ 𝛾′.
22 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Assumptions
Assumption 3.3 (Data regularity condition)
Weak density condition.
There exists constants cwdc > 0, and 𝛿wdc > 0, such that for all x ∈ X
and 𝜌D(x)rd ≤ 𝛿d
wdc
.
ℙX′∼D[𝕕(X′
, x) ≤ r] ≥ cd
wdc𝜌D(x)rd
Density level-set.
There exists a function f𝜌(𝛿) with f𝜌(𝛿) → 0 as 𝛿 → 0, such that for
any 𝛿 > 0,
ℙX∼D[𝜌D(X) ≤ f𝜌(𝛿)] ≤ 𝛿
23 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Assumptions
Assumption 3.4 (Weak + density condition)
There exists constants cwdc+ ≥ 0, and 𝛼wdc+ > 0, such that for all
x ∈ X and r ∈ [0, rmax],
ℙX′∼D[𝕕(X′, x) ≤ r]
𝜌D(c)vold (r)
− 1 ≤ cwdc+r𝛼wdc+
Under this assumption the local ERM error bounds can be tightened
further.
24 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Excess risk bound for local ERM
Excess risk bound for local ERM
It proceed to their main results on the excess risk bound of local ERM.
At x ∈ X, fx,∗ denotes the minimizer of the population version of the
local loss, and f∗ for the global loss.
fx,∗
= arg min
f ∈Floc
Rx
𝓁 (f); f∗
= arg min
f ∈Fglobal
R𝓁 (f)
The next slide will show how the expected excess risk of the local
ERM solution f̂X is bounded, it is called Risk decomposition.
25 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Excess risk bound for local ERM
𝔼(X,Y)∼D
h
ℓ

f̂X
(X), Y

− ℓ (f∗
(X), Y)
i
≤ 𝔼(X,Y)∼D
h
RX
ℓ

fX,∗

− RX
ℓ (f∗
)
i
| {z }
Local vs Global Population Optimal Risk
+
∑︁
F∈{Fglobal ,Floc
}
𝔼(X,Y)∼D

sup
f ∈F
RX
ℓ (f) − ℓ(f (X), Y)
#
| {z }
Global and Local: Sample vs Retrieved Set Risk
+ 𝔼(X,Y)∼D

sup
f ∈Floc
RX
ℓ (f) − R̂X
ℓ (f)
#
| {z }
Generalization of Local ERM
+ 𝔼(X,Y)∼D
h
RX
ℓ

fX,∗

− R̂X
ℓ

fX,∗
 i
| {z }
Central Absolute Moment of fX,∗
.
26 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Excess risk bound for local ERM
Excess risk bound for local ERM
In here, we can obtain a tighter bound by utilizing the local structure
of the distribution DX,r
. For any L  0, we can define
Mr (L; 𝓁, ftrue,F) := 2L𝓁 (Lr + (2∥F ∥∞ − Lr)ctrue(2Ltruer)𝛼true
)
For any x ∈ X, the weak density condition provides high probability
lower bound on the size of the retrieved set Rx.
27 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Excess risk bound for local ERM
Proposition 3.6.
Under the Assumption 3.3, for any x ∈ X, r  0, and 𝛿  0,
ℙD[|Rx
|  N(r, 𝛿)] ≤ 𝛿
for N(r, 𝛿) = n(cd
wdc
min{f𝜌(𝛿/2)rd, 𝛿d
wdc
} −
√︃
log(2/𝛿)
2n )
The next slide will show how the expected excess risk of the local
ERM solution f̂X is bounded, it is called Excess risk bound.
28 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Excess risk bound for local ERM
Theorem 3.7 (Excess risk bound)
𝔼(X,Y)∼D
h
ℓ

f̂X
(X), Y

− ℓ (f∗
(X), Y)
i
≤ (𝜀x + 𝜀loc)
| {z }
Local vs Global Optimal loss (I)
+ Mr

Lloc ; ℓ, ftrue , F loc

+ Mr

Lglobal ; ℓ, ftrue , F global

| {z }
Global and Local: Sample vs Retrieved Set Risk (II)
+
𝔼(X,Y)∼D

ℜRX (G(X, Y)) | RX ≥ N(r, 𝛿)

+5Mr Lloc ; ℓ, ftrue , F loc
 √︃
2 ln(4/𝛿)
N(r,𝛿)
+8𝛿Lℓ F loc
∞
,
| {z }
Generalization of Local ERM and Central Absolute Moment of fX,∗
(III)
29 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Excess risk bound for local ERM
The result shows a trade-off in approximation vs. generalization error
as retrieval radius r varies.
Approximation error.
It comprises two components, defined by (I) and (II) in Thm. 3.7.
Generalization error.
It (III) depends on the size of the retrieved set RX and the Rademacher
complexity of G(X, Y) which is included by F loc.
Under the local ERM setting the total approximation error increases
with increasing radius r for a fixed F loc. But for the generalization
error, it decreases.
30 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Illustrative examples
Illustrative examples
Local linear models.
In this setting where F loc is the class of linear classifiers in
d-dimension.
Excess Risk ≤ O

r2

|{z}
(I)
+ O

rmin{𝛼true ,1}

| {z }
(II)
+
O

d
n(2d−1)/2drd/2
+
rmin{𝛼true ,1}
n(2d−1)/4drd/2
+
1
n1/2d

| {z }
(III)
.
31 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Illustrative examples
Illustrative examples
Feed-forward classifiers.
As another example they study the setting where F loc is a the class of
fully connected deep neural networks (FC-DNN).
Excess Risk ≤ O

r(qmax+1)

| {z }
(I)
+ O

rmin{𝛼true ,1}

| {z }
(II)
+
O
q3/4
max ln (dqmax/r)3/4
ln(n)3/2
n(2d−1)/2drd/2
+
rmin{𝛼true ,1}
n(2d−1)/4drd/2
+
1
n1/2d
!
| {z }
(III)
.
32 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Endowing local ERM with global representations
Endowing local ERM with global representations
The local ERM method takes a myopic view and does not aim to learn
a global hypothesis that explains the entire data distribution.
This approach may result in poor performance in regions of input
domains that are not well represented in the training set.
The two-stage approach enables the local learning to benefit from
good quality global representations, especially in sparse data regions.
33 / 41
A Statistical Perspective on Retrieval-Based Models
Local empirical risk minimization
Endowing local ERM with global representations
Endowing local ERM with global representations
Here they discuss two-stage approach to address the potential
shortcoming of local empirical risk minimization (ERM) in
retrieval-based models.
In the first stage, a global representation is learned using the entire
dataset.
In the second stage, the learned global representation is utilized at test
time while solving the local ERM as previously defined.
34 / 41
A Statistical Perspective on Retrieval-Based Models
Classification in extended feature space
Table of contents
1 Abstract
2 Introduction
3 Problem setup
4 Local empirical risk minimization
5 Classification in extended feature space
6 Experiments
7 Conclusion and future direction 35 / 41
A Statistical Perspective on Retrieval-Based Models
Classification in extended feature space
Classification in extended feature space
The scorer function can implicitly solve the local empirical risk
minimization (ERM) using retrieved neighboring labeled instances to
make the classification prediction.
The objective is to learn a function f : X × (X × Y)∗ → ℝ|Y|
Here, they also discuss a kernel-based approach to classification in the
extended feature space, where the scorer function is represented as a
linear combination of kernel functions evaluated on the extended
feature space.
36 / 41
A Statistical Perspective on Retrieval-Based Models
Experiments
Table of contents
1 Abstract
2 Introduction
3 Problem setup
4 Local empirical risk minimization
5 Classification in extended feature space
6 Experiments
7 Conclusion and future direction 37 / 41
A Statistical Perspective on Retrieval-Based Models
Experiments
Experiments
This paper performs experiments on both synthetic and real datasets
to demonstrate the benefits of retrieval-based models in classification
tasks.
Synthetic: binary classification
CIFAR-10: binary classification
ImageNet: 1000-way classification
The experiments show that retrieval-based models can achieve good
performance with much simpler function classes compared to
traditional parametric and nonparametric models.
38 / 41
A Statistical Perspective on Retrieval-Based Models
Experiments
Experiments
Figure 2: Performance of local ERM with size of retrieved set across models
of different complexity.
39 / 41
A Statistical Perspective on Retrieval-Based Models
Conclusion and future direction
Conclusion and future direction
The main contributions of the paper, which include
A formal framework for retrieval-based models
Analysis of local and global learning frameworks
Empirical results that support the theoretical findings
For the future work, we could explore the use of retrieval-based
models in other machine learning tasks beyond classification.
40 / 41
A Statistical Perspective on Retrieval-Based Models
Conclusion and future direction
References I
[1] Peter L. Bartlett, Michael I. Jordan, and Jon D. Mcauliffe.
“Convexity, Classification, and Risk Bounds”. In: Journal of the
American Statistical Association 101.473 (2006), pp. 138–156.
issn: 01621459. url:
http://www.jstor.org/stable/30047445.
41 / 41

More Related Content

What's hot

Tips for data science competitions
Tips for data science competitionsTips for data science competitions
Tips for data science competitionsOwen Zhang
 
text summarization using amr
text summarization using amrtext summarization using amr
text summarization using amramit nagarkoti
 
Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...
Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...
Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...Edureka!
 
ONNX and MLflow
ONNX and MLflowONNX and MLflow
ONNX and MLflowamesar0
 
La mda 딥러닝 논문읽기 모임, 2021 google IO
La mda 딥러닝 논문읽기 모임, 2021 google IOLa mda 딥러닝 논문읽기 모임, 2021 google IO
La mda 딥러닝 논문읽기 모임, 2021 google IOtaeseon ryu
 
Neural Networks Basics
Neural Networks BasicsNeural Networks Basics
Neural Networks BasicsElifTech
 
Python Versions and Dependencies Made Easy
Python Versions and Dependencies Made EasyPython Versions and Dependencies Made Easy
Python Versions and Dependencies Made EasySebastian Witowski
 
Synthetic data generation for machine learning
Synthetic data generation for machine learningSynthetic data generation for machine learning
Synthetic data generation for machine learningQuantUniversity
 
Overview of tree algorithms from decision tree to xgboost
Overview of tree algorithms from decision tree to xgboostOverview of tree algorithms from decision tree to xgboost
Overview of tree algorithms from decision tree to xgboostTakami Sato
 
Bag the model with bagging
Bag the model with baggingBag the model with bagging
Bag the model with baggingChode Amarnath
 
1909 BERT: why-and-how (CODE SEMINAR)
1909 BERT: why-and-how (CODE SEMINAR)1909 BERT: why-and-how (CODE SEMINAR)
1909 BERT: why-and-how (CODE SEMINAR)WarNik Chow
 
Ai 그까이거
Ai 그까이거Ai 그까이거
Ai 그까이거도형 임
 
오토인코더의 모든 것
오토인코더의 모든 것오토인코더의 모든 것
오토인코더의 모든 것NAVER Engineering
 
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...Edureka!
 
Fin bert paper review !
Fin bert paper review !Fin bert paper review !
Fin bert paper review !taeseon ryu
 
딥러닝 논문 리뷰 Learning phrase representations using rnn encoder decoder for stati...
딥러닝 논문 리뷰 Learning phrase representations using rnn encoder decoder for stati...딥러닝 논문 리뷰 Learning phrase representations using rnn encoder decoder for stati...
딥러닝 논문 리뷰 Learning phrase representations using rnn encoder decoder for stati...keunbong kwak
 
Serving BERT Models in Production with TorchServe
Serving BERT Models in Production with TorchServeServing BERT Models in Production with TorchServe
Serving BERT Models in Production with TorchServeNidhin Pattaniyil
 

What's hot (20)

Tips for data science competitions
Tips for data science competitionsTips for data science competitions
Tips for data science competitions
 
Hadoop 기반 빅데이터 이해
Hadoop 기반 빅데이터 이해Hadoop 기반 빅데이터 이해
Hadoop 기반 빅데이터 이해
 
Fatigue Analysis of Subsea Cladded Vessel using fe-safe
Fatigue Analysis of Subsea Cladded Vessel using fe-safeFatigue Analysis of Subsea Cladded Vessel using fe-safe
Fatigue Analysis of Subsea Cladded Vessel using fe-safe
 
text summarization using amr
text summarization using amrtext summarization using amr
text summarization using amr
 
Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...
Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...
Natural Language Processing (NLP) & Text Mining Tutorial Using NLTK | NLP Tra...
 
ONNX and MLflow
ONNX and MLflowONNX and MLflow
ONNX and MLflow
 
La mda 딥러닝 논문읽기 모임, 2021 google IO
La mda 딥러닝 논문읽기 모임, 2021 google IOLa mda 딥러닝 논문읽기 모임, 2021 google IO
La mda 딥러닝 논문읽기 모임, 2021 google IO
 
Neural Networks Basics
Neural Networks BasicsNeural Networks Basics
Neural Networks Basics
 
Bert
BertBert
Bert
 
Python Versions and Dependencies Made Easy
Python Versions and Dependencies Made EasyPython Versions and Dependencies Made Easy
Python Versions and Dependencies Made Easy
 
Synthetic data generation for machine learning
Synthetic data generation for machine learningSynthetic data generation for machine learning
Synthetic data generation for machine learning
 
Overview of tree algorithms from decision tree to xgboost
Overview of tree algorithms from decision tree to xgboostOverview of tree algorithms from decision tree to xgboost
Overview of tree algorithms from decision tree to xgboost
 
Bag the model with bagging
Bag the model with baggingBag the model with bagging
Bag the model with bagging
 
1909 BERT: why-and-how (CODE SEMINAR)
1909 BERT: why-and-how (CODE SEMINAR)1909 BERT: why-and-how (CODE SEMINAR)
1909 BERT: why-and-how (CODE SEMINAR)
 
Ai 그까이거
Ai 그까이거Ai 그까이거
Ai 그까이거
 
오토인코더의 모든 것
오토인코더의 모든 것오토인코더의 모든 것
오토인코더의 모든 것
 
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
PyTorch Python Tutorial | Deep Learning Using PyTorch | Image Classifier Usin...
 
Fin bert paper review !
Fin bert paper review !Fin bert paper review !
Fin bert paper review !
 
딥러닝 논문 리뷰 Learning phrase representations using rnn encoder decoder for stati...
딥러닝 논문 리뷰 Learning phrase representations using rnn encoder decoder for stati...딥러닝 논문 리뷰 Learning phrase representations using rnn encoder decoder for stati...
딥러닝 논문 리뷰 Learning phrase representations using rnn encoder decoder for stati...
 
Serving BERT Models in Production with TorchServe
Serving BERT Models in Production with TorchServeServing BERT Models in Production with TorchServe
Serving BERT Models in Production with TorchServe
 

Similar to A Statistical Perspective on Retrieval-Based Models.pdf

8517ijaia06
8517ijaia068517ijaia06
8517ijaia06ijaia
 
Cheatsheet supervised-learning
Cheatsheet supervised-learningCheatsheet supervised-learning
Cheatsheet supervised-learningSteve Nouri
 
Semi-Supervised Regression using Cluster Ensemble
Semi-Supervised Regression using Cluster EnsembleSemi-Supervised Regression using Cluster Ensemble
Semi-Supervised Regression using Cluster EnsembleAlexander Litvinenko
 
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationAccelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationFeynman Liang
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodFrank Nielsen
 
Mac2311 study guide-tcm6-49721
Mac2311 study guide-tcm6-49721Mac2311 study guide-tcm6-49721
Mac2311 study guide-tcm6-49721Glicerio Gavilan
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Christian Robert
 
Conditional Random Fields
Conditional Random FieldsConditional Random Fields
Conditional Random Fieldslswing
 
Statement of stochastic programming problems
Statement of stochastic programming problemsStatement of stochastic programming problems
Statement of stochastic programming problemsSSA KPI
 
Elementary Probability and Information Theory
Elementary Probability and Information TheoryElementary Probability and Information Theory
Elementary Probability and Information TheoryKhalidSaghiri2
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Valentin De Bortoli
 
Monte Carlo Methods
Monte Carlo MethodsMonte Carlo Methods
Monte Carlo MethodsJames Bell
 
Error Estimates for Multi-Penalty Regularization under General Source Condition
Error Estimates for Multi-Penalty Regularization under General Source ConditionError Estimates for Multi-Penalty Regularization under General Source Condition
Error Estimates for Multi-Penalty Regularization under General Source Conditioncsandit
 
Применение машинного обучения для навигации и управления роботами
Применение машинного обучения для навигации и управления роботамиПрименение машинного обучения для навигации и управления роботами
Применение машинного обучения для навигации и управления роботамиSkolkovo Robotics Center
 
Introduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksIntroduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksFederico Cerutti
 
An Introduction To Basic Statistics And Probability
An Introduction To Basic Statistics And ProbabilityAn Introduction To Basic Statistics And Probability
An Introduction To Basic Statistics And ProbabilityMaria Perkins
 

Similar to A Statistical Perspective on Retrieval-Based Models.pdf (20)

8517ijaia06
8517ijaia068517ijaia06
8517ijaia06
 
Cheatsheet supervised-learning
Cheatsheet supervised-learningCheatsheet supervised-learning
Cheatsheet supervised-learning
 
A basic introduction to learning
A basic introduction to learningA basic introduction to learning
A basic introduction to learning
 
Semi-Supervised Regression using Cluster Ensemble
Semi-Supervised Regression using Cluster EnsembleSemi-Supervised Regression using Cluster Ensemble
Semi-Supervised Regression using Cluster Ensemble
 
Accelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference CompilationAccelerating Metropolis Hastings with Lightweight Inference Compilation
Accelerating Metropolis Hastings with Lightweight Inference Compilation
 
On learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihoodOn learning statistical mixtures maximizing the complete likelihood
On learning statistical mixtures maximizing the complete likelihood
 
Mac2311 study guide-tcm6-49721
Mac2311 study guide-tcm6-49721Mac2311 study guide-tcm6-49721
Mac2311 study guide-tcm6-49721
 
Laplace's Demon: seminar #1
Laplace's Demon: seminar #1Laplace's Demon: seminar #1
Laplace's Demon: seminar #1
 
Conditional Random Fields
Conditional Random FieldsConditional Random Fields
Conditional Random Fields
 
Statement of stochastic programming problems
Statement of stochastic programming problemsStatement of stochastic programming problems
Statement of stochastic programming problems
 
Elementary Probability and Information Theory
Elementary Probability and Information TheoryElementary Probability and Information Theory
Elementary Probability and Information Theory
 
ABC-Gibbs
ABC-GibbsABC-Gibbs
ABC-Gibbs
 
ICPR 2016
ICPR 2016ICPR 2016
ICPR 2016
 
Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...Maximum likelihood estimation of regularisation parameters in inverse problem...
Maximum likelihood estimation of regularisation parameters in inverse problem...
 
Monte Carlo Methods
Monte Carlo MethodsMonte Carlo Methods
Monte Carlo Methods
 
Deep Learning Opening Workshop - Admissibility of Solution Estimators in Stoc...
Deep Learning Opening Workshop - Admissibility of Solution Estimators in Stoc...Deep Learning Opening Workshop - Admissibility of Solution Estimators in Stoc...
Deep Learning Opening Workshop - Admissibility of Solution Estimators in Stoc...
 
Error Estimates for Multi-Penalty Regularization under General Source Condition
Error Estimates for Multi-Penalty Regularization under General Source ConditionError Estimates for Multi-Penalty Regularization under General Source Condition
Error Estimates for Multi-Penalty Regularization under General Source Condition
 
Применение машинного обучения для навигации и управления роботами
Применение машинного обучения для навигации и управления роботамиПрименение машинного обучения для навигации и управления роботами
Применение машинного обучения для навигации и управления роботами
 
Introduction to Evidential Neural Networks
Introduction to Evidential Neural NetworksIntroduction to Evidential Neural Networks
Introduction to Evidential Neural Networks
 
An Introduction To Basic Statistics And Probability
An Introduction To Basic Statistics And ProbabilityAn Introduction To Basic Statistics And Probability
An Introduction To Basic Statistics And Probability
 

More from Po-Chuan Chen

E-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation.pdf
E-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation.pdfE-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation.pdf
E-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation.pdfPo-Chuan Chen
 
Effective Structured Prompting by Meta-Learning and Representative Verbalizer...
Effective Structured Prompting by Meta-Learning and Representative Verbalizer...Effective Structured Prompting by Meta-Learning and Representative Verbalizer...
Effective Structured Prompting by Meta-Learning and Representative Verbalizer...Po-Chuan Chen
 
Quark: Controllable Text Generation with Reinforced [Un]learning.pdf
Quark: Controllable Text Generation with Reinforced [Un]learning.pdfQuark: Controllable Text Generation with Reinforced [Un]learning.pdf
Quark: Controllable Text Generation with Reinforced [Un]learning.pdfPo-Chuan Chen
 
Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible...
Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible...Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible...
Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible...Po-Chuan Chen
 
On the Effectiveness of Offline RL for Dialogue Response Generation.pdf
On the Effectiveness of Offline RL for Dialogue Response Generation.pdfOn the Effectiveness of Offline RL for Dialogue Response Generation.pdf
On the Effectiveness of Offline RL for Dialogue Response Generation.pdfPo-Chuan Chen
 
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...Po-Chuan Chen
 
A Neural Corpus Indexer for Document Retrieval.pdf
A Neural Corpus Indexer for Document Retrieval.pdfA Neural Corpus Indexer for Document Retrieval.pdf
A Neural Corpus Indexer for Document Retrieval.pdfPo-Chuan Chen
 
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning.pdf
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning.pdfAdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning.pdf
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning.pdfPo-Chuan Chen
 
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...Po-Chuan Chen
 
Active Retrieval Augmented Generation.pdf
Active Retrieval Augmented Generation.pdfActive Retrieval Augmented Generation.pdf
Active Retrieval Augmented Generation.pdfPo-Chuan Chen
 
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdf
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdfOffline Reinforcement Learning for Informal Summarization in Online Domains.pdf
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdfPo-Chuan Chen
 
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdf
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdfCold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdf
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdfPo-Chuan Chen
 
Image_to_Prompts.pdf
Image_to_Prompts.pdfImage_to_Prompts.pdf
Image_to_Prompts.pdfPo-Chuan Chen
 
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfRetrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfPo-Chuan Chen
 
Evaluating Parameter Efficient Learning for Generation.pdf
Evaluating Parameter Efficient Learning for Generation.pdfEvaluating Parameter Efficient Learning for Generation.pdf
Evaluating Parameter Efficient Learning for Generation.pdfPo-Chuan Chen
 
Off-Policy Deep Reinforcement Learning without Exploration.pdf
Off-Policy Deep Reinforcement Learning without Exploration.pdfOff-Policy Deep Reinforcement Learning without Exploration.pdf
Off-Policy Deep Reinforcement Learning without Exploration.pdfPo-Chuan Chen
 
A Mixture-of-Expert Approach to RL-based Dialogue Management.pdf
A Mixture-of-Expert Approach to RL-based Dialogue Management.pdfA Mixture-of-Expert Approach to RL-based Dialogue Management.pdf
A Mixture-of-Expert Approach to RL-based Dialogue Management.pdfPo-Chuan Chen
 
Is Reinforcement Learning (Not) for Natural Language Processing.pdf
Is Reinforcement Learning (Not) for Natural
Language Processing.pdfIs Reinforcement Learning (Not) for Natural
Language Processing.pdf
Is Reinforcement Learning (Not) for Natural Language Processing.pdfPo-Chuan Chen
 
HyperPrompt:Prompt-based Task-Conditioning of Transformerspdf
HyperPrompt:Prompt-based Task-Conditioning of TransformerspdfHyperPrompt:Prompt-based Task-Conditioning of Transformerspdf
HyperPrompt:Prompt-based Task-Conditioning of TransformerspdfPo-Chuan Chen
 
Training language models to follow instructions with human feedback.pdf
Training language models to follow instructions
with human feedback.pdfTraining language models to follow instructions
with human feedback.pdf
Training language models to follow instructions with human feedback.pdfPo-Chuan Chen
 

More from Po-Chuan Chen (20)

E-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation.pdf
E-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation.pdfE-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation.pdf
E-CORE: Emotion Correlation Enhanced Empathetic Dialogue Generation.pdf
 
Effective Structured Prompting by Meta-Learning and Representative Verbalizer...
Effective Structured Prompting by Meta-Learning and Representative Verbalizer...Effective Structured Prompting by Meta-Learning and Representative Verbalizer...
Effective Structured Prompting by Meta-Learning and Representative Verbalizer...
 
Quark: Controllable Text Generation with Reinforced [Un]learning.pdf
Quark: Controllable Text Generation with Reinforced [Un]learning.pdfQuark: Controllable Text Generation with Reinforced [Un]learning.pdf
Quark: Controllable Text Generation with Reinforced [Un]learning.pdf
 
Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible...
Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible...Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible...
Empathetic Dialogue Generation via Sensitive Emotion Recognition and Sensible...
 
On the Effectiveness of Offline RL for Dialogue Response Generation.pdf
On the Effectiveness of Offline RL for Dialogue Response Generation.pdfOn the Effectiveness of Offline RL for Dialogue Response Generation.pdf
On the Effectiveness of Offline RL for Dialogue Response Generation.pdf
 
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...
GPTQ: Accurate Post-Training Quantization for Generative Pre-trained Transfor...
 
A Neural Corpus Indexer for Document Retrieval.pdf
A Neural Corpus Indexer for Document Retrieval.pdfA Neural Corpus Indexer for Document Retrieval.pdf
A Neural Corpus Indexer for Document Retrieval.pdf
 
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning.pdf
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning.pdfAdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning.pdf
AdaMix: Mixture-of-Adaptations for Parameter-efficient Model Tuning.pdf
 
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
 
Active Retrieval Augmented Generation.pdf
Active Retrieval Augmented Generation.pdfActive Retrieval Augmented Generation.pdf
Active Retrieval Augmented Generation.pdf
 
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdf
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdfOffline Reinforcement Learning for Informal Summarization in Online Domains.pdf
Offline Reinforcement Learning for Informal Summarization in Online Domains.pdf
 
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdf
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdfCold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdf
Cold_Start_Reinforcement_Learning_with_Softmax_Policy_Gradient.pdf
 
Image_to_Prompts.pdf
Image_to_Prompts.pdfImage_to_Prompts.pdf
Image_to_Prompts.pdf
 
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdfRetrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks.pdf
 
Evaluating Parameter Efficient Learning for Generation.pdf
Evaluating Parameter Efficient Learning for Generation.pdfEvaluating Parameter Efficient Learning for Generation.pdf
Evaluating Parameter Efficient Learning for Generation.pdf
 
Off-Policy Deep Reinforcement Learning without Exploration.pdf
Off-Policy Deep Reinforcement Learning without Exploration.pdfOff-Policy Deep Reinforcement Learning without Exploration.pdf
Off-Policy Deep Reinforcement Learning without Exploration.pdf
 
A Mixture-of-Expert Approach to RL-based Dialogue Management.pdf
A Mixture-of-Expert Approach to RL-based Dialogue Management.pdfA Mixture-of-Expert Approach to RL-based Dialogue Management.pdf
A Mixture-of-Expert Approach to RL-based Dialogue Management.pdf
 
Is Reinforcement Learning (Not) for Natural Language Processing.pdf
Is Reinforcement Learning (Not) for Natural
Language Processing.pdfIs Reinforcement Learning (Not) for Natural
Language Processing.pdf
Is Reinforcement Learning (Not) for Natural Language Processing.pdf
 
HyperPrompt:Prompt-based Task-Conditioning of Transformerspdf
HyperPrompt:Prompt-based Task-Conditioning of TransformerspdfHyperPrompt:Prompt-based Task-Conditioning of Transformerspdf
HyperPrompt:Prompt-based Task-Conditioning of Transformerspdf
 
Training language models to follow instructions with human feedback.pdf
Training language models to follow instructions
with human feedback.pdfTraining language models to follow instructions
with human feedback.pdf
Training language models to follow instructions with human feedback.pdf
 

Recently uploaded

Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Christo Ananth
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxpurnimasatapathy1234
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxupamatechverse
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxpranjaldaimarysona
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations120cr0395
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 

Recently uploaded (20)

DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Microscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptxMicroscopic Analysis of Ceramic Materials.pptx
Microscopic Analysis of Ceramic Materials.pptx
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(ANJALI) Dange Chowk Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Introduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptxIntroduction to IEEE STANDARDS and its different types.pptx
Introduction to IEEE STANDARDS and its different types.pptx
 
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...
 
Processing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptxProcessing & Properties of Floor and Wall Tiles.pptx
Processing & Properties of Floor and Wall Tiles.pptx
 
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSAPPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS
 
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
Extrusion Processes and Their Limitations
Extrusion Processes and Their LimitationsExtrusion Processes and Their Limitations
Extrusion Processes and Their Limitations
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 

A Statistical Perspective on Retrieval-Based Models.pdf

  • 1. A Statistical Perspective on Retrieval-Based Models A Statistical Perspective on Retrieval-Based Models ICML, 2023 Soumya Basu, Ankit Singh Rawat, Manzil Zaheer Speaker: Po-Chuan Chen Oct 12, 2023 1 / 41
  • 2. A Statistical Perspective on Retrieval-Based Models Table of contents 1 Abstract 2 Introduction 3 Problem setup 4 Local empirical risk minimization 5 Classification in extended feature space 6 Experiments 7 Conclusion and future direction 2 / 41
  • 3. A Statistical Perspective on Retrieval-Based Models Abstract Table of contents 1 Abstract 2 Introduction 3 Problem setup 4 Local empirical risk minimization 5 Classification in extended feature space 6 Experiments 7 Conclusion and future direction 3 / 41
  • 4. A Statistical Perspective on Retrieval-Based Models Abstract Abstract This paper uses a formal treatment of retrieval-based models to characterize their performance via a novel statistical perspective. They study two different perspective method Analyzing local learning framework Learning global model using kernel methods 4 / 41
  • 5. A Statistical Perspective on Retrieval-Based Models Introduction Table of contents 1 Abstract 2 Introduction 3 Problem setup 4 Local empirical risk minimization 5 Classification in extended feature space 6 Experiments 7 Conclusion and future direction 5 / 41
  • 6. A Statistical Perspective on Retrieval-Based Models Introduction Introduction To increase the expressiveness of an ML model, a popular way is to homogeneously scale the size of a parametric model. Such large models, however, have their own limitations High computation cost Catastrophic forgetting Lack of provenance Poor explainability 6 / 41
  • 7. A Statistical Perspective on Retrieval-Based Models Introduction Introduction Figure 1: An illustration of a retrieval-based classification model. 7 / 41
  • 8. A Statistical Perspective on Retrieval-Based Models Introduction Contribution 1 Setting up a formal framework for classification via retrieval-based models under local structure 2 Finite sample analysis of explicit local learning framework 3 Extending the analysis to a globally learnt model 4 Providing the first rigorous treatment of an end-to-end retrieval-based model to study its generalization by using kernel-based learning 8 / 41
  • 9. A Statistical Perspective on Retrieval-Based Models Problem setup Table of contents I 1 Abstract 2 Introduction 3 Problem setup Multiclass classification Classification with local structure Retrieval-based classification model 4 Local empirical risk minimization 5 Classification in extended feature space 9 / 41
  • 10. A Statistical Perspective on Retrieval-Based Models Problem setup Table of contents II 6 Experiments 7 Conclusion and future direction 10 / 41
  • 11. A Statistical Perspective on Retrieval-Based Models Problem setup Multiclass classification Multiclass classification In here, it’ll access to n training examples S = {(xi, yi)}i∈[n] ⊂ X × Y , sampled i.i.d. from the data distribution D := DX,Y. For the scorer f, the classifier takes the form: hf (x) = arg max y∈Y fy(x) Given a set of scorer F global ⊆ {f : X → ℝ|Y| }, learning a model can find a scorer in F global that minimizes the miss-classification error or expected 0/1 loss: f∗ 0/1 = arg min f ∈Fglobal ℙD(hf (X) ≠ Y) 11 / 41
  • 12. A Statistical Perspective on Retrieval-Based Models Problem setup Multiclass classification Multiclass classification In this part, it uses surrogate loss [1] 𝓁 for the miss-classification error and aims to minimize the associated population risk: R𝓁 (f) = 𝔼(X,Y)∼D[𝓁(f (X), Y)] With minimizing the (global) empirical risk over the function class F global, we can learn a good scorer: f̂ = arg min f ∈Fglobal 1 n ∑︁ i∈[n] 𝓁(f (xi), yi) And, R̂ := 1 n Í i∈[n] 𝓁(f (xi), yi). 12 / 41
  • 13. A Statistical Perspective on Retrieval-Based Models Problem setup Classification with local structure Classification with local structure They define the data in each local neighborhood as Bx,r := {x′ ∈ X : 𝕕(x, x′) ≤ r}, where x ∈ X and r > 0. Dx,r set as the data distribution restricted to Bx,r Dx,r = D(A) D(Bx,r × Y) A ⊆ Bx,r × Y 13 / 41
  • 14. A Statistical Perspective on Retrieval-Based Models Problem setup Classification with local structure Classification with local structure Such that we have local structure condition that approximates the Bayes optimal for the local classification problem. That is for a given 𝜀X > 0 and ∀x ∈ X, we have min f ∈Fx Rx 𝓁 (f) ≤ min f ∈Fglobal Rx 𝓁 (f) + 𝜀X And the local population risk can be defined as Rx 𝓁 (f) = 𝔼(X′,Y′ )∼Dx,r [𝓁(f (X′ ), Y′ )] 14 / 41
  • 15. A Statistical Perspective on Retrieval-Based Models Problem setup Retrieval-based classification model Retrieval-based classification model In this paper, they focus on retrieval-based methods. In local empirical risk minimization, it will give a instance x, the local empirical risk minimization (ERM) approach first retrieves a neighboring set Rx = {(x′ j, y′ j)} ⊆ S. And it identifies a scorer f̂x from a function class F loc ⊂ {f : X → ℝ|Y| }: f̂x = arg min f ∈Floc 1 |Rx| ∑︁ (x′,y′ )∈Rx 𝓁(f (x′ ), y′ ) if |Rx| = 0, f̂x ∈ F loc is chosen arbitrarily. 15 / 41
  • 16. A Statistical Perspective on Retrieval-Based Models Problem setup Retrieval-based classification model Retrieval-based classification model Another approach is called classification with extended feature space, that the scorer directly maps the augmented input x × Rx ∈ X × (X × Y)∗ to per-class scores. A scorer can be learned over extended feature space X × (X × Y)∗ as follows: f̂ex = arg min f ∈Fex R̂ex 𝓁 (f) where R̂ex 𝓁 (f) := 1 n Í i∈[n] 𝓁(f (xi, Rxi ), yi) and a function class of interest over the extended space is denoted as F ex ⊂ {f : X × (X × Y)∗ → ℝ|Y| }. 16 / 41
  • 17. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Table of contents I 1 Abstract 2 Introduction 3 Problem setup 4 Local empirical risk minimization Assumptions Excess risk bound for local ERM Illustrative examples Endowing local ERM with global representations 17 / 41
  • 18. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Table of contents II 5 Classification in extended feature space 6 Experiments 7 Conclusion and future direction 18 / 41
  • 19. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Local empirical risk minimization The goal is to characterize the excess risk of local ERM, such that it aims to bound 𝔼(X,Y)∼D[𝓁(f̂X (X), Y) − 𝓁(f∗ (X), Y)] Here f̂X (X) in the above equation is a function of RX. 19 / 41
  • 20. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Assumptions Assumptions First, they define the margin of scorer f at a given label y ∈ Y as 𝛾f (x, y) = fy(x) − max y′≠y fy′ (x) To ensure the margin of the scorer f has smooth deviation as x varies, a scorer f is L-coordinate Lipschitz iff for all y ∈ Y and x, x′ ∈ X, it has |fy(x) − fy(x′ )| ≤ L∥x − x′ ∥2 Also they define the weak margin condition for a scorer f: Given a distribution D, a scorer f satisfies (𝛼, c)-weak margin condition iff, for all t ≥ 0, ℙ(X,Y)∼D(|𝛾f (X, Y)| ≤ t) ≤ ct𝛼 20 / 41
  • 21. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Assumptions Assumption 3.1 (True scorer function) The scorer ftrue make sure for all (x, y) ∈ X × Y, ftrue generates the true label, i.e., 𝛾ftrue (x, y) > 0 that ftrue is Ltrue-coordinate Lipschitz, and satisfies the (𝛼true, ctrue)-weak margin condition. 21 / 41
  • 22. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Assumptions Assumption 3.2 (Margin-based Lipschitz loss) For any given example (x, y) and any scorer f we have 𝓁(f (x), y) = 𝓁(𝛾f (x, y)) and 𝓁 is a decreasing function of the margin. That, the loss function 𝓁 is L𝓁-Lipschitz function, i.e. |𝓁(𝛾) − 𝓁(𝛾′)| ≤ L𝓁 |𝛾 − 𝛾′|, ∀𝛾 ≥ 𝛾′. 22 / 41
  • 23. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Assumptions Assumption 3.3 (Data regularity condition) Weak density condition. There exists constants cwdc > 0, and 𝛿wdc > 0, such that for all x ∈ X and 𝜌D(x)rd ≤ 𝛿d wdc . ℙX′∼D[𝕕(X′ , x) ≤ r] ≥ cd wdc𝜌D(x)rd Density level-set. There exists a function f𝜌(𝛿) with f𝜌(𝛿) → 0 as 𝛿 → 0, such that for any 𝛿 > 0, ℙX∼D[𝜌D(X) ≤ f𝜌(𝛿)] ≤ 𝛿 23 / 41
  • 24. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Assumptions Assumption 3.4 (Weak + density condition) There exists constants cwdc+ ≥ 0, and 𝛼wdc+ > 0, such that for all x ∈ X and r ∈ [0, rmax], ℙX′∼D[𝕕(X′, x) ≤ r] 𝜌D(c)vold (r) − 1 ≤ cwdc+r𝛼wdc+ Under this assumption the local ERM error bounds can be tightened further. 24 / 41
  • 25. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Excess risk bound for local ERM Excess risk bound for local ERM It proceed to their main results on the excess risk bound of local ERM. At x ∈ X, fx,∗ denotes the minimizer of the population version of the local loss, and f∗ for the global loss. fx,∗ = arg min f ∈Floc Rx 𝓁 (f); f∗ = arg min f ∈Fglobal R𝓁 (f) The next slide will show how the expected excess risk of the local ERM solution f̂X is bounded, it is called Risk decomposition. 25 / 41
  • 26. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Excess risk bound for local ERM 𝔼(X,Y)∼D h ℓ f̂X (X), Y − ℓ (f∗ (X), Y) i ≤ 𝔼(X,Y)∼D h RX ℓ fX,∗ − RX ℓ (f∗ ) i | {z } Local vs Global Population Optimal Risk + ∑︁ F∈{Fglobal ,Floc } 𝔼(X,Y)∼D sup f ∈F RX ℓ (f) − ℓ(f (X), Y) # | {z } Global and Local: Sample vs Retrieved Set Risk + 𝔼(X,Y)∼D sup f ∈Floc RX ℓ (f) − R̂X ℓ (f) # | {z } Generalization of Local ERM + 𝔼(X,Y)∼D h RX ℓ fX,∗ − R̂X ℓ fX,∗ i | {z } Central Absolute Moment of fX,∗ . 26 / 41
  • 27. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Excess risk bound for local ERM Excess risk bound for local ERM In here, we can obtain a tighter bound by utilizing the local structure of the distribution DX,r . For any L 0, we can define Mr (L; 𝓁, ftrue,F) := 2L𝓁 (Lr + (2∥F ∥∞ − Lr)ctrue(2Ltruer)𝛼true ) For any x ∈ X, the weak density condition provides high probability lower bound on the size of the retrieved set Rx. 27 / 41
  • 28. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Excess risk bound for local ERM Proposition 3.6. Under the Assumption 3.3, for any x ∈ X, r 0, and 𝛿 0, ℙD[|Rx | N(r, 𝛿)] ≤ 𝛿 for N(r, 𝛿) = n(cd wdc min{f𝜌(𝛿/2)rd, 𝛿d wdc } − √︃ log(2/𝛿) 2n ) The next slide will show how the expected excess risk of the local ERM solution f̂X is bounded, it is called Excess risk bound. 28 / 41
  • 29. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Excess risk bound for local ERM Theorem 3.7 (Excess risk bound) 𝔼(X,Y)∼D h ℓ f̂X (X), Y − ℓ (f∗ (X), Y) i ≤ (𝜀x + 𝜀loc) | {z } Local vs Global Optimal loss (I) + Mr Lloc ; ℓ, ftrue , F loc + Mr Lglobal ; ℓ, ftrue , F global | {z } Global and Local: Sample vs Retrieved Set Risk (II) + 𝔼(X,Y)∼D ℜRX (G(X, Y)) | RX ≥ N(r, 𝛿) +5Mr Lloc ; ℓ, ftrue , F loc √︃ 2 ln(4/𝛿) N(r,𝛿) +8𝛿Lℓ F loc ∞ , | {z } Generalization of Local ERM and Central Absolute Moment of fX,∗ (III) 29 / 41
  • 30. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Excess risk bound for local ERM The result shows a trade-off in approximation vs. generalization error as retrieval radius r varies. Approximation error. It comprises two components, defined by (I) and (II) in Thm. 3.7. Generalization error. It (III) depends on the size of the retrieved set RX and the Rademacher complexity of G(X, Y) which is included by F loc. Under the local ERM setting the total approximation error increases with increasing radius r for a fixed F loc. But for the generalization error, it decreases. 30 / 41
  • 31. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Illustrative examples Illustrative examples Local linear models. In this setting where F loc is the class of linear classifiers in d-dimension. Excess Risk ≤ O r2 |{z} (I) + O rmin{𝛼true ,1} | {z } (II) + O d n(2d−1)/2drd/2 + rmin{𝛼true ,1} n(2d−1)/4drd/2 + 1 n1/2d | {z } (III) . 31 / 41
  • 32. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Illustrative examples Illustrative examples Feed-forward classifiers. As another example they study the setting where F loc is a the class of fully connected deep neural networks (FC-DNN). Excess Risk ≤ O r(qmax+1) | {z } (I) + O rmin{𝛼true ,1} | {z } (II) + O q3/4 max ln (dqmax/r)3/4 ln(n)3/2 n(2d−1)/2drd/2 + rmin{𝛼true ,1} n(2d−1)/4drd/2 + 1 n1/2d ! | {z } (III) . 32 / 41
  • 33. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Endowing local ERM with global representations Endowing local ERM with global representations The local ERM method takes a myopic view and does not aim to learn a global hypothesis that explains the entire data distribution. This approach may result in poor performance in regions of input domains that are not well represented in the training set. The two-stage approach enables the local learning to benefit from good quality global representations, especially in sparse data regions. 33 / 41
  • 34. A Statistical Perspective on Retrieval-Based Models Local empirical risk minimization Endowing local ERM with global representations Endowing local ERM with global representations Here they discuss two-stage approach to address the potential shortcoming of local empirical risk minimization (ERM) in retrieval-based models. In the first stage, a global representation is learned using the entire dataset. In the second stage, the learned global representation is utilized at test time while solving the local ERM as previously defined. 34 / 41
  • 35. A Statistical Perspective on Retrieval-Based Models Classification in extended feature space Table of contents 1 Abstract 2 Introduction 3 Problem setup 4 Local empirical risk minimization 5 Classification in extended feature space 6 Experiments 7 Conclusion and future direction 35 / 41
  • 36. A Statistical Perspective on Retrieval-Based Models Classification in extended feature space Classification in extended feature space The scorer function can implicitly solve the local empirical risk minimization (ERM) using retrieved neighboring labeled instances to make the classification prediction. The objective is to learn a function f : X × (X × Y)∗ → ℝ|Y| Here, they also discuss a kernel-based approach to classification in the extended feature space, where the scorer function is represented as a linear combination of kernel functions evaluated on the extended feature space. 36 / 41
  • 37. A Statistical Perspective on Retrieval-Based Models Experiments Table of contents 1 Abstract 2 Introduction 3 Problem setup 4 Local empirical risk minimization 5 Classification in extended feature space 6 Experiments 7 Conclusion and future direction 37 / 41
  • 38. A Statistical Perspective on Retrieval-Based Models Experiments Experiments This paper performs experiments on both synthetic and real datasets to demonstrate the benefits of retrieval-based models in classification tasks. Synthetic: binary classification CIFAR-10: binary classification ImageNet: 1000-way classification The experiments show that retrieval-based models can achieve good performance with much simpler function classes compared to traditional parametric and nonparametric models. 38 / 41
  • 39. A Statistical Perspective on Retrieval-Based Models Experiments Experiments Figure 2: Performance of local ERM with size of retrieved set across models of different complexity. 39 / 41
  • 40. A Statistical Perspective on Retrieval-Based Models Conclusion and future direction Conclusion and future direction The main contributions of the paper, which include A formal framework for retrieval-based models Analysis of local and global learning frameworks Empirical results that support the theoretical findings For the future work, we could explore the use of retrieval-based models in other machine learning tasks beyond classification. 40 / 41
  • 41. A Statistical Perspective on Retrieval-Based Models Conclusion and future direction References I [1] Peter L. Bartlett, Michael I. Jordan, and Jon D. Mcauliffe. “Convexity, Classification, and Risk Bounds”. In: Journal of the American Statistical Association 101.473 (2006), pp. 138–156. issn: 01621459. url: http://www.jstor.org/stable/30047445. 41 / 41