SlideShare a Scribd company logo
1 of 52
Download to read offline
PAC-Bayesian	Bound	
for	
Deep	Learning
Mark	Chang
2019/11/26
Outlines
• Learning	Theory
• Generalization	in	Deep	Learning
• PAC-Bayesian	Bound
• PAC-Bayesian	Bound	for	Deep	Learning
• How	to	Overcome	Over-fitting?
Learning	Theory
data
training	data
testing	data
sampling
sampling
hypothesis
h
hypothesis
h
training	algorithm:
minimize	
training	
error
update	the	hypothesis	h
✏(h)
testing	
error
ˆ✏(h)
Learning is feasible when is small -> is small✏(h)ˆ✏(h)
Learning	Theory
• Over	fitting	:	 is	small,	but									is	large
• What	cause	over-fitting?	
✏(h)
h
h h
under	fitting over	fittingappropriate	fitting
too	few	
parameters
adequate
parameters
too	many
parameters
training	data
testing	data
ˆ✏(h)
Learning	Theory
• What	cause	over-fitting?	
• Too	many	parameters	->	over	fitting	?	
• Too	many	parameters	->	high	VC	Dimension	->	over	fitting	?
• …	?
Learning	Theory
• With 1-ẟ probability, the following inequality	(VC	Bound) is satisfied.
numbef of
training instances
VC	Dimension
(model complexity)
✏(h)  ˆ✏(h) +
r
8
n
log(
4(2n)d
)
VC Dimension
• Model	Complexity
• Categorized the hypothesis set into finite groups
• ex: 1D linear model
infinite	hypothesis finite	groups	of	hypothesis
VC	dim	=	log(4)	=	2
• Growth Function
• VC	Dimension
VC Dimension
⌧H(n) = max
(x1,...,xn)
⌧(x1, ..., xn)
⌧(x1, ..., xn) =
n
h(x1), ..., h(xn) h 2 H
o
data	samples hypothesis	set
d(H) = max{n : ⌧H (n) = 2n
}
Growth Function
• Considering	2D	Linear	Model,	and	n=2
(h(x1),	h(x2))	=	(1,	1) (0,	1) (1,	0) (0,	0)
h
x1
x2
⌧(x1, x2) =
n
h(x1), h(xn) h 2 H
o
= 4
⌧(x1, ..., xn) =
n
h(x1), ..., h(xn) h 2 H
o
Growth Function
• Considering	2D	Linear	Model,	and	n=3
⌧H(3) = max
(x1,...,x3)
⌧(x1, ..., x3) = 8
⌧(x11, x12, x13) = 8
x11
x12 x13
⌧(x21, x22, x23) = 6
x21
x22
x23
⌧H(n) = max
(x1,...,xn)
⌧(x1, ..., xn)
VC	Dimension
• VC-Dimension	for	2D	Linear	Model	(VC	Dimension	=	3)
…
⌧H(3) = 8 = 23
(shatter) ⌧H(4) = 14 < 24
(not	shatter)
d(H) = max{n : ⌧H (n) = 2n
} = 3
d(H) = max{n : ⌧H (n) = 2n
}
VC	Dimension
• VC Dimension is independent from data distribution
⌧H(n) = max
(x1,...,xn)
⌧(x1, ..., xn)
maximize	over	all	possible	data	
distributionwith n samples
⌧(x11, x12, x13) = 8 ⌧(x21, x22, x23) = 6
⌧H(3) = max
(x1,...,x3)
⌧(x1, ..., x3) = 8
x11
x12 x13
x21
x22
x23
d(H) = max{n : ⌧H (n) = 2n
}
maximize	over	all	possible	
number of data samples
⌧H(3) = 8 = 23 ⌧H(4) = 14 < 24
d(H) = max{n : ⌧H (n) = 2n
} = 3
VC	Dimension
• VC	Dimension	of	linear	model:	
• O(W)
• W	=	number	of	parameters	
• VC	Dimension	of	fully-connected	
neural	networks:	
• O(LW	log	W)
• L	=	number	of	layers
• W	=	number	of	parameters
VC	Bound
• For	a	given	dataset	(n	is	constant),	search	for	the	best	VC	Dimension
✏(h)  ˆ✏(h) +
r
8
n
log(
4(2n)d
)
d=n (shatter)
d	(VC	Dimension)
over-fittingerror
best	VC	Dimension
VC	Bound
• For	a	given	dataset	(n	is	constant),	search	for	the	best	VC	Dimension
training	data
testing	data
reduce	
VC	Dimension
Model	with	high	VC	
Dimension
Model	with	moderate	
VC	Dimension
over	fitting	!!
Generalization	in	Deep	Learning
• Considering	a	toy	example:
neural	networks
input:	780 hidden:	600
d	≈	26M
dataset
n	=	50,000
d	>>	n,	but	testing	error	<	0.1
Generalization	in	Deep	Learning
• However,	when	you	are	solving	your	problem	…
neural	networks
input:	
780
hidden:	
600
testing	error	=	0.6
over	fitting	!!	
your	dataset
n	=	50,000
10	classes
testing	error	=	0.6
over	fitting	!!	
reduce	
VC	Dimension
neural	networks
input:	
780
hidden:	
200
…
reduce	
VC	Dimension
Generalization	in	Deep	Learning
• Reconciling	modern	machine	learning	and	the	bias-variance	trade-off
(arxiv 2018)
• Understanding	deep	learning	requires	rethinking	generalization	(ICLR	
2017)
• On	Large-Batch	Training	for	Deep	Learning:	Generalization	Gap	and	
Sharp	Minima	(ICLR	2017)
Generalization	in	Deep	Learning
Reconciling	modern	machine	learning	and	the	
bias-variance	trade-off	
d	(VC	Dimension)
error
✏(h)  ˆ✏(h) +
r
8
n
log(
4(2n)d
)
d=n
over-fitting over-parameterization
model	with	extremely	
high	VC-Dimension
Reconciling	modern	machine	learning	and	the	
bias-variance	trade-off	
neural	networks	(two	layers) random	forest
Generalization	in	Deep	Learning
ICLR2017
Understanding	deep	learning	requires	
rethinking	generalization
1						0 1 0 2
random	noise	features
shatter	!
deep
neural
networks
(Inception)
feature:
label: 1						0 1 0 2
original	dataset	(CIFAR)
0 1 1 2 0
random	label
ˆ✏(h) ⇡ 0 ˆ✏(h) ⇡ 0 ˆ✏(h) ⇡ 0
Understanding	deep	learning	requires	
rethinking	generalization
1						0 1 0 2
random	noise	features
deep
neural
networks
(Inception)
original	dataset	(CIFAR)
feature:
label: 1						0 1 0 2
random	label
0 1 1 2 0
ˆ✏(h) ⇡ 0 ˆ✏(h) ⇡ 0 ˆ✏(h) ⇡ 0
✏(h) ⇡ 0.14 ✏(h) ⇡ 0.9 ✏(h) ⇡ 0.9
Understanding	deep	learning	requires	
rethinking	generalization
• Testing error depends on data distribution
• However,	VC-Bound does not depend on data distribution
1						0 1 0 2
random	noise	featuresoriginal	dataset
feature:
label: 1						0 1 0 2
random	label
0 1 1 2 0
✏(h) ⇡ 0.14 ✏(h) ⇡ 0.9
Understanding	deep	learning	requires	
rethinking	generalization
• Regularization of	Deep Neural Networks	
• only	small	improvement	on	testing	error
✏(h) ⇡ 0.1397
✏(h) ⇡ 0.1069
weight	decay
data	augmentation
✏(h) ⇡ 0.1095
weight	decay
+
data	augmentation
✏(h) ⇡ 0.1424
no	regularization
Generalization	in	Deep	Learning
On	Large-Batch	Training	for	Deep	Learning:	
Generalization	Gap	and	Sharp	Minima
• high	sharpness	->	high	testing	error
On	Large-Batch	Training	for	Deep	Learning:	
Generalization	Gap	and	Sharp	Minima
• high	sharpness	->	high	testing	error
• large	batch	size	->	high	sharpness
PAC-Bayesian	Bound
• Original	PAC	Bound	:	
• PAC-Bayesian	Model	Averaging	(COLT	1999)
• PAC-Bayesian	Bound	for	Deep	Learning	(Stochastic)	:	
• Computing	NonvacuousGeneralization	Bounds	for	Deep	(Stochastic)	Neural	
Networks	with	Many	More	Parameters	than	Training	Data	(UAI	2017)
• PAC-Bayesian	Bound	for	Deep	Learning	(Deterministic):
• A	PAC-Bayesian	Approach	to	Spectrally-Normalized	Margin	Bounds	for	Neural	
Networks	(ICLR	2018)
PAC-Bayesian	Bound
COLT	1999
PAC-Bayesian	Bound
• Deterministic Model • Stochastic Model	(Gibbs	Classifier)
data ✏(h)
hypothesis
h
error
data
✏(Q)
= Eh⇠Q(✏(h))
distribution	of	
hypothesis
hypothesis
h
Gibbs
error
sampling
PAC-Bayesian	Bound
• Considering	the	sharpness	of	local	minimums	
single	
hypothesis
h
distribution	of
hypothesis	
Q
flat	minimum:
• low
• low		
ˆ✏(h)
ˆ✏(Q)
sharp	minimum:
• low
• high
ˆ✏(h)
ˆ✏(Q)
PAC-Bayesian	Bound
• Training	Gibbs	Classifier
P (Prior)	:
distribution	of	hypothesis
before	training
Q (Posterior)	:
distribution	of	hypothesis
after	training
training	
algorithm
training	data
sampling
PAC-Bayesian	Bound
• Training
data
training	
data
sampling hypothesis
h
training	
algorithm:
minimize	
training	error
update	the	distribution	
of	hypothesis	Q
Distribution
of	hypothesis
Q
ˆ✏(Q)
ˆ✏(Q) = Eh⇠Q(ˆ✏(h))
PAC-Bayesian	Bound
• Testing
hypothesis
h
testing	error
Trained
distribution
of	hypothesis
Q
testing
data
sampling sampling
ˆ✏(Q)
data
✏(Q) = Eh⇠Q(✏(h))
PAC-Bayesian	Bound
• Learning is feasible when is small -> is small
• With 1-ẟ probability, the following inequality (PAC-Bayesian	Bound)	is
satisfied
number of
training instances
KL	divergence
between	P	and	Q
ˆ✏(Q) ✏(Q)
✏(Q)  ˆ✏(Q) +
s
KL(QkP) + log(n
) + 2
2n 1
PAC-Bayesian	Bound
✏(Q)  ˆ✏(Q) +
s
KL(Q||P) + log(n
) + 2
2n 1
high ✏(Q)  ˆ✏(Q) +
s
KL(Q||P) + log(n
) + 2
2n 1
,			high
under	fitting over	fittingappropriate	fitting
✏(Q)  ˆ✏(Q) +
s
KL(Q||P) + log(n
) + 2
2n 1
moderate ✏(Q)  ˆ✏(Q) +
s
KL(Q||P) + lo
2n 1
,		moderate ✏(Q)  ˆ✏(Q) +
s
KL(Q||P
2
low ✏(Q) ,			high
low KL(Q||P) moderate KL(Q||P) high KL(Q||P)
|✏(Q) ˆ✏(Q)|low |✏(Q) ˆ✏(Q)|moderae |✏(Q) ˆ✏(Q)|high
training	data
testing	data
P
Q
P
Q
P
Q
✏(Q)  ˆ✏(Q) +
s
KL(QkP) + log(n
) + 2
2n 1
PAC-Bayesian	Bound
• High	VC	Dimension,	but	clean	data
->	low	KL	(Q||P)
• High	VC	Dimension,	but	noisy	data
->	high	KL(Q||P)
• PAC-Bayesian	Bound	is	data-dependent
✏(Q)  ˆ✏(Q) +
s
KL(QkP) + log(n
) + 2
2n 1
P
Q Q
P
PAC-Bayesian	Bound	for	Deep	Learning	
(Stochastic)
UAI	2017
PAC-Bayesian	Bound	for	Deep	Learning	
(Stochastic)
• deterministic	neural	networks • stochastic neural networks	(SNN)
w,	bx y=wx+b
w,	bx y=wx+b
distribution	of	w,	b	: N(W,S)
• N	:	normal	distribution
• W	:	mean	of	weights	
• S	:	covariance	of	weights
sampling
PAC-Bayesian	Bound	for	Deep	Learning	
(Stochastic)
• With 1-ẟ probability, the following inequality is satisfied.
Q:	SNN	after	training P:	SNN	before	training
KL(ˆ✏(Q)k✏(Q)) 
KL(QkP) + log(n
)
n 1
Estimating	the	PAC-Bayesian	Bound
•
• N	:	normal	distribution
• w	:	mean	of	weights	(d	dimensional	vector)
• s	:	covariance	of	weights	(d	x	d	diagonal	matrix)
KL(ˆ✏(Q)k✏(Q)) 
KL(QkP) + log(n
)
n 1
KL(QkP) =
1
2
1
ksk1 d +
1
kw w0k2
2 + d log( ) 1d · log(s)
Q = N(w, s), P = N(w0, Id)
KL(ˆ✏(Q)k✏(Q)) 
KL(QkP) + log(n
)
n 1
Estimating	the	PAC-Bayesian	Bound
• is	intractable
• Sampling	m	hypothesis	from	Q	:	
• with	1-ẟ’	probability,	, the following inequality is satisfied.
ˆ✏(Q) = Eh⇠Q(ˆ✏(h))
KL(ˆ✏( ˆQ)kˆ✏(Q)) 
log 2
m 0
ˆ✏( ˆQ) =
1
m
mX
i=1
ˆ✏(hi), hi ⇠ Q
Experiments
• Data	:	M-NIST	(binary	classification,	class0	:	0~4,	class1	:	5~9)
• Model	:	2	layer	or	3	layer	NN
original	M-NIST random	label
feature:
label: 0 0 0 1 1 1						0 1						0						1
Experiments
Training
Testing
VC	Bound
Pac-Bayesian
Bound
0.028
0.034
26m
0.161
0.027
0.035
56m
0.179
Varying	the	width	of	
hidden	layer	(2	layer	NN)
600 1200
0.027
0.032
121m
0.201
0.028
0.034
26m
0.161
0.028
0.033
66m
0.186
Varying	the	number	of	
hidden	layers
2 3 4
Experiments
Training
Testing
VC	Bound
Pac-Bayesian
Bound
0.028
0.034
26m
0.161
0.112
0.503
26m
1.352
random	label
1						0 1						0						1		
original	M-NIST
0 0 0 1 1
PAC-Bayesian	Bound	for	Deep	Learning
• PAC-Bayesian	Bound	is	Data	Dependent
original	M-NIST random	label
clean	data:
->small	KL	(Q||P)
->small	ε(Q)
noisy	data:
->large	KL(Q||P)
->large	ε(Q)
feature:
label: 0 0 0 1 1 1						0 1						0						1		
KL(ˆ✏(Q)k✏(Q)) 
KL(QkP) + log(n
)
n 1
PAC-Bayesian	Bound	for	Deep	Learning	
(Deterministic)
PAC-Bayesian	Bound	for	Deep	Learning	
(Deterministic)
• Constrain on sharpness: given a	margin	γ >	0,	if	Q	satisfy:
h
Q
h'
Ph0⇠Q
n
sup
x2X
kh0
(x) h(x)k1 
4
o 1
2
PAC-Bayesian	Bound	for	Deep	Learning	
(Deterministic)
• Constrain on sharpness: given a	margin	γ >	0,	if	Q	satisfy:
• With	1-ẟ probability, the following inequality is satisfied.
• margin loss L γ (h):
L0(h)  ˆL (h) + 4
s
KL(QkP) + log(6n
)
n 1
Ph0⇠Q
n
sup
x2X
kh0
(x) h(x)k1 
4
o 1
2
L (h) = P(x,y)⇠D
n
h(x)[y]  + max
j6=y
h(x)[j]
o
How	to	Overcome	Over-fitting?
Traditional	Machine	Learning
• reduce the number of parameters
• weight decay
• early stop
• data augmentation
Modern	Deep	Learning
• reduce the number of parameters
• weight decay ?
• early stop ?
• data augmentation ?
• improving	data	quality	
• starting from	a good P	(transfer
learning)
✏(h)  ˆ✏(h) +
r
8
n
log(
4(2n)d
) KL(ˆ✏(Q)k✏(Q)) 
KL(QkP) + log(n
)
n 1

More Related Content

What's hot

Locality-sensitive hashing for search in metric space
Locality-sensitive hashing for search in metric space Locality-sensitive hashing for search in metric space
Locality-sensitive hashing for search in metric space Eliezer Silva
 
Speaker Diarization
Speaker DiarizationSpeaker Diarization
Speaker DiarizationHONGJOO LEE
 
Parallel Optimization in Machine Learning
Parallel Optimization in Machine LearningParallel Optimization in Machine Learning
Parallel Optimization in Machine LearningFabian Pedregosa
 
QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017Fred J. Hickernell
 
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEEuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEHONGJOO LEE
 
accurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannaccurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannolli0601
 
Hyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradientHyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradientFabian Pedregosa
 
Tailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsFrank Nielsen
 
Delayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsDelayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsChristian Robert
 
Support Vector Machines Simply
Support Vector Machines SimplySupport Vector Machines Simply
Support Vector Machines SimplyEmad Nabil
 
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence MeasuresLinear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence MeasuresAnmol Dwivedi
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structuresDavid Gleich
 

What's hot (20)

QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Locality-sensitive hashing for search in metric space
Locality-sensitive hashing for search in metric space Locality-sensitive hashing for search in metric space
Locality-sensitive hashing for search in metric space
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
Speaker Diarization
Speaker DiarizationSpeaker Diarization
Speaker Diarization
 
Parallel Optimization in Machine Learning
Parallel Optimization in Machine LearningParallel Optimization in Machine Learning
Parallel Optimization in Machine Learning
 
QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017QMC Error SAMSI Tutorial Aug 2017
QMC Error SAMSI Tutorial Aug 2017
 
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
 
cyclic_code.pdf
cyclic_code.pdfcyclic_code.pdf
cyclic_code.pdf
 
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOMEEuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
EuroPython 2017 - PyData - Deep Learning your Broadband Network @ HOME
 
accurate ABC Oliver Ratmann
accurate ABC Oliver Ratmannaccurate ABC Oliver Ratmann
accurate ABC Oliver Ratmann
 
Hyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradientHyperparameter optimization with approximate gradient
Hyperparameter optimization with approximate gradient
 
Tailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest NeighborsTailored Bregman Ball Trees for Effective Nearest Neighbors
Tailored Bregman Ball Trees for Effective Nearest Neighbors
 
talk MCMC & SMC 2004
talk MCMC & SMC 2004talk MCMC & SMC 2004
talk MCMC & SMC 2004
 
QMC: Operator Splitting Workshop, A Splitting Method for Nonsmooth Nonconvex ...
QMC: Operator Splitting Workshop, A Splitting Method for Nonsmooth Nonconvex ...QMC: Operator Splitting Workshop, A Splitting Method for Nonsmooth Nonconvex ...
QMC: Operator Splitting Workshop, A Splitting Method for Nonsmooth Nonconvex ...
 
Delayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithmsDelayed acceptance for Metropolis-Hastings algorithms
Delayed acceptance for Metropolis-Hastings algorithms
 
Support Vector Machines Simply
Support Vector Machines SimplySupport Vector Machines Simply
Support Vector Machines Simply
 
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence MeasuresLinear Discriminant Analysis (LDA) Under f-Divergence Measures
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
 
Iterative methods with special structures
Iterative methods with special structuresIterative methods with special structures
Iterative methods with special structures
 

Similar to PAC Bayesian for Deep Learning

Data Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptxData Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptxSubrata Kumer Paul
 
L1 intro2 supervised_learning
L1 intro2 supervised_learningL1 intro2 supervised_learning
L1 intro2 supervised_learningYogendra Singh
 
Pattern recognition binoy 05-naive bayes classifier
Pattern recognition binoy 05-naive bayes classifierPattern recognition binoy 05-naive bayes classifier
Pattern recognition binoy 05-naive bayes classifier108kaushik
 
Large-Scale Geographically Weighted Regression on Spark
Large-Scale Geographically Weighted Regression on SparkLarge-Scale Geographically Weighted Regression on Spark
Large-Scale Geographically Weighted Regression on SparkViet-Trung TRAN
 
DeepLearningLecture.pptx
DeepLearningLecture.pptxDeepLearningLecture.pptx
DeepLearningLecture.pptxssuserf07225
 
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsMethods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsRyan B Harvey, CSDP, CSM
 
QX Simulator and quantum programming - 2020-04-28
QX Simulator and quantum programming - 2020-04-28QX Simulator and quantum programming - 2020-04-28
QX Simulator and quantum programming - 2020-04-28Aritra Sarkar
 
Low-rank matrix approximations in Python by Christian Thurau PyData 2014
Low-rank matrix approximations in Python by Christian Thurau PyData 2014Low-rank matrix approximations in Python by Christian Thurau PyData 2014
Low-rank matrix approximations in Python by Christian Thurau PyData 2014PyData
 
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님AI Robotics KR
 
ch8Bayes.ppt
ch8Bayes.pptch8Bayes.ppt
ch8Bayes.pptImXaib
 

Similar to PAC Bayesian for Deep Learning (20)

Data Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptxData Mining Lecture_10(b).pptx
Data Mining Lecture_10(b).pptx
 
ML.pptx
ML.pptxML.pptx
ML.pptx
 
L1 intro2 supervised_learning
L1 intro2 supervised_learningL1 intro2 supervised_learning
L1 intro2 supervised_learning
 
Pattern recognition binoy 05-naive bayes classifier
Pattern recognition binoy 05-naive bayes classifierPattern recognition binoy 05-naive bayes classifier
Pattern recognition binoy 05-naive bayes classifier
 
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Backpropagation - Elisa Sayrol - UPC Barcelona 2018Backpropagation - Elisa Sayrol - UPC Barcelona 2018
Backpropagation - Elisa Sayrol - UPC Barcelona 2018
 
Large-Scale Geographically Weighted Regression on Spark
Large-Scale Geographically Weighted Regression on SparkLarge-Scale Geographically Weighted Regression on Spark
Large-Scale Geographically Weighted Regression on Spark
 
DeepLearningLecture.pptx
DeepLearningLecture.pptxDeepLearningLecture.pptx
DeepLearningLecture.pptx
 
Dynamic programming
Dynamic programmingDynamic programming
Dynamic programming
 
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data SetsMethods of Manifold Learning for Dimension Reduction of Large Data Sets
Methods of Manifold Learning for Dimension Reduction of Large Data Sets
 
Naive Bayes Presentation
Naive Bayes PresentationNaive Bayes Presentation
Naive Bayes Presentation
 
ch8Bayes.ppt
ch8Bayes.pptch8Bayes.ppt
ch8Bayes.ppt
 
QX Simulator and quantum programming - 2020-04-28
QX Simulator and quantum programming - 2020-04-28QX Simulator and quantum programming - 2020-04-28
QX Simulator and quantum programming - 2020-04-28
 
ABC workshop: 17w5025
ABC workshop: 17w5025ABC workshop: 17w5025
ABC workshop: 17w5025
 
Normalizing flow
Normalizing flowNormalizing flow
Normalizing flow
 
Into to prob_prog_hari
Into to prob_prog_hariInto to prob_prog_hari
Into to prob_prog_hari
 
Low-rank matrix approximations in Python by Christian Thurau PyData 2014
Low-rank matrix approximations in Python by Christian Thurau PyData 2014Low-rank matrix approximations in Python by Christian Thurau PyData 2014
Low-rank matrix approximations in Python by Christian Thurau PyData 2014
 
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
Bayesian Inference : Kalman filter 에서 Optimization 까지 - 김홍배 박사님
 
ch8Bayes.ppt
ch8Bayes.pptch8Bayes.ppt
ch8Bayes.ppt
 
ch8Bayes.ppt
ch8Bayes.pptch8Bayes.ppt
ch8Bayes.ppt
 
LSH
LSHLSH
LSH
 

More from Mark Chang

NTU ML TENSORFLOW
NTU ML TENSORFLOWNTU ML TENSORFLOW
NTU ML TENSORFLOWMark Chang
 
NTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsNTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsMark Chang
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial NetworksMark Chang
 
Applied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksApplied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksMark Chang
 
The Genome Assembly Problem
The Genome Assembly ProblemThe Genome Assembly Problem
The Genome Assembly ProblemMark Chang
 
DRAW: Deep Recurrent Attentive Writer
DRAW: Deep Recurrent Attentive WriterDRAW: Deep Recurrent Attentive Writer
DRAW: Deep Recurrent Attentive WriterMark Chang
 
淺談深度學習
淺談深度學習淺談深度學習
淺談深度學習Mark Chang
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational AutoencoderMark Chang
 
TensorFlow 深度學習快速上手班--深度學習
 TensorFlow 深度學習快速上手班--深度學習 TensorFlow 深度學習快速上手班--深度學習
TensorFlow 深度學習快速上手班--深度學習Mark Chang
 
TensorFlow 深度學習快速上手班--電腦視覺應用
TensorFlow 深度學習快速上手班--電腦視覺應用TensorFlow 深度學習快速上手班--電腦視覺應用
TensorFlow 深度學習快速上手班--電腦視覺應用Mark Chang
 
TensorFlow 深度學習快速上手班--自然語言處理應用
TensorFlow 深度學習快速上手班--自然語言處理應用TensorFlow 深度學習快速上手班--自然語言處理應用
TensorFlow 深度學習快速上手班--自然語言處理應用Mark Chang
 
TensorFlow 深度學習快速上手班--機器學習
TensorFlow 深度學習快速上手班--機器學習TensorFlow 深度學習快速上手班--機器學習
TensorFlow 深度學習快速上手班--機器學習Mark Chang
 
Computational Linguistics week 10
 Computational Linguistics week 10 Computational Linguistics week 10
Computational Linguistics week 10Mark Chang
 
TensorFlow 深度學習講座
TensorFlow 深度學習講座TensorFlow 深度學習講座
TensorFlow 深度學習講座Mark Chang
 
Computational Linguistics week 5
Computational Linguistics  week 5Computational Linguistics  week 5
Computational Linguistics week 5Mark Chang
 
Neural Art (English Version)
Neural Art (English Version)Neural Art (English Version)
Neural Art (English Version)Mark Chang
 
AlphaGo in Depth
AlphaGo in Depth AlphaGo in Depth
AlphaGo in Depth Mark Chang
 
Image completion
Image completionImage completion
Image completionMark Chang
 
自然語言處理簡介
自然語言處理簡介自然語言處理簡介
自然語言處理簡介Mark Chang
 

More from Mark Chang (20)

NTU ML TENSORFLOW
NTU ML TENSORFLOWNTU ML TENSORFLOW
NTU ML TENSORFLOW
 
NTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANsNTHU AI Reading Group: Improved Training of Wasserstein GANs
NTHU AI Reading Group: Improved Training of Wasserstein GANs
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
Applied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural NetworksApplied Deep Learning 11/03 Convolutional Neural Networks
Applied Deep Learning 11/03 Convolutional Neural Networks
 
The Genome Assembly Problem
The Genome Assembly ProblemThe Genome Assembly Problem
The Genome Assembly Problem
 
DRAW: Deep Recurrent Attentive Writer
DRAW: Deep Recurrent Attentive WriterDRAW: Deep Recurrent Attentive Writer
DRAW: Deep Recurrent Attentive Writer
 
淺談深度學習
淺談深度學習淺談深度學習
淺談深度學習
 
Variational Autoencoder
Variational AutoencoderVariational Autoencoder
Variational Autoencoder
 
TensorFlow 深度學習快速上手班--深度學習
 TensorFlow 深度學習快速上手班--深度學習 TensorFlow 深度學習快速上手班--深度學習
TensorFlow 深度學習快速上手班--深度學習
 
TensorFlow 深度學習快速上手班--電腦視覺應用
TensorFlow 深度學習快速上手班--電腦視覺應用TensorFlow 深度學習快速上手班--電腦視覺應用
TensorFlow 深度學習快速上手班--電腦視覺應用
 
TensorFlow 深度學習快速上手班--自然語言處理應用
TensorFlow 深度學習快速上手班--自然語言處理應用TensorFlow 深度學習快速上手班--自然語言處理應用
TensorFlow 深度學習快速上手班--自然語言處理應用
 
TensorFlow 深度學習快速上手班--機器學習
TensorFlow 深度學習快速上手班--機器學習TensorFlow 深度學習快速上手班--機器學習
TensorFlow 深度學習快速上手班--機器學習
 
Computational Linguistics week 10
 Computational Linguistics week 10 Computational Linguistics week 10
Computational Linguistics week 10
 
Neural Doodle
Neural DoodleNeural Doodle
Neural Doodle
 
TensorFlow 深度學習講座
TensorFlow 深度學習講座TensorFlow 深度學習講座
TensorFlow 深度學習講座
 
Computational Linguistics week 5
Computational Linguistics  week 5Computational Linguistics  week 5
Computational Linguistics week 5
 
Neural Art (English Version)
Neural Art (English Version)Neural Art (English Version)
Neural Art (English Version)
 
AlphaGo in Depth
AlphaGo in Depth AlphaGo in Depth
AlphaGo in Depth
 
Image completion
Image completionImage completion
Image completion
 
自然語言處理簡介
自然語言處理簡介自然語言處理簡介
自然語言處理簡介
 

Recently uploaded

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraDeakin University
 

Recently uploaded (20)

Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
Science&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdfScience&tech:THE INFORMATION AGE STS.pdf
Science&tech:THE INFORMATION AGE STS.pdf
 
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
Transcript: #StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024
 
Artificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning eraArtificial intelligence in the post-deep learning era
Artificial intelligence in the post-deep learning era
 
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptxVulnerability_Management_GRC_by Sohang Sengupta.pptx
Vulnerability_Management_GRC_by Sohang Sengupta.pptx
 

PAC Bayesian for Deep Learning