SlideShare a Scribd company logo
1 of 25
Download to read offline
STATISTICAL PHYSICS MODELLING
OF MACHINE LEARNING
Lenka Zdeborová
(IPhT; CEA Saclay & CNRS)
WiMLDS meetup, November 29, 2018
Long history of physics influencing machine learning.
Examples:
Gibbs-Bogoluibov-Feynman’60s - physics behind variational
inference.
Hopfield model’82. Spin glass models of neural networks Amit,
Gutfreund, Sompolinsky’85.
Boltzmann machine Hinton, Sejnowski’86 - named after the
Boltzmann distribution.
Gardner’87 - Maximum storage capacity in neural networks
(related to VC dimension).
SVMs by Boser, Guyon, Vapnik’92 inspired by Krauth, Mézard’87
Many papers on neural networks in physics in 80s and 90s.
PHYSICS IN MACHINE
LEARNING
PHYSICS IN MACHINE
LEARNING
Les Houches, 1985
PHYSICS IN MACHINE
LEARNING
Les Houches, 1985
THE PUZZLE OF GENERALIZATION
According to PAC bounds (via VC dimension, Rademacher
complexity) neural networks that generalize well should not be
able to fit random labels.
ICLR’16
THEORETICAL QUESTIONS
IN DEEP LEARNING
Why the lack of overfitting?
“More parameters = more overfitting”
Does not seem to hold in deep learning.
SAMPLE COMPLEXITY
Cifar10 - 50000 samples.
How many samples are
really needed?
How low is the optimal sample complexity? Are we achieving it?
If not, is it because of architectures or algorithms?
AND THE PHYSICS HERE?
THEORETICAL-PHYSICS
ROADMAP
1. Experimental observation or fundamental hypothesis.
2. Unreasonably simple model for which toughest questions
can be understood mathematically.
3. Generalize to more realistic models, relies on universality
(= important laws of nature rarely depend on many details).
MODELS
H = J
X
(ij)2E
SiSj
P({Si}i=1,...,N ) =
e H
Z
magnetism of materials
In data science, models are used to fit the data. (e.g. linear
regression: What is the best straight line that captures the
dependence of y on x?)
In physics, models are the main tool for understanding.
MODELS
In data science, models are used to fit the data. (e.g. linear
regression: What is the best straight line that captures the
dependence of y on x?)
In physics, models are the main tool for understanding.
P({Si}i=1,...,N ) =
e H
Z
H =
X
(ijk)2E
JijkSiSjSk
glass transitionp-spin model
Jijk ⇠ N(0, 1)
IS THIS USEFUL IN MACHINE LEARNING?
Example: Single layer neural network = generalized linear regression.
Given (X,y) find w such that
μ = 1,…, n
i = 1,…, pyμ = φ(
p
∑
i=1
Xμiwi)
data
X
y
labels
w
weights
data
weights
(noisy) activation function
Take random iid Gaussian and random iid from
Create
Goal: Compute the best possible generalisation error achievable
with n samples of dimension p.
High-dimensional regime:
TEACHER-STUDENT MODEL
Xμi w*i
yμ = φ(
p
∑
i=1
Xμiw*i )
Pw
p → ∞
n → ∞
n/p = Ω(1)
data
X
y
labels
w
weights
data
weights
Gardner, Derrida’89, Gyorgyi’90
Take random iid Gaussian and random iid from
Create
Goal: Compute the best possible generalisation error achievable
with n samples of dimension p.
High-dimensional regime:
Xμi w*i
yμ = φ(
p
∑
i=1
Xμiw*i )
Pw
p → ∞n → ∞ n/p = Ω(1)
What did we win? Posterior is tractable with replica
and cavity method, developed in the theory of spin glasses.
P(w|X, y)
TEACHER-STUDENT MODEL
Gardner, Derrida’89, Gyorgyi’90
Optimal generalisation error for any non-linearity
and prior on weights.
Proof of the replica formula for the optimal
generalisation error.
Approximate message passing provably reaching the
optimal generalization error (out of the hard region).
Barbier, Krzakala, Macris, Miolane, LZ, COLT’18, arXiv:1708.03395
NEW W.R.T. 1990
LEARNING CURVESgeneralisationerror
φ(z) = sign(z) p → ∞
n → ∞ n/p = Ω(1)
optimal
AMP algorithm
logistic regression
Pw = 𝒩(0,1)
# of samples per dimension n/p
generalisationerror
# of samples per dimension
optimal, achievable
optimal
AMP algorithm
logistic regression
wi 2 { 1, +1}φ(z) = sign(z)
n/p
p → ∞
n → ∞ n/p = Ω(1)
PHASE TRANSITIONS
generalisationerror
# of samples per dimension
optimal, achievable
optimal
AMP algorithm
logistic regression
wi 2 { 1, +1}φ(z) = sign(z)
n/p
p → ∞
n → ∞ n/p = Ω(1)
hard
PHASE TRANSITIONS
HARD REGIME
INCLUDING HIDDEN VARIABLES
data
X
y
labels
w
v1
v2
weights
p input units
K hidden units
output unit
L=3 layers
n training samples
w learned, v1 & v2 fixed
Limit:
Committee machine
Model from Schwarze’92.
Proof of the replica formula, and approximate message passing Aubin, Maillard,
Barbier, Macris, Krzakala, LZ’19, spotlight at NeurIPS’18.
K = O(1)<latexit sha1_base64="pnb2kdx6DB2WkAndPWumY4tr6mw=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CRahXsquFPQiFL0IHqxgP6BdSjbNtrHZZEmyQln6H7x4UMSr/8eb/8a03YO2Phh4vDfDzLwg5kwb1/12ciura+sb+c3C1vbO7l5x/6CpZaIIbRDJpWoHWFPOBG0YZjhtx4riKOC0FYyup37riSrNpHgw45j6ER4IFjKCjZWat5d3Ze+0Vyy5FXcGtEy8jJQgQ71X/Or2JUkiKgzhWOuO58bGT7EyjHA6KXQTTWNMRnhAO5YKHFHtp7NrJ+jEKn0USmVLGDRTf0+kONJ6HAW2M8JmqBe9qfif10lMeOGnTMSJoYLMF4UJR0ai6euozxQlho8twUQxeysiQ6wwMTaggg3BW3x5mTTPKp5b8e6rpdpVFkcejuAYyuDBOdTgBurQAAKP8Ayv8OZI58V5dz7mrTknmzmEP3A+fwD3vI4P</latexit><latexit sha1_base64="pnb2kdx6DB2WkAndPWumY4tr6mw=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CRahXsquFPQiFL0IHqxgP6BdSjbNtrHZZEmyQln6H7x4UMSr/8eb/8a03YO2Phh4vDfDzLwg5kwb1/12ciura+sb+c3C1vbO7l5x/6CpZaIIbRDJpWoHWFPOBG0YZjhtx4riKOC0FYyup37riSrNpHgw45j6ER4IFjKCjZWat5d3Ze+0Vyy5FXcGtEy8jJQgQ71X/Or2JUkiKgzhWOuO58bGT7EyjHA6KXQTTWNMRnhAO5YKHFHtp7NrJ+jEKn0USmVLGDRTf0+kONJ6HAW2M8JmqBe9qfif10lMeOGnTMSJoYLMF4UJR0ai6euozxQlho8twUQxeysiQ6wwMTaggg3BW3x5mTTPKp5b8e6rpdpVFkcejuAYyuDBOdTgBurQAAKP8Ayv8OZI58V5dz7mrTknmzmEP3A+fwD3vI4P</latexit><latexit sha1_base64="pnb2kdx6DB2WkAndPWumY4tr6mw=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CRahXsquFPQiFL0IHqxgP6BdSjbNtrHZZEmyQln6H7x4UMSr/8eb/8a03YO2Phh4vDfDzLwg5kwb1/12ciura+sb+c3C1vbO7l5x/6CpZaIIbRDJpWoHWFPOBG0YZjhtx4riKOC0FYyup37riSrNpHgw45j6ER4IFjKCjZWat5d3Ze+0Vyy5FXcGtEy8jJQgQ71X/Or2JUkiKgzhWOuO58bGT7EyjHA6KXQTTWNMRnhAO5YKHFHtp7NrJ+jEKn0USmVLGDRTf0+kONJ6HAW2M8JmqBe9qfif10lMeOGnTMSJoYLMF4UJR0ai6euozxQlho8twUQxeysiQ6wwMTaggg3BW3x5mTTPKp5b8e6rpdpVFkcejuAYyuDBOdTgBurQAAKP8Ayv8OZI58V5dz7mrTknmzmEP3A+fwD3vI4P</latexit><latexit sha1_base64="pnb2kdx6DB2WkAndPWumY4tr6mw=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CRahXsquFPQiFL0IHqxgP6BdSjbNtrHZZEmyQln6H7x4UMSr/8eb/8a03YO2Phh4vDfDzLwg5kwb1/12ciura+sb+c3C1vbO7l5x/6CpZaIIbRDJpWoHWFPOBG0YZjhtx4riKOC0FYyup37riSrNpHgw45j6ER4IFjKCjZWat5d3Ze+0Vyy5FXcGtEy8jJQgQ71X/Or2JUkiKgzhWOuO58bGT7EyjHA6KXQTTWNMRnhAO5YKHFHtp7NrJ+jEKn0USmVLGDRTf0+kONJ6HAW2M8JmqBe9qfif10lMeOGnTMSJoYLMF4UJR0ai6euozxQlho8twUQxeysiQ6wwMTaggg3BW3x5mTTPKp5b8e6rpdpVFkcejuAYyuDBOdTgBurQAAKP8Ayv8OZI58V5dz7mrTknmzmEP3A+fwD3vI4P</latexit>
p → ∞
n → ∞ α = n/p = Ω(1)
PHASE TRANSITONS
Specialization phase transition
= hidden units specialise to
correlate with specific features.
K=2
sign(0) = 0<latexit sha1_base64="Dc5utTXextgZij3T2/A7p36jzo8=">AAAB+nicbVBNSwMxEJ31s9avrR69BItQLyUrgl6EohePFewHtEvJptk2NJtdkqxS1v4ULx4U8eov8ea/MW33oK0PBh7vzTAzL0gE1wbjb2dldW19Y7OwVdze2d3bd0sHTR2nirIGjUWs2gHRTHDJGoYbwdqJYiQKBGsFo5up33pgSvNY3ptxwvyIDCQPOSXGSj23lHVVhDQfyAo+naArhHtuGVfxDGiZeDkpQ456z/3q9mOaRkwaKojWHQ8nxs+IMpwKNil2U80SQkdkwDqWShIx7Wez0yfoxCp9FMbKljRopv6eyEik9TgKbGdEzFAvelPxP6+TmvDSz7hMUsMknS8KU4FMjKY5oD5XjBoxtoRQxe2tiA6JItTYtIo2BG/x5WXSPKt6uOrdnZdr13kcBTiCY6iABxdQg1uoQwMoPMIzvMKb8+S8OO/Ox7x1xclnDuEPnM8fCE+Shw==</latexit><latexit sha1_base64="Dc5utTXextgZij3T2/A7p36jzo8=">AAAB+nicbVBNSwMxEJ31s9avrR69BItQLyUrgl6EohePFewHtEvJptk2NJtdkqxS1v4ULx4U8eov8ea/MW33oK0PBh7vzTAzL0gE1wbjb2dldW19Y7OwVdze2d3bd0sHTR2nirIGjUWs2gHRTHDJGoYbwdqJYiQKBGsFo5up33pgSvNY3ptxwvyIDCQPOSXGSj23lHVVhDQfyAo+naArhHtuGVfxDGiZeDkpQ456z/3q9mOaRkwaKojWHQ8nxs+IMpwKNil2U80SQkdkwDqWShIx7Wez0yfoxCp9FMbKljRopv6eyEik9TgKbGdEzFAvelPxP6+TmvDSz7hMUsMknS8KU4FMjKY5oD5XjBoxtoRQxe2tiA6JItTYtIo2BG/x5WXSPKt6uOrdnZdr13kcBTiCY6iABxdQg1uoQwMoPMIzvMKb8+S8OO/Ox7x1xclnDuEPnM8fCE+Shw==</latexit><latexit sha1_base64="Dc5utTXextgZij3T2/A7p36jzo8=">AAAB+nicbVBNSwMxEJ31s9avrR69BItQLyUrgl6EohePFewHtEvJptk2NJtdkqxS1v4ULx4U8eov8ea/MW33oK0PBh7vzTAzL0gE1wbjb2dldW19Y7OwVdze2d3bd0sHTR2nirIGjUWs2gHRTHDJGoYbwdqJYiQKBGsFo5up33pgSvNY3ptxwvyIDCQPOSXGSj23lHVVhDQfyAo+naArhHtuGVfxDGiZeDkpQ456z/3q9mOaRkwaKojWHQ8nxs+IMpwKNil2U80SQkdkwDqWShIx7Wez0yfoxCp9FMbKljRopv6eyEik9TgKbGdEzFAvelPxP6+TmvDSz7hMUsMknS8KU4FMjKY5oD5XjBoxtoRQxe2tiA6JItTYtIo2BG/x5WXSPKt6uOrdnZdr13kcBTiCY6iABxdQg1uoQwMoPMIzvMKb8+S8OO/Ox7x1xclnDuEPnM8fCE+Shw==</latexit><latexit sha1_base64="Dc5utTXextgZij3T2/A7p36jzo8=">AAAB+nicbVBNSwMxEJ31s9avrR69BItQLyUrgl6EohePFewHtEvJptk2NJtdkqxS1v4ULx4U8eov8ea/MW33oK0PBh7vzTAzL0gE1wbjb2dldW19Y7OwVdze2d3bd0sHTR2nirIGjUWs2gHRTHDJGoYbwdqJYiQKBGsFo5up33pgSvNY3ptxwvyIDCQPOSXGSj23lHVVhDQfyAo+naArhHtuGVfxDGiZeDkpQ456z/3q9mOaRkwaKojWHQ8nxs+IMpwKNil2U80SQkdkwDqWShIx7Wez0yfoxCp9FMbKljRopv6eyEik9TgKbGdEzFAvelPxP6+TmvDSz7hMUsMknS8KU4FMjKY5oD5XjBoxtoRQxe2tiA6JItTYtIo2BG/x5WXSPKt6uOrdnZdr13kcBTiCY6iABxdQg1uoQwMoPMIzvMKb8+S8OO/Ox7x1xclnDuEPnM8fCE+Shw==</latexit>
0 1 2 3 4
α
0.00
0.05
0.10
0.15
0.20
0.25
Generalizationerrorϵg(α)
0.0
0.2
0.4
0.6
0.8
1.0
Overlapq
AMP q00
AMP q01
SE q00
SE q01
SE ϵg(α)
AMP ϵg(α)
Specialization
yμ = sign[sign(∑
i
Xμ,iwi,1) + sign
∑
i
(Xμ,iwi,2)]
Large algorithmic gap:
IT threshold:
Algorithmic threshold
K 1<latexit sha1_base64="tIKGLXfugTsoLV203AoKJohXlvk=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0GPRi+Clgv2ANpTNdpMu3WzC7kQooT/CiwdFvPp7vPlv3LY5aOuDgcd7M8zMC1IpDLrut1NaW9/Y3CpvV3Z29/YPqodHbZNkmvEWS2SiuwE1XArFWyhQ8m6qOY0DyTvB+Hbmd564NiJRjzhJuR/TSIlQMIpW6tyTfhQRb1CtuXV3DrJKvILUoEBzUP3qDxOWxVwhk9SYnuem6OdUo2CSTyv9zPCUsjGNeM9SRWNu/Hx+7pScWWVIwkTbUkjm6u+JnMbGTOLAdsYUR2bZm4n/eb0Mw2s/FyrNkCu2WBRmkmBCZr+TodCcoZxYQpkW9lbCRlRThjahig3BW355lbQv6p5b9x4ua42bIo4ynMApnIMHV9CAO2hCCxiM4Rle4c1JnRfn3flYtJacYuYY/sD5/AH0iY6m</latexit><latexit sha1_base64="tIKGLXfugTsoLV203AoKJohXlvk=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0GPRi+Clgv2ANpTNdpMu3WzC7kQooT/CiwdFvPp7vPlv3LY5aOuDgcd7M8zMC1IpDLrut1NaW9/Y3CpvV3Z29/YPqodHbZNkmvEWS2SiuwE1XArFWyhQ8m6qOY0DyTvB+Hbmd564NiJRjzhJuR/TSIlQMIpW6tyTfhQRb1CtuXV3DrJKvILUoEBzUP3qDxOWxVwhk9SYnuem6OdUo2CSTyv9zPCUsjGNeM9SRWNu/Hx+7pScWWVIwkTbUkjm6u+JnMbGTOLAdsYUR2bZm4n/eb0Mw2s/FyrNkCu2WBRmkmBCZr+TodCcoZxYQpkW9lbCRlRThjahig3BW355lbQv6p5b9x4ua42bIo4ynMApnIMHV9CAO2hCCxiM4Rle4c1JnRfn3flYtJacYuYY/sD5/AH0iY6m</latexit><latexit sha1_base64="tIKGLXfugTsoLV203AoKJohXlvk=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0GPRi+Clgv2ANpTNdpMu3WzC7kQooT/CiwdFvPp7vPlv3LY5aOuDgcd7M8zMC1IpDLrut1NaW9/Y3CpvV3Z29/YPqodHbZNkmvEWS2SiuwE1XArFWyhQ8m6qOY0DyTvB+Hbmd564NiJRjzhJuR/TSIlQMIpW6tyTfhQRb1CtuXV3DrJKvILUoEBzUP3qDxOWxVwhk9SYnuem6OdUo2CSTyv9zPCUsjGNeM9SRWNu/Hx+7pScWWVIwkTbUkjm6u+JnMbGTOLAdsYUR2bZm4n/eb0Mw2s/FyrNkCu2WBRmkmBCZr+TodCcoZxYQpkW9lbCRlRThjahig3BW355lbQv6p5b9x4ua42bIo4ynMApnIMHV9CAO2hCCxiM4Rle4c1JnRfn3flYtJacYuYY/sD5/AH0iY6m</latexit><latexit sha1_base64="tIKGLXfugTsoLV203AoKJohXlvk=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0GPRi+Clgv2ANpTNdpMu3WzC7kQooT/CiwdFvPp7vPlv3LY5aOuDgcd7M8zMC1IpDLrut1NaW9/Y3CpvV3Z29/YPqodHbZNkmvEWS2SiuwE1XArFWyhQ8m6qOY0DyTvB+Hbmd564NiJRjzhJuR/TSIlQMIpW6tyTfhQRb1CtuXV3DrJKvILUoEBzUP3qDxOWxVwhk9SYnuem6OdUo2CSTyv9zPCUsjGNeM9SRWNu/Hx+7pScWWVIwkTbUkjm6u+JnMbGTOLAdsYUR2bZm4n/eb0Mw2s/FyrNkCu2WBRmkmBCZr+TodCcoZxYQpkW9lbCRlRThjahig3BW355lbQv6p5b9x4ua42bIo4ynMApnIMHV9CAO2hCCxiM4Rle4c1JnRfn3flYtJacYuYY/sD5/AH0iY6m</latexit>
yμ = sign[
K
∑
a=1
sign(∑
i
Xμ,iwi,a)]
n > 7.65Kp
n > const . K2
p
PHASE TRANSITONS
0 2 4 6 8 10 12 14
α = (# of samples)/(#hidden units × input size)
0.0
0.1
0.2
0.3
0.4
0.5
Generalizationerrorϵg(α)
Non-specialized
hidden units
Specialized
hidden units
Computational gap
Bayes optimal ϵg(α)
AMP ϵg(α)
Discontinuous specialization
impossible hard doable todaydoable
# of samples
Good generalisation error
Our goal: Quantify this in more realistic models.
Design algorithms working in the doable region.
LZ, F. Krzakala, Statistical Physics of Algorithm: Threshold and Algorithms,
Advances of Physics (2016), arXiv:1511.02476.
J. Barbier, N. Macris, L. Miolane, F. Krzakala, LZ, Phase Transitions, Optimal
Errors and Optimality of Message-Passing in Generalized Linear Models,
arXiv:1708.03395, COLT’18.
B. Aubin, A. Maillard, J. Barbier, F. Krzakala N. Macris,, LZ, The committee
machine: Computational to statistical gaps in learning a two-layers neural
network, arXiv:1806.05451, NeurIPS’18.
REFERENCES
Thank you for your attention!
LZ, F. Krzakala, Statistical Physics of Algorithm: Threshold and Algorithms,
Advances of Physics (2016), arXiv:1511.02476.
J. Barbier, N. Macris, L. Miolane, F. Krzakala, LZ, Phase Transitions, Optimal
Errors and Optimality of Message-Passing in Generalized Linear Models,
arXiv:1708.03395, COLT’18.
B. Aubin, A. Maillard, J. Barbier, F. Krzakala N. Macris,, LZ, The committee
machine: Computational to statistical gaps in learning a two-layers neural
network, arXiv:1806.05451, NeurIPS’18.
REFERENCES

More Related Content

What's hot

Mcmc & lkd free I
Mcmc & lkd free IMcmc & lkd free I
Mcmc & lkd free IDeb Roy
 
Introduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methodsIntroduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methodsChristian Robert
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsPierre Jacob
 
Supervised Planetary Unmixing with Optimal Transport
Supervised Planetary Unmixing with Optimal TransportSupervised Planetary Unmixing with Optimal Transport
Supervised Planetary Unmixing with Optimal TransportSina Nakhostin
 
Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Pierre Jacob
 
Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Pierre Jacob
 
Unbiased MCMC with couplings
Unbiased MCMC with couplingsUnbiased MCMC with couplings
Unbiased MCMC with couplingsPierre Jacob
 
Bayesian Neural Networks
Bayesian Neural NetworksBayesian Neural Networks
Bayesian Neural Networksm.a.kirn
 
Winter school-pq2016v2
Winter school-pq2016v2Winter school-pq2016v2
Winter school-pq2016v2Ludovic Perret
 
Numerical approach for Hamilton-Jacobi equations on a network: application to...
Numerical approach for Hamilton-Jacobi equations on a network: application to...Numerical approach for Hamilton-Jacobi equations on a network: application to...
Numerical approach for Hamilton-Jacobi equations on a network: application to...Guillaume Costeseque
 
Thesis defense improved
Thesis defense improvedThesis defense improved
Thesis defense improvedZheng Mengdi
 
A quantum-inspired optimization heuristic for the multiple sequence alignment...
A quantum-inspired optimization heuristic for the multiple sequence alignment...A quantum-inspired optimization heuristic for the multiple sequence alignment...
A quantum-inspired optimization heuristic for the multiple sequence alignment...Konstantinos Giannakis
 
Habilitation à diriger des recherches
Habilitation à diriger des recherchesHabilitation à diriger des recherches
Habilitation à diriger des recherchesPierre Pudlo
 
WSC 2011, advanced tutorial on simulation in Statistics
WSC 2011, advanced tutorial on simulation in StatisticsWSC 2011, advanced tutorial on simulation in Statistics
WSC 2011, advanced tutorial on simulation in StatisticsChristian Robert
 
Analyzing Meme Propagation in Multimemetic Algorithms
Analyzing Meme Propagation in Multimemetic AlgorithmsAnalyzing Meme Propagation in Multimemetic Algorithms
Analyzing Meme Propagation in Multimemetic AlgorithmsRafael Nogueras
 
Redundancy and synergy in dynamical systems
Redundancy and synergy in dynamical systemsRedundancy and synergy in dynamical systems
Redundancy and synergy in dynamical systemsdanielemarinazzo
 
Puneet Singla
Puneet SinglaPuneet Singla
Puneet Singlapsingla
 
An Analysis of a Selecto-Lamarckian Model of Multimemetic Algorithms with Dyn...
An Analysis of a Selecto-Lamarckian Model of Multimemetic Algorithms with Dyn...An Analysis of a Selecto-Lamarckian Model of Multimemetic Algorithms with Dyn...
An Analysis of a Selecto-Lamarckian Model of Multimemetic Algorithms with Dyn...Carlos Cotta
 
Bayesian inversion of deterministic dynamic causal models
Bayesian inversion of deterministic dynamic causal modelsBayesian inversion of deterministic dynamic causal models
Bayesian inversion of deterministic dynamic causal modelskhbrodersen
 

What's hot (20)

Mcmc & lkd free I
Mcmc & lkd free IMcmc & lkd free I
Mcmc & lkd free I
 
Introduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methodsIntroduction to advanced Monte Carlo methods
Introduction to advanced Monte Carlo methods
 
Monte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problemsMonte Carlo methods for some not-quite-but-almost Bayesian problems
Monte Carlo methods for some not-quite-but-almost Bayesian problems
 
Supervised Planetary Unmixing with Optimal Transport
Supervised Planetary Unmixing with Optimal TransportSupervised Planetary Unmixing with Optimal Transport
Supervised Planetary Unmixing with Optimal Transport
 
Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation Couplings of Markov chains and the Poisson equation
Couplings of Markov chains and the Poisson equation
 
Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods Unbiased Markov chain Monte Carlo methods
Unbiased Markov chain Monte Carlo methods
 
Unbiased MCMC with couplings
Unbiased MCMC with couplingsUnbiased MCMC with couplings
Unbiased MCMC with couplings
 
Bayesian Neural Networks
Bayesian Neural NetworksBayesian Neural Networks
Bayesian Neural Networks
 
Winter school-pq2016v2
Winter school-pq2016v2Winter school-pq2016v2
Winter school-pq2016v2
 
Numerical approach for Hamilton-Jacobi equations on a network: application to...
Numerical approach for Hamilton-Jacobi equations on a network: application to...Numerical approach for Hamilton-Jacobi equations on a network: application to...
Numerical approach for Hamilton-Jacobi equations on a network: application to...
 
Thesis defense improved
Thesis defense improvedThesis defense improved
Thesis defense improved
 
A quantum-inspired optimization heuristic for the multiple sequence alignment...
A quantum-inspired optimization heuristic for the multiple sequence alignment...A quantum-inspired optimization heuristic for the multiple sequence alignment...
A quantum-inspired optimization heuristic for the multiple sequence alignment...
 
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
Program on Mathematical and Statistical Methods for Climate and the Earth Sys...
 
Habilitation à diriger des recherches
Habilitation à diriger des recherchesHabilitation à diriger des recherches
Habilitation à diriger des recherches
 
WSC 2011, advanced tutorial on simulation in Statistics
WSC 2011, advanced tutorial on simulation in StatisticsWSC 2011, advanced tutorial on simulation in Statistics
WSC 2011, advanced tutorial on simulation in Statistics
 
Analyzing Meme Propagation in Multimemetic Algorithms
Analyzing Meme Propagation in Multimemetic AlgorithmsAnalyzing Meme Propagation in Multimemetic Algorithms
Analyzing Meme Propagation in Multimemetic Algorithms
 
Redundancy and synergy in dynamical systems
Redundancy and synergy in dynamical systemsRedundancy and synergy in dynamical systems
Redundancy and synergy in dynamical systems
 
Puneet Singla
Puneet SinglaPuneet Singla
Puneet Singla
 
An Analysis of a Selecto-Lamarckian Model of Multimemetic Algorithms with Dyn...
An Analysis of a Selecto-Lamarckian Model of Multimemetic Algorithms with Dyn...An Analysis of a Selecto-Lamarckian Model of Multimemetic Algorithms with Dyn...
An Analysis of a Selecto-Lamarckian Model of Multimemetic Algorithms with Dyn...
 
Bayesian inversion of deterministic dynamic causal models
Bayesian inversion of deterministic dynamic causal modelsBayesian inversion of deterministic dynamic causal models
Bayesian inversion of deterministic dynamic causal models
 

Similar to Statistical Physics Studies of Machine Learning Problems by Lenka Zdeborova, Researcher @CNRS

. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...butest
 
nonlinear_rmt.pdf
nonlinear_rmt.pdfnonlinear_rmt.pdf
nonlinear_rmt.pdfGieTe
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer PerceptronsESCOM
 
My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.
My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.
My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.Anirbit Mukherjee
 
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...Anirbit Mukherjee
 
Functional specialization in human cognition: a large-scale neuroimaging init...
Functional specialization in human cognition: a large-scale neuroimaging init...Functional specialization in human cognition: a large-scale neuroimaging init...
Functional specialization in human cognition: a large-scale neuroimaging init...Ana Luísa Pinho
 
Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsUniversity of Glasgow
 
diffusion 모델부터 DALLE2까지.pdf
diffusion 모델부터 DALLE2까지.pdfdiffusion 모델부터 DALLE2까지.pdf
diffusion 모델부터 DALLE2까지.pdf수철 박
 
abstrakty přijatých příspěvků.doc
abstrakty přijatých příspěvků.docabstrakty přijatých příspěvků.doc
abstrakty přijatých příspěvků.docbutest
 
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...Amro Elfeki
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Frank Nielsen
 
A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...JuanPabloCarbajal3
 
20130928 automated theorem_proving_harrison
20130928 automated theorem_proving_harrison20130928 automated theorem_proving_harrison
20130928 automated theorem_proving_harrisonComputer Science Club
 
Artificial Intelligence Applications in Petroleum Engineering - Part I
Artificial Intelligence Applications in Petroleum Engineering - Part IArtificial Intelligence Applications in Petroleum Engineering - Part I
Artificial Intelligence Applications in Petroleum Engineering - Part IRamez Abdalla, M.Sc
 

Similar to Statistical Physics Studies of Machine Learning Problems by Lenka Zdeborova, Researcher @CNRS (20)

. An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic .... An introduction to machine learning and probabilistic ...
. An introduction to machine learning and probabilistic ...
 
nonlinear_rmt.pdf
nonlinear_rmt.pdfnonlinear_rmt.pdf
nonlinear_rmt.pdf
 
prototypes-AMALEA.pdf
prototypes-AMALEA.pdfprototypes-AMALEA.pdf
prototypes-AMALEA.pdf
 
Multi-Layer Perceptrons
Multi-Layer PerceptronsMulti-Layer Perceptrons
Multi-Layer Perceptrons
 
My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.
My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.
My 2hr+ survey talk at the Vector Institute, on our deep learning theorems.
 
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
My invited talk at the 2018 Annual Meeting of SIAM (Society of Industrial and...
 
Functional specialization in human cognition: a large-scale neuroimaging init...
Functional specialization in human cognition: a large-scale neuroimaging init...Functional specialization in human cognition: a large-scale neuroimaging init...
Functional specialization in human cognition: a large-scale neuroimaging init...
 
Perceptron
PerceptronPerceptron
Perceptron
 
stat-phys-AMALEA.pdf
stat-phys-AMALEA.pdfstat-phys-AMALEA.pdf
stat-phys-AMALEA.pdf
 
Projection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamicsProjection methods for stochastic structural dynamics
Projection methods for stochastic structural dynamics
 
diffusion 모델부터 DALLE2까지.pdf
diffusion 모델부터 DALLE2까지.pdfdiffusion 모델부터 DALLE2까지.pdf
diffusion 모델부터 DALLE2까지.pdf
 
Hmm and neural networks
Hmm and neural networksHmm and neural networks
Hmm and neural networks
 
abstrakty přijatých příspěvků.doc
abstrakty přijatých příspěvků.docabstrakty přijatých příspěvků.doc
abstrakty přijatých příspěvků.doc
 
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...
Characterization of Subsurface Heterogeneity: Integration of Soft and Hard In...
 
Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...Pattern learning and recognition on statistical manifolds: An information-geo...
Pattern learning and recognition on statistical manifolds: An information-geo...
 
kape_science
kape_sciencekape_science
kape_science
 
Poster(3)-1
Poster(3)-1Poster(3)-1
Poster(3)-1
 
A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...A walk through the intersection between machine learning and mechanistic mode...
A walk through the intersection between machine learning and mechanistic mode...
 
20130928 automated theorem_proving_harrison
20130928 automated theorem_proving_harrison20130928 automated theorem_proving_harrison
20130928 automated theorem_proving_harrison
 
Artificial Intelligence Applications in Petroleum Engineering - Part I
Artificial Intelligence Applications in Petroleum Engineering - Part IArtificial Intelligence Applications in Petroleum Engineering - Part I
Artificial Intelligence Applications in Petroleum Engineering - Part I
 

More from Paris Women in Machine Learning and Data Science

More from Paris Women in Machine Learning and Data Science (20)

Managing international tech teams, by Natasha Dimban
Managing international tech teams, by Natasha DimbanManaging international tech teams, by Natasha Dimban
Managing international tech teams, by Natasha Dimban
 
Optimizing GenAI apps, by N. El Mawass and Maria Knorps
Optimizing GenAI apps, by N. El Mawass and Maria KnorpsOptimizing GenAI apps, by N. El Mawass and Maria Knorps
Optimizing GenAI apps, by N. El Mawass and Maria Knorps
 
Perspectives, by M. Pannegeon
Perspectives, by M. PannegeonPerspectives, by M. Pannegeon
Perspectives, by M. Pannegeon
 
Evaluation strategies for dealing with partially labelled or unlabelled data
Evaluation strategies for dealing with partially labelled or unlabelled dataEvaluation strategies for dealing with partially labelled or unlabelled data
Evaluation strategies for dealing with partially labelled or unlabelled data
 
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
Combinatorial Optimisation with Policy Adaptation using latent Space Search, ...
 
An age-old question, by Caroline Jean-Pierre
An age-old question, by Caroline Jean-PierreAn age-old question, by Caroline Jean-Pierre
An age-old question, by Caroline Jean-Pierre
 
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle LautréApplying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
Applying Churn Prediction Approaches to the Telecom Industry, by Joëlle Lautré
 
How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier
How to supervise a thesis in NLP in the ChatGPT era? By Laure SoulierHow to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier
How to supervise a thesis in NLP in the ChatGPT era? By Laure Soulier
 
Global Ambitions Local Realities, by Anna Abreu
Global Ambitions Local Realities, by Anna AbreuGlobal Ambitions Local Realities, by Anna Abreu
Global Ambitions Local Realities, by Anna Abreu
 
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie DelonPlug-and-Play methods for inverse problems in imagine, by Julie Delon
Plug-and-Play methods for inverse problems in imagine, by Julie Delon
 
Sales Forecasting as a Data Product by Francesca Iannuzzi
Sales Forecasting as a Data Product by Francesca IannuzziSales Forecasting as a Data Product by Francesca Iannuzzi
Sales Forecasting as a Data Product by Francesca Iannuzzi
 
Identifying and mitigating bias in machine learning, by Ruta Binkyte
Identifying and mitigating bias in machine learning, by Ruta BinkyteIdentifying and mitigating bias in machine learning, by Ruta Binkyte
Identifying and mitigating bias in machine learning, by Ruta Binkyte
 
“Turning your ML algorithms into full web apps in no time with Python" by Mar...
“Turning your ML algorithms into full web apps in no time with Python" by Mar...“Turning your ML algorithms into full web apps in no time with Python" by Mar...
“Turning your ML algorithms into full web apps in no time with Python" by Mar...
 
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
Nature Language Processing for proteins by Amélie Héliou, Software Engineer @...
 
Sandrine Henry presents the BechdelAI project
Sandrine Henry presents the BechdelAI projectSandrine Henry presents the BechdelAI project
Sandrine Henry presents the BechdelAI project
 
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
Anastasiia Tryputen_War in Ukraine or how extraordinary courage reshapes geop...
 
Khrystyna Grynko WiMLDS - From marketing to Tech.pdf
Khrystyna Grynko WiMLDS - From marketing to Tech.pdfKhrystyna Grynko WiMLDS - From marketing to Tech.pdf
Khrystyna Grynko WiMLDS - From marketing to Tech.pdf
 
Iana Iatsun_ML in production_20Dec2022.pdf
Iana Iatsun_ML in production_20Dec2022.pdfIana Iatsun_ML in production_20Dec2022.pdf
Iana Iatsun_ML in production_20Dec2022.pdf
 
41 WiMLDS Kyiv Paris Poznan.pdf
41 WiMLDS Kyiv Paris Poznan.pdf41 WiMLDS Kyiv Paris Poznan.pdf
41 WiMLDS Kyiv Paris Poznan.pdf
 
Emergency plan to secure winter: what are the measures set up by RTE?
Emergency plan to secure winter: what are the measures set up by RTE?Emergency plan to secure winter: what are the measures set up by RTE?
Emergency plan to secure winter: what are the measures set up by RTE?
 

Recently uploaded

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersThousandEyes
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Patryk Bandurski
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Alan Dix
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 3652toLead Limited
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 

Recently uploaded (20)

Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for PartnersEnhancing Worker Digital Experience: A Hands-on Workshop for Partners
Enhancing Worker Digital Experience: A Hands-on Workshop for Partners
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
Integration and Automation in Practice: CI/CD in Mule Integration and Automat...
 
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...Swan(sea) Song – personal research during my six years at Swansea ... and bey...
Swan(sea) Song – personal research during my six years at Swansea ... and bey...
 
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
Tech-Forward - Achieving Business Readiness For Copilot in Microsoft 365
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Connect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck PresentationConnect Wave/ connectwave Pitch Deck Presentation
Connect Wave/ connectwave Pitch Deck Presentation
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
The transition to renewables in India.pdf
The transition to renewables in India.pdfThe transition to renewables in India.pdf
The transition to renewables in India.pdf
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 

Statistical Physics Studies of Machine Learning Problems by Lenka Zdeborova, Researcher @CNRS

  • 1. STATISTICAL PHYSICS MODELLING OF MACHINE LEARNING Lenka Zdeborová (IPhT; CEA Saclay & CNRS) WiMLDS meetup, November 29, 2018
  • 2. Long history of physics influencing machine learning. Examples: Gibbs-Bogoluibov-Feynman’60s - physics behind variational inference. Hopfield model’82. Spin glass models of neural networks Amit, Gutfreund, Sompolinsky’85. Boltzmann machine Hinton, Sejnowski’86 - named after the Boltzmann distribution. Gardner’87 - Maximum storage capacity in neural networks (related to VC dimension). SVMs by Boser, Guyon, Vapnik’92 inspired by Krauth, Mézard’87 Many papers on neural networks in physics in 80s and 90s. PHYSICS IN MACHINE LEARNING
  • 5. THE PUZZLE OF GENERALIZATION According to PAC bounds (via VC dimension, Rademacher complexity) neural networks that generalize well should not be able to fit random labels. ICLR’16
  • 6. THEORETICAL QUESTIONS IN DEEP LEARNING Why the lack of overfitting? “More parameters = more overfitting” Does not seem to hold in deep learning.
  • 7. SAMPLE COMPLEXITY Cifar10 - 50000 samples. How many samples are really needed? How low is the optimal sample complexity? Are we achieving it? If not, is it because of architectures or algorithms?
  • 9. THEORETICAL-PHYSICS ROADMAP 1. Experimental observation or fundamental hypothesis. 2. Unreasonably simple model for which toughest questions can be understood mathematically. 3. Generalize to more realistic models, relies on universality (= important laws of nature rarely depend on many details).
  • 10. MODELS H = J X (ij)2E SiSj P({Si}i=1,...,N ) = e H Z magnetism of materials In data science, models are used to fit the data. (e.g. linear regression: What is the best straight line that captures the dependence of y on x?) In physics, models are the main tool for understanding.
  • 11. MODELS In data science, models are used to fit the data. (e.g. linear regression: What is the best straight line that captures the dependence of y on x?) In physics, models are the main tool for understanding. P({Si}i=1,...,N ) = e H Z H = X (ijk)2E JijkSiSjSk glass transitionp-spin model Jijk ⇠ N(0, 1)
  • 12. IS THIS USEFUL IN MACHINE LEARNING? Example: Single layer neural network = generalized linear regression. Given (X,y) find w such that μ = 1,…, n i = 1,…, pyμ = φ( p ∑ i=1 Xμiwi) data X y labels w weights data weights (noisy) activation function
  • 13. Take random iid Gaussian and random iid from Create Goal: Compute the best possible generalisation error achievable with n samples of dimension p. High-dimensional regime: TEACHER-STUDENT MODEL Xμi w*i yμ = φ( p ∑ i=1 Xμiw*i ) Pw p → ∞ n → ∞ n/p = Ω(1) data X y labels w weights data weights Gardner, Derrida’89, Gyorgyi’90
  • 14. Take random iid Gaussian and random iid from Create Goal: Compute the best possible generalisation error achievable with n samples of dimension p. High-dimensional regime: Xμi w*i yμ = φ( p ∑ i=1 Xμiw*i ) Pw p → ∞n → ∞ n/p = Ω(1) What did we win? Posterior is tractable with replica and cavity method, developed in the theory of spin glasses. P(w|X, y) TEACHER-STUDENT MODEL Gardner, Derrida’89, Gyorgyi’90
  • 15. Optimal generalisation error for any non-linearity and prior on weights. Proof of the replica formula for the optimal generalisation error. Approximate message passing provably reaching the optimal generalization error (out of the hard region). Barbier, Krzakala, Macris, Miolane, LZ, COLT’18, arXiv:1708.03395 NEW W.R.T. 1990
  • 16. LEARNING CURVESgeneralisationerror φ(z) = sign(z) p → ∞ n → ∞ n/p = Ω(1) optimal AMP algorithm logistic regression Pw = 𝒩(0,1) # of samples per dimension n/p
  • 17. generalisationerror # of samples per dimension optimal, achievable optimal AMP algorithm logistic regression wi 2 { 1, +1}φ(z) = sign(z) n/p p → ∞ n → ∞ n/p = Ω(1) PHASE TRANSITIONS
  • 18. generalisationerror # of samples per dimension optimal, achievable optimal AMP algorithm logistic regression wi 2 { 1, +1}φ(z) = sign(z) n/p p → ∞ n → ∞ n/p = Ω(1) hard PHASE TRANSITIONS
  • 20. INCLUDING HIDDEN VARIABLES data X y labels w v1 v2 weights p input units K hidden units output unit L=3 layers n training samples w learned, v1 & v2 fixed Limit: Committee machine Model from Schwarze’92. Proof of the replica formula, and approximate message passing Aubin, Maillard, Barbier, Macris, Krzakala, LZ’19, spotlight at NeurIPS’18. K = O(1)<latexit sha1_base64="pnb2kdx6DB2WkAndPWumY4tr6mw=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CRahXsquFPQiFL0IHqxgP6BdSjbNtrHZZEmyQln6H7x4UMSr/8eb/8a03YO2Phh4vDfDzLwg5kwb1/12ciura+sb+c3C1vbO7l5x/6CpZaIIbRDJpWoHWFPOBG0YZjhtx4riKOC0FYyup37riSrNpHgw45j6ER4IFjKCjZWat5d3Ze+0Vyy5FXcGtEy8jJQgQ71X/Or2JUkiKgzhWOuO58bGT7EyjHA6KXQTTWNMRnhAO5YKHFHtp7NrJ+jEKn0USmVLGDRTf0+kONJ6HAW2M8JmqBe9qfif10lMeOGnTMSJoYLMF4UJR0ai6euozxQlho8twUQxeysiQ6wwMTaggg3BW3x5mTTPKp5b8e6rpdpVFkcejuAYyuDBOdTgBurQAAKP8Ayv8OZI58V5dz7mrTknmzmEP3A+fwD3vI4P</latexit><latexit sha1_base64="pnb2kdx6DB2WkAndPWumY4tr6mw=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CRahXsquFPQiFL0IHqxgP6BdSjbNtrHZZEmyQln6H7x4UMSr/8eb/8a03YO2Phh4vDfDzLwg5kwb1/12ciura+sb+c3C1vbO7l5x/6CpZaIIbRDJpWoHWFPOBG0YZjhtx4riKOC0FYyup37riSrNpHgw45j6ER4IFjKCjZWat5d3Ze+0Vyy5FXcGtEy8jJQgQ71X/Or2JUkiKgzhWOuO58bGT7EyjHA6KXQTTWNMRnhAO5YKHFHtp7NrJ+jEKn0USmVLGDRTf0+kONJ6HAW2M8JmqBe9qfif10lMeOGnTMSJoYLMF4UJR0ai6euozxQlho8twUQxeysiQ6wwMTaggg3BW3x5mTTPKp5b8e6rpdpVFkcejuAYyuDBOdTgBurQAAKP8Ayv8OZI58V5dz7mrTknmzmEP3A+fwD3vI4P</latexit><latexit sha1_base64="pnb2kdx6DB2WkAndPWumY4tr6mw=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CRahXsquFPQiFL0IHqxgP6BdSjbNtrHZZEmyQln6H7x4UMSr/8eb/8a03YO2Phh4vDfDzLwg5kwb1/12ciura+sb+c3C1vbO7l5x/6CpZaIIbRDJpWoHWFPOBG0YZjhtx4riKOC0FYyup37riSrNpHgw45j6ER4IFjKCjZWat5d3Ze+0Vyy5FXcGtEy8jJQgQ71X/Or2JUkiKgzhWOuO58bGT7EyjHA6KXQTTWNMRnhAO5YKHFHtp7NrJ+jEKn0USmVLGDRTf0+kONJ6HAW2M8JmqBe9qfif10lMeOGnTMSJoYLMF4UJR0ai6euozxQlho8twUQxeysiQ6wwMTaggg3BW3x5mTTPKp5b8e6rpdpVFkcejuAYyuDBOdTgBurQAAKP8Ayv8OZI58V5dz7mrTknmzmEP3A+fwD3vI4P</latexit><latexit sha1_base64="pnb2kdx6DB2WkAndPWumY4tr6mw=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CRahXsquFPQiFL0IHqxgP6BdSjbNtrHZZEmyQln6H7x4UMSr/8eb/8a03YO2Phh4vDfDzLwg5kwb1/12ciura+sb+c3C1vbO7l5x/6CpZaIIbRDJpWoHWFPOBG0YZjhtx4riKOC0FYyup37riSrNpHgw45j6ER4IFjKCjZWat5d3Ze+0Vyy5FXcGtEy8jJQgQ71X/Or2JUkiKgzhWOuO58bGT7EyjHA6KXQTTWNMRnhAO5YKHFHtp7NrJ+jEKn0USmVLGDRTf0+kONJ6HAW2M8JmqBe9qfif10lMeOGnTMSJoYLMF4UJR0ai6euozxQlho8twUQxeysiQ6wwMTaggg3BW3x5mTTPKp5b8e6rpdpVFkcejuAYyuDBOdTgBurQAAKP8Ayv8OZI58V5dz7mrTknmzmEP3A+fwD3vI4P</latexit> p → ∞ n → ∞ α = n/p = Ω(1)
  • 21. PHASE TRANSITONS Specialization phase transition = hidden units specialise to correlate with specific features. K=2 sign(0) = 0<latexit sha1_base64="Dc5utTXextgZij3T2/A7p36jzo8=">AAAB+nicbVBNSwMxEJ31s9avrR69BItQLyUrgl6EohePFewHtEvJptk2NJtdkqxS1v4ULx4U8eov8ea/MW33oK0PBh7vzTAzL0gE1wbjb2dldW19Y7OwVdze2d3bd0sHTR2nirIGjUWs2gHRTHDJGoYbwdqJYiQKBGsFo5up33pgSvNY3ptxwvyIDCQPOSXGSj23lHVVhDQfyAo+naArhHtuGVfxDGiZeDkpQ456z/3q9mOaRkwaKojWHQ8nxs+IMpwKNil2U80SQkdkwDqWShIx7Wez0yfoxCp9FMbKljRopv6eyEik9TgKbGdEzFAvelPxP6+TmvDSz7hMUsMknS8KU4FMjKY5oD5XjBoxtoRQxe2tiA6JItTYtIo2BG/x5WXSPKt6uOrdnZdr13kcBTiCY6iABxdQg1uoQwMoPMIzvMKb8+S8OO/Ox7x1xclnDuEPnM8fCE+Shw==</latexit><latexit sha1_base64="Dc5utTXextgZij3T2/A7p36jzo8=">AAAB+nicbVBNSwMxEJ31s9avrR69BItQLyUrgl6EohePFewHtEvJptk2NJtdkqxS1v4ULx4U8eov8ea/MW33oK0PBh7vzTAzL0gE1wbjb2dldW19Y7OwVdze2d3bd0sHTR2nirIGjUWs2gHRTHDJGoYbwdqJYiQKBGsFo5up33pgSvNY3ptxwvyIDCQPOSXGSj23lHVVhDQfyAo+naArhHtuGVfxDGiZeDkpQ456z/3q9mOaRkwaKojWHQ8nxs+IMpwKNil2U80SQkdkwDqWShIx7Wez0yfoxCp9FMbKljRopv6eyEik9TgKbGdEzFAvelPxP6+TmvDSz7hMUsMknS8KU4FMjKY5oD5XjBoxtoRQxe2tiA6JItTYtIo2BG/x5WXSPKt6uOrdnZdr13kcBTiCY6iABxdQg1uoQwMoPMIzvMKb8+S8OO/Ox7x1xclnDuEPnM8fCE+Shw==</latexit><latexit sha1_base64="Dc5utTXextgZij3T2/A7p36jzo8=">AAAB+nicbVBNSwMxEJ31s9avrR69BItQLyUrgl6EohePFewHtEvJptk2NJtdkqxS1v4ULx4U8eov8ea/MW33oK0PBh7vzTAzL0gE1wbjb2dldW19Y7OwVdze2d3bd0sHTR2nirIGjUWs2gHRTHDJGoYbwdqJYiQKBGsFo5up33pgSvNY3ptxwvyIDCQPOSXGSj23lHVVhDQfyAo+naArhHtuGVfxDGiZeDkpQ456z/3q9mOaRkwaKojWHQ8nxs+IMpwKNil2U80SQkdkwDqWShIx7Wez0yfoxCp9FMbKljRopv6eyEik9TgKbGdEzFAvelPxP6+TmvDSz7hMUsMknS8KU4FMjKY5oD5XjBoxtoRQxe2tiA6JItTYtIo2BG/x5WXSPKt6uOrdnZdr13kcBTiCY6iABxdQg1uoQwMoPMIzvMKb8+S8OO/Ox7x1xclnDuEPnM8fCE+Shw==</latexit><latexit sha1_base64="Dc5utTXextgZij3T2/A7p36jzo8=">AAAB+nicbVBNSwMxEJ31s9avrR69BItQLyUrgl6EohePFewHtEvJptk2NJtdkqxS1v4ULx4U8eov8ea/MW33oK0PBh7vzTAzL0gE1wbjb2dldW19Y7OwVdze2d3bd0sHTR2nirIGjUWs2gHRTHDJGoYbwdqJYiQKBGsFo5up33pgSvNY3ptxwvyIDCQPOSXGSj23lHVVhDQfyAo+naArhHtuGVfxDGiZeDkpQ456z/3q9mOaRkwaKojWHQ8nxs+IMpwKNil2U80SQkdkwDqWShIx7Wez0yfoxCp9FMbKljRopv6eyEik9TgKbGdEzFAvelPxP6+TmvDSz7hMUsMknS8KU4FMjKY5oD5XjBoxtoRQxe2tiA6JItTYtIo2BG/x5WXSPKt6uOrdnZdr13kcBTiCY6iABxdQg1uoQwMoPMIzvMKb8+S8OO/Ox7x1xclnDuEPnM8fCE+Shw==</latexit> 0 1 2 3 4 α 0.00 0.05 0.10 0.15 0.20 0.25 Generalizationerrorϵg(α) 0.0 0.2 0.4 0.6 0.8 1.0 Overlapq AMP q00 AMP q01 SE q00 SE q01 SE ϵg(α) AMP ϵg(α) Specialization yμ = sign[sign(∑ i Xμ,iwi,1) + sign ∑ i (Xμ,iwi,2)]
  • 22. Large algorithmic gap: IT threshold: Algorithmic threshold K 1<latexit sha1_base64="tIKGLXfugTsoLV203AoKJohXlvk=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0GPRi+Clgv2ANpTNdpMu3WzC7kQooT/CiwdFvPp7vPlv3LY5aOuDgcd7M8zMC1IpDLrut1NaW9/Y3CpvV3Z29/YPqodHbZNkmvEWS2SiuwE1XArFWyhQ8m6qOY0DyTvB+Hbmd564NiJRjzhJuR/TSIlQMIpW6tyTfhQRb1CtuXV3DrJKvILUoEBzUP3qDxOWxVwhk9SYnuem6OdUo2CSTyv9zPCUsjGNeM9SRWNu/Hx+7pScWWVIwkTbUkjm6u+JnMbGTOLAdsYUR2bZm4n/eb0Mw2s/FyrNkCu2WBRmkmBCZr+TodCcoZxYQpkW9lbCRlRThjahig3BW355lbQv6p5b9x4ua42bIo4ynMApnIMHV9CAO2hCCxiM4Rle4c1JnRfn3flYtJacYuYY/sD5/AH0iY6m</latexit><latexit sha1_base64="tIKGLXfugTsoLV203AoKJohXlvk=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0GPRi+Clgv2ANpTNdpMu3WzC7kQooT/CiwdFvPp7vPlv3LY5aOuDgcd7M8zMC1IpDLrut1NaW9/Y3CpvV3Z29/YPqodHbZNkmvEWS2SiuwE1XArFWyhQ8m6qOY0DyTvB+Hbmd564NiJRjzhJuR/TSIlQMIpW6tyTfhQRb1CtuXV3DrJKvILUoEBzUP3qDxOWxVwhk9SYnuem6OdUo2CSTyv9zPCUsjGNeM9SRWNu/Hx+7pScWWVIwkTbUkjm6u+JnMbGTOLAdsYUR2bZm4n/eb0Mw2s/FyrNkCu2WBRmkmBCZr+TodCcoZxYQpkW9lbCRlRThjahig3BW355lbQv6p5b9x4ua42bIo4ynMApnIMHV9CAO2hCCxiM4Rle4c1JnRfn3flYtJacYuYY/sD5/AH0iY6m</latexit><latexit sha1_base64="tIKGLXfugTsoLV203AoKJohXlvk=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0GPRi+Clgv2ANpTNdpMu3WzC7kQooT/CiwdFvPp7vPlv3LY5aOuDgcd7M8zMC1IpDLrut1NaW9/Y3CpvV3Z29/YPqodHbZNkmvEWS2SiuwE1XArFWyhQ8m6qOY0DyTvB+Hbmd564NiJRjzhJuR/TSIlQMIpW6tyTfhQRb1CtuXV3DrJKvILUoEBzUP3qDxOWxVwhk9SYnuem6OdUo2CSTyv9zPCUsjGNeM9SRWNu/Hx+7pScWWVIwkTbUkjm6u+JnMbGTOLAdsYUR2bZm4n/eb0Mw2s/FyrNkCu2WBRmkmBCZr+TodCcoZxYQpkW9lbCRlRThjahig3BW355lbQv6p5b9x4ua42bIo4ynMApnIMHV9CAO2hCCxiM4Rle4c1JnRfn3flYtJacYuYY/sD5/AH0iY6m</latexit><latexit sha1_base64="tIKGLXfugTsoLV203AoKJohXlvk=">AAAB7nicbVBNS8NAEJ3Ur1q/qh69LBbBU0lE0GPRi+Clgv2ANpTNdpMu3WzC7kQooT/CiwdFvPp7vPlv3LY5aOuDgcd7M8zMC1IpDLrut1NaW9/Y3CpvV3Z29/YPqodHbZNkmvEWS2SiuwE1XArFWyhQ8m6qOY0DyTvB+Hbmd564NiJRjzhJuR/TSIlQMIpW6tyTfhQRb1CtuXV3DrJKvILUoEBzUP3qDxOWxVwhk9SYnuem6OdUo2CSTyv9zPCUsjGNeM9SRWNu/Hx+7pScWWVIwkTbUkjm6u+JnMbGTOLAdsYUR2bZm4n/eb0Mw2s/FyrNkCu2WBRmkmBCZr+TodCcoZxYQpkW9lbCRlRThjahig3BW355lbQv6p5b9x4ua42bIo4ynMApnIMHV9CAO2hCCxiM4Rle4c1JnRfn3flYtJacYuYY/sD5/AH0iY6m</latexit> yμ = sign[ K ∑ a=1 sign(∑ i Xμ,iwi,a)] n > 7.65Kp n > const . K2 p PHASE TRANSITONS 0 2 4 6 8 10 12 14 α = (# of samples)/(#hidden units × input size) 0.0 0.1 0.2 0.3 0.4 0.5 Generalizationerrorϵg(α) Non-specialized hidden units Specialized hidden units Computational gap Bayes optimal ϵg(α) AMP ϵg(α) Discontinuous specialization
  • 23. impossible hard doable todaydoable # of samples Good generalisation error Our goal: Quantify this in more realistic models. Design algorithms working in the doable region.
  • 24. LZ, F. Krzakala, Statistical Physics of Algorithm: Threshold and Algorithms, Advances of Physics (2016), arXiv:1511.02476. J. Barbier, N. Macris, L. Miolane, F. Krzakala, LZ, Phase Transitions, Optimal Errors and Optimality of Message-Passing in Generalized Linear Models, arXiv:1708.03395, COLT’18. B. Aubin, A. Maillard, J. Barbier, F. Krzakala N. Macris,, LZ, The committee machine: Computational to statistical gaps in learning a two-layers neural network, arXiv:1806.05451, NeurIPS’18. REFERENCES
  • 25. Thank you for your attention! LZ, F. Krzakala, Statistical Physics of Algorithm: Threshold and Algorithms, Advances of Physics (2016), arXiv:1511.02476. J. Barbier, N. Macris, L. Miolane, F. Krzakala, LZ, Phase Transitions, Optimal Errors and Optimality of Message-Passing in Generalized Linear Models, arXiv:1708.03395, COLT’18. B. Aubin, A. Maillard, J. Barbier, F. Krzakala N. Macris,, LZ, The committee machine: Computational to statistical gaps in learning a two-layers neural network, arXiv:1806.05451, NeurIPS’18. REFERENCES