SlideShare a Scribd company logo
1 of 35
An Introduction to
Hamiltonian Neural Networks
Presented by Miles Cranmer, Princeton University
@MilesCranmer
(advised by Shirley Ho/David Spergel)
This is based on none of my own research.
The work is by:
Sam Greydanus, Misko Dzamba, and Jason Yosinski
(+ Tom Bertalan, Felix Dietrich, Igor Mesić, and Ioannis G
Kevrekidis which was posted at a similar time)
Ordering:
1. Classical Mechanics Review
2. Neural Networks
3. Hamiltonian Neural Networks
4. Bonus: Neural ODEs
5. Code Demo
Forces
• Objects and fields by themselves induce
forces on other objects
• A vector-wise sum of forces gets the net force
• Divide by mass of the body to get the
acceleration
• Common forces:
• Normal force (desk holding something)
• Friction
• Tension (string)
• Gravity
[1]
Lagrangian Mechanics
• For a coordinate system,
• (Focus on object coordinates for today)
• Write down kinetic energy =
• Potential energy =
• Lagrangian is a function of coordinates and (usually) their first
order derivatives
• Action is:
• Apply principle of stationary action
Lagrangian Mechanics 2
• By extremizing the action, we get the Euler-Lagrange equations.
• Example: falling ball:
• Numerically integrate these to get the dynamics of the system
Hamiltonian Mechanics
• Canonical momenta for a system:
• Legendre transformation of L is the Hamiltonian:
• This usually is the energy, conserved in a dynamical system.
• What path preserves H?
• Move perpendicular to its gradient!
• Called symplectic programming
• Falling ball:
[2]
Hamiltonian Mechanics 2
• H-preserving path = Symplectic Gradient:
• Also known as Hamilton’s equations!
• Can use these first order, explicit ODEs to integrate physical
dynamics
• Problems with L:
• Second order, implicit ODEs
• L isn’t meaningful by itself
Things to worry about with L, H
• Dissipation/friction
• Need to add force to Euler-Lagrange equation
• Can also use multiplicative factor:
• Energy pools/boundaries
• Constraints
• E.g., normal forces
• Sol’n: Use better coordinates (sometimes tricky)
• Or, use constraint function that equals 0
• (Lagrange multiplier method)
• *After reading the presentation – if you manage to think of a way
to add these techniques to a Hamiltonian NN, come talk to me!
Integrators
• Presented with an explicit differential equation,
we can use several methods to numerically integrate it.
• Recall that:
• This is an Euler integrator:
Accurate Integrators
• Advanced integrators do several
intermediate steps to improve accuracy
• Runge-Kutta integrators target accuracy
• Can be very accurate, but not preserve
known invariants!
• Symplectic integrators target energy
conservation
• Can preserve energy very well, but have no
accuracy!
• (All integrators are bad for longterm
accuracy)
[3]
Integrator Examples
• Runge-Kutta 4th order
(most common)
• High accuracy, low-
cost
• Does not necessarily
preserve energy
[3]
[3]
• Symplectic 4th order (Yoshida)
• These exactly conserve energy!
• Do drift (update x) and kick (update p) steps separately
• (c, d) are ugly constants,
some negative,
which add to 1
[4]
Pivot to Machine Learning
• Recall (or not?): Machine Learning is parameter estimation where
the parameters lack explicit physical meaning!
• Many types of ML:
• Supervised (common):
• Regression
• Classification
• Unsupervised
• E.g., clustering, density estimation
• Semi-supervised – a mix
• Linear Regression – this counts as ML!
[5]
Neural Networks
• Repeat after me:
Neural Networks are piecewise Linear Regression!
• Mathematically (we’ll only talk Multi-Layer Perceptrons):
• (You do a linear regression -> zero the negatives -> repeat)
Neural Networks 2
• Repeat after me:
Neural Networks are piecewise Linear Regression!
• 0-hidden layer Neural Network: linear regression!
• 1-hidden layer NN with ReLU: Piecewise
• Whatever combination of “neurons” are on = different “region” for linear
regression
• 2^(layers*hidden size) different linear regression solutions
• Continuously connected
• Don’t expect good extrapolation! Only nearby interpolation
• Neural Net parameters both inform the slope and the regions.
I don’t believe you!
• Randomly-initialized 2-hidden layer 50-node NN:
Why?
• ReLU on = linear regression
• ReLU off = 0
• Remaining nodes simplify to
linear regression!
[6]
Neural Network Aside
• Other activation functions: tanh and softplus, smear this linearity
• Neural Networks are universal function approximators. In the
limit of infinitely wide layers, even with two hidden ones, they can
express any mapping.
• They happen to be efficient at doing this too!
• All Neural Network techniques are about getting them to cheat
less. They are very good at cheating.
• Data Augmentation (hugely important)
• Regularization
• Structure (Convolutional NN, Graph Net, etc)
Differentiability
• Derivative is well-defined. Just a product of sparse matrices!
• Interested in:
• Derivative wrt weights used for optimization (SGD or Adam)
• Auto-diff frameworks like TensorFlow and PyTorch make this easy.
• Demo: https://playground.tensorflow.org
Neural Nets for Physical Dynamics
• Here we will focus on physical systems over time.
• Many other things like sequences can be reframed as dynamics
problems.
• We are interested in problems where we have:
•
• for i particles over time
• In addition to other fixed properties...
• How do we use Neural Nets to simulate systems?
Example - Pendulum
• How to learn to estimate the future position and velocity of a
pendulum?
• Neural Net:
• n is the number of particles*dynamical parameters
• l is the number of fixed parameters
• Pendulum:
• n = 2 (theta, theta velocity)
• l = 2 (gravity, length of pendulum)
• Want to only predict change in parameters - easier regression problem
• So, here we are learning a function that approximates a velocity update
and a force law
[7]
Real World Applications (of NNs for
simulation)
• Neural Networks learn "effective" forces in simulations
• They only look at the most relevant degrees of freedom!
• Can be more accurate at reduced computational cost
• Some examples:
• Shirley Ho's U-Net can do cosmological simulations much faster and more
accurately than standard simulators
• Peter Battaglia's Interaction Network used in many applications
• Drug discovery/molecular+protein modelling – getting very popular
• E.g., Cecilia Clementi, Frank Noe, Mark Waller, many others
• DeepMind's AlphaFold Protein Folding algorithm - destroys baseline algorithms at
finding structure from genetic code
• See IPAM's recent workshop for good list!
• Some say intelligent reasoning is based on learning to simulate potential
outcomes => path to general intelligence?
Hamiltonian Neural Networks
• Learn a mapping from coordinates and momenta to a single
number
• The derivatives of this can describe your dynamics by Hamilton's
equations:
• Comparing the true and predicted dynamical updates gives a
minimization objective:
(Sam’s blog)
(Sam’s blog)
Why?
• It works better; it’s more interpretable. Not only
do we have a simulator, we know the energy!
(Sam’s blog)
Why does it work?
• It uses symplectic gradients: by prescribing that we can only move
along the level set of H, it learns the proper H.
Start: Final:
(Sam’s blog)
(Sanchez-Gonzalez
et al)
Graph Network extension:
Integrators
• So far we have only talked about Euler integrators. But as dH is
just an ODE, we can use any integrator: RK4 and symplectic
included.
• If H has learned the true energy, we can exactly preserve it with
symplectic integrators.
• In practice, RK4 still more accurate. Maybe some combination is best?
This model is less than 6 months old! We don't know what is best yet.
• Can train + eval with RK4 or Symplectic Methods!
• Do multiple queries and multiple derivatives of your network’s H
• This works very well in practice.
I don’t know the canonical coordinates!
• Pair two Neural Networks:
• g, an autoencoder to latent variables
• H, a Hamiltonian that pretends those
latent variables are (q, p).
• Training this setup in combination
will learn the
canonical coords
+ the Hamiltonian!
(Sam’s blog)
Tips
• Activations:
• Recall: Neural Networks are piecewise linear regression.
• Looking at derivatives from ReLU means we are literally learning a lookup
table – not good!
• Use Softplus or Tanh to make H have a smoother derivative
• Use more hidden nodes than for regular NNs, as H needs to be very
smooth
• Stability:
• According to some (Stephan Hoyer), better to learn multiple timesteps at
once.
• Use RK4 integrators
Bonus: Neural ODEs
• Famous 2018 paper:
Neural Ordinary Differential
Equations.
• Hamiltonian Neural
Networks -ARE- a Neural
ODE.
• Paper connects ResNets
with Euler integrators
• Paper: “Why not just learn a
derivative and integrate it?”
• Smoother output!
(Chen et al)
PyTorch Tutorial – Falling Ball
• Short: https://bit.ly/2JiTEJE
• (Copy to new notebook in your drive)
Figure + other references
1. http://ffden-2.phys.uaf.edu/211_fall2004.web.dir/Jeff_Levison/Freebody%20diagram.htm
2. https://physics.stackexchange.com/questions/384990/why-will-a-dropped-object-land-at-the-same-time-as-a-sideways-
thrown-one
3. https://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods
4. https://en.wikipedia.org/wiki/Leapfrog_integration
5. https://en.wikipedia.org/wiki/Linear_regression#/media/File:Linear_regression.svg
6. https://medium.com/@amarbudhiraja/https-medium-com-amarbudhiraja-learning-less-to-learn-better-dropout-in-
deep-machine-learning-74334da4bfc5
7. https://medium.com/@kriswilliams/how-life-is-like-a-pendulum-8811c4177685
Other resources used:
1. https://arxiv.org/abs/1906.01563
2. https://arxiv.org/abs/1907.12715
3. https://arxiv.org/pdf/1909.12790.pdf
4. https://greydanus.github.io/2019/05/15/hamiltonian-nns/
5. https://arxiv.org/pdf/1806.07366.pdf

More Related Content

What's hot

Deep Recurrent Q-Learning(DRQN) for Partially Observable MDPs
Deep Recurrent Q-Learning(DRQN) for Partially Observable MDPsDeep Recurrent Q-Learning(DRQN) for Partially Observable MDPs
Deep Recurrent Q-Learning(DRQN) for Partially Observable MDPsHakky St
 
深層強化学習入門 2020年度Deep Learning基礎講座「強化学習」
深層強化学習入門 2020年度Deep Learning基礎講座「強化学習」深層強化学習入門 2020年度Deep Learning基礎講座「強化学習」
深層強化学習入門 2020年度Deep Learning基礎講座「強化学習」Tatsuya Matsushima
 
[DL輪読会]Deep Learning 第5章 機械学習の基礎
[DL輪読会]Deep Learning 第5章 機械学習の基礎[DL輪読会]Deep Learning 第5章 機械学習の基礎
[DL輪読会]Deep Learning 第5章 機械学習の基礎Deep Learning JP
 
The PubchemQC project
The PubchemQC projectThe PubchemQC project
The PubchemQC projectMaho Nakata
 
[PRML] パターン認識と機械学習(第3章:線形回帰モデル)
[PRML] パターン認識と機械学習(第3章:線形回帰モデル)[PRML] パターン認識と機械学習(第3章:線形回帰モデル)
[PRML] パターン認識と機械学習(第3章:線形回帰モデル)Ryosuke Sasaki
 
AIチップ戦国時代における深層学習モデルの推論の最適化と実用的な運用を可能にするソフトウェア技術について
AIチップ戦国時代における深層学習モデルの推論の最適化と実用的な運用を可能にするソフトウェア技術についてAIチップ戦国時代における深層学習モデルの推論の最適化と実用的な運用を可能にするソフトウェア技術について
AIチップ戦国時代における深層学習モデルの推論の最適化と実用的な運用を可能にするソフトウェア技術についてFixstars Corporation
 
不遇の標準ライブラリ - valarray
不遇の標準ライブラリ - valarray不遇の標準ライブラリ - valarray
不遇の標準ライブラリ - valarrayRyosuke839
 
数学で解く競馬.pptx
数学で解く競馬.pptx数学で解く競馬.pptx
数学で解く競馬.pptxssuser83a4fc
 
ICML 2021 Workshop 深層学習の不確実性について
ICML 2021 Workshop 深層学習の不確実性についてICML 2021 Workshop 深層学習の不確実性について
ICML 2021 Workshop 深層学習の不確実性についてtmtm otm
 
劣モジュラ最適化と機械学習1章
劣モジュラ最適化と機械学習1章劣モジュラ最適化と機械学習1章
劣モジュラ最適化と機械学習1章Hakky St
 
CPU / GPU高速化セミナー!性能モデルの理論と実践:実践編
CPU / GPU高速化セミナー!性能モデルの理論と実践:実践編CPU / GPU高速化セミナー!性能モデルの理論と実践:実践編
CPU / GPU高速化セミナー!性能モデルの理論と実践:実践編Fixstars Corporation
 
深層強化学習の分散化・RNN利用の動向〜R2D2の紹介をもとに〜
深層強化学習の分散化・RNN利用の動向〜R2D2の紹介をもとに〜深層強化学習の分散化・RNN利用の動向〜R2D2の紹介をもとに〜
深層強化学習の分散化・RNN利用の動向〜R2D2の紹介をもとに〜Jun Okumura
 
Hopper アーキテクチャで、変わること、変わらないこと
Hopper アーキテクチャで、変わること、変わらないことHopper アーキテクチャで、変わること、変わらないこと
Hopper アーキテクチャで、変わること、変わらないことNVIDIA Japan
 
素材産業のDxに貢献する 『Matlantis』のご紹介_nano tech2022_2022/1/28
素材産業のDxに貢献する 『Matlantis』のご紹介_nano tech2022_2022/1/28素材産業のDxに貢献する 『Matlantis』のご紹介_nano tech2022_2022/1/28
素材産業のDxに貢献する 『Matlantis』のご紹介_nano tech2022_2022/1/28Matlantis
 
GPGPU Seminar (GPU Accelerated Libraries, 3 of 3, Thrust)
GPGPU Seminar (GPU Accelerated Libraries, 3 of 3, Thrust) GPGPU Seminar (GPU Accelerated Libraries, 3 of 3, Thrust)
GPGPU Seminar (GPU Accelerated Libraries, 3 of 3, Thrust) 智啓 出川
 
TensorFlowで逆強化学習
TensorFlowで逆強化学習TensorFlowで逆強化学習
TensorFlowで逆強化学習Mitsuhisa Ohta
 

What's hot (20)

Deep Recurrent Q-Learning(DRQN) for Partially Observable MDPs
Deep Recurrent Q-Learning(DRQN) for Partially Observable MDPsDeep Recurrent Q-Learning(DRQN) for Partially Observable MDPs
Deep Recurrent Q-Learning(DRQN) for Partially Observable MDPs
 
深層強化学習入門 2020年度Deep Learning基礎講座「強化学習」
深層強化学習入門 2020年度Deep Learning基礎講座「強化学習」深層強化学習入門 2020年度Deep Learning基礎講座「強化学習」
深層強化学習入門 2020年度Deep Learning基礎講座「強化学習」
 
[DL輪読会]Deep Learning 第5章 機械学習の基礎
[DL輪読会]Deep Learning 第5章 機械学習の基礎[DL輪読会]Deep Learning 第5章 機械学習の基礎
[DL輪読会]Deep Learning 第5章 機械学習の基礎
 
The PubchemQC project
The PubchemQC projectThe PubchemQC project
The PubchemQC project
 
強化学習3章
強化学習3章強化学習3章
強化学習3章
 
[PRML] パターン認識と機械学習(第3章:線形回帰モデル)
[PRML] パターン認識と機械学習(第3章:線形回帰モデル)[PRML] パターン認識と機械学習(第3章:線形回帰モデル)
[PRML] パターン認識と機械学習(第3章:線形回帰モデル)
 
AIチップ戦国時代における深層学習モデルの推論の最適化と実用的な運用を可能にするソフトウェア技術について
AIチップ戦国時代における深層学習モデルの推論の最適化と実用的な運用を可能にするソフトウェア技術についてAIチップ戦国時代における深層学習モデルの推論の最適化と実用的な運用を可能にするソフトウェア技術について
AIチップ戦国時代における深層学習モデルの推論の最適化と実用的な運用を可能にするソフトウェア技術について
 
不遇の標準ライブラリ - valarray
不遇の標準ライブラリ - valarray不遇の標準ライブラリ - valarray
不遇の標準ライブラリ - valarray
 
数学で解く競馬.pptx
数学で解く競馬.pptx数学で解く競馬.pptx
数学で解く競馬.pptx
 
ICML 2021 Workshop 深層学習の不確実性について
ICML 2021 Workshop 深層学習の不確実性についてICML 2021 Workshop 深層学習の不確実性について
ICML 2021 Workshop 深層学習の不確実性について
 
強化学習1章
強化学習1章強化学習1章
強化学習1章
 
劣モジュラ最適化と機械学習1章
劣モジュラ最適化と機械学習1章劣モジュラ最適化と機械学習1章
劣モジュラ最適化と機械学習1章
 
マーク付き点過程
マーク付き点過程マーク付き点過程
マーク付き点過程
 
From mcmc to sgnht
From mcmc to sgnhtFrom mcmc to sgnht
From mcmc to sgnht
 
CPU / GPU高速化セミナー!性能モデルの理論と実践:実践編
CPU / GPU高速化セミナー!性能モデルの理論と実践:実践編CPU / GPU高速化セミナー!性能モデルの理論と実践:実践編
CPU / GPU高速化セミナー!性能モデルの理論と実践:実践編
 
深層強化学習の分散化・RNN利用の動向〜R2D2の紹介をもとに〜
深層強化学習の分散化・RNN利用の動向〜R2D2の紹介をもとに〜深層強化学習の分散化・RNN利用の動向〜R2D2の紹介をもとに〜
深層強化学習の分散化・RNN利用の動向〜R2D2の紹介をもとに〜
 
Hopper アーキテクチャで、変わること、変わらないこと
Hopper アーキテクチャで、変わること、変わらないことHopper アーキテクチャで、変わること、変わらないこと
Hopper アーキテクチャで、変わること、変わらないこと
 
素材産業のDxに貢献する 『Matlantis』のご紹介_nano tech2022_2022/1/28
素材産業のDxに貢献する 『Matlantis』のご紹介_nano tech2022_2022/1/28素材産業のDxに貢献する 『Matlantis』のご紹介_nano tech2022_2022/1/28
素材産業のDxに貢献する 『Matlantis』のご紹介_nano tech2022_2022/1/28
 
GPGPU Seminar (GPU Accelerated Libraries, 3 of 3, Thrust)
GPGPU Seminar (GPU Accelerated Libraries, 3 of 3, Thrust) GPGPU Seminar (GPU Accelerated Libraries, 3 of 3, Thrust)
GPGPU Seminar (GPU Accelerated Libraries, 3 of 3, Thrust)
 
TensorFlowで逆強化学習
TensorFlowで逆強化学習TensorFlowで逆強化学習
TensorFlowで逆強化学習
 

Similar to Introduction to Hamiltonian Neural Networks

Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkGayatri Khanvilkar
 
Neural Networks and Deep Learning Basics
Neural Networks and Deep Learning BasicsNeural Networks and Deep Learning Basics
Neural Networks and Deep Learning BasicsJon Lederman
 
Neural net and back propagation
Neural net and back propagationNeural net and back propagation
Neural net and back propagationMohit Shrivastava
 
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Randa Elanwar
 
Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)Jon Lederman
 
08 neural networks
08 neural networks08 neural networks
08 neural networksankit_ppt
 
nil-100128213838-phpapp02.pptx
nil-100128213838-phpapp02.pptxnil-100128213838-phpapp02.pptx
nil-100128213838-phpapp02.pptxdlakmlkfma
 
Introduction to Deep learning and H2O for beginner's
Introduction to Deep learning and H2O for beginner'sIntroduction to Deep learning and H2O for beginner's
Introduction to Deep learning and H2O for beginner'sVidyasagar Bhargava
 
nil-100128213838-phpapp02.pdf
nil-100128213838-phpapp02.pdfnil-100128213838-phpapp02.pdf
nil-100128213838-phpapp02.pdfdlakmlkfma
 
Deep learning summary
Deep learning summaryDeep learning summary
Deep learning summaryankit_ppt
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural networknainabhatt2
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural NetworkNainaBhatt1
 
Artificial Neural Network Learning Algorithm.ppt
Artificial Neural Network Learning Algorithm.pptArtificial Neural Network Learning Algorithm.ppt
Artificial Neural Network Learning Algorithm.pptNJUSTAiMo
 
Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17 Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17 Prof. Neeta Awasthy
 
Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Balázs Hidasi
 
Learning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaLearning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaSangwoo Mo
 

Similar to Introduction to Hamiltonian Neural Networks (20)

Activation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural networkActivation functions and Training Algorithms for Deep Neural network
Activation functions and Training Algorithms for Deep Neural network
 
Neural Networks and Deep Learning Basics
Neural Networks and Deep Learning BasicsNeural Networks and Deep Learning Basics
Neural Networks and Deep Learning Basics
 
Neural net and back propagation
Neural net and back propagationNeural net and back propagation
Neural net and back propagation
 
Presentationnnnn
PresentationnnnnPresentationnnnn
Presentationnnnn
 
Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9Introduction to Neural networks (under graduate course) Lecture 9 of 9
Introduction to Neural networks (under graduate course) Lecture 9 of 9
 
Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)
 
08 neural networks
08 neural networks08 neural networks
08 neural networks
 
nil-100128213838-phpapp02.pptx
nil-100128213838-phpapp02.pptxnil-100128213838-phpapp02.pptx
nil-100128213838-phpapp02.pptx
 
UNIT 5-ANN.ppt
UNIT 5-ANN.pptUNIT 5-ANN.ppt
UNIT 5-ANN.ppt
 
Introduction to Deep learning and H2O for beginner's
Introduction to Deep learning and H2O for beginner'sIntroduction to Deep learning and H2O for beginner's
Introduction to Deep learning and H2O for beginner's
 
nil-100128213838-phpapp02.pdf
nil-100128213838-phpapp02.pdfnil-100128213838-phpapp02.pdf
nil-100128213838-phpapp02.pdf
 
Deep learning summary
Deep learning summaryDeep learning summary
Deep learning summary
 
Artificial neural network
Artificial neural networkArtificial neural network
Artificial neural network
 
Artificial Neural Network
Artificial Neural NetworkArtificial Neural Network
Artificial Neural Network
 
Artificial Neural Network Learning Algorithm.ppt
Artificial Neural Network Learning Algorithm.pptArtificial Neural Network Learning Algorithm.ppt
Artificial Neural Network Learning Algorithm.ppt
 
Deep Learning Basics.pptx
Deep Learning Basics.pptxDeep Learning Basics.pptx
Deep Learning Basics.pptx
 
Deep learning
Deep learningDeep learning
Deep learning
 
Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17 Artificial Neural Networks for NIU session 2016 17
Artificial Neural Networks for NIU session 2016 17
 
Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017Deep Learning in Recommender Systems - RecSys Summer School 2017
Deep Learning in Recommender Systems - RecSys Summer School 2017
 
Learning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat MinimaLearning Theory 101 ...and Towards Learning the Flat Minima
Learning Theory 101 ...and Towards Learning the Flat Minima
 

Recently uploaded

Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )aarthirajkumar25
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)PraveenaKalaiselvan1
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTSérgio Sacani
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoSérgio Sacani
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksSérgio Sacani
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhousejana861314
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxkessiyaTpeter
 

Recently uploaded (20)

Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )Recombination DNA Technology (Nucleic Acid Hybridization )
Recombination DNA Technology (Nucleic Acid Hybridization )
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
9953056974 Young Call Girls In Mahavir enclave Indian Quality Escort service
 
Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)Recombinant DNA technology (Immunological screening)
Recombinant DNA technology (Immunological screening)
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Disentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOSTDisentangling the origin of chemical differences using GHOST
Disentangling the origin of chemical differences using GHOST
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
Isotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on IoIsotopic evidence of long-lived volcanism on Io
Isotopic evidence of long-lived volcanism on Io
 
Formation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disksFormation of low mass protostars and their circumstellar disks
Formation of low mass protostars and their circumstellar disks
 
Orientation, design and principles of polyhouse
Orientation, design and principles of polyhouseOrientation, design and principles of polyhouse
Orientation, design and principles of polyhouse
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
CELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdfCELL -Structural and Functional unit of life.pdf
CELL -Structural and Functional unit of life.pdf
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptxSOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
SOLUBLE PATTERN RECOGNITION RECEPTORS.pptx
 

Introduction to Hamiltonian Neural Networks

  • 1. An Introduction to Hamiltonian Neural Networks Presented by Miles Cranmer, Princeton University @MilesCranmer (advised by Shirley Ho/David Spergel) This is based on none of my own research. The work is by: Sam Greydanus, Misko Dzamba, and Jason Yosinski (+ Tom Bertalan, Felix Dietrich, Igor Mesić, and Ioannis G Kevrekidis which was posted at a similar time)
  • 2. Ordering: 1. Classical Mechanics Review 2. Neural Networks 3. Hamiltonian Neural Networks 4. Bonus: Neural ODEs 5. Code Demo
  • 3. Forces • Objects and fields by themselves induce forces on other objects • A vector-wise sum of forces gets the net force • Divide by mass of the body to get the acceleration • Common forces: • Normal force (desk holding something) • Friction • Tension (string) • Gravity [1]
  • 4. Lagrangian Mechanics • For a coordinate system, • (Focus on object coordinates for today) • Write down kinetic energy = • Potential energy = • Lagrangian is a function of coordinates and (usually) their first order derivatives • Action is: • Apply principle of stationary action
  • 5. Lagrangian Mechanics 2 • By extremizing the action, we get the Euler-Lagrange equations. • Example: falling ball: • Numerically integrate these to get the dynamics of the system
  • 6. Hamiltonian Mechanics • Canonical momenta for a system: • Legendre transformation of L is the Hamiltonian: • This usually is the energy, conserved in a dynamical system. • What path preserves H? • Move perpendicular to its gradient! • Called symplectic programming
  • 8. Hamiltonian Mechanics 2 • H-preserving path = Symplectic Gradient: • Also known as Hamilton’s equations! • Can use these first order, explicit ODEs to integrate physical dynamics • Problems with L: • Second order, implicit ODEs • L isn’t meaningful by itself
  • 9. Things to worry about with L, H • Dissipation/friction • Need to add force to Euler-Lagrange equation • Can also use multiplicative factor: • Energy pools/boundaries • Constraints • E.g., normal forces • Sol’n: Use better coordinates (sometimes tricky) • Or, use constraint function that equals 0 • (Lagrange multiplier method) • *After reading the presentation – if you manage to think of a way to add these techniques to a Hamiltonian NN, come talk to me!
  • 10. Integrators • Presented with an explicit differential equation, we can use several methods to numerically integrate it. • Recall that: • This is an Euler integrator:
  • 11. Accurate Integrators • Advanced integrators do several intermediate steps to improve accuracy • Runge-Kutta integrators target accuracy • Can be very accurate, but not preserve known invariants! • Symplectic integrators target energy conservation • Can preserve energy very well, but have no accuracy! • (All integrators are bad for longterm accuracy) [3]
  • 12. Integrator Examples • Runge-Kutta 4th order (most common) • High accuracy, low- cost • Does not necessarily preserve energy [3]
  • 13. [3]
  • 14. • Symplectic 4th order (Yoshida) • These exactly conserve energy! • Do drift (update x) and kick (update p) steps separately • (c, d) are ugly constants, some negative, which add to 1 [4]
  • 15. Pivot to Machine Learning • Recall (or not?): Machine Learning is parameter estimation where the parameters lack explicit physical meaning! • Many types of ML: • Supervised (common): • Regression • Classification • Unsupervised • E.g., clustering, density estimation • Semi-supervised – a mix • Linear Regression – this counts as ML! [5]
  • 16. Neural Networks • Repeat after me: Neural Networks are piecewise Linear Regression! • Mathematically (we’ll only talk Multi-Layer Perceptrons): • (You do a linear regression -> zero the negatives -> repeat)
  • 17. Neural Networks 2 • Repeat after me: Neural Networks are piecewise Linear Regression! • 0-hidden layer Neural Network: linear regression! • 1-hidden layer NN with ReLU: Piecewise • Whatever combination of “neurons” are on = different “region” for linear regression • 2^(layers*hidden size) different linear regression solutions • Continuously connected • Don’t expect good extrapolation! Only nearby interpolation • Neural Net parameters both inform the slope and the regions.
  • 18. I don’t believe you! • Randomly-initialized 2-hidden layer 50-node NN:
  • 19. Why? • ReLU on = linear regression • ReLU off = 0 • Remaining nodes simplify to linear regression! [6]
  • 20. Neural Network Aside • Other activation functions: tanh and softplus, smear this linearity • Neural Networks are universal function approximators. In the limit of infinitely wide layers, even with two hidden ones, they can express any mapping. • They happen to be efficient at doing this too! • All Neural Network techniques are about getting them to cheat less. They are very good at cheating. • Data Augmentation (hugely important) • Regularization • Structure (Convolutional NN, Graph Net, etc)
  • 21. Differentiability • Derivative is well-defined. Just a product of sparse matrices! • Interested in: • Derivative wrt weights used for optimization (SGD or Adam) • Auto-diff frameworks like TensorFlow and PyTorch make this easy. • Demo: https://playground.tensorflow.org
  • 22. Neural Nets for Physical Dynamics • Here we will focus on physical systems over time. • Many other things like sequences can be reframed as dynamics problems. • We are interested in problems where we have: • • for i particles over time • In addition to other fixed properties... • How do we use Neural Nets to simulate systems?
  • 23. Example - Pendulum • How to learn to estimate the future position and velocity of a pendulum? • Neural Net: • n is the number of particles*dynamical parameters • l is the number of fixed parameters • Pendulum: • n = 2 (theta, theta velocity) • l = 2 (gravity, length of pendulum) • Want to only predict change in parameters - easier regression problem • So, here we are learning a function that approximates a velocity update and a force law [7]
  • 24. Real World Applications (of NNs for simulation) • Neural Networks learn "effective" forces in simulations • They only look at the most relevant degrees of freedom! • Can be more accurate at reduced computational cost • Some examples: • Shirley Ho's U-Net can do cosmological simulations much faster and more accurately than standard simulators • Peter Battaglia's Interaction Network used in many applications • Drug discovery/molecular+protein modelling – getting very popular • E.g., Cecilia Clementi, Frank Noe, Mark Waller, many others • DeepMind's AlphaFold Protein Folding algorithm - destroys baseline algorithms at finding structure from genetic code • See IPAM's recent workshop for good list! • Some say intelligent reasoning is based on learning to simulate potential outcomes => path to general intelligence?
  • 25. Hamiltonian Neural Networks • Learn a mapping from coordinates and momenta to a single number • The derivatives of this can describe your dynamics by Hamilton's equations: • Comparing the true and predicted dynamical updates gives a minimization objective: (Sam’s blog)
  • 27. Why? • It works better; it’s more interpretable. Not only do we have a simulator, we know the energy! (Sam’s blog)
  • 28. Why does it work? • It uses symplectic gradients: by prescribing that we can only move along the level set of H, it learns the proper H. Start: Final: (Sam’s blog)
  • 30. Integrators • So far we have only talked about Euler integrators. But as dH is just an ODE, we can use any integrator: RK4 and symplectic included. • If H has learned the true energy, we can exactly preserve it with symplectic integrators. • In practice, RK4 still more accurate. Maybe some combination is best? This model is less than 6 months old! We don't know what is best yet. • Can train + eval with RK4 or Symplectic Methods! • Do multiple queries and multiple derivatives of your network’s H • This works very well in practice.
  • 31. I don’t know the canonical coordinates! • Pair two Neural Networks: • g, an autoencoder to latent variables • H, a Hamiltonian that pretends those latent variables are (q, p). • Training this setup in combination will learn the canonical coords + the Hamiltonian! (Sam’s blog)
  • 32. Tips • Activations: • Recall: Neural Networks are piecewise linear regression. • Looking at derivatives from ReLU means we are literally learning a lookup table – not good! • Use Softplus or Tanh to make H have a smoother derivative • Use more hidden nodes than for regular NNs, as H needs to be very smooth • Stability: • According to some (Stephan Hoyer), better to learn multiple timesteps at once. • Use RK4 integrators
  • 33. Bonus: Neural ODEs • Famous 2018 paper: Neural Ordinary Differential Equations. • Hamiltonian Neural Networks -ARE- a Neural ODE. • Paper connects ResNets with Euler integrators • Paper: “Why not just learn a derivative and integrate it?” • Smoother output! (Chen et al)
  • 34. PyTorch Tutorial – Falling Ball • Short: https://bit.ly/2JiTEJE • (Copy to new notebook in your drive)
  • 35. Figure + other references 1. http://ffden-2.phys.uaf.edu/211_fall2004.web.dir/Jeff_Levison/Freebody%20diagram.htm 2. https://physics.stackexchange.com/questions/384990/why-will-a-dropped-object-land-at-the-same-time-as-a-sideways- thrown-one 3. https://en.wikipedia.org/wiki/Runge%E2%80%93Kutta_methods 4. https://en.wikipedia.org/wiki/Leapfrog_integration 5. https://en.wikipedia.org/wiki/Linear_regression#/media/File:Linear_regression.svg 6. https://medium.com/@amarbudhiraja/https-medium-com-amarbudhiraja-learning-less-to-learn-better-dropout-in- deep-machine-learning-74334da4bfc5 7. https://medium.com/@kriswilliams/how-life-is-like-a-pendulum-8811c4177685 Other resources used: 1. https://arxiv.org/abs/1906.01563 2. https://arxiv.org/abs/1907.12715 3. https://arxiv.org/pdf/1909.12790.pdf 4. https://greydanus.github.io/2019/05/15/hamiltonian-nns/ 5. https://arxiv.org/pdf/1806.07366.pdf