SlideShare a Scribd company logo
Feedback
■Activation function
 Why use?
• Large hidden layer : complex function
• Small hidden layer : simple function
• Input node : 13, output node : 1
■ hidden layer 1 (Node 4 ) : 13 * 4 + 5 = 57
■ Hidden layer 2 (Node 2) : 13 * 2 + 2 * 2 + 3 = 35
Ch4_Feedback
Interaction Lab., Kumoh National Institue of Technology 2
■Sigmoid function
 ℎ 𝑥 =
1
1+𝑒−𝑥
 Smooth curve, continuous variation
 Return real-valued
 ℎ 1 = 0.731
■ReLU function
 ℎ(𝑥) =
𝑥 (𝑥 ≥ 0)
0 (𝑥 < 0)
 Leakly ReLU, PReLU
Ch4_Feedback
Interaction Lab., Kumoh National Institue of Technology 3
■Sigmoid function
 Gradient vanishing
• Backpropagation
Ch4_Feedback
Interaction Lab., Kumoh National Institue of Technology 4
■HOG(Histogram of Gradient)
 Use image’s local gradient as a feature of the image
Ch5_Feedback
Interaction Lab., Kumoh National Institue of Technology 5
■GD vs SGD
 Gradient Descent
• Compute all the data => 1 h
• Take the best step forward
• 6 step = 6 h
• Sure, but it is too slow
 Stochastic Gradient Descent
• Compute only some data => 5 m
• Take quickly step forward
• 10 step = 50 m
• It is a little lost, but it is going fast
Ch5_Feedback
Interaction Lab., Kumoh National Institue of Technology 6
■Optimizer
Ch5_Feedback
Interaction Lab., Kumoh National Institue of Technology 7
Interaction Lab. Kumoh National Institute of Technology
Deep Learning from Scratch
chapter 6. back propagation
JaeYeop Jeong
■Intro
■Computational graph
■Chain rule
■Back propagation
■Implementation of simple layer
■Implementation of activation function layer
■Implementation of Affine/softmax layer
Agenda
Interaction Lab., Kumoh National Institue of Technology 9
■Numerical differentials are simple and easy to implement
 Long time to calculate
■Back propagation
 To calculate the gradient of the weight efficiently
 A formula or Computational graph
Intro
Interaction Lab., Kumoh National Institue of Technology 10
■A graph of the calculation process
 Node, edge
■Q1
 현빈 군은 슈퍼에서 1개에 100원인 사과를 2개 샀습니다. 이때 지불
금액을 구하세요. 단 소비세가 10% 부과됩니다.
Computational graph(1/5)
Interaction Lab., Kumoh National Institue of Technology 11
Computational graph(2/5)
Interaction Lab., Kumoh National Institue of Technology 12
■Q2
 현빈 군은 슈퍼에서 사과를 2개, 귤을 3개 샀습니다. 사과는 1개에 100
원, 귤은 1개 150원입니다. 소비세가 10%일 때 지불 금액을 구하세요.
 Construct the Computational graph
 Proceed from left to right with the calculation
Computational graph(3/5)
Interaction Lab., Kumoh National Institue of Technology 13
■Local computation
 A small range directly related to oneself
Computational graph(4/5)
Interaction Lab., Kumoh National Institue of Technology 14
4000 + 200 = 4200
■Why computational graph
 Local computation
 Keep all intermediate calculation results
 Calculate differentials efficiently
• Apple prices : 𝑥, Payment(𝐿) :
𝜕𝐿
𝜕𝑥
Computational graph(5/5)
Interaction Lab., Kumoh National Institue of Technology 15
■Back propagation of computational graph
 Multiply the local differential in the forward and opposite directions
• 𝑦 = 𝑓 𝑥 = 𝑥2
,
𝜕𝑦
𝜕𝑥
= 2𝑥
Chain rule(1/3)
Interaction Lab., Kumoh National Institue of Technology 16
𝑓
𝑥 𝑦
𝐸
𝜕𝑦
𝜕𝑥
𝐸
■𝑧 = 𝑡2
, 𝑡 = 𝑥 + 𝑦
Chain rule(2/3)
Interaction Lab., Kumoh National Institue of Technology 17
Chain rule(3/3)
Interaction Lab., Kumoh National Institue of Technology 18
𝜕𝑧
𝜕𝑧
𝜕𝑧
𝜕𝑡
𝜕𝑡
𝜕𝑥
=
𝜕𝑧
𝜕𝑡
𝜕𝑡
𝜕𝑥
=
𝜕𝑧
𝜕𝑥
■Back propagation of add node
 𝑧 = 𝑥 + 𝑦,
𝜕𝑧
𝜕𝑥
= 1,
𝜕𝑧
𝜕𝑦
= 1
Back propagation(1/5)
Interaction Lab., Kumoh National Institue of Technology 19
■Back propagation of add node
 Add node : Send as it is
Back propagation(2/5)
Interaction Lab., Kumoh National Institue of Technology 20
■Back propagation of multiply node
 𝑧 = 𝑥𝑦,
𝜕𝑧
𝜕𝑥
= 𝑦,
𝜕𝑧
𝜕𝑦
= 𝑥
Back propagation(3/5)
Interaction Lab., Kumoh National Institue of Technology 21
■Back propagation of multiply node
 Multiply interchangeable values
• Input of forward propagation
Back propagation(4/5)
Interaction Lab., Kumoh National Institue of Technology 22
■Example
Back propagation(5/5)
Interaction Lab., Kumoh National Institue of Technology 23
■Multiply layer
Implementation of simple layer(1/3)
Interaction Lab., Kumoh National Institue of Technology 24
■Add layer
Implementation of simple layer(2/3)
Interaction Lab., Kumoh National Institue of Technology 25
Implementation of simple layer(3/3)
Interaction Lab., Kumoh National Institue of Technology 26
■ReLU layer
 𝑦 =
𝑥 ( 𝑥 > 0)
0 (𝑥 ≤ 0)
𝜕𝑦
𝜕𝑥
=
1 (𝑥 > 0)
0 (𝑥 ≤ 0)
Implementation of activation function layer
Interaction Lab., Kumoh National Institue of Technology 27
𝑟𝑒𝑙𝑢
𝑥 𝑦
𝜕𝐿
𝜕𝑦
𝜕𝐿
𝜕𝑦
𝑟𝑒𝑙𝑢
𝑥 𝑦
0 𝜕𝐿
𝜕𝑦
𝑥 > 0
𝑥 ≤ 0
■Sigmoid layer
 𝑦 =
1
1+exp(−𝑥)
 exp 𝑥 → 𝑦 = exp 𝑥
 / → 𝑦 =
1
𝑥
Implementation of activation function layer
Interaction Lab., Kumoh National Institue of Technology 28
■Sigmoid layer
 𝑦 =
1
1+exp(−𝑥)
, (1 + exp −𝑥 = 𝑥) 𝑦 =
1
𝑥
Implementation of activation function layer
Interaction Lab., Kumoh National Institue of Technology 29
■Sigmoid layer
Implementation of activation function layer
Interaction Lab., Kumoh National Institue of Technology 30
■Sigmoid layer
Implementation of activation function layer
Interaction Lab., Kumoh National Institue of Technology 31
■Affine layer
Implementation of Affine/softmax layer
Interaction Lab., Kumoh National Institue of Technology 32
■Batch affine layer
Implementation of Affine/softmax layer
Interaction Lab., Kumoh National Institue of Technology 33
■Softmax-with-Loss layer
 Softmax layer
 Cross entropy error
Implementation of Affine/softmax layer
Interaction Lab., Kumoh National Institue of Technology 34
■Softmax-with-Loss layer
Implementation of Affine/softmax layer
Interaction Lab., Kumoh National Institue of Technology 35
■Softmax-with-Loss layer
 t : (0, 1, 0)
 y : (0.3, 0.2, 0.5) => y – t : (0.3, -0.8, 0.5)
Implementation of Affine/softmax layer
Interaction Lab., Kumoh National Institue of Technology 36
Q&A
Interaction Lab., Kumoh National Institue of Technology 37

More Related Content

What's hot

자습해도 모르겠던 딥러닝, 머리속에 인스톨 시켜드립니다.
자습해도 모르겠던 딥러닝, 머리속에 인스톨 시켜드립니다.자습해도 모르겠던 딥러닝, 머리속에 인스톨 시켜드립니다.
자습해도 모르겠던 딥러닝, 머리속에 인스톨 시켜드립니다.
Yongho Ha
 
深層学習による非滑らかな関数の推定
深層学習による非滑らかな関数の推定深層学習による非滑らかな関数の推定
深層学習による非滑らかな関数の推定
Masaaki Imaizumi
 
딥러닝 - 역사와 이론적 기초
딥러닝 - 역사와 이론적 기초딥러닝 - 역사와 이론적 기초
딥러닝 - 역사와 이론적 기초
Hyungsoo Ryoo
 
MIRU2014 tutorial deeplearning
MIRU2014 tutorial deeplearningMIRU2014 tutorial deeplearning
MIRU2014 tutorial deeplearning
Takayoshi Yamashita
 
Deep Learningについて(改訂版)
Deep Learningについて(改訂版)Deep Learningについて(改訂版)
Deep Learningについて(改訂版)
Brains Consulting, Inc.
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
NAVER Engineering
 
DLLab 異常検知ナイト 資料 20180214
DLLab 異常検知ナイト 資料 20180214DLLab 異常検知ナイト 資料 20180214
DLLab 異常検知ナイト 資料 20180214
Kosuke Nakago
 
Brief intro : Invariance and Equivariance
Brief intro : Invariance and EquivarianceBrief intro : Invariance and Equivariance
Brief intro : Invariance and Equivariance
홍배 김
 
【解説】 一般逆行列
【解説】 一般逆行列【解説】 一般逆行列
【解説】 一般逆行列
Kenjiro Sugimoto
 
はじパタ8章 svm
はじパタ8章 svmはじパタ8章 svm
はじパタ8章 svmtetsuro ito
 
LLVM最適化のこつ
LLVM最適化のこつLLVM最適化のこつ
LLVM最適化のこつ
MITSUNARI Shigeo
 
社会心理学者のための時系列分析入門_小森
社会心理学者のための時系列分析入門_小森社会心理学者のための時系列分析入門_小森
社会心理学者のための時系列分析入門_小森
Masashi Komori
 
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recog...
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recog...Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recog...
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recog...
yukihiro domae
 
並列化による高速化
並列化による高速化 並列化による高速化
並列化による高速化
sakura-mike
 
高速な倍精度指数関数expの実装
高速な倍精度指数関数expの実装高速な倍精度指数関数expの実装
高速な倍精度指数関数expの実装
MITSUNARI Shigeo
 
20190721 gaussian process
20190721 gaussian process20190721 gaussian process
20190721 gaussian process
Yoichi Tokita
 
TensorFlow XLAは、 中で何をやっているのか?
TensorFlow XLAは、 中で何をやっているのか?TensorFlow XLAは、 中で何をやっているのか?
TensorFlow XLAは、 中で何をやっているのか?
Mr. Vengineer
 
deep learning from scratch chapter 4.neural network learing
deep learning from scratch chapter 4.neural network learingdeep learning from scratch chapter 4.neural network learing
deep learning from scratch chapter 4.neural network learing
Jaey Jeong
 
数値解析と物理学
数値解析と物理学数値解析と物理学
数値解析と物理学
すずしめ
 
実践・最強最速のアルゴリズム勉強会 第一回 講義資料(ワークスアプリケーションズ & AtCoder)
実践・最強最速のアルゴリズム勉強会 第一回 講義資料(ワークスアプリケーションズ & AtCoder)実践・最強最速のアルゴリズム勉強会 第一回 講義資料(ワークスアプリケーションズ & AtCoder)
実践・最強最速のアルゴリズム勉強会 第一回 講義資料(ワークスアプリケーションズ & AtCoder)
AtCoder Inc.
 

What's hot (20)

자습해도 모르겠던 딥러닝, 머리속에 인스톨 시켜드립니다.
자습해도 모르겠던 딥러닝, 머리속에 인스톨 시켜드립니다.자습해도 모르겠던 딥러닝, 머리속에 인스톨 시켜드립니다.
자습해도 모르겠던 딥러닝, 머리속에 인스톨 시켜드립니다.
 
深層学習による非滑らかな関数の推定
深層学習による非滑らかな関数の推定深層学習による非滑らかな関数の推定
深層学習による非滑らかな関数の推定
 
딥러닝 - 역사와 이론적 기초
딥러닝 - 역사와 이론적 기초딥러닝 - 역사와 이론적 기초
딥러닝 - 역사와 이론적 기초
 
MIRU2014 tutorial deeplearning
MIRU2014 tutorial deeplearningMIRU2014 tutorial deeplearning
MIRU2014 tutorial deeplearning
 
Deep Learningについて(改訂版)
Deep Learningについて(改訂版)Deep Learningについて(改訂版)
Deep Learningについて(改訂版)
 
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
 
DLLab 異常検知ナイト 資料 20180214
DLLab 異常検知ナイト 資料 20180214DLLab 異常検知ナイト 資料 20180214
DLLab 異常検知ナイト 資料 20180214
 
Brief intro : Invariance and Equivariance
Brief intro : Invariance and EquivarianceBrief intro : Invariance and Equivariance
Brief intro : Invariance and Equivariance
 
【解説】 一般逆行列
【解説】 一般逆行列【解説】 一般逆行列
【解説】 一般逆行列
 
はじパタ8章 svm
はじパタ8章 svmはじパタ8章 svm
はじパタ8章 svm
 
LLVM最適化のこつ
LLVM最適化のこつLLVM最適化のこつ
LLVM最適化のこつ
 
社会心理学者のための時系列分析入門_小森
社会心理学者のための時系列分析入門_小森社会心理学者のための時系列分析入門_小森
社会心理学者のための時系列分析入門_小森
 
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recog...
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recog...Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recog...
Spatial Temporal Graph Convolutional Networks for Skeleton-Based Action Recog...
 
並列化による高速化
並列化による高速化 並列化による高速化
並列化による高速化
 
高速な倍精度指数関数expの実装
高速な倍精度指数関数expの実装高速な倍精度指数関数expの実装
高速な倍精度指数関数expの実装
 
20190721 gaussian process
20190721 gaussian process20190721 gaussian process
20190721 gaussian process
 
TensorFlow XLAは、 中で何をやっているのか?
TensorFlow XLAは、 中で何をやっているのか?TensorFlow XLAは、 中で何をやっているのか?
TensorFlow XLAは、 中で何をやっているのか?
 
deep learning from scratch chapter 4.neural network learing
deep learning from scratch chapter 4.neural network learingdeep learning from scratch chapter 4.neural network learing
deep learning from scratch chapter 4.neural network learing
 
数値解析と物理学
数値解析と物理学数値解析と物理学
数値解析と物理学
 
実践・最強最速のアルゴリズム勉強会 第一回 講義資料(ワークスアプリケーションズ & AtCoder)
実践・最強最速のアルゴリズム勉強会 第一回 講義資料(ワークスアプリケーションズ & AtCoder)実践・最強最速のアルゴリズム勉強会 第一回 講義資料(ワークスアプリケーションズ & AtCoder)
実践・最強最速のアルゴリズム勉強会 第一回 講義資料(ワークスアプリケーションズ & AtCoder)
 

Similar to deep learning from scratch chapter 6.backpropagation

hands on machine learning Chapter 4 model training
hands on machine learning Chapter 4 model traininghands on machine learning Chapter 4 model training
hands on machine learning Chapter 4 model training
Jaey Jeong
 
Gaze estimation using transformer
Gaze estimation using transformerGaze estimation using transformer
Gaze estimation using transformer
Jaey Jeong
 
deep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural networkdeep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural network
Jaey Jeong
 
Unsupervised representation learning for gaze estimation
Unsupervised representation learning for gaze estimationUnsupervised representation learning for gaze estimation
Unsupervised representation learning for gaze estimation
Jaey Jeong
 
Appearance based gaze estimation using deep features and random forest regres...
Appearance based gaze estimation using deep features and random forest regres...Appearance based gaze estimation using deep features and random forest regres...
Appearance based gaze estimation using deep features and random forest regres...
Jaey Jeong
 
deep learning from scratch chapter 7.cnn
deep learning from scratch chapter 7.cnndeep learning from scratch chapter 7.cnn
deep learning from scratch chapter 7.cnn
Jaey Jeong
 
deep learning from scratch chapter 5.learning related skills
deep learning from scratch chapter 5.learning related skillsdeep learning from scratch chapter 5.learning related skills
deep learning from scratch chapter 5.learning related skills
Jaey Jeong
 
Mlp mixer an all-mlp architecture for vision
Mlp mixer  an all-mlp architecture for visionMlp mixer  an all-mlp architecture for vision
Mlp mixer an all-mlp architecture for vision
Jaey Jeong
 
Deep learning based gaze detection system for automobile drivers using nir ca...
Deep learning based gaze detection system for automobile drivers using nir ca...Deep learning based gaze detection system for automobile drivers using nir ca...
Deep learning based gaze detection system for automobile drivers using nir ca...
Jaey Jeong
 
Improving accuracy of binary neural networks using unbalanced activation dist...
Improving accuracy of binary neural networks using unbalanced activation dist...Improving accuracy of binary neural networks using unbalanced activation dist...
Improving accuracy of binary neural networks using unbalanced activation dist...
Jaey Jeong
 
Parallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemMLParallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemMLJanani C
 
Correctness attraction __kth_2017
Correctness attraction __kth_2017Correctness attraction __kth_2017
Correctness attraction __kth_2017
Benjamin Danglot
 
Tablet gaze unconstrained appearance based gaze estimation in mobile tablets
Tablet gaze unconstrained appearance based gaze estimation in mobile tabletsTablet gaze unconstrained appearance based gaze estimation in mobile tablets
Tablet gaze unconstrained appearance based gaze estimation in mobile tablets
Jaey Jeong
 
hands on machine learning Chapter 6&7 decision tree, ensemble and random forest
hands on machine learning Chapter 6&7 decision tree, ensemble and random foresthands on machine learning Chapter 6&7 decision tree, ensemble and random forest
hands on machine learning Chapter 6&7 decision tree, ensemble and random forest
Jaey Jeong
 
HOP-Rec_RecSys18
HOP-Rec_RecSys18HOP-Rec_RecSys18
HOP-Rec_RecSys18
Matt Yang
 
Algorithm of some numerical /computational methods
Algorithm of some numerical /computational methodsAlgorithm of some numerical /computational methods
Algorithm of some numerical /computational methods
Chandan
 
Parallel Optimization in Machine Learning
Parallel Optimization in Machine LearningParallel Optimization in Machine Learning
Parallel Optimization in Machine Learning
Fabian Pedregosa
 
.pptx
.pptx.pptx

Similar to deep learning from scratch chapter 6.backpropagation (20)

hands on machine learning Chapter 4 model training
hands on machine learning Chapter 4 model traininghands on machine learning Chapter 4 model training
hands on machine learning Chapter 4 model training
 
Gaze estimation using transformer
Gaze estimation using transformerGaze estimation using transformer
Gaze estimation using transformer
 
deep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural networkdeep learning from scratch chapter 3 neural network
deep learning from scratch chapter 3 neural network
 
Unsupervised representation learning for gaze estimation
Unsupervised representation learning for gaze estimationUnsupervised representation learning for gaze estimation
Unsupervised representation learning for gaze estimation
 
Appearance based gaze estimation using deep features and random forest regres...
Appearance based gaze estimation using deep features and random forest regres...Appearance based gaze estimation using deep features and random forest regres...
Appearance based gaze estimation using deep features and random forest regres...
 
deep learning from scratch chapter 7.cnn
deep learning from scratch chapter 7.cnndeep learning from scratch chapter 7.cnn
deep learning from scratch chapter 7.cnn
 
deep learning from scratch chapter 5.learning related skills
deep learning from scratch chapter 5.learning related skillsdeep learning from scratch chapter 5.learning related skills
deep learning from scratch chapter 5.learning related skills
 
Mlp mixer an all-mlp architecture for vision
Mlp mixer  an all-mlp architecture for visionMlp mixer  an all-mlp architecture for vision
Mlp mixer an all-mlp architecture for vision
 
Deep learning based gaze detection system for automobile drivers using nir ca...
Deep learning based gaze detection system for automobile drivers using nir ca...Deep learning based gaze detection system for automobile drivers using nir ca...
Deep learning based gaze detection system for automobile drivers using nir ca...
 
Improving accuracy of binary neural networks using unbalanced activation dist...
Improving accuracy of binary neural networks using unbalanced activation dist...Improving accuracy of binary neural networks using unbalanced activation dist...
Improving accuracy of binary neural networks using unbalanced activation dist...
 
Parallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemMLParallel Machine Learning- DSGD and SystemML
Parallel Machine Learning- DSGD and SystemML
 
Correctness attraction __kth_2017
Correctness attraction __kth_2017Correctness attraction __kth_2017
Correctness attraction __kth_2017
 
Lesson 39
Lesson 39Lesson 39
Lesson 39
 
AI Lesson 39
AI Lesson 39AI Lesson 39
AI Lesson 39
 
Tablet gaze unconstrained appearance based gaze estimation in mobile tablets
Tablet gaze unconstrained appearance based gaze estimation in mobile tabletsTablet gaze unconstrained appearance based gaze estimation in mobile tablets
Tablet gaze unconstrained appearance based gaze estimation in mobile tablets
 
hands on machine learning Chapter 6&7 decision tree, ensemble and random forest
hands on machine learning Chapter 6&7 decision tree, ensemble and random foresthands on machine learning Chapter 6&7 decision tree, ensemble and random forest
hands on machine learning Chapter 6&7 decision tree, ensemble and random forest
 
HOP-Rec_RecSys18
HOP-Rec_RecSys18HOP-Rec_RecSys18
HOP-Rec_RecSys18
 
Algorithm of some numerical /computational methods
Algorithm of some numerical /computational methodsAlgorithm of some numerical /computational methods
Algorithm of some numerical /computational methods
 
Parallel Optimization in Machine Learning
Parallel Optimization in Machine LearningParallel Optimization in Machine Learning
Parallel Optimization in Machine Learning
 
.pptx
.pptx.pptx
.pptx
 

Recently uploaded

Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
rickgrimesss22
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
Paco van Beckhoven
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
Globus
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
Matt Welsh
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
Shane Coughlan
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Neo4j
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
Fermin Galan
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
Globus
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
Georgi Kodinov
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
NYGGS Automation Suite
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Globus
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
Google
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
Neo4j
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
abdulrafaychaudhry
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
Aftab Hussain
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
Globus
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Globus
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
Globus
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns
 

Recently uploaded (20)

Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptxTop Features to Include in Your Winzo Clone App for Business Growth (4).pptx
Top Features to Include in Your Winzo Clone App for Business Growth (4).pptx
 
Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024Cracking the code review at SpringIO 2024
Cracking the code review at SpringIO 2024
 
How to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good PracticesHow to Position Your Globus Data Portal for Success Ten Good Practices
How to Position Your Globus Data Portal for Success Ten Good Practices
 
Large Language Models and the End of Programming
Large Language Models and the End of ProgrammingLarge Language Models and the End of Programming
Large Language Models and the End of Programming
 
openEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain SecurityopenEuler Case Study - The Journey to Supply Chain Security
openEuler Case Study - The Journey to Supply Chain Security
 
Atelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissancesAtelier - Innover avec l’IA Générative et les graphes de connaissances
Atelier - Innover avec l’IA Générative et les graphes de connaissances
 
Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604Orion Context Broker introduction 20240604
Orion Context Broker introduction 20240604
 
OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024OpenMetadata Community Meeting - 5th June 2024
OpenMetadata Community Meeting - 5th June 2024
 
Understanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSageUnderstanding Globus Data Transfers with NetSage
Understanding Globus Data Transfers with NetSage
 
2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx2024 RoOUG Security model for the cloud.pptx
2024 RoOUG Security model for the cloud.pptx
 
Enterprise Resource Planning System in Telangana
Enterprise Resource Planning System in TelanganaEnterprise Resource Planning System in Telangana
Enterprise Resource Planning System in Telangana
 
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
Innovating Inference - Remote Triggering of Large Language Models on HPC Clus...
 
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI AppAI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
AI Fusion Buddy Review: Brand New, Groundbreaking Gemini-Powered AI App
 
GraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph TechnologyGraphSummit Paris - The art of the possible with Graph Technology
GraphSummit Paris - The art of the possible with Graph Technology
 
Pro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp BookPro Unity Game Development with C-sharp Book
Pro Unity Game Development with C-sharp Book
 
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of CodeA Study of Variable-Role-based Feature Enrichment in Neural Models of Code
A Study of Variable-Role-based Feature Enrichment in Neural Models of Code
 
First Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User EndpointsFirst Steps with Globus Compute Multi-User Endpoints
First Steps with Globus Compute Multi-User Endpoints
 
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
Climate Science Flows: Enabling Petabyte-Scale Climate Analysis with the Eart...
 
Enhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdfEnhancing Research Orchestration Capabilities at ORNL.pdf
Enhancing Research Orchestration Capabilities at ORNL.pdf
 
Prosigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology SolutionsProsigns: Transforming Business with Tailored Technology Solutions
Prosigns: Transforming Business with Tailored Technology Solutions
 

deep learning from scratch chapter 6.backpropagation

  • 2. ■Activation function  Why use? • Large hidden layer : complex function • Small hidden layer : simple function • Input node : 13, output node : 1 ■ hidden layer 1 (Node 4 ) : 13 * 4 + 5 = 57 ■ Hidden layer 2 (Node 2) : 13 * 2 + 2 * 2 + 3 = 35 Ch4_Feedback Interaction Lab., Kumoh National Institue of Technology 2
  • 3. ■Sigmoid function  ℎ 𝑥 = 1 1+𝑒−𝑥  Smooth curve, continuous variation  Return real-valued  ℎ 1 = 0.731 ■ReLU function  ℎ(𝑥) = 𝑥 (𝑥 ≥ 0) 0 (𝑥 < 0)  Leakly ReLU, PReLU Ch4_Feedback Interaction Lab., Kumoh National Institue of Technology 3
  • 4. ■Sigmoid function  Gradient vanishing • Backpropagation Ch4_Feedback Interaction Lab., Kumoh National Institue of Technology 4
  • 5. ■HOG(Histogram of Gradient)  Use image’s local gradient as a feature of the image Ch5_Feedback Interaction Lab., Kumoh National Institue of Technology 5
  • 6. ■GD vs SGD  Gradient Descent • Compute all the data => 1 h • Take the best step forward • 6 step = 6 h • Sure, but it is too slow  Stochastic Gradient Descent • Compute only some data => 5 m • Take quickly step forward • 10 step = 50 m • It is a little lost, but it is going fast Ch5_Feedback Interaction Lab., Kumoh National Institue of Technology 6
  • 7. ■Optimizer Ch5_Feedback Interaction Lab., Kumoh National Institue of Technology 7
  • 8. Interaction Lab. Kumoh National Institute of Technology Deep Learning from Scratch chapter 6. back propagation JaeYeop Jeong
  • 9. ■Intro ■Computational graph ■Chain rule ■Back propagation ■Implementation of simple layer ■Implementation of activation function layer ■Implementation of Affine/softmax layer Agenda Interaction Lab., Kumoh National Institue of Technology 9
  • 10. ■Numerical differentials are simple and easy to implement  Long time to calculate ■Back propagation  To calculate the gradient of the weight efficiently  A formula or Computational graph Intro Interaction Lab., Kumoh National Institue of Technology 10
  • 11. ■A graph of the calculation process  Node, edge ■Q1  현빈 군은 슈퍼에서 1개에 100원인 사과를 2개 샀습니다. 이때 지불 금액을 구하세요. 단 소비세가 10% 부과됩니다. Computational graph(1/5) Interaction Lab., Kumoh National Institue of Technology 11
  • 12. Computational graph(2/5) Interaction Lab., Kumoh National Institue of Technology 12
  • 13. ■Q2  현빈 군은 슈퍼에서 사과를 2개, 귤을 3개 샀습니다. 사과는 1개에 100 원, 귤은 1개 150원입니다. 소비세가 10%일 때 지불 금액을 구하세요.  Construct the Computational graph  Proceed from left to right with the calculation Computational graph(3/5) Interaction Lab., Kumoh National Institue of Technology 13
  • 14. ■Local computation  A small range directly related to oneself Computational graph(4/5) Interaction Lab., Kumoh National Institue of Technology 14 4000 + 200 = 4200
  • 15. ■Why computational graph  Local computation  Keep all intermediate calculation results  Calculate differentials efficiently • Apple prices : 𝑥, Payment(𝐿) : 𝜕𝐿 𝜕𝑥 Computational graph(5/5) Interaction Lab., Kumoh National Institue of Technology 15
  • 16. ■Back propagation of computational graph  Multiply the local differential in the forward and opposite directions • 𝑦 = 𝑓 𝑥 = 𝑥2 , 𝜕𝑦 𝜕𝑥 = 2𝑥 Chain rule(1/3) Interaction Lab., Kumoh National Institue of Technology 16 𝑓 𝑥 𝑦 𝐸 𝜕𝑦 𝜕𝑥 𝐸
  • 17. ■𝑧 = 𝑡2 , 𝑡 = 𝑥 + 𝑦 Chain rule(2/3) Interaction Lab., Kumoh National Institue of Technology 17
  • 18. Chain rule(3/3) Interaction Lab., Kumoh National Institue of Technology 18 𝜕𝑧 𝜕𝑧 𝜕𝑧 𝜕𝑡 𝜕𝑡 𝜕𝑥 = 𝜕𝑧 𝜕𝑡 𝜕𝑡 𝜕𝑥 = 𝜕𝑧 𝜕𝑥
  • 19. ■Back propagation of add node  𝑧 = 𝑥 + 𝑦, 𝜕𝑧 𝜕𝑥 = 1, 𝜕𝑧 𝜕𝑦 = 1 Back propagation(1/5) Interaction Lab., Kumoh National Institue of Technology 19
  • 20. ■Back propagation of add node  Add node : Send as it is Back propagation(2/5) Interaction Lab., Kumoh National Institue of Technology 20
  • 21. ■Back propagation of multiply node  𝑧 = 𝑥𝑦, 𝜕𝑧 𝜕𝑥 = 𝑦, 𝜕𝑧 𝜕𝑦 = 𝑥 Back propagation(3/5) Interaction Lab., Kumoh National Institue of Technology 21
  • 22. ■Back propagation of multiply node  Multiply interchangeable values • Input of forward propagation Back propagation(4/5) Interaction Lab., Kumoh National Institue of Technology 22
  • 23. ■Example Back propagation(5/5) Interaction Lab., Kumoh National Institue of Technology 23
  • 24. ■Multiply layer Implementation of simple layer(1/3) Interaction Lab., Kumoh National Institue of Technology 24
  • 25. ■Add layer Implementation of simple layer(2/3) Interaction Lab., Kumoh National Institue of Technology 25
  • 26. Implementation of simple layer(3/3) Interaction Lab., Kumoh National Institue of Technology 26
  • 27. ■ReLU layer  𝑦 = 𝑥 ( 𝑥 > 0) 0 (𝑥 ≤ 0) 𝜕𝑦 𝜕𝑥 = 1 (𝑥 > 0) 0 (𝑥 ≤ 0) Implementation of activation function layer Interaction Lab., Kumoh National Institue of Technology 27 𝑟𝑒𝑙𝑢 𝑥 𝑦 𝜕𝐿 𝜕𝑦 𝜕𝐿 𝜕𝑦 𝑟𝑒𝑙𝑢 𝑥 𝑦 0 𝜕𝐿 𝜕𝑦 𝑥 > 0 𝑥 ≤ 0
  • 28. ■Sigmoid layer  𝑦 = 1 1+exp(−𝑥)  exp 𝑥 → 𝑦 = exp 𝑥  / → 𝑦 = 1 𝑥 Implementation of activation function layer Interaction Lab., Kumoh National Institue of Technology 28
  • 29. ■Sigmoid layer  𝑦 = 1 1+exp(−𝑥) , (1 + exp −𝑥 = 𝑥) 𝑦 = 1 𝑥 Implementation of activation function layer Interaction Lab., Kumoh National Institue of Technology 29
  • 30. ■Sigmoid layer Implementation of activation function layer Interaction Lab., Kumoh National Institue of Technology 30
  • 31. ■Sigmoid layer Implementation of activation function layer Interaction Lab., Kumoh National Institue of Technology 31
  • 32. ■Affine layer Implementation of Affine/softmax layer Interaction Lab., Kumoh National Institue of Technology 32
  • 33. ■Batch affine layer Implementation of Affine/softmax layer Interaction Lab., Kumoh National Institue of Technology 33
  • 34. ■Softmax-with-Loss layer  Softmax layer  Cross entropy error Implementation of Affine/softmax layer Interaction Lab., Kumoh National Institue of Technology 34
  • 35. ■Softmax-with-Loss layer Implementation of Affine/softmax layer Interaction Lab., Kumoh National Institue of Technology 35
  • 36. ■Softmax-with-Loss layer  t : (0, 1, 0)  y : (0.3, 0.2, 0.5) => y – t : (0.3, -0.8, 0.5) Implementation of Affine/softmax layer Interaction Lab., Kumoh National Institue of Technology 36
  • 37. Q&A Interaction Lab., Kumoh National Institue of Technology 37