Week4Visual AI(시각 인공지능) Lecture 4 : Multiple Layer Perceptron (MLP)

•

0 likes•765 views

Kwangsik Lee

Visual AI(시각 인공지능) Lecture 4 : Multiple Layer Perceptron (MLP)

Technology

Content
Multilayer Perceptron (MLP)
(Commonly known as neural network by lay person)
Error Function (Loss Function) of MLP
2

Multiple Layer Perceptron (MLP)
MLP는 fully connected임
MLP는 classification, regression 모두에 쓰일 수 있음
5

MLP Architectures (2/2)
히든 레이어의 활성화 함수로 전통적으로 sigmoid를 사용
왜 step function을 사용하지 않는가? 미분을 가능케 하기 위해
7

Other Commonly use Activation Functions
8

MLP Architectures
Classification 문제에서 출력 뉴런은 아래와 같이 구성 가능
9

Common Error/Loss Functions used in
MLP
17

Mean Square Error (MSE) for Regression
MSE Loss
ϵ = (d − S(y ))
ϵ = ϵ
Q = # of training data
k
2
1
j=2
∑
p
j
k
j
k 2
Q
1
k=1
∑
Q
k
19

Error Function in the Weight Space
Error function epsilon(w)는 못생김
매우 높은 차원
non-linear
global minima는 도달 불가능할 수 있음
local minima는 bad or good
20

Error Functions
학습단계에 있어 에러 함수는 validation/test 셋을 사용하지 않
음
에러 함수는 학습 단계에서의 이정표 역할을 해줌
-> 에러를 줄여 최적의 가중치를 찾기위해
25

다음강의
GD 알고리즘을 배웠는데 하나의 단일 뉴런의 가중치 최적화를
알아보자.
GD를 MLP에는 어떻게 적용할 수 있을까?
26

Assumption
If we assume f , f , f is linear function.
We can say a^1 = f^1(w^1p+b^1) becomes also linear function
since (w^1p+b^1) is a linear function. after applying f , each data
points just move within linear space.
1 2 3
1
28

Proof
Let's replace f^1, f^2, f^3 = p which means it's a identity
function and to make calculation simpler.
a = f (w f (w f (w p + b ) + b ) + b )
a = f (w f (w (w p + b ) + b ) + b ) since f = p
a = f (w f (w w p + w b + b ) + b )
a = f (w (w w p + w b + b ) + b ) since f = p
a = f (w w w p + w w b + w b + b )
a = w w w p + w w b + w b + b since f = p
a = w w w p + w w b + w b + b
3 3 3 2 2 1 1 1 2 3
3 3 3 2 2 1 1 2 3 1
3 3 3 2 2 1 2 1 2 3
3 3 3 2 1 2 1 2 3 2
3 3 3 2 1 3 2 1 3 2 3
3 3 2 1 3 2 1 3 2 3 3
3 3 2 1 3 2 1 3 2 3
29

we can say
a = Ap + C where A = w w w , C = w w b + w b + b
As a result, final activation function a ^ 3 is a single linear neuron.
This only makes 1-dimension decision boundary(hyperplane), so it
can't solve complex classification problem(using more than 2-
dimension hyper plane).
So, we should use non-linear activation function in MLP.
3 3 2 1 3 2 1 3 2 3
30

What's hot

Presentation on Probability Genrating FunctionMd Riaz Ahmed Khan

Fourier series and applications of fourier transformKrishna Jangid

Sns pre semprabhatviet

Travelling salesman problemDimitris Mavrommatis

Tele3113 tut5Vin Voro

Tele3113 tut3Vin Voro

Presentation on fourier transformationWasim Shah

Array&pointerBangladesh Mathematical Olympiad

Polygon Unionprem76

Guided Modes Of Planer waveguideajay singh

5th Semester Electronic and Communication Engineering (2013-June) Question Pa...BGS Institute of Technology, Adichunchanagiri University (ACU)

Projectors and Projection Onto a LineIsaac Yowetu

Chpt13 laplacetransformsmatlabJason Harvey

AP Calculus 1984 FRQsA Jorge Garcia

5th Semester Electronic and Communication Engineering (June/July-2015) Questi...BGS Institute of Technology, Adichunchanagiri University (ACU)

Diametre of treeVivek Rathi

My presentation all shortestpathCarlostheran

Fourier transformsIffat Anjum

Addition and Subtraction of Functionsjordhuffman

06. string matchingOnkar Nath Sharma

What's hot (20)

Presentation on Probability Genrating Function

Fourier series and applications of fourier transform

Sns pre sem

Travelling salesman problem

Tele3113 tut5

Tele3113 tut3

Presentation on fourier transformation

Array&pointer

Polygon Union

Guided Modes Of Planer waveguide

5th Semester Electronic and Communication Engineering (2013-June) Question Pa...

Projectors and Projection Onto a Line

Chpt13 laplacetransformsmatlab

AP Calculus 1984 FRQs

5th Semester Electronic and Communication Engineering (June/July-2015) Questi...

Diametre of tree

My presentation all shortestpath

Fourier transforms

Addition and Subtraction of Functions

06. string matching

Similar to Week4Visual AI(시각 인공지능) Lecture 4 : Multiple Layer Perceptron (MLP)

openMP loop parallelizationAlbert DeFusco

MATLABAssignment 2Bracketing (Multiple Roots) (4)Bisection .docxandreecapon

ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHMWireilla

ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHMijfls

Analysis_molfPraveen Jesudhas

Extreme learning machine:Theory and applicationsJames Chou

Std 12 Computer Chapter 7 Java Basics (Part 2)Nuzhat Memon

parallelHlynur Davíð Hlynsson

HyperPrompt:Prompt-based Task-Conditioning of TransformerspdfPo-Chuan Chen

2. diferensial hiperbolikSophia Sumbung

Multirate simAlim Sheikh

Practical Spherical Harmonics Based PRT MethodsNaughty Dog

Signal flow graphjani parth

Multinomial Logistic Regression with Apache SparkDB Tsai

Alpine Spark Implementation - Technicalalpinedatalabs

4_CSI_ROBUSTNESS-PART2.pdfDPSTech

Numerical Method Analysis: Algebraic and Transcendental Equations (Non-Linear)Minhas Kamal

Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 6Ono Shigeru

GraphBLAS: A linear algebraic approach for high-performance graph queriesGábor Szárnyas

Numerical Algorithm for a few Special FunctionsAmos Tsai

Similar to Week4Visual AI(시각 인공지능) Lecture 4 : Multiple Layer Perceptron (MLP) (20)

openMP loop parallelization

MATLABAssignment 2Bracketing (Multiple Roots) (4)Bisection .docx

ADAPTIVE FUZZY KERNEL CLUSTERING ALGORITHM

Analysis_molf

Extreme learning machine:Theory and applications

Std 12 Computer Chapter 7 Java Basics (Part 2)

parallel

HyperPrompt:Prompt-based Task-Conditioning of Transformerspdf

2. diferensial hiperbolik

Multirate sim

Practical Spherical Harmonics Based PRT Methods

Signal flow graph

Multinomial Logistic Regression with Apache Spark

Alpine Spark Implementation - Technical

4_CSI_ROBUSTNESS-PART2.pdf

Numerical Method Analysis: Algebraic and Transcendental Equations (Non-Linear)

Goodfellow, Bengio, Couville (2016) "Deep Learning", Chap. 6

GraphBLAS: A linear algebraic approach for high-performance graph queries

Numerical Algorithm for a few Special Functions

Recently uploaded

Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm

Snow Chain-Integrated Tire for a Safe Drive on Winter RoadsHyundai Motor Group

Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK

Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service9953056974 Low Rate Call Girls In Saket, Delhi NCR

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsAndrey Dotsenko

Artificial intelligence in the post-deep learning eraDeakin University

My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar

DMCC Future of Trade Web3 - Special EditionDubai Multi Commodity Centre

SQL Database Design For Developers at php[tek] 2024Scott Keck-Warren

New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024BookNet Canada

Connect Wave/ connectwave Pitch Deck PresentationSlibray Presentation

APIForce Zurich 5 April Automation LPDGMarianaLemus7

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxnull - The Open Security Community

Science&tech:THE INFORMATION AGE STS.pdfjimielynbastida

SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j

Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar

Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson

CloudStudio User manual (basic edition):comworks

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang

Bluetooth Controlled Car with Arduino.pdfngoud9212

Recently uploaded (20)

Streamlining Python Development: A Guide to a Modern Project Setup

Snow Chain-Integrated Tire for a Safe Drive on Winter Roads

Unblocking The Main Thread Solving ANRs and Frozen Frames

Hot Sexy call girls in Panjabi Bagh 🔝 9953056974 🔝 Delhi escort Service

Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics

Artificial intelligence in the post-deep learning era

My Hashitalk Indonesia April 2024 Presentation

DMCC Future of Trade Web3 - Special Edition

SQL Database Design For Developers at php[tek] 2024

New from BookNet Canada for 2024: BNC BiblioShare - Tech Forum 2024

Connect Wave/ connectwave Pitch Deck Presentation

APIForce Zurich 5 April Automation LPDG

E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx

Science&tech:THE INFORMATION AGE STS.pdf

SIEMENS: RAPUNZEL – A Tale About Knowledge Graph

Unleash Your Potential - Namagunga Girls Coding Club

Are Multi-Cloud and Serverless Good or Bad?

CloudStudio User manual (basic edition):

Bun (KitWorks Team Study 노별마루 발표 2024.4.22)

Bluetooth Controlled Car with Arduino.pdf

Week4Visual AI(시각 인공지능) Lecture 4 : Multiple Layer Perceptron (MLP)

1. Visual AI(시각인공지능) Lecture 4 : Multiple Layer Perceptron (MLP) 1

2. Content Multilayer Perceptron (MLP) (Commonly known as neural network by lay person) Error Function (Loss Function) of MLP 2

3. Multilayer Perceptron (1/2) 3

4. Multilayer Perceptron (2/2) 4

5. Multiple Layer Perceptron (MLP) MLP는 fully connected임 MLP는 classification, regression 모두에 쓰일 수 있음 5

6. MLP Architectures (1/2) 6

7. MLP Architectures (2/2) 히든 레이어의 활성화 함수로 전통적으로 sigmoid를 사용 왜 step function을 사용하지 않는가? 미분을 가능케 하기 위해 7

8. Other Commonly use Activation Functions 8

9. MLP Architectures Classification 문제에서 출력 뉴런은 아래와 같이 구성 가능 9

10. Example 10

11. Multiple Layer Perceptron (MLP) 11

12. Illustration 링크에서 체험 12

13. Multiple Layer Perceptron (MLP) 13

14. Multiple Layer Perceptron (MLP) 14

15. 15

16. 16

17. Common Error/Loss Functions used in MLP 17

18. Recall 18

19. Mean Square Error (MSE) for Regression MSE Loss ϵ = (d − S(y )) ϵ = ϵ Q = # of training data k 2 1 j=2 ∑ p j k j k 2 Q 1 k=1 ∑ Q k 19

20. Error Function in the Weight Space Error function epsilon(w)는 못생김 매우 높은 차원 non-linear global minima는 도달 불가능할 수 있음 local minima는 bad or good 20

21. 21

22. Ugly Error Function 22

23. Cross-Entropy for Classification 23

24. 24

25. Error Functions 학습단계에 있어 에러 함수는 validation/test 셋을 사용하지 않 음 에러 함수는 학습 단계에서의 이정표 역할을 해줌 -> 에러를 줄여 최적의 가중치를 찾기위해 25

26. 다음강의 GD 알고리즘을 배웠는데 하나의 단일 뉴런의 가중치 최적화를 알아보자. GD를 MLP에는 어떻게 적용할 수 있을까? 26

27. Quiz 3 27

28. Assumption If we assume f , f , f is linear function. We can say a^1 = f^1(w^1p+b^1) becomes also linear function since (w^1p+b^1) is a linear function. after applying f , each data points just move within linear space. 1 2 3 1 28

29. Proof Let's replace f^1, f^2, f^3 = p which means it's a identity function and to make calculation simpler. a = f (w f (w f (w p + b ) + b ) + b ) a = f (w f (w (w p + b ) + b ) + b ) since f = p a = f (w f (w w p + w b + b ) + b ) a = f (w (w w p + w b + b ) + b ) since f = p a = f (w w w p + w w b + w b + b ) a = w w w p + w w b + w b + b since f = p a = w w w p + w w b + w b + b 3 3 3 2 2 1 1 1 2 3 3 3 3 2 2 1 1 2 3 1 3 3 3 2 2 1 2 1 2 3 3 3 3 2 1 2 1 2 3 2 3 3 3 2 1 3 2 1 3 2 3 3 3 2 1 3 2 1 3 2 3 3 3 3 2 1 3 2 1 3 2 3 29

30. we can say a = Ap + C where A = w w w , C = w w b + w b + b As a result, final activation function a ^ 3 is a single linear neuron. This only makes 1-dimension decision boundary(hyperplane), so it can't solve complex classification problem(using more than 2- dimension hyper plane). So, we should use non-linear activation function in MLP. 3 3 2 1 3 2 1 3 2 3 30

Week4Visual AI(시각 인공지능) Lecture 4 : Multiple Layer Perceptron (MLP)

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Week4Visual AI(시각 인공지능) Lecture 4 : Multiple Layer Perceptron (MLP)

Similar to Week4Visual AI(시각 인공지능) Lecture 4 : Multiple Layer Perceptron (MLP) (20)

More from Kwangsik Lee

More from Kwangsik Lee (19)

Recently uploaded

Recently uploaded (20)

Week4Visual AI(시각 인공지능) Lecture 4 : Multiple Layer Perceptron (MLP)