SlideShare a Scribd company logo
1 of 16
“Deep” Learning
CS771: Introduction to Machine Learning
Nisheeth
CS771: Intro to ML
Limitations of Linear Models
 Linear models: Output produced by taking a linear combination of input features
 This basic architecture is classically also known as the “Perceptron” (not to be confused with the Perceptron
“algorithm”, which learns a linear classification model)
 This can’t however learn nonlinear functions or nonlinear decision boundaries
2
Some monotonic function
(e.g., sigmoid)
CS771: Intro to ML
Limitations of Classic Non-Linear Models
 Non-linear models: kNN, kernel methods, generative classification, decision trees etc.
 All have their own disadvantages
 kNN and kernel methods are expensive to generate predictions from
 Kernel based and generative models particularize the decision boundary to a particular class of functions, e.g.
quadratic polynomials, gaussian functions etc.
 Decision trees require optimization over many arbitrary hyperparameters to generate good results, and are
(somewhat) expensive to generate predictions from
 Not a deal-breaker, most common competitor for deep learning over large datasets tends to be some decision-tree derivative
 In general, non-linear ML models are complicated beasts
3
CS771: Intro to ML
Neural Networks: Multi-layer Perceptron (MLP)
 An MLP consists of an input layer, an output layer, and one or more hidden layers
4
Input Layer
(with D=3 visible units)
Hidden Layer
(with K=2 hidden units)
Output Layer
(with a scalar-valued output)
Hidden layer units/nodes act as
new features
Can think of this model as a
combination of two predictions
ℎ𝑛1 and ℎ𝑛2 of two simpler
models
Learnable
weights
CS771: Intro to ML
Illustration: Neural Net with One Hidden Layer
 Each input 𝒙𝑛 transformed into several pre-activations using
linear models
 Nonlinear activation applied on each pre-act.
 Linear model learned on the new “features” 𝒉𝑛
 Finally, output is produced as 𝑦 = 𝑜(𝑠𝑛)
 Unknowns (𝒘1, 𝒘2, … , 𝒘𝐾, 𝒗) learned by minimizing
some loss function, for example ℒ 𝑾, 𝒗 =
𝑛=1
𝑁
ℓ(𝑦𝑛, 𝑜(𝑠𝑛))
(squared, logistic, softmax, etc)
5
Can even be identity
(e.g., for regression yn = sn )
CS771: Intro to ML
Neural Nets: A Compact Illustration
 Note: Hidden layer pre-act 𝑎𝑛𝑘 and post-act ℎ𝑛𝑘 will be shown together for brevity
 Different layers may use different non-linear activations. Output layer may have none.
6
Single
Hidden
Layer
More succinctly..
Will combine pre-act and post-act and directly show
only ℎ𝑛𝑘 to denote the value computed by a
hidden node
Will directly show the
final output
Will denote a linear combination
of inputs followed by a nonlinear
operation on the result
CS771: Intro to ML
Activation Functions: Some Common Choices
7
sigmoid tanh
ReLU
Leaky ReLU
h
a
h
a
h
a
ReLU and Leaky ReLU are
among the most popular ones
Preferred more than sigmoid. Helps
keep the mean of the next layer’s
inputs close to zero (with sigmoid, it
is close to 0.5)
For sigmoid as well as tanh, gradients
saturate (become close to zero as the
function tends to its extreme values)
Helps fix the dead neuron
problem of ReLU when 𝑎 is a
negative number
CS771: Intro to ML
MLP Can Learn Nonlin. Fn: A Brief Justification
 An MLP can be seen as a composition of multiple linear models combined nonlinearly
8
Standard Single “Perceptron” Classifier (no hidden units)
score
Score monotonically increases. One-
sided increase (not ideal for learning
nonlineardecision boundaries)
A Multi-layer Perceptron Classifier
(one hidden layer with 2 units)
Capable of learning nonlinear boundaries
score score
score
Obtained by composing the two one-sided
increasing score functions (using 𝑣1 = 1, and 𝑣2 =
-1 to “flip” the second one before adding). This can
now learn nonlineardecision boundary
High-score in the middleand low-score on
either of the two sides of it. Exactly what we
want for the given classification problem
A nonlinearclassification
problem
A single hidden layer MLP with sufficiently large number of
hidden units can approximate any function (Hornik, 1991)
CS771: Intro to ML
Examples of some basic NN/MLP architectures
9
CS771: Intro to ML
Single Hidden Layer and Single Outputs
 One hidden layer with 𝐾 nodes and a single output (e.g., scalar-valued regression or binary classification)
10
CS771: Intro to ML
Single Hidden Layer and Multiple Outputs
 One hidden layer with 𝐾 nodes and a vector of 𝐶 output (e.g., vector-valued regression or multi-class classification
or multi-label classification)
11
CS771: Intro to ML
Multiple Hidden Layers (One/Multiple Outputs)
 Most general case: Multiple hidden layers with (with same or different number of hidden nodes in each) and a scalar
or vector-valued output
12
CS771: Intro to ML
Neural Nets are Feature Learners
 Hidden layers can be seen as learning a feature rep. 𝜙 𝒙𝑛 for each input 𝒙𝑛
13
The last hidden layer’s
values
CS771: Intro to ML
Kernel Methods vs Neural Nets
 Recall the prediction rule for a kernel method (e.g., kernel SVM)
 This is analogous to a single hidden layer NN with fixed/pre-defined hidden nodes 𝑘 𝒙𝑛, 𝒙 𝑛=1
𝑁
and output
weights 𝛼𝑛 𝑛=1
𝑁
 The prediction rule for a deep neural network
 Here, the ℎ𝑘’s are learned from data (possibly after multiple layers of nonlinear transformations)
 Both kernel methods and deep NNs be seen as using nonlinear basis functions for making predictions. Kernel
methods use fixed basis functions (defined by the kernel) whereas NN learns the basis functions adaptively from
data
14
Also note that neural nets are faster than kernel
methods at test time since kernel methods need
to store the training examples at test time
whereas neural nets do not
CS771: Intro to ML
Feature Learned by a Neural Network
 Node values in each hidden layer tell us how much a “learned” feature is active in 𝒙𝑛
 Hidden layer weights are like pattern/feature-detector/filter
15
All the incoming weights (a vector) on this hidden node can be seen as
representing a template/pattern/feature-detector
Here, 𝒘100
(1)
denotes a D-dim
pattern/feature-detector/filter
All the incoming weights (a vector) on this hidden node can be seen as
representing a pattern/feature-detector/filter
Here, 𝒘32
(2)
denotes a 𝐾1-dim
template/pattern/feature-detector
CS771: Intro to ML
Why Neural Networks Work Better: Another View
 Linear models tend to only learn the “average” pattern
 Deep models can learn multiple patterns (each hidden node can learn one pattern)
 Thus deep models can learn to capture more subtle variations that a simpler linear model
16

More Related Content

Similar to lec26.pptx

unsupervised learning k-means model.pptx
unsupervised learning k-means model.pptxunsupervised learning k-means model.pptx
unsupervised learning k-means model.pptxNaveenKumar5162
 
Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & PythonLonghow Lam
 
Rethinking Attention with Performers
Rethinking Attention with PerformersRethinking Attention with Performers
Rethinking Attention with PerformersJoonhyung Lee
 
Deep Learning and TensorFlow
Deep Learning and TensorFlowDeep Learning and TensorFlow
Deep Learning and TensorFlowOswald Campesato
 
240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptx240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptxthanhdowork
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxssuser3aa461
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMayuraD1
 
Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)Jon Lederman
 
Deep Neural Network DNN.docx
Deep Neural Network DNN.docxDeep Neural Network DNN.docx
Deep Neural Network DNN.docxjaffarbikat
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Oswald Campesato
 

Similar to lec26.pptx (20)

cnn.pdf
cnn.pdfcnn.pdf
cnn.pdf
 
Cnn
CnnCnn
Cnn
 
Deep learning (2)
Deep learning (2)Deep learning (2)
Deep learning (2)
 
Java and Deep Learning
Java and Deep LearningJava and Deep Learning
Java and Deep Learning
 
unsupervised learning k-means model.pptx
unsupervised learning k-means model.pptxunsupervised learning k-means model.pptx
unsupervised learning k-means model.pptx
 
Report_NLNN
Report_NLNNReport_NLNN
Report_NLNN
 
Convolutional neural networks
Convolutional neural  networksConvolutional neural  networks
Convolutional neural networks
 
Keras on tensorflow in R & Python
Keras on tensorflow in R & PythonKeras on tensorflow in R & Python
Keras on tensorflow in R & Python
 
Rethinking Attention with Performers
Rethinking Attention with PerformersRethinking Attention with Performers
Rethinking Attention with Performers
 
Deep Learning
Deep LearningDeep Learning
Deep Learning
 
Deep Learning and TensorFlow
Deep Learning and TensorFlowDeep Learning and TensorFlow
Deep Learning and TensorFlow
 
240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptx240318_JW_labseminar[Attention Is All You Need].pptx
240318_JW_labseminar[Attention Is All You Need].pptx
 
intro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptxintro-to-cnn-April_2020.pptx
intro-to-cnn-April_2020.pptx
 
14_cnn complete.pptx
14_cnn complete.pptx14_cnn complete.pptx
14_cnn complete.pptx
 
Machine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester ElectiveMachine learning Module-2, 6th Semester Elective
Machine learning Module-2, 6th Semester Elective
 
Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)Deep Learning Sample Class (Jon Lederman)
Deep Learning Sample Class (Jon Lederman)
 
deep CNN vs conventional ML
deep CNN vs conventional MLdeep CNN vs conventional ML
deep CNN vs conventional ML
 
Deep Neural Network DNN.docx
Deep Neural Network DNN.docxDeep Neural Network DNN.docx
Deep Neural Network DNN.docx
 
Neural Networks - How do they work?
Neural Networks - How do they work?Neural Networks - How do they work?
Neural Networks - How do they work?
 
Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)Diving into Deep Learning (Silicon Valley Code Camp 2017)
Diving into Deep Learning (Silicon Valley Code Camp 2017)
 

More from SwatiMahale4

A_03_Tejal_Pathak_Report.pptx
A_03_Tejal_Pathak_Report.pptxA_03_Tejal_Pathak_Report.pptx
A_03_Tejal_Pathak_Report.pptxSwatiMahale4
 
औद्योगिक असमतोल.pptx
औद्योगिक असमतोल.pptxऔद्योगिक असमतोल.pptx
औद्योगिक असमतोल.pptxSwatiMahale4
 
Limitations of Financial Statement Analysis.pptx
Limitations of Financial Statement Analysis.pptxLimitations of Financial Statement Analysis.pptx
Limitations of Financial Statement Analysis.pptxSwatiMahale4
 
Introduction_to_DEEP_LEARNING.ppt
Introduction_to_DEEP_LEARNING.pptIntroduction_to_DEEP_LEARNING.ppt
Introduction_to_DEEP_LEARNING.pptSwatiMahale4
 

More from SwatiMahale4 (6)

A_03_Tejal_Pathak_Report.pptx
A_03_Tejal_Pathak_Report.pptxA_03_Tejal_Pathak_Report.pptx
A_03_Tejal_Pathak_Report.pptx
 
1.pptx
1.pptx1.pptx
1.pptx
 
औद्योगिक असमतोल.pptx
औद्योगिक असमतोल.pptxऔद्योगिक असमतोल.pptx
औद्योगिक असमतोल.pptx
 
Limitations of Financial Statement Analysis.pptx
Limitations of Financial Statement Analysis.pptxLimitations of Financial Statement Analysis.pptx
Limitations of Financial Statement Analysis.pptx
 
माती.pdf
माती.pdfमाती.pdf
माती.pdf
 
Introduction_to_DEEP_LEARNING.ppt
Introduction_to_DEEP_LEARNING.pptIntroduction_to_DEEP_LEARNING.ppt
Introduction_to_DEEP_LEARNING.ppt
 

Recently uploaded

College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Dr.Costas Sachpazis
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxwendy cai
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingrakeshbaidya232001
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )Tsuyoshi Horigome
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130Suhani Kapoor
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAbhinavSharma374939
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)Suman Mia
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerAnamika Sarkar
 
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...Call Girls in Nagpur High Profile
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur High Profile
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSSIVASHANKAR N
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N
 

Recently uploaded (20)

College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCollege Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik
 
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
Sheet Pile Wall Design and Construction: A Practical Guide for Civil Engineer...
 
What are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptxWhat are the advantages and disadvantages of membrane structures.pptx
What are the advantages and disadvantages of membrane structures.pptx
 
Porous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writingPorous Ceramics seminar and technical writing
Porous Ceramics seminar and technical writing
 
SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )SPICE PARK APR2024 ( 6,793 SPICE Models )
SPICE PARK APR2024 ( 6,793 SPICE Models )
 
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt
 
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur EscortsCall Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
Call Girls in Nagpur Suman Call 7001035870 Meet With Nagpur Escorts
 
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IVHARMONY IN THE NATURE AND EXISTENCE - Unit-IV
HARMONY IN THE NATURE AND EXISTENCE - Unit-IV
 
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
VIP Call Girls Service Kondapur Hyderabad Call +91-8250192130
 
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCRCall Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
Call Us -/9953056974- Call Girls In Vikaspuri-/- Delhi NCR
 
Analog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog ConverterAnalog to Digital and Digital to Analog Converter
Analog to Digital and Digital to Analog Converter
 
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)Software Development Life Cycle By  Team Orange (Dept. of Pharmacy)
Software Development Life Cycle By Team Orange (Dept. of Pharmacy)
 
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
★ CALL US 9953330565 ( HOT Young Call Girls In Badarpur delhi NCR
 
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube ExchangerStudy on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
Study on Air-Water & Water-Water Heat Exchange in a Finned Tube Exchanger
 
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
High Profile Call Girls Nashik Megha 7001305949 Independent Escort Service Na...
 
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur EscortsCall Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
Call Girls Service Nagpur Tanvi Call 7001035870 Meet With Nagpur Escorts
 
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSHARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS
 
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLSMANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
MANUFACTURING PROCESS-II UNIT-5 NC MACHINE TOOLS
 
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINEMANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE
 
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEDJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE
 

lec26.pptx

  • 1. “Deep” Learning CS771: Introduction to Machine Learning Nisheeth
  • 2. CS771: Intro to ML Limitations of Linear Models  Linear models: Output produced by taking a linear combination of input features  This basic architecture is classically also known as the “Perceptron” (not to be confused with the Perceptron “algorithm”, which learns a linear classification model)  This can’t however learn nonlinear functions or nonlinear decision boundaries 2 Some monotonic function (e.g., sigmoid)
  • 3. CS771: Intro to ML Limitations of Classic Non-Linear Models  Non-linear models: kNN, kernel methods, generative classification, decision trees etc.  All have their own disadvantages  kNN and kernel methods are expensive to generate predictions from  Kernel based and generative models particularize the decision boundary to a particular class of functions, e.g. quadratic polynomials, gaussian functions etc.  Decision trees require optimization over many arbitrary hyperparameters to generate good results, and are (somewhat) expensive to generate predictions from  Not a deal-breaker, most common competitor for deep learning over large datasets tends to be some decision-tree derivative  In general, non-linear ML models are complicated beasts 3
  • 4. CS771: Intro to ML Neural Networks: Multi-layer Perceptron (MLP)  An MLP consists of an input layer, an output layer, and one or more hidden layers 4 Input Layer (with D=3 visible units) Hidden Layer (with K=2 hidden units) Output Layer (with a scalar-valued output) Hidden layer units/nodes act as new features Can think of this model as a combination of two predictions ℎ𝑛1 and ℎ𝑛2 of two simpler models Learnable weights
  • 5. CS771: Intro to ML Illustration: Neural Net with One Hidden Layer  Each input 𝒙𝑛 transformed into several pre-activations using linear models  Nonlinear activation applied on each pre-act.  Linear model learned on the new “features” 𝒉𝑛  Finally, output is produced as 𝑦 = 𝑜(𝑠𝑛)  Unknowns (𝒘1, 𝒘2, … , 𝒘𝐾, 𝒗) learned by minimizing some loss function, for example ℒ 𝑾, 𝒗 = 𝑛=1 𝑁 ℓ(𝑦𝑛, 𝑜(𝑠𝑛)) (squared, logistic, softmax, etc) 5 Can even be identity (e.g., for regression yn = sn )
  • 6. CS771: Intro to ML Neural Nets: A Compact Illustration  Note: Hidden layer pre-act 𝑎𝑛𝑘 and post-act ℎ𝑛𝑘 will be shown together for brevity  Different layers may use different non-linear activations. Output layer may have none. 6 Single Hidden Layer More succinctly.. Will combine pre-act and post-act and directly show only ℎ𝑛𝑘 to denote the value computed by a hidden node Will directly show the final output Will denote a linear combination of inputs followed by a nonlinear operation on the result
  • 7. CS771: Intro to ML Activation Functions: Some Common Choices 7 sigmoid tanh ReLU Leaky ReLU h a h a h a ReLU and Leaky ReLU are among the most popular ones Preferred more than sigmoid. Helps keep the mean of the next layer’s inputs close to zero (with sigmoid, it is close to 0.5) For sigmoid as well as tanh, gradients saturate (become close to zero as the function tends to its extreme values) Helps fix the dead neuron problem of ReLU when 𝑎 is a negative number
  • 8. CS771: Intro to ML MLP Can Learn Nonlin. Fn: A Brief Justification  An MLP can be seen as a composition of multiple linear models combined nonlinearly 8 Standard Single “Perceptron” Classifier (no hidden units) score Score monotonically increases. One- sided increase (not ideal for learning nonlineardecision boundaries) A Multi-layer Perceptron Classifier (one hidden layer with 2 units) Capable of learning nonlinear boundaries score score score Obtained by composing the two one-sided increasing score functions (using 𝑣1 = 1, and 𝑣2 = -1 to “flip” the second one before adding). This can now learn nonlineardecision boundary High-score in the middleand low-score on either of the two sides of it. Exactly what we want for the given classification problem A nonlinearclassification problem A single hidden layer MLP with sufficiently large number of hidden units can approximate any function (Hornik, 1991)
  • 9. CS771: Intro to ML Examples of some basic NN/MLP architectures 9
  • 10. CS771: Intro to ML Single Hidden Layer and Single Outputs  One hidden layer with 𝐾 nodes and a single output (e.g., scalar-valued regression or binary classification) 10
  • 11. CS771: Intro to ML Single Hidden Layer and Multiple Outputs  One hidden layer with 𝐾 nodes and a vector of 𝐶 output (e.g., vector-valued regression or multi-class classification or multi-label classification) 11
  • 12. CS771: Intro to ML Multiple Hidden Layers (One/Multiple Outputs)  Most general case: Multiple hidden layers with (with same or different number of hidden nodes in each) and a scalar or vector-valued output 12
  • 13. CS771: Intro to ML Neural Nets are Feature Learners  Hidden layers can be seen as learning a feature rep. 𝜙 𝒙𝑛 for each input 𝒙𝑛 13 The last hidden layer’s values
  • 14. CS771: Intro to ML Kernel Methods vs Neural Nets  Recall the prediction rule for a kernel method (e.g., kernel SVM)  This is analogous to a single hidden layer NN with fixed/pre-defined hidden nodes 𝑘 𝒙𝑛, 𝒙 𝑛=1 𝑁 and output weights 𝛼𝑛 𝑛=1 𝑁  The prediction rule for a deep neural network  Here, the ℎ𝑘’s are learned from data (possibly after multiple layers of nonlinear transformations)  Both kernel methods and deep NNs be seen as using nonlinear basis functions for making predictions. Kernel methods use fixed basis functions (defined by the kernel) whereas NN learns the basis functions adaptively from data 14 Also note that neural nets are faster than kernel methods at test time since kernel methods need to store the training examples at test time whereas neural nets do not
  • 15. CS771: Intro to ML Feature Learned by a Neural Network  Node values in each hidden layer tell us how much a “learned” feature is active in 𝒙𝑛  Hidden layer weights are like pattern/feature-detector/filter 15 All the incoming weights (a vector) on this hidden node can be seen as representing a template/pattern/feature-detector Here, 𝒘100 (1) denotes a D-dim pattern/feature-detector/filter All the incoming weights (a vector) on this hidden node can be seen as representing a pattern/feature-detector/filter Here, 𝒘32 (2) denotes a 𝐾1-dim template/pattern/feature-detector
  • 16. CS771: Intro to ML Why Neural Networks Work Better: Another View  Linear models tend to only learn the “average” pattern  Deep models can learn multiple patterns (each hidden node can learn one pattern)  Thus deep models can learn to capture more subtle variations that a simpler linear model 16