Lecture_04_Supervised_Pretraining.pptx

•Download as PPTX, PDF•

0 likes•2 views

GauravGautam216125

Transfer Learning

Engineering

Supervised Pretraining
Jia-Bin Huang
Virginia Tech
ECE 6554 Advanced Computer Vision

Administrative stuffs
• Project proposal due March 2nd
• 1-page summary of
• Feedback on paper summary
• Explicit structure

Discussion – Think-pair-share
• Discuss
• strength,
• weakness, and
• potential extension
• Share with class

Today’s class
• Training tricks for CNN
• Transfer learning via supervised pretraining

Training CNN with gradient
descent
• A CNN as composition of functions
𝑓𝒘 𝒙 = 𝑓𝐿(… (𝑓2 𝑓1 𝒙; 𝒘1 ; 𝒘2 … ; 𝒘𝐿)
• Parameters
𝒘 = (𝒘𝟏, 𝒘𝟐, … 𝒘𝑳)
• Empirical loss function
𝐿 𝒘 =
1
𝑛
𝑖
𝑙(𝑧𝑖, 𝑓𝒘(𝒙𝒊))
• Gradient descent
𝒘𝒕+𝟏 = 𝒘𝒕 − 𝜂𝑡
𝜕𝒇
𝜕𝒘
(𝒘𝒕)
Learning rate Gradient
Old weight
New weight

Dropout
Dropout: A simple way to prevent neural networks from overfitting [Srivastava JMLR 2014]
Intuition: successful conspiracies
• 50 people planning a conspiracy
• Strategy A: plan a big conspiracy involving 50 people
• Likely to fail. 50 people need to play their parts correctly.
• Strategy B: plan 10 conspiracies each involving 5 people
• Likely to succeed!

Dropout
Dropout: A simple way to prevent neural networks from overfitting [Srivastava JMLR 2014]
Main Idea: approximately
combining exponentially many
different neural network
architectures efficiently

Data Augmentation (Jittering)
• Create virtual training samples
• Horizontal flip
• Random crop
• Color casting
• Geometric distortion
Deep Image [Wu et al. 2015]

Parametric Rectified Linear Unit
Delving Deep into Rectifiers: Surpassing Human-Level Performance on
ImageNet Classification [He et al. 2015]

Batch Normalization
Batch Normalization: Accelerating Deep Network Training by
Reducing Internal Covariate Shift [Ioffe and Szegedy 2015]

Transfer Learning
• Improvement of learning in a new task through the
transfer of knowledge from a related task that has
already been learned.
• Weight initialization for CNN
• Two major strategies
• ConvNet as fixed feature extractor
• Fine-tuning the ConvNet

When to finetune your model?
• New dataset is small and similar to original dataset.
• train a linear classifier on the CNN codes
• New dataset is large and similar to the original dataset
• fine-tune through the full network
• New dataset is small but very different from the
original dataset
• SVM classifier from activations somewhere earlier in the
network
• New dataset is large and very different from the
original dataset
• fine-tune through the entire network

Convolutional activation features
[Donahue et al. ICML 2013]

CNN Features off-the-shelf: an Astounding Baseline for Recognition
[Razavian et al. 2014]

Learning and Transferring Mid-Level Image Representations using
Convolutional Neural Networks [Oquab et al. CVPR 2014]

Rich feature hierarchies for accurate object detection and
semantic segmentation, [Girshick et al. CVPR 2014]

How transferable are features in
CNN?
How transferable are features in deep
neural networks [Yosinski NIPS 2014]

What makes ImageNet good for transfer learning? [Huh arXiv 2016]

Things to remember
• Training CNN
• Dropout
• Data augmentation
• Activaition
• Batch normalization
• Transfer learning
• Two strategies
• CNN code
• Finetuning
• When and how to transfer
• Characteristics of transfer learning

Similar to Lecture_04_Supervised_Pretraining.pptx

Introduction to deep learningAbhishek Bhandwaldar

[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...NAVER Engineering

Introduction to Neural NetworkYan Xu

ppt.pdfMohanRaj924804

Finding the best solution for Image ProcessingTech Triveni

Machine Duping 101: Pwning Deep Learning SystemsClarence Chio

DEF CON 24 - Clarence Chio - machine duping 101Felipe Prado

PPT.pptxMayankAhuja42

Introduction to Neural networks (under graduate course) Lecture 9 of 9Randa Elanwar

Weight Agnostic Neural NetworksYongsu Baek

Mx net image segmentation to predict and diagnose the cardiac diseases karp...KannanRamasamy25

Computer Design Concepts for Machine LearningFacultad de Informática UCM

PR12-193 NISP: Pruning Networks using Neural Importance Score PropagationTaesu Kim

How Can Machine Learning Help Your Research Forward?Wouter Deconinck

Multi scale dense networksGuan Qingji

ANN load forecastingDr Ashok Tiwari

DSRLab seminar Introduction to deep learningPoo Kuan Hoong

Image classification with neural networksSepehr Rasouli

Scaling Face Recognition with Big Data - Key Notes at DevTalks Bucharest 2017VisageCloud

InfoEducatie - Face Recognition ArchitectureBogdan Bocse

Similar to Lecture_04_Supervised_Pretraining.pptx (20)

Introduction to deep learning

[CVPR 2018] Utilizing unlabeled or noisy labeled data (classification, detect...

Introduction to Neural Network

ppt.pdf

Finding the best solution for Image Processing

Machine Duping 101: Pwning Deep Learning Systems

DEF CON 24 - Clarence Chio - machine duping 101

PPT.pptx

Introduction to Neural networks (under graduate course) Lecture 9 of 9

Weight Agnostic Neural Networks

Mx net image segmentation to predict and diagnose the cardiac diseases karp...

Computer Design Concepts for Machine Learning

PR12-193 NISP: Pruning Networks using Neural Importance Score Propagation

How Can Machine Learning Help Your Research Forward?

Multi scale dense networks

ANN load forecasting

DSRLab seminar Introduction to deep learning

Image classification with neural networks

Scaling Face Recognition with Big Data - Key Notes at DevTalks Bucharest 2017

InfoEducatie - Face Recognition Architecture

Recently uploaded

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...Call Girls in Nagpur High Profile

HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICSRajkumarAkumalla

(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escortsranjana rawat

Roadmap to Membership of RICS - Pathways and RoutesM Maged Hegazy, LLM, MBA, CCP, P3O

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...ranjana rawat

Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝soniya singh

Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptxJoão Esperancinha

Extrusion Processes and Their Limitations120cr0395

HARMONY IN THE NATURE AND EXISTENCE - Unit-IVRajaP95

(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat

Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile

APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICSKurinjimalarL3

DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINEslot gacor bisa pakai pulsa

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...Dr.Costas Sachpazis

UNIT-III FMM. DIMENSIONAL ANALYSISrknatarajan

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).pptssuser5c9d4b1

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINESIVASHANKAR N

College Call Girls Nashik Nehal 7001305949 Independent Escort Service NashikCall Girls in Nagpur High Profile

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...ranjana rawat

Recently uploaded (20)

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...

HARDNESS, FRACTURE TOUGHNESS AND STRENGTH OF CERAMICS

(MEERA) Dapodi Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Escorts

Roadmap to Membership of RICS - Pathways and Routes

(SHREYA) Chakan Call Girls Just Call 7001035870 [ Cash on Delivery ] Pune Esc...

Model Call Girl in Narela Delhi reach out to us at 🔝8264348440🔝

Decoding Kotlin - Your guide to solving the mysterious in Kotlin.pptx

Extrusion Processes and Their Limitations

HARMONY IN THE NATURE AND EXISTENCE - Unit-IV

(RIA) Call Girls Bhosari ( 7001035870 ) HI-Fi Pune Escorts Service

Call Girls Service Nashik Vaishnavi 7001305949 Independent Escort Service Nashik

APPLICATIONS-AC/DC DRIVES-OPERATING CHARACTERISTICS

DJARUM4D - SLOT GACOR ONLINE | SLOT DEMO ONLINE

(PRIYA) Rajgurunagar Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

Structural Analysis and Design of Foundations: A Comprehensive Handbook for S...

UNIT-III FMM. DIMENSIONAL ANALYSIS

247267395-1-Symmetric-and-distributed-shared-memory-architectures-ppt (1).ppt

MANUFACTURING PROCESS-II UNIT-2 LATHE MACHINE

College Call Girls Nashik Nehal 7001305949 Independent Escort Service Nashik

(ANVI) Koregaon Park Call Girls Just Call 7001035870 [ Cash on Delivery ] Pun...

Lecture_04_Supervised_Pretraining.pptx

1. Supervised Pretraining Jia-Bin Huang Virginia Tech ECE 6554 Advanced Computer Vision

2. Administrative stuffs • Project proposal due March 2nd • 1-page summary of • Feedback on paper summary • Explicit structure

3. Discussion – Think-pair-share • Discuss • strength, • weakness, and • potential extension • Share with class

4. Today’s class • Training tricks for CNN • Transfer learning via supervised pretraining

5. Training CNN with gradient descent • A CNN as composition of functions 𝑓𝒘 𝒙 = 𝑓𝐿(… (𝑓2 𝑓1 𝒙; 𝒘1 ; 𝒘2 … ; 𝒘𝐿) • Parameters 𝒘 = (𝒘𝟏, 𝒘𝟐, … 𝒘𝑳) • Empirical loss function 𝐿 𝒘 = 1 𝑛 𝑖 𝑙(𝑧𝑖, 𝑓𝒘(𝒙𝒊)) • Gradient descent 𝒘𝒕+𝟏 = 𝒘𝒕 − 𝜂𝑡 𝜕𝒇 𝜕𝒘 (𝒘𝒕) Learning rate Gradient Old weight New weight

6. Dropout Dropout: A simple way to prevent neural networks from overfitting [Srivastava JMLR 2014] Intuition: successful conspiracies • 50 people planning a conspiracy • Strategy A: plan a big conspiracy involving 50 people • Likely to fail. 50 people need to play their parts correctly. • Strategy B: plan 10 conspiracies each involving 5 people • Likely to succeed!

7. Dropout Dropout: A simple way to prevent neural networks from overfitting [Srivastava JMLR 2014] Main Idea: approximately combining exponentially many different neural network architectures efficiently

8. Data Augmentation (Jittering) • Create virtual training samples • Horizontal flip • Random crop • Color casting • Geometric distortion Deep Image [Wu et al. 2015]

9. Parametric Rectified Linear Unit Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification [He et al. 2015]

10. Batch Normalization Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift [Ioffe and Szegedy 2015]

11. Transfer Learning • Improvement of learning in a new task through the transfer of knowledge from a related task that has already been learned. • Weight initialization for CNN • Two major strategies • ConvNet as fixed feature extractor • Fine-tuning the ConvNet

12. When to finetune your model? • New dataset is small and similar to original dataset. • train a linear classifier on the CNN codes • New dataset is large and similar to the original dataset • fine-tune through the full network • New dataset is small but very different from the original dataset • SVM classifier from activations somewhere earlier in the network • New dataset is large and very different from the original dataset • fine-tune through the entire network

13. Convolutional activation features [Donahue et al. ICML 2013]

14. CNN Features off-the-shelf: an Astounding Baseline for Recognition [Razavian et al. 2014]

15. Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks [Oquab et al. CVPR 2014]

16. Rich feature hierarchies for accurate object detection and semantic segmentation, [Girshick et al. CVPR 2014]

17. How transferable are features in CNN? How transferable are features in deep neural networks [Yosinski NIPS 2014]

18. What makes ImageNet good for transfer learning? [Huh arXiv 2016]

19. What makes ImageNet good for transfer learning? [Huh arXiv 2016]

20. What makes ImageNet good for transfer learning? [Huh arXiv 2016]

21. Things to remember • Training CNN • Dropout • Data augmentation • Activaition • Batch normalization • Transfer learning • Two strategies • CNN code • Finetuning • When and how to transfer • Characteristics of transfer learning

Lecture_04_Supervised_Pretraining.pptx

Recommended

Recommended

More Related Content

Similar to Lecture_04_Supervised_Pretraining.pptx

Similar to Lecture_04_Supervised_Pretraining.pptx (20)

More from GauravGautam216125

More from GauravGautam216125 (8)

Recently uploaded

Recently uploaded (20)

Lecture_04_Supervised_Pretraining.pptx