SlideShare a Scribd company logo
1 of 32
Multi-Task Learning in
Deep Neural Networks
An Overview
• By Sebastian Ruder, Insight Centre for Data Analytics, Dublin.
• published in Jun 2017
• Cited by 1849
• Multitask learning (1997) By Rich Caruana
• His thesis on Multi-Task Learning helped create interest in a new subfield of
machine learning called Transfer Learning.
Introduction
Traditional Machine Learning single task:
typically care about optimizing for a particular metric, whether this is a score on a
certain benchmark or a business KPI. In order to do this, we generally train a single
model or an ensemble of models to perform our desired task. We then fine-tune
and tweak these models until their performance no longer increases.
Multi-task learning (MTL):
is a machine learning approach in which we try to learn multiple tasks
simultaneously, optimizing multiple loss functions at once. Rather than training
independent models for each task, we allow a single model to learn to complete all
of the tasks at once.
Real world implementation
Tesla Auto Pilot
Andrej Karpathy: Tesla Autopilot and Multi-Task Learning for Perception and
Prediction
MTL methods for Deep Learning
• Hard parameter sharing:
sharing the hidden layers between all tasks, while keeping several task-specific
output layers.
• Soft parameter sharing:
each task has its own model with its own parameters. The distance between the
parameters of the model is then regularized in order to encourage the parameters
to be similar.
Why MTL work?
• Implicit data augmentation
• Attention focusing
• Eavesdropping
• Representation bias
• Regularization
• Implicit data augmentation
MTL effectively increases the sample size that we are using for training our model.
As all tasks are at least somewhat noisy, when training a model on some task A
aim is to learn a good representation for task A that ideally ignores the data-
dependent noise and generalizes well. As different tasks have different noise
patterns, a model that learns two tasks simultaneously is able to learn a more
general representation. Learning just task A bears the risk of overfitting to task A,
while learning A and B jointly enables the model to obtain a better representation
through averaging the noise patterns.
• Attention focusing
If a task is very noisy or data is limited and high-dimensional, it can be difficult for a
model to differentiate between relevant and irrelevant features. MTL help the model
focus its attention on those features that actually matter.
• Eavesdropping
Some features G are easy to learn for some task B, while being difficult to learn for
another task A. This might either be because A interacts with the features in a more
complex way or because other features are impeding the model's ability to learn G.
Through MTL, we can allow the model to eavesdrop, i.e. learn G through task B. The
easiest way to do this is through hints, i.e. directly training the model to predict the
most important features.
• Representation bias
MTL biases the model to prefer representations that other tasks also prefer. This
also help the model to generalize to new tasks in the future as a hypothesis space
that performs well for a sufficiently large number of training tasks will also perform
well for learning novel tasks as long as they are from the same environment.
• Regularization
Finally, MTL acts as a regularizer by introducing an inductive bias. As such, it
reduces the risk of overfitting as well as the Rademacher complexity of the model,
i.e. its ability to fit random noise.
Experiment
• Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and
Semantics
By:
• Alex Kendall
• Yarin Gal
• Roberto Cipolla
Conclusion
• From natural language processing and speech recognition to computer vision and
drug discovery, multi-task learning (MTL) has led to success in a variety of machine
learning applications.
• This work aims to assist ML practitioners in implementing MTL by explaining how it
works and offering advice for selecting relevant auxiliary activities.
• Our understanding of tasks – their similarity, relationship, hierarchy, and benefit for
MTL – is still restricted, and we need to study them more them more to acquire a
deeper grasp of MTL's deep neural network generalization capabilities.
References
• Ruder, S. (2017). An overview of multi-task learning in deep neural networks. arXiv
preprint arXiv:1706.05098.
• Caruana, R. (1997). Multitask learning. Machine learning, 28(1), 41-75.
• Andrej Karpathy. (2019, 3 December). Tesla Autopilot and Multi-Task Learning for
Perception and Prediction. [Video]. YouTube.
https://www.youtube.com/watch?v=IHH47nZ7FZU
• Abu-Mostafa, Y. S. (1990). Learning from hints in neural networks. Journal of
Complexity, 6(2), 192–198. https://doi.org/10.1016/0885-064X(90)90006-Y
• Kendall, A., Gal, Y., & Cipolla, R. (2018). Multi-task learning using uncertainty to weigh
losses for scene geometry and semantics. In Proceedings of the IEEE conference on
computer vision and pattern recognition (pp. 7482-7491).

More Related Content

What's hot

Kaggle presentation
Kaggle presentationKaggle presentation
Kaggle presentationHJ van Veen
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationYan Xu
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine LearningShao-Chuan Wang
 
Artifical Neural Network and its applications
Artifical Neural Network and its applicationsArtifical Neural Network and its applications
Artifical Neural Network and its applicationsSangeeta Tiwari
 
Machine learning seminar ppt
Machine learning seminar pptMachine learning seminar ppt
Machine learning seminar pptRAHUL DANGWAL
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronMostafa G. M. Mostafa
 
07 regularization
07 regularization07 regularization
07 regularizationRonald Teo
 
Application of machine learning in industrial applications
Application of machine learning in industrial applicationsApplication of machine learning in industrial applications
Application of machine learning in industrial applicationsAnish Das
 
Introduction Of Artificial neural network
Introduction Of Artificial neural networkIntroduction Of Artificial neural network
Introduction Of Artificial neural networkNagarajan
 
Deep learning: Overfitting , underfitting, and regularization
Deep learning: Overfitting , underfitting, and regularizationDeep learning: Overfitting , underfitting, and regularization
Deep learning: Overfitting , underfitting, and regularizationAly Abdelkareem
 
Feed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descentFeed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descentMuhammad Rasel
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep LearningOswald Campesato
 
Intro to Neural Networks
Intro to Neural NetworksIntro to Neural Networks
Intro to Neural NetworksDean Wyatte
 
Feature Engineering in Machine Learning
Feature Engineering in Machine LearningFeature Engineering in Machine Learning
Feature Engineering in Machine LearningKnoldus Inc.
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksChristian Perone
 
Building a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to ZBuilding a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to ZCharles Vestur
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesAshraf Uddin
 

What's hot (20)

Kaggle presentation
Kaggle presentationKaggle presentation
Kaggle presentation
 
Deep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and RegularizationDeep Feed Forward Neural Networks and Regularization
Deep Feed Forward Neural Networks and Regularization
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
Artifical Neural Network and its applications
Artifical Neural Network and its applicationsArtifical Neural Network and its applications
Artifical Neural Network and its applications
 
Machine learning seminar ppt
Machine learning seminar pptMachine learning seminar ppt
Machine learning seminar ppt
 
Neural Networks: Multilayer Perceptron
Neural Networks: Multilayer PerceptronNeural Networks: Multilayer Perceptron
Neural Networks: Multilayer Perceptron
 
07 regularization
07 regularization07 regularization
07 regularization
 
Application of machine learning in industrial applications
Application of machine learning in industrial applicationsApplication of machine learning in industrial applications
Application of machine learning in industrial applications
 
Introduction Of Artificial neural network
Introduction Of Artificial neural networkIntroduction Of Artificial neural network
Introduction Of Artificial neural network
 
Deep learning: Overfitting , underfitting, and regularization
Deep learning: Overfitting , underfitting, and regularizationDeep learning: Overfitting , underfitting, and regularization
Deep learning: Overfitting , underfitting, and regularization
 
Feed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descentFeed forward ,back propagation,gradient descent
Feed forward ,back propagation,gradient descent
 
Machine learning
Machine learningMachine learning
Machine learning
 
Introduction to Deep Learning
Introduction to Deep LearningIntroduction to Deep Learning
Introduction to Deep Learning
 
Deep learning
Deep learning Deep learning
Deep learning
 
Intro to Neural Networks
Intro to Neural NetworksIntro to Neural Networks
Intro to Neural Networks
 
Feature Engineering in Machine Learning
Feature Engineering in Machine LearningFeature Engineering in Machine Learning
Feature Engineering in Machine Learning
 
Deep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural NetworksDeep Learning - Convolutional Neural Networks
Deep Learning - Convolutional Neural Networks
 
Building a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to ZBuilding a performing Machine Learning model from A to Z
Building a performing Machine Learning model from A to Z
 
Machine learning
Machine learningMachine learning
Machine learning
 
Big Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture CapabilitiesBig Data: Its Characteristics And Architecture Capabilities
Big Data: Its Characteristics And Architecture Capabilities
 

Similar to Multi-Task Learning in Deep Neural Networks.pptx

Multi-Task Learning in Deep Neural Networks.pptx
Multi-Task Learning in Deep Neural Networks.pptxMulti-Task Learning in Deep Neural Networks.pptx
Multi-Task Learning in Deep Neural Networks.pptxibrahimalshareef3
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningBigDataCloud
 
Machine Learning basics
Machine Learning basicsMachine Learning basics
Machine Learning basicsNeeleEilers
 
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP MeetupDealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP MeetupYves Peirsman
 
Chapter01 introductory handbook
Chapter01 introductory handbookChapter01 introductory handbook
Chapter01 introductory handbookRaman Kannan
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflowCharmi Chokshi
 
Review On In-Context Leaning.pptx
Review On In-Context Leaning.pptxReview On In-Context Leaning.pptx
Review On In-Context Leaning.pptxwesleyshih4
 
Overcoming the limits of Machine Learning in business
Overcoming the limits of Machine Learning in businessOvercoming the limits of Machine Learning in business
Overcoming the limits of Machine Learning in businessAhmed Fattah
 
Deep learning short introduction
Deep learning short introductionDeep learning short introduction
Deep learning short introductionAdwait Bhave
 
Lifelong / Incremental Deep Learning - Ramon Morros - UPC Barcelona 2018
Lifelong / Incremental Deep Learning - Ramon Morros - UPC Barcelona 2018Lifelong / Incremental Deep Learning - Ramon Morros - UPC Barcelona 2018
Lifelong / Incremental Deep Learning - Ramon Morros - UPC Barcelona 2018Universitat Politècnica de Catalunya
 
Preliminary Exam Slides
Preliminary Exam SlidesPreliminary Exam Slides
Preliminary Exam SlidesDebasmit Das
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsXavier Amatriain
 
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...Universitat Politècnica de Catalunya
 
Nlp research presentation
Nlp research presentationNlp research presentation
Nlp research presentationSurya Sg
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Fwdays
 
Deep learning Introduction and Basics
Deep learning  Introduction and BasicsDeep learning  Introduction and Basics
Deep learning Introduction and BasicsNitin Mishra
 

Similar to Multi-Task Learning in Deep Neural Networks.pptx (20)

Multi-Task Learning in Deep Neural Networks.pptx
Multi-Task Learning in Deep Neural Networks.pptxMulti-Task Learning in Deep Neural Networks.pptx
Multi-Task Learning in Deep Neural Networks.pptx
 
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher ManningDeep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
Deep Learning for NLP (without Magic) - Richard Socher and Christopher Manning
 
Machine Learning basics
Machine Learning basicsMachine Learning basics
Machine Learning basics
 
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP MeetupDealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
Dealing with Data Scarcity in Natural Language Processing - Belgium NLP Meetup
 
Multi Task Learning and Meta Learning
Multi Task Learning and Meta LearningMulti Task Learning and Meta Learning
Multi Task Learning and Meta Learning
 
Chapter01 introductory handbook
Chapter01 introductory handbookChapter01 introductory handbook
Chapter01 introductory handbook
 
Deep learning with tensorflow
Deep learning with tensorflowDeep learning with tensorflow
Deep learning with tensorflow
 
Review On In-Context Leaning.pptx
Review On In-Context Leaning.pptxReview On In-Context Leaning.pptx
Review On In-Context Leaning.pptx
 
Overcoming the limits of Machine Learning in business
Overcoming the limits of Machine Learning in businessOvercoming the limits of Machine Learning in business
Overcoming the limits of Machine Learning in business
 
Deep learning short introduction
Deep learning short introductionDeep learning short introduction
Deep learning short introduction
 
Lifelong / Incremental Deep Learning - Ramon Morros - UPC Barcelona 2018
Lifelong / Incremental Deep Learning - Ramon Morros - UPC Barcelona 2018Lifelong / Incremental Deep Learning - Ramon Morros - UPC Barcelona 2018
Lifelong / Incremental Deep Learning - Ramon Morros - UPC Barcelona 2018
 
Preliminary Exam Slides
Preliminary Exam SlidesPreliminary Exam Slides
Preliminary Exam Slides
 
Lessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systemsLessons learned from building practical deep learning systems
Lessons learned from building practical deep learning systems
 
Generative Pretraining.pptx
Generative Pretraining.pptxGenerative Pretraining.pptx
Generative Pretraining.pptx
 
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...
 
Meta-Learning Presentation
Meta-Learning PresentationMeta-Learning Presentation
Meta-Learning Presentation
 
Nlp research presentation
Nlp research presentationNlp research presentation
Nlp research presentation
 
tensorflow.pptx
tensorflow.pptxtensorflow.pptx
tensorflow.pptx
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
 
Deep learning Introduction and Basics
Deep learning  Introduction and BasicsDeep learning  Introduction and Basics
Deep learning Introduction and Basics
 

Recently uploaded

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about usDynamic Netsoft
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataBradBedford3
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number SystemsJheuzeDellosa
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝soniya singh
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackVICTOR MAESTRE RAMIREZ
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsMehedi Hasan Shohan
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxbodapatigopi8531
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptkotipi9215
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningVitsRangannavar
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software DevelopersVinodh Ram
 

Recently uploaded (20)

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...
 
DNT_Corporate presentation know about us
DNT_Corporate presentation know about usDNT_Corporate presentation know about us
DNT_Corporate presentation know about us
 
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer DataAdobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
Adobe Marketo Engage Deep Dives: Using Webhooks to Transfer Data
 
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxKnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx
 
What is Binary Language? Computer Number Systems
What is Binary Language?  Computer Number SystemsWhat is Binary Language?  Computer Number Systems
What is Binary Language? Computer Number Systems
 
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
Call Girls in Naraina Delhi 💯Call Us 🔝8264348440🔝
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
Cloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStackCloud Management Software Platforms: OpenStack
Cloud Management Software Platforms: OpenStack
 
XpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software SolutionsXpertSolvers: Your Partner in Building Innovative Software Solutions
XpertSolvers: Your Partner in Building Innovative Software Solutions
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Hand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptxHand gesture recognition PROJECT PPT.pptx
Hand gesture recognition PROJECT PPT.pptx
 
chapter--4-software-project-planning.ppt
chapter--4-software-project-planning.pptchapter--4-software-project-planning.ppt
chapter--4-software-project-planning.ppt
 
cybersecurity notes for mca students for learning
cybersecurity notes for mca students for learningcybersecurity notes for mca students for learning
cybersecurity notes for mca students for learning
 
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Professional Resume Template for Software Developers
Professional Resume Template for Software DevelopersProfessional Resume Template for Software Developers
Professional Resume Template for Software Developers
 

Multi-Task Learning in Deep Neural Networks.pptx

  • 1. Multi-Task Learning in Deep Neural Networks An Overview
  • 2. • By Sebastian Ruder, Insight Centre for Data Analytics, Dublin. • published in Jun 2017 • Cited by 1849 • Multitask learning (1997) By Rich Caruana • His thesis on Multi-Task Learning helped create interest in a new subfield of machine learning called Transfer Learning.
  • 3. Introduction Traditional Machine Learning single task: typically care about optimizing for a particular metric, whether this is a score on a certain benchmark or a business KPI. In order to do this, we generally train a single model or an ensemble of models to perform our desired task. We then fine-tune and tweak these models until their performance no longer increases.
  • 4. Multi-task learning (MTL): is a machine learning approach in which we try to learn multiple tasks simultaneously, optimizing multiple loss functions at once. Rather than training independent models for each task, we allow a single model to learn to complete all of the tasks at once.
  • 5.
  • 6. Real world implementation Tesla Auto Pilot Andrej Karpathy: Tesla Autopilot and Multi-Task Learning for Perception and Prediction
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17.
  • 18.
  • 19.
  • 20.
  • 21. MTL methods for Deep Learning • Hard parameter sharing: sharing the hidden layers between all tasks, while keeping several task-specific output layers. • Soft parameter sharing: each task has its own model with its own parameters. The distance between the parameters of the model is then regularized in order to encourage the parameters to be similar.
  • 22.
  • 23.
  • 24. Why MTL work? • Implicit data augmentation • Attention focusing • Eavesdropping • Representation bias • Regularization
  • 25. • Implicit data augmentation MTL effectively increases the sample size that we are using for training our model. As all tasks are at least somewhat noisy, when training a model on some task A aim is to learn a good representation for task A that ideally ignores the data- dependent noise and generalizes well. As different tasks have different noise patterns, a model that learns two tasks simultaneously is able to learn a more general representation. Learning just task A bears the risk of overfitting to task A, while learning A and B jointly enables the model to obtain a better representation through averaging the noise patterns.
  • 26. • Attention focusing If a task is very noisy or data is limited and high-dimensional, it can be difficult for a model to differentiate between relevant and irrelevant features. MTL help the model focus its attention on those features that actually matter. • Eavesdropping Some features G are easy to learn for some task B, while being difficult to learn for another task A. This might either be because A interacts with the features in a more complex way or because other features are impeding the model's ability to learn G. Through MTL, we can allow the model to eavesdrop, i.e. learn G through task B. The easiest way to do this is through hints, i.e. directly training the model to predict the most important features.
  • 27. • Representation bias MTL biases the model to prefer representations that other tasks also prefer. This also help the model to generalize to new tasks in the future as a hypothesis space that performs well for a sufficiently large number of training tasks will also perform well for learning novel tasks as long as they are from the same environment. • Regularization Finally, MTL acts as a regularizer by introducing an inductive bias. As such, it reduces the risk of overfitting as well as the Rademacher complexity of the model, i.e. its ability to fit random noise.
  • 28. Experiment • Multi-Task Learning Using Uncertainty to Weigh Losses for Scene Geometry and Semantics By: • Alex Kendall • Yarin Gal • Roberto Cipolla
  • 29.
  • 30.
  • 31. Conclusion • From natural language processing and speech recognition to computer vision and drug discovery, multi-task learning (MTL) has led to success in a variety of machine learning applications. • This work aims to assist ML practitioners in implementing MTL by explaining how it works and offering advice for selecting relevant auxiliary activities. • Our understanding of tasks – their similarity, relationship, hierarchy, and benefit for MTL – is still restricted, and we need to study them more them more to acquire a deeper grasp of MTL's deep neural network generalization capabilities.
  • 32. References • Ruder, S. (2017). An overview of multi-task learning in deep neural networks. arXiv preprint arXiv:1706.05098. • Caruana, R. (1997). Multitask learning. Machine learning, 28(1), 41-75. • Andrej Karpathy. (2019, 3 December). Tesla Autopilot and Multi-Task Learning for Perception and Prediction. [Video]. YouTube. https://www.youtube.com/watch?v=IHH47nZ7FZU • Abu-Mostafa, Y. S. (1990). Learning from hints in neural networks. Journal of Complexity, 6(2), 192–198. https://doi.org/10.1016/0885-064X(90)90006-Y • Kendall, A., Gal, Y., & Cipolla, R. (2018). Multi-task learning using uncertainty to weigh losses for scene geometry and semantics. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 7482-7491).