SlideShare a Scribd company logo
A Survey on
      Transfer Learning
               Sinno Jialin Pan
  Department of Computer Science and Engineering
The Hong Kong University of Science and Technology
        Joint work with Prof. Qiang Yang
Transfer Learning? (DARPA 05)
Transfer Learning (TL):
    The ability of a system to recognize and apply knowledge and
    skills learned in previous tasks to novel tasks (in new domains)



 It is motivated by human learning. People can often transfer knowledge
 learnt previously to novel situations
  Chess  Checkers
  Mathematics  Computer Science
  Table Tennis  Tennis
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning

 Negative Transfer

 Conclusion
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning

 Negative Transfer

 Conclusion
Traditional ML vs. TL
                                          (P. Langley 06)
                    Traditional ML in                                      Transfer of learning
                    multiple domains                                         across domains
training items




                                                          training items




                                                                                                       test items
                                             test items




             Humans can learn in many domains.                     Humans can also transfer from one
                                                                   domain to other domains.
Traditional ML vs. TL
           Learning Process of                       Learning Process of
             Traditional ML                           Transfer Learning

              training items                           training items




Learning System          Learning System   Learning System




                                            Knowledge               Learning System
Notation
Domain:
It consists of two components: A feature space     , a marginal distribution


In general, if two domains are different, then they may have different feature
    spaces
or different marginal distributions.

Task:
Given a specific domain and label space      , for each   in the domain, to
predict its corresponding label

In general, if two tasks are different, then they may have different label spaces or
different conditional distributions
Notation
For simplicity, we only consider at most two domains and two tasks.

Source domain:

Task in the source domain:

Target domain:

Task in the target domain
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning

 Negative Transfer

 Conclusion
Why Transfer Learning?
 In some domains, labeled data are in short supply.
 In some domains, the calibration effort is very expensive.
 In some domains, the learning process is time consuming.


  How to extract knowledge learnt from related domains to help
  learning in a target domain with a few labeled data?
  How to extract knowledge learnt from related domains to speed up
  learning in a target domain?


          Transfer learning techniques may help!
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning

 Negative Transfer

 Conclusion
Settings of Transfer Learning
  Transfer learning settings     Labeled data in   Labeled data in      Tasks
                                 a source domain   a target domain
 Inductive Transfer Learning                                         Classification
                                       ×                 √
                                                                      Regression
                                       √                 √                …
Transductive Transfer Learning                                       Classification
                                       √                 ×
                                                                      Regression
                                                                          …
Unsupervised Transfer Learning                                        Clustering
                                       ×                 ×
                                                                         …
An overview of
various settings of                                                                                                 Self-taught
                                                                              Case 1
  transfer learning                                                                                                  Learning
                                                   No labeled data in a source domain


                                       Inductive
                                   Transfer Learning
                                            Labeled data are available in a source domain
   Labeled data are available in
                                                                                                 Source and
         a target domain
                                                                                               target tasks are
                                                                                                                    Multi-task
                                                                              Case 2                learnt          Learning
                                                                                               simultaneously

 Transfer
 Learning
                              Labeled data are
                             available only in a                                                Assumption:
                               source domain                Transductive                          different          Domain
                                                                                                domains but
                                                          Transfer Learning                      single task        Adaptation
   No labeled data in
    both source and
     target domain                                                                      Assumption: single domain
                                                                                             and single task



                              Unsupervised                                                          Sample Selection Bias /
                            Transfer Learning                                                          Covariance Shift
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning

 Negative Transfer

 Conclusion
Approaches to Transfer Learning
Transfer learning approaches                        Description
        Instance-transfer            To re-weight some labeled data in a source
                                        domain for use in the target domain
 Feature-representation-transfer   Find a “good” feature representation that reduces
                                   difference between a source and a target domain
                                            or minimizes error of models
         Model-transfer            Discover shared parameters or priors of models
                                   between a source domain and a target domain
 Relational-knowledge-transfer     Build mapping of relational knowledge between
                                        a source domain and a target domain.
Approaches to Transfer Learning
                             Inductive          Transductive        Unsupervised
                          Transfer Learning   Transfer Learning   Transfer Learning

   Instance-transfer             √                   √
Feature-representation-          √                   √                   √
       transfer
    Model-transfer               √
Relational-knowledge-            √
       transfer
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning
   Inductive Transfer Learning
   Transductive Transfer Learning
   Unsupervised Transfer Learning
Inductive Transfer Learning
               Instance-transfer Approaches
• Assumption: the source domain and target domain data use
  exactly the same features and labels.

• Motivation: Although the source domain data can not be
  reused directly, there are some parts of the data that can still
  be reused by re-weighting.

• Main Idea: Discriminatively adjust weighs of data in the
  source domain for use in the target domain.
Inductive Transfer Learning
                        --- Instance-transfer Approaches
                             Non-standard SVMs
                                 [Wu and Dietterich ICML-04]

                       Uniform weights                 Correct the decision boundary by re-weighting




Loss function on the                     Loss function on the
target domain data                       source domain data                       Regularization term


   Differentiate the cost for misclassification of the target and source data
Inductive Transfer Learning
    --- Instance-transfer Approaches
                         TrAdaBoost
                              [Dai et al. ICML-07]

           Hedge ( β )                           AdaBoost
      [Freund et al. 1997]                 [Freund et al. 1997]
  To decrease the weights                    To increase the weights of
  of the misclassified data                  the misclassified data



                                                            The whole
                    Source domain                        training data set
                                         target domain
                     labeled data         labeled data


                                           Classifiers trained on
                                          re-weighted labeled data

                                Target domain
                                unlabeled data
Inductive Transfer Learning
     Feature-representation-transfer Approaches
         Supervised Feature Construction
                  [Argyriou et al. NIPS-06, NIPS-07]

Assumption: If t tasks are related to each other, then they may
share some common features which can benefit for all tasks.

Input: t tasks, each of them has its own training data.

Output: Common features learnt across t tasks and t models for t
tasks, respectively.
Supervised Feature Construction
                    [Argyriou et al. NIPS-06, NIPS-07]




        Average of the empirical
        error across t tasks                            Regularization to make the
                                                        representation sparse

                               Orthogonal Constraints


where
Inductive Transfer Learning
     Feature-representation-transfer Approaches
       Unsupervised Feature Construction
                        [Raina et al. ICML-07]

Three steps:
•   Applying sparse coding [Lee et al. NIPS-07] algorithm to learn
    higher-level representation from unlabeled data in the source
    domain.

•   Transforming the target data to new representations by new bases
    learnt in the first step.

•   Traditional discriminative models can be applied on new
    representations of the target data with corresponding labels.
Unsupervised Feature Construction
                         [Raina et al. ICML-07]

Step1:




Input: Source domain data X S = {xS } and coefficient β
                                         i

Output: New representations of the source domain data AS = {aS }     i


       and new bases B = {bi }

Step2:



Input: Target domain data X T = {xT } , coefficient β and bases B = {bi }
                                     i


Output: New representations of the target domain data AT = {aT } i
Inductive Transfer Learning
                     Model-transfer Approaches
               Regularization-based Method
                          [Evgeiou and Pontil, KDD-04]

Assumption: If t tasks are related to each other, then they may share some
parameters among individual models.

Assume f t = wt ⋅ x be a hyper-plane for task , where        t ∈ {T , S } and




               Common part                       Specific part for individual task

                                                                                 Regularization terms
                                                                                  for multiple tasks
Encode them into SVMs:
Inductive Transfer Learning
       Relational-knowledge-transfer Approaches
                               TAMAR
                        [Mihalkova et al. AAAI-07]
Assumption: If the target domain and source domain are related, then there
may be some relationship between domains being similar, which can be used for
transfer learning

Input:
6. Relational data in the source domain and a statistical relational model,
     Markov Logic Network (MLN), which has been learnt in the source
     domain.
7. Relational data in the target domain.

Output: A new statistical relational model, MLN, in the target domain.

Goal: To learn a MLN in the target domain more efficiently and effectively.
TAMAR [Mihalkova et al. AAAI-07]
Two Stages:
2. Predicate Mapping
   – Establish the mapping between predicates in the source
     and target domain. Once a mapping is established, clauses
     from the source domain can be translated into the target
     domain.
3. Revising the Mapped Structure
   – The clauses mapping from the source domain directly
     may not be completely accurate and may need to be
     revised, augmented , and re-weighted in order to properly
     model the target data.
TAMAR [Mihalkova et al. AAAI-07]
    Source domain (academic domain)            Target domain (movie domain)



               AdvisedBy                                 WorkedFor
Student (B)                   Professor (A)   Actor(A)               Director(B)


      Publication         Publication             MovieMember    MovieMember


              Paper (T)                                  Movie(M)




Mapping




Revising
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning
   Inductive Transfer Learning
   Transductive Transfer Learning
   Unsupervised Transfer Learning
Transductive Transfer Learning
                    Instance-transfer Approaches
       Sample Selection Bias / Covariance Shift
                  [Zadrozny ICML-04, Schwaighofer JSPI-00]
Input: A lot of labeled data in the source domain and no labeled data in the
target domain.

Output: Models for use in the target domain data.

Assumption: The source domain and target domain are the same. In addition,
P (YS | X S )   P (YT | X T )                 P( X S )   P( X T )
           and            are the same while         and        may be
different causing by different sampling process (training data and test data).

Main Idea: Re-weighting (important sampling) the source domain data.
Sample Selection Bias/Covariance Shift
 To correct sample selection bias:




                             weights for source
                               domain data




How to estimate        ?
One straightforward solution is to estimate P( X S ) and P( X T ) ,
respectively. However, estimating density function is a hard
   problem.
Sample Selection Bias/Covariance Shift
              Kernel Mean Match (KMM)
                         [Huang et al. NIPS 2006]

Main Idea: KMM tries to estimate              directly instead of estimating
density function.

It can be proved that can be estimated by solving the following quadratic
programming (QP) optimization problem.
                                                     To match means between
                                                 training and test data in a RKHS




Theoretical Support: Maximum Mean Discrepancy (MMD) [Borgwardt et al.
BIOINFOMATICS-06]. The distance of distributions can be measured
by Euclid distance of their mean vectors in a RKHS.
Transductive Transfer Learning
      Feature-representation-transfer Approaches
                       Domain Adaptation
 [Blitzer et al. EMNL-06, Ben-David et al. NIPS-07, Daume III ACL-07]
Assumption: Single task across domains, which means P(YS | X S ) and P(YT | X T )
are the same while P ( X S ) and P ( X T ) may be different causing by feature
representations across domains.

Main Idea: Find a “good” feature representation that reduce the “distance”
between domains.

Input: A lot of labeled data in the source domain and only unlabeled data in the
target domain.

Output: A common representation between source domain data and target
domain data and a model on the new representation for use in the target
     domain.
Domain Adaptation
 Structural Correspondence Learning (SCL)
   [Blitzer et al. EMNL-06, Blitzer et al. ACL-07, Ando and Zhang JMLR-05]

Motivation: If two domains are related to each other, then there may exist
some “pivot” features across both domain. Pivot features are features that
behave in the same way for discriminative learning in both domains.

Main Idea: To identify correspondences among features from different
domains by modeling their correlations with pivot features. Non-pivot features
form different domains that are correlated with many of the same pivot
features are assumed to correspond, and they are treated similarly in a
discriminative learner.
SCL
[Blitzer et al. EMNL-06, Blitzer et al. ACL-07, Ando and Zhang
                          JMLR-05]
                                            a) Heuristically choose m pivot
                                            features, which is task specific.
                                            b) Transform each vector of pivot
                                            feature to a vector of binary
                                            values and then create
                                            corresponding prediction
                                            problem.

                                                Learn parameters of each
                                                prediction problem


                                                Do Eigen Decomposition
                                                on the matrix of
                                                parameters and learn the
                                                linear mapping function.


                                       Use the learnt mapping function to
                                       construct new features and train
                                       classifiers onto the new representations.
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning
   Inductive Transfer Learning
   Transductive Transfer Learning
   Unsupervised Transfer Learning
Unsupervised Transfer Learning
      Feature-representation-transfer Approaches
              Self-taught Clustering (STC)
                            [Dai et al. ICML-08]

Input: A lot of unlabeled data in a source domain and a few unlabeled data in a
target domain.

Goal: Clustering the target domain data.

Assumption: The source domain and target domain data share some common
features, which can help clustering in the target domain.

Main Idea: To extend the information theoretic co-clustering algorithm
[Dhillon et al. KDD-03] for transfer learning.
Self-taught Clustering (STC)
                                 [Dai et al. ICML-08]
                                 Common features



                                                        Target domain data



Source domain data

                                                                         Co-clustering in the
                                                                           source domain
Objective function that need to be minimized


      Co-clustering in the target domain


where

                                    Cluster functions
Output
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning

 Negative Transfer

 Conclusion
Negative Transfer
 Most approaches to transfer learning assume transferring knowledge across
  domains be always positive.

 However, in some cases, when two tasks are too dissimilar, brute-force
  transfer may even hurt the performance of the target task, which is called
  negative transfer [Rosenstein et al NIPS-05 Workshop].

 Some researchers have studied how to measure relatedness among tasks
  [Ben-David and Schuller NIPS-03, Bakker and Heskes JMLR-03].

 How to design a mechanism to avoid negative transfer needs to be studied
  theoretically.
Outline
 Traditional Machine Learning vs. Transfer Learning

 Why Transfer Learning?

 Settings of Transfer Learning

 Approaches to Transfer Learning

 Negative Transfer

 Conclusion
Conclusion
                             Inductive          Transductive        Unsupervised
                          Transfer Learning   Transfer Learning   Transfer Learning

   Instance-transfer             √                   √
Feature-representation-          √                   √                   √
       transfer
    Model-transfer               √
Relational-knowledge-            √
       transfer




How to avoid negative transfer need to be attracted more attention!

More Related Content

What's hot

An introduction to reinforcement learning
An introduction to reinforcement learningAn introduction to reinforcement learning
An introduction to reinforcement learning
Subrat Panda, PhD
 
Model-Based Reinforcement Learning @NIPS2017
Model-Based Reinforcement Learning @NIPS2017Model-Based Reinforcement Learning @NIPS2017
Model-Based Reinforcement Learning @NIPS2017
mooopan
 
Intro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningIntro to Deep Reinforcement Learning
Intro to Deep Reinforcement Learning
Khaled Saleh
 
Lect12 graph mining
Lect12 graph miningLect12 graph mining
Lect12 graph mining
Houw Liong The
 
Semi-Supervised Learning
Semi-Supervised LearningSemi-Supervised Learning
Semi-Supervised Learning
Lukas Tencer
 
Artificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesArtificial Intelligence Searching Techniques
Artificial Intelligence Searching Techniques
Dr. C.V. Suresh Babu
 
Transfer learning-presentation
Transfer learning-presentationTransfer learning-presentation
Transfer learning-presentation
Bushra Jbawi
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
Kuppusamy P
 
An introduction to reinforcement learning
An introduction to  reinforcement learningAn introduction to  reinforcement learning
An introduction to reinforcement learning
Jie-Han Chen
 
Essential concepts for machine learning
Essential concepts for machine learning Essential concepts for machine learning
Essential concepts for machine learning
pyingkodi maran
 
I.ITERATIVE DEEPENING DEPTH FIRST SEARCH(ID-DFS) II.INFORMED SEARCH IN ARTIFI...
I.ITERATIVE DEEPENING DEPTH FIRST SEARCH(ID-DFS) II.INFORMED SEARCH IN ARTIFI...I.ITERATIVE DEEPENING DEPTH FIRST SEARCH(ID-DFS) II.INFORMED SEARCH IN ARTIFI...
I.ITERATIVE DEEPENING DEPTH FIRST SEARCH(ID-DFS) II.INFORMED SEARCH IN ARTIFI...
vikas dhakane
 
Multi-armed Bandits
Multi-armed BanditsMulti-armed Bandits
Multi-armed Bandits
Dongmin Lee
 
AI local search
AI local searchAI local search
AI local search
Renas Rekany
 
Data Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksData Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural Networks
BICA Labs
 
Feature selection
Feature selectionFeature selection
Feature selection
Dong Guo
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
Mark Chang
 
An introduction to deep reinforcement learning
An introduction to deep reinforcement learningAn introduction to deep reinforcement learning
An introduction to deep reinforcement learning
Big Data Colombia
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.
butest
 
Deep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applicationsDeep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applications
Buhwan Jeong
 
Stuart russell and peter norvig artificial intelligence - a modern approach...
Stuart russell and peter norvig   artificial intelligence - a modern approach...Stuart russell and peter norvig   artificial intelligence - a modern approach...
Stuart russell and peter norvig artificial intelligence - a modern approach...
Lê Anh Đạt
 

What's hot (20)

An introduction to reinforcement learning
An introduction to reinforcement learningAn introduction to reinforcement learning
An introduction to reinforcement learning
 
Model-Based Reinforcement Learning @NIPS2017
Model-Based Reinforcement Learning @NIPS2017Model-Based Reinforcement Learning @NIPS2017
Model-Based Reinforcement Learning @NIPS2017
 
Intro to Deep Reinforcement Learning
Intro to Deep Reinforcement LearningIntro to Deep Reinforcement Learning
Intro to Deep Reinforcement Learning
 
Lect12 graph mining
Lect12 graph miningLect12 graph mining
Lect12 graph mining
 
Semi-Supervised Learning
Semi-Supervised LearningSemi-Supervised Learning
Semi-Supervised Learning
 
Artificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesArtificial Intelligence Searching Techniques
Artificial Intelligence Searching Techniques
 
Transfer learning-presentation
Transfer learning-presentationTransfer learning-presentation
Transfer learning-presentation
 
Recurrent neural networks rnn
Recurrent neural networks   rnnRecurrent neural networks   rnn
Recurrent neural networks rnn
 
An introduction to reinforcement learning
An introduction to  reinforcement learningAn introduction to  reinforcement learning
An introduction to reinforcement learning
 
Essential concepts for machine learning
Essential concepts for machine learning Essential concepts for machine learning
Essential concepts for machine learning
 
I.ITERATIVE DEEPENING DEPTH FIRST SEARCH(ID-DFS) II.INFORMED SEARCH IN ARTIFI...
I.ITERATIVE DEEPENING DEPTH FIRST SEARCH(ID-DFS) II.INFORMED SEARCH IN ARTIFI...I.ITERATIVE DEEPENING DEPTH FIRST SEARCH(ID-DFS) II.INFORMED SEARCH IN ARTIFI...
I.ITERATIVE DEEPENING DEPTH FIRST SEARCH(ID-DFS) II.INFORMED SEARCH IN ARTIFI...
 
Multi-armed Bandits
Multi-armed BanditsMulti-armed Bandits
Multi-armed Bandits
 
AI local search
AI local searchAI local search
AI local search
 
Data Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural NetworksData Science, Machine Learning and Neural Networks
Data Science, Machine Learning and Neural Networks
 
Feature selection
Feature selectionFeature selection
Feature selection
 
Generative Adversarial Networks
Generative Adversarial NetworksGenerative Adversarial Networks
Generative Adversarial Networks
 
An introduction to deep reinforcement learning
An introduction to deep reinforcement learningAn introduction to deep reinforcement learning
An introduction to deep reinforcement learning
 
Machine Learning presentation.
Machine Learning presentation.Machine Learning presentation.
Machine Learning presentation.
 
Deep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applicationsDeep learning - Conceptual understanding and applications
Deep learning - Conceptual understanding and applications
 
Stuart russell and peter norvig artificial intelligence - a modern approach...
Stuart russell and peter norvig   artificial intelligence - a modern approach...Stuart russell and peter norvig   artificial intelligence - a modern approach...
Stuart russell and peter norvig artificial intelligence - a modern approach...
 

Viewers also liked

Flavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approachFlavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approach
Alexander Rakhlin
 
Self taught clustering
Self taught clusteringSelf taught clustering
Self taught clustering
SOYEON KIM
 
Video concept detection by learning from web images
Video concept detection by learning from web imagesVideo concept detection by learning from web images
Video concept detection by learning from web images
MediaMixerCommunity
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learning
Sung Kim
 
Best Blue Brain ppt ever.
Best Blue Brain ppt ever.Best Blue Brain ppt ever.
Best Blue Brain ppt ever.
Suhail Shaikh
 
Survey on Software Defect Prediction
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect Prediction
Sung Kim
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Sujit Pal
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
PyData
 
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Universitat Politècnica de Catalunya
 
Artificial Neural Network Seminar - Google Brain
Artificial Neural Network Seminar - Google BrainArtificial Neural Network Seminar - Google Brain
Artificial Neural Network Seminar - Google Brain
Rawan Al-Omari
 
Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)
Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)
Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)
Universitat Politècnica de Catalunya
 
TRANSFER OF LEARNING by Lorraine Anoran
TRANSFER OF LEARNING by Lorraine AnoranTRANSFER OF LEARNING by Lorraine Anoran
TRANSFER OF LEARNING by Lorraine Anoran
Lorraine Mae Anoran
 
Amith blue brain
Amith blue brainAmith blue brain
Amith blue brain
Amith Kp
 
Blue brain
Blue brainBlue brain
Blue brain
Leelakh Sachdeva
 
Brain fingerprinting
Brain fingerprintingBrain fingerprinting
Brain fingerprinting
Priyodarshini Dhar
 
BLUE BRAIN
BLUE BRAINBLUE BRAIN
BLUE BRAIN
amitsaraf02
 
Brain Fingerprinting PPT
Brain Fingerprinting PPTBrain Fingerprinting PPT
Brain Fingerprinting PPT
Vishnu Mysterio
 

Viewers also liked (17)

Flavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approachFlavours of Physics Challenge: Transfer Learning approach
Flavours of Physics Challenge: Transfer Learning approach
 
Self taught clustering
Self taught clusteringSelf taught clustering
Self taught clustering
 
Video concept detection by learning from web images
Video concept detection by learning from web imagesVideo concept detection by learning from web images
Video concept detection by learning from web images
 
Transfer defect learning
Transfer defect learningTransfer defect learning
Transfer defect learning
 
Best Blue Brain ppt ever.
Best Blue Brain ppt ever.Best Blue Brain ppt ever.
Best Blue Brain ppt ever.
 
Survey on Software Defect Prediction
Survey on Software Defect PredictionSurvey on Software Defect Prediction
Survey on Software Defect Prediction
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
 
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr TeterwakLearn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
Learn to Build an App to Find Similar Images using Deep Learning- Piotr Teterwak
 
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
Deep Learning for Computer Vision: Transfer Learning and Domain Adaptation (U...
 
Artificial Neural Network Seminar - Google Brain
Artificial Neural Network Seminar - Google BrainArtificial Neural Network Seminar - Google Brain
Artificial Neural Network Seminar - Google Brain
 
Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)
Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)
Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)
 
TRANSFER OF LEARNING by Lorraine Anoran
TRANSFER OF LEARNING by Lorraine AnoranTRANSFER OF LEARNING by Lorraine Anoran
TRANSFER OF LEARNING by Lorraine Anoran
 
Amith blue brain
Amith blue brainAmith blue brain
Amith blue brain
 
Blue brain
Blue brainBlue brain
Blue brain
 
Brain fingerprinting
Brain fingerprintingBrain fingerprinting
Brain fingerprinting
 
BLUE BRAIN
BLUE BRAINBLUE BRAIN
BLUE BRAIN
 
Brain Fingerprinting PPT
Brain Fingerprinting PPTBrain Fingerprinting PPT
Brain Fingerprinting PPT
 

Similar to A survey on transfer learning

[ppt]
[ppt][ppt]
[ppt]
butest
 
Transfer Learning in NLP: A Survey
Transfer Learning in NLP: A SurveyTransfer Learning in NLP: A Survey
Transfer Learning in NLP: A Survey
NUPUR YADAV
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
zukun
 
Presentation Online Educa 2011 Jacob Molenaar
Presentation Online Educa 2011 Jacob MolenaarPresentation Online Educa 2011 Jacob Molenaar
Presentation Online Educa 2011 Jacob Molenaar
Jacob Molenaar
 
2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...
2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...
2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...
広樹 本間
 
Fcv rep darrell
Fcv rep darrellFcv rep darrell
Fcv rep darrell
zukun
 
Seminar dm
Seminar dmSeminar dm
Seminar dm
MHDAmmarALkelany
 
Kma week 6_knowledge_transfer_type
Kma week 6_knowledge_transfer_typeKma week 6_knowledge_transfer_type
Kma week 6_knowledge_transfer_type
gharawi
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
yang947066
 
Introduction to Deep Learning presentation
Introduction to Deep Learning presentationIntroduction to Deep Learning presentation
Introduction to Deep Learning presentation
johanericka2
 
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
GeeksLab Odessa
 
Analysis on Domain Adaptation based on different papers
Analysis on Domain Adaptation based on different papersAnalysis on Domain Adaptation based on different papers
Analysis on Domain Adaptation based on different papers
harshavardhan814108
 
Main single agent machine learning algorithms
Main single agent machine learning algorithmsMain single agent machine learning algorithms
Main single agent machine learning algorithms
butest
 
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
MLAI2
 
On Semi-Supervised Learning and Beyond
On Semi-Supervised Learning and BeyondOn Semi-Supervised Learning and Beyond
On Semi-Supervised Learning and Beyond
Eunjeong (Lucy) Park
 

Similar to A survey on transfer learning (15)

[ppt]
[ppt][ppt]
[ppt]
 
Transfer Learning in NLP: A Survey
Transfer Learning in NLP: A SurveyTransfer Learning in NLP: A Survey
Transfer Learning in NLP: A Survey
 
Icml2012 tutorial representation_learning
Icml2012 tutorial representation_learningIcml2012 tutorial representation_learning
Icml2012 tutorial representation_learning
 
Presentation Online Educa 2011 Jacob Molenaar
Presentation Online Educa 2011 Jacob MolenaarPresentation Online Educa 2011 Jacob Molenaar
Presentation Online Educa 2011 Jacob Molenaar
 
2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...
2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...
2019 dynamically composing_domain-data_selection_with_clean-data_selection_by...
 
Fcv rep darrell
Fcv rep darrellFcv rep darrell
Fcv rep darrell
 
Seminar dm
Seminar dmSeminar dm
Seminar dm
 
Kma week 6_knowledge_transfer_type
Kma week 6_knowledge_transfer_typeKma week 6_knowledge_transfer_type
Kma week 6_knowledge_transfer_type
 
deepnet-lourentzou.ppt
deepnet-lourentzou.pptdeepnet-lourentzou.ppt
deepnet-lourentzou.ppt
 
Introduction to Deep Learning presentation
Introduction to Deep Learning presentationIntroduction to Deep Learning presentation
Introduction to Deep Learning presentation
 
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
AI&BigData Lab 2016. Александр Баев: Transfer learning - зачем, как и где.
 
Analysis on Domain Adaptation based on different papers
Analysis on Domain Adaptation based on different papersAnalysis on Domain Adaptation based on different papers
Analysis on Domain Adaptation based on different papers
 
Main single agent machine learning algorithms
Main single agent machine learning algorithmsMain single agent machine learning algorithms
Main single agent machine learning algorithms
 
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
Learning to Balance: Bayesian Meta-Learning for Imbalanced and Out-of-distrib...
 
On Semi-Supervised Learning and Beyond
On Semi-Supervised Learning and BeyondOn Semi-Supervised Learning and Beyond
On Semi-Supervised Learning and Beyond
 

Recently uploaded

How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
Chart Kalyan
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
Intelisync
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
DanBrown980551
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Precisely
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
Hiike
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
Zilliz
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
akankshawande
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
ScyllaDB
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
Tatiana Kojar
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
shyamraj55
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Tosin Akinosho
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
LucaBarbaro3
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
Edge AI and Vision Alliance
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
Alex Pruden
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
Zilliz
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
MichaelKnudsen27
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Tatiana Kojar
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
Jason Packer
 

Recently uploaded (20)

How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdfHow to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
How to Interpret Trends in the Kalyan Rajdhani Mix Chart.pdf
 
TrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc Webinar - 2024 Global Privacy Survey
TrustArc Webinar - 2024 Global Privacy Survey
 
A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024A Comprehensive Guide to DeFi Development Services in 2024
A Comprehensive Guide to DeFi Development Services in 2024
 
5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides5th LF Energy Power Grid Model Meet-up Slides
5th LF Energy Power Grid Model Meet-up Slides
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their MainframeDigital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
Digital Banking in the Cloud: How Citizens Bank Unlocked Their Mainframe
 
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - HiikeSystem Design Case Study: Building a Scalable E-Commerce Platform - Hiike
System Design Case Study: Building a Scalable E-Commerce Platform - Hiike
 
Programming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup SlidesProgramming Foundation Models with DSPy - Meetup Slides
Programming Foundation Models with DSPy - Meetup Slides
 
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development ProvidersYour One-Stop Shop for Python Success: Top 10 US Python Development Providers
Your One-Stop Shop for Python Success: Top 10 US Python Development Providers
 
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-EfficiencyFreshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
Freshworks Rethinks NoSQL for Rapid Scaling & Cost-Efficiency
 
Skybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoptionSkybuffer SAM4U tool for SAP license adoption
Skybuffer SAM4U tool for SAP license adoption
 
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with SlackLet's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slack
 
Monitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdfMonitoring and Managing Anomaly Detection on OpenShift.pdf
Monitoring and Managing Anomaly Detection on OpenShift.pdf
 
Trusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process MiningTrusted Execution Environment for Decentralized Process Mining
Trusted Execution Environment for Decentralized Process Mining
 
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
“Temporal Event Neural Networks: A More Efficient Alternative to the Transfor...
 
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
zkStudyClub - LatticeFold: A Lattice-based Folding Scheme and its Application...
 
Building Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and MilvusBuilding Production Ready Search Pipelines with Spark and Milvus
Building Production Ready Search Pipelines with Spark and Milvus
 
Nordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptxNordic Marketo Engage User Group_June 13_ 2024.pptx
Nordic Marketo Engage User Group_June 13_ 2024.pptx
 
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
Skybuffer AI: Advanced Conversational and Generative AI Solution on SAP Busin...
 
Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024Columbus Data & Analytics Wednesdays - June 2024
Columbus Data & Analytics Wednesdays - June 2024
 

A survey on transfer learning

  • 1. A Survey on Transfer Learning Sinno Jialin Pan Department of Computer Science and Engineering The Hong Kong University of Science and Technology Joint work with Prof. Qiang Yang
  • 2. Transfer Learning? (DARPA 05) Transfer Learning (TL): The ability of a system to recognize and apply knowledge and skills learned in previous tasks to novel tasks (in new domains) It is motivated by human learning. People can often transfer knowledge learnt previously to novel situations  Chess  Checkers  Mathematics  Computer Science  Table Tennis  Tennis
  • 3. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Negative Transfer  Conclusion
  • 4. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Negative Transfer  Conclusion
  • 5. Traditional ML vs. TL (P. Langley 06) Traditional ML in Transfer of learning multiple domains across domains training items training items test items test items Humans can learn in many domains. Humans can also transfer from one domain to other domains.
  • 6. Traditional ML vs. TL Learning Process of Learning Process of Traditional ML Transfer Learning training items training items Learning System Learning System Learning System Knowledge Learning System
  • 7. Notation Domain: It consists of two components: A feature space , a marginal distribution In general, if two domains are different, then they may have different feature spaces or different marginal distributions. Task: Given a specific domain and label space , for each in the domain, to predict its corresponding label In general, if two tasks are different, then they may have different label spaces or different conditional distributions
  • 8. Notation For simplicity, we only consider at most two domains and two tasks. Source domain: Task in the source domain: Target domain: Task in the target domain
  • 9. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Negative Transfer  Conclusion
  • 10. Why Transfer Learning?  In some domains, labeled data are in short supply.  In some domains, the calibration effort is very expensive.  In some domains, the learning process is time consuming.  How to extract knowledge learnt from related domains to help learning in a target domain with a few labeled data?  How to extract knowledge learnt from related domains to speed up learning in a target domain?  Transfer learning techniques may help!
  • 11. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Negative Transfer  Conclusion
  • 12. Settings of Transfer Learning Transfer learning settings Labeled data in Labeled data in Tasks a source domain a target domain Inductive Transfer Learning Classification × √ Regression √ √ … Transductive Transfer Learning Classification √ × Regression … Unsupervised Transfer Learning Clustering × × …
  • 13. An overview of various settings of Self-taught Case 1 transfer learning Learning No labeled data in a source domain Inductive Transfer Learning Labeled data are available in a source domain Labeled data are available in Source and a target domain target tasks are Multi-task Case 2 learnt Learning simultaneously Transfer Learning Labeled data are available only in a Assumption: source domain Transductive different Domain domains but Transfer Learning single task Adaptation No labeled data in both source and target domain Assumption: single domain and single task Unsupervised Sample Selection Bias / Transfer Learning Covariance Shift
  • 14. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Negative Transfer  Conclusion
  • 15. Approaches to Transfer Learning Transfer learning approaches Description Instance-transfer To re-weight some labeled data in a source domain for use in the target domain Feature-representation-transfer Find a “good” feature representation that reduces difference between a source and a target domain or minimizes error of models Model-transfer Discover shared parameters or priors of models between a source domain and a target domain Relational-knowledge-transfer Build mapping of relational knowledge between a source domain and a target domain.
  • 16. Approaches to Transfer Learning Inductive Transductive Unsupervised Transfer Learning Transfer Learning Transfer Learning Instance-transfer √ √ Feature-representation- √ √ √ transfer Model-transfer √ Relational-knowledge- √ transfer
  • 17. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Inductive Transfer Learning  Transductive Transfer Learning  Unsupervised Transfer Learning
  • 18. Inductive Transfer Learning Instance-transfer Approaches • Assumption: the source domain and target domain data use exactly the same features and labels. • Motivation: Although the source domain data can not be reused directly, there are some parts of the data that can still be reused by re-weighting. • Main Idea: Discriminatively adjust weighs of data in the source domain for use in the target domain.
  • 19. Inductive Transfer Learning --- Instance-transfer Approaches Non-standard SVMs [Wu and Dietterich ICML-04] Uniform weights Correct the decision boundary by re-weighting Loss function on the Loss function on the target domain data source domain data Regularization term  Differentiate the cost for misclassification of the target and source data
  • 20. Inductive Transfer Learning --- Instance-transfer Approaches TrAdaBoost [Dai et al. ICML-07] Hedge ( β ) AdaBoost [Freund et al. 1997] [Freund et al. 1997] To decrease the weights To increase the weights of of the misclassified data the misclassified data The whole Source domain training data set target domain labeled data labeled data Classifiers trained on re-weighted labeled data Target domain unlabeled data
  • 21. Inductive Transfer Learning Feature-representation-transfer Approaches Supervised Feature Construction [Argyriou et al. NIPS-06, NIPS-07] Assumption: If t tasks are related to each other, then they may share some common features which can benefit for all tasks. Input: t tasks, each of them has its own training data. Output: Common features learnt across t tasks and t models for t tasks, respectively.
  • 22. Supervised Feature Construction [Argyriou et al. NIPS-06, NIPS-07] Average of the empirical error across t tasks Regularization to make the representation sparse Orthogonal Constraints where
  • 23. Inductive Transfer Learning Feature-representation-transfer Approaches Unsupervised Feature Construction [Raina et al. ICML-07] Three steps: • Applying sparse coding [Lee et al. NIPS-07] algorithm to learn higher-level representation from unlabeled data in the source domain. • Transforming the target data to new representations by new bases learnt in the first step. • Traditional discriminative models can be applied on new representations of the target data with corresponding labels.
  • 24. Unsupervised Feature Construction [Raina et al. ICML-07] Step1: Input: Source domain data X S = {xS } and coefficient β i Output: New representations of the source domain data AS = {aS } i and new bases B = {bi } Step2: Input: Target domain data X T = {xT } , coefficient β and bases B = {bi } i Output: New representations of the target domain data AT = {aT } i
  • 25. Inductive Transfer Learning Model-transfer Approaches Regularization-based Method [Evgeiou and Pontil, KDD-04] Assumption: If t tasks are related to each other, then they may share some parameters among individual models. Assume f t = wt ⋅ x be a hyper-plane for task , where t ∈ {T , S } and Common part Specific part for individual task Regularization terms for multiple tasks Encode them into SVMs:
  • 26. Inductive Transfer Learning Relational-knowledge-transfer Approaches TAMAR [Mihalkova et al. AAAI-07] Assumption: If the target domain and source domain are related, then there may be some relationship between domains being similar, which can be used for transfer learning Input: 6. Relational data in the source domain and a statistical relational model, Markov Logic Network (MLN), which has been learnt in the source domain. 7. Relational data in the target domain. Output: A new statistical relational model, MLN, in the target domain. Goal: To learn a MLN in the target domain more efficiently and effectively.
  • 27. TAMAR [Mihalkova et al. AAAI-07] Two Stages: 2. Predicate Mapping – Establish the mapping between predicates in the source and target domain. Once a mapping is established, clauses from the source domain can be translated into the target domain. 3. Revising the Mapped Structure – The clauses mapping from the source domain directly may not be completely accurate and may need to be revised, augmented , and re-weighted in order to properly model the target data.
  • 28. TAMAR [Mihalkova et al. AAAI-07] Source domain (academic domain) Target domain (movie domain) AdvisedBy WorkedFor Student (B) Professor (A) Actor(A) Director(B) Publication Publication MovieMember MovieMember Paper (T) Movie(M) Mapping Revising
  • 29. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Inductive Transfer Learning  Transductive Transfer Learning  Unsupervised Transfer Learning
  • 30. Transductive Transfer Learning Instance-transfer Approaches Sample Selection Bias / Covariance Shift [Zadrozny ICML-04, Schwaighofer JSPI-00] Input: A lot of labeled data in the source domain and no labeled data in the target domain. Output: Models for use in the target domain data. Assumption: The source domain and target domain are the same. In addition, P (YS | X S ) P (YT | X T ) P( X S ) P( X T ) and are the same while and may be different causing by different sampling process (training data and test data). Main Idea: Re-weighting (important sampling) the source domain data.
  • 31. Sample Selection Bias/Covariance Shift To correct sample selection bias: weights for source domain data How to estimate ? One straightforward solution is to estimate P( X S ) and P( X T ) , respectively. However, estimating density function is a hard problem.
  • 32. Sample Selection Bias/Covariance Shift Kernel Mean Match (KMM) [Huang et al. NIPS 2006] Main Idea: KMM tries to estimate directly instead of estimating density function. It can be proved that can be estimated by solving the following quadratic programming (QP) optimization problem. To match means between training and test data in a RKHS Theoretical Support: Maximum Mean Discrepancy (MMD) [Borgwardt et al. BIOINFOMATICS-06]. The distance of distributions can be measured by Euclid distance of their mean vectors in a RKHS.
  • 33. Transductive Transfer Learning Feature-representation-transfer Approaches Domain Adaptation [Blitzer et al. EMNL-06, Ben-David et al. NIPS-07, Daume III ACL-07] Assumption: Single task across domains, which means P(YS | X S ) and P(YT | X T ) are the same while P ( X S ) and P ( X T ) may be different causing by feature representations across domains. Main Idea: Find a “good” feature representation that reduce the “distance” between domains. Input: A lot of labeled data in the source domain and only unlabeled data in the target domain. Output: A common representation between source domain data and target domain data and a model on the new representation for use in the target domain.
  • 34. Domain Adaptation Structural Correspondence Learning (SCL) [Blitzer et al. EMNL-06, Blitzer et al. ACL-07, Ando and Zhang JMLR-05] Motivation: If two domains are related to each other, then there may exist some “pivot” features across both domain. Pivot features are features that behave in the same way for discriminative learning in both domains. Main Idea: To identify correspondences among features from different domains by modeling their correlations with pivot features. Non-pivot features form different domains that are correlated with many of the same pivot features are assumed to correspond, and they are treated similarly in a discriminative learner.
  • 35. SCL [Blitzer et al. EMNL-06, Blitzer et al. ACL-07, Ando and Zhang JMLR-05] a) Heuristically choose m pivot features, which is task specific. b) Transform each vector of pivot feature to a vector of binary values and then create corresponding prediction problem. Learn parameters of each prediction problem Do Eigen Decomposition on the matrix of parameters and learn the linear mapping function. Use the learnt mapping function to construct new features and train classifiers onto the new representations.
  • 36. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Inductive Transfer Learning  Transductive Transfer Learning  Unsupervised Transfer Learning
  • 37. Unsupervised Transfer Learning Feature-representation-transfer Approaches Self-taught Clustering (STC) [Dai et al. ICML-08] Input: A lot of unlabeled data in a source domain and a few unlabeled data in a target domain. Goal: Clustering the target domain data. Assumption: The source domain and target domain data share some common features, which can help clustering in the target domain. Main Idea: To extend the information theoretic co-clustering algorithm [Dhillon et al. KDD-03] for transfer learning.
  • 38. Self-taught Clustering (STC) [Dai et al. ICML-08] Common features Target domain data Source domain data Co-clustering in the source domain Objective function that need to be minimized Co-clustering in the target domain where Cluster functions Output
  • 39. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Negative Transfer  Conclusion
  • 40. Negative Transfer  Most approaches to transfer learning assume transferring knowledge across domains be always positive.  However, in some cases, when two tasks are too dissimilar, brute-force transfer may even hurt the performance of the target task, which is called negative transfer [Rosenstein et al NIPS-05 Workshop].  Some researchers have studied how to measure relatedness among tasks [Ben-David and Schuller NIPS-03, Bakker and Heskes JMLR-03].  How to design a mechanism to avoid negative transfer needs to be studied theoretically.
  • 41. Outline  Traditional Machine Learning vs. Transfer Learning  Why Transfer Learning?  Settings of Transfer Learning  Approaches to Transfer Learning  Negative Transfer  Conclusion
  • 42. Conclusion Inductive Transductive Unsupervised Transfer Learning Transfer Learning Transfer Learning Instance-transfer √ √ Feature-representation- √ √ √ transfer Model-transfer √ Relational-knowledge- √ transfer How to avoid negative transfer need to be attracted more attention!

Editor's Notes

  1. Seems the changes is involved in moving to new tasks is more radical than moving to new domains.
  2. Generality means human can learn to perform a variety of tasks.
  3. Generality means human can learn to perform a variety of tasks.