SlideShare a Scribd company logo
1 of 50
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
CUbiC
ARIZONA STATE UNIVERSITY
A Study of Boosting based Transfer Learning for
Activity and Gesture Recognition
Ashok Venkatesan
Committee Members
Sethuraman Panchanathan, Professor (Chair)
Jieping Ye, Associate Professor
Baoxin Li, Associate Professor
Master’s Thesis Defense
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Outline
• Motivation
• Transfer Learning
• Problem and Related Work
• Cost-Sensitive Boosting
• Results and Discussions
• Conclusion
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Outline
• Motivation : Real World Data, Dataset Shifts, Traditional Learning
• Transfer Learning
• Problem and Related Work
• Cost-Sensitive Boosting
• Results and Discussions
• Conclusion
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Real-World Data
Difficult to learn as it is Non-Stationary and Continuously Evolving
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Example : Spam Filtering
A spam filter is trained on random emails tracked from a group of users
under the assumption that new users would classify spam identically.
1. What if the training data is no longer relevant?
2. What if the user preferences are not identical?
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Motivational Example : Accelerometer Based 3D Gesture Recognition
A gesture recognition model is trained on mock data obtained in a control environment
under the assumption that real life data would be identical
1. What if the user has peculiar traits?
2. What if environmental factors and the objects interacted with vary and impact the
property of the gesture?
Scoop
Stir
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Simple Covariate Shift
• Change in 𝑃(𝑥) due to the
change in a known covariate.
Prior Probability Shift
• Change in 𝑃(𝑦) when
𝑃(𝑦|𝑥) is modeled as
𝑃(𝑥|𝑦) 𝑃(𝑦)
Sample Selection
Bias
• 𝑃 𝑥𝑖 ≠ 𝑃(𝑥)
Imbalanced Data
• Change in 𝑃(𝑦) by
design
Domain Shift
• Change in measurement
system of 𝑥𝑖
Source Component
Shift
• Involves changes in
strength of contributing
components
Concept Drift
• Change in 𝑃(𝑦|𝑥) in
continuous and real-time
data streams
Dataset Shift[1]
[1] Quionero-Candela, J., et al., Dataset shift in Machine Learning. s.l. : The MIT Press, 2009
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
• Training and test examples are assumed to be independently drawn and
identically distributed
Traditional Learning
NOT SUITED FOR HANDLING DATASET SHIFTS
Algorithm
Algorithm
Tasks Models
Traditional Learning over Multiple Domains
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Outline
• Motivation
• Transfer Learning : Definition, Learning Settings, Notation,
Problem, What to Transfer?
• Instance-Weighting using Boosting
• Cost-Sensitive Boosting
• Results and Discussions
• Conclusion
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Definition[2][3]
“Transfer Learning is a methodology that uses prior acquired
knowledge to effectively develop a new hypothesis. It emphasizes
knowledge transfer across domains, tasks and distributions that are
similar but not the same.”
[2] NIPS Inductive Transfer Workshop 2005
[3] Pan, S.J. and Yang, Q., "A Survey on Transfer Learning, TKDE 2009
• It is motivated by human learning. People can often transfer knowledge
learnt previously to novel situations.
• e.g. Knowing how to ride a bicycle might help improve learning to ride a
motorbike
• Outdated data representing prior knowledge is called to as Source
• Newer data representing the newer knowledge is referred to as Target.
𝐷 = *𝒳, 𝑃(𝑋)+ 𝑇 = *𝒴, 𝑓(. )+ 𝑃(𝑋), 𝑃(𝑌) & 𝑃(𝑌|𝑋)
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Algorithm
Insufficient Target
Training Data
Target Task Model
Transfer Learning - Illustration
Algorithm
Knowledge
Abundant Source
Training Data
Transfer Learning is beneficial for lessening labeling costs associated in re-training a model from
scratch and to make classification rapidly adaptable in real-time.
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
• Few labeled target domain data is available
for obtaining a weak inductive bias. Source
data is used as auxiliary data.
Inductive
Transfer
• Lots of labeled source data and lots of
unlabeled target domain data. Capitalize
on the difference in the domains.
Transductive
Transfer
• Both source data and target data are
unlabeled. Apply techniques such as
clustering and density estimation
Unsupervised
Transfer
Transfer Settings[3]
The scope of Transfer Learning in general is to learn a classifier that performs well over
target data samples alone. Classification performance over source tasks is ignored.
[3] Pan, S.J. and Yang, Q., "A Survey on Transfer Learning, TKDE 2009
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
• Two sets of tasks, source and target, represented by
instances 𝑋𝑠𝑜𝑢𝑟𝑐𝑒, 𝑋𝑡𝑎𝑟𝑔𝑒𝑡 ∈ 𝒳 and labels 𝑌𝑠𝑜𝑢𝑟𝑐𝑒, 𝑌𝑡𝑎𝑟𝑔𝑒𝑡 ∈ 𝒴 such that,
𝑃 𝑋𝑠𝑜𝑢𝑟𝑐𝑒, 𝑌𝑠𝑜𝑢𝑟𝑐𝑒 ≠ 𝑃(𝑋𝑡𝑎𝑟𝑔𝑒𝑡, 𝑌𝑡𝑎𝑟𝑔𝑒𝑡)
• Training examples are grouped and named based on
their task distributions
– Same task distribution as target ,
– Different task distribution from that of target,
• Unlabeled test examples representing the target tasks,
Notation
𝑇𝑠 = 𝑥𝑖
𝑠
, 𝑦𝑖
𝑠
𝑖=1
𝑚
𝑇𝑑 = 𝑥𝑗
𝑑
, 𝑦𝑗
𝑑
𝑗=1
𝑛
𝑆
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Problem Statement
Unseen
Target Data
(𝑆)
Abundant
Source Data
𝑇𝑑 = 𝑥𝑖
𝑑
, 𝑦𝑖
𝑑
𝑖=1
𝑛
Little Labeled
Target Data
𝑇𝑠 = 𝑥𝑗
𝑠
, 𝑦𝑗
𝑠
𝑗=1
𝑚
Model trained on 𝑇𝑑
Target model
Model trained on 𝑇𝑠
Objective: Given |𝑇𝑠| ≪ |𝑇𝑑| and that 𝑇𝑠 is insufficient to learn the
target tasks, learn a model using 𝑇𝑑 ∪ 𝑇𝑠 that classifies target task
examples 𝑆 with minimum error.
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Instance-based
• Reuse instances observed in source domain similar to the target domain.
• E.g. – Instance reweighting, Importance sampling
Feature-based
• Find an alternate feature space for learning the target domain while
projecting the source domain in the new space.
• E.g. – Feature subset selection, Feature space transformation
Model/Parameter-based
• Use model components such as parameters and hyper-parameters to
influence learning the target task.
• E.g. – Parameter-space partitioning, Superimposing shape constraints
What to Transfer?
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Outline
• Motivation
• Transfer Learning
• Instance-Weighting using Boosting : Instance
Weighting, AdaBoost, TrAdaBoost, TransferBoost, Limitations
• Cost-Sensitive Boosting
• Results and Discussions
• Conclusion
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
• AdaBoost[4] boosts a weak learning algorithm into a strong
learner by linearly combining an ensemble of weak
hypotheses.
• Why Boosting based Instance-Weighting?
– Provides theoretical guarantees on generalization error bounds.
– Incremental instance boosting aids in systematic selection of
important examples
– Well defined focus areas to modified for knowledge transfer
• Weak hypothesis loss function
• Weight update scheme
• Linear combination of the weak hypotheses
Boosting
[4] Freund, Y., Schapire, R. and Abe, N., "A Short Introduction to Boosting“, JSAI,
1999
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Stage 1 Stage 2 (AdaBoost)
Similarity
Measure
Weak
Hypotheses
Loss
Function
Weight
Update
Scheme
Linear
combination
of Weak
Hypotheses
Instance-Weighting and Boosting
• Two recent instance-weighting algorithms adapt AdaBoost
for knowledge transfer :
• TrAdaBoost[5],
• TransferBoost[6]
[5] Dai, W., et al., "Boosting for transfer learning." ICML, 2007
[6] Eaton, E. and Desjardins., "Set-Based Boosting for Instance-level Transfer”,IEEE, ICDM-W 2009.
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
AdaBoost
• 𝜖 𝑡 = 𝑝𝑖
𝑡
|𝑕 𝑡 𝑥𝑖 − 𝑦𝑖|𝑁
𝑖=1
• 𝛼 𝑡 =
1
2
log
1−𝜖 𝑡
𝜖 𝑡
Loss Function
• 𝑤𝑖
𝑡+1
= 𝑤𝑖
𝑡
exp −𝛼 𝑡 𝑦𝑖 𝑕 𝑥𝑖Weight Update
• 𝐻 𝑥 = 𝑠𝑖𝑔𝑛 𝛼 𝑡 𝑕 𝑡 𝑥𝑇
𝑡=1
Linear
Combination
Main Idea: Increase weights of misclassified training
samples (𝑇𝑑 ∪ 𝑇𝑠)
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
TrAdaBoost
• 𝜖 𝑡 = 𝑝𝑖
𝑡
|𝑕 𝑡 𝑥𝑖 − 𝑦𝑖|
|𝑇𝑠|
𝑖=1Loss Function
• 𝑤𝑖
𝑡+1
=
𝑤𝑖
𝑡
exp −𝛼 𝑡
𝑠
|𝑕 𝑡 𝑥𝑖
𝑠
− 𝑦𝑖
𝑠
|
𝑤𝑖
𝑡
exp −𝛼 𝑑 𝑦𝑖 𝑕 𝑡 𝑥𝑖
𝑑Weight Update
• 𝐻 𝑥 = 𝑠𝑖𝑔𝑛 𝛼 𝑡
𝑠
𝑕 𝑡 𝑥𝑇
𝑡=𝑇/2
Linear
Combination
• Increases weights of misclassified 𝑇𝑠 samples
• Decreases weights of misclassified 𝑇𝑑 samples
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
TransferBoost
• 𝛾𝑡
𝑘
= 𝜖 𝑇𝑠
− 𝜖 𝑇 𝑑∪𝑇𝑠
Similarity
Measure
• 𝜖 𝑡 = 𝑝𝑖
𝑡
|𝑕 𝑡 𝑥𝑖 − 𝑦𝑖|
|𝑇𝑠|
𝑖=1Loss Function
• 𝑤𝑖
𝑡+1
=
𝑤𝑖
𝑡
exp −𝛼 𝑡 𝑦𝑖
𝑘
𝑕 𝑡 𝑥𝑖
𝑘
+ 𝛾𝑡
𝑘
, 𝑥𝑖
𝑘
, 𝑦𝑖
𝑘
∈ 𝑇𝑑 𝑘
𝑤𝑖
𝑡
exp −𝛼 𝑡 𝑦𝑖
𝑠
𝑕 𝑥𝑖
𝑠Weight Update
• Increases weights of misclassified Ts samples
• Weights 𝐾 tasks using a measure called transferability
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
• TrAdaBoost
– decreases weights of supporting source domain
instances, making knowledge transfer inefficient.
– converging over target error makes it prone to over
fitting
• TransferBoost
– positive transferability is hard to come by due to the
small size of 𝑇𝑠
– requires external information of the structure of data to
be of any use
Limitations
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Outline
• Motivation
• Transfer Learning
• Instance-Weighting using Boosting
• Cost-Sensitive Boosting : General Idea, Weight update
schemes, Algorithm , Cost Estimation, Dynamic Cost
• Results and Discussions
• Conclusion
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Stage 1 Stage 2 (AdaBoost)
Similarity
Measure
Weak
Hypotheses
Loss
Function
Weight
Update
Scheme
Linear
combination
of Weak
Hypotheses
General Idea
• Compute instance-weights for Td and Ts separately.
• Augment 𝑇𝑑 instances with computed cost factors 𝐶.
• Learn a strong classifier to minimize training error over 𝑇𝑠 and reduce net
misclassification cost over 𝑇𝑑.
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Weight Update Schemes[7]
[7] Sun, Y. et al., "Cost-sensitive boosting for classification of imbalanced data." 2007
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Algorithm
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Cost Properties
• Represents the similarity of instance distributions and
classification functions between 𝑇𝑠 and 𝑇𝑑.
• Lies in the interval ,0,1-.
• Relevant examples have cost values lying closer to 1.
• 𝑇𝑑 examples that have a cost, 𝑐𝑖 = 0 are not used for
training.
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
• Instance Pruning[8]
The probability of correct classification of an instance by
model trained on 𝑇𝑠
• Relevance Measure
𝑑𝑖𝑠𝑡 𝑥𝑖
𝑑
, 𝑥𝑗
𝑠
𝑗,𝑦𝑖
𝑑≠𝑦𝑗
𝑠
𝑑𝑖𝑠𝑡 𝑥𝑖
𝑑
, 𝑥𝑗
𝑠
𝑗,𝑦𝑖
𝑑=𝑦𝑗
𝑠
Cost Estimation
[8] Jiang, J. and Zhai, C.X., "Instance weighting for domain adaptation in NLP.“, Association For
Computational Linguistics, 2007
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
• KL Importance Estimation Procedure[9]
Transductively estimates
𝑃 𝑥 𝑠 𝑖
𝑃 𝑥 𝑡 𝑖
by minimizing KL
divergence between distributions of 𝑇𝑑 and 𝑇𝑠
• Concept Feature Vector Distance[10]
Measures the distance between the Concept Feature
Vectors that represent different class labels in 𝑇𝑑 and 𝑇𝑠.
Cost Estimation
[9] Sugiyama, M. et al., "Direct importance estimation with model selection and its
application to covariate shift adaptation." NIPS, 2008
[10] Katakis, I. et al., “An Ensemble of Classifiers for coping with Recurring Contexts in
Data Streams.”, ECAI, 2008.
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Dynamic Cost-Sensitive Boosting
11. Update the cost vector C by calling the Cost Estimation Procedure along with the
weights of 𝑇𝑠
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Outline
• Motivation
• Transfer Learning
• Instance-Weighting using Boosting
• Cost-Sensitive Boosting
• Results and Discussions : Datasets, Classification
Accuracies, Dominance of AdaC2, vs. % of Training Data, Effect of Cost,
Dynamic Cost, Multisource Transfer
• Conclusion
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
• Source and Target Datasets
– Multi-class mock laboratory data(20
action samples from 5 users)
– Multi-class real-life data (4 users made 4
glasses of Gatorade and drank)
– 44 features and 500 source instances.
• Factors that induce dataset shift:
– Environmental Factors including size, shape
and weight of real-world objects
– User traits
• Avg. cross validation accuracy was
obtained over 5 trials.
Datasets
Act_gest : Accelerometer Based 3D Gesture Recognition (4 datasets)
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Activity Gesture dataset shows clear signs of a domain shift upon performing a PCA on the feature
points and projecting its instances onto the first three principle components.
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Activity Gesture dataset shows clear signs of a domain shift upon performing a PCA on the feature
points and projecting its instances onto the first three principle components.
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
• Multi-Source Datasets
– Multi-class activity data captured from
different 7 smart home test beds.
– Modeled into single source and target
datasets by using one vs. all.
– 19 features and a max of 5468 instances
for a source
• Factors that introduce dataset shift:
– Different Apartment Layouts
– Different Residents
• Avg. cross validation accuracy was
obtained over 5 trials.
Datasets
WSU Smart Home Activity Recognition (7 datasets)
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
WSU Activity Recognition datasets show signs of a shift in 𝑃 𝑋 . Of particular interest is the how the
dataset shift varies in agreement to the actual task in question.
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
• Source and Target Datasets
– 65K features were reduced to 45k features using document frequency thresholding.
– All features were encoded to binary
– Modeled into a binary classification dataset with class labels as one subcategory vs.
another.
• Factors that introduce concept drift in the gesture dataset
– Different term frequencies
– Synthetically generated from different subcategories.
• Avg. cross validation accuracy was obtained over 5 trials.
Datasets
20Newsgroups 1 (6 datasets)
• A multi-source variation containing one subcategory vs. noisy subcategories.
20Newsgroups 2 (7 datasets)
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Classification Accuracies
Dataset Svm𝑇𝑠 Svm𝑇𝑑 Svm𝑇𝑑𝑠 Ada Trada Adac1 Adac2 Adac3
User 1 0.77 0.56 0.79 0.85 0.82 0.85 0.88 0.85
User 2 0.84 0.64 0.98 0.93 0.98 0.97 0.98 0.98
User 3 0.54 0.33 0.71 0.67 0.65 0.70 0.75 0.74
User 4 0.44 0.61 0.77 0.73 0.75 0.76 0.79 0.80
Dataset Svm𝑇𝑠 Svm𝑇𝑑 Svm𝑇𝑑𝑠 Ada Trada Adac1 Adac2 Adac3
Apt-A 0.71 0.67 0.71 0.78 0.63 0.80 0.82 0.75
Apt-B 0.67 0.62 0.68 0.72 0.57 0.79 0.80 0.76
Apt-C 0.79 0.37 0.81 0.76 0.49 0.79 0.83 0.78
Apt-D 0.76 0.34 0.77 0.82 0.52 0.83 0.81 0.81
Apt-E 0.29 0.04 0.45 0.46 0.70 0.46 0.48 0.49
Apt-F 0.58 0.20 0.60 0.62 0.40 0.67 0.68 0.67
Apt-G 0.52 0.44 0.55 0.53 0.46 0.59 0.59 0.58
Act-gest dataset
Act-rec dataset
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Classification Accuracies
Dataset Svm𝑇𝑠 Svm𝑇𝑑 Svm𝑇𝑑𝑠 Ada Trada Adac1 Adac2 Adac3
Rec vs. Talk 0.68 0.72 0.75 0.72 0.73 0.71 0.83 0.72
Rec vs. Sci 0.63 0.70 0.69 0.69 0.69 0.70 0.77 0.69
Sci vs. Talk 0.60 0.64 0.67 0.64 0.70 0.67 0.74 0.68
Comp vs. Rec 0.80 0.73 0.85 0.83 0.72 0.82 0.86 0.84
Comp vs. Sci 0.62 0.64 0.67 0.68 0.58 0.69 0.76 0.69
Comp vs. Talk 0.86 0.68 0.87 0.87 0.73 0.88 0.89 0.88
Rec vs. Talk 0.68 0.72 0.75 0.72 0.73 0.71 0.83 0.72
20Newsgroups 1
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
AdaC2 vs. AdaC1, AdaC3
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
AdaC2 vs. % of Target Training Data
The above plot corresponds to Apartment-A dataset of act_rec
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Effect of Cost
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Dynamic Cost-Sensitive Boosting
Dataset AdaC2 DAdaC2
User1 0.88 0.87
User2 0.98 0.98
User3 0.75 0.71
User4 0.79 0.80
Dataset AdaC2 DAdaC2
Apt - A 0.82 0.82
Apt - B 0.80 0.74
Apt - C 0.83 0.80
Apt - D 0.81 0.77
Apt - E 0.48 0.48
Apt - F 0.68 0.69
Apt - G 0.59 0.60
Dataset AdaC2 DAdaC2
Rec vs Talk 0.83 0.84
Rec vs Sci 0.77 0.77
Sci vs Talk 0.74 0.74
Rec vs Comp 0.86 0.89
Comp vs Sci 0.76 0.75
Comp vs Talk 0.89 0.90
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
AdaC2 vs. Multisource Transfer
Dataset TrAdaBoost TransferBoost AdaC2
Apt-A 0.63 0.71 0.82
Apt-B 0.57 0.69 0.80
Apt-C 0.49 0.79 0.83
Apt-D 0.52 0.78 0.81
Apt-E 0.70 0.37 0.48
Apt-F 0.40 0.61 0.68
Apt-G 0.46 0.56 0.59
baseball 0.46 0.54 0.78
electronics 0.65 0.54 0.64
med 0.52 0.51 0.67
mideast 0.39 0.48 0.54
misc 0.47 0.53 0.51
pchardware 0.63 0.53 0.69
windowsx 0.64 0.57 0.66
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Outline
• Motivation
• Transfer Learning
• Instance-Weighting using Boosting
• Cost-Sensitive Boosting
• Results and Discussions
• Conclusion : Conclusion, Thesis Summary, Future Directions,
Dissemination
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
• An extension of AdaBoost for Transfer Learning
• Performs better than existing instance transfer techniques on real-word
datasets.
• Provides flexibility in using different relatedness measures and base classifiers
• Has good theoretical basis
Conclusion
• May be prone to over fitting
• Performance is dependent on the effectiveness of the cost estimated.
• Relies on being a bottom-top weighting approach. Does not utilize a given
structure of data.
Pros
Cons
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
• Cost-sensitive boosting schemes were evaluated over real-
world datasets and compared against well known
algorithms.
• 3 variants of cost-sensitive boosting algorithms were
investigated. AdaC2 was found to be better among the lot.
• 4 different relatedness measures were evaluated. Instance
pruning was found to give better results.
• Effect of maintaining a dynamic cost scheme was studied.
• Equivalence of AdaC2 with respect to multisource transfer
learning was analyzed.
Summary
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
• Estimating Relatedness
– Does a better a priori relatedness measure exist?
• Target Domain Instance Selection
– How to optimally select instances from the target
domain?
• Discovering Structure in datasets
– How can an existing structure in data be capitalized?
• System Integration
– How to best integrate these methodologies into an
application framework?
Future Directions
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
• A.Venkatesan, N.C.Krishnan, and S. Panchanathan, "Cost-sensitive
Boosting for Concept Drift", ECML Workshop on Handling Concept
Drift in Adaptive Information Systems (HaCDAIS), Barcelona, Spain,
2010.
• N.C. Krishnan, A. Venkatesan, S. Panchanathan, D.Cook, “Cost-
sensitive Boosting for Transfer Learning”, In preparation to be
submitted to IEEE Transactions on Knowledge and Data Engineering.
Dissemination
CENTER FOR COGNITIVE UBIQUITOUS COMPUTING
Thank you. Questions?

More Related Content

Similar to Boosting based Transfer Learning

LETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptx
LETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptxLETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptx
LETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptx
shamsul2010
 
GIS_presentation .pptx
GIS_presentation                    .pptxGIS_presentation                    .pptx
GIS_presentation .pptx
lahelex741
 

Similar to Boosting based Transfer Learning (20)

Online Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation SystemsOnline Tuning of Large Scale Recommendation Systems
Online Tuning of Large Scale Recommendation Systems
 
crossvalidation.pptx
crossvalidation.pptxcrossvalidation.pptx
crossvalidation.pptx
 
Statistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptxStatistical Learning and Model Selection (1).pptx
Statistical Learning and Model Selection (1).pptx
 
Winning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to StackingWinning Kaggle 101: Introduction to Stacking
Winning Kaggle 101: Introduction to Stacking
 
AI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptxAI_Unit-4_Learning.pptx
AI_Unit-4_Learning.pptx
 
MACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptxMACHINE LEARNING YEAR DL SECOND PART.pptx
MACHINE LEARNING YEAR DL SECOND PART.pptx
 
LETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptx
LETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptxLETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptx
LETS PUBLISH WITH MORE RELIABLE & PRESENTABLE MODELLING.pptx
 
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
Review : Adaptive Consistency Regularization for Semi-Supervised Transfer Lea...
 
Paper id 312201523
Paper id 312201523Paper id 312201523
Paper id 312201523
 
Fast AutoAugment
Fast AutoAugmentFast AutoAugment
Fast AutoAugment
 
Research-DrRakhiSawlani.pptx
Research-DrRakhiSawlani.pptxResearch-DrRakhiSawlani.pptx
Research-DrRakhiSawlani.pptx
 
Presentation1.pptx
Presentation1.pptxPresentation1.pptx
Presentation1.pptx
 
ResNeSt: Split-Attention Networks
ResNeSt: Split-Attention NetworksResNeSt: Split-Attention Networks
ResNeSt: Split-Attention Networks
 
Data preprocessing in Machine learning
Data preprocessing in Machine learning Data preprocessing in Machine learning
Data preprocessing in Machine learning
 
GIS_presentation .pptx
GIS_presentation                    .pptxGIS_presentation                    .pptx
GIS_presentation .pptx
 
introduction to Statistical Theory.pptx
 introduction to Statistical Theory.pptx introduction to Statistical Theory.pptx
introduction to Statistical Theory.pptx
 
Endsem AI merged.pdf
Endsem AI merged.pdfEndsem AI merged.pdf
Endsem AI merged.pdf
 
Presentation of master thesis
Presentation of master thesisPresentation of master thesis
Presentation of master thesis
 
ML SFCSE.pptx
ML SFCSE.pptxML SFCSE.pptx
ML SFCSE.pptx
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 

Recently uploaded

Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Abortion pills in Riyadh +966572737505 get cytotec
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
nirzagarg
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
gajnagarg
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
nirzagarg
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
gajnagarg
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
vexqp
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
cnajjemba
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
gajnagarg
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
wsppdmt
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
vexqp
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
vexqp
 

Recently uploaded (20)

Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit RiyadhCytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
Cytotec in Jeddah+966572737505) get unwanted pregnancy kit Riyadh
 
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
Top profile Call Girls In Purnia [ 7014168258 ] Call Me For Genuine Models We...
 
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
Top profile Call Girls In bhavnagar [ 7014168258 ] Call Me For Genuine Models...
 
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...
 
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Vadodara [ 7014168258 ] Call Me For Genuine Models ...
 
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
怎样办理纽约州立大学宾汉姆顿分校毕业证(SUNY-Bin毕业证书)成绩单学校原版复制
 
PLE-statistics document for primary schs
PLE-statistics document for primary schsPLE-statistics document for primary schs
PLE-statistics document for primary schs
 
7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt7. Epi of Chronic respiratory diseases.ppt
7. Epi of Chronic respiratory diseases.ppt
 
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptxThe-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
The-boAt-Story-Navigating-the-Waves-of-Innovation.pptx
 
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...
 
Aspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - AlmoraAspirational Block Program Block Syaldey District - Almora
Aspirational Block Program Block Syaldey District - Almora
 
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book nowVadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
Vadodara 💋 Call Girl 7737669865 Call Girls in Vadodara Escort service book now
 
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
如何办理英国诺森比亚大学毕业证(NU毕业证书)成绩单原件一模一样
 
Digital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham WareDigital Transformation Playbook by Graham Ware
Digital Transformation Playbook by Graham Ware
 
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
Digital Advertising Lecture for Advanced Digital & Social Media Strategy at U...
 
Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...Sequential and reinforcement learning for demand side management by Margaux B...
Sequential and reinforcement learning for demand side management by Margaux B...
 
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
怎样办理旧金山城市学院毕业证(CCSF毕业证书)成绩单学校原版复制
 
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
怎样办理圣地亚哥州立大学毕业证(SDSU毕业证书)成绩单学校原版复制
 
Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........Switzerland Constitution 2002.pdf.........
Switzerland Constitution 2002.pdf.........
 
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATIONCapstone in Interprofessional Informatic  // IMPACT OF COVID 19 ON EDUCATION
Capstone in Interprofessional Informatic // IMPACT OF COVID 19 ON EDUCATION
 

Boosting based Transfer Learning

  • 1. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING CUbiC ARIZONA STATE UNIVERSITY A Study of Boosting based Transfer Learning for Activity and Gesture Recognition Ashok Venkatesan Committee Members Sethuraman Panchanathan, Professor (Chair) Jieping Ye, Associate Professor Baoxin Li, Associate Professor Master’s Thesis Defense
  • 2. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Outline • Motivation • Transfer Learning • Problem and Related Work • Cost-Sensitive Boosting • Results and Discussions • Conclusion
  • 3. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Outline • Motivation : Real World Data, Dataset Shifts, Traditional Learning • Transfer Learning • Problem and Related Work • Cost-Sensitive Boosting • Results and Discussions • Conclusion
  • 4. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Real-World Data Difficult to learn as it is Non-Stationary and Continuously Evolving
  • 5. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Example : Spam Filtering A spam filter is trained on random emails tracked from a group of users under the assumption that new users would classify spam identically. 1. What if the training data is no longer relevant? 2. What if the user preferences are not identical?
  • 6. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Motivational Example : Accelerometer Based 3D Gesture Recognition A gesture recognition model is trained on mock data obtained in a control environment under the assumption that real life data would be identical 1. What if the user has peculiar traits? 2. What if environmental factors and the objects interacted with vary and impact the property of the gesture? Scoop Stir
  • 7. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Simple Covariate Shift • Change in 𝑃(𝑥) due to the change in a known covariate. Prior Probability Shift • Change in 𝑃(𝑦) when 𝑃(𝑦|𝑥) is modeled as 𝑃(𝑥|𝑦) 𝑃(𝑦) Sample Selection Bias • 𝑃 𝑥𝑖 ≠ 𝑃(𝑥) Imbalanced Data • Change in 𝑃(𝑦) by design Domain Shift • Change in measurement system of 𝑥𝑖 Source Component Shift • Involves changes in strength of contributing components Concept Drift • Change in 𝑃(𝑦|𝑥) in continuous and real-time data streams Dataset Shift[1] [1] Quionero-Candela, J., et al., Dataset shift in Machine Learning. s.l. : The MIT Press, 2009
  • 8. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING • Training and test examples are assumed to be independently drawn and identically distributed Traditional Learning NOT SUITED FOR HANDLING DATASET SHIFTS Algorithm Algorithm Tasks Models Traditional Learning over Multiple Domains
  • 9. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Outline • Motivation • Transfer Learning : Definition, Learning Settings, Notation, Problem, What to Transfer? • Instance-Weighting using Boosting • Cost-Sensitive Boosting • Results and Discussions • Conclusion
  • 10. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Definition[2][3] “Transfer Learning is a methodology that uses prior acquired knowledge to effectively develop a new hypothesis. It emphasizes knowledge transfer across domains, tasks and distributions that are similar but not the same.” [2] NIPS Inductive Transfer Workshop 2005 [3] Pan, S.J. and Yang, Q., "A Survey on Transfer Learning, TKDE 2009 • It is motivated by human learning. People can often transfer knowledge learnt previously to novel situations. • e.g. Knowing how to ride a bicycle might help improve learning to ride a motorbike • Outdated data representing prior knowledge is called to as Source • Newer data representing the newer knowledge is referred to as Target. 𝐷 = *𝒳, 𝑃(𝑋)+ 𝑇 = *𝒴, 𝑓(. )+ 𝑃(𝑋), 𝑃(𝑌) & 𝑃(𝑌|𝑋)
  • 11. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Algorithm Insufficient Target Training Data Target Task Model Transfer Learning - Illustration Algorithm Knowledge Abundant Source Training Data Transfer Learning is beneficial for lessening labeling costs associated in re-training a model from scratch and to make classification rapidly adaptable in real-time.
  • 12. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING • Few labeled target domain data is available for obtaining a weak inductive bias. Source data is used as auxiliary data. Inductive Transfer • Lots of labeled source data and lots of unlabeled target domain data. Capitalize on the difference in the domains. Transductive Transfer • Both source data and target data are unlabeled. Apply techniques such as clustering and density estimation Unsupervised Transfer Transfer Settings[3] The scope of Transfer Learning in general is to learn a classifier that performs well over target data samples alone. Classification performance over source tasks is ignored. [3] Pan, S.J. and Yang, Q., "A Survey on Transfer Learning, TKDE 2009
  • 13. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING • Two sets of tasks, source and target, represented by instances 𝑋𝑠𝑜𝑢𝑟𝑐𝑒, 𝑋𝑡𝑎𝑟𝑔𝑒𝑡 ∈ 𝒳 and labels 𝑌𝑠𝑜𝑢𝑟𝑐𝑒, 𝑌𝑡𝑎𝑟𝑔𝑒𝑡 ∈ 𝒴 such that, 𝑃 𝑋𝑠𝑜𝑢𝑟𝑐𝑒, 𝑌𝑠𝑜𝑢𝑟𝑐𝑒 ≠ 𝑃(𝑋𝑡𝑎𝑟𝑔𝑒𝑡, 𝑌𝑡𝑎𝑟𝑔𝑒𝑡) • Training examples are grouped and named based on their task distributions – Same task distribution as target , – Different task distribution from that of target, • Unlabeled test examples representing the target tasks, Notation 𝑇𝑠 = 𝑥𝑖 𝑠 , 𝑦𝑖 𝑠 𝑖=1 𝑚 𝑇𝑑 = 𝑥𝑗 𝑑 , 𝑦𝑗 𝑑 𝑗=1 𝑛 𝑆
  • 14. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Problem Statement Unseen Target Data (𝑆) Abundant Source Data 𝑇𝑑 = 𝑥𝑖 𝑑 , 𝑦𝑖 𝑑 𝑖=1 𝑛 Little Labeled Target Data 𝑇𝑠 = 𝑥𝑗 𝑠 , 𝑦𝑗 𝑠 𝑗=1 𝑚 Model trained on 𝑇𝑑 Target model Model trained on 𝑇𝑠 Objective: Given |𝑇𝑠| ≪ |𝑇𝑑| and that 𝑇𝑠 is insufficient to learn the target tasks, learn a model using 𝑇𝑑 ∪ 𝑇𝑠 that classifies target task examples 𝑆 with minimum error.
  • 15. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Instance-based • Reuse instances observed in source domain similar to the target domain. • E.g. – Instance reweighting, Importance sampling Feature-based • Find an alternate feature space for learning the target domain while projecting the source domain in the new space. • E.g. – Feature subset selection, Feature space transformation Model/Parameter-based • Use model components such as parameters and hyper-parameters to influence learning the target task. • E.g. – Parameter-space partitioning, Superimposing shape constraints What to Transfer?
  • 16. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Outline • Motivation • Transfer Learning • Instance-Weighting using Boosting : Instance Weighting, AdaBoost, TrAdaBoost, TransferBoost, Limitations • Cost-Sensitive Boosting • Results and Discussions • Conclusion
  • 17. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING • AdaBoost[4] boosts a weak learning algorithm into a strong learner by linearly combining an ensemble of weak hypotheses. • Why Boosting based Instance-Weighting? – Provides theoretical guarantees on generalization error bounds. – Incremental instance boosting aids in systematic selection of important examples – Well defined focus areas to modified for knowledge transfer • Weak hypothesis loss function • Weight update scheme • Linear combination of the weak hypotheses Boosting [4] Freund, Y., Schapire, R. and Abe, N., "A Short Introduction to Boosting“, JSAI, 1999
  • 18. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Stage 1 Stage 2 (AdaBoost) Similarity Measure Weak Hypotheses Loss Function Weight Update Scheme Linear combination of Weak Hypotheses Instance-Weighting and Boosting • Two recent instance-weighting algorithms adapt AdaBoost for knowledge transfer : • TrAdaBoost[5], • TransferBoost[6] [5] Dai, W., et al., "Boosting for transfer learning." ICML, 2007 [6] Eaton, E. and Desjardins., "Set-Based Boosting for Instance-level Transfer”,IEEE, ICDM-W 2009.
  • 19. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING AdaBoost • 𝜖 𝑡 = 𝑝𝑖 𝑡 |𝑕 𝑡 𝑥𝑖 − 𝑦𝑖|𝑁 𝑖=1 • 𝛼 𝑡 = 1 2 log 1−𝜖 𝑡 𝜖 𝑡 Loss Function • 𝑤𝑖 𝑡+1 = 𝑤𝑖 𝑡 exp −𝛼 𝑡 𝑦𝑖 𝑕 𝑥𝑖Weight Update • 𝐻 𝑥 = 𝑠𝑖𝑔𝑛 𝛼 𝑡 𝑕 𝑡 𝑥𝑇 𝑡=1 Linear Combination Main Idea: Increase weights of misclassified training samples (𝑇𝑑 ∪ 𝑇𝑠)
  • 20. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING TrAdaBoost • 𝜖 𝑡 = 𝑝𝑖 𝑡 |𝑕 𝑡 𝑥𝑖 − 𝑦𝑖| |𝑇𝑠| 𝑖=1Loss Function • 𝑤𝑖 𝑡+1 = 𝑤𝑖 𝑡 exp −𝛼 𝑡 𝑠 |𝑕 𝑡 𝑥𝑖 𝑠 − 𝑦𝑖 𝑠 | 𝑤𝑖 𝑡 exp −𝛼 𝑑 𝑦𝑖 𝑕 𝑡 𝑥𝑖 𝑑Weight Update • 𝐻 𝑥 = 𝑠𝑖𝑔𝑛 𝛼 𝑡 𝑠 𝑕 𝑡 𝑥𝑇 𝑡=𝑇/2 Linear Combination • Increases weights of misclassified 𝑇𝑠 samples • Decreases weights of misclassified 𝑇𝑑 samples
  • 21. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING TransferBoost • 𝛾𝑡 𝑘 = 𝜖 𝑇𝑠 − 𝜖 𝑇 𝑑∪𝑇𝑠 Similarity Measure • 𝜖 𝑡 = 𝑝𝑖 𝑡 |𝑕 𝑡 𝑥𝑖 − 𝑦𝑖| |𝑇𝑠| 𝑖=1Loss Function • 𝑤𝑖 𝑡+1 = 𝑤𝑖 𝑡 exp −𝛼 𝑡 𝑦𝑖 𝑘 𝑕 𝑡 𝑥𝑖 𝑘 + 𝛾𝑡 𝑘 , 𝑥𝑖 𝑘 , 𝑦𝑖 𝑘 ∈ 𝑇𝑑 𝑘 𝑤𝑖 𝑡 exp −𝛼 𝑡 𝑦𝑖 𝑠 𝑕 𝑥𝑖 𝑠Weight Update • Increases weights of misclassified Ts samples • Weights 𝐾 tasks using a measure called transferability
  • 22. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING • TrAdaBoost – decreases weights of supporting source domain instances, making knowledge transfer inefficient. – converging over target error makes it prone to over fitting • TransferBoost – positive transferability is hard to come by due to the small size of 𝑇𝑠 – requires external information of the structure of data to be of any use Limitations
  • 23. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Outline • Motivation • Transfer Learning • Instance-Weighting using Boosting • Cost-Sensitive Boosting : General Idea, Weight update schemes, Algorithm , Cost Estimation, Dynamic Cost • Results and Discussions • Conclusion
  • 24. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Stage 1 Stage 2 (AdaBoost) Similarity Measure Weak Hypotheses Loss Function Weight Update Scheme Linear combination of Weak Hypotheses General Idea • Compute instance-weights for Td and Ts separately. • Augment 𝑇𝑑 instances with computed cost factors 𝐶. • Learn a strong classifier to minimize training error over 𝑇𝑠 and reduce net misclassification cost over 𝑇𝑑.
  • 25. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Weight Update Schemes[7] [7] Sun, Y. et al., "Cost-sensitive boosting for classification of imbalanced data." 2007
  • 26. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Algorithm
  • 27. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Cost Properties • Represents the similarity of instance distributions and classification functions between 𝑇𝑠 and 𝑇𝑑. • Lies in the interval ,0,1-. • Relevant examples have cost values lying closer to 1. • 𝑇𝑑 examples that have a cost, 𝑐𝑖 = 0 are not used for training.
  • 28. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING • Instance Pruning[8] The probability of correct classification of an instance by model trained on 𝑇𝑠 • Relevance Measure 𝑑𝑖𝑠𝑡 𝑥𝑖 𝑑 , 𝑥𝑗 𝑠 𝑗,𝑦𝑖 𝑑≠𝑦𝑗 𝑠 𝑑𝑖𝑠𝑡 𝑥𝑖 𝑑 , 𝑥𝑗 𝑠 𝑗,𝑦𝑖 𝑑=𝑦𝑗 𝑠 Cost Estimation [8] Jiang, J. and Zhai, C.X., "Instance weighting for domain adaptation in NLP.“, Association For Computational Linguistics, 2007
  • 29. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING • KL Importance Estimation Procedure[9] Transductively estimates 𝑃 𝑥 𝑠 𝑖 𝑃 𝑥 𝑡 𝑖 by minimizing KL divergence between distributions of 𝑇𝑑 and 𝑇𝑠 • Concept Feature Vector Distance[10] Measures the distance between the Concept Feature Vectors that represent different class labels in 𝑇𝑑 and 𝑇𝑠. Cost Estimation [9] Sugiyama, M. et al., "Direct importance estimation with model selection and its application to covariate shift adaptation." NIPS, 2008 [10] Katakis, I. et al., “An Ensemble of Classifiers for coping with Recurring Contexts in Data Streams.”, ECAI, 2008.
  • 30. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Dynamic Cost-Sensitive Boosting 11. Update the cost vector C by calling the Cost Estimation Procedure along with the weights of 𝑇𝑠
  • 31. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Outline • Motivation • Transfer Learning • Instance-Weighting using Boosting • Cost-Sensitive Boosting • Results and Discussions : Datasets, Classification Accuracies, Dominance of AdaC2, vs. % of Training Data, Effect of Cost, Dynamic Cost, Multisource Transfer • Conclusion
  • 32. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING • Source and Target Datasets – Multi-class mock laboratory data(20 action samples from 5 users) – Multi-class real-life data (4 users made 4 glasses of Gatorade and drank) – 44 features and 500 source instances. • Factors that induce dataset shift: – Environmental Factors including size, shape and weight of real-world objects – User traits • Avg. cross validation accuracy was obtained over 5 trials. Datasets Act_gest : Accelerometer Based 3D Gesture Recognition (4 datasets)
  • 33. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Activity Gesture dataset shows clear signs of a domain shift upon performing a PCA on the feature points and projecting its instances onto the first three principle components.
  • 34. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Activity Gesture dataset shows clear signs of a domain shift upon performing a PCA on the feature points and projecting its instances onto the first three principle components.
  • 35. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING • Multi-Source Datasets – Multi-class activity data captured from different 7 smart home test beds. – Modeled into single source and target datasets by using one vs. all. – 19 features and a max of 5468 instances for a source • Factors that introduce dataset shift: – Different Apartment Layouts – Different Residents • Avg. cross validation accuracy was obtained over 5 trials. Datasets WSU Smart Home Activity Recognition (7 datasets)
  • 36. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING WSU Activity Recognition datasets show signs of a shift in 𝑃 𝑋 . Of particular interest is the how the dataset shift varies in agreement to the actual task in question.
  • 37. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING • Source and Target Datasets – 65K features were reduced to 45k features using document frequency thresholding. – All features were encoded to binary – Modeled into a binary classification dataset with class labels as one subcategory vs. another. • Factors that introduce concept drift in the gesture dataset – Different term frequencies – Synthetically generated from different subcategories. • Avg. cross validation accuracy was obtained over 5 trials. Datasets 20Newsgroups 1 (6 datasets) • A multi-source variation containing one subcategory vs. noisy subcategories. 20Newsgroups 2 (7 datasets)
  • 38. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Classification Accuracies Dataset Svm𝑇𝑠 Svm𝑇𝑑 Svm𝑇𝑑𝑠 Ada Trada Adac1 Adac2 Adac3 User 1 0.77 0.56 0.79 0.85 0.82 0.85 0.88 0.85 User 2 0.84 0.64 0.98 0.93 0.98 0.97 0.98 0.98 User 3 0.54 0.33 0.71 0.67 0.65 0.70 0.75 0.74 User 4 0.44 0.61 0.77 0.73 0.75 0.76 0.79 0.80 Dataset Svm𝑇𝑠 Svm𝑇𝑑 Svm𝑇𝑑𝑠 Ada Trada Adac1 Adac2 Adac3 Apt-A 0.71 0.67 0.71 0.78 0.63 0.80 0.82 0.75 Apt-B 0.67 0.62 0.68 0.72 0.57 0.79 0.80 0.76 Apt-C 0.79 0.37 0.81 0.76 0.49 0.79 0.83 0.78 Apt-D 0.76 0.34 0.77 0.82 0.52 0.83 0.81 0.81 Apt-E 0.29 0.04 0.45 0.46 0.70 0.46 0.48 0.49 Apt-F 0.58 0.20 0.60 0.62 0.40 0.67 0.68 0.67 Apt-G 0.52 0.44 0.55 0.53 0.46 0.59 0.59 0.58 Act-gest dataset Act-rec dataset
  • 39. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Classification Accuracies Dataset Svm𝑇𝑠 Svm𝑇𝑑 Svm𝑇𝑑𝑠 Ada Trada Adac1 Adac2 Adac3 Rec vs. Talk 0.68 0.72 0.75 0.72 0.73 0.71 0.83 0.72 Rec vs. Sci 0.63 0.70 0.69 0.69 0.69 0.70 0.77 0.69 Sci vs. Talk 0.60 0.64 0.67 0.64 0.70 0.67 0.74 0.68 Comp vs. Rec 0.80 0.73 0.85 0.83 0.72 0.82 0.86 0.84 Comp vs. Sci 0.62 0.64 0.67 0.68 0.58 0.69 0.76 0.69 Comp vs. Talk 0.86 0.68 0.87 0.87 0.73 0.88 0.89 0.88 Rec vs. Talk 0.68 0.72 0.75 0.72 0.73 0.71 0.83 0.72 20Newsgroups 1
  • 40. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING AdaC2 vs. AdaC1, AdaC3
  • 41. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING AdaC2 vs. % of Target Training Data The above plot corresponds to Apartment-A dataset of act_rec
  • 42. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Effect of Cost
  • 43. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Dynamic Cost-Sensitive Boosting Dataset AdaC2 DAdaC2 User1 0.88 0.87 User2 0.98 0.98 User3 0.75 0.71 User4 0.79 0.80 Dataset AdaC2 DAdaC2 Apt - A 0.82 0.82 Apt - B 0.80 0.74 Apt - C 0.83 0.80 Apt - D 0.81 0.77 Apt - E 0.48 0.48 Apt - F 0.68 0.69 Apt - G 0.59 0.60 Dataset AdaC2 DAdaC2 Rec vs Talk 0.83 0.84 Rec vs Sci 0.77 0.77 Sci vs Talk 0.74 0.74 Rec vs Comp 0.86 0.89 Comp vs Sci 0.76 0.75 Comp vs Talk 0.89 0.90
  • 44. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING AdaC2 vs. Multisource Transfer Dataset TrAdaBoost TransferBoost AdaC2 Apt-A 0.63 0.71 0.82 Apt-B 0.57 0.69 0.80 Apt-C 0.49 0.79 0.83 Apt-D 0.52 0.78 0.81 Apt-E 0.70 0.37 0.48 Apt-F 0.40 0.61 0.68 Apt-G 0.46 0.56 0.59 baseball 0.46 0.54 0.78 electronics 0.65 0.54 0.64 med 0.52 0.51 0.67 mideast 0.39 0.48 0.54 misc 0.47 0.53 0.51 pchardware 0.63 0.53 0.69 windowsx 0.64 0.57 0.66
  • 45. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Outline • Motivation • Transfer Learning • Instance-Weighting using Boosting • Cost-Sensitive Boosting • Results and Discussions • Conclusion : Conclusion, Thesis Summary, Future Directions, Dissemination
  • 46. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING • An extension of AdaBoost for Transfer Learning • Performs better than existing instance transfer techniques on real-word datasets. • Provides flexibility in using different relatedness measures and base classifiers • Has good theoretical basis Conclusion • May be prone to over fitting • Performance is dependent on the effectiveness of the cost estimated. • Relies on being a bottom-top weighting approach. Does not utilize a given structure of data. Pros Cons
  • 47. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING • Cost-sensitive boosting schemes were evaluated over real- world datasets and compared against well known algorithms. • 3 variants of cost-sensitive boosting algorithms were investigated. AdaC2 was found to be better among the lot. • 4 different relatedness measures were evaluated. Instance pruning was found to give better results. • Effect of maintaining a dynamic cost scheme was studied. • Equivalence of AdaC2 with respect to multisource transfer learning was analyzed. Summary
  • 48. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING • Estimating Relatedness – Does a better a priori relatedness measure exist? • Target Domain Instance Selection – How to optimally select instances from the target domain? • Discovering Structure in datasets – How can an existing structure in data be capitalized? • System Integration – How to best integrate these methodologies into an application framework? Future Directions
  • 49. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING • A.Venkatesan, N.C.Krishnan, and S. Panchanathan, "Cost-sensitive Boosting for Concept Drift", ECML Workshop on Handling Concept Drift in Adaptive Information Systems (HaCDAIS), Barcelona, Spain, 2010. • N.C. Krishnan, A. Venkatesan, S. Panchanathan, D.Cook, “Cost- sensitive Boosting for Transfer Learning”, In preparation to be submitted to IEEE Transactions on Knowledge and Data Engineering. Dissemination
  • 50. CENTER FOR COGNITIVE UBIQUITOUS COMPUTING Thank you. Questions?