slides-sd

Learning Better while
Sending Less
Communication-Efficient Online Semi-Supervised
Learning in Client-Server Settings
Han Xiao Technical University of Munich
Shou-De Lin National Taiwan University
Mi-Yen Yeh Academia Sinica
Phillip B. Gibbons Intel Labs
Claudia Eckert Technical University of Munich
1

Project solves online semi-supervised learning in
client-server settings
}FrameworkClient Communication Server
Client
Communication
Server
Task Generate data and send to the server
Challenge Large volume of data and most are unlabeled
Task Transmit client and server data to each other
Challenge Network bandwidth is limited
Task Learn a classification model from client’s uploads
Challenge Incoming data stream is partially labeled
Framework Goal
Design a modular framework to provide high
classification accuracy with reduced communication
and labeling costs.
2

An intelligent traffic management system involves
distributed learning
Real-time traffic Surveillance
camera
Images captured
over time
9 am
Network
10 am
11 am
12 am
ServerClassifier
Automatic road
condition
recognition
Distributed learning in the
client-server setting
3

An intelligent wearable device involves distributed learning
Daily activities Wearable
device
Sensory data
in real-time
BluetoothLaptopClassifier
Human activity
recognition
Distributed learning in the
client-server setting
4

Outline
• Related work
• Gap & Question
• Method
• Result
• Summary
5

Project shares characteristics of online, semi-
supervised, and active learning
Online learning
Semi-supervised
learning
Active learning
• Passive aggressive [JMLR03],
• confidence weighted [ICML06,NIPS07],
• adaptive regularization of weights [NIPS09],
• exact soft confidence weight [ICML12]
• Semi-supervised suport vector machines,
• harmonic function solution [ICML03],
• SSL with max-margin graph cuts [AISTATS10]
• Submodularity function [AAAI07,ICRA10]
Online semi-
supervised learning
• Harmonic function on quantized graph [UAI09],
• bootstrap AROW [ACML12]
Online active
learning
• Unbiased online active learning [KDD11]
Active semi-
supervised learning
• Graph risk on harmonic function solution [ICML04]
online
active
semi-
supervised
1
1
2
3
4
5
6
2 3
4 5
6
7
Research Previous work
6

Project considers three settings jointly for
communication-efficient learning
Online learning
Semi-supervised
learning
Active learning
Online semi-
supervised learning
Online active
learning
Active semi-
supervised learning
online
active
semi-
supervised
1
1
2
3
4
5
6
2 3
4 5
6
7
Research Gap
Data is only partially
labeled
Data comes in
sequentially
No oracle is available
for providing feedback
Need to take
bandwidth limit in
account
Need to deal with
unlabeled data
Need to learn
incrementally
Project
7
7

Project develops algorithms for both client and
server
Unlabeled data Candidate pool Selection policy
Upload
selections
Two-learner
model
Update
selection policy
Client Server
Keysteps
1 Client fills unlabeled data into a candidate pool
2 Once the pool is full, client selects high-priority instances and uploads to server
3 Server receives unlabeled data and feeds to a two-learner model
4 Server updates the model and send the new selection policy to client
5 Client receives the new selection policy, clears the candidate pool, goto step 1
8

Server employs two-learner model to learn
unlabeled data from client
Upload
selections
Two-learner
model
Update
selection policy
Client Server
PurposeMethod
• Incrementally learn a binary classifier from unlabeled data
Requirement
• Leverage neighbor information for exploiting unlabeled data
• Learn in online fashion
• Be efficient enough to handle large-volume of data
• Be easily parameterized as a selection policy
• Two-learner structure
• Harmonic solution (HS)
• Soft confidence-weighted (SCW)
HS SCW
Teach most
certain instances
Keysteps
9

Client uploads only crucial data according to the
selection policy
Unlabeled data
Keysteps
Candidate pool Selection policy
Upload
selections
Two-learner
model
Update
selection policy
Client Server
PurposeMethod
• Select a small set of data from the candidate pool for uploading
Requirement
• Uploaded data should improve the classification performance on the server
• Selection procedure should be light-weight for the client
• Selection policy should be light-weight for the network
• Use the current weight of SCW to construct the selection policy
• Optimize a submodular function consists of two criterions
• Uncertainty w.r.t. SCW
• Redundancy w.r.t. the candidate pool
10

Experiments validated algorithms on both server
and client
Goal
Data sets
Sessions
• Explore a good combination of techniques for
communication-efficient online semi-supervised
learning
• 10 data sets downloaded from UCI, LibSVM
website
• Benchmark the model on the server.
• Fix the labeling rate 2%, sampling rate 20%, and
selection policy to “rand” on the client.
• Benchmark the selection strategy on the client.
• Fix the labeling rate 2%, sampling rate 20%, and
server’s model to the best obtained in session 1.
• Explore how the labeling rate and sampling rate
affect the overall performance.
• Fix the server’s model to the best obtained in
session 1; fix the client’s policy to the best
obtained in session 2.
1
2
3
Evaluation • Offline accuracy on test set
Experiments

Two-learner model effectively learns from
unlabeled data
12
Full
HS+SCW+
CUT
HS+SCW
SCW
None
KNN+SCW
KNN
All uploaded instances are labeled by an oracle. This
approach should give the best result due to the availability
of full information.
Proposed two-learner model with cutoff averaging for
predicting test data.
Proposed two-learner model on the server.
The server consists of an SCW model only, which “learns”
each unlabeled instance using its own prediction.
No unlabeled instances are uploaded to the server. The
server stops learning right after labeled instances. This
approach should give the worst performance.
The server consists of a two-learner model: knn followed by
scw. The prediction of knn is used for training scw.
The server employs 5-nearest neighbors algorithm. The
training set is built by first including all labeled instances,
and then adding unlabeled instances with its corresponding
predicted labels.
Proposed two-learner model
Model on server Description
Acc. avg. on
10 data sets
92.16%
86.71%
86.38%
83.73%
84.55%
84.31%
82.89%
Result on
a data set

Better selection policy achieves higher accuracy with
same communication budget
13
Full
Submod
Uncertain
Rand
All
Certain
All uploaded instances are labeled by an oracle. This
approach should give the best result due to the availability
of full information.
Selection is done by optimizing a submodular function,
which considers both uncertainty and redundancy.
The most uncertain instances are uploaded.
Randomly selects instances for uploading.
All unlabeled instances are uploaded without selection. This
incurs 5x the communication costs versus other
approaches.
The most certain instances according to the current SCW
on the server are uploaded.
Selection policy
on client
Description
Acc. avg. on
10 data sets
92.16%
87.08%
87.12%
86.38%
86.32%
82.39%
None
The server employs 5-nearest neighbors algorithm. The
training set is built by first including all labeled instances,
and then adding unlabeled instances with its corresponding
predicted labels.
82.89%
Result on
a data set

Best combination of techniques reduces
communication cost while maintaining accuracy
14
}FrameworkClient Communication Server
Selection policy on
client
Labeling rate (a mount
of human effort)
Sampling rate (a mount
of communication cost)
Accuracy averaged on
10 data sets
Full 100% 20% 92.16%
All 2% 100% 86.32%
Rand 2% 20% 86.38%
Best comb. (submod) 2% 20% 87.08%
Upload
selections
Two-learner
model
Update
selection policy
Client Server
Keysteps

Project establishes a framework that enables
communication-efficient learning in client-server settings
15
Client Server
Introduce a novel learning setting motivated by many big data
applications.
Propose a framework that is modular in design, flexible, and can be
practically incorporated into a variety of useful systems.
Present a novel techniques at the clients and the server that are
well-suited to providing high classification accuracy with reduced
communication and labeling costs.
Show that some particular combination of techniques outperforms
other approaches, and often outperforms (communication
expensive) approaches that send all the data to the server.

slides-sd

More Related Content

Similar to slides-sd

slides-sd