SlideShare a Scribd company logo
Active Learning
Ragib Ahsan
Committee
Prof. Xinhua Zhang (Chair)
Prof. Brian Ziebart
Prof. Jon A Solworth
Overview
● What is active learning?
● Does active learning make any difference?
● Active learning from multiple oracles
● Active learning with weak and strong oracle
● Multiple oracles with varying expertise
2
What is Active Learning?
● Introduced in Education by 1990s
● Let students participate actively
● Doing things rather than just listening
● Inspired machine learning
● Also known as Query Learning
3
Contrast to passive learning
Passive Learning Active Learning
4
Applications
● Fewer labeled data
● Speech Recognition
○ Word level annotation can take ten times longer
than actual audio (Zhu, 2005)
● Medical Diagnosis
○ Expert doctors
● Document Classification
5
Active Learning Examples
Pool based active learning (Settles, 2009) 6
Active Learning Examples
a) Toy dataset, two Gaussians b) logistic regression model produces 70% accuracy c) logistic
regression with active querying produces 90% accuracy (Settles, 2009)
7
Human Active Learning
[Source: JSLHR]
8
Does AL make any difference?
“Learners do benefit from the
opportunity to actively select
examples during learning. But
It is very difficult to asses the
magnitude of difference that
active learning makes
compared to passive learning”
Laughlin (1973)
There were conflicting claims
throughout the literature on
the effectiveness of active
learning
9
Does AL make any difference?
“People make inappropriate
queries to assess simple logical
hypotheses such as if p then q
(frequently examining q
instances to see if they are p, and
failing to explore not-q instances”
Wason et al. (1972)
“If the learning task is properly
construed, human actually do a
great job in asking questions”
Gigerenzer et al.(2002)
Oaksford et al. (2007)
10
Does AL make any difference?
Castro et al. (2008) addressed these questions:
[Q1] Do humans perform better when they can select their own examples for labeling,
compared to passive observation of labeled examples?
[Q2] If so, do they achieve the full benefit of active learning suggested by statistical
learning theory?
[Q3] If they do not, can machine learning be used to enhance human performance?
[Q4] Do the answers to these questions vary depending upon the difficulty of the
learning problem?
11
Task Formulation
● Binary Classification in interval [0,1]
● Unknown decision boundary,
● 0 and 1 class
● n samples
● Xi
[0, 1], Yi
{0, 1}
● Yi
is correct with probability 1 − ε
● 0 ≤ ε < 1/2
12
[Source: Castro et. al. (2008)]
Error bound (ε = 0)
● Passive Learning
○ Random sampling
○ Error: O(1/n)
● Active Learning
○ Binary search
○ Error: O(2-n
)
13
[Source: Castro et. al. (2008)]
Error bound (ε > 0)
● Passive learning
● Active learning
[ Maximum Likelihood Estimate ]
14
[Source: Castro et. al. (2008)]
Experiment
A few 3D visual stimuli and their X values used in our experiment.
Participant was asked to guess the decision boundary
after every three iterations
15
Experiment
● Random
○ No queries
● Human Active
○ Active queries
● Machine Yoked
○ Machine makes query
○ Human observes
16
Results
Iteration, n
17[Source: Castro et. al. (2008)]
Answers
[Q1] Do humans perform better when they can select their own examples for labeling,
compared to passive observation of labeled examples? - Yes, in low noise levels
[Q2] If so, do they achieve the full benefit of active learning suggested by statistical
learning theory? - No, slower decay constants
[Q3] If they do not, can machine learning be used to enhance human performance? -
Inconclusive
[Q4] Do the answers to these questions vary depending upon the difficulty of the
learning problem? - Yes, with noise levels
18
Conclusion
● Simple learning task
● Machine Yoked Learning
● Impact on:
○ Fields of psychology and cognitive sciences
○ Intelligent tutoring systems
19
AL from multiple oracles
20
Why multiple oracles?
21
Multiple Oracle: Challenges
● How to select the most informative query?
● How to select the best oracle to ask questions?
● How to deal with disagreement among the
oracles?
● How to deal with a noisy or weak oracle?
22
Weak and strong labeler
● Zhang et al. (2015) considered exactly two oracles
● One standard oracle
○ Accurate but costly
● One weak oracle
○ Noisy but cheap
● Goal
○ Reduce number of queries to standard oracle
○ No impact on accuracy
23
Observations
● Difference Classifier to predict disagreement between
strong and weak labeler
○ Might not be statistically consistent
○ Can use cost-sensitive difference classifier
● Active learning queries a localized region of space
○ Train difference classifier on that localized region
24
Disagreement Based Active Learning (DBAL)
Vt
X
h1
h2
h7
h6
h3
h5
h4
h*
x1
x2
x8
x3
x6
x5
x7
x4
h1
(x1
) = h2
(x1
) = h3
(x1
) = h4
(x1
) = h4
(x1
)
h1
(x3
) != h2
(x3
) = h3
(x3
) = h4
(x3
) = h5
(x3
)
h1
(x4
) = h2
(x4
) = h3
(x4
) = h4
(x4
) = h5
(x4
)
query x3
O . . . . . . . . . . .
update
25
Problem Formulation
● Unlabeled Distribution, U
● Input space, X
● Label space, Y
● Hypothesis class, H
● Data distribution, D
● Excess error,
● Goal:
with as few queries to O as possible
Strong
Oracle
O
Weak
Oracle
W
26
Algorithm
● Three key ideas
○ Difference classifier
○ Disagreement region DIS(V)
■ Region of the input space
where two member
classifiers disagree
○ Epoch based agnostic CAL
■ Train fresh difference
classifier in each epoch
27
[Source: Theory of Active Learning
(Steve Hanneke, 2014)]
Algorithm
● Initialize error 0
, total number of epochs k0
and draw some n0
examples
to form labeled dataset S0
● In each iteration up to k’ iterations:
○ Set target error
○ Draw nk
unlabeled samples
○ Identify disagreement region Ak
○ Train difference classifier hdf with Ak
, O, W
○ Active learning using hdf
■ Draw mk examples, use hdf
and query either O or W. Add the labeled data
to Sk
● Return a classifier learned from the labeled dataset Sk’
28
Performance Guarantee
● First term for learning, second for training difference classifier
● Second term is lower order term when d ≈ d’
● Fitting the difference classifier does not incur a high overhead
29
AL from crowds
30
AL from crowds
● Multiple experts in supervised learning (Raykar et al.,
2009 and Yan et al., 2010)
● NLP tasks from AMT data (Snow et al., 2008)
● Yan et al., 2011 proposed a novel method in active
learning
● Focus:
○ Most informative query
○ Most useful annotator
31
Proposed Model
32
[Source: Yan et. al. (2011)]
Proposed Model
(3.3)
33
Algorithm
● Two key steps
○ Select a sample to label next
○ Select the best annotator to label
● Select sample
○ Uncertainty sampling
■ Select the sample for which classifier is least
certain about
34
Algorithm: Select Sample
Where, and (ᾶ > 0)
Separating hyperplane:
35
Algorithm: Select Annotator
(3.6)
36
Algorithm
37
Experiment
(left) Labels, (center) Areas of Labeler expertise and (right) annotator selection information for the
simplified two dimensional Galaxy Dim Data (Yan et al., 2011)
38
Experiment: Baselines
● active learning+majority vote
○ Active query based on majority vote of all annotators
● random sample+multi-labeler
○ Multi labeler algorithm on randomly sampled
examples
● random sample+majority vote
○ Random sampling with majority vote
39
Experimental Result
Accuracy comparisons on text data for the polarity, focus and the evidence labelings (Yan et al., 2011)
40
More Analyses
● Decision boundary intersects
all region of expertise
● Comparison with single oracle
AL
● Specialized vs General
expertise
41
[Source: Yan et. al. (2011)]
Future Direction
● More Applications
○ Real world problems
● Optimal number of oracles
○ Does multiple oracles always performs better than single oracle?
○ Is there an optimal number of oracles that works best?
● Cost function associated with labeling
○ Choose single vs multiple oracles
● General expertise
○ Each of multiple oracles have general expertise
42
References
● Castro, Rui M. et al. (2008). “Human Active Learning”. In: NIPS.
● Gigerenzer, Gerd and Reinhard Selten (2002). Bounded rationality: The
adaptive toolbox. MIT press.
● Laughlin, Patrick R. (1973). “Focusing strategy in concept attainment as a
function of instructions and task”. In: Journal of Experimental Psychology.
● Oaksford, Mike and Nick Chater (2007). Bayesian rationality: The
probabilistic approach to human reasoning. Oxford University Press.
● Raykar, Vikas C. et al. (2009). “Supervised learning from multiple experts:
whom to trust when everyone lies a bit”. In: ICML.
● Settles, Burr (2009). Active Learning Literature Survey. Computer Sciences
Technical Report 1648. University of Wisconsin–Madison.
43
References
● Snow, Rion et al. (2008). “Cheap and Fast - But is it Good? Evaluating
Non-Expert Annotations for Natural Language Tasks”. In: EMNLP.
● Wason, Peter Cathcart and Philip N Johnson-Laird (1972). Psychology of
reasoning: Structure and content. Vol. 86. Harvard University Press.
● Yan, Yan et al. (2010). “Modeling annotator expertise: Learning when
everybody knows a bit of something”. In: AISTATS.
● Yan, Yan et al. (2011). “Active Learning from Crowds”. In: ICML.
● Zhang, Chicheng and Kamalika Chaudhuri (2015). “Active Learning from
Weak and Strong Labelers”. In: NIPS.
● Zhu, Xiaojin (2005). “Semi-supervised Learning with Graphs”. AAI3179046.
PhD thesis. Pittsburgh, PA, USA
● Hanneke, Steve (2014). “Theory of Active Learning”
44
Questions?
45
Appendix: HAL Results
46
Appendix: WeakStrong Algorithm
47
Appendix: WeakStrong Algorithm
48
Appendix: WeakStrong Performance Guarantee
49
Appendix: Agnostic CAL
50

More Related Content

What's hot

Transfer Learning for Natural Language Processing
Transfer Learning for Natural Language ProcessingTransfer Learning for Natural Language Processing
Transfer Learning for Natural Language Processing
Sebastian Ruder
 
How Powerful are Graph Networks?
How Powerful are Graph Networks?How Powerful are Graph Networks?
How Powerful are Graph Networks?
IAMAl
 
Supervised Learning
Supervised LearningSupervised Learning
Supervised Learningbutest
 
Machine learning with ADA Boost
Machine learning with ADA BoostMachine learning with ADA Boost
Machine learning with ADA Boost
Aman Patel
 
Perceptron & Neural Networks
Perceptron & Neural NetworksPerceptron & Neural Networks
Perceptron & Neural Networks
NAGUR SHAREEF SHAIK
 
Self supervised learning
Self supervised learningSelf supervised learning
Self supervised learning
哲东 郑
 
Machine Learning - Ensemble Methods
Machine Learning - Ensemble MethodsMachine Learning - Ensemble Methods
Machine Learning - Ensemble Methods
Andrew Ferlitsch
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
Jinwon Lee
 
Teacher Professional Development with a wow-factor: Innovative and emerging p...
Teacher Professional Development with a wow-factor: Innovative and emerging p...Teacher Professional Development with a wow-factor: Innovative and emerging p...
Teacher Professional Development with a wow-factor: Innovative and emerging p...
Riina Vuorikari
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksFrancesco Collova'
 
Introduction to random forest and gradient boosting methods a lecture
Introduction to random forest and gradient boosting methods   a lectureIntroduction to random forest and gradient boosting methods   a lecture
Introduction to random forest and gradient boosting methods a lecture
Shreyas S K
 
Artificial Neural Network | Deep Neural Network Explained | Artificial Neural...
Artificial Neural Network | Deep Neural Network Explained | Artificial Neural...Artificial Neural Network | Deep Neural Network Explained | Artificial Neural...
Artificial Neural Network | Deep Neural Network Explained | Artificial Neural...
Simplilearn
 
Density based clustering
Density based clusteringDensity based clustering
Density based clustering
YaswanthHariKumarVud
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Sujit Pal
 
HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep Learning
Yan Xu
 
Graph Signal Processing: an interpretable framework to link neurocognitive ar...
Graph Signal Processing: an interpretable framework to link neurocognitive ar...Graph Signal Processing: an interpretable framework to link neurocognitive ar...
Graph Signal Processing: an interpretable framework to link neurocognitive ar...
Nicolas Farrugia
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptron
omaraldabash
 
Bag of tricks for image classification with convolutional neural networks r...
Bag of tricks for image classification with convolutional neural networks   r...Bag of tricks for image classification with convolutional neural networks   r...
Bag of tricks for image classification with convolutional neural networks r...
Dongmin Choi
 
Tabnet presentation
Tabnet presentationTabnet presentation
Tabnet presentation
Sebastien Fischman
 
The Basics of Active Learning
The Basics of Active LearningThe Basics of Active Learning
The Basics of Active Learning
Janet Corral
 

What's hot (20)

Transfer Learning for Natural Language Processing
Transfer Learning for Natural Language ProcessingTransfer Learning for Natural Language Processing
Transfer Learning for Natural Language Processing
 
How Powerful are Graph Networks?
How Powerful are Graph Networks?How Powerful are Graph Networks?
How Powerful are Graph Networks?
 
Supervised Learning
Supervised LearningSupervised Learning
Supervised Learning
 
Machine learning with ADA Boost
Machine learning with ADA BoostMachine learning with ADA Boost
Machine learning with ADA Boost
 
Perceptron & Neural Networks
Perceptron & Neural NetworksPerceptron & Neural Networks
Perceptron & Neural Networks
 
Self supervised learning
Self supervised learningSelf supervised learning
Self supervised learning
 
Machine Learning - Ensemble Methods
Machine Learning - Ensemble MethodsMachine Learning - Ensemble Methods
Machine Learning - Ensemble Methods
 
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
PR-330: How To Train Your ViT? Data, Augmentation, and Regularization in Visi...
 
Teacher Professional Development with a wow-factor: Innovative and emerging p...
Teacher Professional Development with a wow-factor: Innovative and emerging p...Teacher Professional Development with a wow-factor: Innovative and emerging p...
Teacher Professional Development with a wow-factor: Innovative and emerging p...
 
Machine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural NetworksMachine Learning: Introduction to Neural Networks
Machine Learning: Introduction to Neural Networks
 
Introduction to random forest and gradient boosting methods a lecture
Introduction to random forest and gradient boosting methods   a lectureIntroduction to random forest and gradient boosting methods   a lecture
Introduction to random forest and gradient boosting methods a lecture
 
Artificial Neural Network | Deep Neural Network Explained | Artificial Neural...
Artificial Neural Network | Deep Neural Network Explained | Artificial Neural...Artificial Neural Network | Deep Neural Network Explained | Artificial Neural...
Artificial Neural Network | Deep Neural Network Explained | Artificial Neural...
 
Density based clustering
Density based clusteringDensity based clustering
Density based clustering
 
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
Transfer Learning and Fine Tuning for Cross Domain Image Classification with ...
 
HML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep LearningHML: Historical View and Trends of Deep Learning
HML: Historical View and Trends of Deep Learning
 
Graph Signal Processing: an interpretable framework to link neurocognitive ar...
Graph Signal Processing: an interpretable framework to link neurocognitive ar...Graph Signal Processing: an interpretable framework to link neurocognitive ar...
Graph Signal Processing: an interpretable framework to link neurocognitive ar...
 
Multilayer perceptron
Multilayer perceptronMultilayer perceptron
Multilayer perceptron
 
Bag of tricks for image classification with convolutional neural networks r...
Bag of tricks for image classification with convolutional neural networks   r...Bag of tricks for image classification with convolutional neural networks   r...
Bag of tricks for image classification with convolutional neural networks r...
 
Tabnet presentation
Tabnet presentationTabnet presentation
Tabnet presentation
 
The Basics of Active Learning
The Basics of Active LearningThe Basics of Active Learning
The Basics of Active Learning
 

Similar to Active learning

Teacher-Aware Active Robot Learning
Teacher-Aware Active Robot LearningTeacher-Aware Active Robot Learning
Teacher-Aware Active Robot Learning
Mattia Racca
 
XAI (IIT-Patna).pdf
XAI (IIT-Patna).pdfXAI (IIT-Patna).pdf
XAI (IIT-Patna).pdf
MaheshPanchal51
 
TS4-3: Takumi Sato from Nagoya Institute of Technology
TS4-3: Takumi Sato from Nagoya Institute of TechnologyTS4-3: Takumi Sato from Nagoya Institute of Technology
TS4-3: Takumi Sato from Nagoya Institute of Technology
Jawad Haqbeen
 
Statistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and HowStatistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and How
Julián Urbano
 
Data Analytics.03. Data processing
Data Analytics.03. Data processingData Analytics.03. Data processing
Data Analytics.03. Data processing
Alex Rayón Jerez
 
Data Sets as Facilitator for new Products and Services for Universities
Data Sets as Facilitator for new Products and Services for UniversitiesData Sets as Facilitator for new Products and Services for Universities
Data Sets as Facilitator for new Products and Services for Universities
Hendrik Drachsler
 
Deep Meta Learning
Deep Meta Learning Deep Meta Learning
Deep Meta Learning
Changhoon Jeong
 
NLG, Training, Inference & Evaluation
NLG, Training, Inference & Evaluation NLG, Training, Inference & Evaluation
NLG, Training, Inference & Evaluation
Deep Learning Italia
 
Mental rotation skills
Mental rotation skillsMental rotation skills
Mental rotation skills
Christian Bokhove
 
DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha A_Palalas C_G...
DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha  A_Palalas C_G...DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha  A_Palalas C_G...
DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha A_Palalas C_G...Agnieszka (Aga) Palalas, Ed.D.
 
Active Content-Based Crowdsourcing Task Selection
Active Content-Based Crowdsourcing Task SelectionActive Content-Based Crowdsourcing Task Selection
Active Content-Based Crowdsourcing Task Selection
Carsten Eickhoff
 
Xiangen Hu - WESST Keynote - Conversational Tutors and the Experience API
Xiangen Hu - WESST Keynote - Conversational Tutors and the Experience APIXiangen Hu - WESST Keynote - Conversational Tutors and the Experience API
Xiangen Hu - WESST Keynote - Conversational Tutors and the Experience API
NUS Institute of Applied Learning Sciences and Educational Technology
 
Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments
NASIG
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Matthew Lease
 
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionIterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Claudio Greco
 
Whether simulation models that fall under the information systems category ad...
Whether simulation models that fall under the information systems category ad...Whether simulation models that fall under the information systems category ad...
Whether simulation models that fall under the information systems category ad...
Elisavet Andrikopoulou
 
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionIterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Alessandro Suglia
 
Resonance Introduction at SacPy
Resonance Introduction at SacPyResonance Introduction at SacPy
Resonance Introduction at SacPy
moorepants
 
MILA DL & RL summer school highlights
MILA DL & RL summer school highlights MILA DL & RL summer school highlights
MILA DL & RL summer school highlights
Natalia Díaz Rodríguez
 
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
Vincenzo Lomonaco
 

Similar to Active learning (20)

Teacher-Aware Active Robot Learning
Teacher-Aware Active Robot LearningTeacher-Aware Active Robot Learning
Teacher-Aware Active Robot Learning
 
XAI (IIT-Patna).pdf
XAI (IIT-Patna).pdfXAI (IIT-Patna).pdf
XAI (IIT-Patna).pdf
 
TS4-3: Takumi Sato from Nagoya Institute of Technology
TS4-3: Takumi Sato from Nagoya Institute of TechnologyTS4-3: Takumi Sato from Nagoya Institute of Technology
TS4-3: Takumi Sato from Nagoya Institute of Technology
 
Statistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and HowStatistical Analysis of Results in Music Information Retrieval: Why and How
Statistical Analysis of Results in Music Information Retrieval: Why and How
 
Data Analytics.03. Data processing
Data Analytics.03. Data processingData Analytics.03. Data processing
Data Analytics.03. Data processing
 
Data Sets as Facilitator for new Products and Services for Universities
Data Sets as Facilitator for new Products and Services for UniversitiesData Sets as Facilitator for new Products and Services for Universities
Data Sets as Facilitator for new Products and Services for Universities
 
Deep Meta Learning
Deep Meta Learning Deep Meta Learning
Deep Meta Learning
 
NLG, Training, Inference & Evaluation
NLG, Training, Inference & Evaluation NLG, Training, Inference & Evaluation
NLG, Training, Inference & Evaluation
 
Mental rotation skills
Mental rotation skillsMental rotation skills
Mental rotation skills
 
DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha A_Palalas C_G...
DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha  A_Palalas C_G...DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha  A_Palalas C_G...
DBR (Design-Based Research) in mobile learning-Mlearn2013 Doha A_Palalas C_G...
 
Active Content-Based Crowdsourcing Task Selection
Active Content-Based Crowdsourcing Task SelectionActive Content-Based Crowdsourcing Task Selection
Active Content-Based Crowdsourcing Task Selection
 
Xiangen Hu - WESST Keynote - Conversational Tutors and the Experience API
Xiangen Hu - WESST Keynote - Conversational Tutors and the Experience APIXiangen Hu - WESST Keynote - Conversational Tutors and the Experience API
Xiangen Hu - WESST Keynote - Conversational Tutors and the Experience API
 
Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments Analyzing workflows and improving communication across departments
Analyzing workflows and improving communication across departments
 
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & OpportunitiesDeep Learning for Information Retrieval: Models, Progress, & Opportunities
Deep Learning for Information Retrieval: Models, Progress, & Opportunities
 
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionIterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer Prediction
 
Whether simulation models that fall under the information systems category ad...
Whether simulation models that fall under the information systems category ad...Whether simulation models that fall under the information systems category ad...
Whether simulation models that fall under the information systems category ad...
 
Iterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer PredictionIterative Multi-document Neural Attention for Multiple Answer Prediction
Iterative Multi-document Neural Attention for Multiple Answer Prediction
 
Resonance Introduction at SacPy
Resonance Introduction at SacPyResonance Introduction at SacPy
Resonance Introduction at SacPy
 
MILA DL & RL summer school highlights
MILA DL & RL summer school highlights MILA DL & RL summer school highlights
MILA DL & RL summer school highlights
 
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
2023-08-22 CoLLAs Tutorial - Beyond CIL.pdf
 

Recently uploaded

The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
Delapenabediema
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
timhan337
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
JosvitaDsouza2
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
EduSkills OECD
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
Ashokrao Mane college of Pharmacy Peth-Vadgaon
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
Vikramjit Singh
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
Tamralipta Mahavidyalaya
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
MIRIAMSALINAS13
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
Special education needs
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
GeoBlogs
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
Jisc
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Atul Kumar Singh
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
vaibhavrinwa19
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
RaedMohamed3
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
Peter Windle
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
BhavyaRajput3
 

Recently uploaded (20)

The Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official PublicationThe Challenger.pdf DNHS Official Publication
The Challenger.pdf DNHS Official Publication
 
Honest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptxHonest Reviews of Tim Han LMA Course Program.pptx
Honest Reviews of Tim Han LMA Course Program.pptx
 
1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx1.4 modern child centered education - mahatma gandhi-2.pptx
1.4 modern child centered education - mahatma gandhi-2.pptx
 
Francesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptxFrancesca Gottschalk - How can education support child empowerment.pptx
Francesca Gottschalk - How can education support child empowerment.pptx
 
Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.Biological Screening of Herbal Drugs in detailed.
Biological Screening of Herbal Drugs in detailed.
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
Digital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and ResearchDigital Tools and AI for Teaching Learning and Research
Digital Tools and AI for Teaching Learning and Research
 
Home assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdfHome assignment II on Spectroscopy 2024 Answers.pdf
Home assignment II on Spectroscopy 2024 Answers.pdf
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXXPhrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
Phrasal Verbs.XXXXXXXXXXXXXXXXXXXXXXXXXX
 
special B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdfspecial B.ed 2nd year old paper_20240531.pdf
special B.ed 2nd year old paper_20240531.pdf
 
Chapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptxChapter 3 - Islamic Banking Products and Services.pptx
Chapter 3 - Islamic Banking Products and Services.pptx
 
The geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideasThe geography of Taylor Swift - some ideas
The geography of Taylor Swift - some ideas
 
The approach at University of Liverpool.pptx
The approach at University of Liverpool.pptxThe approach at University of Liverpool.pptx
The approach at University of Liverpool.pptx
 
Guidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th SemesterGuidance_and_Counselling.pdf B.Ed. 4th Semester
Guidance_and_Counselling.pdf B.Ed. 4th Semester
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
Acetabularia Information For Class 9 .docx
Acetabularia Information For Class 9  .docxAcetabularia Information For Class 9  .docx
Acetabularia Information For Class 9 .docx
 
Palestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptxPalestine last event orientationfvgnh .pptx
Palestine last event orientationfvgnh .pptx
 
A Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in EducationA Strategic Approach: GenAI in Education
A Strategic Approach: GenAI in Education
 
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCECLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
CLASS 11 CBSE B.St Project AIDS TO TRADE - INSURANCE
 

Active learning

  • 1. Active Learning Ragib Ahsan Committee Prof. Xinhua Zhang (Chair) Prof. Brian Ziebart Prof. Jon A Solworth
  • 2. Overview ● What is active learning? ● Does active learning make any difference? ● Active learning from multiple oracles ● Active learning with weak and strong oracle ● Multiple oracles with varying expertise 2
  • 3. What is Active Learning? ● Introduced in Education by 1990s ● Let students participate actively ● Doing things rather than just listening ● Inspired machine learning ● Also known as Query Learning 3
  • 4. Contrast to passive learning Passive Learning Active Learning 4
  • 5. Applications ● Fewer labeled data ● Speech Recognition ○ Word level annotation can take ten times longer than actual audio (Zhu, 2005) ● Medical Diagnosis ○ Expert doctors ● Document Classification 5
  • 6. Active Learning Examples Pool based active learning (Settles, 2009) 6
  • 7. Active Learning Examples a) Toy dataset, two Gaussians b) logistic regression model produces 70% accuracy c) logistic regression with active querying produces 90% accuracy (Settles, 2009) 7
  • 9. Does AL make any difference? “Learners do benefit from the opportunity to actively select examples during learning. But It is very difficult to asses the magnitude of difference that active learning makes compared to passive learning” Laughlin (1973) There were conflicting claims throughout the literature on the effectiveness of active learning 9
  • 10. Does AL make any difference? “People make inappropriate queries to assess simple logical hypotheses such as if p then q (frequently examining q instances to see if they are p, and failing to explore not-q instances” Wason et al. (1972) “If the learning task is properly construed, human actually do a great job in asking questions” Gigerenzer et al.(2002) Oaksford et al. (2007) 10
  • 11. Does AL make any difference? Castro et al. (2008) addressed these questions: [Q1] Do humans perform better when they can select their own examples for labeling, compared to passive observation of labeled examples? [Q2] If so, do they achieve the full benefit of active learning suggested by statistical learning theory? [Q3] If they do not, can machine learning be used to enhance human performance? [Q4] Do the answers to these questions vary depending upon the difficulty of the learning problem? 11
  • 12. Task Formulation ● Binary Classification in interval [0,1] ● Unknown decision boundary, ● 0 and 1 class ● n samples ● Xi [0, 1], Yi {0, 1} ● Yi is correct with probability 1 − ε ● 0 ≤ ε < 1/2 12 [Source: Castro et. al. (2008)]
  • 13. Error bound (ε = 0) ● Passive Learning ○ Random sampling ○ Error: O(1/n) ● Active Learning ○ Binary search ○ Error: O(2-n ) 13 [Source: Castro et. al. (2008)]
  • 14. Error bound (ε > 0) ● Passive learning ● Active learning [ Maximum Likelihood Estimate ] 14 [Source: Castro et. al. (2008)]
  • 15. Experiment A few 3D visual stimuli and their X values used in our experiment. Participant was asked to guess the decision boundary after every three iterations 15
  • 16. Experiment ● Random ○ No queries ● Human Active ○ Active queries ● Machine Yoked ○ Machine makes query ○ Human observes 16
  • 18. Answers [Q1] Do humans perform better when they can select their own examples for labeling, compared to passive observation of labeled examples? - Yes, in low noise levels [Q2] If so, do they achieve the full benefit of active learning suggested by statistical learning theory? - No, slower decay constants [Q3] If they do not, can machine learning be used to enhance human performance? - Inconclusive [Q4] Do the answers to these questions vary depending upon the difficulty of the learning problem? - Yes, with noise levels 18
  • 19. Conclusion ● Simple learning task ● Machine Yoked Learning ● Impact on: ○ Fields of psychology and cognitive sciences ○ Intelligent tutoring systems 19
  • 20. AL from multiple oracles 20
  • 22. Multiple Oracle: Challenges ● How to select the most informative query? ● How to select the best oracle to ask questions? ● How to deal with disagreement among the oracles? ● How to deal with a noisy or weak oracle? 22
  • 23. Weak and strong labeler ● Zhang et al. (2015) considered exactly two oracles ● One standard oracle ○ Accurate but costly ● One weak oracle ○ Noisy but cheap ● Goal ○ Reduce number of queries to standard oracle ○ No impact on accuracy 23
  • 24. Observations ● Difference Classifier to predict disagreement between strong and weak labeler ○ Might not be statistically consistent ○ Can use cost-sensitive difference classifier ● Active learning queries a localized region of space ○ Train difference classifier on that localized region 24
  • 25. Disagreement Based Active Learning (DBAL) Vt X h1 h2 h7 h6 h3 h5 h4 h* x1 x2 x8 x3 x6 x5 x7 x4 h1 (x1 ) = h2 (x1 ) = h3 (x1 ) = h4 (x1 ) = h4 (x1 ) h1 (x3 ) != h2 (x3 ) = h3 (x3 ) = h4 (x3 ) = h5 (x3 ) h1 (x4 ) = h2 (x4 ) = h3 (x4 ) = h4 (x4 ) = h5 (x4 ) query x3 O . . . . . . . . . . . update 25
  • 26. Problem Formulation ● Unlabeled Distribution, U ● Input space, X ● Label space, Y ● Hypothesis class, H ● Data distribution, D ● Excess error, ● Goal: with as few queries to O as possible Strong Oracle O Weak Oracle W 26
  • 27. Algorithm ● Three key ideas ○ Difference classifier ○ Disagreement region DIS(V) ■ Region of the input space where two member classifiers disagree ○ Epoch based agnostic CAL ■ Train fresh difference classifier in each epoch 27 [Source: Theory of Active Learning (Steve Hanneke, 2014)]
  • 28. Algorithm ● Initialize error 0 , total number of epochs k0 and draw some n0 examples to form labeled dataset S0 ● In each iteration up to k’ iterations: ○ Set target error ○ Draw nk unlabeled samples ○ Identify disagreement region Ak ○ Train difference classifier hdf with Ak , O, W ○ Active learning using hdf ■ Draw mk examples, use hdf and query either O or W. Add the labeled data to Sk ● Return a classifier learned from the labeled dataset Sk’ 28
  • 29. Performance Guarantee ● First term for learning, second for training difference classifier ● Second term is lower order term when d ≈ d’ ● Fitting the difference classifier does not incur a high overhead 29
  • 31. AL from crowds ● Multiple experts in supervised learning (Raykar et al., 2009 and Yan et al., 2010) ● NLP tasks from AMT data (Snow et al., 2008) ● Yan et al., 2011 proposed a novel method in active learning ● Focus: ○ Most informative query ○ Most useful annotator 31
  • 34. Algorithm ● Two key steps ○ Select a sample to label next ○ Select the best annotator to label ● Select sample ○ Uncertainty sampling ■ Select the sample for which classifier is least certain about 34
  • 35. Algorithm: Select Sample Where, and (ᾶ > 0) Separating hyperplane: 35
  • 38. Experiment (left) Labels, (center) Areas of Labeler expertise and (right) annotator selection information for the simplified two dimensional Galaxy Dim Data (Yan et al., 2011) 38
  • 39. Experiment: Baselines ● active learning+majority vote ○ Active query based on majority vote of all annotators ● random sample+multi-labeler ○ Multi labeler algorithm on randomly sampled examples ● random sample+majority vote ○ Random sampling with majority vote 39
  • 40. Experimental Result Accuracy comparisons on text data for the polarity, focus and the evidence labelings (Yan et al., 2011) 40
  • 41. More Analyses ● Decision boundary intersects all region of expertise ● Comparison with single oracle AL ● Specialized vs General expertise 41 [Source: Yan et. al. (2011)]
  • 42. Future Direction ● More Applications ○ Real world problems ● Optimal number of oracles ○ Does multiple oracles always performs better than single oracle? ○ Is there an optimal number of oracles that works best? ● Cost function associated with labeling ○ Choose single vs multiple oracles ● General expertise ○ Each of multiple oracles have general expertise 42
  • 43. References ● Castro, Rui M. et al. (2008). “Human Active Learning”. In: NIPS. ● Gigerenzer, Gerd and Reinhard Selten (2002). Bounded rationality: The adaptive toolbox. MIT press. ● Laughlin, Patrick R. (1973). “Focusing strategy in concept attainment as a function of instructions and task”. In: Journal of Experimental Psychology. ● Oaksford, Mike and Nick Chater (2007). Bayesian rationality: The probabilistic approach to human reasoning. Oxford University Press. ● Raykar, Vikas C. et al. (2009). “Supervised learning from multiple experts: whom to trust when everyone lies a bit”. In: ICML. ● Settles, Burr (2009). Active Learning Literature Survey. Computer Sciences Technical Report 1648. University of Wisconsin–Madison. 43
  • 44. References ● Snow, Rion et al. (2008). “Cheap and Fast - But is it Good? Evaluating Non-Expert Annotations for Natural Language Tasks”. In: EMNLP. ● Wason, Peter Cathcart and Philip N Johnson-Laird (1972). Psychology of reasoning: Structure and content. Vol. 86. Harvard University Press. ● Yan, Yan et al. (2010). “Modeling annotator expertise: Learning when everybody knows a bit of something”. In: AISTATS. ● Yan, Yan et al. (2011). “Active Learning from Crowds”. In: ICML. ● Zhang, Chicheng and Kamalika Chaudhuri (2015). “Active Learning from Weak and Strong Labelers”. In: NIPS. ● Zhu, Xiaojin (2005). “Semi-supervised Learning with Graphs”. AAI3179046. PhD thesis. Pittsburgh, PA, USA ● Hanneke, Steve (2014). “Theory of Active Learning” 44