SlideShare a Scribd company logo
1 of 43
Download to read offline
Searn Algorithm for
Structured Prediction
Presented by Supun Abeysinghe
Outline
● What is Structured Prediction
● Approaches to Structured Prediction
● Idea of Search-Based Structured Prediction
● Background Information for SEARN
● SEARN Algorithm
● Comparison with other approaches
Structured Prediction
What is Structured Prediction?
● If we define informally structured prediction is a process by which a
structure inside a given input is captured.
● Main difference with other machine learning problems is Structured
Prediction problems usually have a complex output.
POS Tagging
She sells seashells on the seashore
PRP VBZ NNS IN DT NN
Chunking
Constituency Parsing
Dependency Parsing
And a lot more...
Approaches for Structured
Prediction
Different Approaches to SP
● Structured Perceptron
○ Direct implementation of Averaged Perceptron in binary classification to use in SP
● Incremental Perceptron
○ Also a search-based approach.
● Maximum Entropy Markov Models
○ Similar to logistic regression in binary classification
● Conditional Random Fields
○ Solves label bias problem of Maximum Entropy models
● Maximum Margin Markov Networks
● SVM for Independent and Structured Outputs (SVMstruct
)
Search-based Structured
Prediction
Traditional Approach to SP
ModelInput
All
Possible
Outputs
Xn
W - Parameters
F - features
● There is a model that can generate all the possible outputs for a given
input
● Based on the input features, model parameters assign scores for each of
those outputs
Training
ModelInput
All
Possible
Outputs
Xn
(Xn
, Yn
)
W - Parameters
● Model parameters are trained such that the correct output for each training
example will have the highest score.
Yn
Highest
Score
Decoding
ModelInput
All
Possible
Outputs
Xn
W - Parameters
● In the decoding phase, the input is ran through the model and then all the
outputs are searched to find out the output with the highest score
Search for
the output
with
highest
score
Role of Search
● Search gets the output with highest score from the search space.
● Almost all SP approaches needs a search component
● In most cases, searching through the whole space is intractable.
○ Assumptions about the output is made so that dynamic programming can be applied
○ Use approximating methods such as beam search, greedy search and any other heuristic
based search methods
● Search can be seen as a sequence of decisions taken to get the best
output.
Search-based SP
● Search phase and the model is combined.
● Rather than searching after the model, learn how to search.
● Each decision made during the search is considered as a large
classification problem.
● Now each search decision that make will build the output incrementally.
● The goal is to train these classifiers to build an optimal output.
Background information before
SEARN
Learning Reductions
● Relating a hard and complex prediction problem to a simpler prediction
problem.
● Maps a harder problem to a simpler problem, then obtains a solution for
the simple problem and maps that solution to the harder problem.
● A reduction has three components
○ Sample mapping - Mapping complex problem dataset to the simpler problem
○ Hypothesis mapping - Mapping the solution to the easier problem to the hard problem
○ Bounds - How well the reduction solves the larger problem
Importance Weighted Binary Classification
● A simple extension to binary classification.
● Each example (data item) has an associated weight which reflects the
importance of that data item. (xi
, yi
, ci
)
● Solution should be a binary classifier that minimizes the expected weight
loss.
Importance Weighted Binary Classification
● Solved by reducing the problem to C parallel binary classifiers.
● C datasets are generated sampling from the original dataset with a
sampling probability proportional to their importance weights.
● Using those different datasets, C binary classifiers will be trained.
● Prediction is made based on majority prediction of those C parallel binary
classifiers.
Cost Sensitive Classification
● This is a natural extension of Importance Weighted Binary Classification to
a multi class scenario.
● For a K-Class task, we have to find a hypothesis (h) such that it minimizes
the expected cost of predictions.
● C is a k sized vector containing cost for each classification.
Cost Sensitive Classification
● This is reduced to Importance Weighted Binary Classification problem
using Weighted All Pairs (Beygelzimer et al., 2005) reduction.
● WAP generates k
Cc
Importance weighted binary classification problems.
● Importance weights calculated using a special formula so that
classification is done correctly.
SEARN Algorithm
SEARN Algorithm
● Searn is developed by casting structured prediction in the language of
reductions;
● In particular, it reduces structured prediction to cost-sensitive
classification.
● In that case, the cost-sensitive classification problem can be reduced to
binary classification by applying weighted all pairs method.
● So the structured prediction can be solved using binary classification.
SEARN Algorithm
● Removes the “search” from the prediction process by learning a classifier
to make incremental decisions.
Definition of Structured Prediction
We can define structured prediction problem as a cost-sensitive classification
problem as follows.
Definition of Structured Prediction
The goal of the structured prediction is to find a hypothesis h : X→Y that
minimizes the given loss.
Policy
● We need to find a h such that, given a state s, and and the input x, h(x,s)
gives the next action.
● We can consider policy h as a classifier. Now the whole problem becomes
a classification problem.
● Now we need to train this classifier.
Training
● Training is an iterative process
● Initialize with a known policy
● Using that policy create cost-sensitive examples
● Create a new policy using the cost-sensitive examples
● Interpolate the previous policy and the new policy
Cost Sensitive Examples
● A policy generates one path per one training example. (Path is a sequence
of states; state is a partial structure)
● SEARN creates a single cost sensitive example for each state in each path.
● The classes associated with each example is the cost of available actions.
(Next states)
● Now the difficulty lies in specifying these cost values.
Cost
● Cost of each action can be considered as regret. It is defined as follows. (π
is the policy)
● The complexity of the above equation is problem dependent.
● There are multiple ways to compute it. (Monte-carlo sampling, Single
Monte-carlo sampling, etc)
Optimal Policy
● The optimal policy is a policy that, for a given state, input and
output(structured prediction cost vector) always predicts the best action to
take.
Optimal Policy
● Searn uses the optimal policy to initialize the iterative process, and
attempts to migrate toward a completely learned policy that will generalize
well.
● SEARN assumes existence of an optimal policy to the problem.
Algorithm
● π*
is the optimal policy
● Learn is a multi class learner.
● Policy will be initialized using
the optimal policy. (Line 1)
● Algorithm then iterates for a
number of iterations.
● Makes cost-sensitive examples
using the current policy.
● Interpolate the previous policy
with current policy
Comparison with other
approaches
Vs. Independent Classifiers
● Output structure is assumed to be decomposable and each part is
classified (predicted) individually.
● Cannot define features that span across output structure.
● Even if the previous results are taken into consideration it can be sub
optimal.
● Limited to hamming loss.
Vs. Perceptron algorithms
● Assumes a tractable argmax operation.
● Generalize poorly. (Can solve this by averaging the weights)
● Limited to only one loss function.
Vs. Global prediction algorithms
● Highly dependent on assumptions about output structure. (Markov
assumption)
● In comparison SEARN is more general, limited neither to linear chains nor
to Markov style features.
● SEARN requires far more weaker assumptions.
SEARN algorithm can solve structured
prediction problems under any model,
any feature functions and any loss
References
● Search-based Structured Prediction. Hal Daumé III, John Langford and
Daniel Marcu. Submitted to Machine Learning Journal, 2006.
● Practical Structured Learning Techniques for Natural Language
Processing. Hal Daumé III. PhD Thesis, 2006 (USC)
Thank you!
supun.14@cse.mrt.ac.lk

More Related Content

What's hot

Reinforcement learning
Reinforcement learningReinforcement learning
Reinforcement learningDongHyun Kwak
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement LearningDongHyun Kwak
 
Actor critic algorithm
Actor critic algorithmActor critic algorithm
Actor critic algorithmJie-Han Chen
 
Deep reinforcement learning from scratch
Deep reinforcement learning from scratchDeep reinforcement learning from scratch
Deep reinforcement learning from scratchJie-Han Chen
 
Machine learning Algorithms with a Sagemaker demo
Machine learning Algorithms with a Sagemaker demoMachine learning Algorithms with a Sagemaker demo
Machine learning Algorithms with a Sagemaker demoHridyesh Bisht
 
An Introduction to Reinforcement Learning - The Doors to AGI
An Introduction to Reinforcement Learning - The Doors to AGIAn Introduction to Reinforcement Learning - The Doors to AGI
An Introduction to Reinforcement Learning - The Doors to AGIAnirban Santara
 
Artificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesArtificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesDr. C.V. Suresh Babu
 
Derivative Free Optimization and Robust Optimization
Derivative Free Optimization and Robust OptimizationDerivative Free Optimization and Robust Optimization
Derivative Free Optimization and Robust OptimizationSSA KPI
 
Algorithms and Programming
Algorithms and ProgrammingAlgorithms and Programming
Algorithms and ProgrammingMelanie Knight
 
Machine Learning Lecture 2 Basics
Machine Learning Lecture 2 BasicsMachine Learning Lecture 2 Basics
Machine Learning Lecture 2 Basicsananth
 
L06 stemmer and edit distance
L06 stemmer and edit distanceL06 stemmer and edit distance
L06 stemmer and edit distanceananth
 
L05 language model_part2
L05 language model_part2L05 language model_part2
L05 language model_part2ananth
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models ananth
 
Multi armed bandit
Multi armed banditMulti armed bandit
Multi armed banditJie-Han Chen
 
Temporal difference learning
Temporal difference learningTemporal difference learning
Temporal difference learningJie-Han Chen
 
Model Based Episodic Memory
Model Based Episodic MemoryModel Based Episodic Memory
Model Based Episodic MemoryHung Le
 
Reinforcement learning 7313
Reinforcement learning 7313Reinforcement learning 7313
Reinforcement learning 7313Slideshare
 
Deep learning concepts
Deep learning conceptsDeep learning concepts
Deep learning conceptsJoe li
 
An introduction to reinforcement learning
An introduction to  reinforcement learningAn introduction to  reinforcement learning
An introduction to reinforcement learningJie-Han Chen
 

What's hot (20)

Reinforcement learning
Reinforcement learningReinforcement learning
Reinforcement learning
 
Reinforcement Learning
Reinforcement LearningReinforcement Learning
Reinforcement Learning
 
Actor critic algorithm
Actor critic algorithmActor critic algorithm
Actor critic algorithm
 
Deep reinforcement learning from scratch
Deep reinforcement learning from scratchDeep reinforcement learning from scratch
Deep reinforcement learning from scratch
 
Machine learning Algorithms with a Sagemaker demo
Machine learning Algorithms with a Sagemaker demoMachine learning Algorithms with a Sagemaker demo
Machine learning Algorithms with a Sagemaker demo
 
An Introduction to Reinforcement Learning - The Doors to AGI
An Introduction to Reinforcement Learning - The Doors to AGIAn Introduction to Reinforcement Learning - The Doors to AGI
An Introduction to Reinforcement Learning - The Doors to AGI
 
Artificial Intelligence Searching Techniques
Artificial Intelligence Searching TechniquesArtificial Intelligence Searching Techniques
Artificial Intelligence Searching Techniques
 
Derivative Free Optimization and Robust Optimization
Derivative Free Optimization and Robust OptimizationDerivative Free Optimization and Robust Optimization
Derivative Free Optimization and Robust Optimization
 
Algorithms and Programming
Algorithms and ProgrammingAlgorithms and Programming
Algorithms and Programming
 
Generalized Reinforcement Learning
Generalized Reinforcement LearningGeneralized Reinforcement Learning
Generalized Reinforcement Learning
 
Machine Learning Lecture 2 Basics
Machine Learning Lecture 2 BasicsMachine Learning Lecture 2 Basics
Machine Learning Lecture 2 Basics
 
L06 stemmer and edit distance
L06 stemmer and edit distanceL06 stemmer and edit distance
L06 stemmer and edit distance
 
L05 language model_part2
L05 language model_part2L05 language model_part2
L05 language model_part2
 
Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models Artificial Intelligence Course: Linear models
Artificial Intelligence Course: Linear models
 
Multi armed bandit
Multi armed banditMulti armed bandit
Multi armed bandit
 
Temporal difference learning
Temporal difference learningTemporal difference learning
Temporal difference learning
 
Model Based Episodic Memory
Model Based Episodic MemoryModel Based Episodic Memory
Model Based Episodic Memory
 
Reinforcement learning 7313
Reinforcement learning 7313Reinforcement learning 7313
Reinforcement learning 7313
 
Deep learning concepts
Deep learning conceptsDeep learning concepts
Deep learning concepts
 
An introduction to reinforcement learning
An introduction to  reinforcement learningAn introduction to  reinforcement learning
An introduction to reinforcement learning
 

Similar to A brief introduction to Searn Algorithm

ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptxHadrian7
 
Reinforcement learning:policy gradient (part 1)
Reinforcement learning:policy gradient (part 1)Reinforcement learning:policy gradient (part 1)
Reinforcement learning:policy gradient (part 1)Bean Yen
 
Machine learning - session 3
Machine learning - session 3Machine learning - session 3
Machine learning - session 3Luis Borbon
 
Instance Learning and Genetic Algorithm by Dr.C.R.Dhivyaa Kongu Engineering C...
Instance Learning and Genetic Algorithm by Dr.C.R.Dhivyaa Kongu Engineering C...Instance Learning and Genetic Algorithm by Dr.C.R.Dhivyaa Kongu Engineering C...
Instance Learning and Genetic Algorithm by Dr.C.R.Dhivyaa Kongu Engineering C...Dhivyaa C.R
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Universitat Politècnica de Catalunya
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningNAVER Engineering
 
Multiobjective optimization and trade offs using pareto optimality
Multiobjective optimization and trade offs using pareto optimalityMultiobjective optimization and trade offs using pareto optimality
Multiobjective optimization and trade offs using pareto optimalityAmogh Mundhekar
 
Reinforcement Learning 8: Planning and Learning with Tabular Methods
Reinforcement Learning 8: Planning and Learning with Tabular MethodsReinforcement Learning 8: Planning and Learning with Tabular Methods
Reinforcement Learning 8: Planning and Learning with Tabular MethodsSeung Jae Lee
 
IFTA2020 Kei Nakagawa
IFTA2020 Kei NakagawaIFTA2020 Kei Nakagawa
IFTA2020 Kei NakagawaKei Nakagawa
 
How to formulate reinforcement learning in illustrative ways
How to formulate reinforcement learning in illustrative waysHow to formulate reinforcement learning in illustrative ways
How to formulate reinforcement learning in illustrative waysYasutoTamura1
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningShubhmay Potdar
 
Structured prediction with reinforcement learning
Structured prediction with reinforcement learningStructured prediction with reinforcement learning
Structured prediction with reinforcement learningguruprasad110
 
GDG Cloud Community Day 2022 - Managing data quality in Machine Learning
GDG Cloud Community Day 2022 -  Managing data quality in Machine LearningGDG Cloud Community Day 2022 -  Managing data quality in Machine Learning
GDG Cloud Community Day 2022 - Managing data quality in Machine LearningSARADINDU SENGUPTA
 
Advanced regression and model selection
Advanced regression and model selectionAdvanced regression and model selection
Advanced regression and model selectionAnkit Jain
 

Similar to A brief introduction to Searn Algorithm (20)

ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
ngboost.pptx
ngboost.pptxngboost.pptx
ngboost.pptx
 
Reinforcement learning:policy gradient (part 1)
Reinforcement learning:policy gradient (part 1)Reinforcement learning:policy gradient (part 1)
Reinforcement learning:policy gradient (part 1)
 
Parallel Processing Concepts
Parallel Processing Concepts Parallel Processing Concepts
Parallel Processing Concepts
 
Machine learning - session 3
Machine learning - session 3Machine learning - session 3
Machine learning - session 3
 
UNIT IV (4).pptx
UNIT IV (4).pptxUNIT IV (4).pptx
UNIT IV (4).pptx
 
Instance Learning and Genetic Algorithm by Dr.C.R.Dhivyaa Kongu Engineering C...
Instance Learning and Genetic Algorithm by Dr.C.R.Dhivyaa Kongu Engineering C...Instance Learning and Genetic Algorithm by Dr.C.R.Dhivyaa Kongu Engineering C...
Instance Learning and Genetic Algorithm by Dr.C.R.Dhivyaa Kongu Engineering C...
 
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
Training Deep Networks with Backprop (D1L4 Insight@DCU Machine Learning Works...
 
AI Algorithms
AI AlgorithmsAI Algorithms
AI Algorithms
 
Ensemble methods
Ensemble methodsEnsemble methods
Ensemble methods
 
Introduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement LearningIntroduction of Deep Reinforcement Learning
Introduction of Deep Reinforcement Learning
 
Multiobjective optimization and trade offs using pareto optimality
Multiobjective optimization and trade offs using pareto optimalityMultiobjective optimization and trade offs using pareto optimality
Multiobjective optimization and trade offs using pareto optimality
 
XgBoost.pptx
XgBoost.pptxXgBoost.pptx
XgBoost.pptx
 
Reinforcement Learning 8: Planning and Learning with Tabular Methods
Reinforcement Learning 8: Planning and Learning with Tabular MethodsReinforcement Learning 8: Planning and Learning with Tabular Methods
Reinforcement Learning 8: Planning and Learning with Tabular Methods
 
IFTA2020 Kei Nakagawa
IFTA2020 Kei NakagawaIFTA2020 Kei Nakagawa
IFTA2020 Kei Nakagawa
 
How to formulate reinforcement learning in illustrative ways
How to formulate reinforcement learning in illustrative waysHow to formulate reinforcement learning in illustrative ways
How to formulate reinforcement learning in illustrative ways
 
Deep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter TuningDeep Dive into Hyperparameter Tuning
Deep Dive into Hyperparameter Tuning
 
Structured prediction with reinforcement learning
Structured prediction with reinforcement learningStructured prediction with reinforcement learning
Structured prediction with reinforcement learning
 
GDG Cloud Community Day 2022 - Managing data quality in Machine Learning
GDG Cloud Community Day 2022 -  Managing data quality in Machine LearningGDG Cloud Community Day 2022 -  Managing data quality in Machine Learning
GDG Cloud Community Day 2022 - Managing data quality in Machine Learning
 
Advanced regression and model selection
Advanced regression and model selectionAdvanced regression and model selection
Advanced regression and model selection
 

Recently uploaded

{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...Pooja Nehwal
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130Suhani Kapoor
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一ffjhghh
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...soniya singh
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptxAnupama Kate
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptSonatrach
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Callshivangimorya083
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxTanveerAhmed817946
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubaihf8803863
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptxthyngster
 

Recently uploaded (20)

{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...{Pooja:  9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
{Pooja: 9892124323 } Call Girl in Mumbai | Jas Kaur Rate 4500 Free Hotel Del...
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
E-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptxE-Commerce Order PredictionShraddha Kamble.pptx
E-Commerce Order PredictionShraddha Kamble.pptx
 
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
VIP Call Girls Service Miyapur Hyderabad Call +91-8250192130
 
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一定制英国白金汉大学毕业证(UCB毕业证书)																			成绩单原版一比一
定制英国白金汉大学毕业证(UCB毕业证书) 成绩单原版一比一
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
High Class Call Girls Noida Sector 39 Aarushi 🔝8264348440🔝 Independent Escort...
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
꧁❤ Aerocity Call Girls Service Aerocity Delhi ❤꧂ 9999965857 ☎️ Hard And Sexy ...
 
100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx100-Concepts-of-AI by Anupama Kate .pptx
100-Concepts-of-AI by Anupama Kate .pptx
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.pptdokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
dokumen.tips_chapter-4-transient-heat-conduction-mehmet-kanoglu.ppt
 
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
꧁❤ Greater Noida Call Girls Delhi ❤꧂ 9711199171 ☎️ Hard And Sexy Vip Call
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
Digi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptxDigi Khata Problem along complete plan.pptx
Digi Khata Problem along complete plan.pptx
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
Decoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in ActionDecoding Loan Approval: Predictive Modeling in Action
Decoding Loan Approval: Predictive Modeling in Action
 
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls DubaiDubai Call Girls Wifey O52&786472 Call Girls Dubai
Dubai Call Girls Wifey O52&786472 Call Girls Dubai
 
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptxEMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM  TRACKING WITH GOOGLE ANALYTICS.pptx
EMERCE - 2024 - AMSTERDAM - CROSS-PLATFORM TRACKING WITH GOOGLE ANALYTICS.pptx
 

A brief introduction to Searn Algorithm

  • 1. Searn Algorithm for Structured Prediction Presented by Supun Abeysinghe
  • 2. Outline ● What is Structured Prediction ● Approaches to Structured Prediction ● Idea of Search-Based Structured Prediction ● Background Information for SEARN ● SEARN Algorithm ● Comparison with other approaches
  • 4. What is Structured Prediction? ● If we define informally structured prediction is a process by which a structure inside a given input is captured. ● Main difference with other machine learning problems is Structured Prediction problems usually have a complex output.
  • 5. POS Tagging She sells seashells on the seashore PRP VBZ NNS IN DT NN
  • 9. And a lot more...
  • 11. Different Approaches to SP ● Structured Perceptron ○ Direct implementation of Averaged Perceptron in binary classification to use in SP ● Incremental Perceptron ○ Also a search-based approach. ● Maximum Entropy Markov Models ○ Similar to logistic regression in binary classification ● Conditional Random Fields ○ Solves label bias problem of Maximum Entropy models ● Maximum Margin Markov Networks ● SVM for Independent and Structured Outputs (SVMstruct )
  • 13. Traditional Approach to SP ModelInput All Possible Outputs Xn W - Parameters F - features ● There is a model that can generate all the possible outputs for a given input ● Based on the input features, model parameters assign scores for each of those outputs
  • 14. Training ModelInput All Possible Outputs Xn (Xn , Yn ) W - Parameters ● Model parameters are trained such that the correct output for each training example will have the highest score. Yn Highest Score
  • 15. Decoding ModelInput All Possible Outputs Xn W - Parameters ● In the decoding phase, the input is ran through the model and then all the outputs are searched to find out the output with the highest score Search for the output with highest score
  • 16. Role of Search ● Search gets the output with highest score from the search space. ● Almost all SP approaches needs a search component ● In most cases, searching through the whole space is intractable. ○ Assumptions about the output is made so that dynamic programming can be applied ○ Use approximating methods such as beam search, greedy search and any other heuristic based search methods ● Search can be seen as a sequence of decisions taken to get the best output.
  • 17. Search-based SP ● Search phase and the model is combined. ● Rather than searching after the model, learn how to search. ● Each decision made during the search is considered as a large classification problem. ● Now each search decision that make will build the output incrementally. ● The goal is to train these classifiers to build an optimal output.
  • 19. Learning Reductions ● Relating a hard and complex prediction problem to a simpler prediction problem. ● Maps a harder problem to a simpler problem, then obtains a solution for the simple problem and maps that solution to the harder problem. ● A reduction has three components ○ Sample mapping - Mapping complex problem dataset to the simpler problem ○ Hypothesis mapping - Mapping the solution to the easier problem to the hard problem ○ Bounds - How well the reduction solves the larger problem
  • 20. Importance Weighted Binary Classification ● A simple extension to binary classification. ● Each example (data item) has an associated weight which reflects the importance of that data item. (xi , yi , ci ) ● Solution should be a binary classifier that minimizes the expected weight loss.
  • 21. Importance Weighted Binary Classification ● Solved by reducing the problem to C parallel binary classifiers. ● C datasets are generated sampling from the original dataset with a sampling probability proportional to their importance weights. ● Using those different datasets, C binary classifiers will be trained. ● Prediction is made based on majority prediction of those C parallel binary classifiers.
  • 22. Cost Sensitive Classification ● This is a natural extension of Importance Weighted Binary Classification to a multi class scenario. ● For a K-Class task, we have to find a hypothesis (h) such that it minimizes the expected cost of predictions. ● C is a k sized vector containing cost for each classification.
  • 23. Cost Sensitive Classification ● This is reduced to Importance Weighted Binary Classification problem using Weighted All Pairs (Beygelzimer et al., 2005) reduction. ● WAP generates k Cc Importance weighted binary classification problems. ● Importance weights calculated using a special formula so that classification is done correctly.
  • 25. SEARN Algorithm ● Searn is developed by casting structured prediction in the language of reductions; ● In particular, it reduces structured prediction to cost-sensitive classification. ● In that case, the cost-sensitive classification problem can be reduced to binary classification by applying weighted all pairs method. ● So the structured prediction can be solved using binary classification.
  • 26. SEARN Algorithm ● Removes the “search” from the prediction process by learning a classifier to make incremental decisions.
  • 27. Definition of Structured Prediction We can define structured prediction problem as a cost-sensitive classification problem as follows.
  • 28. Definition of Structured Prediction The goal of the structured prediction is to find a hypothesis h : X→Y that minimizes the given loss.
  • 29. Policy ● We need to find a h such that, given a state s, and and the input x, h(x,s) gives the next action. ● We can consider policy h as a classifier. Now the whole problem becomes a classification problem. ● Now we need to train this classifier.
  • 30. Training ● Training is an iterative process ● Initialize with a known policy ● Using that policy create cost-sensitive examples ● Create a new policy using the cost-sensitive examples ● Interpolate the previous policy and the new policy
  • 31. Cost Sensitive Examples ● A policy generates one path per one training example. (Path is a sequence of states; state is a partial structure) ● SEARN creates a single cost sensitive example for each state in each path. ● The classes associated with each example is the cost of available actions. (Next states) ● Now the difficulty lies in specifying these cost values.
  • 32. Cost ● Cost of each action can be considered as regret. It is defined as follows. (π is the policy) ● The complexity of the above equation is problem dependent. ● There are multiple ways to compute it. (Monte-carlo sampling, Single Monte-carlo sampling, etc)
  • 33. Optimal Policy ● The optimal policy is a policy that, for a given state, input and output(structured prediction cost vector) always predicts the best action to take.
  • 34. Optimal Policy ● Searn uses the optimal policy to initialize the iterative process, and attempts to migrate toward a completely learned policy that will generalize well. ● SEARN assumes existence of an optimal policy to the problem.
  • 35. Algorithm ● π* is the optimal policy ● Learn is a multi class learner. ● Policy will be initialized using the optimal policy. (Line 1) ● Algorithm then iterates for a number of iterations. ● Makes cost-sensitive examples using the current policy. ● Interpolate the previous policy with current policy
  • 37. Vs. Independent Classifiers ● Output structure is assumed to be decomposable and each part is classified (predicted) individually. ● Cannot define features that span across output structure. ● Even if the previous results are taken into consideration it can be sub optimal. ● Limited to hamming loss.
  • 38. Vs. Perceptron algorithms ● Assumes a tractable argmax operation. ● Generalize poorly. (Can solve this by averaging the weights) ● Limited to only one loss function.
  • 39. Vs. Global prediction algorithms ● Highly dependent on assumptions about output structure. (Markov assumption) ● In comparison SEARN is more general, limited neither to linear chains nor to Markov style features. ● SEARN requires far more weaker assumptions.
  • 40.
  • 41. SEARN algorithm can solve structured prediction problems under any model, any feature functions and any loss
  • 42. References ● Search-based Structured Prediction. Hal Daumé III, John Langford and Daniel Marcu. Submitted to Machine Learning Journal, 2006. ● Practical Structured Learning Techniques for Natural Language Processing. Hal Daumé III. PhD Thesis, 2006 (USC)