SlideShare a Scribd company logo
MaxEnt Model
Overview
Anantharaman Narayana Iyer
30 Jan 2015
MaxEnt Classifier
• This is a powerful model that has equivalence to logistic regression
• Many NLP problems can be reformulated as classification problems
• E.g. Language Modelling, Tagging Problems
• MaxEnt is widely used for various text processing tasks.
• Task is to estimate the probability of a class given the context
• The term context may refer to a single word or group of words
• In a large text corpora contains information on the cooccurrence of
classes and specific contexts
Problem: MaxEnt (Refer paper by
Ratnaparkhi)
• Let p(a, b) be the probability of class a occurring with context b
• Given the sparsity of words in b and also limited training data, it is not
possible to completely specify p(a, b)
• Given the sparse evidence about a’s and b’s our goal is to estimate
the probability model p(a, b)
MaxEnt principle
Representing Evidence
• One way to represent evidence is to encode useful facts as features
and to impose constraints on the values of those feature expectations
• A feature is a binary valued function (indicator function):
• 𝑓𝑖: ε ⟶ 0, 1
• Given k features the constraints have the form:
• Expectation value of the model for the feature fj = Observed Expectation
value for the feature fj
• 𝑥∈𝜖 𝑝 𝑥 𝑓𝑗 x = 𝑥∈𝜖 𝑝 𝑥 𝑓𝑗 x
• The principle of maximum entropy requires:
Motivating Problems for Log-linear Models
• Language Model: Given the context (that is, words w1, w2, …, wi-1 ) predict the word wi
• Consider the examples:
• A natural number (i.e. 1, 2, 3, 4, 5, 6, etc.) is called a prime number (or a prime) if it has exactly two positive divisors, 1
and the number itself. Natural numbers greater than 1 that are not prime are called composite.
• Asked about the speculation that he may be inducted into the Cabinet, Parrikar said, “I can comment on it only after
meeting the Prime Minister. Let the Prime Minister who has invited me comment”
• Prime Minister Narendra Modi is likely to expand his Cabinet on Sunday, according to Times Now
• “The prime focus of this release of our product is to simplify the user interface”
• N-gram models
• Uses the context of previous (n-1) words to predict the nth word
• A trigram model approach uses 2 previous words
• Sometimes the accuracy can be improved if other features of the input are taken in to consideration as opposed to
using only a very limited context
• The n-gram LM techniques are not flexible enough to include additional features, such as the total length of sentence,
presence of certain specific words, identity of the author etc. Note: One might include extra features like author’s
name etc and compute conditional probabilities but such extensions to the conventional trigram approach becomes
quickly unwieldy
• Log-linear models can be used to include the additional features and improve the performance
The general problem
• We have an input domain X
• For example: A sequence of words
• There is a finite label set Y
• For example: The space of all possible words – that is the vocabulary
• Our goal is to determine P(y|x) for any x, y where x is in the input
space and y is in the space of labels
• For example: Given an input sentence (that is x, a sequence of words),
determine the next word in the sequence - that is P(wi | w1..wn)
Feature Vector
• A feature is a function fk(x, y) ∈ ℝ
• Often the features used in Log-linear models for typical NLP
applications are binary functions that are also called indicator
functions: fk(x, y) ∈ {0, 1}
• If we have m features then a feature vector f(x, y) ∈ ℝ 𝑚
• The number and choice of features for a given input is arbitrary. The
system developer can design these with an intuition of the problem
space he is addressing.
Features in Log-Linear Models
• Features are pieces of elementary pieces of evidence that link aspects
of what we observe x with a label y that we want to predict (Ref: C
Manning)
• A feature is a function with a bounded real value
𝑓: 𝑋 ∗ 𝑌 → ℝ
• Example:
• Consider a sentence: “Gandhi was born on 2 October 1869 in Porbandar”
• f1(x, y) = [y = PERSON and wi = isCapitalized and wi+1 = (“was” | “is”) and wi+2 =
VERB]
• f2(x, y) = [y = LOCATION and wi = isCapitalized and wi+1 = (“was” | “is”) and wi+2
= VERB]
• f3(x, y) = [y = DATE and wi = CD and wi-1 = (“on”) and wi-2 = VERB]
Feature Vector Representations
• Consider the examples:
• A natural number (i.e. 1, 2, 3, 4, 5, 6, etc.) is called
a prime number (or a prime) if it has exactly two
positive divisors, 1 and the number itself.
Natural numbers greater than 1 that are
not prime are called composite.
• Asked about the speculation that he may be inducted
into the Cabinet, Parrikar said, “I can comment on it
only after meeting the Prime Minister. Let the Prime
Minister who has invited me comment”
• Prime Minister Narendra Modi is likely to expand his
Cabinet on Sunday, according to Times Now
• “The prime focus of this release of our product is to
simplify the user interface”
• Exercise:
• What are the possible features we may consider for
representing the Trigram LM problem?
• How do we extend this set of trigram features in to a
more powerful set of features?
Parameter Vector
• Given the feature vector f(x, y) ∈ ℝ 𝑚 we can define the parameter
vector v ∈ ℝ 𝑚
• Each (x, y) is mapped to a score which is the dot product of the
parameter vector and the feature vector:
𝑣. 𝑓 𝑥, 𝑦 =
𝑘=1
𝑚
𝑣 𝑘 𝑓𝑘
Log-linear model - definition
• Let the Input domain X and label space Y
• Our goal is to determine P(y|x)
• A feature is a function: 𝑓: 𝑋 × 𝑌 → ℝ
• We have m features that constitute a feature vector: 𝑓 𝑥, 𝑦 ∈ ℝ 𝑚
• We also have the parameter vector: 𝑣 ∈ ℝ 𝑚
• We define the log-linear model as:
𝒑 𝒚 𝒙; 𝒗 =
𝒆 𝒗.𝒇 𝒙,𝒚
𝒚′∈𝒀
𝒆 𝒗.𝒇 𝒙,𝒚′
Refer: Coursera Notes Prof Michael Collins

More Related Content

What's hot

Generative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variantsGenerative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variants
ananth
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
Sujit Pal
 
An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier
ananth
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Trees
ananth
 
Mathematical Background for Artificial Intelligence
Mathematical Background for Artificial IntelligenceMathematical Background for Artificial Intelligence
Mathematical Background for Artificial Intelligence
ananth
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
ananth
 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptbutest
 
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskDeep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Saurabh Saxena
 
NLP Bootcamp
NLP BootcampNLP Bootcamp
NLP Bootcamp
Anuj Gupta
 
Session-Based Recommendations with Recurrent Neural Networks (Balazs Hidasi, ...
Session-Based Recommendations with Recurrent Neural Networks(Balazs Hidasi, ...Session-Based Recommendations with Recurrent Neural Networks(Balazs Hidasi, ...
Session-Based Recommendations with Recurrent Neural Networks (Balazs Hidasi, ...
hyunsung lee
 
Recurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovRecurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas Mikolov
Bhaskar Mitra
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017
Shuai Zhang
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial
Alexandros Karatzoglou
 
H transformer-1d paper review!!
H transformer-1d paper review!!H transformer-1d paper review!!
H transformer-1d paper review!!
taeseon ryu
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Roelof Pieters
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
Girish Khanzode
 
Test for AI model
Test for AI modelTest for AI model
Test for AI model
Arithmer Inc.
 
Generating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural NetworksGenerating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural Networks
Jonathan Mugan
 
Machine Learning in NLP
Machine Learning in NLPMachine Learning in NLP
Machine Learning in NLP
Vijay Ganti
 
Lecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language TechnologyLecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language Technology
Marina Santini
 

What's hot (20)

Generative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variantsGenerative Adversarial Networks : Basic architecture and variants
Generative Adversarial Networks : Basic architecture and variants
 
Artificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep LearningArtificial Intelligence, Machine Learning and Deep Learning
Artificial Intelligence, Machine Learning and Deep Learning
 
An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier An Overview of Naïve Bayes Classifier
An Overview of Naïve Bayes Classifier
 
Machine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision TreesMachine Learning Lecture 3 Decision Trees
Machine Learning Lecture 3 Decision Trees
 
Mathematical Background for Artificial Intelligence
Mathematical Background for Artificial IntelligenceMathematical Background for Artificial Intelligence
Mathematical Background for Artificial Intelligence
 
Recurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRURecurrent Neural Networks, LSTM and GRU
Recurrent Neural Networks, LSTM and GRU
 
Machine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.pptMachine Learning Applications in NLP.ppt
Machine Learning Applications in NLP.ppt
 
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate HelpdeskDeep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
 
NLP Bootcamp
NLP BootcampNLP Bootcamp
NLP Bootcamp
 
Session-Based Recommendations with Recurrent Neural Networks (Balazs Hidasi, ...
Session-Based Recommendations with Recurrent Neural Networks(Balazs Hidasi, ...Session-Based Recommendations with Recurrent Neural Networks(Balazs Hidasi, ...
Session-Based Recommendations with Recurrent Neural Networks (Balazs Hidasi, ...
 
Recurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas MikolovRecurrent networks and beyond by Tomas Mikolov
Recurrent networks and beyond by Tomas Mikolov
 
Talk@rmit 09112017
Talk@rmit 09112017Talk@rmit 09112017
Talk@rmit 09112017
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial
 
H transformer-1d paper review!!
H transformer-1d paper review!!H transformer-1d paper review!!
H transformer-1d paper review!!
 
Deep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word EmbeddingsDeep Learning for NLP: An Introduction to Neural Word Embeddings
Deep Learning for NLP: An Introduction to Neural Word Embeddings
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Test for AI model
Test for AI modelTest for AI model
Test for AI model
 
Generating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural NetworksGenerating Natural-Language Text with Neural Networks
Generating Natural-Language Text with Neural Networks
 
Machine Learning in NLP
Machine Learning in NLPMachine Learning in NLP
Machine Learning in NLP
 
Lecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language TechnologyLecture 2 Basic Concepts in Machine Learning for Language Technology
Lecture 2 Basic Concepts in Machine Learning for Language Technology
 

Viewers also liked

Principle of Maximum Entropy
Principle of Maximum EntropyPrinciple of Maximum Entropy
Principle of Maximum Entropy
Jiawang Liu
 
Max Entropy
Max EntropyMax Entropy
Max Entropyjianingy
 
Overview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language ProcessingOverview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processing
ananth
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vec
ananth
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
Rahul Jain
 
MaxEnt 2009 talk
MaxEnt 2009 talkMaxEnt 2009 talk
MaxEnt 2009 talk
Christian Robert
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysisharit66
 
Stanford - Statistical Learning
Stanford - Statistical LearningStanford - Statistical Learning
Stanford - Statistical LearningRavi Sankar Varma
 
Hierarchichal species distributions model and Maxent
Hierarchichal species distributions model and MaxentHierarchichal species distributions model and Maxent
Hierarchichal species distributions model and Maxent
richardchandler
 
Inferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOInferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSO
tuxette
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
Ankur Tyagi
 
Stanford Statistical Learning
Stanford Statistical LearningStanford Statistical Learning
Stanford Statistical LearningKurt Holst
 
Sentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine LearningSentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine Learning
Nihar Suryawanshi
 
L05 word representation
L05 word representationL05 word representation
L05 word representation
ananth
 
A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learning
tuxette
 
Deep Learning Primer - a brief introduction
Deep Learning Primer - a brief introductionDeep Learning Primer - a brief introduction
Deep Learning Primer - a brief introduction
ananth
 
Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...
Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...
Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...
Decision and Policy Analysis Program
 
Introduction to Statistical Machine Learning
Introduction to Statistical Machine LearningIntroduction to Statistical Machine Learning
Introduction to Statistical Machine Learning
mahutte
 
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...
Toshiki Sakai
 

Viewers also liked (20)

Principle of Maximum Entropy
Principle of Maximum EntropyPrinciple of Maximum Entropy
Principle of Maximum Entropy
 
Max Entropy
Max EntropyMax Entropy
Max Entropy
 
Overview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language ProcessingOverview of TensorFlow For Natural Language Processing
Overview of TensorFlow For Natural Language Processing
 
Word representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2VecWord representation: SVD, LSA, Word2Vec
Word representation: SVD, LSA, Word2Vec
 
Introduction to Machine Learning
Introduction to Machine LearningIntroduction to Machine Learning
Introduction to Machine Learning
 
MaxEnt 2009 talk
MaxEnt 2009 talkMaxEnt 2009 talk
MaxEnt 2009 talk
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Stanford - Statistical Learning
Stanford - Statistical LearningStanford - Statistical Learning
Stanford - Statistical Learning
 
Hierarchichal species distributions model and Maxent
Hierarchichal species distributions model and MaxentHierarchichal species distributions model and Maxent
Hierarchichal species distributions model and Maxent
 
Sentiment analysis
Sentiment analysisSentiment analysis
Sentiment analysis
 
Inferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSOInferring networks from multiple samples with consensus LASSO
Inferring networks from multiple samples with consensus LASSO
 
Sentiment Analysis
Sentiment AnalysisSentiment Analysis
Sentiment Analysis
 
Stanford Statistical Learning
Stanford Statistical LearningStanford Statistical Learning
Stanford Statistical Learning
 
Sentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine LearningSentiment Analysis Using Machine Learning
Sentiment Analysis Using Machine Learning
 
L05 word representation
L05 word representationL05 word representation
L05 word representation
 
A short introduction to statistical learning
A short introduction to statistical learningA short introduction to statistical learning
A short introduction to statistical learning
 
Deep Learning Primer - a brief introduction
Deep Learning Primer - a brief introductionDeep Learning Primer - a brief introduction
Deep Learning Primer - a brief introduction
 
Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...
Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...
Brief introduction to Ecocrop as a tool for crop suitability analysis to clim...
 
Introduction to Statistical Machine Learning
Introduction to Statistical Machine LearningIntroduction to Statistical Machine Learning
Introduction to Statistical Machine Learning
 
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...
CNN-RNN: A Unified Framework for Multi-label Image Classification@CV勉強会35回CVP...
 

Similar to MaxEnt (Loglinear) Models - Overview

Text Processing Framework for Hindi
Text Processing Framework for HindiText Processing Framework for Hindi
Text Processing Framework for Hindi
Utsav Chokshi
 
Towards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositoriesTowards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositoriesValentina Paunovic
 
Scaling Quality on Quora Using Machine Learning
Scaling Quality on Quora Using Machine LearningScaling Quality on Quora Using Machine Learning
Scaling Quality on Quora Using Machine Learning
Vo Viet Anh
 
Introduction to Artificial Intelligence...pptx
Introduction to Artificial Intelligence...pptxIntroduction to Artificial Intelligence...pptx
Introduction to Artificial Intelligence...pptx
MMCOE, Karvenagar, Pune
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLP
Satyam Saxena
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLP
Anuj Gupta
 
Discovering Real-World Usage for a Multimodal Math Search Interface
Discovering Real-World Usage for a Multimodal Math Search InterfaceDiscovering Real-World Usage for a Multimodal Math Search Interface
Discovering Real-World Usage for a Multimodal Math Search Interface
Keita (Del Valle) Wangari
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLP
Anuj Gupta
 
PL Lecture 03 - Types
PL Lecture 03 - TypesPL Lecture 03 - Types
PL Lecture 03 - Types
Schwannden Kuo
 
ARTIFICIAL INTELLIGENCE---UNIT 4.pptx
ARTIFICIAL INTELLIGENCE---UNIT 4.pptxARTIFICIAL INTELLIGENCE---UNIT 4.pptx
ARTIFICIAL INTELLIGENCE---UNIT 4.pptx
RuchitaMaaran
 
Reference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural NetworkReference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural Network
Saurav Jha
 
LLM.pdf
LLM.pdfLLM.pdf
LLM.pdf
MedBelatrach
 
Introduction to programming with python
Introduction to programming with pythonIntroduction to programming with python
Introduction to programming with python
Porimol Chandro
 
Lec01-Algorithems - Introduction and Overview.pdf
Lec01-Algorithems - Introduction and Overview.pdfLec01-Algorithems - Introduction and Overview.pdf
Lec01-Algorithems - Introduction and Overview.pdf
MAJDABDALLAH3
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
lucenerevolution
 
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
DrkhanchanaR
 
Module 3,4.pptx
Module 3,4.pptxModule 3,4.pptx
Module 3,4.pptx
SandeepR95
 
Practical Natural language processing
Practical Natural language processing Practical Natural language processing
Practical Natural language processing
Kim Ming Teh
 
Tokenization and how to use it from scratch
Tokenization and how to use it from scratchTokenization and how to use it from scratch
Tokenization and how to use it from scratch
Mahmoud Yasser
 

Similar to MaxEnt (Loglinear) Models - Overview (20)

Text Processing Framework for Hindi
Text Processing Framework for HindiText Processing Framework for Hindi
Text Processing Framework for Hindi
 
Towards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositoriesTowards advanced data retrieval from learning objects repositories
Towards advanced data retrieval from learning objects repositories
 
Scaling Quality on Quora Using Machine Learning
Scaling Quality on Quora Using Machine LearningScaling Quality on Quora Using Machine Learning
Scaling Quality on Quora Using Machine Learning
 
NLP from scratch
NLP from scratch NLP from scratch
NLP from scratch
 
Introduction to Artificial Intelligence...pptx
Introduction to Artificial Intelligence...pptxIntroduction to Artificial Intelligence...pptx
Introduction to Artificial Intelligence...pptx
 
Anthiil Inside workshop on NLP
Anthiil Inside workshop on NLPAnthiil Inside workshop on NLP
Anthiil Inside workshop on NLP
 
Representation Learning of Text for NLP
Representation Learning of Text for NLPRepresentation Learning of Text for NLP
Representation Learning of Text for NLP
 
Discovering Real-World Usage for a Multimodal Math Search Interface
Discovering Real-World Usage for a Multimodal Math Search InterfaceDiscovering Real-World Usage for a Multimodal Math Search Interface
Discovering Real-World Usage for a Multimodal Math Search Interface
 
NLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLPNLP Bootcamp 2018 : Representation Learning of text for NLP
NLP Bootcamp 2018 : Representation Learning of text for NLP
 
PL Lecture 03 - Types
PL Lecture 03 - TypesPL Lecture 03 - Types
PL Lecture 03 - Types
 
ARTIFICIAL INTELLIGENCE---UNIT 4.pptx
ARTIFICIAL INTELLIGENCE---UNIT 4.pptxARTIFICIAL INTELLIGENCE---UNIT 4.pptx
ARTIFICIAL INTELLIGENCE---UNIT 4.pptx
 
Reference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural NetworkReference Scope Identification of Citances Using Convolutional Neural Network
Reference Scope Identification of Citances Using Convolutional Neural Network
 
LLM.pdf
LLM.pdfLLM.pdf
LLM.pdf
 
Introduction to programming with python
Introduction to programming with pythonIntroduction to programming with python
Introduction to programming with python
 
Lec01-Algorithems - Introduction and Overview.pdf
Lec01-Algorithems - Introduction and Overview.pdfLec01-Algorithems - Introduction and Overview.pdf
Lec01-Algorithems - Introduction and Overview.pdf
 
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVMText Classification with Lucene/Solr, Apache Hadoop and LibSVM
Text Classification with Lucene/Solr, Apache Hadoop and LibSVM
 
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
Unit I- Data structures Introduction, Evaluation of Algorithms, Arrays, Spars...
 
Module 3,4.pptx
Module 3,4.pptxModule 3,4.pptx
Module 3,4.pptx
 
Practical Natural language processing
Practical Natural language processing Practical Natural language processing
Practical Natural language processing
 
Tokenization and how to use it from scratch
Tokenization and how to use it from scratchTokenization and how to use it from scratch
Tokenization and how to use it from scratch
 

More from ananth

Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
ananth
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
ananth
 
Search problems in Artificial Intelligence
Search problems in Artificial IntelligenceSearch problems in Artificial Intelligence
Search problems in Artificial Intelligence
ananth
 
Introduction to Artificial Intelligence
Introduction to Artificial IntelligenceIntroduction to Artificial Intelligence
Introduction to Artificial Intelligence
ananth
 
Deep Learning For Speech Recognition
Deep Learning For Speech RecognitionDeep Learning For Speech Recognition
Deep Learning For Speech Recognition
ananth
 
Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1
ananth
 
An overview of Hidden Markov Models (HMM)
An overview of Hidden Markov Models (HMM)An overview of Hidden Markov Models (HMM)
An overview of Hidden Markov Models (HMM)
ananth
 
Natural Language Processing: L03 maths fornlp
Natural Language Processing: L03 maths fornlpNatural Language Processing: L03 maths fornlp
Natural Language Processing: L03 maths fornlp
ananth
 
Natural Language Processing: L02 words
Natural Language Processing: L02 wordsNatural Language Processing: L02 words
Natural Language Processing: L02 words
ananth
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
ananth
 

More from ananth (10)

Convolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular ArchitecturesConvolutional Neural Networks : Popular Architectures
Convolutional Neural Networks : Popular Architectures
 
Overview of Convolutional Neural Networks
Overview of Convolutional Neural NetworksOverview of Convolutional Neural Networks
Overview of Convolutional Neural Networks
 
Search problems in Artificial Intelligence
Search problems in Artificial IntelligenceSearch problems in Artificial Intelligence
Search problems in Artificial Intelligence
 
Introduction to Artificial Intelligence
Introduction to Artificial IntelligenceIntroduction to Artificial Intelligence
Introduction to Artificial Intelligence
 
Deep Learning For Speech Recognition
Deep Learning For Speech RecognitionDeep Learning For Speech Recognition
Deep Learning For Speech Recognition
 
Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1Convolutional Neural Networks: Part 1
Convolutional Neural Networks: Part 1
 
An overview of Hidden Markov Models (HMM)
An overview of Hidden Markov Models (HMM)An overview of Hidden Markov Models (HMM)
An overview of Hidden Markov Models (HMM)
 
Natural Language Processing: L03 maths fornlp
Natural Language Processing: L03 maths fornlpNatural Language Processing: L03 maths fornlp
Natural Language Processing: L03 maths fornlp
 
Natural Language Processing: L02 words
Natural Language Processing: L02 wordsNatural Language Processing: L02 words
Natural Language Processing: L02 words
 
Natural Language Processing: L01 introduction
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
 

Recently uploaded

IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Tobias Schneck
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
UiPathCommunity
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
CatarinaPereira64715
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Ramesh Iyer
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
Alan Dix
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
Guy Korland
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
Cheryl Hung
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
Ralf Eggert
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 

Recently uploaded (20)

IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024
 
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...
 
ODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User GroupODC, Data Fabric and Architecture User Group
ODC, Data Fabric and Architecture User Group
 
FIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdfFIDO Alliance Osaka Seminar: Overview.pdf
FIDO Alliance Osaka Seminar: Overview.pdf
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
Epistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI supportEpistemic Interaction - tuning interfaces to provide information for AI support
Epistemic Interaction - tuning interfaces to provide information for AI support
 
GraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge GraphGraphRAG is All You need? LLM & Knowledge Graph
GraphRAG is All You need? LLM & Knowledge Graph
 
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
Key Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdfKey Trends Shaping the Future of Infrastructure.pdf
Key Trends Shaping the Future of Infrastructure.pdf
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)PHP Frameworks: I want to break free (IPC Berlin 2024)
PHP Frameworks: I want to break free (IPC Berlin 2024)
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 

MaxEnt (Loglinear) Models - Overview

  • 2. MaxEnt Classifier • This is a powerful model that has equivalence to logistic regression • Many NLP problems can be reformulated as classification problems • E.g. Language Modelling, Tagging Problems • MaxEnt is widely used for various text processing tasks. • Task is to estimate the probability of a class given the context • The term context may refer to a single word or group of words • In a large text corpora contains information on the cooccurrence of classes and specific contexts
  • 3. Problem: MaxEnt (Refer paper by Ratnaparkhi) • Let p(a, b) be the probability of class a occurring with context b • Given the sparsity of words in b and also limited training data, it is not possible to completely specify p(a, b) • Given the sparse evidence about a’s and b’s our goal is to estimate the probability model p(a, b)
  • 5. Representing Evidence • One way to represent evidence is to encode useful facts as features and to impose constraints on the values of those feature expectations • A feature is a binary valued function (indicator function): • 𝑓𝑖: ε ⟶ 0, 1 • Given k features the constraints have the form: • Expectation value of the model for the feature fj = Observed Expectation value for the feature fj • 𝑥∈𝜖 𝑝 𝑥 𝑓𝑗 x = 𝑥∈𝜖 𝑝 𝑥 𝑓𝑗 x • The principle of maximum entropy requires:
  • 6. Motivating Problems for Log-linear Models • Language Model: Given the context (that is, words w1, w2, …, wi-1 ) predict the word wi • Consider the examples: • A natural number (i.e. 1, 2, 3, 4, 5, 6, etc.) is called a prime number (or a prime) if it has exactly two positive divisors, 1 and the number itself. Natural numbers greater than 1 that are not prime are called composite. • Asked about the speculation that he may be inducted into the Cabinet, Parrikar said, “I can comment on it only after meeting the Prime Minister. Let the Prime Minister who has invited me comment” • Prime Minister Narendra Modi is likely to expand his Cabinet on Sunday, according to Times Now • “The prime focus of this release of our product is to simplify the user interface” • N-gram models • Uses the context of previous (n-1) words to predict the nth word • A trigram model approach uses 2 previous words • Sometimes the accuracy can be improved if other features of the input are taken in to consideration as opposed to using only a very limited context • The n-gram LM techniques are not flexible enough to include additional features, such as the total length of sentence, presence of certain specific words, identity of the author etc. Note: One might include extra features like author’s name etc and compute conditional probabilities but such extensions to the conventional trigram approach becomes quickly unwieldy • Log-linear models can be used to include the additional features and improve the performance
  • 7. The general problem • We have an input domain X • For example: A sequence of words • There is a finite label set Y • For example: The space of all possible words – that is the vocabulary • Our goal is to determine P(y|x) for any x, y where x is in the input space and y is in the space of labels • For example: Given an input sentence (that is x, a sequence of words), determine the next word in the sequence - that is P(wi | w1..wn)
  • 8. Feature Vector • A feature is a function fk(x, y) ∈ ℝ • Often the features used in Log-linear models for typical NLP applications are binary functions that are also called indicator functions: fk(x, y) ∈ {0, 1} • If we have m features then a feature vector f(x, y) ∈ ℝ 𝑚 • The number and choice of features for a given input is arbitrary. The system developer can design these with an intuition of the problem space he is addressing.
  • 9. Features in Log-Linear Models • Features are pieces of elementary pieces of evidence that link aspects of what we observe x with a label y that we want to predict (Ref: C Manning) • A feature is a function with a bounded real value 𝑓: 𝑋 ∗ 𝑌 → ℝ • Example: • Consider a sentence: “Gandhi was born on 2 October 1869 in Porbandar” • f1(x, y) = [y = PERSON and wi = isCapitalized and wi+1 = (“was” | “is”) and wi+2 = VERB] • f2(x, y) = [y = LOCATION and wi = isCapitalized and wi+1 = (“was” | “is”) and wi+2 = VERB] • f3(x, y) = [y = DATE and wi = CD and wi-1 = (“on”) and wi-2 = VERB]
  • 10. Feature Vector Representations • Consider the examples: • A natural number (i.e. 1, 2, 3, 4, 5, 6, etc.) is called a prime number (or a prime) if it has exactly two positive divisors, 1 and the number itself. Natural numbers greater than 1 that are not prime are called composite. • Asked about the speculation that he may be inducted into the Cabinet, Parrikar said, “I can comment on it only after meeting the Prime Minister. Let the Prime Minister who has invited me comment” • Prime Minister Narendra Modi is likely to expand his Cabinet on Sunday, according to Times Now • “The prime focus of this release of our product is to simplify the user interface” • Exercise: • What are the possible features we may consider for representing the Trigram LM problem? • How do we extend this set of trigram features in to a more powerful set of features?
  • 11. Parameter Vector • Given the feature vector f(x, y) ∈ ℝ 𝑚 we can define the parameter vector v ∈ ℝ 𝑚 • Each (x, y) is mapped to a score which is the dot product of the parameter vector and the feature vector: 𝑣. 𝑓 𝑥, 𝑦 = 𝑘=1 𝑚 𝑣 𝑘 𝑓𝑘
  • 12. Log-linear model - definition • Let the Input domain X and label space Y • Our goal is to determine P(y|x) • A feature is a function: 𝑓: 𝑋 × 𝑌 → ℝ • We have m features that constitute a feature vector: 𝑓 𝑥, 𝑦 ∈ ℝ 𝑚 • We also have the parameter vector: 𝑣 ∈ ℝ 𝑚 • We define the log-linear model as: 𝒑 𝒚 𝒙; 𝒗 = 𝒆 𝒗.𝒇 𝒙,𝒚 𝒚′∈𝒀 𝒆 𝒗.𝒇 𝒙,𝒚′
  • 13. Refer: Coursera Notes Prof Michael Collins