Presentation1

Ritikesh Bhaskarwar
Guided By,
Mrs. Gauri M. Dhopavkar
Presented By,
Ritikesh Bhaskarwar Vimal Shah
Ashwin Borkar Shashil Pohankar
Department of ComputerTechnology
YESHWANTRAO CHAVAN COLLEGE OF ENGINEERING,
Nagpur
(An Autonomous Institution Affiliated to RashtrasantTukadoji Maharaj Nagpur University)
Natural language processing
 Natural language processing (NLP) is a
field of computer science, artificial
intelligence, and linguistics concerned
with the interactions between computers
and human (natural) languages.
 Natural Language Processing (NLP) is the
computerized approach to analysing text
that is based on both a set of theories and
a set of technologies
Presentation1
POS Tagging :
 Part-of-Speech (POS) tagging is the
process of assigning a part-of-speech like
noun, verb, pronoun or other lexical class
marker to each word in a sentence.
 After POS tags are identified, the next
step is chunking, which involves dividing
sentences into non-overlapping non-
recursive phrases.
ते फू ल खूप
सुगंधी
आहे
Marathi POS
Tagger
ते-unidentified
फू ल-noun
खूप-adjective
सुगंधी-
adjective
आहे-verb
THE POSTAGGING EXAMPLE
Need of Marathi POS Tagging :
 Lack of significant tools for Indian
languages
 Dependence of other NLP activities on
POS tagging
 Failure of existing techniques on Indian
Languages
Overview of
POS tagging
Methods for POSTagging
1.Rule Based 2.Stochastic
 The rule based POS tagging
models apply a set of hand
written rules and use
contextual information to
assign POS tags to words.
 A stochastic approach
includes frequency,
probability or statistics. The
simplest stochastic
approach finds out the most
frequently used tag for a
specific word in the
annotated training data and
uses this information to tag
that word in the
unannotated text.
Methods for POSTagging
(cntd.)
3. Hiden Markov Model 4. Maximum Entropy Model
 The HMM model trains on
annotated corpora to find
out the transition and
emission probabilities
 The Maximum Entropy
Model (MEM) is based on
the principle of Maximum
Entropy, which states that
when choosing between a
number of different
probabilistic models for a
set of data, the most valid
model is the one which
makes fewest arbitrary
assumptions about the
nature of the data
Architecture and Design :
 Marathi sentence is taken as input , then
the tokens are created followed by
tagging and finding ambiguity.
TOKENIZING TAGGING FINDING
AMBIGUOUS
WORDS
FINDING
PROBABILITY
ASSIGN TAGS
ACCORDING TO
PROBABILITY
VIEW THE
RESULT
INPUT
Detail of Identified Module :
 Tokenizer :This module is used to get the
tokens of the input sentence.Also, calls
the other modules when required.
 Tagging :These modules is used for
assigning certain tags to tokens and also
search for ambiguous words and also find
their types and assign some special
symbols to them.
Details of identified modules (cntd.)
 Root word : This module is used for
finding the root word of each token
finding it from the Marathi wordnet.
 Probability : This module calculates the
probability and accordingly assigns the tag,
according to the higher probability of
word.
• Showing the results :This module shows
the result.The words are shown with
tags.
Experimentation and Results :
1.
• 1000: If first bit is 1, then we assign a tag as a noun to
the particular word.
• 1100: In this case, the word can be used as both
unidentified.
2.
• 0100: If second bit is 1, then we assign a tag as an
adjective to the particular word.
• 0110: In this case, the word can be used as other
words.
3.
• 0010: If third bit is 1, then we assign a tag as an adverb
to the particular word.
• 0001: If fourth bit is 1, then we assign a tag as a verb
to the particular word.
Advantages :
 A POS tagger can be seen as a first-step
towards tightening the integration
between speech recognition and natural
language processing.
 A POS tagger in the language model aids
in the identification of boundary tones and
speech repairs, redefining the speech
recognition problem.
Advantages (cntd.):
 A typical NLP system consists of
tokenization, sentence delimitation, part-of-
speech (POS) tagging, phrase chunking,
parsing, and concept mapping. As one of the
initial steps, POS tagging determines the part
of speech for each token in a sentence.
 Managers, educators, Trainers, Sales people
are able to accurately assess the needs of a
group, improves questioning techniques thus
improving their skills to achieve more
consistent results.
Limitations :
 User Cannot enter more than one sentence
i.e. cannot enter paragraph.
 It is not able to detect and report the gender
of the word i.e. Morphological analysis in
not done.
 When ambiguity is encountered it is
searched for the POS of the ambiguous word
if it contains less or no word with the correct
POS and there are more number of words for
other POS then it shows incorrect POS for
the ambiguous word.
Applications :
 Information Retrieval
 Speech synthesis
 Word Sense Disambiguation (WSD)
 Machine Translation (MT)
-Text to Text
-Speech to Speech
Snapshots
Presentation1
Presentation1
Presentation1
Conclusion and Future Scope :
 The POS tagger described here is very
simple and efficient for automatic tagging,
but the morphological complexity of the
Marathi make it hard.The performance of
the current system is good and result
achieved by this method are excellent. In
future we wish to improve the accuracy
our system by adding more tagged
sentence in our training corpus.
1 of 23

Recommended

Natural language processing using python by
Natural language processing using pythonNatural language processing using python
Natural language processing using pythonPrakash Anand
66 views17 slides
Tutorial - Speech Synthesis System by
Tutorial - Speech Synthesis SystemTutorial - Speech Synthesis System
Tutorial - Speech Synthesis SystemIJERA Editor
42 views5 slides
NLP by
NLPNLP
NLPJeet Das
2.1K views44 slides
Token classification using Bengali Tokenizer by
Token classification using Bengali TokenizerToken classification using Bengali Tokenizer
Token classification using Bengali TokenizerJeet Das
3.3K views37 slides
Natural Language Processing in Alternative and Augmentative Communication by
Natural Language Processing in Alternative and Augmentative CommunicationNatural Language Processing in Alternative and Augmentative Communication
Natural Language Processing in Alternative and Augmentative CommunicationDivya Sugumar
3K views15 slides
Python Course At Sparse Matrix Solutions by
Python Course At Sparse Matrix SolutionsPython Course At Sparse Matrix Solutions
Python Course At Sparse Matrix SolutionsESPARSE MATRIX SOUTIONS PRIVATE LIMITED
38 views7 slides

More Related Content

What's hot

D3 dhanalakshmi by
D3 dhanalakshmiD3 dhanalakshmi
D3 dhanalakshmiJasline Presilda
543 views6 slides
Introduction to natural language processing by
Introduction to natural language processingIntroduction to natural language processing
Introduction to natural language processingMinh Quang-Nhat Pham
8.5K views47 slides
Natural language processing by
Natural language processingNatural language processing
Natural language processingBasha Chand
420 views35 slides
Natural Language Processing (NLP) by
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)Yuriy Guts
26K views61 slides
Natural language processing by
Natural language processingNatural language processing
Natural language processingSaurav Aryal
615 views8 slides
Natural Language Processing for Games Research by
Natural Language Processing for Games ResearchNatural Language Processing for Games Research
Natural Language Processing for Games ResearchJose Zagal
6.1K views43 slides

What's hot(20)

Natural language processing by Basha Chand
Natural language processingNatural language processing
Natural language processing
Basha Chand420 views
Natural Language Processing (NLP) by Yuriy Guts
Natural Language Processing (NLP)Natural Language Processing (NLP)
Natural Language Processing (NLP)
Yuriy Guts26K views
Natural language processing by Saurav Aryal
Natural language processingNatural language processing
Natural language processing
Saurav Aryal615 views
Natural Language Processing for Games Research by Jose Zagal
Natural Language Processing for Games ResearchNatural Language Processing for Games Research
Natural Language Processing for Games Research
Jose Zagal6.1K views
Nlp presentation by Surya Sg
Nlp presentationNlp presentation
Nlp presentation
Surya Sg101 views
Natural Language Processing glossary for Coders by Aravind Mohanoor
Natural Language Processing glossary for CodersNatural Language Processing glossary for Coders
Natural Language Processing glossary for Coders
Aravind Mohanoor330 views
A Review on a web based Punjabi t o English Machine Transliteration System by Editor IJCATR
A Review on a web based Punjabi t o English Machine Transliteration SystemA Review on a web based Punjabi t o English Machine Transliteration System
A Review on a web based Punjabi t o English Machine Transliteration System
Editor IJCATR237 views
Natural Language Processing: L01 introduction by ananth
Natural Language Processing: L01 introductionNatural Language Processing: L01 introduction
Natural Language Processing: L01 introduction
ananth3K views
Introduction to Natural Language Processing (NLP) by VenkateshMurugadas
Introduction to Natural Language Processing (NLP)Introduction to Natural Language Processing (NLP)
Introduction to Natural Language Processing (NLP)
VenkateshMurugadas572 views
Big Data and Natural Language Processing by Michel Bruley
Big Data and Natural Language ProcessingBig Data and Natural Language Processing
Big Data and Natural Language Processing
Michel Bruley4.8K views
Ijartes v1-i1-002 by IJARTES
Ijartes v1-i1-002Ijartes v1-i1-002
Ijartes v1-i1-002
IJARTES 623 views
Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi... by TELKOMNIKA JOURNAL
Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...
Improving Sentiment Analysis of Short Informal Indonesian Product Reviews usi...
Natural language processing (nlp) by Kuppusamy P
Natural language processing (nlp)Natural language processing (nlp)
Natural language processing (nlp)
Kuppusamy P3.4K views
Technical Development Workshop - Text Analytics with Python by Michelle Purnama
Technical Development Workshop - Text Analytics with PythonTechnical Development Workshop - Text Analytics with Python
Technical Development Workshop - Text Analytics with Python
Michelle Purnama215 views

Viewers also liked

Power point arsenal by
Power point arsenalPower point arsenal
Power point arsenal478551412
302 views5 slides
Google cloud by
Google cloudGoogle cloud
Google cloud478551412
277 views12 slides
Google cloud by
Google cloudGoogle cloud
Google cloud478551412
194 views12 slides
Music video plans by
Music video plansMusic video plans
Music video planskieranhyde
137 views5 slides
Planning digipak and advert by
Planning digipak and advertPlanning digipak and advert
Planning digipak and advertkieranhyde
157 views6 slides
Q analysis by
Q analysisQ analysis
Q analysiskieranhyde
157 views5 slides

Viewers also liked(6)

Power point arsenal by 478551412
Power point arsenalPower point arsenal
Power point arsenal
478551412302 views
Google cloud by 478551412
Google cloudGoogle cloud
Google cloud
478551412277 views
Google cloud by 478551412
Google cloudGoogle cloud
Google cloud
478551412194 views
Music video plans by kieranhyde
Music video plansMusic video plans
Music video plans
kieranhyde137 views
Planning digipak and advert by kieranhyde
Planning digipak and advertPlanning digipak and advert
Planning digipak and advert
kieranhyde157 views

Similar to Presentation1

PART OF SPEECH TAGGING OFMARATHI TEXT USING TRIGRAMMETHOD by
PART OF SPEECH TAGGING OFMARATHI TEXT USING TRIGRAMMETHODPART OF SPEECH TAGGING OFMARATHI TEXT USING TRIGRAMMETHOD
PART OF SPEECH TAGGING OFMARATHI TEXT USING TRIGRAMMETHODijait
988 views7 slides
A Survey On Automatic Essay Evaluation System Using Machine Learning by
A Survey On Automatic Essay Evaluation System Using Machine LearningA Survey On Automatic Essay Evaluation System Using Machine Learning
A Survey On Automatic Essay Evaluation System Using Machine LearningJim Jimenez
3 views7 slides
Natural Language Processing .pdf by
Natural Language Processing .pdfNatural Language Processing .pdf
Natural Language Processing .pdfAnime196637
7 views13 slides
Natural Language Processing: A comprehensive overview by
Natural Language Processing: A comprehensive overviewNatural Language Processing: A comprehensive overview
Natural Language Processing: A comprehensive overviewBenjaminlapid1
37 views30 slides
NLP Techniques for Sentiment Anaysis.docx by
NLP Techniques for Sentiment Anaysis.docxNLP Techniques for Sentiment Anaysis.docx
NLP Techniques for Sentiment Anaysis.docxKevinSims18
9 views3 slides
HMM BASED POS TAGGER FOR HINDI by
HMM BASED POS TAGGER FOR HINDIHMM BASED POS TAGGER FOR HINDI
HMM BASED POS TAGGER FOR HINDIcscpconf
203 views9 slides

Similar to Presentation1(20)

PART OF SPEECH TAGGING OFMARATHI TEXT USING TRIGRAMMETHOD by ijait
PART OF SPEECH TAGGING OFMARATHI TEXT USING TRIGRAMMETHODPART OF SPEECH TAGGING OFMARATHI TEXT USING TRIGRAMMETHOD
PART OF SPEECH TAGGING OFMARATHI TEXT USING TRIGRAMMETHOD
ijait988 views
A Survey On Automatic Essay Evaluation System Using Machine Learning by Jim Jimenez
A Survey On Automatic Essay Evaluation System Using Machine LearningA Survey On Automatic Essay Evaluation System Using Machine Learning
A Survey On Automatic Essay Evaluation System Using Machine Learning
Jim Jimenez3 views
Natural Language Processing .pdf by Anime196637
Natural Language Processing .pdfNatural Language Processing .pdf
Natural Language Processing .pdf
Anime1966377 views
Natural Language Processing: A comprehensive overview by Benjaminlapid1
Natural Language Processing: A comprehensive overviewNatural Language Processing: A comprehensive overview
Natural Language Processing: A comprehensive overview
Benjaminlapid137 views
NLP Techniques for Sentiment Anaysis.docx by KevinSims18
NLP Techniques for Sentiment Anaysis.docxNLP Techniques for Sentiment Anaysis.docx
NLP Techniques for Sentiment Anaysis.docx
KevinSims189 views
HMM BASED POS TAGGER FOR HINDI by cscpconf
HMM BASED POS TAGGER FOR HINDIHMM BASED POS TAGGER FOR HINDI
HMM BASED POS TAGGER FOR HINDI
cscpconf203 views
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE by Journal For Research
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUECOMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
COMPREHENSIVE ANALYSIS OF NATURAL LANGUAGE PROCESSING TECHNIQUE
IRJET -Survey on Named Entity Recognition using Syntactic Parsing for Hindi L... by IRJET Journal
IRJET -Survey on Named Entity Recognition using Syntactic Parsing for Hindi L...IRJET -Survey on Named Entity Recognition using Syntactic Parsing for Hindi L...
IRJET -Survey on Named Entity Recognition using Syntactic Parsing for Hindi L...
IRJET Journal11 views
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING by csandit
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNINGDETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING
csandit376 views
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING by cscpconf
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNINGDETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING
DETECTION OF JARGON WORDS IN A TEXT USING SEMI-SUPERVISED LEARNING
cscpconf76 views
NLP Deep Learning with Tensorflow by seungwoo kim
NLP Deep Learning with TensorflowNLP Deep Learning with Tensorflow
NLP Deep Learning with Tensorflow
seungwoo kim2.3K views
Detection of slang words in e data using semi supervised learning by ijaia
Detection of slang words in e data using semi supervised learningDetection of slang words in e data using semi supervised learning
Detection of slang words in e data using semi supervised learning
ijaia583 views
Shallow parser for hindi language with an input from a transliterator by Shashank Shisodia
Shallow parser for hindi language with an input from a transliteratorShallow parser for hindi language with an input from a transliterator
Shallow parser for hindi language with an input from a transliterator
Shashank Shisodia1.4K views
A New Approach to Parts of Speech Tagging in Malayalam by ijcsit
A New Approach to Parts of Speech Tagging in MalayalamA New Approach to Parts of Speech Tagging in Malayalam
A New Approach to Parts of Speech Tagging in Malayalam
ijcsit333 views
Introduction to Natural Language Processing by dhruv_chaudhari
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
dhruv_chaudhari308 views
DOMAIN BASED CHUNKING by ijnlc
DOMAIN BASED CHUNKINGDOMAIN BASED CHUNKING
DOMAIN BASED CHUNKING
ijnlc19 views

Recently uploaded

Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... by
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...ShapeBlue
162 views25 slides
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue by
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueVNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueShapeBlue
207 views54 slides
State of the Union - Rohit Yadav - Apache CloudStack by
State of the Union - Rohit Yadav - Apache CloudStackState of the Union - Rohit Yadav - Apache CloudStack
State of the Union - Rohit Yadav - Apache CloudStackShapeBlue
303 views53 slides
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... by
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...ShapeBlue
178 views15 slides
The Power of Generative AI in Accelerating No Code Adoption.pdf by
The Power of Generative AI in Accelerating No Code Adoption.pdfThe Power of Generative AI in Accelerating No Code Adoption.pdf
The Power of Generative AI in Accelerating No Code Adoption.pdfSaeed Al Dhaheri
39 views18 slides
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueShapeBlue
137 views13 slides

Recently uploaded(20)

Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit... by ShapeBlue
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
Transitioning from VMware vCloud to Apache CloudStack: A Path to Profitabilit...
ShapeBlue162 views
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue by ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlueVNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
VNF Integration and Support in CloudStack - Wei Zhou - ShapeBlue
ShapeBlue207 views
State of the Union - Rohit Yadav - Apache CloudStack by ShapeBlue
State of the Union - Rohit Yadav - Apache CloudStackState of the Union - Rohit Yadav - Apache CloudStack
State of the Union - Rohit Yadav - Apache CloudStack
ShapeBlue303 views
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R... by ShapeBlue
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
Setting Up Your First CloudStack Environment with Beginners Challenges - MD R...
ShapeBlue178 views
The Power of Generative AI in Accelerating No Code Adoption.pdf by Saeed Al Dhaheri
The Power of Generative AI in Accelerating No Code Adoption.pdfThe Power of Generative AI in Accelerating No Code Adoption.pdf
The Power of Generative AI in Accelerating No Code Adoption.pdf
Saeed Al Dhaheri39 views
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue by ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlueCloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
CloudStack Managed User Data and Demo - Harikrishna Patnala - ShapeBlue
ShapeBlue137 views
Digital Personal Data Protection (DPDP) Practical Approach For CISOs by Priyanka Aash
Digital Personal Data Protection (DPDP) Practical Approach For CISOsDigital Personal Data Protection (DPDP) Practical Approach For CISOs
Digital Personal Data Protection (DPDP) Practical Approach For CISOs
Priyanka Aash162 views
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha... by ShapeBlue
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
Mitigating Common CloudStack Instance Deployment Failures - Jithin Raju - Sha...
ShapeBlue183 views
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ... by ShapeBlue
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
Import Export Virtual Machine for KVM Hypervisor - Ayush Pandey - University ...
ShapeBlue120 views
LLMs in Production: Tooling, Process, and Team Structure by Aggregage
LLMs in Production: Tooling, Process, and Team StructureLLMs in Production: Tooling, Process, and Team Structure
LLMs in Production: Tooling, Process, and Team Structure
Aggregage57 views
Initiating and Advancing Your Strategic GIS Governance Strategy by Safe Software
Initiating and Advancing Your Strategic GIS Governance StrategyInitiating and Advancing Your Strategic GIS Governance Strategy
Initiating and Advancing Your Strategic GIS Governance Strategy
Safe Software184 views
Why and How CloudStack at weSystems - Stephan Bienek - weSystems by ShapeBlue
Why and How CloudStack at weSystems - Stephan Bienek - weSystemsWhy and How CloudStack at weSystems - Stephan Bienek - weSystems
Why and How CloudStack at weSystems - Stephan Bienek - weSystems
ShapeBlue247 views
"Node.js Development in 2024: trends and tools", Nikita Galkin by Fwdays
"Node.js Development in 2024: trends and tools", Nikita Galkin "Node.js Development in 2024: trends and tools", Nikita Galkin
"Node.js Development in 2024: trends and tools", Nikita Galkin
Fwdays33 views
Future of AR - Facebook Presentation by Rob McCarty
Future of AR - Facebook PresentationFuture of AR - Facebook Presentation
Future of AR - Facebook Presentation
Rob McCarty65 views
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue by ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
2FA and OAuth2 in CloudStack - Andrija Panić - ShapeBlue
ShapeBlue152 views
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And... by ShapeBlue
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
Enabling DPU Hardware Accelerators in XCP-ng Cloud Platform Environment - And...
ShapeBlue108 views

Presentation1

  • 1. Guided By, Mrs. Gauri M. Dhopavkar Presented By, Ritikesh Bhaskarwar Vimal Shah Ashwin Borkar Shashil Pohankar
  • 2. Department of ComputerTechnology YESHWANTRAO CHAVAN COLLEGE OF ENGINEERING, Nagpur (An Autonomous Institution Affiliated to RashtrasantTukadoji Maharaj Nagpur University)
  • 3. Natural language processing  Natural language processing (NLP) is a field of computer science, artificial intelligence, and linguistics concerned with the interactions between computers and human (natural) languages.  Natural Language Processing (NLP) is the computerized approach to analysing text that is based on both a set of theories and a set of technologies
  • 5. POS Tagging :  Part-of-Speech (POS) tagging is the process of assigning a part-of-speech like noun, verb, pronoun or other lexical class marker to each word in a sentence.  After POS tags are identified, the next step is chunking, which involves dividing sentences into non-overlapping non- recursive phrases.
  • 6. ते फू ल खूप सुगंधी आहे Marathi POS Tagger ते-unidentified फू ल-noun खूप-adjective सुगंधी- adjective आहे-verb THE POSTAGGING EXAMPLE
  • 7. Need of Marathi POS Tagging :  Lack of significant tools for Indian languages  Dependence of other NLP activities on POS tagging  Failure of existing techniques on Indian Languages
  • 9. Methods for POSTagging 1.Rule Based 2.Stochastic  The rule based POS tagging models apply a set of hand written rules and use contextual information to assign POS tags to words.  A stochastic approach includes frequency, probability or statistics. The simplest stochastic approach finds out the most frequently used tag for a specific word in the annotated training data and uses this information to tag that word in the unannotated text.
  • 10. Methods for POSTagging (cntd.) 3. Hiden Markov Model 4. Maximum Entropy Model  The HMM model trains on annotated corpora to find out the transition and emission probabilities  The Maximum Entropy Model (MEM) is based on the principle of Maximum Entropy, which states that when choosing between a number of different probabilistic models for a set of data, the most valid model is the one which makes fewest arbitrary assumptions about the nature of the data
  • 11. Architecture and Design :  Marathi sentence is taken as input , then the tokens are created followed by tagging and finding ambiguity. TOKENIZING TAGGING FINDING AMBIGUOUS WORDS FINDING PROBABILITY ASSIGN TAGS ACCORDING TO PROBABILITY VIEW THE RESULT INPUT
  • 12. Detail of Identified Module :  Tokenizer :This module is used to get the tokens of the input sentence.Also, calls the other modules when required.  Tagging :These modules is used for assigning certain tags to tokens and also search for ambiguous words and also find their types and assign some special symbols to them.
  • 13. Details of identified modules (cntd.)  Root word : This module is used for finding the root word of each token finding it from the Marathi wordnet.  Probability : This module calculates the probability and accordingly assigns the tag, according to the higher probability of word. • Showing the results :This module shows the result.The words are shown with tags.
  • 14. Experimentation and Results : 1. • 1000: If first bit is 1, then we assign a tag as a noun to the particular word. • 1100: In this case, the word can be used as both unidentified. 2. • 0100: If second bit is 1, then we assign a tag as an adjective to the particular word. • 0110: In this case, the word can be used as other words. 3. • 0010: If third bit is 1, then we assign a tag as an adverb to the particular word. • 0001: If fourth bit is 1, then we assign a tag as a verb to the particular word.
  • 15. Advantages :  A POS tagger can be seen as a first-step towards tightening the integration between speech recognition and natural language processing.  A POS tagger in the language model aids in the identification of boundary tones and speech repairs, redefining the speech recognition problem.
  • 16. Advantages (cntd.):  A typical NLP system consists of tokenization, sentence delimitation, part-of- speech (POS) tagging, phrase chunking, parsing, and concept mapping. As one of the initial steps, POS tagging determines the part of speech for each token in a sentence.  Managers, educators, Trainers, Sales people are able to accurately assess the needs of a group, improves questioning techniques thus improving their skills to achieve more consistent results.
  • 17. Limitations :  User Cannot enter more than one sentence i.e. cannot enter paragraph.  It is not able to detect and report the gender of the word i.e. Morphological analysis in not done.  When ambiguity is encountered it is searched for the POS of the ambiguous word if it contains less or no word with the correct POS and there are more number of words for other POS then it shows incorrect POS for the ambiguous word.
  • 18. Applications :  Information Retrieval  Speech synthesis  Word Sense Disambiguation (WSD)  Machine Translation (MT) -Text to Text -Speech to Speech
  • 23. Conclusion and Future Scope :  The POS tagger described here is very simple and efficient for automatic tagging, but the morphological complexity of the Marathi make it hard.The performance of the current system is good and result achieved by this method are excellent. In future we wish to improve the accuracy our system by adding more tagged sentence in our training corpus.