Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Language processing (HUL455)
MORPHOLOGICAL
ANALYSIS
-JINIA RAO & ASHISH KASHYAP
CONTENTS
• Morphology & its types.
• Approaches to Morphology
• Morpheme based morphology
• Morphological Analysis and its...
MORPHOLOGY
• The study of word formation – how words are
built up from smaller pieces.
• Identification, analysis, and des...
Examples
• Washing= wash + ing
• Browser= browse + er
• Rats= rat + s
Types of Morphology
• Inflectional morphology:-modification of a
word to express different grammatical
categories. Example...
APPROACHES TO MORPHOLOGY
There are three principal approaches to
morphology
• Morpheme based morphology
• Lexeme based mor...
Morpheme-based morphology
• Word forms are analyzed as arrangements
of morphemes.
• Morphemes- smallest linguistic unit wi...
Lexeme based Morphology
• Lexeme-based morphology usually takes what
is called an "item-and-process" approach.
• Instead o...
Word based Morphology
• Word-based morphology is (usually) a word-
and-paradigm approach.
• Instead of stating rules to co...
MORPHOLOGICAL ANALYSIS
• Analyzing words into their linguistic
components (morphemes).
• Ambiguity: More than one alternat...
Expected Output
Input Morphologically analyzed output
Cats Cat+ N+ PL
Cat Cat + N + SG
Cities City + N + PL
Geese Goose + ...
NEED FOR MORPHOLOGICAL ANALYSIS
• Wastage of memory in exhaustive lexicon.
• Failure to depict linguistic generalization-
...
MORPHOLOGICAL ANALYSIS
USING PARADIGMS
• Most NLP systems use simple linguistic
theories for morphological analysis.
• Mos...
• Words are related to each other by analogical
rules.
• Words can be categorized based on the
pattern they fit into.
• Ap...
Procedure and Algorithm
• A language expert provides different tables of
word forms covering the words in the entire
langu...
Continued..
EACH ENTRY IN THE TABLE SHOWS THE NUMBER OF
CHARACTERS TO BE DELETED FROM
CASE
Number Direct Oblique
Singular ...
Continued…
The table can be expressed in terms of an algorithm, which is as
follows:-
ALGORITHM 1: Forming paradigm table
...
Continued…
2. For every entry w in WTF, do
If w=r
then store “(0, Ø)” in the corresponding
position in PT
else begin
let i...
Generation of a Word Form
ALGORITHM 2: Generating a word form
PURPOSE: To generate a word form given a root and
desired fe...
Continued…
2. let p = paradigm type of r as obtained from
DR
3. let PT = paradigm table for p.
4. let (n,s) = entry in PT ...
PROBLEMS IN MORPHOLOGICAL
ANALYSIS
• False Analysis
• Productivity
• Bound base morphemes
False analysis
Words such as hospitable, sizeable.
• They don’t have the meaning “to be able”
• They can not take the suff...
PRODUCTIVITY
• Property of a morphological process to give rise
to new formations on a systematic basis.
Exceptions to the...
Bound Base Morphemes
• Occur only in a particular complex word.
• Do not have independent existence.
• Words such as feasi...
REFERENCES
• “Linguistics, An Introduction to Language and
Communication” by Adrian Akmajian, Richard A.
Demers, Ann K. Fa...
THANKYOU!!!
Upcoming SlideShare
Loading in …5
×

Morphological Analysis

29,037 views

Published on

morphological analysis
paradigm based approach

Published in: Engineering
  • Be the first to comment

Morphological Analysis

  1. 1. Language processing (HUL455) MORPHOLOGICAL ANALYSIS -JINIA RAO & ASHISH KASHYAP
  2. 2. CONTENTS • Morphology & its types. • Approaches to Morphology • Morpheme based morphology • Morphological Analysis and its need. • Morphological Generation and Analysis using Paradigms • Problems in Morphological Analysis. • Bibliography.
  3. 3. MORPHOLOGY • The study of word formation – how words are built up from smaller pieces. • Identification, analysis, and description of the structure of a given language's MORPHEMES and other linguistic units, such as root words, affixes, parts of speech, intonations and stresses, or implied context.
  4. 4. Examples • Washing= wash + ing • Browser= browse + er • Rats= rat + s
  5. 5. Types of Morphology • Inflectional morphology:-modification of a word to express different grammatical categories. Examples- cats, men etc. • Derivational Morphology:- creation of a new word from existing word by changing grammatical category. Examples- happiness, brotherhood etc.
  6. 6. APPROACHES TO MORPHOLOGY There are three principal approaches to morphology • Morpheme based morphology • Lexeme based morphology • Word based morphology
  7. 7. Morpheme-based morphology • Word forms are analyzed as arrangements of morphemes. • Morphemes- smallest linguistic unit with a grammatical function.
  8. 8. Lexeme based Morphology • Lexeme-based morphology usually takes what is called an "item-and-process" approach. • Instead of analyzing a word form as a set of morphemes arranged in sequence, a word form is said to be the result of applying rules that alter a word-form or stem in order to produce a new one
  9. 9. Word based Morphology • Word-based morphology is (usually) a word- and-paradigm approach. • Instead of stating rules to combine morphemes into word forms, or to generate word forms from stems, word-based morphology states generalizations that hold between the forms of inflectional paradigms
  10. 10. MORPHOLOGICAL ANALYSIS • Analyzing words into their linguistic components (morphemes). • Ambiguity: More than one alternatives flies fly VERB + PROG fly NOUN + PLU
  11. 11. Expected Output Input Morphologically analyzed output Cats Cat+ N+ PL Cat Cat + N + SG Cities City + N + PL Geese Goose + N + PL Goose Goose + N + SG OR Goose + V Gooses Goose + V + 3SG Merging Merge + V + PresPart Caught Catch + V + PastPart Caught Catch + V + Past
  12. 12. NEED FOR MORPHOLOGICAL ANALYSIS • Wastage of memory in exhaustive lexicon. • Failure to depict linguistic generalization- necessary to understand an unknown word. • Morphologically rich and productive languages might be problematic.
  13. 13. MORPHOLOGICAL ANALYSIS USING PARADIGMS • Most NLP systems use simple linguistic theories for morphological analysis. • Most NLP systems widely use this approach.
  14. 14. • Words are related to each other by analogical rules. • Words can be categorized based on the pattern they fit into. • Applicable both to existing words and to new ones. • Application of a pattern different from the one that has been used - give rise to a new word • Examples:-older replacing elder .
  15. 15. Procedure and Algorithm • A language expert provides different tables of word forms covering the words in the entire language. • The roots follow the pattern( or paradigm ) implicit in the table for generating their word forms. • Examples
  16. 16. Continued.. EACH ENTRY IN THE TABLE SHOWS THE NUMBER OF CHARACTERS TO BE DELETED FROM CASE Number Direct Oblique Singular LADKAA LADAKE Plural LADAKE LADAKON CASE Number Direct Oblique Singular (0,ø) (1,e) Plural (1,e) (1,ON)
  17. 17. Continued… The table can be expressed in terms of an algorithm, which is as follows:- ALGORITHM 1: Forming paradigm table PURPOSE: To form paradigm table from word forms table for a root INPUT: Root r, Words forms table WFT (with labels for rows and columns) OUTPUT: Paradigm table PT ALGORITHM: 1. Create an empty table PT of the same dimensionality, size and labels as the word forms table WFT
  18. 18. Continued… 2. For every entry w in WTF, do If w=r then store “(0, Ø)” in the corresponding position in PT else begin let i be the position of the first characters in w and r which are different store (size(r)-i+1,suffix(i,w)) at the corresponding position in PT 3. Return PT
  19. 19. Generation of a Word Form ALGORITHM 2: Generating a word form PURPOSE: To generate a word form given a root and desired feature values. INPUT: Root r, Feature values FV USES: Paradigm tables, Dictionary of roots DR, dictionary of indeclinable words DI OUTPUT: Word w ALGORITHM: 1. If root r belongs to DI then return( words stored in DI for r irrespective of FV)
  20. 20. Continued… 2. let p = paradigm type of r as obtained from DR 3. let PT = paradigm table for p. 4. let (n,s) = entry in PT for feature values FV 5. w := r minus n characters at the end 6. w := w plus suffix s END ALGORITHM
  21. 21. PROBLEMS IN MORPHOLOGICAL ANALYSIS • False Analysis • Productivity • Bound base morphemes
  22. 22. False analysis Words such as hospitable, sizeable. • They don’t have the meaning “to be able” • They can not take the suffix -ity to form a noun • Analyzing them as the words containing suffix -able leads to false analysis
  23. 23. PRODUCTIVITY • Property of a morphological process to give rise to new formations on a systematic basis. Exceptions to the above rule. • Peaceable • Actionable • Companionable
  24. 24. Bound Base Morphemes • Occur only in a particular complex word. • Do not have independent existence. • Words such as feasible, malleable • -able has the regular meaning “be able” • -ity form is possible • Base words don’t exit independently base (nonexistent) morpheme (known) Compound
  25. 25. REFERENCES • “Linguistics, An Introduction to Language and Communication” by Adrian Akmajian, Richard A. Demers, Ann K. Farmer and Robert M. Harnish (5th Edition) • SPEECH and LANGUAGE PROCESSING, An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition by Daniel Jurafsky and James H. Martin (Second Edition) • “Natural Language Processing- a Paninian perspective” by Akshar Bharati, Vineet Chaitanya, Rajeev Sangal.
  26. 26. THANKYOU!!!

×