SlideShare a Scribd company logo
1 of 29
Understanding Language Syntax
and Structure
 For any language, syntax and structure usually go hand
in hand, where a set of specific rules, conventions, and
principles govern the way words are combined into
phrases; phrases get combines into clauses; and
clauses get combined into sentences.
 constituents include words, phrases, clauses, and
sentences.
Syntactic processing
 “The brown fox is quick and he is jumping over the
lazy dog”,
A bunch of unordered words don’t convey much information
Types of Parsing techniques
 Typical parsing techniques for understanding text
syntax are mentioned below.
 Parts of Speech (POS) Tagging
 Shallow Parsing or Chunking
 Constituency Parsing
 Dependency Parsing
1. PoS tagging
sentence → clauses → phrases → words
Tagging Parts of Speech
 N(oun): This usually denotes words that depict some object or
entity, which may be living or nonliving. Some examples would
be fox , dog , book , and so on. The POS tag symbol for nouns
is N.
 V(erb): Verbs are words that are used to describe certain actions,
states, or occurrences. There are a wide variety of further
subcategories, such as auxiliary, reflexive, and transitive verbs
(and many more). Some typical examples of verbs would be
running , jumping , read , and write . The POS tag symbol for
verbs is V.
 Adj(ective): Adjectives are words used to describe or qualify
other words, typically nouns and noun phrases. The phrase
beautiful flower has the noun (N) flower which is described or
qualified using the adjective (ADJ) beautiful . The POS tag
symbol for adjectives is ADJ .
 Adv(erb): Adverbs usually act as modifiers for other words
including nouns, adjectives, verbs, or other adverbs. The
phrase very beautiful flower has the adverb (ADV) very ,
which modifies the adjective (ADJ) beautiful , indicating
the degree to which the flower is beautiful. The POS tag
symbol for adverbs is ADV.
 each POS tag like the noun (N) can be further subdivided
into categories like singular nouns (NN), singular proper
nouns (NNP), and plural nouns (NNS).
 The process of classifying and labeling POS tags for
words called parts of speech tagging or POS tagging .
POS tags are used to annotate words and depict their
POS, which is really helpful to perform specific
analysis, such as narrowing down upon nouns and
seeing which ones are the most prominent, word sense
disambiguation, and grammar analysis.
 both nltk and spacy which usually use the Penn
Treebank notation for POS tagging.
2. Shallow Parsing or Chunking
 Based on the hierarchy we depicted earlier, groups of words make up phrases.
There are five major categories of phrases:
 Noun phrase (NP): These are phrases where a noun acts as the head word.
Noun phrases act as a subject or object to a verb.
 Verb phrase (VP): These phrases are lexical units that have a verb acting as the
head word. Usually, there are two forms of verb phrases. One form has the verb
components as well as other entities such as nouns, adjectives, or adverbs as
parts of the object.
 Adjective phrase (ADJP): These are phrases with an adjective as the head
word. Their main role is to describe or qualify nouns and pronouns in a
sentence, and they will be either placed before or after the noun or pronoun.
 Adverb phrase (ADVP): These phrases act like adverbs since the adverb acts as
the head word in the phrase. Adverb phrases are used as modifiers for nouns,
verbs, or adverbs themselves by providing further details that describe or
qualify them.
 Prepositional phrase (PP): These phrases usually contain a preposition as the
head word and other lexical components like nouns, pronouns, and so on.
These act like an adjective or adverb describing other words or phrases.
conll2000 corpus for training our shallow parser model. This corpus is available in nltk with chunk
annotations and we will be using around 10K records for training our model.
 10900 48
(S
Chancellor/NNP
(PP of/IN)
(NP the/DT Exchequer/NNP)
(NP Nigel/NNP Lawson/NNP)
(NP 's/POS restated/VBN commitment/NN)
(PP to/TO)
(NP a/DT firm/NN monetary/JJ policy/NN)
(VP has/VBZ helped/VBN to/TO prevent/VB)
(NP a/DT freefall/NN)
(PP in/IN)
(NP sterling/NN)
(PP over/IN)
(NP the/DT past/JJ week/NN)
./.)
Chunk tags IOB format
 [('Chancellor', 'NNP', 'O'),
('of', 'IN', 'B-PP'),
('the', 'DT', 'B-NP'),
('Exchequer', 'NNP', 'I-NP'),
('Nigel', 'NNP', 'B-NP'),
('Lawson', 'NNP', 'I-NP'),
("'s", 'POS', 'B-NP'),
('restated', 'VBN', 'I-NP'),
('commitment', 'NN', 'I-NP'),
('to', 'TO', 'B-PP'),
('a', 'DT', 'B-NP'),
('firm', 'NN', 'I-NP'),
('monetary', 'JJ', 'I-NP'),
('policy', 'NN', 'I-NP'),
('has', 'VBZ', 'B-VP'),
('helped', 'VBN', 'I-VP'),
('to', 'TO', 'I-VP'),
('prevent', 'VB', 'I-VP'),
('a', 'DT', 'B-NP'),
('freefall', 'NN', 'I-NP'),
('in', 'IN', 'B-PP'),
('sterling', 'NN', 'B-NP'),
('over', 'IN', 'B-PP'),
('the', 'DT', 'B-NP'),
('past', 'JJ', 'I-NP'),
('week', 'NN', 'I-NP'),
('.', '.', 'O')]
 The chunk tags use the IOB format. This notation
represents Inside, Outside, and Beginning. The B-
prefix before a tag indicates it is the beginning of a
chunk, and I- prefix indicates that it is inside a chunk.
The O tag indicates that the token does not belong to
any chunk.
NGramTagChunker format
 NGramTagChunker that will take in tagged
sentences as training input, get their (word, POS tag,
Chunk tag) WTC triples, and train
a BigramTagger with a UnigramTagger as the
backoff tagger.
 The UnigramTagger , BigramTagger ,
and TrigramTagger are classes that inherit from the
base class NGramTagger , which itself inherits from
the ContextTagger class , which inherits from
the SequentialBackoffTagger class .
Example
3. Constituency Parsing
 Constituent-based grammars are used to analyze and
determine the constituents of a sentence.
 phrase structure rules:
 They determine what words are used to construct the
phrases or constituents.
 They determine how we need to order these
constituents together.
 A constituency parser can be built based on such
grammars/rules, which are usually collectively
available as context-free grammar (CFG) or phrase-
structured grammar. The parser will process input
sentences according to these rules, and help in
building a parse tree.
 The generic representation of a phrase structure rule
is S → AB , which depicts that the structure S consists
of constituents A and B , and the ordering
is A followed by B .
 The phrase structure rule denotes a binary division for
a sentence or a clause as S → NP VP where S is the
sentence or clause, and it is divided into the subject,
denoted by the noun phrase (NP) and the predicate,
denoted by the verb phrase (VP).
 using nltk and the StanfordParser here to generate
parse trees.
 The Stanford parser generally uses a PCFG
(probabilistic context-free grammar) parser. A
PCFG is a context-free grammar that associates a
probability with each of its production rules. The
probability of a parse tree generated from a PCFG is
simply the production of the individual probabilities
of the productions used to generate it.
4. Dependency Parsing
 In dependency parsing, we try to use dependency-
based grammars to analyze and infer both structure
and semantic dependencies and relationships between
tokens in a sentence.
Dependency Tag Description
acl
clausal modifier of a noun (adnominal
clause)
acl:relcl relative clause modifier
advcl adverbial clause modifier
advmod adverbial modifier
advmod:emph emphasizing word, intensifier
advmod:lmod locative adverbial modifier
amod adjectival modifier
appos appositional modifier
aux auxiliary
aux:pass passive auxiliary
case case-marking
cc coordinating conjunction
cc:preconj preconjunct
ccomp clausal complement
clf classifier
compound compound
compound:lvc light verb construction
compound:prt phrasal verb particle
compound:redup reduplicated compounds
compound:svc serial verb compounds
conj conjunct
cop copula
csubj clausal subject
csubj:pass clausal passive subject
dep unspecified dependency
det determiner
det:numgov
pronominal quantifier governing the case
of the noun
 The dependency tag det is pretty intuitive — it denotes the
determiner relationship between a nominal head and the
determiner. Usually, the word with POS tag DET will also have
the det dependency tag relation. Examples include fox →
the and dog → the.
 The dependency tag amod stands for adjectival modifier and
stands for any adjective that modifies the meaning of a noun.
Examples include fox → brown and dog → lazy.
 The dependency tag nsubj stands for an entity that acts as a
subject or agent in a clause. Examples include is →
fox and jumping → he.
 The dependencies cc and conj have more to do with linkages
related to words connected by coordinating conjunctions .
Examples include is → and and is → jumping.
 The dependency tag aux indicates the auxiliary or secondary
verb in the clause. Example: jumping → is.
 The dependency tag acomp stands for adjective complement
and acts as the complement or object to a verb in the sentence.
Example: is → quick
 The dependency tag prep denotes a prepositional modifier,
which usually modifies the meaning of a noun, verb, adjective, or
preposition. Usually, this representation is used for prepositions
having a noun or noun phrase complement. Example: jumping
→ over.
 The dependency tag pobj is used to denote the object of a
preposition . This is usually the head of a noun phrase following
a preposition in the sentence. Example: over → dog.
solve
 “I prefer the morning flight through Denver.”

More Related Content

Similar to Understanding Language Syntax and Structure

Phrase structure rule
Phrase structure rulePhrase structure rule
Phrase structure ruleSila Chaniago
 
syntaxical process
syntaxical processsyntaxical process
syntaxical processTawarik
 
Unit 4, BASICS OF TECHINICAL ENGLISH CODE 6465
Unit 4, BASICS OF TECHINICAL ENGLISH CODE 6465Unit 4, BASICS OF TECHINICAL ENGLISH CODE 6465
Unit 4, BASICS OF TECHINICAL ENGLISH CODE 6465Zahid Mehmood
 
Constituents and phrases
Constituents  and  phrasesConstituents  and  phrases
Constituents and phrasesAimz Crisostomo
 
Meeting 4-structure-of-modification2
Meeting 4-structure-of-modification2Meeting 4-structure-of-modification2
Meeting 4-structure-of-modification2Tawarik
 
I C Analysis and Ambiguity
I C Analysis and AmbiguityI C Analysis and Ambiguity
I C Analysis and AmbiguityAbha Pandey
 
Dependency Analysis of Abstract Universal Structures in Korean and English
Dependency Analysis of Abstract Universal Structures in Korean and EnglishDependency Analysis of Abstract Universal Structures in Korean and English
Dependency Analysis of Abstract Universal Structures in Korean and EnglishJinho Choi
 
lesson 1 syntax 2021.pdf
lesson 1 syntax 2021.pdflesson 1 syntax 2021.pdf
lesson 1 syntax 2021.pdfMOHAMEDAADDI
 
Mind map esl 502
Mind map esl 502Mind map esl 502
Mind map esl 502k1hinze
 
Parts of Speech and its kinds..
Parts of Speech and its kinds..Parts of Speech and its kinds..
Parts of Speech and its kinds..SyedAlamdarulNaqvi
 
Syntax presetation
Syntax presetationSyntax presetation
Syntax presetationqamaraftab6
 
The noun phrase introducers of npChapter 4the noun phr.docx
The noun phrase  introducers of npChapter 4the noun phr.docxThe noun phrase  introducers of npChapter 4the noun phr.docx
The noun phrase introducers of npChapter 4the noun phr.docxarnoldmeredith47041
 
The noun phrase introducers of npChapter 4the noun phr.docx
The noun phrase  introducers of npChapter 4the noun phr.docxThe noun phrase  introducers of npChapter 4the noun phr.docx
The noun phrase introducers of npChapter 4the noun phr.docxdennisa15
 

Similar to Understanding Language Syntax and Structure (20)

Phrase structure rule
Phrase structure rulePhrase structure rule
Phrase structure rule
 
syntaxical process
syntaxical processsyntaxical process
syntaxical process
 
Group 4A_Syntax.pdf
Group 4A_Syntax.pdfGroup 4A_Syntax.pdf
Group 4A_Syntax.pdf
 
Unit 4, BASICS OF TECHINICAL ENGLISH CODE 6465
Unit 4, BASICS OF TECHINICAL ENGLISH CODE 6465Unit 4, BASICS OF TECHINICAL ENGLISH CODE 6465
Unit 4, BASICS OF TECHINICAL ENGLISH CODE 6465
 
Lesson2
Lesson2Lesson2
Lesson2
 
Constituents and phrases
Constituents  and  phrasesConstituents  and  phrases
Constituents and phrases
 
Meeting 4-structure-of-modification2
Meeting 4-structure-of-modification2Meeting 4-structure-of-modification2
Meeting 4-structure-of-modification2
 
I C Analysis and Ambiguity
I C Analysis and AmbiguityI C Analysis and Ambiguity
I C Analysis and Ambiguity
 
Dependency Analysis of Abstract Universal Structures in Korean and English
Dependency Analysis of Abstract Universal Structures in Korean and EnglishDependency Analysis of Abstract Universal Structures in Korean and English
Dependency Analysis of Abstract Universal Structures in Korean and English
 
Call assignmnt
Call assignmntCall assignmnt
Call assignmnt
 
Call assignmnt
Call assignmntCall assignmnt
Call assignmnt
 
Exempler approach
Exempler approachExempler approach
Exempler approach
 
lesson 1 syntax 2021.pdf
lesson 1 syntax 2021.pdflesson 1 syntax 2021.pdf
lesson 1 syntax 2021.pdf
 
Mind map esl 502
Mind map esl 502Mind map esl 502
Mind map esl 502
 
Parts of Speech and its kinds..
Parts of Speech and its kinds..Parts of Speech and its kinds..
Parts of Speech and its kinds..
 
Syntax presetation
Syntax presetationSyntax presetation
Syntax presetation
 
ENGLISH SYNTAX
ENGLISH SYNTAXENGLISH SYNTAX
ENGLISH SYNTAX
 
The noun phrase introducers of npChapter 4the noun phr.docx
The noun phrase  introducers of npChapter 4the noun phr.docxThe noun phrase  introducers of npChapter 4the noun phr.docx
The noun phrase introducers of npChapter 4the noun phr.docx
 
The noun phrase introducers of npChapter 4the noun phr.docx
The noun phrase  introducers of npChapter 4the noun phr.docxThe noun phrase  introducers of npChapter 4the noun phr.docx
The noun phrase introducers of npChapter 4the noun phr.docx
 
Grammar Syntax(1).pptx
Grammar Syntax(1).pptxGrammar Syntax(1).pptx
Grammar Syntax(1).pptx
 

More from LSURYAPRAKASHREDDY

More from LSURYAPRAKASHREDDY (7)

Unit 4- Testing.pptx
Unit 4- Testing.pptxUnit 4- Testing.pptx
Unit 4- Testing.pptx
 
Unit 3- Software Design.pptx
Unit 3- Software Design.pptxUnit 3- Software Design.pptx
Unit 3- Software Design.pptx
 
Virtual Reality Technology.pptx
Virtual Reality Technology.pptxVirtual Reality Technology.pptx
Virtual Reality Technology.pptx
 
BIG MART SALES.pptx
BIG MART SALES.pptxBIG MART SALES.pptx
BIG MART SALES.pptx
 
nouns 1234 - Copy.pptx
nouns 1234 - Copy.pptxnouns 1234 - Copy.pptx
nouns 1234 - Copy.pptx
 
PREPOSITIONS 12.pptx
PREPOSITIONS 12.pptxPREPOSITIONS 12.pptx
PREPOSITIONS 12.pptx
 
BIG MART SALES PRIDICTION PROJECT.pptx
BIG MART SALES PRIDICTION PROJECT.pptxBIG MART SALES PRIDICTION PROJECT.pptx
BIG MART SALES PRIDICTION PROJECT.pptx
 

Recently uploaded

Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Celine George
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfMr Bounab Samir
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...JhezDiaz1
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxUnboundStockton
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for BeginnersSabitha Banu
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentInMediaRes1
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfUjwalaBharambe
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxDr.Ibrahim Hassaan
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceSamikshaHamane
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxJiesonDelaCerna
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.arsicmarija21
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxabhijeetpadhi001
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitolTechU
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxiammrhaywood
 

Recently uploaded (20)

Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17Computed Fields and api Depends in the Odoo 17
Computed Fields and api Depends in the Odoo 17
 
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝Model Call Girl in Bikash Puri  Delhi reach out to us at 🔝9953056974🔝
Model Call Girl in Bikash Puri Delhi reach out to us at 🔝9953056974🔝
 
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdfLike-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
Like-prefer-love -hate+verb+ing & silent letters & citizenship text.pdf
 
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
ENGLISH 7_Q4_LESSON 2_ Employing a Variety of Strategies for Effective Interp...
 
Blooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docxBlooming Together_ Growing a Community Garden Worksheet.docx
Blooming Together_ Growing a Community Garden Worksheet.docx
 
Full Stack Web Development Course for Beginners
Full Stack Web Development Course  for BeginnersFull Stack Web Development Course  for Beginners
Full Stack Web Development Course for Beginners
 
Meghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media ComponentMeghan Sutherland In Media Res Media Component
Meghan Sutherland In Media Res Media Component
 
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdfFraming an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
Framing an Appropriate Research Question 6b9b26d93da94caf993c038d9efcdedb.pdf
 
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
call girls in Kamla Market (DELHI) 🔝 >༒9953330565🔝 genuine Escort Service 🔝✔️✔️
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)ESSENTIAL of (CS/IT/IS) class 06 (database)
ESSENTIAL of (CS/IT/IS) class 06 (database)
 
Gas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptxGas measurement O2,Co2,& ph) 04/2024.pptx
Gas measurement O2,Co2,& ph) 04/2024.pptx
 
Roles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in PharmacovigilanceRoles & Responsibilities in Pharmacovigilance
Roles & Responsibilities in Pharmacovigilance
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
CELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptxCELL CYCLE Division Science 8 quarter IV.pptx
CELL CYCLE Division Science 8 quarter IV.pptx
 
AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.AmericanHighSchoolsprezentacijaoskolama.
AmericanHighSchoolsprezentacijaoskolama.
 
MICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptxMICROBIOLOGY biochemical test detailed.pptx
MICROBIOLOGY biochemical test detailed.pptx
 
Capitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptxCapitol Tech U Doctoral Presentation - April 2024.pptx
Capitol Tech U Doctoral Presentation - April 2024.pptx
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptxECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
ECONOMIC CONTEXT - PAPER 1 Q3: NEWSPAPERS.pptx
 

Understanding Language Syntax and Structure

  • 1.
  • 2. Understanding Language Syntax and Structure  For any language, syntax and structure usually go hand in hand, where a set of specific rules, conventions, and principles govern the way words are combined into phrases; phrases get combines into clauses; and clauses get combined into sentences.  constituents include words, phrases, clauses, and sentences.
  • 4.  “The brown fox is quick and he is jumping over the lazy dog”, A bunch of unordered words don’t convey much information
  • 5. Types of Parsing techniques  Typical parsing techniques for understanding text syntax are mentioned below.  Parts of Speech (POS) Tagging  Shallow Parsing or Chunking  Constituency Parsing  Dependency Parsing
  • 6. 1. PoS tagging sentence → clauses → phrases → words
  • 7. Tagging Parts of Speech  N(oun): This usually denotes words that depict some object or entity, which may be living or nonliving. Some examples would be fox , dog , book , and so on. The POS tag symbol for nouns is N.  V(erb): Verbs are words that are used to describe certain actions, states, or occurrences. There are a wide variety of further subcategories, such as auxiliary, reflexive, and transitive verbs (and many more). Some typical examples of verbs would be running , jumping , read , and write . The POS tag symbol for verbs is V.  Adj(ective): Adjectives are words used to describe or qualify other words, typically nouns and noun phrases. The phrase beautiful flower has the noun (N) flower which is described or qualified using the adjective (ADJ) beautiful . The POS tag symbol for adjectives is ADJ .
  • 8.  Adv(erb): Adverbs usually act as modifiers for other words including nouns, adjectives, verbs, or other adverbs. The phrase very beautiful flower has the adverb (ADV) very , which modifies the adjective (ADJ) beautiful , indicating the degree to which the flower is beautiful. The POS tag symbol for adverbs is ADV.  each POS tag like the noun (N) can be further subdivided into categories like singular nouns (NN), singular proper nouns (NNP), and plural nouns (NNS).
  • 9.  The process of classifying and labeling POS tags for words called parts of speech tagging or POS tagging . POS tags are used to annotate words and depict their POS, which is really helpful to perform specific analysis, such as narrowing down upon nouns and seeing which ones are the most prominent, word sense disambiguation, and grammar analysis.  both nltk and spacy which usually use the Penn Treebank notation for POS tagging.
  • 10.
  • 11. 2. Shallow Parsing or Chunking  Based on the hierarchy we depicted earlier, groups of words make up phrases. There are five major categories of phrases:  Noun phrase (NP): These are phrases where a noun acts as the head word. Noun phrases act as a subject or object to a verb.  Verb phrase (VP): These phrases are lexical units that have a verb acting as the head word. Usually, there are two forms of verb phrases. One form has the verb components as well as other entities such as nouns, adjectives, or adverbs as parts of the object.  Adjective phrase (ADJP): These are phrases with an adjective as the head word. Their main role is to describe or qualify nouns and pronouns in a sentence, and they will be either placed before or after the noun or pronoun.  Adverb phrase (ADVP): These phrases act like adverbs since the adverb acts as the head word in the phrase. Adverb phrases are used as modifiers for nouns, verbs, or adverbs themselves by providing further details that describe or qualify them.  Prepositional phrase (PP): These phrases usually contain a preposition as the head word and other lexical components like nouns, pronouns, and so on. These act like an adjective or adverb describing other words or phrases.
  • 12.
  • 13. conll2000 corpus for training our shallow parser model. This corpus is available in nltk with chunk annotations and we will be using around 10K records for training our model.  10900 48 (S Chancellor/NNP (PP of/IN) (NP the/DT Exchequer/NNP) (NP Nigel/NNP Lawson/NNP) (NP 's/POS restated/VBN commitment/NN) (PP to/TO) (NP a/DT firm/NN monetary/JJ policy/NN) (VP has/VBZ helped/VBN to/TO prevent/VB) (NP a/DT freefall/NN) (PP in/IN) (NP sterling/NN) (PP over/IN) (NP the/DT past/JJ week/NN) ./.)
  • 14. Chunk tags IOB format  [('Chancellor', 'NNP', 'O'), ('of', 'IN', 'B-PP'), ('the', 'DT', 'B-NP'), ('Exchequer', 'NNP', 'I-NP'), ('Nigel', 'NNP', 'B-NP'), ('Lawson', 'NNP', 'I-NP'), ("'s", 'POS', 'B-NP'), ('restated', 'VBN', 'I-NP'), ('commitment', 'NN', 'I-NP'), ('to', 'TO', 'B-PP'), ('a', 'DT', 'B-NP'), ('firm', 'NN', 'I-NP'), ('monetary', 'JJ', 'I-NP'), ('policy', 'NN', 'I-NP'), ('has', 'VBZ', 'B-VP'), ('helped', 'VBN', 'I-VP'), ('to', 'TO', 'I-VP'), ('prevent', 'VB', 'I-VP'), ('a', 'DT', 'B-NP'), ('freefall', 'NN', 'I-NP'), ('in', 'IN', 'B-PP'), ('sterling', 'NN', 'B-NP'), ('over', 'IN', 'B-PP'), ('the', 'DT', 'B-NP'), ('past', 'JJ', 'I-NP'), ('week', 'NN', 'I-NP'), ('.', '.', 'O')]  The chunk tags use the IOB format. This notation represents Inside, Outside, and Beginning. The B- prefix before a tag indicates it is the beginning of a chunk, and I- prefix indicates that it is inside a chunk. The O tag indicates that the token does not belong to any chunk.
  • 15. NGramTagChunker format  NGramTagChunker that will take in tagged sentences as training input, get their (word, POS tag, Chunk tag) WTC triples, and train a BigramTagger with a UnigramTagger as the backoff tagger.  The UnigramTagger , BigramTagger , and TrigramTagger are classes that inherit from the base class NGramTagger , which itself inherits from the ContextTagger class , which inherits from the SequentialBackoffTagger class .
  • 17. 3. Constituency Parsing  Constituent-based grammars are used to analyze and determine the constituents of a sentence.  phrase structure rules:  They determine what words are used to construct the phrases or constituents.  They determine how we need to order these constituents together.
  • 18.  A constituency parser can be built based on such grammars/rules, which are usually collectively available as context-free grammar (CFG) or phrase- structured grammar. The parser will process input sentences according to these rules, and help in building a parse tree.
  • 19.  The generic representation of a phrase structure rule is S → AB , which depicts that the structure S consists of constituents A and B , and the ordering is A followed by B .  The phrase structure rule denotes a binary division for a sentence or a clause as S → NP VP where S is the sentence or clause, and it is divided into the subject, denoted by the noun phrase (NP) and the predicate, denoted by the verb phrase (VP).
  • 20.  using nltk and the StanfordParser here to generate parse trees.
  • 21.  The Stanford parser generally uses a PCFG (probabilistic context-free grammar) parser. A PCFG is a context-free grammar that associates a probability with each of its production rules. The probability of a parse tree generated from a PCFG is simply the production of the individual probabilities of the productions used to generate it.
  • 22.
  • 23. 4. Dependency Parsing  In dependency parsing, we try to use dependency- based grammars to analyze and infer both structure and semantic dependencies and relationships between tokens in a sentence.
  • 24.
  • 25. Dependency Tag Description acl clausal modifier of a noun (adnominal clause) acl:relcl relative clause modifier advcl adverbial clause modifier advmod adverbial modifier advmod:emph emphasizing word, intensifier advmod:lmod locative adverbial modifier amod adjectival modifier appos appositional modifier aux auxiliary aux:pass passive auxiliary case case-marking cc coordinating conjunction cc:preconj preconjunct ccomp clausal complement clf classifier compound compound compound:lvc light verb construction compound:prt phrasal verb particle compound:redup reduplicated compounds compound:svc serial verb compounds conj conjunct cop copula csubj clausal subject csubj:pass clausal passive subject dep unspecified dependency det determiner det:numgov pronominal quantifier governing the case of the noun
  • 26.  The dependency tag det is pretty intuitive — it denotes the determiner relationship between a nominal head and the determiner. Usually, the word with POS tag DET will also have the det dependency tag relation. Examples include fox → the and dog → the.  The dependency tag amod stands for adjectival modifier and stands for any adjective that modifies the meaning of a noun. Examples include fox → brown and dog → lazy.  The dependency tag nsubj stands for an entity that acts as a subject or agent in a clause. Examples include is → fox and jumping → he.  The dependencies cc and conj have more to do with linkages related to words connected by coordinating conjunctions . Examples include is → and and is → jumping.
  • 27.  The dependency tag aux indicates the auxiliary or secondary verb in the clause. Example: jumping → is.  The dependency tag acomp stands for adjective complement and acts as the complement or object to a verb in the sentence. Example: is → quick  The dependency tag prep denotes a prepositional modifier, which usually modifies the meaning of a noun, verb, adjective, or preposition. Usually, this representation is used for prepositions having a noun or noun phrase complement. Example: jumping → over.  The dependency tag pobj is used to denote the object of a preposition . This is usually the head of a noun phrase following a preposition in the sentence. Example: over → dog.
  • 28.
  • 29. solve  “I prefer the morning flight through Denver.”