Phrase Structure
Grammar
Natural Language Processing
Emory University

Jinho D. Choi
Phrase Structure Grammar
• Phrase structure grammar
- Constituency grammar (e.g., context-free, context-sensitive).
• G = (N, Σ, S, P)
- N : a finite set of non-terminals (word tokens).
- Σ : a finite set of terminals (POS/phrase/clause tags).
- S : a start symbol representing the whole sentence, where S ∈ N.
- P : a finite set of phrase structure rules.
- A ∈ N, α, β ∈ (N ∪ Σ)*, γ ∈ (N ∪ Σ)+
- Context-free : A → α
- Context-sensitive : αA → αγβ
2
Parsing complexity?
Context-Free Languages
Context-Sensitive Languages
Mildly Context-Sensitive Languages
Recursively Enumerable Languages
Chomsky Hierarchy
3
Regular Languages
Chomsky Hierarchy
• Language vs.Automata vs. Parsing Complexity
4
Type Language Automata
Parsing

Complexity
3 Regular Finite state Linear
2 Context-free Pushdown Cubic
1.5
Mildly

Context-sensitive
Extended
pushdown
Polynomial
1 Context-sensitive
Linear
bounded
Exponential
0
Recursively
Enumerable
Turing
machine
Who cares!
I bought a caryou
PRP NNDTVBDPRP
NP
VPNP
S
NP
Phrase Structure Rules
• Rule of thumb
- All siblings on the right-hand side should convey meaningful relations.
• “I bought you a car”
5
S → NP VP
VP → VBD NP
NP → PRP NP
← NP is the subject of (the head of)VP
← NP is the object ofVBD?
← ?
I bought a caryou
PRP NNDTVBDPRP
NP NP
VPNP
S
Phrase Structure Rules
• Rule of thumb
- All siblings on the right-hand side should convey meaningful relations.
• “I bought you a car”
6
S → NP VP
VP → VBD NP NP
NP → DT NN
← NP is the subject of (the head of)VP
← 1st NP is the indirect object ofVBD

2nd NP is the direct object ofVBD
↳ DT is the determiner of NN
Dependent vs. Head
Relation?
Chomsky Normal Form
• Chomsky normal form
- All production rules are A → BC or A → α (A, B, C ∈ N, α ∈ Σ).
- How is this useful?
- Should we consider CNF when we design a grammar?
7
CYK parsing
“I will buy a big red car soon later”
Chomsky normal form?
buy a big red car
VBG DT JJ JJ NN
soon
RB
ADVPNP
VP
I will
PRP MD
VPNP
S
Unary rules?
Computationally consistent?
Linguistically sound?
Chomsky Normal Form
8
“I will buy a big red car soon later”
buy a big red car
VBG DT JJ JJ NN
soon
RB
NP
NP
I will
PRP MD
VPNP
S
VP ADVP
NP
VP
S → NP VP
VP → MD NP
VP → VP ADVP
VP → VBG NP
NP → DT NP|JJ NP|JJ NN
NP → PRP → I
MD → will
VBG → buy
DT → a
JJ → big|red
NN → car
ADVP → RB → soon
Unary rules
Recursive rules?
Greibach Normal Form
• Greibach normal form
- The right-hand sides of all production rules start with a terminal.
- How is this useful?
9
Lexicalization
S → I AUXP
AUXP → will VP
VP → buy DP ADVP
DP → a ADJP
ADJP → big ADJP
ADJP → red NP
NP → car
ADVP → soon
buy a big red car
NP
soon
ADJP
ADJP
I will
AUXP
RPDP
VP
S
“I will buy a big red car soon later”
Useful?
Penn Treebank
• Penn Treebank
- A corpus containing 1M sentences from Wall Street Journal articles.
- Each sentence is parsed as phrase structure trees.
- Each tree is annotated in parenthetical notation.
10
((S (NP (PRP I))
(VP (VBD bought)
(NP (PRP you))
(NP (DT a)
(NN car)))))
I bought a caryou
PRP NNDTVBDPRP
NP NP
VPNP
S
Penn Treebank Tagset
Passive Construction
“I bought a car for you”
11
NP Movement
Did NP really move?
What about
automatic parsing?
Is this
necessary?
I bought a car
NNDTVBDPRP
NP
VPNP
S
for
IN
you
PRP
PP
NP
A car
NNDT
was bought by me for you
VBD VBN IN PRP IN PRP
NP-1
PPPP
VP
*-1
-NONE-
NP
VP
S
NPNP
“A car was bought by me for you”
Exercises
• Wh-questions
- Who bought a car for you?
- What did I buy for you?
- Whom did I buy a car for?
• Relative clause
- I bought the car that she wanted.
- She wanted the car bought by me.
• Coordination
- She wanted the car so I bought it.
- I bought a car and gave it to her.
12
• Gapping (bonus)
- I bought a car for her,

a van for you,

and a truck for myself.
*T*-1 bought a car
VBD-NONE-
NP
VPNP
SQ
for you
PP
Who
WP
WHNP-1
SBARQ
Wh-Questions
13
Who bought a car for you?
I buy *T*-1
-NONE-VBPRP
NP
VPNP
VBD
did
SQ
What
WP
WHNP-1
SBARQ
for you
PP
I buy
VBPRP
VPNP
VBD
did
SQ
Whom
WP
WHNP-1
SBARQ
for *T*-1
PP
a car
NP
Is this
necessary?
What

about this?
Challenging
for parsing?
Whom did I buy a car for?What did I buy for you?
Relative Clause
14
I bought the car
NNDTVBDPRP
NP
VPNP
S
that she wanted *T*-1
WDT PRP VBD -NONE-
NP
VPNP
SWHNP
SBAR
NP
I bought the car that she wanted She wanted the car bought by me
She wanted the car bought *-1 by me
PRP VBD DT NN VBN -NONE- IN PRP
NP
PPNP
VPNP-1
NP
VPNP
S
Coordination
15
She wanted the car so I bought it.
She wanted the car
VBDPRP
NP
VPNP
S
so
CC
I bought it
VBDPRP
NP
VPNP
S
PRP
S
I bought a car and gave it to her.
I bought a car
VBDPRP
NP
VP
NP
S
and gave it to her
CC VBD PRP
PPNP
VP
VP
Chomsky normal form?
Gapping Relation
16
I bought a car for you , a van for you , and a truck for myself
I bought a car
VBDPRP
NP-1
VP
NP
S
VP
for her
PP-2 NP=1
a van for you a truck for myself
PP=2 NP=1 PP=2
VP VP
Challenging
for parsing?

CS571: Phrase Structure Grammar

  • 1.
    Phrase Structure Grammar Natural LanguageProcessing Emory University
 Jinho D. Choi
  • 2.
    Phrase Structure Grammar •Phrase structure grammar - Constituency grammar (e.g., context-free, context-sensitive). • G = (N, Σ, S, P) - N : a finite set of non-terminals (word tokens). - Σ : a finite set of terminals (POS/phrase/clause tags). - S : a start symbol representing the whole sentence, where S ∈ N. - P : a finite set of phrase structure rules. - A ∈ N, α, β ∈ (N ∪ Σ)*, γ ∈ (N ∪ Σ)+ - Context-free : A → α - Context-sensitive : αA → αγβ 2 Parsing complexity?
  • 3.
    Context-Free Languages Context-Sensitive Languages MildlyContext-Sensitive Languages Recursively Enumerable Languages Chomsky Hierarchy 3 Regular Languages
  • 4.
    Chomsky Hierarchy • Languagevs.Automata vs. Parsing Complexity 4 Type Language Automata Parsing
 Complexity 3 Regular Finite state Linear 2 Context-free Pushdown Cubic 1.5 Mildly
 Context-sensitive Extended pushdown Polynomial 1 Context-sensitive Linear bounded Exponential 0 Recursively Enumerable Turing machine Who cares!
  • 5.
    I bought acaryou PRP NNDTVBDPRP NP VPNP S NP Phrase Structure Rules • Rule of thumb - All siblings on the right-hand side should convey meaningful relations. • “I bought you a car” 5 S → NP VP VP → VBD NP NP → PRP NP ← NP is the subject of (the head of)VP ← NP is the object ofVBD? ← ?
  • 6.
    I bought acaryou PRP NNDTVBDPRP NP NP VPNP S Phrase Structure Rules • Rule of thumb - All siblings on the right-hand side should convey meaningful relations. • “I bought you a car” 6 S → NP VP VP → VBD NP NP NP → DT NN ← NP is the subject of (the head of)VP ← 1st NP is the indirect object ofVBD
 2nd NP is the direct object ofVBD ↳ DT is the determiner of NN Dependent vs. Head Relation?
  • 7.
    Chomsky Normal Form •Chomsky normal form - All production rules are A → BC or A → α (A, B, C ∈ N, α ∈ Σ). - How is this useful? - Should we consider CNF when we design a grammar? 7 CYK parsing “I will buy a big red car soon later” Chomsky normal form? buy a big red car VBG DT JJ JJ NN soon RB ADVPNP VP I will PRP MD VPNP S Unary rules? Computationally consistent? Linguistically sound?
  • 8.
    Chomsky Normal Form 8 “Iwill buy a big red car soon later” buy a big red car VBG DT JJ JJ NN soon RB NP NP I will PRP MD VPNP S VP ADVP NP VP S → NP VP VP → MD NP VP → VP ADVP VP → VBG NP NP → DT NP|JJ NP|JJ NN NP → PRP → I MD → will VBG → buy DT → a JJ → big|red NN → car ADVP → RB → soon Unary rules Recursive rules?
  • 9.
    Greibach Normal Form •Greibach normal form - The right-hand sides of all production rules start with a terminal. - How is this useful? 9 Lexicalization S → I AUXP AUXP → will VP VP → buy DP ADVP DP → a ADJP ADJP → big ADJP ADJP → red NP NP → car ADVP → soon buy a big red car NP soon ADJP ADJP I will AUXP RPDP VP S “I will buy a big red car soon later” Useful?
  • 10.
    Penn Treebank • PennTreebank - A corpus containing 1M sentences from Wall Street Journal articles. - Each sentence is parsed as phrase structure trees. - Each tree is annotated in parenthetical notation. 10 ((S (NP (PRP I)) (VP (VBD bought) (NP (PRP you)) (NP (DT a) (NN car))))) I bought a caryou PRP NNDTVBDPRP NP NP VPNP S Penn Treebank Tagset
  • 11.
    Passive Construction “I boughta car for you” 11 NP Movement Did NP really move? What about automatic parsing? Is this necessary? I bought a car NNDTVBDPRP NP VPNP S for IN you PRP PP NP A car NNDT was bought by me for you VBD VBN IN PRP IN PRP NP-1 PPPP VP *-1 -NONE- NP VP S NPNP “A car was bought by me for you”
  • 12.
    Exercises • Wh-questions - Whobought a car for you? - What did I buy for you? - Whom did I buy a car for? • Relative clause - I bought the car that she wanted. - She wanted the car bought by me. • Coordination - She wanted the car so I bought it. - I bought a car and gave it to her. 12 • Gapping (bonus) - I bought a car for her,
 a van for you,
 and a truck for myself.
  • 13.
    *T*-1 bought acar VBD-NONE- NP VPNP SQ for you PP Who WP WHNP-1 SBARQ Wh-Questions 13 Who bought a car for you? I buy *T*-1 -NONE-VBPRP NP VPNP VBD did SQ What WP WHNP-1 SBARQ for you PP I buy VBPRP VPNP VBD did SQ Whom WP WHNP-1 SBARQ for *T*-1 PP a car NP Is this necessary? What
 about this? Challenging for parsing? Whom did I buy a car for?What did I buy for you?
  • 14.
    Relative Clause 14 I boughtthe car NNDTVBDPRP NP VPNP S that she wanted *T*-1 WDT PRP VBD -NONE- NP VPNP SWHNP SBAR NP I bought the car that she wanted She wanted the car bought by me She wanted the car bought *-1 by me PRP VBD DT NN VBN -NONE- IN PRP NP PPNP VPNP-1 NP VPNP S
  • 15.
    Coordination 15 She wanted thecar so I bought it. She wanted the car VBDPRP NP VPNP S so CC I bought it VBDPRP NP VPNP S PRP S I bought a car and gave it to her. I bought a car VBDPRP NP VP NP S and gave it to her CC VBD PRP PPNP VP VP Chomsky normal form?
  • 16.
    Gapping Relation 16 I boughta car for you , a van for you , and a truck for myself I bought a car VBDPRP NP-1 VP NP S VP for her PP-2 NP=1 a van for you a truck for myself PP=2 NP=1 PP=2 VP VP Challenging for parsing?