This report shows what a dependency structure is, why a dependency structure is useful, and how to parse natural sentences to dependency structures. The report describes two stat-of-art dependency parsers, MaltParser and MSTParser, and shows comparisons between the parsers and ways to integrate them. Finally, it suggests a new parsing algorithm and possible applications using dependency structures.
Abdul Kader Baba- Managing Cybersecurity Risks and Compliance Requirements i...
Dependency Parsing
1. Dependency Parsing
Jinho D. Choi
University of Colorado
Preliminary Exam
March 4, 2009
2. Contents
• Dependency Structure
- What is dependency structure?
- Phrase structure vs. Dependency structure
- Dependency Graph
• Dependency Parsers
- MaltParser: Nivre’s algorithm
- MSTParser: Edmonds’s algorithm
- MaltParser vs. MSTParser
- Choi’s algorithm
• Applications
3. Dependency Structure
• What is dependency?
- Syntactic or semantic relation between lexicons
- Syntactic: NMOD, AMOD, Semantic: LOC, MNR
• Phrase Structure(PS) vs. Dependency Structure(DS)
- Constituents vs. Dependencies
- There are no phrasal nodes in DS.
! Each node in DS represents a word-token.
- In DS, every node except the root is dependent in exactly one
other node.
4. Phrase vs. Dependency
She bought a car
Phrase Structure Dependency Structure
S
bought
NP VP
SBJ OBJ
Pro V NP she car
DET
she bought Det N
a
a car
• Not flexible with word-orders
• Language dependent
• No semantic information
5. Dependency Graph
• For a sentence x = w ..w , a dependency graph G = (V , E )
1 n x x x
- V = {w = root, w , ... , w },
x 0 1 n
- E = {(w , r, w ) : w " w , w ! V , w ! V - w , r ! R }
x i j i j i x j x 0 x
! R = a set of all possible dependency relations in x
x
• Well-formed Dependency Graph Root
- Unique root bought
- Single head SBJ OBJ
- Connected She car
- Acyclic
NMOD
Jinho a
6. Projectivity vs Non-projectivity
• Projectivity means no cross-edges.
root She bought a car
root She bought a car yesterday that was blue
• Why projectivity?
- Regenerate the original sentence with the same word-orders
- Parsing is less expressive (O(n) vs. O(n )) 2
- There are not many non-projective relations
7. Dependency Parsers
• Two state-of-art dependency parsers
- MaltParser: performed the best in CoNLL 2007 shared task
- MSTParser: performed the best in CoNLL 2006 shared task
• MaltParser
- Developed by Johan Hall, Jens Nilsson, and Joakim Nivre
- Nivre’s algorithm(p, O(n)), Covington’s algorithm(n, O(n ))
2
• MSTParser
- Developed by Ryan McDonald
- Eisner’s algorithm(p,O(k log k)), Edmonds’s algorithm(n, O(kn )
2
8. Nivre’s Algorithm
• Based on Shift-Reduce algorithm
• S = a stack
• I = a list of remaining input tokens
she bought a car
11. Nivre’s Algorithm
she bought a car
she
bought
a
car
S I A
• Initialize
12. Nivre’s Algorithm
she bought a car
she
bought
a
car
S I A
• Initialize
• Shift : ‘she’
13. Nivre’s Algorithm
she bought a car
bought
a
she car
S I A
• Initialize
• Shift : ‘she’
14. Nivre’s Algorithm
she bought a car
bought
a
she car
S I A
• Initialize
• Shift : ‘she’
• Left-Arc : ‘she ! bought’
15. Nivre’s Algorithm
she bought a car
bought
a
she car she ! bought
S I A
• Initialize
• Shift : ‘she’
• Left-Arc : ‘she ! bought’
16. Nivre’s Algorithm
she bought a car
bought
a
car she ! bought
S I A
• Initialize
• Shift : ‘she’
• Left-Arc : ‘she ! bought’
17. Nivre’s Algorithm
she bought a car
bought
a
car she ! bought
S I A
• Initialize
• Shift : ‘she’
• Left-Arc : ‘she ! bought’
• Shift : ‘bought’
18. Nivre’s Algorithm
she bought a car
a
bought car she ! bought
S I A
• Initialize
• Shift : ‘she’
• Left-Arc : ‘she ! bought’
• Shift : ‘bought’
19. Nivre’s Algorithm
she bought a car
a
bought car she ! bought
S I A
• Initialize • Shift : ‘a’
• Shift : ‘she’
• Left-Arc : ‘she ! bought’
• Shift : ‘bought’
20. Nivre’s Algorithm
she bought a car
a
bought car she ! bought
S I A
• Initialize • Shift : ‘a’
• Shift : ‘she’
• Left-Arc : ‘she ! bought’
• Shift : ‘bought’
21. Nivre’s Algorithm
she bought a car
a
bought car she ! bought
S I A
• Initialize • Shift : ‘a’
• Shift : ‘she’ • Left-Arc : ‘a ! car’
• Left-Arc : ‘she ! bought’
• Shift : ‘bought’
22. Nivre’s Algorithm
she bought a car
a a ! car
bought car she ! bought
S I A
• Initialize • Shift : ‘a’
• Shift : ‘she’ • Left-Arc : ‘a ! car’
• Left-Arc : ‘she ! bought’
• Shift : ‘bought’
23. Nivre’s Algorithm
she bought a car
a ! car
bought car she ! bought
S I A
• Initialize • Shift : ‘a’
• Shift : ‘she’ • Left-Arc : ‘a ! car’
• Left-Arc : ‘she ! bought’
• Shift : ‘bought’
24. Nivre’s Algorithm
she bought a car
a ! car
bought car she ! bought
S I A
• Initialize • Shift : ‘a’
• Shift : ‘she’ • Left-Arc : ‘a ! car’
• Left-Arc : ‘she ! bought’ • Right-Arc : ‘bought " car’
• Shift : ‘bought’
25. Nivre’s Algorithm
she bought a car
bought " car
a ! car
bought car she ! bought
S I A
• Initialize • Shift : ‘a’
• Shift : ‘she’ • Left-Arc : ‘a ! car’
• Left-Arc : ‘she ! bought’ • Right-Arc : ‘bought " car’
• Shift : ‘bought’
26. Nivre’s Algorithm
she bought a car
bought " car
car a ! car
bought she ! bought
S I A
• Initialize • Shift : ‘a’
• Shift : ‘she’ • Left-Arc : ‘a ! car’
• Left-Arc : ‘she ! bought’ • Right-Arc : ‘bought " car’
• Shift : ‘bought’
27. Nivre’s Algorithm
she bought a car
bought " car
car a ! car
bought she ! bought
S I A
• Initialize • Shift : ‘a’
• Shift : ‘she’ • Left-Arc : ‘a ! car’
• Left-Arc : ‘she ! bought’ • Right-Arc : ‘bought " car’
• Shift : ‘bought’ • Terminate (no need to reduce ‘car’ or ‘bought’)
28. Edmonds’s Algorithm
• Based on Maximum Spanning Tree algorithm
• Algorithm
1. Build a complete graph
2. Keep only incoming edges with the maximum scores
3. If there is no cycle, goto #5
4. If there is a cycle, pretend the cycle as one vertex and update
scores for all incoming edges to the cycle; goto #2
5. Break all cycles by removing appropriate edges in the cycle
(edges that cause multiple heads)
30. Edmonds’s Algorithm
root 9 root
10
9 saw saw
20 0 20
30 30 30 30
John 11 Mary John Mary
3
31. Edmonds’s Algorithm
root 9 root
10
9 saw saw
20 0 20
30 30 30 30
John 11 Mary John Mary
3
32. Edmonds’s Algorithm
root 9 root
10
9 saw saw
20 0 20
30 30 30 30
John 11 Mary John Mary
3
root 9
40
29 saw
30
30
John 31 Mary
3
33. Edmonds’s Algorithm
root 9 root
10
9 saw saw
20 0 20
30 30 30 30
John 11 Mary John Mary
3
root 9 root
40 40
29 saw saw
30
30 30
John 31 Mary John Mary
3
34. Edmonds’s Algorithm
root 9 root
10
9 saw saw
20 0 20
30 30 30 30
John 11 Mary John Mary
3
root 9 root root
40 40 10
29 saw saw saw
30
30 30 30 30
John 31 Mary John Mary John Mary
3
35. MaltParser vs. MSTParser
• Advantages
- MaltParser: low complexity, more accurate for short-distance
- MSTParser: high accuracy, more accurate for long-distance
• Merge MaltParser and MSTParser in learning stages
36. Choi’s Algorithm
• Projective dependency parsing algorithm
- Motivation: do more exhaustive searches than MaltParser but
keep the complexity lower than the one for MSTParser
- Intuition: in projective dependency graph, every word can find
its head from a word in adjacent phrases
She bought a car yesterday that was blue
- Searching: starts with the edge-node, jump to its head
- Complexity: O(k"n), k is the number of words in each phrase
39. Choi’s Algorithm
0.9 0.6 0.9
A B C D E A B C D E
X X
0.9 X
A B C D E
0.5
0.7
40. Choi’s Algorithm
0.9 0.6 0.9
A B C D E A B C D E
X X
0.9 X 0.7
0.9
A B C D E
A B C D E
0.5
0.7
X
41. Choi’s Algorithm
0.9 0.6 0.9
A B C D E A B C D E
X X
0.9 X 0.7
0.9
A B C D E
A B C D E
0.5
0.7
X
0.7
0.9 0.8
A B C D E
X
X
42. Choi’s Algorithm
0.9 0.6 0.9
A B C D E A B C D E
X X
0.9 X 0.7
0.9
A B C D E
A B C D E
0.5
0.7
X
0.7 0.7
0.9 0.8 0.9 0.8
A B C D E A B C D E
X 0.5
X 0.8