Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Successfully reported this slideshow.

Like this presentation? Why not share!

3,737 views

Published on

No Downloads

Total views

3,737

On SlideShare

0

From Embeds

0

Number of Embeds

5

Shares

0

Downloads

116

Comments

0

Likes

6

No embeds

No notes for slide

- 1. Dependency Parsing Jinho D. Choi University of Colorado Preliminary Exam March 4, 2009
- 2. Contents • Dependency Structure - What is dependency structure? - Phrase structure vs. Dependency structure - Dependency Graph • Dependency Parsers - MaltParser: Nivre’s algorithm - MSTParser: Edmonds’s algorithm - MaltParser vs. MSTParser - Choi’s algorithm • Applications
- 3. Dependency Structure • What is dependency? - Syntactic or semantic relation between lexicons - Syntactic: NMOD, AMOD, Semantic: LOC, MNR • Phrase Structure(PS) vs. Dependency Structure(DS) - Constituents vs. Dependencies - There are no phrasal nodes in DS. ! Each node in DS represents a word-token. - In DS, every node except the root is dependent in exactly one other node.
- 4. Phrase vs. Dependency She bought a car Phrase Structure Dependency Structure S bought NP VP SBJ OBJ Pro V NP she car DET she bought Det N a a car • Not ﬂexible with word-orders • Language dependent • No semantic information
- 5. Dependency Graph • For a sentence x = w ..w , a dependency graph G = (V , E ) 1 n x x x - V = {w = root, w , ... , w }, x 0 1 n - E = {(w , r, w ) : w " w , w ! V , w ! V - w , r ! R } x i j i j i x j x 0 x ! R = a set of all possible dependency relations in x x • Well-formed Dependency Graph Root - Unique root bought - Single head SBJ OBJ - Connected She car - Acyclic NMOD Jinho a
- 6. Projectivity vs Non-projectivity • Projectivity means no cross-edges. root She bought a car root She bought a car yesterday that was blue • Why projectivity? - Regenerate the original sentence with the same word-orders - Parsing is less expressive (O(n) vs. O(n )) 2 - There are not many non-projective relations
- 7. Dependency Parsers • Two state-of-art dependency parsers - MaltParser: performed the best in CoNLL 2007 shared task - MSTParser: performed the best in CoNLL 2006 shared task • MaltParser - Developed by Johan Hall, Jens Nilsson, and Joakim Nivre - Nivre’s algorithm(p, O(n)), Covington’s algorithm(n, O(n )) 2 • MSTParser - Developed by Ryan McDonald - Eisner’s algorithm(p,O(k log k)), Edmonds’s algorithm(n, O(kn ) 2
- 8. Nivre’s Algorithm • Based on Shift-Reduce algorithm • S = a stack • I = a list of remaining input tokens she bought a car
- 9. Nivre’s Algorithm she bought a car S I A
- 10. Nivre’s Algorithm she bought a car S I A • Initialize
- 11. Nivre’s Algorithm she bought a car she bought a car S I A • Initialize
- 12. Nivre’s Algorithm she bought a car she bought a car S I A • Initialize • Shift : ‘she’
- 13. Nivre’s Algorithm she bought a car bought a she car S I A • Initialize • Shift : ‘she’
- 14. Nivre’s Algorithm she bought a car bought a she car S I A • Initialize • Shift : ‘she’ • Left-Arc : ‘she ! bought’
- 15. Nivre’s Algorithm she bought a car bought a she car she ! bought S I A • Initialize • Shift : ‘she’ • Left-Arc : ‘she ! bought’
- 16. Nivre’s Algorithm she bought a car bought a car she ! bought S I A • Initialize • Shift : ‘she’ • Left-Arc : ‘she ! bought’
- 17. Nivre’s Algorithm she bought a car bought a car she ! bought S I A • Initialize • Shift : ‘she’ • Left-Arc : ‘she ! bought’ • Shift : ‘bought’
- 18. Nivre’s Algorithm she bought a car a bought car she ! bought S I A • Initialize • Shift : ‘she’ • Left-Arc : ‘she ! bought’ • Shift : ‘bought’
- 19. Nivre’s Algorithm she bought a car a bought car she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘she ! bought’ • Shift : ‘bought’
- 20. Nivre’s Algorithm she bought a car a bought car she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘she ! bought’ • Shift : ‘bought’
- 21. Nivre’s Algorithm she bought a car a bought car she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘a ! car’ • Left-Arc : ‘she ! bought’ • Shift : ‘bought’
- 22. Nivre’s Algorithm she bought a car a a ! car bought car she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘a ! car’ • Left-Arc : ‘she ! bought’ • Shift : ‘bought’
- 23. Nivre’s Algorithm she bought a car a ! car bought car she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘a ! car’ • Left-Arc : ‘she ! bought’ • Shift : ‘bought’
- 24. Nivre’s Algorithm she bought a car a ! car bought car she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘a ! car’ • Left-Arc : ‘she ! bought’ • Right-Arc : ‘bought " car’ • Shift : ‘bought’
- 25. Nivre’s Algorithm she bought a car bought " car a ! car bought car she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘a ! car’ • Left-Arc : ‘she ! bought’ • Right-Arc : ‘bought " car’ • Shift : ‘bought’
- 26. Nivre’s Algorithm she bought a car bought " car car a ! car bought she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘a ! car’ • Left-Arc : ‘she ! bought’ • Right-Arc : ‘bought " car’ • Shift : ‘bought’
- 27. Nivre’s Algorithm she bought a car bought " car car a ! car bought she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘a ! car’ • Left-Arc : ‘she ! bought’ • Right-Arc : ‘bought " car’ • Shift : ‘bought’ • Terminate (no need to reduce ‘car’ or ‘bought’)
- 28. Edmonds’s Algorithm • Based on Maximum Spanning Tree algorithm • Algorithm 1. Build a complete graph 2. Keep only incoming edges with the maximum scores 3. If there is no cycle, goto #5 4. If there is a cycle, pretend the cycle as one vertex and update scores for all incoming edges to the cycle; goto #2 5. Break all cycles by removing appropriate edges in the cycle (edges that cause multiple heads)
- 29. Edmonds’s Algorithm root 9 10 9 saw 20 0 30 30 John 11 Mary 3
- 30. Edmonds’s Algorithm root 9 root 10 9 saw saw 20 0 20 30 30 30 30 John 11 Mary John Mary 3
- 31. Edmonds’s Algorithm root 9 root 10 9 saw saw 20 0 20 30 30 30 30 John 11 Mary John Mary 3
- 32. Edmonds’s Algorithm root 9 root 10 9 saw saw 20 0 20 30 30 30 30 John 11 Mary John Mary 3 root 9 40 29 saw 30 30 John 31 Mary 3
- 33. Edmonds’s Algorithm root 9 root 10 9 saw saw 20 0 20 30 30 30 30 John 11 Mary John Mary 3 root 9 root 40 40 29 saw saw 30 30 30 John 31 Mary John Mary 3
- 34. Edmonds’s Algorithm root 9 root 10 9 saw saw 20 0 20 30 30 30 30 John 11 Mary John Mary 3 root 9 root root 40 40 10 29 saw saw saw 30 30 30 30 30 John 31 Mary John Mary John Mary 3
- 35. MaltParser vs. MSTParser • Advantages - MaltParser: low complexity, more accurate for short-distance - MSTParser: high accuracy, more accurate for long-distance • Merge MaltParser and MSTParser in learning stages
- 36. Choi’s Algorithm • Projective dependency parsing algorithm - Motivation: do more exhaustive searches than MaltParser but keep the complexity lower than the one for MSTParser - Intuition: in projective dependency graph, every word can ﬁnd its head from a word in adjacent phrases She bought a car yesterday that was blue - Searching: starts with the edge-node, jump to its head - Complexity: O(k"n), k is the number of words in each phrase
- 37. Choi’s Algorithm 0.9 0.6 A B C D E X
- 38. Choi’s Algorithm 0.9 0.6 0.9 A B C D E A B C D E X X
- 39. Choi’s Algorithm 0.9 0.6 0.9 A B C D E A B C D E X X 0.9 X A B C D E 0.5 0.7
- 40. Choi’s Algorithm 0.9 0.6 0.9 A B C D E A B C D E X X 0.9 X 0.7 0.9 A B C D E A B C D E 0.5 0.7 X
- 41. Choi’s Algorithm 0.9 0.6 0.9 A B C D E A B C D E X X 0.9 X 0.7 0.9 A B C D E A B C D E 0.5 0.7 X 0.7 0.9 0.8 A B C D E X X
- 42. Choi’s Algorithm 0.9 0.6 0.9 A B C D E A B C D E X X 0.9 X 0.7 0.9 A B C D E A B C D E 0.5 0.7 X 0.7 0.7 0.9 0.8 0.9 0.8 A B C D E A B C D E X 0.5 X 0.8
- 43. Applications • Semantic Role Labeling - CoNLL 2008~9 shared task • Sentence Compression - Relation extraction • Sentence Alignment - Paraphrase detection, machine translation • Sentiment Analysis

No public clipboards found for this slide

Be the first to comment