Dependency Parsing

3,068 views

Published on

This report shows what a dependency structure is, why a dependency structure is useful, and how to parse natural sentences to dependency structures. The report describes two stat-of-art dependency parsers, MaltParser and MSTParser, and shows comparisons between the parsers and ways to integrate them. Finally, it suggests a new parsing algorithm and possible applications using dependency structures.

Published in: Technology, Education

Dependency Parsing

  1. 1. Dependency Parsing Jinho D. Choi University of Colorado Preliminary Exam March 4, 2009
  2. 2. Contents • Dependency Structure - What is dependency structure? - Phrase structure vs. Dependency structure - Dependency Graph • Dependency Parsers - MaltParser: Nivre’s algorithm - MSTParser: Edmonds’s algorithm - MaltParser vs. MSTParser - Choi’s algorithm • Applications
  3. 3. Dependency Structure • What is dependency? - Syntactic or semantic relation between lexicons - Syntactic: NMOD, AMOD, Semantic: LOC, MNR • Phrase Structure(PS) vs. Dependency Structure(DS) - Constituents vs. Dependencies - There are no phrasal nodes in DS. ! Each node in DS represents a word-token. - In DS, every node except the root is dependent in exactly one other node.
  4. 4. Phrase vs. Dependency She bought a car Phrase Structure Dependency Structure S bought NP VP SBJ OBJ Pro V NP she car DET she bought Det N a a car • Not flexible with word-orders • Language dependent • No semantic information
  5. 5. Dependency Graph • For a sentence x = w ..w , a dependency graph G = (V , E ) 1 n x x x - V = {w = root, w , ... , w }, x 0 1 n - E = {(w , r, w ) : w " w , w ! V , w ! V - w , r ! R } x i j i j i x j x 0 x ! R = a set of all possible dependency relations in x x • Well-formed Dependency Graph Root - Unique root bought - Single head SBJ OBJ - Connected She car - Acyclic NMOD Jinho a
  6. 6. Projectivity vs Non-projectivity • Projectivity means no cross-edges. root She bought a car root She bought a car yesterday that was blue • Why projectivity? - Regenerate the original sentence with the same word-orders - Parsing is less expressive (O(n) vs. O(n )) 2 - There are not many non-projective relations
  7. 7. Dependency Parsers • Two state-of-art dependency parsers - MaltParser: performed the best in CoNLL 2007 shared task - MSTParser: performed the best in CoNLL 2006 shared task • MaltParser - Developed by Johan Hall, Jens Nilsson, and Joakim Nivre - Nivre’s algorithm(p, O(n)), Covington’s algorithm(n, O(n )) 2 • MSTParser - Developed by Ryan McDonald - Eisner’s algorithm(p,O(k log k)), Edmonds’s algorithm(n, O(kn ) 2
  8. 8. Nivre’s Algorithm • Based on Shift-Reduce algorithm • S = a stack • I = a list of remaining input tokens she bought a car
  9. 9. Nivre’s Algorithm she bought a car S I A
  10. 10. Nivre’s Algorithm she bought a car S I A • Initialize
  11. 11. Nivre’s Algorithm she bought a car she bought a car S I A • Initialize
  12. 12. Nivre’s Algorithm she bought a car she bought a car S I A • Initialize • Shift : ‘she’
  13. 13. Nivre’s Algorithm she bought a car bought a she car S I A • Initialize • Shift : ‘she’
  14. 14. Nivre’s Algorithm she bought a car bought a she car S I A • Initialize • Shift : ‘she’ • Left-Arc : ‘she ! bought’
  15. 15. Nivre’s Algorithm she bought a car bought a she car she ! bought S I A • Initialize • Shift : ‘she’ • Left-Arc : ‘she ! bought’
  16. 16. Nivre’s Algorithm she bought a car bought a car she ! bought S I A • Initialize • Shift : ‘she’ • Left-Arc : ‘she ! bought’
  17. 17. Nivre’s Algorithm she bought a car bought a car she ! bought S I A • Initialize • Shift : ‘she’ • Left-Arc : ‘she ! bought’ • Shift : ‘bought’
  18. 18. Nivre’s Algorithm she bought a car a bought car she ! bought S I A • Initialize • Shift : ‘she’ • Left-Arc : ‘she ! bought’ • Shift : ‘bought’
  19. 19. Nivre’s Algorithm she bought a car a bought car she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘she ! bought’ • Shift : ‘bought’
  20. 20. Nivre’s Algorithm she bought a car a bought car she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘she ! bought’ • Shift : ‘bought’
  21. 21. Nivre’s Algorithm she bought a car a bought car she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘a ! car’ • Left-Arc : ‘she ! bought’ • Shift : ‘bought’
  22. 22. Nivre’s Algorithm she bought a car a a ! car bought car she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘a ! car’ • Left-Arc : ‘she ! bought’ • Shift : ‘bought’
  23. 23. Nivre’s Algorithm she bought a car a ! car bought car she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘a ! car’ • Left-Arc : ‘she ! bought’ • Shift : ‘bought’
  24. 24. Nivre’s Algorithm she bought a car a ! car bought car she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘a ! car’ • Left-Arc : ‘she ! bought’ • Right-Arc : ‘bought " car’ • Shift : ‘bought’
  25. 25. Nivre’s Algorithm she bought a car bought " car a ! car bought car she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘a ! car’ • Left-Arc : ‘she ! bought’ • Right-Arc : ‘bought " car’ • Shift : ‘bought’
  26. 26. Nivre’s Algorithm she bought a car bought " car car a ! car bought she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘a ! car’ • Left-Arc : ‘she ! bought’ • Right-Arc : ‘bought " car’ • Shift : ‘bought’
  27. 27. Nivre’s Algorithm she bought a car bought " car car a ! car bought she ! bought S I A • Initialize • Shift : ‘a’ • Shift : ‘she’ • Left-Arc : ‘a ! car’ • Left-Arc : ‘she ! bought’ • Right-Arc : ‘bought " car’ • Shift : ‘bought’ • Terminate (no need to reduce ‘car’ or ‘bought’)
  28. 28. Edmonds’s Algorithm • Based on Maximum Spanning Tree algorithm • Algorithm 1. Build a complete graph 2. Keep only incoming edges with the maximum scores 3. If there is no cycle, goto #5 4. If there is a cycle, pretend the cycle as one vertex and update scores for all incoming edges to the cycle; goto #2 5. Break all cycles by removing appropriate edges in the cycle (edges that cause multiple heads)
  29. 29. Edmonds’s Algorithm root 9 10 9 saw 20 0 30 30 John 11 Mary 3
  30. 30. Edmonds’s Algorithm root 9 root 10 9 saw saw 20 0 20 30 30 30 30 John 11 Mary John Mary 3
  31. 31. Edmonds’s Algorithm root 9 root 10 9 saw saw 20 0 20 30 30 30 30 John 11 Mary John Mary 3
  32. 32. Edmonds’s Algorithm root 9 root 10 9 saw saw 20 0 20 30 30 30 30 John 11 Mary John Mary 3 root 9 40 29 saw 30 30 John 31 Mary 3
  33. 33. Edmonds’s Algorithm root 9 root 10 9 saw saw 20 0 20 30 30 30 30 John 11 Mary John Mary 3 root 9 root 40 40 29 saw saw 30 30 30 John 31 Mary John Mary 3
  34. 34. Edmonds’s Algorithm root 9 root 10 9 saw saw 20 0 20 30 30 30 30 John 11 Mary John Mary 3 root 9 root root 40 40 10 29 saw saw saw 30 30 30 30 30 John 31 Mary John Mary John Mary 3
  35. 35. MaltParser vs. MSTParser • Advantages - MaltParser: low complexity, more accurate for short-distance - MSTParser: high accuracy, more accurate for long-distance • Merge MaltParser and MSTParser in learning stages
  36. 36. Choi’s Algorithm • Projective dependency parsing algorithm - Motivation: do more exhaustive searches than MaltParser but keep the complexity lower than the one for MSTParser - Intuition: in projective dependency graph, every word can find its head from a word in adjacent phrases She bought a car yesterday that was blue - Searching: starts with the edge-node, jump to its head - Complexity: O(k"n), k is the number of words in each phrase
  37. 37. Choi’s Algorithm 0.9 0.6 A B C D E X
  38. 38. Choi’s Algorithm 0.9 0.6 0.9 A B C D E A B C D E X X
  39. 39. Choi’s Algorithm 0.9 0.6 0.9 A B C D E A B C D E X X 0.9 X A B C D E 0.5 0.7
  40. 40. Choi’s Algorithm 0.9 0.6 0.9 A B C D E A B C D E X X 0.9 X 0.7 0.9 A B C D E A B C D E 0.5 0.7 X
  41. 41. Choi’s Algorithm 0.9 0.6 0.9 A B C D E A B C D E X X 0.9 X 0.7 0.9 A B C D E A B C D E 0.5 0.7 X 0.7 0.9 0.8 A B C D E X X
  42. 42. Choi’s Algorithm 0.9 0.6 0.9 A B C D E A B C D E X X 0.9 X 0.7 0.9 A B C D E A B C D E 0.5 0.7 X 0.7 0.7 0.9 0.8 0.9 0.8 A B C D E A B C D E X 0.5 X 0.8
  43. 43. Applications • Semantic Role Labeling - CoNLL 2008~9 shared task • Sentence Compression - Relation extraction • Sentence Alignment - Paraphrase detection, machine translation • Sentiment Analysis

×