Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Ets Jan2005

1,802 views

Published on

  • Be the first to comment

  • Be the first to like this

Ets Jan2005

  1. 1. Unsupervised Word Sense Discrimination By Clustering Similar Contexts Ted Pedersen University of Minnesota, Duluth http:// www.d.umn.edu/~tpederse Research Supported by National Science Foundation Faculty Early Career Development Award (#0092784)
  2. 2. Univ. of Minnesota, Duluth <ul><li>Computer Science Dept. </li></ul><ul><ul><li>11 tenure/tenure-track faculty </li></ul></ul><ul><ul><li>250 undergraduate majors </li></ul></ul><ul><ul><li>30 MS students </li></ul></ul><ul><ul><ul><li>5 currently in NLP @ UMD group </li></ul></ul></ul><ul><ul><ul><ul><li>Anagha Kulkarni ( SenseClusters ) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Jason Michelizzi ( WordNet::Similarity ) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Pratheepan Ravendranathan ( Google -Hack ) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Apurva Padhye (Semantic Similarity in UMLS) </li></ul></ul></ul></ul><ul><ul><ul><ul><li>Mahesh Joshi (WSD for biomedical text) </li></ul></ul></ul></ul>
  3. 3. NLP @ UMD, Fall 2004
  4. 4. Alumni <ul><li>Amruta Purandare (MS 2004) -> Pitt/ISP (MS) </li></ul><ul><ul><li>SenseClusters , Ngram Statistics Package , Senseval-3 </li></ul></ul><ul><li>Bridget McInnes (MS 2004) -> Univ of Minn/TC (PhD) </li></ul><ul><ul><li>Collocation discovery </li></ul></ul><ul><li>Siddharth Patwardhan (MS 2003) -> Univ of Utah (PhD) </li></ul><ul><ul><li>WordNet::Similarity </li></ul></ul><ul><li>Saif Mohammad (MS 2003) -> Univ of Toronto (PhD) </li></ul><ul><ul><li>Supervised word sense disambiguation, sense tagged data </li></ul></ul><ul><li>Satanjeev Banerjee (MS 2002) -> CMU (PhD) </li></ul><ul><ul><li>Ngram Statistics Package , WordNet::Similarity </li></ul></ul>
  5. 5. Alumni
  6. 6. At UMD… <ul><li>I do research…more about that soon… </li></ul><ul><li>I teach… </li></ul><ul><ul><li>Natural Language Processing </li></ul></ul><ul><ul><ul><li>Graduate NLP class worked on essay grading systems in Fall 2004 </li></ul></ul></ul><ul><ul><ul><li>More on that later… </li></ul></ul></ul><ul><ul><li>Operating Systems Practicum </li></ul></ul><ul><ul><ul><li>Linux stuff </li></ul></ul></ul>
  7. 7. Overall Research Objectives <ul><li>Assign meanings to words </li></ul><ul><ul><li>Bank means Financial Institution </li></ul></ul><ul><li>Group words according to meaning </li></ul><ul><ul><li>Line, Cord, Cable are synonyms </li></ul></ul><ul><li>Organize texts according to content </li></ul><ul><ul><li>Records of patients with similar ailments </li></ul></ul><ul><li>Organize concepts by relationships </li></ul><ul><ul><li>Rachel is a friend of Ross </li></ul></ul>
  8. 8. Making Free Software Mostly Perl, All CopyLeft <ul><li>SenseClusters </li></ul><ul><ul><li>Identify similar contexts </li></ul></ul><ul><li>Ngram Statistics Package </li></ul><ul><ul><li>Identify interesting sequences of words </li></ul></ul><ul><li>WordNet::Similarity </li></ul><ul><ul><li>Measure similarity among concepts </li></ul></ul><ul><li>WordNet::SenseRelate </li></ul><ul><ul><li>All words sense disambiguation </li></ul></ul><ul><li>Google-Hack </li></ul><ul><ul><li>Find sets of related words </li></ul></ul><ul><li>SyntaLex and Duluth systems </li></ul><ul><ul><li>Supervised WSD </li></ul></ul><ul><li>http:// www.d.umn.edu/~tpederse/code.html </li></ul>
  9. 9. Unsupervised Word Sense Discrimination By Clustering Similar Contexts With Considerable Assistance From Anagha Kulkarni (M.S. 2006) Amruta Purandare (M.S. 2004)
  10. 10. Overview shells exploded in a US diplomatic complex in Liberia shell scripts are user interactive artillery guns were used to fire highly explosive shells the biggest shop on the shore for serious shell collectors shell script is a series of commands written into a file that Unix executes she sells sea shells by the sea shore sherry enjoys walking along the beach and collecting shells firework shells exploded onto usually dark screens in a variety of colors shells automate system administrative tasks we specialize in low priced corals, starfish and shells we help people in identifying wonderful sea shells along the coastlines shop at the biggest shell store by the shore shell script is much like the ms dos batch file
  11. 11. sherry enjoys walking along the beach and collecting shells we specialize in low priced corals, starfish and shells we help people in identifying wonderful sea shells along the coastlines shop at the biggest shell store by the shore she sells sea shells by the sea shore the biggest shop on the shore for serious shell collectors shell script is much like the ms dos batch file shell script is a series of commands written into a file that Unix executes shell scripts are user interactive shells automate system administrative tasks shells exploded in a US diplomatic complex in Liberia firework shells exploded onto usually dark screens in a variety of colors artillery guns were used to fire highly explosive shells
  12. 12. Unsupervised discrimination? <ul><li>Dictionaries are fixed and static, relative to the world at least </li></ul><ul><li>Sense distinctions made in dictionaries are not always the right ones for NLP applications. </li></ul><ul><ul><li>29 senses of line? </li></ul></ul><ul><li>Dictionaries don’t agree. </li></ul><ul><ul><li>So which one do you use? </li></ul></ul>
  13. 13. Our goal? <ul><li>Identify contexts that use a word in similar way. </li></ul><ul><ul><li>I drove my car to the house. </li></ul></ul><ul><ul><li>My car doesn’t drive very well any more. </li></ul></ul><ul><li>Assume that word has similar or related meanings. </li></ul><ul><li>Automatically create a descriptive label that serves as a definition of that word in those contexts . </li></ul><ul><li>… make it possible to automatically discover meanings and categorize words relative to them without the use of difficult to create and maintain resources… </li></ul>
  14. 14. Our Approach <ul><li>Strong Contextual Hypothesis </li></ul><ul><ul><li>Sea Shells => (sea, beach, ocean, water, corals) </li></ul></ul><ul><ul><li>Bomb Shells => (kill, attack, fire, guns, explode) </li></ul></ul><ul><ul><li>Unix Shells => (machine, OS, computer, system) </li></ul></ul><ul><li>Corpus—Based Machine Learning </li></ul><ul><li>Knowledge—Lean </li></ul><ul><ul><li>Portable – Other languages, domains </li></ul></ul><ul><ul><li>Scalable – Large Raw Text </li></ul></ul><ul><ul><li>Adaptable – Fluid Word Meanings </li></ul></ul>
  15. 15. Methodology <ul><li>Feature Selection </li></ul><ul><li>Context Representation </li></ul><ul><li>Measuring Similarities </li></ul><ul><li>Clustering </li></ul><ul><li>Evaluation </li></ul>
  16. 16. Feature Selection <ul><li>What Data ? </li></ul><ul><li>What Features ? </li></ul><ul><li>How to Select ? </li></ul>
  17. 17. What Data ? <ul><li>Training and Test? </li></ul><ul><ul><li>Training => Features </li></ul></ul><ul><ul><li>Test => Cluster </li></ul></ul><ul><li>Training = Test </li></ul><ul><ul><li>Identify features from data to be clustered </li></ul></ul>
  18. 18. Local Training <ul><li>Pectens or Scallops are one of the few bivalve shells that actually swim. This is accomplished by rapidly opening & closing their valves, sending the shell backward. </li></ul><ul><li>Fire marshals hauled out something that looked like a rifle with tubes attached to it, along with several bags of bullets and shells . </li></ul><ul><li>If you hear a snapping sound when you’re in the water, chances are it is the sound of the valves hitting together as it opens and shuts its shell . </li></ul><ul><li>Teenagers tried to make a bomb or some kind of homemade fireworks by taking the bullets and shotgun shells apart and collecting the black powder. </li></ul><ul><li>Bivalve shells are mollusks with two valves joined by a hinge. Most of the 20,000 species are marine including clams, mussels, oysters and scallops. </li></ul><ul><li>There was an explosion in one of the shells , it flamed over the top of the other shells and sealed in the fireworks, so when they ignited, it made it react like a pipe bomb.&quot; </li></ul><ul><li>These edible oysters are the most commonly known throughout the world as a popular source of seafood. The shell is porcelaneous and the pearls produced from these edible oysters have little value. </li></ul>
  19. 19. Surface Lexical Features <ul><li>Unigrams </li></ul><ul><li>Bigrams </li></ul><ul><li>Co-occurrences </li></ul>
  20. 20. Unigrams <ul><li>in today’s world the scallop is a popular design in architecture and is well known as the shell gasoline logo if you hear a snapping sound when you’re in the water chances are it is the sound of the valves hitting together as it opens and shuts its shell </li></ul>
  21. 21. Bigrams <ul><li>she sells sea shells on the sea shore </li></ul>the <>sea on <> the sea<>shore shells<> on sea<>shells she <>sells sells<>sea Rejected Selected
  22. 22. Bigrams in Window <ul><li>she sells sea shells on the sea shore </li></ul><ul><li>she sells sea shells on the sea shore </li></ul><ul><li>she sells sea shells on the sea shore </li></ul>shells<>shore sea<>sea shells<>sea sells<>shells window5 Window4 Window3
  23. 23. Co-occurrences <ul><li>Scallops are bivalve shells that actually swim </li></ul><ul><li>Teenagers tried to make a bomb or some kind of homemade fireworks by taking the bullets and shotgun shells apart </li></ul><ul><li>bivalve shells are mollusks with two valves joined by a hinge </li></ul><ul><li>shells can decorate an aquarium </li></ul>
  24. 24. Feature Matching <ul><li>Exact, No Stemming </li></ul><ul><li>Unigram Matching </li></ul><ul><ul><li>sells doesn’t match sell or sold </li></ul></ul><ul><li>Bigram Matching </li></ul><ul><ul><li>No Window </li></ul></ul><ul><ul><li>sea shells doesn’t match sea shore sells or shells sea </li></ul></ul><ul><ul><li>Window </li></ul></ul><ul><ul><li>sea shells matches sea creatures live in shells </li></ul></ul><ul><li>Co-occurrence Matching </li></ul>
  25. 25. 1 st Order Context Vectors <ul><li>C1: if she sells shells by the sea shore , then the shells she sells must be sea shore shells and not firework shells </li></ul><ul><li>C2: store the system commands in a unix shell and invoke csh to execute these commands </li></ul>2 0 commands 1 0 1 1 0 0 C2 0 1 0 0 2 2 C1 unix firework execute system shore sea
  26. 26. 2 nd Order Context Vectors <ul><li>The largest shell store by the sea shore </li></ul>0 6272.85 2.9133 62.6084 20.032 1176.84 51.021 O2 context 0 18818.55 0 0 0 205.5469 134.5102 Store 0 0 0 136.0441 29.576 0 0 Shore 0 0 8.7399 51.7812 30.520 3324.98 18.5533 Sea Artillery Sales Bombs Sandy North- West Water Sells
  27. 27. 2 nd Order Context Vectors Context sea shore store
  28. 28. Measuring Similarities <ul><li>c1: {file, unix , commands, system , store } </li></ul><ul><li>c2: {machine, os, unix , system , computer, dos, store } </li></ul><ul><li>Matching = |X П Y| </li></ul><ul><li>{unix, system, store} = 3 </li></ul><ul><li>Cosine = |X П Y|/(|X|*|Y|) </li></ul><ul><li>3/(√5*√7) = 3/(2.2361*2.646) = 0.5070 </li></ul>
  29. 29. Limitations 0 52.27 0 0.92 0 4.21 0 28.72 0 3.24 0 1.28 0 2.53 Weapon Missile Shoot Fire Destroy Murder Kill 17.77 0 14.6 46.2 22.1 0 34.2 19.23 2.36 0 72.7 0 1.28 2.56 Execute Command Bomb Pipe Fire CD Burn
  30. 30. Latent Semantic Analysis <ul><li>Singular Value Decomposition </li></ul><ul><li>Resolves Polysemy and Synonymy </li></ul><ul><li>Conceptual Fuzzy Feature Matching </li></ul><ul><li>Word Space to Semantic Space </li></ul>
  31. 31. Clustering <ul><li>UPGMA </li></ul><ul><ul><li>Hierarchical : Agglomerative </li></ul></ul><ul><li>Repeated Bisections </li></ul><ul><ul><li>Hybrid : Divisive + Partitional </li></ul></ul>
  32. 32. Evaluation (before mapping) c1 c2 c4 c3 2 1 15 2 C4 6 1 1 2 C3 1 7 1 1 C2 2 3 0 10 C1
  33. 33. Evaluation (after mapping) Accuracy=38/55=0.69 20 15 2 1 2 C4 17 1 1 0 55 11 12 15 10 6 1 2 C3 10 1 7 1 C2 15 2 3 10 C1
  34. 34. Majority Sense Classifier Maj. =17/55=0.31
  35. 35. Data <ul><li>Line, Hard, Serve </li></ul><ul><ul><li>4000+ Instances / Word </li></ul></ul><ul><ul><li>60:40 Split </li></ul></ul><ul><ul><li>3-5 Senses / Word </li></ul></ul><ul><li>SENSEVAL-2 </li></ul><ul><ul><li>73 words = 28 V + 29 N + 15 A </li></ul></ul><ul><ul><li>Approx. 50-100 Test, 100-200 Train </li></ul></ul><ul><ul><li>8-12 Senses/Word </li></ul></ul>
  36. 36. Experiment 1: Features and Measures <ul><li>Features </li></ul><ul><ul><li>Unigrams </li></ul></ul><ul><ul><li>Bigrams </li></ul></ul><ul><ul><li>Second-Order Co-occurrences </li></ul></ul><ul><li>1 st Order Contexts </li></ul><ul><li>Similarity Measures </li></ul><ul><ul><li>Match </li></ul></ul><ul><ul><li>Cosine </li></ul></ul><ul><li>Agglomerative Clustering with UPGMA </li></ul><ul><li>Senseval-2 Data </li></ul>
  37. 37. Experiment 1: Results POS wise 29 NOUNS 28verbs 15 adjs No of words of a POS for which experiment obtained accuracy more than Majority 8 7 3 5 7 6 MAT COS SOC UNI BI MAT COS MAT COS 0 1 0 0 1 1 9 13 5 5 6 11 SOC UNI BI SOC UNI BI
  38. 38. Experiment 1: Results Feature wise SOC BI UNI 32 18 38 1 1 6 11 7 6 MAT COS N ADJ V MAT COS MAT COS 0 1 9 13 8 7 0 0 5 5 3 5 N ADJ V N ADJ V
  39. 39. Experiment 1: Results Measure wise COS MAT 49 39 0 5 5 1 1 13 11 7 6 BI UNI SOC N ADJ V BI UNI SOC 0 5 3 0 1 9 6 8 7 N ADJ V
  40. 40. Experiment 1: Conclusions <ul><li>Scaling done by Cosine helps </li></ul><ul><li>1 st order contexts very sparse </li></ul><ul><li>Similarity space even more sparse </li></ul>
  41. 41. Experiment 2: 2 nd Order Contexts and RBR <ul><li>SC3 </li></ul><ul><li>SC1 with Bi-gram Matrix </li></ul><ul><li>PB3 </li></ul><ul><li>PB1 with Bi-gram Features </li></ul><ul><li>SC2 </li></ul><ul><li>SC1 except </li></ul><ul><li>UPGMA, Similarity Space </li></ul><ul><li>PB2 </li></ul><ul><li>PB1 except </li></ul><ul><li>RB, Vector Space </li></ul><ul><li>SC1 </li></ul><ul><li>Co-occurrence Matrix, SVD </li></ul><ul><li>RB, Vector Space </li></ul><ul><li>PB1 </li></ul><ul><li>Co-occurrences, </li></ul><ul><li>UPGMA, Similarity Space </li></ul>Schütze (2 nd Order Contexts) Pedersen & Bruce (1 st Order Contexts)
  42. 42. Experiment 2: Sval2 Results Bi-grams Vs Co-occurrences Bi-gram = COC 0 2 1 Bi-gram < COC 1 1 4 Bi-gram > COC 3 3 9 Bi-gram = COC 0 1 1 Bi-gram < COC 2 4 6 Bi-gram > COC 2 1 7 V A N PB1 Vs PB3 SC1 Vs SC3
  43. 43. Experiment 2: Sval2 Results RB Vs UPGMA RB = UPGMA 1 0 4 RB < UPGMA 0 5 2 RB > UPGMA 3 1 8 RB = UPGMA 1 2 1 RB < UPGMA 2 0 4 RB > UPGMA 1 4 9 V A N PB1 Vs PB2 SC1 Vs SC2
  44. 44. Experiment 2: Sval2 Results Comparing with MAJ 5 2 0 3 PB3 > MAJ 6 1 1 4 PB1 > MAJ 9 2 1 6 SC2 > MAJ 9 0 2 7 PB2 > MAJ 10 2 2 6 SC1 > MAJ 12 1 3 8 SC3 > MAJ Total V A N
  45. 45. Experiment 2: Results Line, Hard, Serve (TOP 3) PB2 PB1 PB3 Serve.v SC2 PB1 PB3 Hard.a PB2 PB3 PB1 Line.n 3 rd 2 nd 1 st
  46. 46. Experiment 2: Conclusions 2 nd order, RB Smaller Data (like SENSEVAL-2) 1 st order, UPGMA Large, Homogeneous (like Line, Hard, Serve) Recommendation Nature of Data
  47. 47. Ongoing Work <ul><li>Sense Labeling </li></ul><ul><ul><li>Treat contexts in cluster as a mini corpus </li></ul></ul><ul><ul><li>Identify most significant collocations </li></ul></ul><ul><ul><ul><li>Ngram Statistics Package </li></ul></ul></ul><ul><ul><li>Treat as text to be summarized </li></ul></ul><ul><ul><li>Treat as Headline Generation problem </li></ul></ul>
  48. 48. What’s this really all about? <ul><li>Search Google for Ted Pedersen </li></ul>
  49. 49. Mangled Web Search Results <ul><li>Organize the Ted Pedersens </li></ul><ul><li>Label them </li></ul><ul><ul><li>Professor of Computer Science who does natural language processing research </li></ul></ul><ul><ul><li>Author of children’s books about computers and science fiction </li></ul></ul><ul><ul><li>Lighthouse keeper from long ago </li></ul></ul>
  50. 50. Software <ul><li>SenseClusters – </li></ul><ul><li>http://senseclusters.sourceforge.net/ </li></ul><ul><li>N-gram Statistic Package - http:// www.d.umn.edu/~tpederse/nsp.html </li></ul><ul><li>Cluto - </li></ul><ul><li>http://www-users.cs.umn.edu/~karypis/cluto/ </li></ul><ul><li>SVDPack - </li></ul><ul><li>http://netlib.org/svdpack/ </li></ul>
  51. 51. CS 8761 – Fall 2004 <ul><li>Essay Grading Project </li></ul><ul><li>5 students per team </li></ul><ul><ul><li>Randomly assigned </li></ul></ul><ul><li>Use Perl </li></ul><ul><li>Create CGI interface </li></ul><ul><li>8 weeks to produce alpha, beta, and finial versions </li></ul><ul><li>Distribute code and make interface available </li></ul>
  52. 52. Each system had to have … <ul><li>Gibberish detection </li></ul><ul><ul><li>Syntactic (pos sequences, link grammar) </li></ul></ul><ul><ul><li>Semantic (semantic relatedness) </li></ul></ul><ul><li>Relevance measure </li></ul><ul><ul><li>Mostly LSA-like </li></ul></ul><ul><ul><li>Measure semantic similarity </li></ul></ul>
  53. 53. Each system had to have… <ul><li>Fact identification </li></ul><ul><ul><li>Lists of words that indicate opinions or subjectivity </li></ul></ul><ul><ul><li>Filter out everything but facts </li></ul></ul><ul><li>Fact checking </li></ul><ul><ul><li>Google – count the hits </li></ul></ul><ul><ul><li>Wikipedia – find the facts </li></ul></ul>
  54. 54. Class web page, with links… <ul><li>http://www.d.umn.edu/~tpederse/Courses/CS8761-FALL04/class.html </li></ul>
  55. 55. Hi, from Duluth!

×