Karlgren
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Karlgren

on

  • 1,805 views

 

Statistics

Views

Total Views
1,805
Views on SlideShare
298
Embed Views
1,507

Actions

Likes
0
Downloads
1
Comments
0

10 Embeds 1,507

http://xss.yandex.net 1474
http://tech.yandex.ru 10
http://events.yandex.ru 10
http://study.yandex-team.ru 6
http://events.yandex-team.ru 2
http://web-chib.events.lacerta.yandex-team.ru 1
http://events.lynx.yandex.ru 1
http://study.events.test.tools.yandex-team.ru 1
http://external.events.test.tools.yandex-team.ru 1
https://tech.yandex.ru 1
More...

Accessibility

Categories

Upload Details

Uploaded via as Adobe PDF

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Karlgren Presentation Transcript

  • 1. constructional syntactic analysis for information access tasks jussi karlgren yandex, february 16, 2011
  • 2. jussi karlgrenph d in (computational) linguistics from stockholmsenior researcher in information access at sics, stockholmdocent in language technology at univ of helsinkifounding partner, gavagai ab, stockholm
  • 3. • independent non-profit research institute• about 100-200 researchers• ... networks, distributed systems, programming tools, collaborative environments, information access, design, digital art...
  • 4. • recent startup company• about 7-8 employees• extracts actionable intelligence from very large text streams
  • 5. why use computational methods and machinery for informationaccess?two reasons: 1 amount of data is overwhelming → reduce data complexity let’s call these “simple” tasks 2 signal is weak and complex → peer closer into data let’s call these “difficult” tasks
  • 6. why use computational methods and machinery for informationaccess?two reasons: 1 amount of data is overwhelming → reduce data complexity let’s call these “simple” tasks 2 signal is weak and complex → peer closer into data let’s call these “difficult” tasks
  • 7. why use computational methods and machinery for informationaccess?two reasons: 1 amount of data is overwhelming → reduce data complexity let’s call these “simple” tasks 2 signal is weak and complex → peer closer into data let’s call these “difficult” tasks
  • 8. why use computational methods and machinery for informationaccess?two reasons: 1 amount of data is overwhelming → reduce data complexity let’s call these “simple” tasks 2 signal is weak and complex → peer closer into data let’s call these “difficult” tasks
  • 9. for the simple tasks the sensible thing to do is topound the text into small bits and count the various type of bit.
  • 10. for the simple tasks the sensible thing to do is topound the text into small bits and count the various type of bit.
  • 11. this works well up to a point.for search engines.but not for e.g. authorship attribution.
  • 12. this works well up to a point.for search engines.but not for e.g. authorship attribution.
  • 13. this works well up to a point.for search engines.but not for e.g. authorship attribution.
  • 14. what is the next sensible thing to do?try to organise the bits into piles first, generalising them,ortry to see if the bits have relations to each other, building morecomplex structures.which involves non-trivial complex decisions and results in a brittle, error-prone procedure.
  • 15. what is the next sensible thing to do?try to organise the bits into piles first, generalising them,ortry to see if the bits have relations to each other, building morecomplex structures.which involves non-trivial complex decisions and results in a brittle, error-prone procedure.
  • 16. what is the next sensible thing to do?try to organise the bits into piles first, generalising them,ortry to see if the bits have relations to each other, building morecomplex structures.which involves non-trivial complex decisions and results in a brittle, error-prone procedure.
  • 17. why is parsing impractical? 1 new text 2 categories unfounded in data 3 dependencies not based on necessity or efficiency
  • 18. why is parsing impractical? 1 new text 2 categories unfounded in data 3 dependencies not based on necessity or efficiency
  • 19. why is parsing impractical? 1 new text 2 categories unfounded in data 3 dependencies not based on necessity or efficiency
  • 20. why is parsing impractical? 1 new text 2 categories unfounded in data 3 dependencies not based on necessity or efficiency
  • 21. what is in the signal?“It is this, I think, that commentators mean when they say gliblythat the ‘world changed’ after Sept 11.”words? or something more?
  • 22. what is in the signal?“It is this, I think, that commentators mean when they say gliblythat the ‘world changed’ after Sept 11.”words? or something more?
  • 23. linguistics has an answer.but that answer doesn’t help much in practical applications.
  • 24. linguistics has an answer.but that answer doesn’t help much in practical applications.
  • 25. what is in the signal to begin with?just words?and a pattern?
  • 26. what is in the signal to begin with?just words?and a pattern?
  • 27. what is in the signal to begin with?just words?and a pattern?
  • 28. „ah, but the words are in the clause, the pattern is only expressedby the words that participate in it.”
  • 29. so patterns do not exist when not in use?do words exist outside their usage?„To ask where a verbal operant is when a response is not in the course ofbeing emitted is like asking where one’s knee-jerk is when the physician isnot tapping the patellar tendon.”B.F. Skinner, Verbal Behavior
  • 30. so patterns do not exist when not in use?do words exist outside their usage?„To ask where a verbal operant is when a response is not in the course ofbeing emitted is like asking where one’s knee-jerk is when the physician isnot tapping the patellar tendon.”B.F. Skinner, Verbal Behavior
  • 31. so patterns do not exist when not in use?do words exist outside their usage?„To ask where a verbal operant is when a response is not in the course ofbeing emitted is like asking where one’s knee-jerk is when the physician isnot tapping the patellar tendon.”B.F. Skinner, Verbal Behavior
  • 32. we claim that patterns are part of the signal,- not incidental to it,- nor secondary to the terms in it.this appears to be a contentious statement.
  • 33. we claim that patterns are part of the signal,- not incidental to it,- nor secondary to the terms in it.this appears to be a contentious statement.
  • 34. radical construction grammar (cxg) 1 syntax-lexicon continuum 2 form and function specified in unified model 3 structurally cohesive(william croft, 2005)
  • 35. 1. syntax-lexicon continuum Construction type Examples Complex and abstract syntax sbj be-tense verb-en by agent Complex and concrete idiom up-tense the ante Complex and bound morphology noun-s Atomic and abstract category adj, clause Atomic and concrete lexicon this, greenare all equal: constructions are the primitive elements→ no parts of speech, no syntactic categories necessary
  • 36. 2. form and function specified in unified model→ no separate syntactic (or semantic) component necessary
  • 37. 3. structurally cohesive→ everything is constructions and nothing else; everything isspecific and nothing is universal
  • 38. practically:the pattern of an utterance is a feature with the same ontologicalstatus as the terms that occur in the utterance.constructions and lexemes both have conceptual meaning.constructions or patterns are present even without recourse to thewords in it.
  • 39. practically:the pattern of an utterance is a feature with the same ontologicalstatus as the terms that occur in the utterance.constructions and lexemes both have conceptual meaning.constructions or patterns are present even without recourse to thewords in it.
  • 40. practically:the pattern of an utterance is a feature with the same ontologicalstatus as the terms that occur in the utterance.constructions and lexemes both have conceptual meaning.constructions or patterns are present even without recourse to thewords in it.
  • 41. our claim:to study pattern occurrences, no coupling between the features andthe words carrying them needs to be done.this is quite convenient.which is good.
  • 42. our claim:to study pattern occurrences, no coupling between the features andthe words carrying them needs to be done.this is quite convenient.which is good.
  • 43. patterns, in various forms have been used in language technologyfor some time:linguistic string project. (1965-1998)Naomi Sager et alleading toinformation extraction.large number of adhoc pattern descriptions, closely based on data as observed in use
  • 44. patterns, in various forms have been used in language technologyfor some time:linguistic string project. (1965-1998)Naomi Sager et alleading toinformation extraction.large number of adhoc pattern descriptions, closely based on data as observed in use
  • 45. now turn to one example task: identification and analysis ofattitude
  • 46. attitude analysis can be done on any text sourceblogs: unfettered discourse, wom, low publication threshold, noeditorial controlbut it’s new text — new processing practice necessary
  • 47. attitude analysis can be done on any text sourceblogs: unfettered discourse, wom, low publication threshold, noeditorial controlbut it’s new text — new processing practice necessary
  • 48. a prototypical attitudinal expression Expression WHO FEELS WHAT ABOUT WHAT I like sauerkraut I like sauerkraut Kissing is nice ? nice kiss someone sentiment term topicis this picture true?
  • 49. a prototypical attitudinal expression Expression WHO FEELS WHAT ABOUT WHAT I like sauerkraut I like sauerkraut Kissing is nice ? nice kiss someone sentiment term topicis this picture true?
  • 50. it is this, i think, that commentators mean when they say gliblythat the ‘world changed’ after sept 11.president hafez al-assad has said that peace was a pressing need forthe region and the world at large and syria, considering peace astrategic option would take steps towards peace.mr cohen, beginning an eight-day european tour including a natodefence ministers’ meeting in brussels today and tomorrow, said heexpected further international action soon, though not necessarilymilitary intervention.the designers from house on fire do not like random play.sauerkraut is damn good but kimchi is even better.bertram powerboats have a deep v hull and handle well in choppysea.m.a.k. halliday thought it natural to view syntax from a functionalperspective.
  • 51. our claim:attitude is not only lexical or lexicon is not only words & terms “He blew me off” vs “He blew off” “He has the best result, we cannot fail him” vs “This is the best coffee, we cannot fail with it” “Fifth Avenue”, “9/11”
  • 52. an experiment
  • 53. we’ll hand code a number of sample constructions to test our claimthat they might be useful to identify attitudinal expressions.remember: to study patterns — we do not need to encode explictlinkage to words!
  • 54. we’ll hand code a number of sample constructions to test our claimthat they might be useful to identify attitudinal expressions.remember: to study patterns — we do not need to encode explictlinkage to words!
  • 55. we represent each sentence using three separate sets of features: I content words F form words K constructions
  • 56. I featurescontent words – nouns, adjectives, verbs (including verbal uses ofparticiples), adverbs, abbreviations, numerals, interjections, andnegation
  • 57. F featuresfunction words – prepositions, determiners, conjunctions, pronouns
  • 58. K : sentence structuretransitivity, predicate, relative, and object clauses, tense shift withinsentence
  • 59. K : various adverbialsadverbials of location, time, manner, condition, quantity, clauseadverbial, clause initial adverbials
  • 60. K : morphology of sentence constituentspresent or past tense, adjectives in base, comparative, or superlativeform
  • 61. K : word dependencies and categoriessubordinate conjunctions, negations, prepositional post modifiers,verb chains, quantifiers, particle verbs, prepositional phrases,adjective modifiers
  • 62. “It is this, I think, that commentators mean when they say glibly that the ‘world changed’ after Sept 11.”I be think commentator mean when say glibly world change sept 11F it this i that they that the afterK AdvlTim, AdvlMan, ObjCls, PredCls, TRIn, TRtr, TRmix, TnsPres, TnsPast, TnsShift
  • 63. in preliminary experiments with SVM feature selection we find thatseveral of the K features have high rank for categorisation, notablyTnsShift, TnsPast, TRmix, PredCls.
  • 64. TnsShift“Noam Chomsky saidpast that what makes human language uniqueispresent recursive centre embedding”“M.A.K. Halliday believedpast that grammar, viewed functionally,ispresent natural”→ saves us from acquiring and maintaining lists of verbs ofutterance, pronuncement, and cognition.
  • 65. TnsShift“Noam Chomsky saidpast that what makes human language uniqueispresent recursive centre embedding”“M.A.K. Halliday believedpast that grammar, viewed functionally,ispresent natural”→ saves us from acquiring and maintaining lists of verbs ofutterance, pronuncement, and cognition.
  • 66. 0.15 KmpAdj 0.10 AdjMod TnsShift Neg RelCls 0.05 TnsPastFactor 2 (13.8 %) VChain k.1 0.00 ObjCls k.2 Pos TRmix NoAtt PredCls -0.05 SupAdj Neu -0.10 AdvlMan k -0.15 Neg
  • 67. our experiment: 1 represent sentences using three sets of features: I , F , K . 2 build a language representation using one year of newsprint: test for differences between KT, MD, GH. 3 test sets of attitudinal sentences: SEMEVAL, NTCIR 6 & 7, MPQA. 4 put test sentences in word space (random indexing, 2000 dims) with added feature indicating attitude. 5 extract feature vector from word space and run through SVM. 6 test with five-fold crossvalidation.
  • 68. our experiment: 1 represent sentences using three sets of features: I , F , K . 2 build a language representation using one year of newsprint: test for differences between KT, MD, GH. 3 test sets of attitudinal sentences: SEMEVAL, NTCIR 6 & 7, MPQA. 4 put test sentences in word space (random indexing, 2000 dims) with added feature indicating attitude. 5 extract feature vector from word space and run through SVM. 6 test with five-fold crossvalidation.
  • 69. our experiment: 1 represent sentences using three sets of features: I , F , K . 2 build a language representation using one year of newsprint: test for differences between KT, MD, GH. 3 test sets of attitudinal sentences: SEMEVAL, NTCIR 6 & 7, MPQA. 4 put test sentences in word space (random indexing, 2000 dims) with added feature indicating attitude. 5 extract feature vector from word space and run through SVM. 6 test with five-fold crossvalidation.
  • 70. our experiment: 1 represent sentences using three sets of features: I , F , K . 2 build a language representation using one year of newsprint: test for differences between KT, MD, GH. 3 test sets of attitudinal sentences: SEMEVAL, NTCIR 6 & 7, MPQA. 4 put test sentences in word space (random indexing, 2000 dims) with added feature indicating attitude. 5 extract feature vector from word space and run through SVM. 6 test with five-fold crossvalidation.
  • 71. our experiment: 1 represent sentences using three sets of features: I , F , K . 2 build a language representation using one year of newsprint: test for differences between KT, MD, GH. 3 test sets of attitudinal sentences: SEMEVAL, NTCIR 6 & 7, MPQA. 4 put test sentences in word space (random indexing, 2000 dims) with added feature indicating attitude. 5 extract feature vector from word space and run through SVM. 6 test with five-fold crossvalidation.
  • 72. our experiment: 1 represent sentences using three sets of features: I , F , K . 2 build a language representation using one year of newsprint: test for differences between KT, MD, GH. 3 test sets of attitudinal sentences: SEMEVAL, NTCIR 6 & 7, MPQA. 4 put test sentences in word space (random indexing, 2000 dims) with added feature indicating attitude. 5 extract feature vector from word space and run through SVM. 6 test with five-fold crossvalidation.
  • 73. experimental data: NTCIR 6 NTCIR 7 SEMEVAL MPQA Attitudinal 1 392 1 075 76 6 021 Non-attitudinal 4 416 3 201 174 4 982 Total 5 808 4 276 250 11 003
  • 74. F1 NTCIR 6 NTCIR 7 MPQA SEMEVAL I 46.1 45.2 63.4 42.4 F 44.9 47.5 65.4 40.4 K 42.3 43.6 63.7 33.8 IF 45.9 47.4 67.3 41.4 IK 45.9 48.6 67.0 38.6 FK 46.1 47.9 68.0 37.5 IFK 47.5 48.6 69.2 41.8 Precision range approx 40 approx 70 approx 30 Recall range approx 55-65K features often help and never really hurt.(karlgren et al, ECIR 2010)
  • 75. SEMEVAL is different: Discovered Boys Bring Shock, Joy (+45) Iraq Car Bombings Kill 22 People, Wound more than 60 (−98)
  • 76. 1 results tie with reported NTCIR and SEMEVAL results, not far from best MPQA results.2 combinations with K generally better than those without.3 SEMEVAL data: much lower results, no surprise given terseness.4 background language model has some effect: Glasgow Herald better precision; Korea Times better for recall for NTCIR data.
  • 77. 1 results tie with reported NTCIR and SEMEVAL results, not far from best MPQA results.2 combinations with K generally better than those without.3 SEMEVAL data: much lower results, no surprise given terseness.4 background language model has some effect: Glasgow Herald better precision; Korea Times better for recall for NTCIR data.
  • 78. 1 results tie with reported NTCIR and SEMEVAL results, not far from best MPQA results.2 combinations with K generally better than those without.3 SEMEVAL data: much lower results, no surprise given terseness.4 background language model has some effect: Glasgow Herald better precision; Korea Times better for recall for NTCIR data.
  • 79. 1 results tie with reported NTCIR and SEMEVAL results, not far from best MPQA results.2 combinations with K generally better than those without.3 SEMEVAL data: much lower results, no surprise given terseness.4 background language model has some effect: Glasgow Herald better precision; Korea Times better for recall for NTCIR data.
  • 80. but this was all hand coded.
  • 81. next (preliminary) experiment: skeletons.
  • 82. put sentences in word space (random indexing, 2000 dims) withadded feature for each trigram of structural terms
  • 83. prove utility by better choice of task? • sentiment and opinion identification • quote identification • novelty detection • authorship attribution • summarisation • terminology miningsuggestions?
  • 84. prove utility by better choice of task? • sentiment and opinion identification • quote identification • novelty detection • authorship attribution • summarisation • terminology miningsuggestions?
  • 85. prove utility by better choice of task? • sentiment and opinion identification • quote identification • novelty detection • authorship attribution • summarisation • terminology miningsuggestions?
  • 86. take home1 constructional models have suitable granularity for simple difficult tasks2 constructional models provide simple methodology to test the effect of language structure3 constructional features are not subsidiary to word occurrence features4 constructional analysis has a long history in language technology5 constructional analysis has an opportunity to influence linguistics
  • 87. take home1 constructional models have suitable granularity for simple difficult tasks2 constructional models provide simple methodology to test the effect of language structure3 constructional features are not subsidiary to word occurrence features4 constructional analysis has a long history in language technology5 constructional analysis has an opportunity to influence linguistics
  • 88. take home1 constructional models have suitable granularity for simple difficult tasks2 constructional models provide simple methodology to test the effect of language structure3 constructional features are not subsidiary to word occurrence features4 constructional analysis has a long history in language technology5 constructional analysis has an opportunity to influence linguistics
  • 89. take home1 constructional models have suitable granularity for simple difficult tasks2 constructional models provide simple methodology to test the effect of language structure3 constructional features are not subsidiary to word occurrence features4 constructional analysis has a long history in language technology5 constructional analysis has an opportunity to influence linguistics
  • 90. take home1 constructional models have suitable granularity for simple difficult tasks2 constructional models provide simple methodology to test the effect of language structure3 constructional features are not subsidiary to word occurrence features4 constructional analysis has a long history in language technology5 constructional analysis has an opportunity to influence linguistics
  • 91. to discuss1 what constitutes a construction?2 what is not a construction?3 what sort of tasks can use constructions profitably?4 what sort of abstractions do we want to use for describing constructions productively?5 how can we learn constructions automatically?