Hierarchical Taxonomy Extraction

Loading...

Flash Player 9 (or above) is needed to view presentations.
We have detected that you do not have it on your computer. To install it, go here.

0 comments

Post a comment

    Post a comment
    Embed Video
    Edit your comment Cancel

    Favorites, Groups & Events

    Hierarchical Taxonomy Extraction - Presentation Transcript

    1. KDIR09 International Conference On Knowledge Hierachical taxonomy extraction by mining topical query sessions Dicovery and Information Retrieval 2009 Miguel Fernández Fernández and Daniel Gayo Avello lunes 19 de octubre de 2009
    2. lunes 19 de octubre de 2009
    3. brittany spears www.wikipedia.org horse jumping auto restoration auto repair classic car repair car supplies classic car batteries vintageparts.com low cost airlines cheap flights easyjet.com lunes 19 de octubre de 2009
    4. brittany spears www.wikipedia.org horse jumping auto restoration auto repair classic car repair car supplies classic car batteries vintageparts.com “... a series of interactions by the low cost airlines user toward addressing cheap flights a single information easyjet.com need...” Jansen et. al 2007 lunes 19 de octubre de 2009
    5. Unfortunatelly, not all Wang and Zhai 2008. Mining term association patterns from queries are equally effective search logs for effective query reformulation. lunes 19 de octubre de 2009
    6. Unfortunatelly, not all Wang and Zhai 2008. Mining term association patterns from queries are equally effective search logs for effective query reformulation. Mispeci fication different people use different words to discribe the same thing! lunes 19 de octubre de 2009
    7. Unfortunatelly, not all Wang and Zhai 2008. Mining term association patterns from queries are equally effective search logs for effective query reformulation. Mispeci fication cification Underspe different people use different user has shallow knowledge about words to discribe the same thing! what he is looking for lunes 19 de octubre de 2009
    8. How can they be mitigated? Und e n rspec cifi catio i fic Mispe atio n lunes 19 de octubre de 2009
    9. lunes 19 de octubre de 2009
    10. mispecification (typo) query suggestion lunes 19 de octubre de 2009
    11. mispecification (typo) query suggestion query expansion lunes 19 de octubre de 2009
    12. mispecification (typo) query suggestion query expansion based on clustering lunes 19 de octubre de 2009
    13. mispecification (typo) query suggestion query expansion based on clustering lunes 19 de octubre de 2009
    14. Wuh! Pretty cool, but... Jargon Slang Vague domain knowledge ...are still on the game lunes 19 de octubre de 2009
    15. Not lunes 19 de octubre de 2009
    16. Not semantic query sugg & expansioestion n lunes 19 de octubre de 2009
    17. How? lunes 19 de octubre de 2009
    18. How? lunes 19 de octubre de 2009
    19. hyponym |ˈhīpəˌnim| a word of more specific meaning than a general or superordinate term applicable to it. lunes 19 de octubre de 2009
    20. hyponymy|ˈhīpəˌnim| | hyponym | hīˈpänəmē a word of more specific meaning than a general or superordinate term applicable to it. lunes 19 de octubre de 2009
    21. hyponymy|ˈhīpəˌnim| | hyponym | hīˈpänəmē a word of more specific meaning than a general or superordinate term applicable to it. Transitivity ➞ deductive power lunes 19 de octubre de 2009
    22. hyponymy|ˈhīpəˌnim| | hyponym | hīˈpänəmē a word of more specific meaning than a general or superordinate term applicable to it. Socrates is mortal Transitivity ➞ deductive power lunes 19 de octubre de 2009
    23. hyponymy|ˈhīpəˌnim| | hyponym | hīˈpänəmē a word of more specific meaning than a general or superordinate term applicable to it. Socrates is mortal Transitivity ➞ deductive power Hyponym semantic equivalence (synsets) lunes 19 de octubre de 2009
    24. hyponymy|ˈhīpəˌnim| | hyponym | hīˈpänəmē a word of more specific meaning than a general or superordinate term applicable to it. Transitivity ➞ deductive power Socrates is mortal Hyponym semantic equivalence (synsets) Ferrari and Lamborghini are luxury cars lunes 19 de octubre de 2009
    25. Complexity, Semantic richness Semantic data sources Taxonomies hyponymy lunes 19 de octubre de 2009
    26. Complexity, Semantic richness Semantic data sources Thesauri Taxonomies synonymy hyponymy hyponymy lunes 19 de octubre de 2009
    27. Complexity, Semantic richness Semantic data sources Wordnets Thesauri Taxonomies [...] entailment troponymy meronymy synonymy synonymy hyponymy hyponymy hyponymy lunes 19 de octubre de 2009
    28. Semantic data sources Ontologies Complexity, Semantic richness Wordnets Thesauri ANY Taxonomies [...] entailment troponymy meronymy synonymy synonymy hyponymy hyponymy hyponymy lunes 19 de octubre de 2009
    29. Miller and FellBaun 1990 WordNet, an online Lexical Database lunes 19 de octubre de 2009
    30. (d es ip te Hearst ‘92) to ma atn in Miller and FellBaun 1990 h ard WordNet, an online Lexical Database lunes 19 de octubre de 2009
    31. (d es ip te Hearst ‘92) langu ma iatn n age specific h ard to Miller and FellBaun 1990 WordNet, an online Lexical Database lunes 19 de octubre de 2009
    32. (d es ip te Hearst ‘92) langu ma iatn n age specific h to Miller and FellBaun 1990 ard absence of proper names, WordNet, an online Lexical Database jna daalargeot na,l slang M . ’99 Gabrilovich & Markovitch ‘07 lunes 19 de octubre de 2009
    33. Our proposal for the KDIR’09 lunes 19 de octubre de 2009
    34. Automatically build hyponym taxonomies that capture not only formal lexicon semantics, but also relations between those terms actually used by search engine users Do it without needing additional sources of information than the own query log lunes 19 de octubre de 2009
    35. Automatic acquisition of hyponyms from large text corpora (1992) Caraballo, 1999. Automatic construction of a hypernym-labeled noun hierarchy from text. Girju, Badulescu and Moldovan. 2003. Learning Ma rti A. Hearst semantic constraints for the automatic discovery of part-whole relations. [...] lunes 19 de octubre de 2009
    36. Baeza-Yates and Tiberi. 2007. Extracting semantic relations from query logs. Shen et al. 2008. Mining web query hierarchies from clickthrough data Paşca ʻ07 Sekine and Suzuki ʼ07 Mika ʼ07 Schmitz ʼ06 Komachi and Suzuki ʼ08 lunes 19 de octubre de 2009
    37. Baeza-Yates and Tiberi. 2007. Extracting semantic relations from query logs. a wi asest oleut whtho ir es ugg h y Shen et al. 2008. Mining web query queto s hierarchies from clickthrough data ik ngrive ing Ta d now Paşca ʻ07 Sekine and Suzuki ʼ07 Mika ʼ07 k w Schmitz ʼ06 Komachi and Suzuki ʼ08 lunes 19 de octubre de 2009
    38. What we did lunes 19 de octubre de 2009
    39. lunes 19 de octubre de 2009
    40. 1. Reveal topical sessions lunes 19 de octubre de 2009
    41. 1. Reveal topical sessions 2. Filter noisy information lunes 19 de octubre de 2009
    42. 1. Reveal topical sessions 2. Filter noisy information 3. Identify Generalization / Specialization patterns lunes 19 de octubre de 2009
    43. 1. Reveal topical sessions 2. Filter noisy information 3. Identify Generalization / Specialization patterns 4. Extract hyponymy relations from patterns lunes 19 de octubre de 2009
    44. Log sessionization lunes 19 de octubre de 2009
    45. AOL 6 200Log 0M queries , > 3sessionization lunes 19 de octubre de 2009
    46. Daniel Gayo-Avello .2009. “A survey on session detection methods in query logs and a proposal for future evaluation” lunes 19 de octubre de 2009
    47. summer collection briefs 17:46:48 speedo summer collection 17:48:33 madonna get into the groove 17:55:47 madonna get into the groove 17:57:29 videogames cheats and codes 18:02:56 cheatsandcodes.com 18:10:27 madonna get into the groove 18:11:40 getintothegroovelyrics 18:12:27 Daniel Gayo-Avello .2009. “A survey on session detection methods in query logs and a proposal for future evaluation” lunes 19 de octubre de 2009
    48. summer collection briefs 17:46:48 speedo summer collection 17:48:33 madonna get into the groove 17:55:47 madonna get into the groove 17:57:29 madonna get into the groove 18:11:40 getintothegroovelyrics 18:12:27 videogames cheats and codes 18:02:56 cheatsandcodes.com 18:10:27 Daniel Gayo-Avello .2009. “A survey on session detection methods in query logs and a proposal for future evaluation” lunes 19 de octubre de 2009
    49. Noise filtering lunes 19 de octubre de 2009
    50. summer collection briefs 17:46:48 speedo summer collection 17:48:33 madonna get into the groove 17:55:47 madonna get into the groove 17:57:29 madonna get into the groove 18:11:40 getintothegroovelyrics 18:12:27 videogames cheats and codes 18:02:56 cheatsandcodes.com 18:10:27 lunes 19 de octubre de 2009
    51. summer collection briefs 17:46:48 speedo summer collection 17:48:33 madonna get into the groove 17:55:47 madonna get into the groove 17:57:29 madonna get into the groove 18:11:40 getintothegroovelyrics 18:12:27 videogames cheats and codes 18:02:56 cheatsandcodes.com 18:10:27 lunes 19 de octubre de 2009
    52. summer collection briefs 17:46:48 speedo summer collection 17:48:33 madonna get into the groove 17:55:47 madonna get into the groove 17:57:29 madonna get into the groove 18:11:40 getintothegroovelyrics 18:12:27 videogames cheats and codes 18:02:56 cheatsandcodes.com 18:10:27 Jim Jansen and Amanda Spink. 2008. Determining the informational, navigational and transactional intent of queries. lunes 19 de octubre de 2009
    53. summer collection briefs 17:46:48 speedo summer collection 17:48:33 madonna get into the groove 17:55:47 madonna get into the groove 17:57:29 madonna get into the groove 18:11:40 getintothegroovelyrics 18:12:27 videogames cheats and codes 18:02:56 cheatsandcodes.com 18:10:27 Jim Jansen and Amanda Spink. 2008. Determining the informational, navigational and transactional intent of queries. lunes 19 de octubre de 2009
    54. summer collection briefs 17:46:48 speedo summer collection 17:48:33 madonna get into the groove 17:55:47 madonna get into the groove 17:57:29 madonna get into the groove 18:11:40 getintothegroovelyrics 18:12:27 videogames cheats and codes 18:02:56 cheatsandcodes.com 18:10:27 lunes 19 de octubre de 2009
    55. summer collection briefs 17:46:48 speedo summer collection 17:48:33 madonna get into the groove 17:55:47 madonna get into the groove 17:57:29 madonna get into the groove 18:11:40 getintothegroovelyrics 18:12:27 videogames cheats and codes 18:02:56 cheatsandcodes.com 18:10:27 lunes 19 de octubre de 2009
    56. summer collection briefs 17:46:48 speedo summer collection 17:48:33 lunes 19 de octubre de 2009
    57. Specialization identification lunes 19 de octubre de 2009
    58. fish food tropical fish food Terms added (trivial) lunes 19 de octubre de 2009
    59. fish food tropical fish food Terms added (trivial) formula one pilots Fernando Alonso Queries don’t share any term lunes 19 de octubre de 2009
    60. fish food tropical fish food Terms added (trivial) opees don’t share any term ut o Queri f sc formula one pilots o Fernando Alonso lunes 19 de octubre de 2009
    61. fish food tropical fish food Terms added (trivial) opees don’t share any term ut o Queri f sc formula one pilots o Fernando Alonso speedo summer collection summer collection briefs Someremovrmsd added, other te e lunes 19 de octubre de 2009
    62. Relation extraction lunes 19 de octubre de 2009
    63. Relation extraction lunes 19 de octubre de 2009
    64. Relation extraction: Specialization w/reformulation lunes 19 de octubre de 2009
    65. Relation extraction: Specialization w/reformulation summer collection briefs 35,000,000 speedo summer collection 163,000 lunes 19 de octubre de 2009
    66. Relation extraction: Specialization w/reformulation summer collection briefs ⊇ speedo summer collection lunes 19 de octubre de 2009
    67. Relation extraction: Specialization w/reformulation briefs speedo ✓ lunes 19 de octubre de 2009
    68. Relation extraction: Trivial specialization fish food tropical fish food lunes 19 de octubre de 2009
    69. Relation extraction: Trivial specialization fish food tropical fish food ✓ lunes 19 de octubre de 2009
    70. Relation extraction: Trivial specialization fish food tropical fish food ✓ fish tropical food fish fish food food tropical fish fish food tropical fish food lunes 19 de octubre de 2009
    71. Relation extraction: Trivial specialization fish food tropical fish food ✓ fish tropical food tropical fish food tropical fish fish food fish fish food fish fish food food food fish food food fish tropical fish food tropical fish fish food tropical fish fish fish food food fish food fish food fish food fish tropical fish food food tropical fish food fish food tropical fish food lunes 19 de octubre de 2009
    72. Relation extraction: Trivial specialization fish food tropical fish food ✓ fish fish fish food fish food food fish food food fish tropical fish fish fish food food fish food fish food fish food fish tropical fish food food tropical fish food fish food tropical fish food lunes 19 de octubre de 2009
    73. Relation extraction: Trivial specialization fish food tropical fish food ✓ fish food fish fish food food fish tropical fish fish fish food food fish food fish tropical fish food food tropical fish food fish food tropical fish food lunes 19 de octubre de 2009
    74. Relation extraction: Trivial specialization fish food tropical fish food ✓ fish tropical fish fish fish food food fish food fish tropical fish food food tropical fish food fish food tropical fish food lunes 19 de octubre de 2009
    75. Relation extraction: Trivial specialization fish food tropical fish food ✓ fish tropical fish fish fish food fish tropical fish food food fish food food tropical fish food fish food tropical fish food lunes 19 de octubre de 2009
    76. Relation extraction: Trivial specialization fish food tropical fish food ✓ fish tropical fish ✓ fish fish food ✗ fish tropical fish food ✗ food fish food ✓ food tropical fish food ✓ fish food tropical fish food ✓ lunes 19 de octubre de 2009
    77. Preliminary results lunes 19 de octubre de 2009
    78. Preliminary results 3000 instances Overall Correct Wrong 62,67% 37,33% lunes 19 de octubre de 2009
    79. Preliminary results 3000 instances Correct Overall Present in Wordnet Correct Not present in Wordnet Wrong 70,22% 62,67% 29,78% 37,33% lunes 19 de octubre de 2009
    80. Preliminary results 3000 instances Correct Overall Present in Wordnet Correct Not present in Wordnet Wrong 70,22% 62,67% 29,78% 37,33% eventing ← jumping underwear ← briefs ← speedo celtic ← irish lunes 19 de octubre de 2009
    81. Preliminary results 3000 instances Correct Overall Wrong Present in Wordnet Correct co-hyponyms Not present in Wordnet Wrong unrelated terms 70,22% 62,67% 53,75% 46,25% 29,78% 37,33% eventing ← jumping underwear ← briefs ← speedo celtic ← irish lunes 19 de octubre de 2009
    82. Preliminary results 3000 instances Correct Overall Wrong Present in Wordnet Correct co-hyponyms Not present in Wordnet Wrong unrelated terms 70,22% 62,67% 53,75% 46,25% 29,78% 37,33% eventing ← jumping underwear ← briefs ← speedo yellow ← white celtic ← irish honda ← kawasaki lunes 19 de octubre de 2009
    83. Preliminary results 3000 instances Correct Overall Wrong Present in Wordnet Correct co-hyponyms Not present in Wordnet Wrong unrelated terms 70,22% 62,67% 53,75% 46,25% 29,78% 37,33% fish food ← fish scandal ← election eventing ← jumping underwear ← briefs ← speedo yellow ← white celtic ← irish honda ← kawasaki lunes 19 de octubre de 2009
    84. Work in progress lunes 19 de octubre de 2009
    85. Work in progress Machine Learning specialization detection Paolo Boldi et al. 2009. From 'dango' to 'japanese cakes' lunes 19 de octubre de 2009
    86. Work in progress Machine Learning specialization detection Paolo Boldi et al. 2009. From 'dango' to 'japanese cakes' qi: Formula one pilots qj: Fernando Alonso lunes 19 de octubre de 2009
    87. Work in progress Machine Learning Multi-word term specialization detection identification Paolo Boldi et al. 2009. Rosie Jones et al. 2006. From 'dango' to 'japanese cakes' Generating query substitutions qi: Formula one pilots qj: Fernando Alonso lunes 19 de octubre de 2009
    88. Work in progress Machine Learning Multi-word term specialization detection identification Paolo Boldi et al. 2009. Rosie Jones et al. 2006. From 'dango' to 'japanese cakes' Generating query substitutions qi: Formula one pilots golden globe awards qj: Fernando Alonso new york maps lunes 19 de octubre de 2009
    89. Next future work lunes 19 de octubre de 2009
    90. Next future work Finish ongoing work lunes 19 de octubre de 2009
    91. Next future work Evaluation framework Finish ongoing work lunes 19 de octubre de 2009
    92. Next future work Relevance ranking Evaluation framework Finish ongoing work lunes 19 de octubre de 2009
    93. Next future work Suggestions? Relevance ranking Evaluation framework Finish ongoing work lunes 19 de octubre de 2009
    94. lunes 19 de octubre de 2009
    95. lunes 19 de octubre de 2009
    96. research@miguelfernandez.info lunes 19 de octubre de 2009
    97. KDIR09 International Conference On Knowledge Hierachical taxonomy extraction by mining topical query sessions Dicovery and Information Retrieval 2009 Miguel Fernández Fernández and Daniel Gayo Avello lunes 19 de octubre de 2009

    + Miguel FernándezMiguel Fernández, 1 month ago

    custom

    108 views, 0 favs, 1 embeds more stats

    Search engine logs store detailed information on We more

    More info about this document

    © All Rights Reserved

    Go to text version

    • Total Views 108
      • 81 on SlideShare
      • 27 from embeds
    • Comments 0
    • Favorites 0
    • Downloads 4
    Most viewed embeds
    • 27 views on http://www.miguelfernandez.info

    more

    All embeds
    • 27 views on http://www.miguelfernandez.info

    less

    Flagged as inappropriate Flag as inappropriate
    Flag as inappropriate

    Select your reason for flagging this presentation as inappropriate. If needed, use the feedback form to let us know more details.

    Cancel
    File a copyright complaint
    Having problems? Go to our helpdesk?

    Categories