INFORMATION EXTRACTION FROM QUERIES Ed Snelson, Joaquin Quiñonero Candela, Ralf Herbrich, Thore Graepel
Information extraction from queries
Templates
Probabilistic query modelling
Key details <ul><li>EP message passing for inference within single query model </li></ul><ul><li>ADF single pass through q...
Data <ul><li>10 months, Live Search query logs </li></ul><ul><li>100 Million unique queries, with associated counts </li><...
Seed lists
Actors Instances Attributes tom cruise movies brad pitt pictures johnny depp dealer.com matt damon photos george clooney a...
Cars Instances Attributes dealer {Year} honda civic parts honda accord hybrid ford mustang dealer dodge charger used toyot...
National Parks Instances Attributes grand canyon national park yellowstone park yosemite tours redwood lodging denali hote...
Templates Templates [Inst] [Attr] [Attr] [Inst] {Year} [Inst] [Attr] [Attr] of [Inst] [Inst] and [Attr] [Attr] and [Inst] ...
Future improvements <ul><li>Class/Attribute dependent templates </li></ul><ul><li>A garbage class to deal with “noise” </l...
Upcoming SlideShare
Loading in...5
×

slides

270

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
270
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

slides

  1. 1. INFORMATION EXTRACTION FROM QUERIES Ed Snelson, Joaquin Quiñonero Candela, Ralf Herbrich, Thore Graepel
  2. 2. Information extraction from queries
  3. 3. Templates
  4. 4. Probabilistic query modelling
  5. 5. Key details <ul><li>EP message passing for inference within single query model </li></ul><ul><li>ADF single pass through queries </li></ul><ul><li>Sparse messages within query </li></ul><ul><li>Bootstrap from initial seed sets of instances/attributes </li></ul><ul><li>Directed processing of queries based on current top beliefs </li></ul>
  6. 6. Data <ul><li>10 months, Live Search query logs </li></ul><ul><li>100 Million unique queries, with associated counts </li></ul><ul><li>Preliminary experiments on small specific subsets </li></ul><ul><li>e.g. 50,000 unique queries related to actors, cars and national parks </li></ul>
  7. 7. Seed lists
  8. 8. Actors Instances Attributes tom cruise movies brad pitt pictures johnny depp dealer.com matt damon photos george clooney angelina jolie cameron diaz nude scarlett johansson biography mel gibson news grand canyon height sharon stone wedding
  9. 9. Cars Instances Attributes dealer {Year} honda civic parts honda accord hybrid ford mustang dealer dodge charger used toyota camry world ford explorer accessories toyota corolla ford ford focus cleveland plain dodge durango wachovia
  10. 10. National Parks Instances Attributes grand canyon national park yellowstone park yosemite tours redwood lodging denali hotels everglades lodge algonquin west joshua tree skywalk west yellowstone gmc shenandoah college
  11. 11. Templates Templates [Inst] [Attr] [Attr] [Inst] {Year} [Inst] [Attr] [Attr] of [Inst] [Inst] and [Attr] [Attr] and [Inst] [Attr] in [Inst] the [Attr] [Inst] how [Attr] is [Inst] [Attr] [Inst] coupe [Attr] [Inst] parts the [Inst] [Attr] [Inst] 's [Attr] [Inst] in [Attr]
  12. 12. Future improvements <ul><li>Class/Attribute dependent templates </li></ul><ul><li>A garbage class to deal with “noise” </li></ul><ul><li>Reducing sensitivity to order of processing initial queries </li></ul><ul><li>Disambiguation, synonyms etc. </li></ul><ul><li>Use of part-of-speech tagger </li></ul><ul><li>Combination with standard hand-crafted entity extraction techniques </li></ul>
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×