Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Query Expansion with Locally-Trained Word Embeddings (Neu-IR 2016)

999 views

Published on

Slides presented at the Neu-IR workshop in SIGIR 2016.

Published in: Technology
  • Be the first to comment

Query Expansion with Locally-Trained Word Embeddings (Neu-IR 2016)

  1. 1. Query Expansion with Locally-Trained Word Embeddings Fernando Diaz Bhaskar Mitra Nick Craswell Microsoft July 21, 2016 1 / 22
  2. 2. word embedding: discriminatively trained vector representation 2 / 22
  3. 3. L = T∑ t=1 ωxt term weight   ∑ y∈Vt c log σ(ϕ(xt) · ϕ(y)) observed context + ∑ y∈Vt n log σ(−ϕ(xt) · ϕ(y)) negative context   3 / 22
  4. 4. ωxt needs to reflect the importance of the term at evaluation time. 4 / 22
  5. 5. T∑ t=1 ωxt=w ∝ p(w|C) 5 / 22
  6. 6. what terms are important at query time? 6 / 22
  7. 7. p(w|R) probability of the term in the relevant documents. 7 / 22
  8. 8. how different is p(w|R) from p(w|C)? 8 / 22
  9. 9. KL(R, C)w = p(w|R) log p(w|R) p(w|C) 9 / 22
  10. 10. KL 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 rank 10 / 22
  11. 11. how much better can we do if we train with∑T t=1 ωxt ∝ p(w|R)? 11 / 22
  12. 12. Language Model Scoring score(d, q) = KL(θq, θd) θq maximum likelihood query language model θd document language model 12 / 22
  13. 13. Query Expansion with Word Embeddings ˜θq = UUT θq U |V| × k term embedding matrix 13 / 22
  14. 14. Query Expansion with Word Embeddings Uglobal embedding trained with p(w|C) Ulocal embedding trained with p(w|R) 14 / 22
  15. 15. Getting p(w|R) p(d) = exp(−KL(θq, θd)) ∑ d′ exp(−KL(θq, θd′ )) 15 / 22
  16. 16. Getting p(w|R) p(d) = exp(−KL(θq, θd)) ∑ d′ exp(−KL(θq, θd′ )) ˜p(w|R) = ∑ d p(w|θd)p(d) 15 / 22
  17. 17. Experiments 16 / 22
  18. 18. Data docs words queries trec12 469,949 438,338 150 robust 528,155 665,128 250 web 50,220,423 90,411,624 200 giga 9,875,524 2,645,367 - wiki 3,225,743 4,726,862 - 17 / 22
  19. 19. Embeddings • global • public embeddings (GloVe, word2vec) • word2vec on target corpus • local: word2vec with documents sampled by p(d) 18 / 22
  20. 20. • ten-fold cross-validation • metric: NDCG@10 19 / 22
  21. 21. Results global local wiki+giga gnews target target giga wiki QL 50 100 200 300 300 400 400 400 400 trec12 0.514 0.518 0.518 0.530 0.531 0.530 0.545 0.535 0.563* 0.523 robust 0.467 0.470 0.463 0.469 0.468 0.472 0.465 0.475 0.517* 0.476 web 0.216 0.227 0.229 0.230 0.232 0.218 0.216 0.234 0.236 0.258* 20 / 22
  22. 22. global local top words by ˜p(w|R) (blue: query; red: top words by p(w|R)) 21 / 22
  23. 23. Summary • local embedding provides a stronger representation than global embedding • potential impact for other topic-specific natural language processing tasks • future work • effectiveness improvements • efficiency improvements 22 / 22

×