Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
Upcoming SlideShare
×

# lda2vec Text by the Bay 2016

2,286 views

Published on

lda2vec, word2vec, and LDA

With notes here:
http://www.slideshare.net/ChristopherMoody3/lda2vec-text-by-the-bay-2016-notes

Published in: Technology
• Full Name
Comment goes here.

Are you sure you want to Yes No
Your message goes here
• Unlock Her Legs - How to Turn a Girl On In 10 Minutes or Less...  http://t.cn/AijLRbnO

Are you sure you want to  Yes  No
Your message goes here
• Real Money Streams ~ Create multiple streams of wealth from your home! ▲▲▲ http://ishbv.com/ezpayjobs/pdf

Are you sure you want to  Yes  No
Your message goes here

### lda2vec Text by the Bay 2016

1. 1. lda2vec (word2vec, and lda) Christopher Moody @ Stitch Fix
2. 2. About @chrisemoody Caltech Physics PhD. in astrostats supercomputing sklearn t-SNE contributor Data Labs at Stitch Fix github.com/cemoody Gaussian Processes t-SNE chainer deep learning Tensor Decomposition
3. 3. word2vec lda 1 2 3lda2vec
4. 4. 1. king - man + woman = queen 2. Huge splash in NLP world 3. Learns from raw text 4. Pretty simple algorithm 5. Comes pretrained word2vec
5. 5. 1. Set up an objective function 2. Randomly initialize vectors 3. Do gradient descent word2vec
6. 6. w ord2vec word2vec: learn word vector w from it’s surrounding context w
7. 7. w ord2vec “The fox jumped over the lazy dog” Maximize the likelihood of seeing the words given the word over. P(the|over) P(fox|over) P(jumped|over) P(the|over) P(lazy|over) P(dog|over) …instead of maximizing the likelihood of co-occurrence counts.
8. 8. w ord2vec P(fox|over) What should this be?
9. 9. w ord2vec P(vfox|vover) Should depend on the word vectors. P(fox|over)
10. 10. w ord2vec “The fox jumped over the lazy dog” P(w|c) Extract pairs from context window around every input word.
11. 11. w ord2vec “The fox jumped over the lazy dog” c P(w|c) Extract pairs from context window around every input word.
12. 12. w ord2vec “The fox jumped over the lazy dog” w P(w|c) c Extract pairs from context window around every input word.
13. 13. w ord2vec P(w|c) w c “The fox jumped over the lazy dog” Extract pairs from context window around every input word.
14. 14. w ord2vec “The fox jumped over the lazy dog” P(w|c) w c Extract pairs from context window around every input word.
15. 15. w ord2vec P(w|c) c w “The fox jumped over the lazy dog” Extract pairs from context window around every input word.
16. 16. w ord2vec P(w|c) c w “The fox jumped over the lazy dog” Extract pairs from context window around every input word.
17. 17. w ord2vec P(w|c) c w “The fox jumped over the lazy dog” Extract pairs from context window around every input word.
18. 18. w ord2vec P(w|c) w c “The fox jumped over the lazy dog” Extract pairs from context window around every input word.
19. 19. w ord2vec P(w|c) cw “The fox jumped over the lazy dog” Extract pairs from context window around every input word.
20. 20. w ord2vec P(w|c) cw “The fox jumped over the lazy dog” Extract pairs from context window around every input word.
21. 21. w ord2vec P(w|c) cw “The fox jumped over the lazy dog” Extract pairs from context window around every input word.
22. 22. w ord2vec P(w|c) c w “The fox jumped over the lazy dog” Extract pairs from context window around every input word.
23. 23. w ord2vec P(w|c) c w “The fox jumped over the lazy dog” Extract pairs from context window around every input word.
24. 24. objective Measure loss between w and c? How should we deﬁne P(w|c)?
25. 25. objective w . c How should we deﬁne P(w|c)? Measure loss between w and c?
26. 26. w ord2vec w . c ~ 1 objective w c vcanada . vsnow ~ 1
27. 27. w ord2vec w . c ~ 0 objective w c vcanada . vdesert ~0
28. 28. w ord2vec w . c ~ -1 objective w c
29. 29. w ord2vec w . c ∈ [-1,1] objective
30. 30. w ord2vec But we’d like to measure a probability. w . c ∈ [-1,1] objective
31. 31. w ord2vec But we’d like to measure a probability. objective ∈ [0,1]σ(c·w)
32. 32. w ord2vec But we’d like to measure a probability. objective ∈ [0,1]σ(c·w) w c w c SimilarDissimilar
33. 33. w ord2vec Loss function: objective L=σ(c·w) Logistic (binary) choice. Is the (context, word) combination from our dataset?
34. 34. w ord2vec The skip-gram negative-sampling model objective Trivial solution is that context = word for all vectors L=σ(c·w) w c
35. 35. w ord2vec The skip-gram negative-sampling model L = σ(c·w) + σ(-c·wneg) objective Draw random words in vocabulary.
36. 36. w ord2vec The skip-gram negative-sampling model objective Discriminate positive from negative samples Multiple Negative L = σ(c·w) + σ(-c·wneg) +…+ σ(-c·wneg)
37. 37. w ord2vec The SGNS Model PM I ci·wj = PMI(Mij) - log k …is extremely similar to matrix factorization! Levy & Goldberg 2014 L = σ(c·w) + σ(-c·wneg)
38. 38. w ord2vec The SGNS Model PM I Levy & Goldberg 2014 ‘traditional’ NLP L = σ(c·w) + σ(-c·wneg) ci·wj = PMI(Mij) - log k …is extremely similar to matrix factorization!
39. 39. w ord2vec The SGNS Model L = σ(c·w) + Σσ(-c·w) PM I ci·wj = log Levy & Goldberg 2014 #(ci,wj)/n k #(wj)/n #(ci)/n ‘traditional’ NLP
40. 40. w ord2vec The SGNS Model L = σ(c·w) + Σσ(-c·w) PM I ci·wj = log Levy & Goldberg 2014 popularity of c,w k (popularity of c) (popularity of w) ‘traditional’ NLP
41. 41. w ord2vec PM I 99% of word2vec is counting. And you can count words in SQL
42. 42. w ord2vec PM I Count how many times you saw c·w Count how many times you saw c Count how many times you saw w
43. 43. w ord2vec PM I …and this takes ~5 minutes to compute on a single core. Computing SVD is a completely standard math library.
44. 44. word2vec
45. 45. ITEM_3469 + ‘Pregnant’
46. 46. + ‘Pregnant’
47. 47. = ITEM_701333 = ITEM_901004 = ITEM_800456
48. 48. what about?LDA?
49. 49. LDA on Client Item Descriptions
50. 50. LDA on Item Descriptions (with Jay)
51. 51. LDA on Item Descriptions (with Jay)
52. 52. LDA on Item Descriptions (with Jay)
53. 53. lda vs word2vec
54. 54. Bayesian Graphical ModelML Neural Model
55. 55. word2vec is local: one word predicts a nearby word “I love ﬁnding new designer brands for jeans”
56. 56. “I love ﬁnding new designer brands for jeans” But text is usually organized.
57. 57. “I love ﬁnding new designer brands for jeans” But text is usually organized.
58. 58. “I love ﬁnding new designer brands for jeans” In LDA, documents globally predict words. doc 7681
59. 59. typical word2vec vector [ 0%, 9%, 78%, 11%] typical LDA document vector [ -0.75, -1.25, -0.55, -0.12, +2.2] All sum to 100%All real values
60. 60. 5D word2vec vector [ 0%, 9%, 78%, 11%] 5D LDA document vector [ -0.75, -1.25, -0.55, -0.12, +2.2] Sparse All sum to 100% Dimensions are absolute Dense All real values Dimensions relative
61. 61. 100D word2vec vector [ 0%0%0%0%0% … 0%, 9%, 78%, 11%] 100D LDA document vector [ -0.75, -1.25, -0.55, -0.27, -0.94, 0.44, 0.05, 0.31 … -0.12, +2.2] Sparse All sum to 100% Dimensions are absolute Dense All real values Dimensions relative dense sparse
62. 62. 100D word2vec vector [ 0%0%0%0%0% … 0%, 9%, 78%, 11%] 100D LDA document vector [ -0.75, -1.25, -0.55, -0.27, -0.94, 0.44, 0.05, 0.31 … -0.12, +2.2] Similar in fewer ways (more interpretable) Similar in 100D ways (very ﬂexible) +mixture +sparse
63. 63. can we do both? lda2vec
64. 64. -1.9 0.85 -0.6 -0.3 -0.5 Lufthansa is a German airline and when fox #hidden units Skip grams from sentences Word vector Negative sampling loss Lufthansa is a German airline and when German word2vec predicts locally: one word predicts a nearby word
65. 65. 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when German Document vector predicts a word from a global context 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when
66. 66. 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when We’re missing mixtures & sparsity! German
67. 67. 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when We’re missing mixtures & sparsity!
68. 68. 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when Now it’s a mixture. 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Topic matrix Document proportion Document weight Document vector Context vector x +
69. 69. 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when Trinitarian baptismal Pentecostals Bede schismatics excommunication 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when 0.34 -0.1 0.17 #topics Document weight
70. 70. 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when topic 1 = “religion” Trinitarian baptismal Pentecostals Bede schismatics excommunication 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when 0.34 -0.1 0.17 #topics Document weight
71. 71. 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when Milosevic absentee Indonesia Lebanese Isrealis Karadzic 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when 0.34 -0.1 0.17 #topics Document weight
72. 72. 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when topic 2 = “politics” Milosevic absentee Indonesia Lebanese Isrealis Karadzic 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when 0.34 -0.1 0.17 #topics Document weight
73. 73. 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Topic matrix Document proportion Document weight Document vector Context vector x +
74. 74. 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when
75. 75. 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when
76. 76. 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when Sparsity! 0.34 -0.1 0.17 41% 26% 34% -1.4 -0.5 -1.4 -1.9-1.7 0.75 0.96-0.7 -1.9 -0.2-1.1 0.6 -0.7 -0.4 -0.7 -0.3 -0.3-1.9 0.85 -0.6 -0.3 -0.5 -2.6 0.45 -1.3 -0.6 -0.8 Lufthansa is a German airline and when #topics #topics fox #hiddenunits #topics #hidden units#hidden units #hidden units Skip grams from sentences Word vector Negative sampling loss Topic matrix Document proportion Document weight Document vector Context vector x + Lufthansa is a German airline and when 34% 32% 34% t=0 41% 26% 34% t=10 99% 1% 0% t=∞ tim e
77. 77. @chrisemoody lda2vec.com
78. 78. + API docs + Examples + GPU + Tests @chrisemoody lda2vec.com
79. 79. @chrisemoody Example Hacker News comments Topics: http://nbviewer.jupyter.org/github/cemoody/lda2vec/blob/master/examples/ hacker_news/lda2vec/lda2vec.ipynb Word vectors: https://github.com/cemoody/ lda2vec/blob/master/examples/ hacker_news/lda2vec/ word_vectors.ipynb
80. 80. @chrisemoody lda2vec.com human-interpretable doc topics, use LDA. machine-useable word-level features, use word2vec. if you like to experiment a lot, and have topics over user / doc / region / etc. features, use lda2vec. (and you have a GPU) If you want…
81. 81. ?@chrisemoody Multithreaded Stitch Fix
82. 82. @chrisemoody lda2vec.com
83. 83. Credit Large swathes of this talk are from previous presentations by: • Tomas Mikolov • David Blei • Christopher Olah • Radim Rehurek • Omer Levy & Yoav Goldberg • Richard Socher • Xin Rong • Tim Hopper
84. 84. “PS! Thank you for such an awesome idea” @chrisemoody doc_id=1846 Can we model topics to sentences? lda2lstm
85. 85. Can we model topics to sentences? lda2lstm “PS! Thank you for such an awesome idea”doc_id=1846 @chrisemoody Can we model topics to images? lda2ae TJ Torres
86. 86. and now for something completely crazy 4 Fun Stuﬀ
87. 87. translation (using just a rotation matrix) M ikolov 2013 English Spanish Matrix Rotation
88. 88. deepwalk Perozzi etal2014 learn word vectors from sentences “The fox jumped over the lazy dog” vOUT vOUT vOUT vOUT vOUTvOUT ‘words’ are graph vertices ‘sentences’ are random walks on the graph word2vec
89. 89. Playlists at Spotify context sequence learning ‘words’ are song indices ‘sentences’ are playlists
90. 90. Playlists at Spotify context Erik Bernhardsson Great performance on ‘related artists’
91. 91. Fixes at Stitch Fix sequence learning Let’s try: ‘words’ are items ‘sentences’ are ﬁxes
92. 92. Fixes at Stitch Fix context Learn similarity between styles because they co-occur Learn ‘coherent’ styles sequence learning
93. 93. Fixes at Stitch Fix? context sequence learning Got lots of structure!
94. 94. Fixes at Stitch Fix? context sequence learning
95. 95. Fixes at Stitch Fix? context sequence learning Nearby regions are consistent ‘closets’
96. 96. ?@chrisemoody Multithreaded Stitch Fix
97. 97. context dependent Levy & G oldberg 2014 Australian scientist discovers star with telescope context +/- 2 words
98. 98. context dependent context Australian scientist discovers star with telescope Levy & G oldberg 2014
99. 99. context dependent context Australian scientist discovers star with telescope context Levy & G oldberg 2014
100. 100. context dependent context BoW DEPS topically-similar vs ‘functionally’ similar Levy & G oldberg 2014
101. 101. ?@chrisemoody Multithreaded Stitch Fix
102. 102. Crazy Approaches Paragraph Vectors (Just extend the context window) Content dependency (Change the window grammatically) Social word2vec (deepwalk) (Sentence is a walk on the graph) Spotify (Sentence is a playlist of song_ids) Stitch Fix (Sentence is a shipment of ﬁve items)
103. 103. CBOW “The fox jumped over the lazy dog” Guess the word given the context ~20x faster. (this is the alternative.) vOUT vIN vINvIN vINvIN vIN SkipGram “The fox jumped over the lazy dog” vOUT vOUT vIN vOUT vOUT vOUTvOUT Guess the context given the word Better at syntax. (this is the one we went over)
104. 104. lda2vec vDOC = a vtopic1 + b vtopic2 +… Let’s make vDOC sparse
105. 105. lda2vec This works! 😀 But vDOC isn’t as interpretable as the topic vectors. 😔 vDOC = topic0 + topic1 Let’s say that vDOC ads
106. 106. lda2vec softmax(vOUT * (vIN+ vDOC))
107. 107. theory of lda2vec lda2vec
108. 108. pyLDAvis of lda2vec lda2vec
109. 109. LDA Results context H istory I loved every choice in this ﬁx!! Great job! Great Stylist Perfect
110. 110. LDA Results context H istory Body Fit My measurements are 36-28-32. If that helps. I like wearing some clothing that is ﬁtted. Very hard for me to ﬁnd pants that ﬁt right.
111. 111. LDA Results context H istory Sizing Really enjoyed the experience and the pieces, sizing for tops was too big. Looking forward to my next box! Excited for next
112. 112. LDA Results context H istory Almost Bought It was a great ﬁx. Loved the two items I kept and the three I sent back were close! Perfect
113. 113. All of the following ideas will change what ‘words’ and ‘context’ represent.
114. 114. paragraph vector What about summarizing documents? On the day he took ofﬁce, President Obama reached out to America’s enemies, offering in his ﬁrst inaugural address to extend a hand if you are willing to unclench your ﬁst. More than six years later, he has arrived at a moment of truth in testing that
115. 115. On the day he took ofﬁce, President Obama reached out to America’s enemies, offering in his ﬁrst inaugural address to extend a hand if you are willing to unclench your ﬁst. More than six years later, he has arrived at a moment of truth in testing that The framework nuclear agreement he reached with Iran on Thursday did not provide the deﬁnitive answer to whether Mr. Obama’s audacious gamble will pay off. The ﬁst Iran has shaken at the so-called Great Satan since 1979 has not completely relaxed. paragraph vector Normal skipgram extends C words before, and C words after. IN OUT OUT
116. 116. On the day he took ofﬁce, President Obama reached out to America’s enemies, offering in his ﬁrst inaugural address to extend a hand if you are willing to unclench your ﬁst. More than six years later, he has arrived at a moment of truth in testing that The framework nuclear agreement he reached with Iran on Thursday did not provide the deﬁnitive answer to whether Mr. Obama’s audacious gamble will pay off. The ﬁst Iran has shaken at the so-called Great Satan since 1979 has not completely relaxed. paragraph vector A document vector simply extends the context to the whole document. IN OUT OUT OUT OUTdoc_1347
117. 117. from gensim.models import Doc2Vec fn = “item_document_vectors” model = Doc2Vec.load(fn) model.most_similar('pregnant') matches = list(filter(lambda x: 'SENT_' in x[0], matches)) # ['...I am currently 23 weeks pregnant...', # '...I'm now 10 weeks pregnant...', # '...not showing too much yet...', # '...15 weeks now. Baby bump...', # '...6 weeks post partum!...', # '...12 weeks postpartum and am nursing...', # '...I have my baby shower that...', # '...am still breastfeeding...', # '...I would love an outfit for a baby shower...'] sentence search