Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task of Product Recommendation


Published on

Talk given by Thomas Nedelec, Criteo, during the RecsysFR meetup on February 1st 2017.

Published in: Internet
  • Be the first to comment

CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task of Product Recommendation

  1. 1. Thomas Nedelec 01/02/2107 RecSys Meetup CONTENT2VEC: a Joint Architecture to use Product Image and Text for the task of Product Recommendation
  2. 2. Copyright © 2016 Criteo Talk outline I. Presentation of our architecture: goals and main modules II. Details on the TextCNN module III. Our experimental results IV.Future applications and directions of research
  3. 3. Copyright © 2016 Criteo Motivation Goal for Content2Vec: build the best product representation, meaning that it: 1. Takes into account all product signal in order to help with overall recommendation performance and especially with performance on new products (cold start) 2. Defines the product-2-product similarity as a function of P(co-event of the product pair) in order to optimize for the scenario where the recommended products are retrieved by their similarity with a query product* *assuming optimizing the AUC of link prediction is a good proxy for online performance
  4. 4. Copyright © 2016 Criteo 1. Takes into account all product signal • Represents Product Sequences: • Product co-occurrences Representation: Prod2Vec • Represents Product Information: • Category Representation: Meta-Prod2Vec • Image Representation: AlexNet • Text Representation: Word2Vec, TextCNN
  5. 5. Copyright © 2016 Criteo 2. Merge the different signals Adapt the initial product representations to the final task of predicting P(co-event): • Find the representation that optimizes the P(co-event): Metric learning (Logistic Syamese Nets) • Merge the representations from different signal: Ensemble learning
  6. 6. General Architecture
  7. 7. Integrating product embeddings in recommendation engines Content2vec
  8. 8. I. Product Text Representation
  9. 9. Copyright © 2016 Criteo I Product Text Representation. Goal: To be able to estimate the similarities of products based on their text descriptions.
  10. 10. Copyright © 2016 Criteo I.1 Words Representation. Embedding Solution: • Word2Vec on the product description corpus • Concatenate all products description from Amazon dataset • Ran Word2vec on top of this big file • Get representation for each word of the corpus
  11. 11. Copyright © 2016 Criteo I.1 Words Representation. Similar Word Examples: • Startup: ['startups’,'ecommerce’, 'company’, 'entrepreneurial’, 'b2b’,'businesses’, 'entrepreneurs’, u'homebased' u'entrepreneur’,'cfo'] • Owner: ['proprietor’, 'owners’, 'franchisee’, 'manager’, 'coo' , 'partner’,'breeder’, 'founder' ,'realtor' , 'franchisor'] • Manual: ['handbook’,'workbook','guide’,'manuals’,'manualis’,'sourcebook,'kit' 'labsim’,'guidebook’,'essentials']
  12. 12. Copyright © 2016 Criteo I.2 Product Text Representation. From word embeddings to full product description embedding 3 implemented architectures: - sum of embeddings - cross similarities - TextCNN
  13. 13. I.3 Product Text Representation: TextCNN Convolutional Neural Networks for Sentence Classification :
  14. 14. Copyright © 2016 Criteo I.4 Examples of filters
  15. 15. I.5 TextCNN implementation in TensorFlow TextCNN TF Syamese Architecture
  16. 16. Copyright © 2016 Criteo I.6 Other implemented architectures • Prod2vec:
  17. 17. I.7 Other implemented architectures • Image CNN:
  18. 18. Copyright © 2016 Criteo II. Merge the different scores: A monster model is born!
  19. 19. Copyright © 2016 Criteo II Merge the representations from different signal Baseline: Linear combination of the modality specific similarities (C2V-linear)
  20. 20. Copyright © 2016 Criteo II Other types of ensemble methods Other implemented models: •Cross features (C2V-crossfeat) •A fully connected layer to compress the features •Learn a residual layer to keep using the strong signal from the different modalities and learn some dependences between signals (C2V-res)
  21. 21. III. Experimental Results
  22. 22. Copyright © 2016 Criteo III. Experimental Results Task: Link Prediction – predict on hold out set of product co-events based on a training set of product co-events and their content features (catalog) Dataset: Amazon book dataset with info on title, description, image url and related products (co-view, co-sale) Hard Cold Start Setting: the products in test have not been seen at training time, e.g. no CF signal is available Metrics: AUC loss (classification loss on true co-event vs. spurious)
  23. 23. III. Experimental Results Task1: Hard Cold Start
  24. 24. Copyright © 2016 Criteo IV. Scalability and putting it in production
  25. 25. Copyright © 2016 Criteo IV. Scalability and putting it in production • A lot of CPUs is great for evaluation • Multi-modular architecture: easier to debug • Make the model work better for cross-category pairs • Next: Experiments for making a category classifier (see Is a picture worth thousand words? Work done by a team working with Walmart) • Link to our paper (in preparation for KDD 2017)
  26. 26. Thank you!