Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Recommendation @ Scale


Published on

Recommendation is one of the most traditional and wide spread use case of Machine Learning. In this talk we want to showcase, how an advanced recommendation engine can be served at scale in Glance. Glance is an AI-powered, content driven, personalised Screen Zero (lockscreen) platform for mobile, which is used by over 26M DAU users in India. The talk will take you through each component of a recommendation engine and in the end will showcase the learnings which we got from our experiments to make it functional at scale.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Recommendation @ Scale

  1. 1. Recommendation @ Scale - ADITYA PATEL
  3. 3. Recommendation Problem Formulation: A recommendation problem can be framed in 2 ways: 1. Missing value problem / Prediction problem 2. Ranking top k items The factors to keep in mind for any recommendation are: 1. Relevance 2. Novelty 3. Serendipity 4. Diversity 5. Recency
  4. 4. Basic Terminology 1. User Signal a. Explicit i. Eg: rating, like, dislike b. Implicit i. Clicks ii. Time spent iii. Sharing iv. Etc etc 2. Utility Matrix - [User, Item] matrix 3. Types of ratings a. Binary b. Numerical c. Unary- these are positive preference entries. For example item buying matrix, where we only have the buying (1 ) but nothing for dislike or not buying.
  5. 5. Types of Recommendation Systems There are broadly 4 types of recommendation systems prevalent: 1. Collaborative Filtering a. User based b. Item based 2. Content Based Filtering 3. Knowledge Based 4. Hybrid Systems
  6. 6. Collaborative Filtering: User Advantages: 1. Do not need any domain expertise. 2. Explore is done through similar user tastes. Disadvantages: 1. Sparsity in signals 2. Scalability 3. Cold start
  7. 7. Collaborative Filtering: Item Advantages: 1. Domain knowledge not required. 2. More accurate. 3. High interpretability. 4. More stable than user based CF. Disadvantages: 1. Scalability. 2. Explore is minimum. 3. Cold item problem.
  8. 8. Content Based Filtering 1. Learns from user’s past interactions. 2. Requires Domain Knowledge. Broad Methodology: 1. Learn user’s preference from interaction with combination of item attributes. 2. Preferred methodologies: a. Similarity based methods. Ex. Cosine Similarity estimation. a. ML models - Regression, SVM etc.
  9. 9. Content Based Recommendation Advantages: 1. No cold start problem for items. 2. Interpretation is available Disadvantages: 1. No learning from neighbors. 2. Cold start problem for users 3. Can only be as good as definition of item attribute. a. NETFLIX- “They paid people to watch films and tag them with all kinds of metadata. This process is so sophisticated and precise that taggers receive a 36-page training document that teaches them how to rate movies on their sexually suggestive content, goriness, romance levels, and even narrative elements like plot conclusiveness”
  10. 10. WHAT IS GLANCE?
  11. 11. Personalized screen zero Glance is an AI-powered, content driven, personalised Screen Zero (lockscreen) platform for mobile. Glance enables discovery of content for users every time they unlock their mobile.
  12. 12. Multiple formats and vernacular Languages used - Formats used
  13. 13. Scale of the problem: - 81 Billion Glances a month - 65% of all smart phones in India have Glance pre-installed. - Average time spent is 22 min. - 26 Million Daily Active Users.
  14. 14. Need for recommendation: - Currently there are 17 categories for every Glance article - Glance content is in 4 different vernacular: English, Telugu, Hindi and Tamil. - 100,000 Glances are consumed daily by users. - Lock Screen is really meaningful to a user.
  15. 15. Glance Recommendation Neural Network based Recommendation Approach
  16. 16. Requirements: - Should work with multiple data sources: Image, Video & Text - Should be scalable to serve millions of user. - Latency should be low to serve in real time. - Cost Effective to be scaled properly.
  18. 18. Breaking down a GlanceGLANCE META TEXT IMAGE PLACEMENT - Category - News / Features - Video / Article - Title - Summary - Lock-Screen - In app - Wallpaper - Article Image
  19. 19. User/Glance Profiles Text Image Meta Placement
  20. 20. Image Features (Object Tags): Azure CV API Inhouse InMobi (ResNet 15) YOLO {Hot Air Balloon, Grass, Nature, Mountains}
  21. 21. BERT - Textual Embeddings With glances served in regional languages - vernacular plays an important role in building collaborative profiles
  22. 22. Cross Neural Embedding Vectors in L3 space Approach 1 (Experimental) Approach 2 Home घर Home घर BERT
  23. 23. Text Features: Summary / Title ( Hindi/English…) Entity/Aspec t Extraction Cross Embedding Universal Latent Space
  24. 24. Embedding space TEXT IMAGE META FORMAT . . . . USER EMBEDDING VECTOR Average Average User Features Glance Features Softmax User Embedding Item Embedding ... ...
  25. 25. Result: Model Neural Network Based Content Filtering Mean Average Precision (MAP) 93% Serving time(Per call) 16ms
  26. 26. Key Takeaways:
  27. 27. Sadly cannot be done in production!
  28. 28. Lesson 1: Architecture Decisions 1. Respect your heterogeneous and unique system a. Our Learning: i. We explored 4 different ways to serve the user a personalised feed based on the model ii. Factors in our case were: Latency, Refresh rate and Change in Profile 2. Move fast and break things is not always fun a. Our Learning: i. Consider the scale of your application even in your MVP as this can be deciding factor in deployment
  29. 29. Data Ingestion Architecture Overall data flow ● Because of heterogeneous sources and scale of data, pre-processing is done in real time. ● Pre-processing is done using Azure Stream Analytics and Azure Logic App Event Hub Analytics Event bus Databricks KEY Event Hub
  30. 30. Lesson 2: Serving Architecture Batch Predictions Real-time Prediction Make offline predictions for the users and store. Make predictions has and when required. Advantages: Fast, Model can be complex. Advantages: Reactive, Cost Disadvantage: Wasteful, 3X more expensive Disadvantage: Model complexity needs to be controlled. Our Learning: Take best of both worlds. Keep complexity offline.
  31. 31. Serving Architecture ● The scoring service is done with ‘just-in time’ ranking for all glances for incoming users ● The advantage of this architecture is that it’s tightly coupled with the existing glance engineering architecture ● Our architecture made it scalable and compute effective by reducing it 3X compared to batch predictions.
  32. 32. Lesson 3: Containerize and Code Hygiene 1. Code Hygiene is extremely important a. Our Learning i. Triple check coding standards and documentation of code for easy passover and debugging 2. Train and Serve Environments should match a. Our Learning: i. Kubernetes allowed high replicability, scalability and easy container management.
  33. 33. Lesson 4: Log and Monitor PROPERFAST CLEAR A/B Testing framework Logging of all results in the online world Predefined and agreed DS and business metrics to track
  34. 34. References: - Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018). - Redmon, Joseph, and Ali Farhadi. "Yolov3: An incremental improvement." arXiv preprint arXiv:1804.02767 (2018). - He, Kaiming, et al. "Deep residual learning for image recognition." Proceedings of the IEEE conference on computer vision and pattern recognition. 2016. -
  35. 35. Thank You