Successfully reported this slideshow.
Your SlideShare is downloading. ×

[DSC Europe 22] On building a video recommendation system and other use-cases - Vladimir Ageev

Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad
Ad

Check these out next

1 of 16 Ad

[DSC Europe 22] On building a video recommendation system and other use-cases - Vladimir Ageev

Download to read offline

I will talk about building a recommendation system for our internal video-hosting used by thousands of employees. Will describe how our project is organized, ML approach we use and issues we are facing, what our business goal is, what kind of features do we use, how quality is measured, and model is monitored. In addition to that, I would cover some of the other recommender system use-cases we have in the company.

I will talk about building a recommendation system for our internal video-hosting used by thousands of employees. Will describe how our project is organized, ML approach we use and issues we are facing, what our business goal is, what kind of features do we use, how quality is measured, and model is monitored. In addition to that, I would cover some of the other recommender system use-cases we have in the company.

Advertisement
Advertisement

More Related Content

More from DataScienceConferenc1 (20)

Recently uploaded (20)

Advertisement

[DSC Europe 22] On building a video recommendation system and other use-cases - Vladimir Ageev

  1. 1. Building a Recommendation System November 2022 Vladimir Ageev
  2. 2. Meet the team Vladimir Ageev Sr. Data Scientist Vera Kochetkova Data Scientist Vadim Radchenko Data Scientist
  3. 3. Plan of the talk What we are working on How we are implementing it How do we measure it Why we are building it
  4. 4. What we are building Our product is an enterprise video management system for an international company with thousands of employees, dozens specialisations and different seniority levels Townhalls/all-hands recordings Corporate onboarding content Technical trainings or L&D materials Soft-skills courses Meetup recordings Well-being trainings Audio podcasts ... These users generate, view and listen various types of video content:
  5. 5. Why we are building it Onboard newcomers to the company Highlight important company or local updates Encourage and help with professional growth All this generated content should be exposed to right people at the right moment and in a right context in order to Recommendation system is one way to achieve distribution of content in such personalised way
  6. 6. Why we are building it Events News posts Video hosting
  7. 7. How we are building it: target What is definition of relevant content? Depends on feedback type! Impressions: "how many times user seen a video but didn't click it?" implicit explicit Like of the video Like/dislike of suggestion Addition to watch later playlist Watch history and view ratio: "user watched 90% of an hour long video 3 months ago"
  8. 8. How we are building it: target What is definition of relevant content? Depends on feedback type! Impressions: "how many times user seen a video but didn't click it?" implicit explicit Like of the video Like/dislike of suggestion Addition to watch later playlist Watch history and view ratio: "user watched 90% of an hour long video 3 months ago" view ratio target is not trivial, depends on video duration and view date a – max duration b – 25% quantile of durations distribution N – number of thresholds p – percent watched duration Final target is scaled
  9. 9. How we are building it: models Multi-stage ranking (aka hybrid recommendation system) learning-to-rank with WARP loss ranking with lambdarank objective
  10. 10. How we are building it: models Cold start users Cold start content & context suggestions Trending content: videos receiving views regularly this week Popular content for their role, level, location: "most popular content among Junior Business Analysts for the past 3 months" their role name (ML Engineer) the last video they watched a text post meetup recording Videos similar to Helps with both user and content cold starts
  11. 11. How we are building it: models Cold start users Cold start content & context suggestions Trending content: videos receiving views regularly this week Popular content for their role, level, location: "most popular content among Junior Business Analysts for the past 3 months" their role name (ML Engineer) the last video they watched a text post meetup recording Videos similar to Helps with both user and content cold starts SBERT
  12. 12. How we are building it: pipelines Models are not enough There is a lot of infrastructure behind GitLab – versioning and CI Postgress for data and features storage Faiss for vector storage Kafka for communication Airflow for orchestration and scheduling Some pipelines are scheduled, others are triggered with appearance of new data
  13. 13. How do we measure it model-level funnel product level How do we measure system performance? CTR – conversion of impressions into clicks Long CTR (mCTR) – conversion of impressions into to 80% view ratio MAP@K – how well we are ordering recommendations Hits@K - general relevancy of recommendations Calculated both online and offline Ratio of watch time - how important recsys is for content distribution Ratio of content consumed from recsys Retention - do users continue to watch recommended content tools Amplitude – events tracking Metabase & PowerBI – dashboarding of stored metrics
  14. 14. This is how you can build a recommender Business goal Target definition Orchestration Corporate updates, trainings and events delivered to right people at right moment Use explicit and implicit feedback to define and measure relevancy Models Complex model for the most active users. Explainable lightweight models for treating cold start and contextual suggestions Infrastructure behind operation of the models. Based on activity select right tool and way of model deployment and inference Quality Measure quality offline and online. Select metrics at different levels: product, funnel, model
  15. 15. Other projects "We've built a Content Recommender System (news, ads) for a telecom company. The customer already had a mobile app, but the personalization was not there. Our team leveraged the Big Data stack (Apache Spark, Apache Ignite) to build a backend capable to serve recommendations in real-time with low latency under high load. It was a hybrid model built on users’ behavior, profile, and news textual content. CTR skyrocketed. Users got personalized content." "Recommendation system built for an international health and beauty retail group with thousands of stores and millions of customers. Team created multiple customer scoring models, product recommenders and promotion recommenders. The solution was build using Apache Spark. Horovod was used for distributed training of DL models"
  16. 16. Feel free to reach out and connect! Opportunities in our company Thank you! email: vladimir_ageev@epam.com link: epa.ms/Jobs-Serbia

×