Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

0

Share

Download to read offline

Building A Product Assortment Recommendation Engine

Download to read offline

Amid the increasingly competitive brewing industry, the ability of retailers and brewers to provide optimal product assortments for their consumers has become a key goal for business stakeholders. Consumer trends, regional heterogeneities and massive product portfolios combine to scale the complexity of assortment selection. At AB InBev, we approach this selection problem through a two-step method rooted in statistical learning techniques. First, regression models and collaborative filtering are used to predict product demand in partnering retailers. The second step involves robust optimization techniques to recommend a set of products that enhance business-specified performance indicators, including retailer revenue and product market share.



With the ultimate goal of scaling our approach to over 100k brick-and-mortar retailers across the United States and online platforms, we have implemented our algorithms in custom-built Python libraries using Apache Spark. We package and deploy production versions of Python wheels to a hosted repository for installation to production infrastructure.



To orchestrate the execution of these processes at scale, we use a combination of the Databricks API, Azure App Configuration, Azure Functions, Azure Event Grid and some custom-built utilities to deploy the production wheels to on-demand and interactive Databricks clusters. From there, we monitor execution with Azure Application Insights and log evaluation metrics to Databricks Delta tables on ADLS. To create a full-fledged product and deliver value to customers, we built a custom web application using React and GraphQL which allows users to request assortment recommendations in a self-service, ad-hoc fashion.

Related Books

Free with a 30 day trial from Scribd

See all
  • Be the first to like this

Building A Product Assortment Recommendation Engine

  1. 1. Building a Product Assortment Recommendation Engine for Brick-and-Mortar Retailers Justin Morse, Staff Data Scientist, AB InBev Ethan DuBois, Senior Software Engineer, AB InBev
  2. 2. Agenda § Introductions and overview § The problem: product assortment selection § The algorithmic solution § Deploying the solution § Lessons learned Justin Morse, Staff Data Scientist Ethan DuBois, Senior Software Engineer
  3. 3. 2k+ Products 150k+ Retailers 30k+ Employees ABInBev in the North America
  4. 4. Pivoting towards a tech-oriented approach LOLA team launched (5 employees) Incorporate Databricks into workflows Begin R&D partnership with Bud Lab & MIT National launch of first microservice Begin development of sku-level recommendation engine BeerTech Organization launched (73 employees) 2018 2019 2020 2021 Launch recommendation engine pilot
  5. 5. Which products should a retailer carry? An average retailer has >10100 ways to select their product assortment.
  6. 6. How can we develop a quantitative approach to assortment planning that accounts for customer preferences, business priorities, and computational complexity?
  7. 7. Assortment Recommendation Pipeline Product Demand Prediction Make quantitative estimates of product demand for each partnering retailer Select the best product assortment given business requirements and estimated product demand Assortment Optimization Data Model Transform datasets into a format required for our pipeline Causal Analysis Measure the effects of our modeling interventions
  8. 8. Predicting demand for products in partnering retailers • Custom built library for family of discrete choice models using PyTorch • Executed on Databricks clusters with Azure functions • Next steps: scale training with Petastorm and Horovod
  9. 9. Optimizing retailer performance • Use traditional numerical techniques to optimize revenue objective function • Include filters related to allowable business outcomes: - Size restrictions - Inventory restrictions - License restrictions
  10. 10. • Recommendation engine launched in partnering retailers in the Ontario region • Currently working with software engineering team to scale solution for North American and Global launch Demonstrating value through small-scale pilots
  11. 11. Pilot implementation: chained notebooks in Databricks
  12. 12. Scaling and deploying the solution • Production quality code standards • Best-practice Code Distribution • Repository-based, version-controlled, automated CI/CD • Flexible and lightweight configuration approach • Decoupled communication between components • Infrastructure-as-code • Ability to scale infra up and down as necessary to meet demand • API for integration with other applications After a number of successful pilots, we needed to build a more robust solution that at minimum included:
  13. 13. Scaling and deploying the solution ▪ Production quality code standards ▪ Best-practice code distribution ▪ Repository-based, version- controlled, automated CI/CD • Flexible/lightweight • Decoupled from code • Infrastructure-as-code • Configuration • Code • Decoupled communication between components • Ability to automatically and programmatically scale infra up or down to meet demand • API for integration with other applications • Orchestration
  14. 14. Scaling and deploying the solution: Technologies • Configuration • Code • Orchestration Azure App Configuration Azure Key Vault Azure App Insights Azure Event Grid Azure Function Apps
  15. 15. Code: Refactoring ML Processes Moving from chained Notebooks to end-to-end Pipelines in Python • Chained Notebooks • Didn’t provide the ease of maintenance and visibility that we wanted • Easy to get lost, added complexity • Difficult to standardize, scan, control quality across workstreams • Process-controlled Python Pipelines • Object-oriented approach • Make use of shared tools and utilities • Ability to package and distribute more easily • CI/CD integration with Github workflows (Code scanning, unit/integration tests, etc)
  16. 16. Code: Refactoring ML Processes
  17. 17. Code: Refactoring ML Processes
  18. 18. Code: Refactoring ML Processes
  19. 19. Code: Packaging and Deployment • Custom Python wheels • Object-oriented, following best practices approach • Built and deployed in GitHub Workflows as part of CI/CD • Distribution: JFrog Artifactory Repository • Organizational PyPI repo • Available for installation on all clusters or machines • Authentication set up with cluster init scripts stored in DBFS • Roadmap: Move to GitHub Packages once PyPI is supported :’( Packaging and deploying code to an easily accessible repository for installation on production resources
  20. 20. Code: CI/CD
  21. 21. Code: Package Distribution
  22. 22. Configuration • Azure App Configuration • Decoupled, customizable approach • Service-level configs • Algorithmic constants and other ML settings • Validation and consistency checks • Execution-level configs • Cluster configuration • Storage locations, file names • Logging settings • Azure Key Vault • Secret storage • Keys and connection strings for Data lake, Event Grid, Application Insights, Azure App Config • Backs Databricks Secret Scope • Allows for easy access within init scripts and Spark environment variable configuration Creating a generic, highly-customizable configuration solution, decoupled from code
  23. 23. Configuration: Logging and Storage
  24. 24. Configuration: Databricks
  25. 25. Configuration: Code Integration
  26. 26. Configuration: Azure App Config
  27. 27. Orchestration § Azure Functions ▪ REST/HTTP ▪ Event Grid § Internal Utilities ▪ Wrappers for Databricks Runs API ▪ Internal config management ▪ Read/Write/Storage management ▪ Logging management ▪ Wrapper modules for Azure SDKs § Azure Databricks ▪ Interactive or Job Clusters ▪ Programmatic configuration ▪ Dependencies ▪ Environment variables ▪ Init scripts for pip-conf § Azure Application Insights ▪ Custom logging written from Python processes § Azure Event Grid ▪ Custom events published for status updates • Compute/Monitoring Kickoff
  28. 28. Orchestration: Azure Functions
  29. 29. Orchestration: Databricks Runs
  30. 30. Orchestration: App Insights Logging
  31. 31. Putting It All Together
  32. 32. Conclusion • MVP Released, in production • Collecting initial user feedback in preparation for future releases • Lessons Learned • Development Process: db-connect vs notebooks, pros and cons • Configuration: moving configs out of code wherever possible • Pandas vs PySpark: understanding the distinction and implications • Future Roadmap • Increased parallelization/distribution for both model training and optimization process • Added intelligence throughout service: Job progress and ETAs, different Demand Estimate universes • Enhanced DevOps approach to cloud resource deployment and environment management
  33. 33. Emmanuel Doro Justin Morse Phillip Theron Gui Neubern Zi Wang Senthil Murugappan Ethan DuBois Ravi Kolla Sarosh Ahmad Griffin Ansel Ashish Baiju Chris Stone Nelson Kandeya Emily Shapiro Jessica Zou Vivek Farias Nikos Trichakis Tianyi Peng Patricio Foncea DS DS DS DS DS DS SE SE SE SE SE DE DE P P Lucas Diffey DE
  34. 34. Q&A
  35. 35. Feedback Your feedback is important to us. Don’t forget to rate and review the sessions.

Amid the increasingly competitive brewing industry, the ability of retailers and brewers to provide optimal product assortments for their consumers has become a key goal for business stakeholders. Consumer trends, regional heterogeneities and massive product portfolios combine to scale the complexity of assortment selection. At AB InBev, we approach this selection problem through a two-step method rooted in statistical learning techniques. First, regression models and collaborative filtering are used to predict product demand in partnering retailers. The second step involves robust optimization techniques to recommend a set of products that enhance business-specified performance indicators, including retailer revenue and product market share. With the ultimate goal of scaling our approach to over 100k brick-and-mortar retailers across the United States and online platforms, we have implemented our algorithms in custom-built Python libraries using Apache Spark. We package and deploy production versions of Python wheels to a hosted repository for installation to production infrastructure. To orchestrate the execution of these processes at scale, we use a combination of the Databricks API, Azure App Configuration, Azure Functions, Azure Event Grid and some custom-built utilities to deploy the production wheels to on-demand and interactive Databricks clusters. From there, we monitor execution with Azure Application Insights and log evaluation metrics to Databricks Delta tables on ADLS. To create a full-fledged product and deliver value to customers, we built a custom web application using React and GraphQL which allows users to request assortment recommendations in a self-service, ad-hoc fashion.

Views

Total views

80

On Slideshare

0

From embeds

0

Number of embeds

0

Actions

Downloads

6

Shares

0

Comments

0

Likes

0

×