Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Scaling Personalization via Machine-Learned Assortment Optimization

127 views

Published on

From DataEngConf NYC 2018
https://www.datacouncil.ai/speaker/scaling-personalization-via-machine-learned-assortment-optimization
--
Machine learning has revolutionized the capability of businesses to create personalized experiences via real-time, individual predictions and recommendations. But what happens when one must make thousands of decisions for thousands of individuals at the same time?

At Dia&Co, a plus-size women’s styling service, we recently faced such an obstacle when building out a brand new product line for the business. This talk will explore how we combined modern machine learning with classical operations research techniques to scale personalization in the face of constraints inherent to a retail business.


The basics of operations research will be introduced before demonstrating how to solve a simple version of our real-world problem using all open source libraries. I will then reveal the gory details of productionizing this work, from testing to gracefully handling failures of convergence. Finally, I will cover the journey from the coldest of starts, with zero data, to synthesizing machine learning with the operations research problem.

Published in: Data & Analytics
  • Be the first to comment

  • Be the first to like this

Scaling Personalization via Machine-Learned Assortment Optimization

  1. 1. Scaling Personalization via Machine-Learned Assortment Optimization Ethan Rosenthal Dia&Co DataEngConf NYC 2018 11/8/2018 eprosenthal EthanRosenthal www.ethanrosenthal.com
  2. 2. Nearly 70% of women in the U.S. are plus-size, but they represent only 16% of apparel spend
  3. 3. Algorithmically style all boxes for a day at once
  4. 4. Greedy Humans-in- the-loop
  5. 5. Queue
  6. 6. Queue
  7. 7. Queue
  8. 8. Queue
  9. 9. Queue
  10. 10. Queue
  11. 11. Queue
  12. 12. Queue
  13. 13. Queue
  14. 14. Pre-distribution of Wealth
  15. 15. Batch
  16. 16. Batch
  17. 17. ML + OR
  18. 18. Analytics Machine Learning Operations Research Machine Learning + Operations Research
  19. 19. Analytics Machine Learning Operations Research Machine Learning + Operations Research Send emails at 9 AM on Tuesdays because they have the highest open rates
  20. 20. Analytics Machine Learning Operations Research Machine Learning + Operations Research Each customer receives email at time that they are most likely to open
  21. 21. Analytics Machine Learning Operations Research Machine Learning + Operations Research Given deliverability and throughput, distribute email sends across the week
  22. 22. Analytics Machine Learning Operations Research Machine Learning + Operations Research Given deliverability, throughput, and customer’s behavior maximize opens
  23. 23. Integer Programming 101
  24. 24. <math><code>
  25. 25. Decision Variables from pulp import LpVariable users = range(N) items = range(M) user_indicators = {} for u in users: user_indicators[u] = LpVariable(f'user_{u}', cat='Binary') user_item_indicators = {} for u in users: for i in items: user_item_indicators[u, i] = LpVariable( f'user_{u}_item_{i}', cat='Binary' )
  26. 26. Objective Function from pulp import lpSum, lpProblem, lpMaximize problem = lpProblem('BoxMaker', lpMaximize) objective_function = [] for u in users: for i in items: value = scores[u, i] * user_item_indicators[u, i] objective_function.append(value) problem += lpSum(objective_function) Machine Learning
  27. 27. Constraints for u in users: users_item_count = lpSum( user_item_indicators[u, i] for i in items ) problem += users_item_count == 5 * user_indicators[u] for i in items: quantity_used = lpSum( user_item_indicators[u, i] for u in users ) problem += quantity_used <= quantity_available[i]
  28. 28. Solve problem.solve()¯_(ツ)_/¯
  29. 29. </math></code>
  30. 30. Productionizing
  31. 31. Productionalizing
  32. 32. Making this thing automatically and reliably run in the cloud
  33. 33. ● Software Architecture ● Testing ● Interfaces ● Failure Planning ● Iteration
  34. 34. ● Software Architecture ● Testing ● Interfaces ● Failure Planning ● Iteration
  35. 35. Objective Function
  36. 36. Deterministic Scorers “If item is in-budget, +50 points” Problem Solution Pre-calculate and aggregate
  37. 37. Stochastic Scorers “If `outfit` in box, +100 points” Problem Solution Create mini-optimization problem with new decision variables and constraints
  38. 38. Deterministic Constraints “User can only receive items that fit” Problem Solution Don’t make these decision variables
  39. 39. Stochastic Constraints “Customers must receive between X and Y shirts” Problem Mental gymnastics to define constraints Solution
  40. 40. ● Software Architecture ● Testing ● Interfaces ● Failure Planning ● Iteration
  41. 41. Unit Tests are Hard
  42. 42. Test for Behavior
  43. 43. Invest in Test Setup Functions
  44. 44. ● Software Architecture ● Testing ● Interfaces ● Failure Planning ● Iteration
  45. 45. Microservices
  46. 46. ● Software Architecture ● Testing ● Interfaces ● Failure Planning ● Iteration
  47. 47. Perfect is the enemy of good
  48. 48. Minimum Viable Solution
  49. 49. ● Software Architecture ● Testing ● Interfaces ● Failure Planning ● Iteration
  50. 50. Frequent Deployments
  51. 51. Model Improvements
  52. 52. Tests
  53. 53. Points -> Coefficients Each scorer becomes an indicator “feature” Each observation/sample is a box Learn to predict some box-level metric
  54. 54. f(X) = X * w ~ y box 1 box 2 . . . box n 0 4 51 . . . . . . 31 . . . . . . . . . . . . w_0 w_1 . . . w_n ? ? . . . ? ~*
  55. 55. Thank You! We’re hiring! www.dia.com/careers eprosenthal EthanRosenthal www.ethanrosenthal.com

×