Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.
How Lazada
ranks products
to improve customer experience and conversion
Strata Hadoop Singapore 2016
How Lazada
ranks products
to improve customer experience and conversion
Strata Hadoop Singapore 2016
Leading
e-commerce
platform in
South-East
Asia
Lazada Data Science
Data App Devs expose, integrate, platform-ize
Data Scientists explore, prepare, model
Data Engineers c...
Ranking
affects what
appears
on top
Ranking is
different
from recom-
mendation
“How can I
rank well on an
e-commerce
platform?”
“How can I
rank well on an
e-commerce
platform?”
“How can I
rank well on an
e-commerce
platform?”
“How can I
rank well on an
e-commerce
platform?”
“How can I
rank well on an
e-commerce
platform?”
“How can I
rank well on an
e-commerce
platform?”
“How can I
rank well on an
e-commerce
platform?”
“How can I
rank well on an
e-commerce
platform?”
Ranking products for
catalog and search
Introducing
new products
Emphasizing
product quality
Web Tracker
(JavaScript)
Mobile Tracker
(Adjust)
3rd
Party
(e.g. ,ZenDesk,
SurveyGizmo)
Kafka Queues
Bulk Loaders
(Spark)
...
Overall results
Better ranking improved conversion and revenue per session
Introducing new products improved new product e...
Ranking products for
catalog and search
Intent
Provide shoppers quick access to best products
in catalog/search results, making shopping easy
Problem
Lazada has millions of products—not easy to navigate
How to identify products that interest users in the future?
H...
Methodology
Measure shoppers’ interest through product
engagement as a proxy
Clicks, add-to-cart, checkouts, etc.
Predict ...
Collecting
behavioral data
Track and collect events on web (JavaScript)
and app (Adjust)
Stream and process via Kafka
Stor...
Data preparation
Filter and categorize online behavioral events
(e.g., impressions, clicks, etc.)
Merge various views of p...
Feature engineering
Calculate product engagement metrics
(e.g., average clicks, conversion rate, etc.)
Derive product attr...
Modelling
(i.e., machine learning)
Predict future (tomorrow’s) product clicks/checkouts
Examine results against a benchmar...
Boosting products
(manually)
Manually increase rank of certain products
(e.g., highly anticipated products, campaign tie-u...
Validation and
A/B testing
Local validation is easy, but difficult to ensure
similar results via A/B testing
A/B test all ...
Results
Increased conversion rate by 3 – 8%
Increased revenue per session by 5 – 20%
Introducing
new products
Intent
Provide potentially good new products with exposure
Provide shoppers with new products they like
Keep catalog fresh
Problem
Products with strong engagement stay on top
Products without engagement don’t get traffic
How can we identify new ...
Methodology (demand)
Find what people need
Measure needs through internal/external data
Rank new products in terms of dema...
Methodology (supply)
Find products similar to top products
Measure similarity with top products
Rank new products based on...
Data preparation and
feature engineering
Parse (log) data to identify shoppers’ needs
Measure potential product demand
Mod...
Validation and
A/B testing
Limited capability on existing A/B testing platforms
to track specific products
Measure perform...
Results
Increased new product click-thru rate by 30 – 80%
Increased new product add-to-cart by 20 – 90%
Expected overall c...
Emphasizing
product quality
Intent
Improve customer experience throughout purchase journey
From online browsing to receiving of product
Product qualit...
Problem
How do we measure product “quality”?
Methodology (online)
Content (e.g., title quality, richness of content)
Reviews (e.g., average rating, negative reviews)
P...
Methodology (offline)
Perfect order rate (i.e., not cancelled, not returned, etc.)
Negative feedback (e.g., counterfeit, c...
Data preparation and
feature engineering
Derive product features (e.g., title quality, image quality, etc.)
Measure conten...
Results
Improved quality of products displayed
Increased conversion by 3 – 5% for some countries
Small conversion change i...
Key takeaways
Data science is (i) team sport, (ii) partly R&D, (iii) iterative
How you use data to solve problems (methodo...
Thank you!
eugene.yan@lazada.com
How Lazada ranks products to improve customer experience and conversion
Upcoming SlideShare
Loading in …5
×

How Lazada ranks products to improve customer experience and conversion

8,347 views

Published on

Slides from sharing at Strata + Hadoop Singapore 2016 (http://conferences.oreilly.com/strata/hadoop-big-data-sg/public/schedule/detail/54542)

Ecommerce has enabled retailers to make all of their products available to consumers and consumers to access niche products not found in brick-and-mortar stores. This growth provides consumers with unparalleled choice. Nonetheless, the sheer number of products brings with it the challenge of helping users find relevant products with ease.

Lazada has tens of millions of products on its platform, and this number grows by approximately one million monthly. Lazada’s challenge: How can we help users easily discover good quality products they will like? How can we ensure product selection remains fresh and constantly updated?

One way to do this is through the ranking of products. Via ranking, Lazada helps customers easily find products that will delight them by ensuring these products appear in the first few pages. I’ll share how Lazada ranks products on our website. (Note: Google “how amazon ranks products” for some industry background)

Topics include how we:

* Develop methodology (and tricks) to solve not-so-well-defined problems
* Collect and store user-behavior data from our website and app
* Clean and prepare the data (e.g., handling outliers)
* Discover and create features useful features
* Build models to improve customer experience and meet business objectives
* Measure and test outcomes on our website
* Built this end-to-end on our Hadoop infrastructure, with tools including Kafka and Spark

Published in: Data & Analytics

How Lazada ranks products to improve customer experience and conversion

  1. 1. How Lazada ranks products to improve customer experience and conversion Strata Hadoop Singapore 2016
  2. 2. How Lazada ranks products to improve customer experience and conversion Strata Hadoop Singapore 2016
  3. 3. Leading e-commerce platform in South-East Asia
  4. 4. Lazada Data Science Data App Devs expose, integrate, platform-ize Data Scientists explore, prepare, model Data Engineers collect, store, maintain Start from bottom up
  5. 5. Ranking affects what appears on top
  6. 6. Ranking is different from recom- mendation
  7. 7. “How can I rank well on an e-commerce platform?”
  8. 8. “How can I rank well on an e-commerce platform?”
  9. 9. “How can I rank well on an e-commerce platform?”
  10. 10. “How can I rank well on an e-commerce platform?”
  11. 11. “How can I rank well on an e-commerce platform?”
  12. 12. “How can I rank well on an e-commerce platform?”
  13. 13. “How can I rank well on an e-commerce platform?”
  14. 14. “How can I rank well on an e-commerce platform?”
  15. 15. Ranking products for catalog and search Introducing new products Emphasizing product quality
  16. 16. Web Tracker (JavaScript) Mobile Tracker (Adjust) 3rd Party (e.g. ,ZenDesk, SurveyGizmo) Kafka Queues Bulk Loaders (Spark) Hadoop Hadoop Data Exploration + Data Preparation + Feature Engineering + Modelling (Spark) Manual Boosting (Django) Local Validation A/B Testing Product Seller Transaction Product rankings Split traffic and measure outcomes (Category Managers) (User devices)
  17. 17. Overall results Better ranking improved conversion and revenue per session Introducing new products improved new product engagement Emphasizing product quality had neutral to positive outcomes
  18. 18. Ranking products for catalog and search
  19. 19. Intent Provide shoppers quick access to best products in catalog/search results, making shopping easy
  20. 20. Problem Lazada has millions of products—not easy to navigate How to identify products that interest users in the future? How do we measure interest?
  21. 21. Methodology Measure shoppers’ interest through product engagement as a proxy Clicks, add-to-cart, checkouts, etc. Predict future interest
  22. 22. Collecting behavioral data Track and collect events on web (JavaScript) and app (Adjust) Stream and process via Kafka Store in Hive tables
  23. 23. Data preparation Filter and categorize online behavioral events (e.g., impressions, clicks, etc.) Merge various views of product data (e.g. price, stock, etc.) Exclude outliers and potentially fraudulent events
  24. 24. Feature engineering Calculate product engagement metrics (e.g., average clicks, conversion rate, etc.) Derive product attributes (e.g., age, discount, etc.) Exclude outliers (e.g., conversion rate > 1.00)
  25. 25. Modelling (i.e., machine learning) Predict future (tomorrow’s) product clicks/checkouts Examine results against a benchmark model Pandas + XGBoost is faster and more effective than Spark + MLlib; assessing XGBoost4J-Spark
  26. 26. Boosting products (manually) Manually increase rank of certain products (e.g., highly anticipated products, campaign tie-ups) User-friendly interface to drag-and-drop products Limits on how many products can be boosted
  27. 27. Validation and A/B testing Local validation is easy, but difficult to ensure similar results via A/B testing A/B test all updates before production
  28. 28. Results Increased conversion rate by 3 – 8% Increased revenue per session by 5 – 20%
  29. 29. Introducing new products
  30. 30. Intent Provide potentially good new products with exposure Provide shoppers with new products they like Keep catalog fresh
  31. 31. Problem Products with strong engagement stay on top Products without engagement don’t get traffic How can we identify new products that are likely to interest users?
  32. 32. Methodology (demand) Find what people need Measure needs through internal/external data Rank new products in terms of demand
  33. 33. Methodology (supply) Find products similar to top products Measure similarity with top products Rank new products based on similarity and top product volume
  34. 34. Data preparation and feature engineering Parse (log) data to identify shoppers’ needs Measure potential product demand Model product similarity (Spark GraphX / ElasticSearch)
  35. 35. Validation and A/B testing Limited capability on existing A/B testing platforms to track specific products Measure performance of new products across experimental groups using in-house tracker
  36. 36. Results Increased new product click-thru rate by 30 – 80% Increased new product add-to-cart by 20 – 90% Expected overall conversion to decrease—increased instead (though not statistically significant)
  37. 37. Emphasizing product quality
  38. 38. Intent Improve customer experience throughout purchase journey From online browsing to receiving of product Product quality identified as key driver
  39. 39. Problem How do we measure product “quality”?
  40. 40. Methodology (online) Content (e.g., title quality, richness of content) Reviews (e.g., average rating, negative reviews) Performance (e.g., click-thru rate, browsing time)
  41. 41. Methodology (offline) Perfect order rate (i.e., not cancelled, not returned, etc.) Negative feedback (e.g., counterfeit, complaints, etc.) Seller metrics (e.g., timely shipped-rate, return rate, etc.)
  42. 42. Data preparation and feature engineering Derive product features (e.g., title quality, image quality, etc.) Measure content richness (e.g., attributes available, grouping, etc.) Measure delivery performance and customer feedback
  43. 43. Results Improved quality of products displayed Increased conversion by 3 – 5% for some countries Small conversion change in other countries (non-significant)
  44. 44. Key takeaways Data science is (i) team sport, (ii) partly R&D, (iii) iterative How you use data to solve problems (methodology), data preparation, and feature engineering > machine learning
  45. 45. Thank you! eugene.yan@lazada.com

×