Slides from sharing at Strata + Hadoop Singapore 2016 (http://conferences.oreilly.com/strata/hadoop-big-data-sg/public/schedule/detail/54542)
Ecommerce has enabled retailers to make all of their products available to consumers and consumers to access niche products not found in brick-and-mortar stores. This growth provides consumers with unparalleled choice. Nonetheless, the sheer number of products brings with it the challenge of helping users find relevant products with ease.
Lazada has tens of millions of products on its platform, and this number grows by approximately one million monthly. Lazada’s challenge: How can we help users easily discover good quality products they will like? How can we ensure product selection remains fresh and constantly updated?
One way to do this is through the ranking of products. Via ranking, Lazada helps customers easily find products that will delight them by ensuring these products appear in the first few pages. I’ll share how Lazada ranks products on our website. (Note: Google “how amazon ranks products” for some industry background)
Topics include how we:
* Develop methodology (and tricks) to solve not-so-well-defined problems
* Collect and store user-behavior data from our website and app
* Clean and prepare the data (e.g., handling outliers)
* Discover and create features useful features
* Build models to improve customer experience and meet business objectives
* Measure and test outcomes on our website
* Built this end-to-end on our Hadoop infrastructure, with tools including Kafka and Spark