SlideShare a Scribd company logo

Anatomy of an eCommerce Search Engine by Mayur Datar

Naresh Jain
Naresh Jain
Naresh JainTech Startup Founder at ConfEngine

In this talk, the chief Data scientist of Flipkart will uncover the various challenges in running an e-commerce search platform like scale, recency, update rates, business shaping etc. He will also explain the overall system architecture of the search platform and get into the details of some of the sub-systems, including the query understanding and rewriting sub-system.

Anatomy of an eCommerce Search Engine by Mayur Datar

1 of 25
Download to read offline
Anatomy of an eCommerce Search Engine by Mayur Datar
Anatomy of an eCommerce Search Engine by Mayur Datar
● Search is one of the most
important discovery tools in
E-commerce.
● Powers other features like
merchandising (promotions),
recommendations etc.
● Accounts for big fraction of the
units sold and GMV.
● Important signals that
affect search: Price,
offers, popularity,
availability, serviceability
etc.
● Used in ranking of
products.
● Exposed as filters and
sorts to end users.
● These signals are very
dynamic, particularly
during sales.
● E-commerce search != websearch.
● Documents have a structure to them
● Queries have an implicit structure
● Challenges:
○ Large document collection with a long heavy tail
○ Extremely high rate of changes/updates (Thousands per sec)
○ Geo specific ranking
○ Multi-objective optimization (GMV, Units, Ads revenue, Long
Term Value)
● Opportunities:
○ Broad queries: personalization can play a huge role
● Queries per day: XXX Millions / week
● Latencies:
○ Average: ~ 100 ms
○ Median: ~ 50 ms
○ 90th percentile: ~ 500 ms
● Documents retrieved and scored from index:
○ Median: 1K to 10K
○ 95th percentile: 200K to 500K
○ 99th percentile: 500K to 3M+
● Search CTR: Around 50%
Ad

Recommended

Applied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingApplied Machine Learning for Ranking Products in an Ecommerce Setting
Applied Machine Learning for Ranking Products in an Ecommerce SettingDatabricks
 
Near RealTime search @Flipkart
Near RealTime search @FlipkartNear RealTime search @Flipkart
Near RealTime search @FlipkartUmesh Prasad
 
Learning to Rank Datasets for Search with Oscar Castaneda
Learning to Rank Datasets for Search with Oscar CastanedaLearning to Rank Datasets for Search with Oscar Castaneda
Learning to Rank Datasets for Search with Oscar CastanedaDatabricks
 
GTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerceGTC 2021: Counterfactual Learning to Rank in E-commerce
GTC 2021: Counterfactual Learning to Rank in E-commerceGrubhubTech
 
Machine Learning for retail and ecommerce
Machine Learning for retail and ecommerceMachine Learning for retail and ecommerce
Machine Learning for retail and ecommerceAndrei Lopatenko
 
Calibrated Recommendations
Calibrated RecommendationsCalibrated Recommendations
Calibrated RecommendationsHarald Steck
 
Better Search Through Query Understanding
Better Search Through Query UnderstandingBetter Search Through Query Understanding
Better Search Through Query UnderstandingDaniel Tunkelang
 

More Related Content

What's hot

Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsBenjamin Le
 
How Lazada ranks products to improve customer experience and conversion
How Lazada ranks products to improve customer experience and conversionHow Lazada ranks products to improve customer experience and conversion
How Lazada ranks products to improve customer experience and conversionEugene Yan Ziyou
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to RankBhaskar Mitra
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender SystemsDavid Zibriczky
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectiveXavier Amatriain
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation SystemsTrieu Nguyen
 
Deep Learning for Semantic Search in E-commerce​
Deep Learning for Semantic Search in E-commerce​Deep Learning for Semantic Search in E-commerce​
Deep Learning for Semantic Search in E-commerce​Somnath Banerjee
 
System design for recommendations and search
System design for recommendations and searchSystem design for recommendations and search
System design for recommendations and searchEugene Yan Ziyou
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableJustin Basilico
 
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation enginesGeorgian Micsa
 
How to Build your Training Set for a Learning To Rank Project
How to Build your Training Set for a Learning To Rank ProjectHow to Build your Training Set for a Learning To Rank Project
How to Build your Training Set for a Learning To Rank ProjectSease
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxElasticsearch
 
General introduction to AI ML DL DS
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DSRoopesh Kohad
 
Recsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakRecsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakDeepak Agarwal
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Bhaskar Mitra
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Alexandros Karatzoglou
 
Modeling Impression discounting in large-scale recommender systems
Modeling Impression discounting in large-scale recommender systemsModeling Impression discounting in large-scale recommender systems
Modeling Impression discounting in large-scale recommender systemsMitul Tiwari
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In IndustryXavier Amatriain
 

What's hot (20)

Deep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender SystemsDeep Learning for Personalized Search and Recommender Systems
Deep Learning for Personalized Search and Recommender Systems
 
How Lazada ranks products to improve customer experience and conversion
How Lazada ranks products to improve customer experience and conversionHow Lazada ranks products to improve customer experience and conversion
How Lazada ranks products to improve customer experience and conversion
 
Neural Learning to Rank
Neural Learning to RankNeural Learning to Rank
Neural Learning to Rank
 
Learn to Rank search results
Learn to Rank search resultsLearn to Rank search results
Learn to Rank search results
 
An introduction to Recommender Systems
An introduction to Recommender SystemsAn introduction to Recommender Systems
An introduction to Recommender Systems
 
Past, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspectivePast, present, and future of Recommender Systems: an industry perspective
Past, present, and future of Recommender Systems: an industry perspective
 
Introduction to Recommendation Systems
Introduction to Recommendation SystemsIntroduction to Recommendation Systems
Introduction to Recommendation Systems
 
Deep Learning for Semantic Search in E-commerce​
Deep Learning for Semantic Search in E-commerce​Deep Learning for Semantic Search in E-commerce​
Deep Learning for Semantic Search in E-commerce​
 
System design for recommendations and search
System design for recommendations and searchSystem design for recommendations and search
System design for recommendations and search
 
Recommender Systems
Recommender SystemsRecommender Systems
Recommender Systems
 
Making Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms ReliableMaking Netflix Machine Learning Algorithms Reliable
Making Netflix Machine Learning Algorithms Reliable
 
Recommendation engines
Recommendation enginesRecommendation engines
Recommendation engines
 
How to Build your Training Set for a Learning To Rank Project
How to Build your Training Set for a Learning To Rank ProjectHow to Build your Training Set for a Learning To Rank Project
How to Build your Training Set for a Learning To Rank Project
 
An introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolboxAn introduction to Elasticsearch's advanced relevance ranking toolbox
An introduction to Elasticsearch's advanced relevance ranking toolbox
 
General introduction to AI ML DL DS
General introduction to AI ML DL DSGeneral introduction to AI ML DL DS
General introduction to AI ML DL DS
 
Recsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and DeepakRecsys2016 Tutorial by Xavier and Deepak
Recsys2016 Tutorial by Xavier and Deepak
 
Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)Neural Text Embeddings for Information Retrieval (WSDM 2017)
Neural Text Embeddings for Information Retrieval (WSDM 2017)
 
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep Learning for Recommender Systems RecSys2017 Tutorial
 
Modeling Impression discounting in large-scale recommender systems
Modeling Impression discounting in large-scale recommender systemsModeling Impression discounting in large-scale recommender systems
Modeling Impression discounting in large-scale recommender systems
 
Recommender Systems In Industry
Recommender Systems In IndustryRecommender Systems In Industry
Recommender Systems In Industry
 

Similar to Anatomy of an eCommerce Search Engine by Mayur Datar

A Survey of Recommender System Techniques and the E-commerce Domain.pptx
A Survey of Recommender System Techniques and the E-commerce Domain.pptxA Survey of Recommender System Techniques and the E-commerce Domain.pptx
A Survey of Recommender System Techniques and the E-commerce Domain.pptxmansivekaria09
 
Matthias Bettag - Challenges for each the multi-channel, multi-device and mul...
Matthias Bettag - Challenges for each the multi-channel, multi-device and mul...Matthias Bettag - Challenges for each the multi-channel, multi-device and mul...
Matthias Bettag - Challenges for each the multi-channel, multi-device and mul...Marketing Festival
 
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...Lucidworks
 
Anatomy of Search Relevance: From Data To Action
Anatomy of Search Relevance: From Data To ActionAnatomy of Search Relevance: From Data To Action
Anatomy of Search Relevance: From Data To ActionSaïd Radhouani
 
Search analytics what why how - By Otis Gospodnetic
 Search analytics what why how - By Otis Gospodnetic  Search analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis Gospodnetic lucenerevolution
 
Search analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis GospodneticSearch analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis Gospodneticlucenerevolution
 
Kp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptxKp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptxCloudBusiness2
 
Personalized search
Personalized searchPersonalized search
Personalized searchToine Bogers
 
Nicholas Gorski: Real-time revenue science at Twitter
Nicholas Gorski: Real-time revenue science at TwitterNicholas Gorski: Real-time revenue science at Twitter
Nicholas Gorski: Real-time revenue science at TwitterDavid Garrison
 
Data Science Salon: Digital Transformation: The Data Science Catalyst
Data Science Salon: Digital Transformation: The Data Science CatalystData Science Salon: Digital Transformation: The Data Science Catalyst
Data Science Salon: Digital Transformation: The Data Science CatalystFormulatedby
 
Being a Data Science Product Manager
Being a Data Science Product ManagerBeing a Data Science Product Manager
Being a Data Science Product ManagerRam Narayan Subudhi
 
Estudio34 Presents- Dara Fitzgerald Brighton SEO-Next Gen Measurement With Go...
Estudio34 Presents- Dara Fitzgerald Brighton SEO-Next Gen Measurement With Go...Estudio34 Presents- Dara Fitzgerald Brighton SEO-Next Gen Measurement With Go...
Estudio34 Presents- Dara Fitzgerald Brighton SEO-Next Gen Measurement With Go...William Renedo
 
Big Data in Ecommerce
Big Data in EcommerceBig Data in Ecommerce
Big Data in EcommerceTeguh Nugraha
 
TAUS 2.0 and the Game Changers in Localization (Jaap van der Meer, director o...
TAUS 2.0 and the Game Changers in Localization (Jaap van der Meer, director o...TAUS 2.0 and the Game Changers in Localization (Jaap van der Meer, director o...
TAUS 2.0 and the Game Changers in Localization (Jaap van der Meer, director o...TAUS - The Language Data Network
 
The TAUS Translation Data Landscape Report, by Jaap van der Meer, TAUS
The TAUS Translation Data Landscape Report, by Jaap van der Meer, TAUSThe TAUS Translation Data Landscape Report, by Jaap van der Meer, TAUS
The TAUS Translation Data Landscape Report, by Jaap van der Meer, TAUSTAUS - The Language Data Network
 
Deepak Tiwari, Lyft
Deepak Tiwari, LyftDeepak Tiwari, Lyft
Deepak Tiwari, LyftHilary Ip
 
Computational Marketing at Groupon - JCSSE 2017
Computational Marketing at Groupon - JCSSE 2017Computational Marketing at Groupon - JCSSE 2017
Computational Marketing at Groupon - JCSSE 2017Clovis Chapman
 

Similar to Anatomy of an eCommerce Search Engine by Mayur Datar (20)

A Survey of Recommender System Techniques and the E-commerce Domain.pptx
A Survey of Recommender System Techniques and the E-commerce Domain.pptxA Survey of Recommender System Techniques and the E-commerce Domain.pptx
A Survey of Recommender System Techniques and the E-commerce Domain.pptx
 
Matthias Bettag - Challenges for each the multi-channel, multi-device and mul...
Matthias Bettag - Challenges for each the multi-channel, multi-device and mul...Matthias Bettag - Challenges for each the multi-channel, multi-device and mul...
Matthias Bettag - Challenges for each the multi-channel, multi-device and mul...
 
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
Anatomy of Relevance - From Data to Action: Presented by Saïd Radhouani, Yell...
 
Anatomy of Search Relevance: From Data To Action
Anatomy of Search Relevance: From Data To ActionAnatomy of Search Relevance: From Data To Action
Anatomy of Search Relevance: From Data To Action
 
Search analytics what why how - By Otis Gospodnetic
 Search analytics what why how - By Otis Gospodnetic  Search analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis Gospodnetic
 
Search analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis GospodneticSearch analytics what why how - By Otis Gospodnetic
Search analytics what why how - By Otis Gospodnetic
 
Groupon at H2O World - London
Groupon at H2O World - LondonGroupon at H2O World - London
Groupon at H2O World - London
 
Kp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptxKp-Data Analytics-ts.pptx
Kp-Data Analytics-ts.pptx
 
Personalized search
Personalized searchPersonalized search
Personalized search
 
Big data: Bringing competition policy to the digital era – VARIAN – November ...
Big data: Bringing competition policy to the digital era – VARIAN – November ...Big data: Bringing competition policy to the digital era – VARIAN – November ...
Big data: Bringing competition policy to the digital era – VARIAN – November ...
 
Nicholas Gorski: Real-time revenue science at Twitter
Nicholas Gorski: Real-time revenue science at TwitterNicholas Gorski: Real-time revenue science at Twitter
Nicholas Gorski: Real-time revenue science at Twitter
 
Data Science Salon: Digital Transformation: The Data Science Catalyst
Data Science Salon: Digital Transformation: The Data Science CatalystData Science Salon: Digital Transformation: The Data Science Catalyst
Data Science Salon: Digital Transformation: The Data Science Catalyst
 
Dicon interactive
Dicon interactiveDicon interactive
Dicon interactive
 
Being a Data Science Product Manager
Being a Data Science Product ManagerBeing a Data Science Product Manager
Being a Data Science Product Manager
 
Estudio34 Presents- Dara Fitzgerald Brighton SEO-Next Gen Measurement With Go...
Estudio34 Presents- Dara Fitzgerald Brighton SEO-Next Gen Measurement With Go...Estudio34 Presents- Dara Fitzgerald Brighton SEO-Next Gen Measurement With Go...
Estudio34 Presents- Dara Fitzgerald Brighton SEO-Next Gen Measurement With Go...
 
Big Data in Ecommerce
Big Data in EcommerceBig Data in Ecommerce
Big Data in Ecommerce
 
TAUS 2.0 and the Game Changers in Localization (Jaap van der Meer, director o...
TAUS 2.0 and the Game Changers in Localization (Jaap van der Meer, director o...TAUS 2.0 and the Game Changers in Localization (Jaap van der Meer, director o...
TAUS 2.0 and the Game Changers in Localization (Jaap van der Meer, director o...
 
The TAUS Translation Data Landscape Report, by Jaap van der Meer, TAUS
The TAUS Translation Data Landscape Report, by Jaap van der Meer, TAUSThe TAUS Translation Data Landscape Report, by Jaap van der Meer, TAUS
The TAUS Translation Data Landscape Report, by Jaap van der Meer, TAUS
 
Deepak Tiwari, Lyft
Deepak Tiwari, LyftDeepak Tiwari, Lyft
Deepak Tiwari, Lyft
 
Computational Marketing at Groupon - JCSSE 2017
Computational Marketing at Groupon - JCSSE 2017Computational Marketing at Groupon - JCSSE 2017
Computational Marketing at Groupon - JCSSE 2017
 

More from Naresh Jain

Problem Solving Techniques For Evolutionary Design
Problem Solving Techniques For Evolutionary DesignProblem Solving Techniques For Evolutionary Design
Problem Solving Techniques For Evolutionary DesignNaresh Jain
 
Agile India 2019 Conference Welcome Note
Agile India 2019 Conference Welcome NoteAgile India 2019 Conference Welcome Note
Agile India 2019 Conference Welcome NoteNaresh Jain
 
Organizational Resilience
Organizational ResilienceOrganizational Resilience
Organizational ResilienceNaresh Jain
 
Improving the Quality of Incoming Code
Improving the Quality of Incoming CodeImproving the Quality of Incoming Code
Improving the Quality of Incoming CodeNaresh Jain
 
Agile India 2018 Conference Summary
Agile India 2018 Conference SummaryAgile India 2018 Conference Summary
Agile India 2018 Conference SummaryNaresh Jain
 
Agile India 2018 Conference
Agile India 2018 ConferenceAgile India 2018 Conference
Agile India 2018 ConferenceNaresh Jain
 
Agile India 2018 Conference
Agile India 2018 ConferenceAgile India 2018 Conference
Agile India 2018 ConferenceNaresh Jain
 
Agile India 2018 Conference
Agile India 2018 ConferenceAgile India 2018 Conference
Agile India 2018 ConferenceNaresh Jain
 
Pilgrim's Progress to the Promised Land by Robert Virding
Pilgrim's Progress to the Promised Land by Robert VirdingPilgrim's Progress to the Promised Land by Robert Virding
Pilgrim's Progress to the Promised Land by Robert VirdingNaresh Jain
 
Concurrent languages are Functional by Francesco Cesarini
Concurrent languages are Functional by Francesco CesariniConcurrent languages are Functional by Francesco Cesarini
Concurrent languages are Functional by Francesco CesariniNaresh Jain
 
Erlang from behing the trenches by Francesco Cesarini
Erlang from behing the trenches by Francesco CesariniErlang from behing the trenches by Francesco Cesarini
Erlang from behing the trenches by Francesco CesariniNaresh Jain
 
Setting up Continuous Delivery Culture for a Large Scale Mobile App
Setting up Continuous Delivery Culture for a Large Scale Mobile AppSetting up Continuous Delivery Culture for a Large Scale Mobile App
Setting up Continuous Delivery Culture for a Large Scale Mobile AppNaresh Jain
 
Towards FutureOps: Stable, Repeatable environments from Dev to Prod
Towards FutureOps: Stable, Repeatable environments from Dev to ProdTowards FutureOps: Stable, Repeatable environments from Dev to Prod
Towards FutureOps: Stable, Repeatable environments from Dev to ProdNaresh Jain
 
Value Driven Development by Dave Thomas
Value Driven Development by Dave Thomas Value Driven Development by Dave Thomas
Value Driven Development by Dave Thomas Naresh Jain
 
No Silver Bullets in Functional Programming by Brian McKenna
No Silver Bullets in Functional Programming by Brian McKennaNo Silver Bullets in Functional Programming by Brian McKenna
No Silver Bullets in Functional Programming by Brian McKennaNaresh Jain
 
Functional Programming Conference 2016
Functional Programming Conference 2016Functional Programming Conference 2016
Functional Programming Conference 2016Naresh Jain
 
Agile India 2017 Conference
Agile India 2017 ConferenceAgile India 2017 Conference
Agile India 2017 ConferenceNaresh Jain
 
Unleashing the Power of Automated Refactoring with JDT
Unleashing the Power of Automated Refactoring with JDTUnleashing the Power of Automated Refactoring with JDT
Unleashing the Power of Automated Refactoring with JDTNaresh Jain
 
Getting2Alpha: Turbo-charge your product with Game Thinking by Amy Jo Kim
Getting2Alpha: Turbo-charge your product with Game Thinking by Amy Jo KimGetting2Alpha: Turbo-charge your product with Game Thinking by Amy Jo Kim
Getting2Alpha: Turbo-charge your product with Game Thinking by Amy Jo KimNaresh Jain
 

More from Naresh Jain (20)

Problem Solving Techniques For Evolutionary Design
Problem Solving Techniques For Evolutionary DesignProblem Solving Techniques For Evolutionary Design
Problem Solving Techniques For Evolutionary Design
 
Agile India 2019 Conference Welcome Note
Agile India 2019 Conference Welcome NoteAgile India 2019 Conference Welcome Note
Agile India 2019 Conference Welcome Note
 
Organizational Resilience
Organizational ResilienceOrganizational Resilience
Organizational Resilience
 
Improving the Quality of Incoming Code
Improving the Quality of Incoming CodeImproving the Quality of Incoming Code
Improving the Quality of Incoming Code
 
Agile India 2018 Conference Summary
Agile India 2018 Conference SummaryAgile India 2018 Conference Summary
Agile India 2018 Conference Summary
 
Agile India 2018 Conference
Agile India 2018 ConferenceAgile India 2018 Conference
Agile India 2018 Conference
 
Agile India 2018 Conference
Agile India 2018 ConferenceAgile India 2018 Conference
Agile India 2018 Conference
 
Agile India 2018 Conference
Agile India 2018 ConferenceAgile India 2018 Conference
Agile India 2018 Conference
 
Pilgrim's Progress to the Promised Land by Robert Virding
Pilgrim's Progress to the Promised Land by Robert VirdingPilgrim's Progress to the Promised Land by Robert Virding
Pilgrim's Progress to the Promised Land by Robert Virding
 
Concurrent languages are Functional by Francesco Cesarini
Concurrent languages are Functional by Francesco CesariniConcurrent languages are Functional by Francesco Cesarini
Concurrent languages are Functional by Francesco Cesarini
 
Erlang from behing the trenches by Francesco Cesarini
Erlang from behing the trenches by Francesco CesariniErlang from behing the trenches by Francesco Cesarini
Erlang from behing the trenches by Francesco Cesarini
 
Setting up Continuous Delivery Culture for a Large Scale Mobile App
Setting up Continuous Delivery Culture for a Large Scale Mobile AppSetting up Continuous Delivery Culture for a Large Scale Mobile App
Setting up Continuous Delivery Culture for a Large Scale Mobile App
 
Towards FutureOps: Stable, Repeatable environments from Dev to Prod
Towards FutureOps: Stable, Repeatable environments from Dev to ProdTowards FutureOps: Stable, Repeatable environments from Dev to Prod
Towards FutureOps: Stable, Repeatable environments from Dev to Prod
 
Value Driven Development by Dave Thomas
Value Driven Development by Dave Thomas Value Driven Development by Dave Thomas
Value Driven Development by Dave Thomas
 
No Silver Bullets in Functional Programming by Brian McKenna
No Silver Bullets in Functional Programming by Brian McKennaNo Silver Bullets in Functional Programming by Brian McKenna
No Silver Bullets in Functional Programming by Brian McKenna
 
Functional Programming Conference 2016
Functional Programming Conference 2016Functional Programming Conference 2016
Functional Programming Conference 2016
 
Agile India 2017 Conference
Agile India 2017 ConferenceAgile India 2017 Conference
Agile India 2017 Conference
 
The Eclipse Way
The Eclipse WayThe Eclipse Way
The Eclipse Way
 
Unleashing the Power of Automated Refactoring with JDT
Unleashing the Power of Automated Refactoring with JDTUnleashing the Power of Automated Refactoring with JDT
Unleashing the Power of Automated Refactoring with JDT
 
Getting2Alpha: Turbo-charge your product with Game Thinking by Amy Jo Kim
Getting2Alpha: Turbo-charge your product with Game Thinking by Amy Jo KimGetting2Alpha: Turbo-charge your product with Game Thinking by Amy Jo Kim
Getting2Alpha: Turbo-charge your product with Game Thinking by Amy Jo Kim
 

Recently uploaded

SATToSE_2023_Presentation_slideshare.pdf
SATToSE_2023_Presentation_slideshare.pdfSATToSE_2023_Presentation_slideshare.pdf
SATToSE_2023_Presentation_slideshare.pdfnatarajan8993
 
Self scaling Multi cloud nomad workloads
Self scaling Multi cloud nomad workloadsSelf scaling Multi cloud nomad workloads
Self scaling Multi cloud nomad workloadsBram Vogelaar
 
MSR2022_Hackathon.pdf
MSR2022_Hackathon.pdfMSR2022_Hackathon.pdf
MSR2022_Hackathon.pdfnatarajan8993
 
Get Your Hands Off the Teams Work.pdf
Get Your Hands Off the Teams Work.pdfGet Your Hands Off the Teams Work.pdf
Get Your Hands Off the Teams Work.pdfAngela Johnson
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...confluent
 
unit I lecture 3 - Software Process Models.pdf
unit I lecture 3 - Software Process Models.pdfunit I lecture 3 - Software Process Models.pdf
unit I lecture 3 - Software Process Models.pdfStephenTec
 
BotSE2022-Natarajan.pdf
BotSE2022-Natarajan.pdfBotSE2022-Natarajan.pdf
BotSE2022-Natarajan.pdfnatarajan8993
 
Enabling Enterprise-wide OT Data access with Matrikon Data Broker.pdf
Enabling Enterprise-wide OT Data access  with Matrikon Data Broker.pdfEnabling Enterprise-wide OT Data access  with Matrikon Data Broker.pdf
Enabling Enterprise-wide OT Data access with Matrikon Data Broker.pdfJohn Archer
 
Microsoft 365 De Security pdf
Microsoft 365 De Security pdfMicrosoft 365 De Security pdf
Microsoft 365 De Security pdfMarkus Moeller
 
Open Sprintera (Where Open Source Sparks a Sprint of Possibilities)
Open Sprintera (Where Open Source Sparks a Sprint of Possibilities)Open Sprintera (Where Open Source Sparks a Sprint of Possibilities)
Open Sprintera (Where Open Source Sparks a Sprint of Possibilities)GDSCNiT
 
unit I lecture 4 - AGILE DEVELOPMENT AND PLAN-DRIVEN.pdf
unit I lecture 4 - AGILE DEVELOPMENT AND PLAN-DRIVEN.pdfunit I lecture 4 - AGILE DEVELOPMENT AND PLAN-DRIVEN.pdf
unit I lecture 4 - AGILE DEVELOPMENT AND PLAN-DRIVEN.pdfStephenTec
 
unit 1 lecture 1 - Introduction - Software Engineering Myths.pdf
unit 1 lecture 1 - Introduction - Software Engineering Myths.pdfunit 1 lecture 1 - Introduction - Software Engineering Myths.pdf
unit 1 lecture 1 - Introduction - Software Engineering Myths.pdfStephenTec
 
Slide Deck - Milestone 9 alx mils .pptx
Slide Deck  - Milestone 9 alx mils .pptxSlide Deck  - Milestone 9 alx mils .pptx
Slide Deck - Milestone 9 alx mils .pptxYassineBissaoui1
 
unit I lecture 2 - Software Engineering Ethics - Software Process.pdf
unit I lecture 2 - Software Engineering Ethics - Software Process.pdfunit I lecture 2 - Software Engineering Ethics - Software Process.pdf
unit I lecture 2 - Software Engineering Ethics - Software Process.pdfStephenTec
 
Steps to Build a PWA with Odoo.pdf
Steps to Build a PWA with Odoo.pdfSteps to Build a PWA with Odoo.pdf
Steps to Build a PWA with Odoo.pdfayushinwizards
 
unit I lecture 5 - Software Development Life Cycle.pdf
unit I lecture 5 - Software Development Life Cycle.pdfunit I lecture 5 - Software Development Life Cycle.pdf
unit I lecture 5 - Software Development Life Cycle.pdfStephenTec
 
Manual de la Mezcladora SoundCraft Notepad -12Fx
Manual de la Mezcladora SoundCraft Notepad -12FxManual de la Mezcladora SoundCraft Notepad -12Fx
Manual de la Mezcladora SoundCraft Notepad -12Fxjavierdavidvelasco17
 
India's_Generative_AI_Startup_Landscape_Report_2023_Inc42 (1).pdf
India's_Generative_AI_Startup_Landscape_Report_2023_Inc42 (1).pdfIndia's_Generative_AI_Startup_Landscape_Report_2023_Inc42 (1).pdf
India's_Generative_AI_Startup_Landscape_Report_2023_Inc42 (1).pdfgranitesrijan
 

Recently uploaded (20)

SATToSE_2023_Presentation_slideshare.pdf
SATToSE_2023_Presentation_slideshare.pdfSATToSE_2023_Presentation_slideshare.pdf
SATToSE_2023_Presentation_slideshare.pdf
 
Self scaling Multi cloud nomad workloads
Self scaling Multi cloud nomad workloadsSelf scaling Multi cloud nomad workloads
Self scaling Multi cloud nomad workloads
 
Importance Of Smaket In Your Buussiness
Importance Of Smaket In Your BuussinessImportance Of Smaket In Your Buussiness
Importance Of Smaket In Your Buussiness
 
MSR2022_Hackathon.pdf
MSR2022_Hackathon.pdfMSR2022_Hackathon.pdf
MSR2022_Hackathon.pdf
 
Get Your Hands Off the Teams Work.pdf
Get Your Hands Off the Teams Work.pdfGet Your Hands Off the Teams Work.pdf
Get Your Hands Off the Teams Work.pdf
 
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
Industry 4.0: Building the Unified Namespace with Confluent, HiveMQ and Spark...
 
unit I lecture 3 - Software Process Models.pdf
unit I lecture 3 - Software Process Models.pdfunit I lecture 3 - Software Process Models.pdf
unit I lecture 3 - Software Process Models.pdf
 
BotSE2022-Natarajan.pdf
BotSE2022-Natarajan.pdfBotSE2022-Natarajan.pdf
BotSE2022-Natarajan.pdf
 
Enabling Enterprise-wide OT Data access with Matrikon Data Broker.pdf
Enabling Enterprise-wide OT Data access  with Matrikon Data Broker.pdfEnabling Enterprise-wide OT Data access  with Matrikon Data Broker.pdf
Enabling Enterprise-wide OT Data access with Matrikon Data Broker.pdf
 
Features of IETM Software -Code and Pixels
Features of IETM Software -Code and PixelsFeatures of IETM Software -Code and Pixels
Features of IETM Software -Code and Pixels
 
Microsoft 365 De Security pdf
Microsoft 365 De Security pdfMicrosoft 365 De Security pdf
Microsoft 365 De Security pdf
 
Open Sprintera (Where Open Source Sparks a Sprint of Possibilities)
Open Sprintera (Where Open Source Sparks a Sprint of Possibilities)Open Sprintera (Where Open Source Sparks a Sprint of Possibilities)
Open Sprintera (Where Open Source Sparks a Sprint of Possibilities)
 
unit I lecture 4 - AGILE DEVELOPMENT AND PLAN-DRIVEN.pdf
unit I lecture 4 - AGILE DEVELOPMENT AND PLAN-DRIVEN.pdfunit I lecture 4 - AGILE DEVELOPMENT AND PLAN-DRIVEN.pdf
unit I lecture 4 - AGILE DEVELOPMENT AND PLAN-DRIVEN.pdf
 
unit 1 lecture 1 - Introduction - Software Engineering Myths.pdf
unit 1 lecture 1 - Introduction - Software Engineering Myths.pdfunit 1 lecture 1 - Introduction - Software Engineering Myths.pdf
unit 1 lecture 1 - Introduction - Software Engineering Myths.pdf
 
Slide Deck - Milestone 9 alx mils .pptx
Slide Deck  - Milestone 9 alx mils .pptxSlide Deck  - Milestone 9 alx mils .pptx
Slide Deck - Milestone 9 alx mils .pptx
 
unit I lecture 2 - Software Engineering Ethics - Software Process.pdf
unit I lecture 2 - Software Engineering Ethics - Software Process.pdfunit I lecture 2 - Software Engineering Ethics - Software Process.pdf
unit I lecture 2 - Software Engineering Ethics - Software Process.pdf
 
Steps to Build a PWA with Odoo.pdf
Steps to Build a PWA with Odoo.pdfSteps to Build a PWA with Odoo.pdf
Steps to Build a PWA with Odoo.pdf
 
unit I lecture 5 - Software Development Life Cycle.pdf
unit I lecture 5 - Software Development Life Cycle.pdfunit I lecture 5 - Software Development Life Cycle.pdf
unit I lecture 5 - Software Development Life Cycle.pdf
 
Manual de la Mezcladora SoundCraft Notepad -12Fx
Manual de la Mezcladora SoundCraft Notepad -12FxManual de la Mezcladora SoundCraft Notepad -12Fx
Manual de la Mezcladora SoundCraft Notepad -12Fx
 
India's_Generative_AI_Startup_Landscape_Report_2023_Inc42 (1).pdf
India's_Generative_AI_Startup_Landscape_Report_2023_Inc42 (1).pdfIndia's_Generative_AI_Startup_Landscape_Report_2023_Inc42 (1).pdf
India's_Generative_AI_Startup_Landscape_Report_2023_Inc42 (1).pdf
 

Anatomy of an eCommerce Search Engine by Mayur Datar

  • 3. ● Search is one of the most important discovery tools in E-commerce. ● Powers other features like merchandising (promotions), recommendations etc. ● Accounts for big fraction of the units sold and GMV.
  • 4. ● Important signals that affect search: Price, offers, popularity, availability, serviceability etc. ● Used in ranking of products. ● Exposed as filters and sorts to end users. ● These signals are very dynamic, particularly during sales.
  • 5. ● E-commerce search != websearch. ● Documents have a structure to them ● Queries have an implicit structure ● Challenges: ○ Large document collection with a long heavy tail ○ Extremely high rate of changes/updates (Thousands per sec) ○ Geo specific ranking ○ Multi-objective optimization (GMV, Units, Ads revenue, Long Term Value) ● Opportunities: ○ Broad queries: personalization can play a huge role
  • 6. ● Queries per day: XXX Millions / week ● Latencies: ○ Average: ~ 100 ms ○ Median: ~ 50 ms ○ 90th percentile: ~ 500 ms ● Documents retrieved and scored from index: ○ Median: 1K to 10K ○ 95th percentile: 200K to 500K ○ 99th percentile: 500K to 3M+ ● Search CTR: Around 50%
  • 7. ● Architectural overview of the search platform ○ Serving and Ingestion ○ Serving functional view ○ Serving architectural view ○ Ingestion architectural view ○ Example ingestion topology ● Search quality ○ Challenges ○ Life of a query: Typical flow for query understanding ○ Illustrative problems
  • 8. ● 1,000,000 Compute Cores ● 2.56 Petabytes RAM ● 120 Petabytes Disk Storage ● 1 Petabytes NVMe SSD ● 128 Tbps bisection bandwidth Clos network
  • 10. Query Rewriter (Spell Check, Concept, NLP, Intent, Augmentation,Retrieval/Scoring query formulation) Reverse Proxy (Geo Coding, User Context, Caching, Isolation, Rate Limit, Tee-off test framework) Search Broker (Distributed Search across shards, Blending Of Results from shards) Searcher (Matching, Scoring, Faceting, Top-K Retrieval (pass-1 ranking)) Text index NRT index Metadata Re-ranking (Pass-2 Ranking) - ML Model Pluggable Ranking Models Pluggable Rewriter Modules
  • 14. ● Architectural overview of the search platform ○ Serving and Ingestion ○ Serving functional view ○ Serving architectural view ○ Ingestion architectural view ○ Example ingestion topology ● Search quality ○ Challenges ○ Life of a query: Typical flow for query understanding ○ Illustrative problems
  • 15. ● Marketplace ○ Catalog entries vary in quality from seller to seller. Spam is rampant. ● Diversity of users ● Mobile heavy users: Real estate on UI ● Poor internet connectivity
  • 16. ● Literacy/Internet awareness ● Language ● Economic power ● Regional preferences Abstraction: City-tier Query/Intent Solicitation Result Presentation Product Ranking
  • 17. 40% increase in proportion of tier-3 customers vis-a-vis metro
  • 18. Query: samsang Relative ratio of query Tier-3 Vs Metro: 1.8 Query: jins Relative ratio of query Tier-3 Vs Metro: 2.2
  • 20. Query Scoring Normalisation(Index time as well) - String clean-up - lower Spell Correction - Resource-based - term->term - Query->query - Online Init Context Phrasing (Index time as well) - Frequent bi/tri grams Stemming (Index time as well) - Core e-commerce stemmer - plurals Common MetaData Store (Query Level) - Raw Data: metrics (CTR, Impression, NDCG…) - Derived Data: Store, LM score, Features Synonyms - Resource-based Intent - Deductions - Tagging (CRF) Query Rewrite - Best query selection - Partial match SOLR interface Query Understanding Output Generator Retrieval ranking logic Store Classifier Query LM Feature Store Classification
  • 21. • Special patterns: – Segmented words: lgnexus5 Counting: “samsang” & no-click followed by “samsung”& click a million times – Context aware counting • Language modeling and edit distance • Term to vector models in deep learning. Specific General
  • 22. ● Intent: From query tokens to (implicit) attributes that are represented by those tokens ● Examples: ○ “red tape shoes” -> (brand) “red tape” (store) “shoes” ○ “kids party dress 4-5 years pack of 2” -> (ideal_for) “kids” (occasion) “party” (store) “dress” (size) “4-5 years” (pack_of) “pack of 2” ○ “samsung e6 cases” -> (“compatible_with”) “samsung e6” (store) “cases” ● Memorization, Language modeling, CRF
  • 23. Past orders Product Views Users’ activity on the platform Customised Search Ranking for User-segment
  • 24. economical expensive shoes watches Past orders Product Views 5 price ranges defined for each vertical. 1 2 3 4 5 User-Segments based on price affinities Users’ past activity on the platform. Customised Search Ranking for each User-segment Price Personalization #ofusers