SlideShare a Scribd company logo
1 of 30
Download to read offline
Towards an End-to-End Product Search System
Zalando Research
Han Xiao
Sept 6, 2017
Animation involved, best view in presentation mode
NLP Team @ Zalando Research
Alan Akbik,
Duncan Blythe,
Leonidas Lefakis,
Han Xiao
About Me
Han Xiao
Senior Research Scientist @ Zalando Research
2.5y engineering experience in Reco and Search teams @ Zalando
Ph.D. & M.Sc. in Computer Science @ TU Munich
Blog: https://hanxiao.github.io
LinkedIn: https://www.linkedin.com/in/hxiao87/
Agenda
1. How most product search systems work?
2. Why do we need an end-to-end product search system?
3. What data do we need to build end-to-end search?
4. Query2Attribute model: character-based LSTM + multi-task learning
5. Discussion
Classical information retrieval framework
Query
String/Symbolic
representation
Product
String/Symbolic
representation
Matching
offline
Parsing Indexing
Classic product search system with filter query
Indexing
{
“brand”: “Miss Selfridge”,
“category”: “Umhängetasche”,
“color”: “red”,
...
}
Message
Queue
Structured
string index
Filter query
*Animation
brand="nike" AND color="orange"
Parsing a full-text query to a filter query
Indexing
{
“brand”: “Miss Selfridge”,
“category”: “Umhängetasche”,
“color”: “red”,
...
}
Message
Queue
Structured
string index
Filter query
Parsing
Query understanding as a pipeline (ideal)
tokenize
lemmatize
spell-correct
recognize
named-entity
disambiguate
Filter queryquery-builder
recognize synonym
& acronym
Full text query normalize
Queryparsing
"nikke sport whiteschoe"
brand="Nike" AND
category=("sportshoe"
OR "sport" OR "shoe")
AND color="white"
Query understanding as a DAG (in practice)
tokenize
lemmatize
spell-correct
recognize
named-entity
disambiguate
Filter queryquery-builder
Full text query normalize
Queryparsing
recognize synonym
& acronym
"nikke sport whiteschoe"
brand="Nike" AND
category=("sportshoe"
OR "sport" OR "shoe")
AND color="white"
Pros & cons of a pipeline system
Upside: intuitive, modular, many off-the-shelf packages, easy to collaborate
● Fragile
● Complicated dependency
● Not straightforward to improve overall search experience
● Difficult to scale out on other languages
Question 1:
If finding the right article is the final goal,
then why should we even care about spell-checking?
Problems of a symbolic-based system
● Limited interpretability;
● Hard-coding rules to enable acronyms, synonyms, etc.
Can't scale to different appdomains;
● No matter how well the intention is, the overall system will turn into a set of
heuristics.
Upside: easy to implement, efficient, very well-studied
Question 2:
How can we associate “fur mamas” with
“Schwangerschaftsmode”
without hard-coding for each language?
Motivation of building end-to-end product search
Question 1:
If finding the right article is the final goal,
then why should we even care about
spell-checking?
Question 2:
How can we associate “fur mamas” with
“Schwangerschaftsmode”
without hard-coding on each domain?
eliminate
components in the
pipeline
find better
representation for
query and product
An end-to-end product
search system with deep
learning
more robust
easier to maintain
more scalable
simpler
architecture
smarter
Classical system vs end-to-end product search system
Query
Symbolic
representation
Product
Symbolic
representation
① indexing② parsing
③ matching
offline
Query
Latent
representation
Product
Latent
representation
matching
offline
deep learning deep learning
*Animation
Text ↔ SKU data sources
Three types of data sources
● Query2SKU
● Crowdsourcing annotations
● Customer reviews
User-generated
content
Product
Extracting Query↔SKU mapping from message queue
receive-query:
"denim shirt"
search-result
user
type in search-box
see search
result page
retrieval-search-result
click a product
click-through:
SKU00000-001
retrieve-reco
-result
"denimshirt"
Message
Queue
Time
Time
{
query: "denim shirt"
skus: ["SKU00000-001", "SKU00000-002"]
}
see PDP
search-result PDP PDP
click-through:
SKU0000-002
click on reco
*Animation
Example of Query → SKU map
{"query":"ananas",
"skus":[
{"id":"CE321D0HP-A11","freq":371},
{"id":"RL651E02D-F11","freq":273},
{"id":"EV411AA0K-T11","freq":243},
{"id":"L1211E001-A11","freq":208},
{"id":"ES121D0ON-C11","freq":180},
...
{"id":"TO226K009-I11","freq":2},
{"id":"BH523F01J-A11","freq":2},
{"id":"MOC83C00C-J11","freq":1},
{"id":"MOC83C001-J11","freq":1},
{"id":"HG223F04A-A11","freq":1}]}
{"sku":"CZ621C04O-G11",
"queries":[
{"text":"chi+chi+london","freq":998},
{"text":"abendkleid","freq":403},
{"text":"ballkleid","freq":394},
{"text":"cocktailkleid","freq":134},
{"text":"kleid","freq":125},
{"text":"kleider","freq":118},
{"text":"abendkleider","freq":79},
{"text":"abendkleid+lang","freq":58},
{"text":"kleid+lang","freq":46},
{"text":"abiballkleid","freq":46},
{"text":"chi+chi","freq":43},
{"text":"lange+kleider","freq":40},
{"text":"ballkleider","freq":36}
Example of SKU → Query map
{"text":"ballkkeid+lang","freq":1},
{"text":"ball+kleid","freq":1},
{"text":"abschlusskleid+leng","freq":1},
{"text":"abschlussballkleider","freq":1},
{"text":"abschluss+kleider+rot","freq":1},
{"text":"abenkleid","freq":1},
{"text":"abendskleid","freq":1},
{"text":"abendkleider+in+lang","freq":1},
{"text":"abendkleider+abendkleider","freq":1},
{"text":"abendkleid+damen","freq":1},
{"text":"abendkleid+chi+chi+london","freq":1},
{"text":"abendkleid+/ballkleid","freq":1},
{"text":"abend+kleid","freq":1}]}
Query2Attribute model
Query2Attribute Model
Query AttributesSKU
Query Attributes
Classical system vs end-to-end product search system
Query
Symbolic
representation
SKU
Symbolic
representation
① indexing② parsing
③ matching
offline
Query
Latent
representation
Product
Latent
representation
matching
offline
deep learning deep learning
Query
Symbolic
representation
Product
Symbolic
representation
matching
offline
deep learning
Classic Query2Attribute Query2SKU
➢ Leverage current indexing and
matching system
➢ Good interpretability
➢ Completely end-to-end
➢ Require efficient matching
algorithm
*Animation
Translating a full-text query to a filter query
1. input: "sports"
2. output: brand, color, gender, category distributions
3. translating results to a filter query:
"brand=Nike Performance AND color=schwarz AND category=(Sport OR Sportbekleidung)"
brand color gender category
Character-based RNN with multi-task learning
Brandclasscolorclass
*Animation
Fully-connected
Fully-connected
LSTM LSTM LSTM...
q u y
character-embedding
...
encoder
Character-based RNN with multi-task learning
(q, )
Brandclasscolorclass
*Animation
...
Fully-connected
Fully-connected
(u, ) (y, )
LSTM LSTM LSTM...
character-embedding
encoder
Demo
Encoder-Matcher Architecture
(q, )
*Animation
...(u, ) (y, )
LSTM LSTM LSTM...
character-embedding
query-encoder
attribute-encoder
matcher
{brand:
"Nike",
color:
"olive"}
YES/NO
image-encoder
Discussion: pros/cons of an end-to-end product search
Query
Symbolic
representation
Product
Symbolic
representation
① indexing② parsing
③ matching
offline
Query
Latent
representation
Product
Latent
representation
matching
offline
deep learning deep learning
Classic Deep learning based End2End
Scalable,
maintainable, data-driven
Need a lot of data, comp.
resources
Thanks for your attention!

More Related Content

Similar to 06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - TOWARDS AN END-TO-END PRODUCT SEARCH SYSTEM

Riding The Semantic Wave
Riding The Semantic WaveRiding The Semantic Wave
Riding The Semantic Wave
Kaniska Mandal
 
Discovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender SystemsDiscovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender Systems
Gabriel Moreira
 

Similar to 06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - TOWARDS AN END-TO-END PRODUCT SEARCH SYSTEM (20)

Falling in and out and in love with Information Architecture
Falling in and out and in love with Information ArchitectureFalling in and out and in love with Information Architecture
Falling in and out and in love with Information Architecture
 
FutureLinks - AI-powered Digital Research Assistant
FutureLinks - AI-powered Digital Research AssistantFutureLinks - AI-powered Digital Research Assistant
FutureLinks - AI-powered Digital Research Assistant
 
Feature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systemsFeature store: Solving anti-patterns in ML-systems
Feature store: Solving anti-patterns in ML-systems
 
Bjørnegård school visit @ Simuladagen 2015
Bjørnegård school visit @ Simuladagen 2015Bjørnegård school visit @ Simuladagen 2015
Bjørnegård school visit @ Simuladagen 2015
 
AI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge ManagementAI, Search, and the Disruption of Knowledge Management
AI, Search, and the Disruption of Knowledge Management
 
Integrating AI Functionalities in your Flutter App.pptx
Integrating AI Functionalities in your Flutter App.pptxIntegrating AI Functionalities in your Flutter App.pptx
Integrating AI Functionalities in your Flutter App.pptx
 
Smashing silos ia-ux-meetup-mar112014
Smashing silos ia-ux-meetup-mar112014Smashing silos ia-ux-meetup-mar112014
Smashing silos ia-ux-meetup-mar112014
 
Search Analytics for Content Strategists
Search Analytics for Content StrategistsSearch Analytics for Content Strategists
Search Analytics for Content Strategists
 
Abhishek Deshpande Resume- October 2023.pdf
Abhishek Deshpande Resume- October 2023.pdfAbhishek Deshpande Resume- October 2023.pdf
Abhishek Deshpande Resume- October 2023.pdf
 
Usability and Accessibility Have a Conversation: How Accessibility and UI/UX ...
Usability and Accessibility Have a Conversation: How Accessibility and UI/UX ...Usability and Accessibility Have a Conversation: How Accessibility and UI/UX ...
Usability and Accessibility Have a Conversation: How Accessibility and UI/UX ...
 
A Primer on Text Mining for Business
A Primer on Text Mining for BusinessA Primer on Text Mining for Business
A Primer on Text Mining for Business
 
The Relationship Between SEO & Content
The Relationship Between SEO & ContentThe Relationship Between SEO & Content
The Relationship Between SEO & Content
 
Major_Project_Presentaion_B14.pptx
Major_Project_Presentaion_B14.pptxMajor_Project_Presentaion_B14.pptx
Major_Project_Presentaion_B14.pptx
 
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.comHABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
HABIB FIGA GUYE {BULE HORA UNIVERSITY}(habibifiga@gmail.com
 
Machine Learning for Marketers - CTAConf 2019
Machine Learning for Marketers - CTAConf 2019Machine Learning for Marketers - CTAConf 2019
Machine Learning for Marketers - CTAConf 2019
 
Elqano - Where Knowledge Finds People
Elqano - Where Knowledge Finds PeopleElqano - Where Knowledge Finds People
Elqano - Where Knowledge Finds People
 
OpenKM commercial
OpenKM commercialOpenKM commercial
OpenKM commercial
 
Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...
Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...
Expert Webinar Series 2: Designing Information Architecture for SharePoint: M...
 
Riding The Semantic Wave
Riding The Semantic WaveRiding The Semantic Wave
Riding The Semantic Wave
 
Discovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender SystemsDiscovering User's Topics of Interest in Recommender Systems
Discovering User's Topics of Interest in Recommender Systems
 

More from Zalando adtech lab

More from Zalando adtech lab (11)

05.10.2017 AWS User Group Meetup - FALLACIES OF DISTRIBUTED COMPUTING WITH KU...
05.10.2017 AWS User Group Meetup - FALLACIES OF DISTRIBUTED COMPUTING WITH KU...05.10.2017 AWS User Group Meetup - FALLACIES OF DISTRIBUTED COMPUTING WITH KU...
05.10.2017 AWS User Group Meetup - FALLACIES OF DISTRIBUTED COMPUTING WITH KU...
 
18.09.2017 Clojure Meetup - ZALANDOS APPROACH TO MICROSERVICES IN CLOJURE
18.09.2017 Clojure Meetup - ZALANDOS APPROACH TO MICROSERVICES IN CLOJURE18.09.2017 Clojure Meetup - ZALANDOS APPROACH TO MICROSERVICES IN CLOJURE
18.09.2017 Clojure Meetup - ZALANDOS APPROACH TO MICROSERVICES IN CLOJURE
 
18.09.2017 Clojure Meetup - PATH TO MICROSERVICES
18.09.2017 Clojure Meetup - PATH TO MICROSERVICES18.09.2017 Clojure Meetup - PATH TO MICROSERVICES
18.09.2017 Clojure Meetup - PATH TO MICROSERVICES
 
12.07.2017 Docker Meetup - POSTGRE SQL ON KUBERNETES
12.07.2017 Docker Meetup - POSTGRE SQL ON KUBERNETES12.07.2017 Docker Meetup - POSTGRE SQL ON KUBERNETES
12.07.2017 Docker Meetup - POSTGRE SQL ON KUBERNETES
 
12.07.2017 Docker Meetup - KUBERNETES ON AWS @ ZALANDO TECH
12.07.2017 Docker Meetup - KUBERNETES ON AWS @ ZALANDO TECH12.07.2017 Docker Meetup - KUBERNETES ON AWS @ ZALANDO TECH
12.07.2017 Docker Meetup - KUBERNETES ON AWS @ ZALANDO TECH
 
30.08.2017 React Meetup
30.08.2017 React Meetup30.08.2017 React Meetup
30.08.2017 React Meetup
 
31.08.2017 Data Science Meetup - SCALABLE MACHINE LEARNING FOR FRAUD DETECTION
31.08.2017 Data Science Meetup - SCALABLE MACHINE LEARNING FOR FRAUD DETECTION31.08.2017 Data Science Meetup - SCALABLE MACHINE LEARNING FOR FRAUD DETECTION
31.08.2017 Data Science Meetup - SCALABLE MACHINE LEARNING FOR FRAUD DETECTION
 
06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...
06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...
06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - MULTI-GPU...
 
18.05.2017 BigData & NoSQL Meetup - DEEP LEARNING FOR PERSONALIZATION IN LARG...
18.05.2017 BigData & NoSQL Meetup - DEEP LEARNING FOR PERSONALIZATION IN LARG...18.05.2017 BigData & NoSQL Meetup - DEEP LEARNING FOR PERSONALIZATION IN LARG...
18.05.2017 BigData & NoSQL Meetup - DEEP LEARNING FOR PERSONALIZATION IN LARG...
 
18.05.2017 BigData & NoSQL Meetup - TOWARDS USER UNDERSTANDING
18.05.2017 BigData & NoSQL Meetup - TOWARDS USER UNDERSTANDING18.05.2017 BigData & NoSQL Meetup - TOWARDS USER UNDERSTANDING
18.05.2017 BigData & NoSQL Meetup - TOWARDS USER UNDERSTANDING
 
30.03.2017 Data Science Meetup - USER JOURNEY ANALYSIS, BETWEEN BUDGET ALLOCA...
30.03.2017 Data Science Meetup - USER JOURNEY ANALYSIS, BETWEEN BUDGET ALLOCA...30.03.2017 Data Science Meetup - USER JOURNEY ANALYSIS, BETWEEN BUDGET ALLOCA...
30.03.2017 Data Science Meetup - USER JOURNEY ANALYSIS, BETWEEN BUDGET ALLOCA...
 

Recently uploaded

Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Riyadh +966572737505 get cytotec
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
amitlee9823
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
MarinCaroMartnezBerg
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
amitlee9823
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
amitlee9823
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
amitlee9823
 

Recently uploaded (20)

VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get CytotecAbortion pills in Doha Qatar (+966572737505 ! Get Cytotec
Abortion pills in Doha Qatar (+966572737505 ! Get Cytotec
 
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
Call Girls in Sarai Kale Khan Delhi 💯 Call Us 🔝9205541914 🔝( Delhi) Escorts S...
 
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Thane West Call On 9920725232 With Body to body massage...
 
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
VIP Model Call Girls Hinjewadi ( Pune ) Call ON 8005736733 Starting From 5K t...
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
Vip Mumbai Call Girls Marol Naka Call On 9920725232 With Body to body massage...
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Anomaly detection and data imputation within time series
Anomaly detection and data imputation within time seriesAnomaly detection and data imputation within time series
Anomaly detection and data imputation within time series
 
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Saket (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdfAccredited-Transport-Cooperatives-Jan-2021-Web.pdf
Accredited-Transport-Cooperatives-Jan-2021-Web.pdf
 
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 

06.09.2017 Computer Science, Machine Learning & Statistiks Meetup - TOWARDS AN END-TO-END PRODUCT SEARCH SYSTEM

  • 1. Towards an End-to-End Product Search System Zalando Research Han Xiao Sept 6, 2017 Animation involved, best view in presentation mode
  • 2. NLP Team @ Zalando Research Alan Akbik, Duncan Blythe, Leonidas Lefakis, Han Xiao
  • 3. About Me Han Xiao Senior Research Scientist @ Zalando Research 2.5y engineering experience in Reco and Search teams @ Zalando Ph.D. & M.Sc. in Computer Science @ TU Munich Blog: https://hanxiao.github.io LinkedIn: https://www.linkedin.com/in/hxiao87/
  • 4. Agenda 1. How most product search systems work? 2. Why do we need an end-to-end product search system? 3. What data do we need to build end-to-end search? 4. Query2Attribute model: character-based LSTM + multi-task learning 5. Discussion
  • 5. Classical information retrieval framework Query String/Symbolic representation Product String/Symbolic representation Matching offline Parsing Indexing
  • 6. Classic product search system with filter query Indexing { “brand”: “Miss Selfridge”, “category”: “Umhängetasche”, “color”: “red”, ... } Message Queue Structured string index Filter query *Animation brand="nike" AND color="orange"
  • 7. Parsing a full-text query to a filter query Indexing { “brand”: “Miss Selfridge”, “category”: “Umhängetasche”, “color”: “red”, ... } Message Queue Structured string index Filter query Parsing
  • 8. Query understanding as a pipeline (ideal) tokenize lemmatize spell-correct recognize named-entity disambiguate Filter queryquery-builder recognize synonym & acronym Full text query normalize Queryparsing "nikke sport whiteschoe" brand="Nike" AND category=("sportshoe" OR "sport" OR "shoe") AND color="white"
  • 9. Query understanding as a DAG (in practice) tokenize lemmatize spell-correct recognize named-entity disambiguate Filter queryquery-builder Full text query normalize Queryparsing recognize synonym & acronym "nikke sport whiteschoe" brand="Nike" AND category=("sportshoe" OR "sport" OR "shoe") AND color="white"
  • 10. Pros & cons of a pipeline system Upside: intuitive, modular, many off-the-shelf packages, easy to collaborate ● Fragile ● Complicated dependency ● Not straightforward to improve overall search experience ● Difficult to scale out on other languages
  • 11. Question 1: If finding the right article is the final goal, then why should we even care about spell-checking?
  • 12. Problems of a symbolic-based system ● Limited interpretability; ● Hard-coding rules to enable acronyms, synonyms, etc. Can't scale to different appdomains; ● No matter how well the intention is, the overall system will turn into a set of heuristics. Upside: easy to implement, efficient, very well-studied
  • 13. Question 2: How can we associate “fur mamas” with “Schwangerschaftsmode” without hard-coding for each language?
  • 14. Motivation of building end-to-end product search Question 1: If finding the right article is the final goal, then why should we even care about spell-checking? Question 2: How can we associate “fur mamas” with “Schwangerschaftsmode” without hard-coding on each domain? eliminate components in the pipeline find better representation for query and product An end-to-end product search system with deep learning more robust easier to maintain more scalable simpler architecture smarter
  • 15. Classical system vs end-to-end product search system Query Symbolic representation Product Symbolic representation ① indexing② parsing ③ matching offline Query Latent representation Product Latent representation matching offline deep learning deep learning *Animation
  • 16. Text ↔ SKU data sources
  • 17. Three types of data sources ● Query2SKU ● Crowdsourcing annotations ● Customer reviews User-generated content Product
  • 18. Extracting Query↔SKU mapping from message queue receive-query: "denim shirt" search-result user type in search-box see search result page retrieval-search-result click a product click-through: SKU00000-001 retrieve-reco -result "denimshirt" Message Queue Time Time { query: "denim shirt" skus: ["SKU00000-001", "SKU00000-002"] } see PDP search-result PDP PDP click-through: SKU0000-002 click on reco *Animation
  • 19. Example of Query → SKU map {"query":"ananas", "skus":[ {"id":"CE321D0HP-A11","freq":371}, {"id":"RL651E02D-F11","freq":273}, {"id":"EV411AA0K-T11","freq":243}, {"id":"L1211E001-A11","freq":208}, {"id":"ES121D0ON-C11","freq":180}, ... {"id":"TO226K009-I11","freq":2}, {"id":"BH523F01J-A11","freq":2}, {"id":"MOC83C00C-J11","freq":1}, {"id":"MOC83C001-J11","freq":1}, {"id":"HG223F04A-A11","freq":1}]}
  • 20. {"sku":"CZ621C04O-G11", "queries":[ {"text":"chi+chi+london","freq":998}, {"text":"abendkleid","freq":403}, {"text":"ballkleid","freq":394}, {"text":"cocktailkleid","freq":134}, {"text":"kleid","freq":125}, {"text":"kleider","freq":118}, {"text":"abendkleider","freq":79}, {"text":"abendkleid+lang","freq":58}, {"text":"kleid+lang","freq":46}, {"text":"abiballkleid","freq":46}, {"text":"chi+chi","freq":43}, {"text":"lange+kleider","freq":40}, {"text":"ballkleider","freq":36} Example of SKU → Query map {"text":"ballkkeid+lang","freq":1}, {"text":"ball+kleid","freq":1}, {"text":"abschlusskleid+leng","freq":1}, {"text":"abschlussballkleider","freq":1}, {"text":"abschluss+kleider+rot","freq":1}, {"text":"abenkleid","freq":1}, {"text":"abendskleid","freq":1}, {"text":"abendkleider+in+lang","freq":1}, {"text":"abendkleider+abendkleider","freq":1}, {"text":"abendkleid+damen","freq":1}, {"text":"abendkleid+chi+chi+london","freq":1}, {"text":"abendkleid+/ballkleid","freq":1}, {"text":"abend+kleid","freq":1}]}
  • 23. Classical system vs end-to-end product search system Query Symbolic representation SKU Symbolic representation ① indexing② parsing ③ matching offline Query Latent representation Product Latent representation matching offline deep learning deep learning Query Symbolic representation Product Symbolic representation matching offline deep learning Classic Query2Attribute Query2SKU ➢ Leverage current indexing and matching system ➢ Good interpretability ➢ Completely end-to-end ➢ Require efficient matching algorithm *Animation
  • 24. Translating a full-text query to a filter query 1. input: "sports" 2. output: brand, color, gender, category distributions 3. translating results to a filter query: "brand=Nike Performance AND color=schwarz AND category=(Sport OR Sportbekleidung)" brand color gender category
  • 25. Character-based RNN with multi-task learning Brandclasscolorclass *Animation Fully-connected Fully-connected LSTM LSTM LSTM... q u y character-embedding ... encoder
  • 26. Character-based RNN with multi-task learning (q, ) Brandclasscolorclass *Animation ... Fully-connected Fully-connected (u, ) (y, ) LSTM LSTM LSTM... character-embedding encoder
  • 27. Demo
  • 28. Encoder-Matcher Architecture (q, ) *Animation ...(u, ) (y, ) LSTM LSTM LSTM... character-embedding query-encoder attribute-encoder matcher {brand: "Nike", color: "olive"} YES/NO image-encoder
  • 29. Discussion: pros/cons of an end-to-end product search Query Symbolic representation Product Symbolic representation ① indexing② parsing ③ matching offline Query Latent representation Product Latent representation matching offline deep learning deep learning Classic Deep learning based End2End Scalable, maintainable, data-driven Need a lot of data, comp. resources
  • 30. Thanks for your attention!