Introducing"Challengesandresearchopportunitiesin
eCommercesearchandrecommendations"
Speaker:@hurutoriya 

Date:2021-07-07
1
What&Whythispaper?
ThispaperisSIGIRForumArticle.AuthorsareorganizersofSIGIReCom.Itwell
summalizedresearchhistoryateCommercesearch&recommendationdomain.
SIGIReComwillbringtogetherpractitionersandresearchersfromacademiaand
industrytodiscussthechallengesandapproachestoproductsearchand
recommendationineCommerce.
PaperlinkinAmazonScience
2
AspectsofeCommercesearchanddiscovery
1.Customergoal
2.Businessgoal
3.Datalogistics
ThreeeCommerceresearchareas
1.Matchingandranking
2.Coversationalsearch
3.Fairness,confidentialityandtransparency
Wefocuson Matching and ranking inthistalk.
3
Uniquepointsofproductsearch
Productsearchhastwomainstakeholderswhoseinterestscoopratebutalsocompete.
1.Customers
Cooperation :Needwhatbusinessesoffer
Compete :Wanttofindthebestqualityatthecheapestprice
2.Businessowners
Cooperation :Needcistomerpurchasestosurvive
Compete :Wanttomaximizeprofit
4
Customers
CustomersvisiteCommercesitestoaccomplishagoal.
Goals
1. simple :e.g.buyingacoffeemachine
2. complex e.g.fixingaholeinthewall
saerchqueriesandinteractions→Customerintents→CustomerJourneys
5
Queryintent
websearchqueriesintent
navigational,informational,transactional
eCommercesearchqueriesintent
UserIntent,Behaviour,andPerceivedSatisfactioninProductSearch,WSDM2018
targetfinding,decisionmaking,andexploration.
ATaxonomyofQueriesforE-commerceSearch,SIGIR2018bywalmart
shallowexploration,targetedpurchase,major-itemshopping,minor-item
shopping,andhard-choiceshopping
6
On-sitecustomerjourney
Customersjourneyviafunnel
i.broadqueries
ii.refinementsqueries
iii.examiningmultipleproductsbeforedecisionmaking
ReturnstoConsumerSearch:EvidencefromeBay,EconomicsandComputation
2016
LargeportionofeCommercecustomerjourneysareinitiallyexploratory,
recommendationsarevaluable.
Searchbecomesmoreimportantoncethecustomerhasshapedtheirviewofwhat
theywant.
7
Globalcustomerjourney
Thecustomerjourneycanspanmultiplesitesandofflineinteractions
Proposeasubstituteproductsystemtoavoidazerohitresult
i.Accesstoknowledgeoutsideofwhatisavailableinthecatalog
ii.Accesstotheglobalstateofcustomerjourney
LeadingConversationalSearchbySuggestingUsefulQuestions,WWW2020
8
Business
Customersatisfactionisimportantforbusiness,butisonlyoneofthemanycriteriathata
businessneedstotracktowardsthegoalofoptimizingprofit
9
Salesstrategiesandshort-andlong-termeffects
cross-selling :Enticingcustomerstobuyadditionalproducts
up-selling :Temptingcustomerstobuyamoreprofitableversionofaproduct
down-selling :Encouragingcustomerstobuybymatchingtheirbudget
e.g.BusinessPushthedown-selltoselltheitemswhichislower-qualityandcheaper.
shortterm:earntheprofit
longterm:mayycustomersnottoreturninthefuture.
cross-sellapproach
backfilltheSERPwithrecommendationsresultthatrelatedtosaerchresult.
10
Brandimageandinventory.
e.g.exampleofAmazon
AninterestingchallengeintheFashionStoreisthediscrepancybetweenwhatthe
majorityofcustomersactuallybuyandwhattheywanttoseeontopofthepage.The
itemmostcommonlyboughtforthequery”diamondring”mightbeacheap
zirconiumring.However,ifweshowthezirconiumringasafirstresult,oursearchwill
beperceivedasbroken.Besides,ourFashionStorewouldlooklikeafleamarket,
insteadofaclassicdepartmentstorewherethelatestcollectionsmeetyouatthe
entrance.
Toapproachthisproblem,weidentifystrategiccategoriesoffashionablecustomers
—customerswhoboughtoraddedtocartfashionbrandproducts—and
significantlyamplifytheirinfluencewhiledesigningthetrainingset.
AmazonSearch:TheJoyofRankingProducts,RecSys2016 11
Onlinemarketingandranking
eCommercesearchenginesincludebusinesslogicthatreflectsmarketingdecisions
Offlinemarketingandranking
eCommercebusinesseshavingbothonlineandphysicalpresencescreatesaunique
blendoforganizationalandinfrastructurechallenges.
12
Regulatoryandbusinessrestrictions
Regulatoryandbusinessconstraintsgovernwhichproductscanbeshowntowhich
customers.
mosteCommercesiteshavebusinesslogicatthetimeofcheckout

todeterminewhetheraproductcanbepurchasedandshippedtoagivencustomer
e.g.
onlyadultscanvieworbuycertainproducts.
13
Datalogistics
Dataplaysakeyroleinproductsearchandrecommendations.Serviceswherethe
eCommercewebsitehasmultiplevendorsbringindynamicswithregardstoqualityand
consistencyofthecontent,frauddetection,andpricing
14
Thirdpartycontent.
SomeeCommercesitessuchasAmazon,Taobao,andeBayserveasaplaceforother
companiestosellproducts.
Thedataforthethirdparty

productsmayneedtobereformattedorsupplementedbeforeindexing.
e.g.
ifthebrandoftheproductisnotprovidedasstructureddatabythevendor,itmaybe
possibletoextractitfromtheproducttitle
15
Volatileinventory
OneofthebiggestchallengesofeCommercesearchandrecommendationisthatthe
inventoryisconstantlychanging.
e.g.eBay
newitemneedtobeaddedquicklyintheindex.
offlinestoreinventoryandonlineinventorymustbesyncedinreal-time
Querysuggestionsarealsoaffectedbyvolatileinventoryastheymaysuggestqueries
thatnolongerreturnresults,creatingafrustratinguserexperience.
TheArchitectureofeBaySearch,SIGIReCom2017.
16
Multi-modal
documents
IneCommercesearch,theindexed
documents,i.e.,

theproductscustomersarelookingfor,
arecombinationsofimages,
unstructuredtextsuchastitles,
descriptions,andreviews,and
structureddatasuchasprice,brand,
ratings,andsellerlocation
17
eCommerceresearchareadeepdives
1.designofmatchingandrankingforeCommercesearch.
2.deepdiveintoconversationaleCommercethathaspromisetoenablethesmooth
shoppingexperienceprovidedbyexpertshopassistants
3.Wediscussissuesoffairness,confidentialityandtransparencywhichareattheheart
ofmaintainingcustomertrustwhileprovidingpersonalizedeCommerceexperiences.
18
1.Relevance:Matchingandranking
19
Matching
Navigationalones(aserialnumber)
Needexactmatchestoproductserialnumbers,producttitlesorcategorynames
Longinformationalones(arebatteriesincludedwiththiswatch)
Needsemanticparsingandmoreelaborateindexingbeforetheycanbe
answered
Somequeriesmayrequireadifferentuserinterface;forexampleatabularlayoutis
betterforansweringcomparisonqueries
20
Relevance
Originofdefinition:Relevancewasconsideredauniversal,dimensionlessquantity
Now:Nottobeuniversalbutinsteaduserdependent
eCommercerelevanceiscontext-dependentandithasfourdimensions
1.customer
2.time
3.query
4.contect(e.g.category)
21
Matchingqueriesandproducts
eCommercesearchisasmuchaboutexplorationasitisaboutfindingthebestexact
match
mayneedcarefulcraftingofsynonymstomatchacustomer’svocabularytothatof
thebusiness
ATaxonomyofQueriesforE-commerceSearchWSDM2018bywalmart/
WhyDoPeopleBuySeeminglyIrrelevantItemsinVoiceProductSearch?,
WSDM2020byAmazon
alltypesofsearch,tokenization,includingwordbreaking,decompounding,and
punctuationhandling,lemmatizationorstemming,andstopwordidentificationare
importantforidentifyingrelevantproducts
Removingthevocaburarygapischallangingresearchtopic.
RemediesagainsttheVocabularyGapinInformationRetrieval 22
Queryunderstanding
Pseudo-relevancefeedback
TheImpactofQuerySuggestioninE-CommerceWebsites
Queryclickgraphs
ContextAwareQuerySuggestionbyMiningClick-throughandSessionData
Exploitingqueryreformulationsforwebsearchresultdiversification,WWW2010
MiningE-CommerceQueryRelationsusingCustomerInteractionNetworks,
WWW2018
23
Queryunderstanding
Wordembeddings
QueryExpansionUsingWordEmbeddings,CIKM2016
Multi-modalmethodsthatcombinetextandvisualcues
ViTOR:LearningtoRankWebpagesBasedonVisualFeatures,WWW2019
ImprovingOutfitRecommendationwithCo-supervisionofFashionGeneration,
WWW2019
24
Queryintentengines
parsethequerytoextractcatalogspecificattributes
LearningQueryIntentfromRegularizedClickGraphs,SIGIR2008
JointMap:JointQueryIntentUnderstandingForModelingIntentHierarchiesinE-
commerceSearch,WWW2019
e.g.
query"redsneakers"whichconvertedto...
{"color":"red", "shoe type":"running", "category":"shoes"}

25
Queryintentengines
simplematchingofquerytermstoapredefinedsetofproductattributesto

moreelaboratesemanticmethods
QueryUnderstandingthroughKnowledge-BasedConceptualization
SemanticQueryUnderstanding
DeeperTextUnderstandingforIRwithContextualNeuralLanguageModeling
Ultimategoalofaqueryintentengineistoreturnstructured,personalizedqueriesforall
customerqueries.
26
Ranking
Howtoranktheresultsshowntocustomersisoneofthemostcomplexissuesin
eCommerce.
Practitionershaveputeffortintoderivingasinglerankingfunctionthat

mixesbooleanortf.idf-basedrankingalgorithmswithothersignals,suchasrecency
orpopularity
e.g.
Query:"stripedt-shirts"
Mayrankhighlystripedproductsotherthant-shirts
Since striped 's IDF scoreishigherthan t-shirts
Numberofsignalsisincreasingtoimprovetherankingbut...
27
Extendingtheproductrepresentation.
Documentshavemanyfeaturesbeyondhowcloselytheymatchthequeryterms
howmanytimestheyhave

beenpurchased
howmanytimestheyhavebeenclicked
theratioofclicksversuspurchases
28
Rankingsignalsandoptimizationcriteria ️
eCommercesearchandrecommendationsystemsmustoptimizeformultiplecriteria
Multi-objectiverankingoptimizationforproductsearchusingstochasticlabel
aggregation,WWW2020byAmazon
oneencodingcustomerpreferencesandoneencodingbusinesspreferences.
29
Rankingsignalsandoptimizationcriteria
Customersatisfactionismeasuredovermultiplesignals.
TutorialonOnlineUserEngagement:MetricsandOptimization,WWW2019
click-throughrate,hoveranddwelltime,satisfiedclicks,queryreformulations,
sessionlength,numberofqueriesbeforecheckout,add-to-baskets,purchases,
time-to-next-visit,productreturns,andcallstocustomerservice
BusinesssuccessismeasuredoverseveralKPIs
inventory-orientedmeasures,revenue-orientedmeasures,profit-oriented
measures,visitor-orientedmeasures,basket-orientedmeasures,
30
Notallsignalsareequal
Objectivefunctionsovermultiplesignalscanbiastowardsmoreabundantsignals.
e.g.purchaseisamoreexplicitpreferenceindicatorthanaclickbutitismuchless
frequent.Apurchasethatwasnotreturnedisastrongersignalthanapurchasebutagainis
lessfrequent
Objectivefunctionsshouldtakeintoaccountthisdifferenceinsignalstrength
versussignalabundance
Tips:normalizationofsignal'svolumeisonesolution
NewsComments:Exploring,Modeling,andOnlinePrediction,ECIR2019
31
Positive,negative,anddelayedfeedbackloops
Creatingfeedbackloopisreinforcementlearningparadigm
ReinforcementLearningtoRankinE-CommerceSearchEngine:Formalization,
Analysis,andApplication,KDD2018byAlibaba
longerfeedbackloopswherethefeedbackoccurswellafterthesystem

hasshownresultstotheuser
delayedfeedback,cold-startproblem.
LearningLatentVectorSpacesforProductSearch,CIKM2016

OnApplicationofLearningtoRankforE-CommerceSearch,SIGIR2017

AComparisonofCounterfactualandOnlineLearningtoRankfromUserInteractions.,
SIGIR2019
32
PracticallimitationsofLearningtoRank
MosteCommercesearchenginesbasedonLtRworkintwosteps.
1.recall-orientedstep
2.precision-orientedstep
ThisimplementationofLtRhas

proventobeeffectiveintermsofIRandbusinessmetrics
PromotingRelevantResultsinTime-RankedMailSearch,WWW2017

LearningtoRankforFreshnessandRelevance,SIGIR2011

OnApplicationofLearningtoRankforE-CommerceSearch,SIGIR2017
33
PracticallimitationsofLearningtoRank
challangeinLtR
1.broadexploratoryqueries
2.LtR’sissuewiththediscontinuityinusefulnessofSERP
34

Introducing "Challenges and research opportunities in eCommerce search and recommendations"