We want to talk all about Art & Science of Food Discovery @Swiggy. How we use advanced Machine Learning/AI on terabytes of data ( implicit/Explicit Feedback ) every day, to bring you recommendations that powers Restaurant Feeds, Filter Widgets, Personalized Collections.
We will also be talking about our Journey, Learning, and Challenges of building Food Recommendation System.
Today I am going to talk about our experiments with Food Discovery & Recommendation @ Swiggy.
I will talk about what are the major challenges, will share some interesting insights and lessons learned in the process.
Now let’s take a min to understand what is Swiggy, can we please have a quick show of hands how many of you actually use Swiggy ??
2
So ppl have some idea what swiggy does.
Swiggy is a 3 way market place where you have millions of customers, Thousands of rest partners, and Delivery Partners.
These are the pillars of the echo systems and if you want to solve any problem it’s going to impact all of these.
When you start Swiggy app we need to come up with the best recommendation for all the serviceable & available options, So there are problems related to relevance, Personalization, Search & Discovery. The moment you figured it out what you want to eat and place the order, assignment of this order, Batching , ETA predictions all these are equally difficult problems.
2
Formally a recommendation system problem is to come up with a utility function that can predict the customer’s likelihood of purchase from the list of Items, and these items could be movies in case of netflix, music in case of spotify, etc.
We currently have these 2 views of the world.
One is a restaurant centric view where we have restaurant feed, Also some personalized collection/ Smart filters. This contributes to ~60% of the orders.
At the same time we are also venturing into a Dish/Item first view where we recently released Dish Discovery in Bangalore. The idea here is to predict the dish themes and show personalized collection based on these themes for you.
If you show a story like this to a customer who loves pizzas you have remove lot of friction from the journey, He can click on this story and will get all the best pizza cater to his taste. This is a great experience.
We have few recommendations at menu and few on cart like Cross sell (Smart enough to understand what you have in cart & if you have a main course it will recommend a dessert/ Drinks
We have few recommendations at menu and few on cart like Cross sell / Meal completions.
3
Secret ingredient
Now if you want to abstract out the idea of any recommendation system it comes down to Matchmaking b\w customer and the items that you sell.
The secret recipe of any good recommendation system is to understand your customer well and in our case we want to know what kind of food/dish/cuisine he likes, is he a vegetarian, does he tries different restaurants or generally like to order from the same.
Same goes for the understanding of the catalog, We do have an unstructured catalog so we do need a way to understand more about the restaurants and the food taxonomy.
We do want to understand what is the taste profile of restaurant, What cuisines it serves, Is it costly ? , Is it healthy ? etc.
4
Still the big question could be what is so niche about this problem, the answer lies here, where the recommendation can’t be looked without these factors.
Swiggy is a hyperlocal business which adds a lot more complexities to the problem, We are constrained by our supply & availability. The need is to make a balance of demand & supply.
See most of the world class recommendation systems don’t suffer from these problems as much as we do. Think about the best recommendation like amazon, Netflix. They have more or less static supply and do not suffer from all these dimensions of a hyper local business such as Swiggy.
You are looking for Pizza options & we are constrained by serviceability we still want to show the possible alternatives.
In food delivery ppl do care about the speed so we need to give some weights to the faster options.
You also want to consider the stress at the Restaurant ( Real time model ) when you are recommending.
If we do not consider these factors in our recommendation systems then these systems will be suboptimal.
These are some other factors to consider when designing recsys.
Diversity
Consider this, I know you are a biryani guy. Shall we show all the biryani restaurants in your feed ??
it’s just too much exploit and we need to provide some diversity to give you different choices and it’s proven that it adds value.
Repeat vs Discovery again very similar kind of choice,
We know top 10 restaurant that you generally order from, Is it a good recommendation to show just these ??
ohh yes !!
In general recsys ( spatially content engines ) tends to ignore repeat. You want ppl to watch new movies, New articles, Songs.
While in case of Swiggy where people tend to order a lot more from the same restaurants over and over again,
So we do need to have a balance b\w repeat vs explore.
Now this is crucial choice and kinda definition of Discovery (
Search Vs discovery
Search is an explicit query. given your historic profile I can show you what are the items you may like but
what about this that a Serendipitous item that you don’t know even exists, finds you
that is the real discovery !!
So it’s easy to show you all the deserts you liked in the past and a unique indian dessert delicacy finds you is the real discovery.
)
1
3
Let’s quickly talk about the datasets available to build all these recommendation systems.
There are these two major philosophies in recommendation system,
ppl similar to you (Some form of CF based methods),
ppl generally like similar items (Content based methods ).
We also started with CF based approach, For these methods you need to build a customer - Item (rest ) matrix, You can build using explicit data/ It has been seen that you are better of using implicit data which is available in large volumes.
We are using matrix factorization based method like ALS, SVD, L2R (Wrap)
Generally these methods are defined for explicit rating prediction, In a classic paper by stephan rendle (et all ) they defined some of these methods for implicit feedback.
Now these methods are good start because you do not need to understand your items.
On the other hand,
it’s kind of biased towards popularity
suffers from cold start
think about a new customer he will not get a meaningful recommendation out of this, Same goes for a new restaurant that will start with a very low position and will be there always.
2
To solve for some of these, ppl go for content based method which solves for cold start and long tail.
What is content and similarity. Generally it’s done on the metadata provided on items , like in case of movies it will genres or in case of news articles it will be content, Title etc.
You will build a vector out of these & will calculate some similarity.
Similarly To understand our restaurant we could have used the data provided by them like menu, item categorization, primary tags, but we realize
1. This is not standard taxonomy.
2. In many cases, the order patterns on online food delivery differs from general understanding of the restaurant.
Some call themselves a cafe place but it’s actually a sandwich place , Few restaurant call themselves a multi cuisine ??
We took a different path and build this understanding of restaurants on ordered items as proxy for meta, So we took order item from a rest as a dummy document and run LDA on that and we go something like this.
3
What is the content and what is the similarity. To understand about these restaurant we could have used the data provided by them like menu but we realize 1. This is not standard taxonomy. Some call themselves a pizza place but it’s actually a Italian place some A2B call themselves a multi cuisine it is but really ??
someone call there pizza pizza some other call them main course so
So we took order item from a rest as a dummy document and run LDA on that and we go something like this.
Here in this visualization we are showing all the topic themes and it came out surprisingly well . There are the major topics/food themes exist in Swiggy you can see how bountifully it decided veg vs non-veg & these desert
3
Any restaurant can be projected into these dimension and you will get a taste profile vector. Some of the example of similarity based on this are here.
1
These are the major topics/food themes that exist in Swiggy, You can see how nicely it seperates veg vs non-veg & these desert, If you zoom to a topic it talks about what kind of words it represents and hence what food theme.
Any restaurant can be projected into these dimension and you will get a taste profile vector.
Some of the example of similarity based on this are here. We clearly see how much similar is MacD to Burger King. So if you have already ordered from Truffle there is a good chance that you will see bundar in your feed list.
This kind of solves for rest first recommendation but to get item first view (Dish Discovery) we need a standard categorization (Enrichments).
This kind of solve for rest recommendation but what about item first view we still have a non standard taxonomy which we standardize with the help of few food experts.
We have some meta regarding the food items like name ,description , recipe and when we did basic text/image classification we saw some nice results.
2
We also did some early experiments with word embedding, We took orders as proxy of the basic unit and run a Word2vec on this.
again quite interestingly if you see, it lists down all the pastas when you look for “pasta” it captures conceptually very similar items, spell variations etc.
Also there is an example of how dal makhni, dal Bukhara and kali dal are nothing but conceptually very similar.
So this is a summary of all the work done and in pipeline.
We started with Matrix factorization methods, We have content based methods ,We have done few experiments with hybrid of CF + Content.
We also tried Learning to rank with adding content as features and with multi objectives
we also have few experiments in Deep learning based method like item embeddings etc.
Similarly Swiggy recommendations are also moving from just home feed to
non-linear Dish discovery to
generating an entire app filed with different collections and
how to add real time context and change the recommendations accordingly.