How does iFood use recommendation to
improve food delivery experience ?!
Luiz Mendes & Renan Oliveira
Luiz Mendes
Lead Data Scientist
Food Graph Project
From Belo Horizonte
@lfmendes
https://www.linkedin.com/in/lfomendes/
Renan Oliveira
Principal Scientist
Recommendation & Search
From Rio
@renan_oliveira
iFood Brain
AI + DA + DE
150 people
40 squads
5 countries
IFOOD IN NUMBERS
40 million orders
200k restaurants
150k drivers
1000 cities
*these numbers are only from Brazil
AN UNIQUE MOMENT - YOU CAN'T FAST-FORWARD OR SKIP
Time and Quality
Decision process
Special occasion
We are not a streaming service.
Ordering food has an additional
difficulty being assertive because
fixing a mistake is not like skipping a
media: placing an order involves more
money and logistical operation.
FOOD IS A VERY PERSONAL CHOICE
Taste Profile
Speed
Brand
Price Affinity
Offer Affinity
etc…
Taste Profile
Dish
Cuisine
Offers
Rating
etc…
Match
CHALLENGES
Locality geographical constraint for model training
Speed if you are hungry you will want to eat soon
Serviceability production capacity and restaurant quality
Feedback implicit (engagement) vs explicit (ratings)
Growth cold start problem
EVERYWHERE CAN BE PERSONALIZED
Push Notification
List of Restaurants
List of Dishes
Search Results
UI Components
PILLARS OF IFOOD 1:1
User Profile Which cuisine does she like most?
Context How would the offer change on weekends?
Journey If a user ordered "n" yesterday, what are the best "y" to offer today?
Dish Profile Healthy? Low-calorie? How allergy-risk?
Candunga
Collaborative Filtering
for Lists
5 2 2
? 3 4
5 3 ?
ASH
GOKU
SEIYA
ALS WITH MULTIPLE MATRICES
.9 .7
.5 .7
.5 .3
U1
U2
U3
Main Matrix
Customized
Discovery
The classic ALS approach (user x item
with implicit feedback) worked very
well for recurring users. The
conversion was not so good for users
without a long history of purchases or
for new items.
ALS WITH MULTIPLE MATRICES
.35 .50 .28
.29 .61 .32
.42 .18
C1
C2
C3
"Clusters" Matrix
Group of users
Implicit and explicit feedback
We can use any feature known to the
user or the item to make a hybrid
approach to content-based and
collaborative filtering. Not using the
user in the matrix index, the matrix
becomes less sparse.
CANDUNGA
.35 .32 .28
.29 .61 .32
.42 .18
C1
C2
C3
.9 .7
.5 .7
.5 .3
U1
U2
U3
features
- No limit on number of matrices
- Features from user, item or booth
- Added the feature layer
- Items can be stores or products
- Bayesian optimization to define the weight of each matrix
- NDCG for offline evaluation
before after
LUNCH LIST FOR LUIZ MENDES
Japanese
Healthy
In the old version of the list,
the top items were desserts
and junk food. Luiz prefers
japanese and healthy foods.
DEPLOY MODEL PIPELINE
RESULTS IN THE FEED OF ALL RESTAURANTES
+38%conversion uplift
+29%coverage uplift
Recommendation
Control
Embeddings
EMBEDDINGS
An embedding is a relatively low-dimensional space into which you can translate high-dimensional vectors.
The most known embeddings are word embeddings, that are build based on the position of the word in a text and
which words come before and after it.
to learn more: https://developers.google.com/machine-learning/crash-course/embeddings/video-lecture
EMBEDDINGS IN IFOOD
food2vec search2vec*foodgraph
Text based - fasttext
Build with ifood text data
Text based - word2vec
Created based on search queries
and the chosen restaurant
Content and graph based
Created based on connections
and features
iFood Word Vectors
food2vec
Create word embeddings
Build with ifood text data
Use fasttext to have better out of the vocabulary embeddings
Text data is very food specific and contains misspellings
food2vec
pasta related words
vegetarian related words
pizza related words
japanese cuisine
related words
food2vec
pasta related words
vegetarian related words
pizza related words
japanese cuisine
related words
Edamame is ingredient common in both japanese and
vegetarian dishes and it is the middle of the groups
food2vec
food2vec
Create dish embeddings
Use the weighted mean of the dish name and description words
Weight is based on IDF (inverse document frequency)
Japanese Food
Healthy Food
food2vec
food2vec
Can be used to augment dish lists
Frango grelhado. Acompanha cuscuz
marroquino com alho poró e ervilhas e cenoura.
Filé de Saint Peter grelhado. Acompanha purê
de abóbora e legumes ao vapor (brócolis e
vagem).
Salada Caesar de salmão cru defumado
Hamburguer vegano + acompanhamento + mini
salada
Risoto integral de ervilha com shimeji e tofu.
Take an example list Find similar dishes for all example Merge all neighbours and sort by similarity
Food Graph
Ontology
Knowledge Graph
Ontology
Knowledge Graph
Knowledge Graph
Ontology
Knowledge Graph
Knowledge Graph
graph-structured
nodes relationships
Knowledge Graph
Movie
Rotten link: www.rotten...
Imdb link: www.imdb…
Liked by: 90%
Description: An orpha..
The Land Before Time
Don Bluth
Directed by
Lucasfilm
Produced by
Knowledge Graph
Python? Which Python are you looking for?
Knowledge Graph
Food Delivery contains many entities: Users, Restaurants, Dishes, Ingredients, Delivery, Vendors...
Food Graph
CONTAINS
Subject
Object
Apple pie><CONTAINS><apple
User A Restaurant R
Tomatoes Chickpea Lettuce
ordered from
contains
Vegan
is
to from
Driver D
Food Graph
Data
Enrichment
Stream of “Enriched” data
Taxonomy
Normalized Dish Names
Serving Size
Meal Type
Ingredients
Weights, Volumes
Tags
...
FKG
“Better” Entity
Representations
RecSys
Marketing
Search
...
Fill in Missing
information by
Graph completion
* 33.7% of dishes have no description.
Food Graph
Dish Embedding
with Food Graph
“You are the average of the five people you
spend the most time with.”
jim rohn
Food Graph Embeddings
PinSage - pinterest
Must
● use node features (excludes node2vec for example)
● be able to create embeddings for unseen nodes
● be scalable (millions of nodes and possibly billions of edges)
● tested on production somewhere
https://medium.com/pinterest-engineering/pinsage-a-new-graph-convolutional-neural-network-for-web-scale-recommender-systems-88795a107f48
Food Graph Embeddings
A
B
C
F
E
D
Food Graph Embeddings
A
Localized convolution based on Random Walks
● Starting from A make random walks and get the most visited
nodes as neighbours
Food Graph Embeddings
A
Localized convolution based on Random Walks
● Starting from A make random walks and get the most visited
nodes as neighbours
● B, C and D - First Hop
A
B
C
F
E
D
Food Graph Embeddings
Localized convolution based on Random Walks
● Starting from A make random walks and get the most visited
nodes as neighbours
● B, C and D - First Hop
● For each neighbour repeat this
○ B -> A and C
○ C -> A, B, E, and F
○ D -> A
A
B
C
F
E
D
Food Graph Embeddings
Localized convolution based on Random Walks
● Starting from A make random walks and get the most visited
nodes as neighbours
● B, C and D - First Hop
● For each neighbour repeat this
○ B -> A and C
○ C -> A, B, E, and F
○ D -> A
How many hops?
Usually 2
A
B
C
F
E
D
Food Graph Embeddings
Localized convolution based on Random Walks
● Starting from A make random walks and get the most visited
nodes as neighbours
● B, C and D - First Hop
● For each neighbour repeat this
○ B -> A and C
○ C -> A, B, E, and F
○ D -> A
How many hops?
Usually 2, if you go to far you might find Kevin Bacon*, or use almost
the whole graph for one node
A
B
C
F
E
D
* https://en.wikipedia.org/wiki/Six_Degrees_of_Kevin_Bacon
Food Graph Embeddings
A Computational Graph for new A
Food Graph Embeddings
AC
D
B
Computational Graph for new AC
D
B
Food Graph Embeddings
AB C
F
E
D
A
C
A
A
B
Computational Graph for new AC
D
B
Food Graph Embeddings
AB C
F
E
D
A
C
A
A
BConvolve
1. Get the first representation of A and C
2. Pass through a convolution layer to create the second B representation
a. send neighbors through a dense neural network and then apply a
aggregator/pooling function (e.g., a element-wise mean or
weighted sum, denoted as γ )
Ɣ
C
A
B
B
Convolve
Food Graph Embeddings
AB C
F
E
D
A
C
A
A
BConvolve
1. The weights are shared between “same step” convolutions
Ɣ
C
A
B
B
Convolve
Convolve
Convolve
Convolve
w1
w1
w2
Food Graph Embeddings in iFood
PinSage algorithm uses a homogeneous graph, that is, all nodes are the same type.
To transpose this to iFood we created first a Bipartite Graph of users and dishes.
Food Graph Embeddings in iFood
PinSage algorithm uses a homogeneous graph, that is, all nodes are the same type.
To transpose this to iFood we created first a Bipartite Graph of users and dishes.
Food Graph Embeddings in iFood
PinSage algorithm uses a homogeneous graph, that is, all nodes are the same type.
To transpose this to iFood we created first a Bipartite Graph of users and dishes.
Food Graph Embeddings in iFood
PinSage algorithm uses a homogeneous graph, that is, all nodes are the same type.
To transpose this to iFood we created first a Bipartite Graph of users and dishes.
Food Graph Embeddings in iFood
PinSage algorithm uses a homogeneous graph, that is, all nodes are the same type.
To transpose this to iFood we created first a Bipartite Graph of users and dishes.
Food Graph Embeddings in iFood
We use food2vec embeddings, category, cuisine and many other features as input.
The result is an embedding that contains both content from itself and neighbours.
Embeddings in
Recommendation
Embeddings in Recommendation
These embeddings can e will be used in many projects in iFood:
● Similarity based recommendation
● Improve search results
● As features for classification/regression models
● Features for recommendation algorithms
● Improve Lean to Rank models
Questions?
THANK YOU!
if you want to revolutionize the food delivery market
we're hiring!
https://materiais.ifood.com.br/recsys

RecSys 2020 - iFood recommendation

  • 1.
    How does iFooduse recommendation to improve food delivery experience ?! Luiz Mendes & Renan Oliveira
  • 2.
    Luiz Mendes Lead DataScientist Food Graph Project From Belo Horizonte @lfmendes https://www.linkedin.com/in/lfomendes/
  • 3.
    Renan Oliveira Principal Scientist Recommendation& Search From Rio @renan_oliveira
  • 4.
    iFood Brain AI +DA + DE 150 people 40 squads 5 countries
  • 5.
    IFOOD IN NUMBERS 40million orders 200k restaurants 150k drivers 1000 cities *these numbers are only from Brazil
  • 6.
    AN UNIQUE MOMENT- YOU CAN'T FAST-FORWARD OR SKIP Time and Quality Decision process Special occasion We are not a streaming service. Ordering food has an additional difficulty being assertive because fixing a mistake is not like skipping a media: placing an order involves more money and logistical operation.
  • 7.
    FOOD IS AVERY PERSONAL CHOICE Taste Profile Speed Brand Price Affinity Offer Affinity etc… Taste Profile Dish Cuisine Offers Rating etc… Match
  • 8.
    CHALLENGES Locality geographical constraintfor model training Speed if you are hungry you will want to eat soon Serviceability production capacity and restaurant quality Feedback implicit (engagement) vs explicit (ratings) Growth cold start problem
  • 9.
    EVERYWHERE CAN BEPERSONALIZED Push Notification List of Restaurants List of Dishes Search Results UI Components
  • 10.
    PILLARS OF IFOOD1:1 User Profile Which cuisine does she like most? Context How would the offer change on weekends? Journey If a user ordered "n" yesterday, what are the best "y" to offer today? Dish Profile Healthy? Low-calorie? How allergy-risk?
  • 11.
    Candunga Collaborative Filtering for Lists 52 2 ? 3 4 5 3 ? ASH GOKU SEIYA
  • 12.
    ALS WITH MULTIPLEMATRICES .9 .7 .5 .7 .5 .3 U1 U2 U3 Main Matrix Customized Discovery The classic ALS approach (user x item with implicit feedback) worked very well for recurring users. The conversion was not so good for users without a long history of purchases or for new items.
  • 13.
    ALS WITH MULTIPLEMATRICES .35 .50 .28 .29 .61 .32 .42 .18 C1 C2 C3 "Clusters" Matrix Group of users Implicit and explicit feedback We can use any feature known to the user or the item to make a hybrid approach to content-based and collaborative filtering. Not using the user in the matrix index, the matrix becomes less sparse.
  • 14.
    CANDUNGA .35 .32 .28 .29.61 .32 .42 .18 C1 C2 C3 .9 .7 .5 .7 .5 .3 U1 U2 U3 features - No limit on number of matrices - Features from user, item or booth - Added the feature layer - Items can be stores or products - Bayesian optimization to define the weight of each matrix - NDCG for offline evaluation
  • 15.
    before after LUNCH LISTFOR LUIZ MENDES Japanese Healthy In the old version of the list, the top items were desserts and junk food. Luiz prefers japanese and healthy foods.
  • 16.
  • 17.
    RESULTS IN THEFEED OF ALL RESTAURANTES +38%conversion uplift +29%coverage uplift Recommendation Control
  • 18.
  • 19.
    EMBEDDINGS An embedding isa relatively low-dimensional space into which you can translate high-dimensional vectors. The most known embeddings are word embeddings, that are build based on the position of the word in a text and which words come before and after it. to learn more: https://developers.google.com/machine-learning/crash-course/embeddings/video-lecture
  • 20.
    EMBEDDINGS IN IFOOD food2vecsearch2vec*foodgraph Text based - fasttext Build with ifood text data Text based - word2vec Created based on search queries and the chosen restaurant Content and graph based Created based on connections and features
  • 21.
  • 22.
    food2vec Create word embeddings Buildwith ifood text data Use fasttext to have better out of the vocabulary embeddings Text data is very food specific and contains misspellings
  • 23.
  • 24.
    pasta related words vegetarianrelated words pizza related words japanese cuisine related words food2vec
  • 25.
    pasta related words vegetarianrelated words pizza related words japanese cuisine related words Edamame is ingredient common in both japanese and vegetarian dishes and it is the middle of the groups food2vec
  • 26.
    food2vec Create dish embeddings Usethe weighted mean of the dish name and description words Weight is based on IDF (inverse document frequency)
  • 27.
  • 28.
    food2vec Can be usedto augment dish lists Frango grelhado. Acompanha cuscuz marroquino com alho poró e ervilhas e cenoura. Filé de Saint Peter grelhado. Acompanha purê de abóbora e legumes ao vapor (brócolis e vagem). Salada Caesar de salmão cru defumado Hamburguer vegano + acompanhamento + mini salada Risoto integral de ervilha com shimeji e tofu. Take an example list Find similar dishes for all example Merge all neighbours and sort by similarity
  • 29.
  • 30.
  • 31.
  • 32.
  • 33.
  • 34.
    Movie Rotten link: www.rotten... Imdblink: www.imdb… Liked by: 90% Description: An orpha.. The Land Before Time Don Bluth Directed by Lucasfilm Produced by Knowledge Graph
  • 35.
    Python? Which Pythonare you looking for? Knowledge Graph
  • 36.
    Food Delivery containsmany entities: Users, Restaurants, Dishes, Ingredients, Delivery, Vendors... Food Graph CONTAINS Subject Object Apple pie><CONTAINS><apple
  • 37.
    User A RestaurantR Tomatoes Chickpea Lettuce ordered from contains Vegan is to from Driver D Food Graph
  • 38.
    Data Enrichment Stream of “Enriched”data Taxonomy Normalized Dish Names Serving Size Meal Type Ingredients Weights, Volumes Tags ... FKG “Better” Entity Representations RecSys Marketing Search ... Fill in Missing information by Graph completion * 33.7% of dishes have no description. Food Graph
  • 39.
  • 40.
    “You are theaverage of the five people you spend the most time with.” jim rohn
  • 41.
    Food Graph Embeddings PinSage- pinterest Must ● use node features (excludes node2vec for example) ● be able to create embeddings for unseen nodes ● be scalable (millions of nodes and possibly billions of edges) ● tested on production somewhere https://medium.com/pinterest-engineering/pinsage-a-new-graph-convolutional-neural-network-for-web-scale-recommender-systems-88795a107f48
  • 42.
  • 43.
    Food Graph Embeddings A Localizedconvolution based on Random Walks ● Starting from A make random walks and get the most visited nodes as neighbours
  • 44.
    Food Graph Embeddings A Localizedconvolution based on Random Walks ● Starting from A make random walks and get the most visited nodes as neighbours ● B, C and D - First Hop A B C F E D
  • 45.
    Food Graph Embeddings Localizedconvolution based on Random Walks ● Starting from A make random walks and get the most visited nodes as neighbours ● B, C and D - First Hop ● For each neighbour repeat this ○ B -> A and C ○ C -> A, B, E, and F ○ D -> A A B C F E D
  • 46.
    Food Graph Embeddings Localizedconvolution based on Random Walks ● Starting from A make random walks and get the most visited nodes as neighbours ● B, C and D - First Hop ● For each neighbour repeat this ○ B -> A and C ○ C -> A, B, E, and F ○ D -> A How many hops? Usually 2 A B C F E D
  • 47.
    Food Graph Embeddings Localizedconvolution based on Random Walks ● Starting from A make random walks and get the most visited nodes as neighbours ● B, C and D - First Hop ● For each neighbour repeat this ○ B -> A and C ○ C -> A, B, E, and F ○ D -> A How many hops? Usually 2, if you go to far you might find Kevin Bacon*, or use almost the whole graph for one node A B C F E D * https://en.wikipedia.org/wiki/Six_Degrees_of_Kevin_Bacon
  • 48.
    Food Graph Embeddings AComputational Graph for new A
  • 49.
  • 50.
    Food Graph Embeddings ABC F E D A C A A B Computational Graph for new AC D B
  • 51.
    Food Graph Embeddings ABC F E D A C A A BConvolve 1. Get the first representation of A and C 2. Pass through a convolution layer to create the second B representation a. send neighbors through a dense neural network and then apply a aggregator/pooling function (e.g., a element-wise mean or weighted sum, denoted as γ ) Ɣ C A B B Convolve
  • 52.
    Food Graph Embeddings ABC F E D A C A A BConvolve 1. The weights are shared between “same step” convolutions Ɣ C A B B Convolve Convolve Convolve Convolve w1 w1 w2
  • 53.
    Food Graph Embeddingsin iFood PinSage algorithm uses a homogeneous graph, that is, all nodes are the same type. To transpose this to iFood we created first a Bipartite Graph of users and dishes.
  • 54.
    Food Graph Embeddingsin iFood PinSage algorithm uses a homogeneous graph, that is, all nodes are the same type. To transpose this to iFood we created first a Bipartite Graph of users and dishes.
  • 55.
    Food Graph Embeddingsin iFood PinSage algorithm uses a homogeneous graph, that is, all nodes are the same type. To transpose this to iFood we created first a Bipartite Graph of users and dishes.
  • 56.
    Food Graph Embeddingsin iFood PinSage algorithm uses a homogeneous graph, that is, all nodes are the same type. To transpose this to iFood we created first a Bipartite Graph of users and dishes.
  • 57.
    Food Graph Embeddingsin iFood PinSage algorithm uses a homogeneous graph, that is, all nodes are the same type. To transpose this to iFood we created first a Bipartite Graph of users and dishes.
  • 58.
    Food Graph Embeddingsin iFood We use food2vec embeddings, category, cuisine and many other features as input. The result is an embedding that contains both content from itself and neighbours.
  • 59.
  • 60.
    Embeddings in Recommendation Theseembeddings can e will be used in many projects in iFood: ● Similarity based recommendation ● Improve search results ● As features for classification/regression models ● Features for recommendation algorithms ● Improve Lean to Rank models
  • 61.
  • 62.
    THANK YOU! if youwant to revolutionize the food delivery market we're hiring! https://materiais.ifood.com.br/recsys