Ingredients based - Recipe recommendation engine

Recipe
Recommendation
ENGINE
By
For: Data Science in the Wild - Spring 2020
Infinite Players

HOWDIDWE ARRIVEHERE?
FINDCALLING
01
FINDDATA
We found over 20+ usable
data repositories &
analyzed them
02
FINDRECIPE
Upon cleaning, we tried various
models to get the best possible
result across various models.
03
COOK&SERVE!
We collated the best
results around an intuitive
workflow
04
Infinite Players
We foodies found that no-one has
collated multiple datasets & made
a good recipe recommendation
engine

BACKGROUND
Coronavirus has brought about an interesting fact about
young working professionals and university students. All of
them relied on take-aways and dine-ins skipping cooking.
Most people do not know what to cook despite so many
options available in the grocery store in this globalized world.
Therefore, we want to answer:
What do I cook, given I have these ingredients available?
Food is the gateway to a new culture and so many cultures
can be explored by what is in your fridge. Our engine enables
this cross cultural exchange by telling you what is possible!
Infinite Players

DATA-FROM THEWILD
DATAsources:
Infinite Players
RECIPE_INGR_REVIEW (12K)
YUMMLY CLEAN (6K)
FOOD.COM DATA (231K)
EPICURIOUS (20K)
RECIPE_INGR (56K)
Dataset Name SOURCE FIELDS TAKEN
FOOD_COM LINK Ingredients, Recipe Name
EPICURIOUS LINK Ingredients, Recipe Name, Ratings,
Description
YUMMLY CLEAN LINK Ingredients, Recipe Name, Cuisines
RECIPE_INGR LINK Ingredients, Cuisines
RECIPE_INGR_REV LINK Ingredients, Recipe Name, Ratings
Total:~270k

DATA-FROM THEWILD
FORINGREDIENTS
We had ingredients ranging
from ubiquitous wheat flour
to the most exotic such as
Saffron.
In total, we had more than
100K+ ingredients in our
datasets
FORCUISINE
We started with more than 35
unique cuisines, studied the
differences, and commonalities
among all.
And finally mapped them to a
superset of 7
FORUSERRATINGS
Certain datasets had user
reviews for the recipes.
We utilised these reviews
by defining a rating scale
from 1-5 as a basis for
our item-item based
collaborative filtering
model.
Infinite Players
FORREcipeNAMES
All datasets have recipe
names except recipe
ingredients which has
only cuisine names &
ingredients.
This is our desired
output.

DATACLEANING- overview
Basic
Common text
preprocessing techniques
01
No“quantities”
“OZ”, “KG”, “POUNDS”,
“TSP”, “LITTLE”, “PINCH”
02
Extractnouns
POS tagging, extract
ingredients from recipe
instructions
03
Removerarewords
AVG term frequency is
600, remove words
occurring < 30 times
04
Infinite Players
Iterate!
05
Continue cleaning as we
see results

FEATUREENGINEERING
WHY?
● Multiple datasets - different data formats
● Cleaning to 100% is hard, doesn’t scale to new data
● Ingredient related tokens > 2.5 MIL across 270K recipes
Infinite Players

CUISINES- inthewild
Which cuisine does the recipe belong to?
Which cuisines should we narrow it down
to?
We tried to narrow down cuisines from this ->
Infinite Players
PROBLEM
GROCERYINSPIRATION

CUISINESDEMYSTIFIED
Infinite Players
Confusion Matrix - Using Neural Network
Upon refining further, we combined many
cuisines to achieve the highest accuracy for
our cuisine classifier while maintaining
distinctive flavors and favoring numbers.
Final List of Cuisines (7): American, Italian,
European, Asian, Mexican, French, Indian

CUISINEs- AMERICA!
Infinite Players
Confusion Matrix - Using Ensembling methods
Some patterns can be clearly noticed:
French cuisine is very similar to America’s Cajun & Creole (Louisiana).
Mexico influenced Texan food.
Italy has a great influence on Northeast food with Pizza etc.
European cuisine (Spanish,British & German) has a great influence
too
Asians & Indian cuisines have minimal collision
This is really similar to the ethnicity of immigrants in the US
*Indian & Mexican cuisines also share a lot of flavors.

HOWDOESITWORK?
Infinite Players
OUTPUT
INPUT : Ingredients feature vectors
EnsembleTechniques
Neuralnetwork
Logistic Regression
K Neighbors Classifier
Decision Tree Classifier
Random Forest Classifier
Layer 1: Linear + Leaky ReLU
Layer 2: Linear + LeakyReLU + Dropout
Layer 3: Linear + LeakyReLU + Dropout
Layer 4: Linear + Softmax
:Cuisine type for a list of ingredients

HOWTOGETARECOMMENDATION
Infinite Players
OUTPUTINPUT
Ingredient (s)
Choice of Cuisine
(if any)
COLLABFILTER
CONTENTBASEDRECOMMENDATION
AlternativeINPUT
Name of Recipe
ONE List of recipes
according to user
preferences
& Another,
List of recipes closest to
the ingredients mentioned.
Ingredient to
features using
word2vec model
Cosine Similarity for
calculating distance
KNN with Means
for recipe ratings
Cosine Similarity for
calculating distance
ITEM - ITEM based filter
Ingredients
taken as input
from recipe

MODEL- COLLABORATIVEFILTERing
INPUT
We build a recommender system
in which the user inputs the
ingredients they have on hand.
Based on these inputs we will
generate a short list of recipes
that fit the users preferences.
MODEL
KNN with Means has been
chosen for the recommender,
which is a basic collaborative
filtering algorithm, taking into
account the mean ratings of
each user.
Compute the cosine similarity
DATACLEANINg
We use only one rating per user.
Further we define a rating scale
for the recipe.This is determined
by the lowest and highest rating
possible given by the users.
Infinite Players
EVALUATE
We use the Surprise lib to test our
recsys. Using cross validation we
evaluate the model using a few
metrics like MSE and RMSE.
OUTPUT
Finally get a
recommendation
based on an input
string of ingredients

COLLABORATIVEFILTERing-RESULTS
Infinite Players
Input:User_ID,Ingredients
User_id: 2043209
Ingredients: ‘chicken,egg,milk’
RECIPE INGREDIENTS INSTRUCTIONS
Chicken Lasagna with White Sauce Recipe mozzarella,mushroom,milk,spinach,egg,ricotta,n… Preheat oven to 350 degrees F (175 degrees C)....
Swedish Meatballs egg,milk,ground beef,cereal,onion,chicken,mush. Preheat oven to 350 degrees F (175 degrees C)....
Mushroom Chicken Piccata Recipe flour,salt,paprika,egg,milk,chicken,butter,mus… In a shallow dish or bowl, mix together flour,...
User_id: 700
Ingredients:: ‘Cheese,onion’
RECIPE INGREDIENTS INSTRUCTIONS
Tuna Noodle Casserole II Recipe noodle,mushroom,milk,tuna,cheese,onion,potato,... In a large pot with boiling salted water cook ...
Hamburger Cheese Bake Recipe pasta,ground beef,onion,tomato sauce,white sauce.. In a large pot cook with boiling salted water..
I

MODEL- CONTENTBASEDRECOMMENDER SYSTEM
RAWINPUT
Ingredient list for every
recipe. All ingredients are
kept through the pre-
processing pipeline
MODEL:WorD2vec
Ingredients to features. 200 dimensions (with PCA,
negligible difference in cuisine results, hence
unused), context window of 12 (based on
experiments), downsampling threshold of 1e-3
Recommendation
Take input ingredients
and use w2v on it. Use
cosine similarity to
compare distance with
recipes in dataset
Infinite Players
evaluation
Based on performance of
downstream task: cuisine
classification
Eyeballing results of recipes
recommended

CONTENTBASEDRECOMMENDER -RESULTS
Infinite Players
Bananas
hot custard and bananas : 0.99
grilled bananas platanos asados : 0.9998
banana fudge pie : 0.9998
Chicken, spinach
spinach chicken caesar wrap : 0.999906
szechuan shrimp stir fry : 0.999901
crunchy low fat summer chicken salad : 0.999901
Bananas, strawberry
raspberry cream smoothie : 0.9999254
ensure smoothie : 0.9999226
got milk vanilla banana wellness smoothie : 0.9999215
Chicken, peas, coconut - “Asian” cuisine filter
jade soup : 0.999903
malay style curry puff : 0.9999025
joanne s hawaiian chicken : 0.999898
Recipe similar to “healthy oatmeal raisin cookie muffins”
oatmeal blueberry muffins : 0.9999919
1 gram fat pumpkin spice muffins low fat : 0.999988
whole wheat apple banana oatmeal muffins : 0.999984
ww core ginormously big breakfast cookie : 0.9999819
Ingredientsonly Ingredientsw/ cuisinefilter
“Similarto” recipes

POST RECOMMENDATION /FUTURESCOPE
Personalization
As a part of improving the recommendations,
users can be prompted to rate the recipes they
were recommended.
!
The tool can be integrated into smart
devices such as refrigerators.
Integrationintosmart
devices
Infinite Players
!
User can evaluate recommendation
quality to improve the models
LEARNREGULAR
!

INFINITEPLAYERS
SALONI PRASHANT
pj263@cornell.edu
CM’21
DALE BHARAT
bg445@cornell.edu
CM’21
Infinite Players
sg2452@cornell.edu
CM’21
dm846@cornell.edu
CS’20

APPENDIX
Infinite Players
American: Potato and Fennel Soup Hodge
American: Banana-Chocolate Chip Cake With Peanut Butter Frosting
Asian: Korean Marinated Beef
Asian: Spicy Noodle Soup
European: Lentil, Apple, and Turkey Wrap
European: Ham Persillade with Mustard Potato Salad and Mashed Peas
European: Mozzarella-Topped Peppers with Tomatoes and Garlic
French: Boudin Blanc Terrine with Red Onion Confit
Indian: Crisp Braised Pork Shoulder
Indian: Jeweled Rice
Indian: Pork Chops with Sweet-and-Sour Cider Glaze
Italian: Pancetta and Taleggio Lasagna with Treviso
CUISineprediction
onrecipes

APPENDIX
Infinite Players
Pretrained*:
'crude_oil', 0.771960,
'petroleum', 0.76550,
'gas', 0.7105979,
'Oil', 0.69080603,
'natural_gas', 0.6823650,
'crude', 0.6720212,
'hydrocarbon', 0.66379529,
'oilfields', 0.6539833,
'hydrocarbons', 0.629402399,
'oilfield', 0.6252065
Word2vecmodel:
Pretrained*vs.custom
Ours:
'cooking oil', 0.8903055,
'vegetable oil', 0.85543,
'corn oil', 0.74846285,
'canola oil', 0.7239809,
'peanut oil', 0.7191789,
'salad oil', 0.63877767,
'olive oil', 0.6366288,
'sunflower oil', 0.5754506,
'safflower oil', 0.54821264,
'lite olive oil', 0.5148879
*GoogleNews-vectors-negative300

References andRelevantWork
1. https://www.kaggle.com/c/whats-cooking/data (P)
2. https://www.kaggle.com/shuyangli94/food-com-recipes-and-user-interactions (D)
3. https://www.kaggle.com/hugodarwood/epirecipes (B)
4. https://www.kaggle.com/kaggle/recipe-ingredients-dataset (S)
5. https://www.kaggle.com/kanaryayi/recipe-ingredients-and-reviews (P)
6. https://data.world/datafiniti/food-ingredient-lists (D)
7. https://link.springer.com/article/10.1007/s10844-017-0469-0
8. http://foodb.ca/ (B)
9. https://github.com/lingcheng99/Flavor-Network (S)
10. https://www.nature.com/articles/srep00196
11. https://www.foodpairing.com/
12. https://www.wired.com/2013/11/a-new-kind-of-food-science/
13. https://www.prescouter.com/2019/05/flavor-discovery-big-data-ai/
14. https://waterfootprint.org/media/downloads/Mekonnen-Hoekstra-2011-WaterFootprintCrops.pdf
15. https://www.footprintnetwork.org/licenses/public-data-package-free/
1. A New Kind of Food Science: How IBM Is Using Big Data to Invent Creative Recipes
● The study develops an algorithm that generates a list of recipes ranked using three categories: surprise, pleasantness of odor, and flavor pairings
1. Flavor network and the principles of food pairing
● The study introduces a flavor network that captures the flavor compounds shared by culinary ingredients. Given the increasing availability of information on food preparation, their data-driven
investigation also opens new avenues towards a systematic understanding of culinary practice.
1. How healthy is the meal: an analysis of recipe data
● The study looks into the interconnection between ratings, nutrients, ingredients, meals, seasons, holidays and cooking techniques.
Infinite Players

CUISINE
Infinite Players
GROCERYINSPIRATION
To these classifications.
We looked at our own grocery store
experiences and saw that we all could identify
items in the supermarket from these cuisines.
Therefore, people could recognize most these
cuisines
On the other hand, some cuisines had very
distinctive flavors and classifications. Such as
Jamaican & Moroccan. Therefore we tried
keeping a small sample & building a model
around it.

Fonts& colors used
This presentation has been made using the following fonts:
Staatliches
(https://fonts.google.com/specimen/Staatliches)
Roboto Condensed
(https://fonts.google.com/specimen/Roboto+Condensed)
#4c1130 #ff5864 #df183d#20124d #76a5af #134f5c#ffd966
OTHER RESOURCEs:
Inspiration from across SlidesGo
Infinite Players

Ingredients based - Recipe recommendation engine

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Ingredients based - Recipe recommendation engine

Similar to Ingredients based - Recipe recommendation engine (20)

Recently uploaded

Recently uploaded (20)

Ingredients based - Recipe recommendation engine

Editor's Notes