Call Girls In Tilak Nagar꧁❤ 🔝 9953056974🔝❤꧂ Escort ServiCe
Thai food recommendation
1. THAI FOOD
RECOMMENDATION
Based on your preference!!
1. Russamee Nakaphun
2. Yada Limsuwan
3. Vanathip Gijruenthong
4. Chanokporn Youngpakool
5. Korrapin Pimapansri
Business Analytics and Data Science ,NIDA
8. Data Collection (Ingredients)
8
Kaggle
- 2 JSON files with id and ingredients
FooDB
- Provides dataset that contains
information about compounds,
proteins, contents, nutrients , etc.
- We use only Food.csv
Features:
Name, Food group, Food subgroup, etc.
9. Data Preparation
9
28K JSON
Files
Remove recipes
- No flavor
information
- Duplicated by ID
Remove specific words
from ingredients
Scraped
Ingredient list
Import
Ingredient list
Combine, add plural
words and remove
specific words such
as low fat, fat free,
non fat, etc.
Clean
ingredients
Import Ingredient Category
Add ingredient
category to
recipes
10. Data Preparation
10
Data Cleaning (Ingredients)
Clean
ingredients
Create Nested Dictionary
{Recipe:{Ingr:w}}
- Unweighted : 1
- Weighted : x gram per
1 serving
Convert Dictionary to
Dataframe.
(Dataframe is matrix
between recipes and
ingredients)
19. Modeling
19
SimJaccard
(Ingri , Ingrj)
SimCosine
(Ingri , Ingrj)
SimCosine
(Flavori, Flavorj)
SimJaccard
(Catagoriesi, Catagoriesj)
A x B x C x D x
+
All approach are combined together.
*A, B, C, D can be adjustable.
Similarity
21. Modeling
21
Problem of Limited memory
0
10
20
30
40
50
60
70
80
90
ComputationTime(S)
Computation Time (S) normal array/dataframe Computation Time (S) dask (chunk)
Model
Computation Time
(S)
Normal Chunk
case1 = ((jac_sim_ingredient)+(cos_sim_weighted_flavor)+(jac_sim_categories))/3 14.6 0.4
case2 = ((cos_sim_weighted_ingredient)+(cos_sim_weighted_flavor)+(jac_sim_categories))/3 78 4.69
case3 = (0.2*jac_sim_ingredient)+(0.5*cos_sim_weighted_flavor)+(0.3*jac_sim_categories) 15.4 0
case4 = (0.3*cos_sim_weighted_ingredient)+(0.5*cos_sim_weighted_flavor)+(0.2*jac_sim_categories) 15.9 0
case5 = (0.3*cos_sim_weighted_ingredient)+(0.2*cos_sim_weighted_flavor)+(0.5*jac_sim_categories) 15.7 0
case6 = (0.5*cos_sim_weighted_ingredient)+(0.2*cos_sim_weighted_flavor)+(0.3*jac_sim_categories) 15.8 1.56
case7 = (0.2*cos_sim_weighted_ingredient)+(0.3*cos_sim_weighted_flavor)+(0.5*jac_sim_categories) 16.1 1.56
case8 = (0.5*cos_sim_weighted_ingredient)+(0.3*cos_sim_weighted_flavor)+(0.2*jac_sim_categories) 15.7 0
case9 = (cos_sim_weighted_flavor+cos_sim_weighted_ingredient+jac_sim_categories+jac_sim_ingredient)/4 12.9 0.263
Compute in “Chunk” very
faster than normal array!
Fixed “Memory Error” problem
22. Evaluation
22
Case Test Correct Accuracy (%)
1 100 41 41.00
2 97 40 41.24
3 96 41 42.71
4 112 44 39.29
5 111 46 41.44
6 111 46 41.44
Final Similarity = (0.2*jaccard(Ingredients))(0.5*cosine(flavors))+(0.3*jaccard(categories))
Process : Result (sample size = 628 times) :
23. Conclusion
● Community Detection
○ Thai food are in the clusters of Morocco, Mexican, Cuba, South West and India
○ Important common ingredients can be found in various countries for example Thai foods usually found fish sauce,
coconut milk, lemon juice, coriander, garlic
● Similarity Computation
○ Main problem is “Memory Error”, especially Jaccard Similarity
○ Necessary to apply “Parallel Computation” to solve this problem. This project used “Dask library” which the library
from “Chunk” concept. Moreover, it very fast running.
● Combination Models
○ This project selected 6 combination models from similarity matrix of Ingredients, favor, food categories
○ This 6 models differentiate from combination and weighting, as a result, every models have accuracy closely
23
24. Future Works
● Should more Thai foods data with a variety
● More analysis with
○ Key Ingredients
○ Customer data for example rating
○ Methods for example puff, boil, fried, grill
● Test the system with more people, especially foreigners
● Develop the interface with image searching then recommend with text, image and restaurant location
24