Disha NEET Physics Guide for classes 11 and 12.pdf
deep_coders(sourav,nitin)
1. DEEP CODERS
ECKOVATION MACHINE LEARNING
members
Nitin Khatkar :01711503116
Sourav Tiwari :03011503116
Gulshan :01211503116
Shrey Achreja :41311503116
2. What cuisine is this recipe??
Picture yourself strolling through
your local, open-air market... What
do you see? What do you smell?
What will you make for dinner
tonight?
We want to thank Yummly for providing this unique dataset.
2
3. Data Description
▫ In the dataset, we include the
recipe id, the type of cuisine,
and the list of ingredients of
each recipe (of variable length).
The data is stored in JSON
format.
▫ An example of a recipe node in
train.json is given aside:
3
5. 5
STEPS FOLLOWED TO SOLVE THE GIVEN PROBLEM
STEP 3
At last, we will apply
the suitable algorithm
to it and find the best
suitable algorithm to
it.
STEP 1
First we will perform
EDA and will remove
all the redundant data
from given dataset.
STEP 2
Then , we will form our
feature matrix as well
target metrix.
8. ALGORITHM FOR FINDING TOP 10 CUISINE
▫ First make a dictionary with keys as different cuisine’s
and ingredients present in it as values.(dic)
▫ Then, with the help of above dictionary make a new
dictionary containing the counts of the ingredients
present in it.(count_dictionary)
▫ At last, make the pie chart of top 10 ingredients with
the help of above two dictionaries.(code given on next
slide)
8
11. 11
GENERATING X AND Y
▫ Create an empty list
y,total_ingredients.
▫ Append all the unique ingredients in
the list total_ingredients.
▫ Create a zero matrix using numpy and
name it as x.(number of rows equal to
y and columns equal to
total_ingredients)
▫ For every ingredient in y replace with
1.
▫ Our feature matrix x and target y is
ready.
12. DIFFERENT ALGORITHM USED
12
FOURTH
FINALLY WE WENT FOR FINSL TEST OF DEEP
LEARNING BUT DUE COULD NOT PERFORM ON
FULL DATA DUE TO LOW END SOECIFIACTIONS
THIRD
OUT OF CUROSITY, WE ALSO
APPLIED NAÏVE BAYES TO IT, BUT
AGAIN GOT A VERY LOW SCORE OF
0.36
SECOND
THEN ,WE TESTED FOR RANDOM
FOREST AND GOT A SATISFACTORY
SCORE OF 0.72
FIRST
WE STARTED WITH DECISION TREE.
OUTCOME WAS NOT AT ALL FRUITFUL
AS IT ACHIEVED SCORE OF 0.30
17. SHORTCOMINGS
Deep learning could not be
applied on whole dataset due to
low end specifications and SVM
could also not be applied due to
memory error.
RESULT
Conclusion
Random Forest is best algo ,
But if deep learning performed
on full dataset then , conclusion
may differ.
17
20. 20
Forest Cover Type Prediction
▫ The study area includes four wilderness areas located in the
Roosevelt National Forest of northern Colorado. Each observation
is a 30m x 30m patch. We are asked to predict an integer
classification for the forest cover type.
21. 21
Data Description
▫ The seven types are:
▫ 1 - Spruce/Fir
2 - Lodgepole Pine
3 - Ponderosa Pine
4 - Cottonwood/Willow
5 - Aspen
6 - Douglas-fir
7 - Krummholz
▫ The training set (15120 observations)
contains both features and the Cover_Type.
The test set contains only the features. You
must predict the Cover_Type for every
row in the test set (565892 observations).
22. “We would predict the forest-cover
type based upon the given value of
parameters.”
22
23. 23
STEPS FOLLOWED TO SOLVE THE GIVEN PROBLEM
23
STEP 3
At last, we will apply
the suitable algorithm
to it and find the best
suitable algorithm to
it.
STEP 1
First we will perform
EDA and will remove
all the redundant data
from given dataset.
STEP 2
Then , we will form our
feature matrix as well
target metrix.
28. 28
GENERATING X AND Y
▫ Assigning the value’s in
cover_type column in target
matrix(y).
▫ Then, after removing the
cover_type from data-frame
assigning the value to x.
▫ Removing the redundant columns.
29. 29
DIFFERENT ALGORITHM USED
FOURTH
Finally we went ahead for deep learning and got
score 0.84 as the Random forest.
THIRD
OUT OF CUROSITY, WE ALSO
APPLIED NAÏVE BAYES and SVM TO
IT, BUT AGAIN GOT A VERY LOW
SCORE OF 0.58 and 0.14 respectively.
SECOND
THEN ,WE TESTED FOR RANDOM
FOREST AND GOT A SATISFACTORY
SCORE OF 0.84
FIRST
WE STARTED WITH DECISION TREE.
OUTCOME WAS NOT AT ALL FRUITFUL
AS IT ACHIEVED SCORE OF 0.60