Machine Learning Project

•

0 likes•60 views

Problem 1: What Cuisine Is This? Problem 2: Will you get a free Pizza? Data analysis on the above problems using machine learning.

Data & Analytics

Problem 1: WHAT CUISINE IS THIS
RECIPE?
Eckovation Machine Learning
Team Bits N’ Bytes :-
-Gaurav(00711503016)
-Kartik(01411503016)
-Pooja (41211503016)
-Kishu(01911504916)
-Govind (01411504916)

MACHINE LEARNING
SUPERVISED UNSUPERVISED
REGRESSION
LINEAR
POLYNOMIAL
CLASSIFICATION
DECISION TREE
RANDOM FOREST
LOGISTIC
NAIVE BAYES
SVM
NEURAL NETWORKS
K-MEANS
CLUSTERING

Data Shared: Training and
Testing Data
Dataset Format: JSON file
Language Used : Python
Type of Machine Learning used : Supervised
Dataset includes : we include the recipe id, the type of
cuisine, and the list of ingredients of each recipe (of
variable length) etc.

ALGORITHM USED:-
LOGISTIC REGRESSION:- is the go-to method for binary
classification. It gives you a discrete binary outcome
between 0 and 1. To say it in simpler words, it’s outcome is
either one thing or another.
RANDOM FOREST CLASSIFIER:- It builds multiple Decision
trees and merges them together to get a more accurate and
stable prediction. One big advantage of random forest is,
that it can be used for both classification and regression
problems, which form the majority of current machine
learning systems.
NAIVE BAYES:-Naive Bayes is a classification algorithm for
binary (two-class) and multi-class classification problems.

CONCLUSION
ALGORITHM USED ACCURACY
LOGISTIC REGRESSION 78.59%
RANDOM FOREST 66.64%
NAIVE BAYES 36.89%
HENCE, LOGISTIC REGRESSION IS THE BEST ALGORITHM FOR THIS
DATASET.

Problem 2: Will You Get A Free
Pizza?This problem is based on sentiment analysis in which we identify positive, negative
and neutral opinions in a natural language.
Here, If someone buys pizza to the requester, the request would be considered
successful, if not, would be unsuccessful.
INPUT : -Dataset for textual requests for Pizza from Random Acts Of Pizza community
on Reddit.
GOAL :- Given a request (post), the goal is to predict if it will be successful or
unsuccessful.
We aim to convert textual features in numeric features that contain sentiment
information, suitable to be given as input to machine learning algorithm.

DATASET SHARED: TRAINING AND TEST
Dataset Format: JSON file
Language Used : Python
Type of Machine Learning used : Supervised
Dataset includes :
5671 requests collected from Reddit
Community Random Acts Of Pizza between
December 8,2010 and September 29,2013.
Outcome of each request(whether the
author gets the pizza or not) : Known
MetaData includes :
Time of the request, Activity of the requester,
community age of the requester, etc.

ALGORITHM USED
Logistic Regression : It is a statistical model is usually applied to a binary dependent variable. The two
dependent variable values are often labelled as “0” and “1” which in our problem are “request text” and
“requester gets the pizza” respectively.
Naive Bayes : It is a family of algorithm based on the principle that value of a particular feature is
independent of the value of any other feature, given in the class variable. Its advantage is that it requires
a small dataset to estimate the parameters necessary for classification.
Support Vector Machine : It is a further extension to SVC to accomodate non-linear boundaries.
Though there is a clear distinction between various definitions but people prefer to call all of them
as SVM to avoid any complications.
Random Forest: represents multitude of decision trees. Based on the concept of neighbourhood
interpretation and can also be analysed in an unsupervised format.
We used NLTK’s API to get the polarity of the text which can be successful or unsuccessful in our
case.

CONCLUSION
Hence, we can say that SVM and Random Forest are the best
models to work with this dataset.
ALGORITHM ACCURACY
SUPPORT VECTOR MACHINE (SVM) 0.7648514851485149
RANDOM FOREST 0.760519801980198
NAIVE BAYES 0.7580445544554455
LOGISTIC REGRESSION 0.7580445544554455

Similar to Machine Learning Project

Finding Bad Code Smells with Neural Network Models IJECEIAES

A Flexible Recommendation System for Cable TVFrancisco Couto

A flexible recommenndation system for Cable TVIntoTheMinds

A Fuzzy Logic Intelligent Agent for Information ExtractionTarekMourad8

Class Diagram Extraction from Textual Requirements Using NLP Techniquesiosrjce

D017232729IOSR Journals

Triantafyllia VoulibasiISSEL

Tech capabilities with_saRobert Martin

Myriam phdiammyr

Building a Meta-search EngineAyan Chandra

A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEijnlc

IRJET - Deep Learning based ChatbotIRJET Journal

Assistive system for Parkinson's patients - Carnegie Mellon University Spring...KP Kshitij Parashar

A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTEkevig

Quick dive into the big data pool without drowning - Demi Ben-Ari @ PanoraysDemi Ben-Ari

How to build your in-house ChatGPT Citynow Asia Inc

Rated Ranking Evaluator (RRE) Hands-on Relevance Testing @ChorusSease

Rated Ranking Evaluator: An Open Source Approach for Search Quality EvaluationAlessandro Benedetti

Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...OpenSource Connections

Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...Martino Mensio

Similar to Machine Learning Project (20)

Finding Bad Code Smells with Neural Network Models

A Flexible Recommendation System for Cable TV

A flexible recommenndation system for Cable TV

A Fuzzy Logic Intelligent Agent for Information Extraction

Class Diagram Extraction from Textual Requirements Using NLP Techniques

D017232729

Triantafyllia Voulibasi

Tech capabilities with_sa

Myriam phd

Building a Meta-search Engine

A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE

IRJET - Deep Learning based Chatbot

Assistive system for Parkinson's patients - Carnegie Mellon University Spring...

A FILM SYNOPSIS GENRE CLASSIFIER BASED ON MAJORITY VOTE

Quick dive into the big data pool without drowning - Demi Ben-Ari @ Panorays

How to build your in-house ChatGPT

Rated Ranking Evaluator (RRE) Hands-on Relevance Testing @Chorus

Rated Ranking Evaluator: An Open Source Approach for Search Quality Evaluation

Haystack 2019 - Rated Ranking Evaluator: an Open Source Approach for Search Q...

Multi-turn QA: A RNN Contextual Approach to Intent Classification for Goal-or...

Recently uploaded

In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabiaahmedjiabur940

Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...nirzagarg

Abortion pills in Jeddah | +966572737505 | Get CytotecAbortion pills in Riyadh +966572737505 get cytotec

Introduction to Statistics Presentation.pptxAniqa Zai

Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridihmeghakumariji156

Vastral Call Girls Book Now 7737669865 Top Class Escort Service Availablegargpaaro

Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...kumargunjan9515

$Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...$ $Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...$

Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...HyderabadDolls

RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptxronsairoathenadugay

Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...gajnagarg

DATA SUMMIT 24 Building Real-Time Pipelines With FLaNKTimothy Spann

Identify Customer Segments to Create Customer Offers for Each Segment - Appli...ThinkInnovation

Aspirational Block Program Block Syaldey District - AlmoraGovindSinghDasila

Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...HyderabadDolls

Digital Transformation Playbook by Graham WareGraham Ware

Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...HyderabadDolls

💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...vershagrag

Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...nirzagarg

Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...HyderabadDolls

Predicting HDB Resale Prices - Conducting Linear Regression Analysis With OrangeThinkInnovation

Recently uploaded (20)

In Riyadh ((+919101817206)) Cytotec kit @ Abortion Pills Saudi Arabia

Top profile Call Girls In Bihar Sharif [ 7014168258 ] Call Me For Genuine Mod...

Abortion pills in Jeddah | +966572737505 | Get Cytotec

Introduction to Statistics Presentation.pptx

Giridih Escorts Service Girl ^ 9332606886, WhatsApp Anytime Giridih

Vastral Call Girls Book Now 7737669865 Top Class Escort Service Available

Fun all Day Call Girls in Jaipur 9332606886 High Profile Call Girls You Ca...

$Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...$ $Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...$

Diamond Harbour \ Russian Call Girls Kolkata | Book 8005736733 Extreme Naught...

RESEARCH-FINAL-DEFENSE-PPT-TEMPLATE.pptx

Top profile Call Girls In dimapur [ 7014168258 ] Call Me For Genuine Models W...

DATA SUMMIT 24 Building Real-Time Pipelines With FLaNK

Identify Customer Segments to Create Customer Offers for Each Segment - Appli...

Aspirational Block Program Block Syaldey District - Almora

Lake Town / Independent Kolkata Call Girls Phone No 8005736733 Elite Escort S...

Digital Transformation Playbook by Graham Ware

Sealdah % High Class Call Girls Kolkata - 450+ Call Girl Cash Payment 8005736...

💞 Safe And Secure Call Girls Agra Call Girls Service Just Call 🍑👄6378878445 🍑...

Top profile Call Girls In Hapur [ 7014168258 ] Call Me For Genuine Models We ...

Nirala Nagar / Cheap Call Girls In Lucknow Phone No 9548273370 Elite Escort S...

Predicting HDB Resale Prices - Conducting Linear Regression Analysis With Orange

Machine Learning Project

1. Problem 1: WHAT CUISINE IS THIS RECIPE? Eckovation Machine Learning Team Bits N’ Bytes :- -Gaurav(00711503016) -Kartik(01411503016) -Pooja (41211503016) -Kishu(01911504916) -Govind (01411504916)

2. MACHINE LEARNING SUPERVISED UNSUPERVISED REGRESSION LINEAR POLYNOMIAL CLASSIFICATION DECISION TREE RANDOM FOREST LOGISTIC NAIVE BAYES SVM NEURAL NETWORKS K-MEANS CLUSTERING

3. Data Shared: Training and Testing Data Dataset Format: JSON file Language Used : Python Type of Machine Learning used : Supervised Dataset includes : we include the recipe id, the type of cuisine, and the list of ingredients of each recipe (of variable length) etc.

4. PLOTS Most Common Ingredients Used

6. ALGORITHM USED:- LOGISTIC REGRESSION:- is the go-to method for binary classification. It gives you a discrete binary outcome between 0 and 1. To say it in simpler words, it’s outcome is either one thing or another. RANDOM FOREST CLASSIFIER:- It builds multiple Decision trees and merges them together to get a more accurate and stable prediction. One big advantage of random forest is, that it can be used for both classification and regression problems, which form the majority of current machine learning systems. NAIVE BAYES:-Naive Bayes is a classification algorithm for binary (two-class) and multi-class classification problems.

8. CONCLUSION ALGORITHM USED ACCURACY LOGISTIC REGRESSION 78.59% RANDOM FOREST 66.64% NAIVE BAYES 36.89% HENCE, LOGISTIC REGRESSION IS THE BEST ALGORITHM FOR THIS DATASET.

9. Problem 2: Will You Get A Free Pizza?This problem is based on sentiment analysis in which we identify positive, negative and neutral opinions in a natural language. Here, If someone buys pizza to the requester, the request would be considered successful, if not, would be unsuccessful. INPUT : -Dataset for textual requests for Pizza from Random Acts Of Pizza community on Reddit. GOAL :- Given a request (post), the goal is to predict if it will be successful or unsuccessful. We aim to convert textual features in numeric features that contain sentiment information, suitable to be given as input to machine learning algorithm.

10. DATASET SHARED: TRAINING AND TEST Dataset Format: JSON file Language Used : Python Type of Machine Learning used : Supervised Dataset includes : 5671 requests collected from Reddit Community Random Acts Of Pizza between December 8,2010 and September 29,2013. Outcome of each request(whether the author gets the pizza or not) : Known MetaData includes : Time of the request, Activity of the requester, community age of the requester, etc.

11.

12.

13. ALGORITHM USED Logistic Regression : It is a statistical model is usually applied to a binary dependent variable. The two dependent variable values are often labelled as “0” and “1” which in our problem are “request text” and “requester gets the pizza” respectively. Naive Bayes : It is a family of algorithm based on the principle that value of a particular feature is independent of the value of any other feature, given in the class variable. Its advantage is that it requires a small dataset to estimate the parameters necessary for classification. Support Vector Machine : It is a further extension to SVC to accomodate non-linear boundaries. Though there is a clear distinction between various definitions but people prefer to call all of them as SVM to avoid any complications. Random Forest: represents multitude of decision trees. Based on the concept of neighbourhood interpretation and can also be analysed in an unsupervised format. We used NLTK’s API to get the polarity of the text which can be successful or unsuccessful in our case.

14.

15. CONCLUSION Hence, we can say that SVM and Random Forest are the best models to work with this dataset. ALGORITHM ACCURACY SUPPORT VECTOR MACHINE (SVM) 0.7648514851485149 RANDOM FOREST 0.760519801980198 NAIVE BAYES 0.7580445544554455 LOGISTIC REGRESSION 0.7580445544554455

Machine Learning Project

Recommended

Recommended

More Related Content

Similar to Machine Learning Project

Similar to Machine Learning Project (20)

Recently uploaded

Recently uploaded (20)

Machine Learning Project