Datamining

•Download as PPTX, PDF•

0 likes•39 views

Neha Agrawal

Fruit Classification using different Machine Learning Algorithms

Software

AGENDA
Introduction
Motivation
Scope
Dataset description
Features
Technologies used
Confusion Matrix

INTRODUCTION
Problem Statement
A system which builds training models using
data mining techniques to predict the fruit
name using its features.

MOTIVATION
The product can be used in jam manufacturing
industries or similar factories where fruit
segregation is required to be done
automatically and no human intervention
should be needed.

SCOPE
The system built assumes that the metrics
used for prediction are available beforehand.
The scope currently is to classify the fruits but
not the sub types.

DATASET USED
 The fruits dataset was created by Dr. Jain Murray from University of Edinburgh.
 And then the professors at University of Michigan formatted the fruits data slightly.

FEATURES
 The system uses three classification algorithms:
1. KNN
2. Naïve Bayes
3. Decision trees
 The accuracy is also shown for each model.
 The option for prediction of fruit label is provided for an
unknown dataset.
 The system also has an interactive GUI.
 LINK FOR SRS:

PRE PROCESSING
Scaling: The features in the dataset are scaled so
as to consider equal weight of all the attributes.
Because the value of height and weight attribute is
much higher than colour score.
Attribute Selection: The attributes not required like
the fruit subtype is removed. And redundancy of
attribute is removed by considering only one.

CROSS VALIDATION STRATEGY
We train the model using only 70% of the dataset
and remaining is reserved for testing the validity of
the model.
The splitting is done by randomly selecting some
sets to train the model. And remaining sets are used
for predicting using the model built.

ALGORITHMS USED
1. K nearest neighbour : The distance of new data point to all other
training data points is calculated and selects k nearest data points.
Finally, it assigns the data point to which majority of the k data points
belong.

2. Naïve Bayes: Every pair of features is
classified independent of each other. It is a
probabilistic model based on Bayes’ theorem.

2. Decision trees: Tries to solve problem using
tree representation and selects best attributes for upper
levels of the tree.

COMPARISON
KNN Naïve Bayes Decision Tree
Accuracy 0.9 0.65 0.8
Detail Memory based
technique
Considers all
attribute
independent
Follows SOP
representation

Tools/techniques used
 Tools:
1. Jupyter notebook
2. Python
3. Libraries used: Tkinter, matplotlib, pandas
 Techniques:
1. KNN-K nearest neighbor algorithm
2. Naïve Bayes algorithm
3. Decision tree

REFERENCES
 https://towardsdatascience.com/solving-a-simple-classification-problem-with-
python
 https://stackoverflow.com/
 https://matplotlib.org/contents.html
 https://youtu.be/JQpWKGcDm70

What's hot

Introduction to random forest and gradient boosting methods a lectureShreyas S K

SECURITYraviraj rajeshirke

Post pruning amiri_mojtaba

Machine LearningPiyukornule

ScienceShare.co.uk Shared ResourceScienceShare.co.uk

Random Forest Classifier in Machine Learning | Palin AnalyticsPalin analytics

6.4 sampling distributionsleblance

Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...Sunil Nair

Multiclass classification of imbalanced dataSaurabhWani6

BigML Education - Anomaly DetectionBigML, Inc

Remote sensing: Accuracy AssesmentKamlesh Kumar

Explore ml day 2preetikumara

Explore ML day 1preetikumara

IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHESVikash Kumar

Simulation reportCik Erlin's

Ensemble learning TechniquesBabu Priyavrat

Efficient Disease Classifier Using Data Mining Techniques: Refinement of Rand...IOSR Journals

Computer Assisted Data Analysis (Hands-on Practice)Dr. Amjad Ali Arain

What's hot (18)

Introduction to random forest and gradient boosting methods a lecture

SECURITY

Post pruning

Machine Learning

ScienceShare.co.uk Shared Resource

Random Forest Classifier in Machine Learning | Palin Analytics

6.4 sampling distributions

Data Mining - Classification Of Breast Cancer Dataset using Decision Tree Ind...

Multiclass classification of imbalanced data

BigML Education - Anomaly Detection

Remote sensing: Accuracy Assesment

Explore ml day 2

Explore ML day 1

IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES

Simulation report

Ensemble learning Techniques

Efficient Disease Classifier Using Data Mining Techniques: Refinement of Rand...

Computer Assisted Data Analysis (Hands-on Practice)

Similar to Datamining

Random Forest Decision Tree.pptxRamakrishna Reddy Bijjam

Rapid MinerSrushtiSuvarna

Unit 2-ML.pptxChitrachitrap

Machine learning application-automated fruit sorting techniqueAnudeep Badam

Data Mining Module 2 Business Analytics.Jayanti Pande

IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...IRJET Journal

random forest.pptxPriyadharshiniG41

data mining.pptxKaviya452563

classification in data mining and data warehousing.pdf321106410027

Data ReductionRajan Shah

Types of Machine Learnig Algorithms(CART, ID3)Fatimakhan325

DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...IEEEGLOBALSOFTTECHNOLOGIES

Module-4_Part-II.pptxVaishaliBagewadikar

IRJET - An User Friendly Interface for Data Preprocessing and Visualizati...IRJET Journal

IRJET-Scaling Distributed Associative Classifier using Big DataIRJET Journal

Internship PPT.ppsxSyeda Nasiha

An Introduction to Random Forest and linear regression algorithmsShouvic Banik0139

New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...ijaia

Network Based Intrusion Detection System using Filter Based Feature Selection...IRJET Journal

ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...csandit

Similar to Datamining (20)

Random Forest Decision Tree.pptx

Rapid Miner

Unit 2-ML.pptx

Machine learning application-automated fruit sorting technique

Data Mining Module 2 Business Analytics.

IRJET- Prediction of Crime Rate Analysis using Supervised Classification Mach...

random forest.pptx

data mining.pptx

classification in data mining and data warehousing.pdf

Data Reduction

Types of Machine Learnig Algorithms(CART, ID3)

DOTNET 2013 IEEE CLOUDCOMPUTING PROJECT A fast clustering based feature subse...

Module-4_Part-II.pptx

IRJET - An User Friendly Interface for Data Preprocessing and Visualizati...

IRJET-Scaling Distributed Associative Classifier using Big Data

Internship PPT.ppsx

An Introduction to Random Forest and linear regression algorithms

New Feature Selection Model Based Ensemble Rule Classifiers Method for Datase...

Network Based Intrusion Detection System using Filter Based Feature Selection...

ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...

Recently uploaded

chapter--4-software-project-planning.pptkotipi9215

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...ICS

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptxTier1 app

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...MyIntelliSource, Inc.

Optimizing AI for immediate response in Smart CCTVshikhaohhpro

EY_Graph Database Powered SustainabilityNeo4j

The Evolution of Karaoke From Analog to App.pdfPower Karaoke

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Delhi Whatsup 9873940964 Enjoy Unlimited Pleasure

DNT_Corporate presentation know about usDynamic Netsoft

Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08

5 Signs You Need a Fashion PLM Software.pdfWave PLM

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700

Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy

cybersecurity notes for mca students for learningVitsRangannavar

What is Fashion PLM and Why Do You Need ItWave PLM

Advancing Engineering with AI through the Next Generation of Strategic Projec...OnePlan Solutions

HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...kellynguyen01

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171

Recently uploaded (20)

chapter--4-software-project-planning.ppt

The Real-World Challenges of Medical Device Cybersecurity- Mitigating Vulnera...

KnowAPIs-UnknownPerf-jaxMainz-2024 (1).pptx

Steps To Getting Up And Running Quickly With MyTimeClock Employee Scheduling ...

Optimizing AI for immediate response in Smart CCTV

EY_Graph Database Powered Sustainability

The Evolution of Karaoke From Analog to App.pdf

Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...

DNT_Corporate presentation know about us

Unit 1.1 Excite Part 1, class 9, cbse...

5 Signs You Need a Fashion PLM Software.pdf

Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...

(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...

Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications

cybersecurity notes for mca students for learning

What is Fashion PLM and Why Do You Need It

Advancing Engineering with AI through the Next Generation of Strategic Projec...

HR Software Buyers Guide in 2024 - HRSoftware.com

Short Story: Unveiling the Reasoning Abilities of Large Language Models by Ke...

Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf

Datamining

1. Fruit Classifier

2. AGENDA Introduction Motivation Scope Dataset description Features Technologies used Confusion Matrix

3. INTRODUCTION Problem Statement A system which builds training models using data mining techniques to predict the fruit name using its features.

4. MOTIVATION The product can be used in jam manufacturing industries or similar factories where fruit segregation is required to be done automatically and no human intervention should be needed.

5. SCOPE The system built assumes that the metrics used for prediction are available beforehand. The scope currently is to classify the fruits but not the sub types.

6. DATASET USED  The fruits dataset was created by Dr. Jain Murray from University of Edinburgh.  And then the professors at University of Michigan formatted the fruits data slightly.

7. FEATURES  The system uses three classification algorithms: 1. KNN 2. Naïve Bayes 3. Decision trees  The accuracy is also shown for each model.  The option for prediction of fruit label is provided for an unknown dataset.  The system also has an interactive GUI.  LINK FOR SRS:

8. PRE PROCESSING Scaling: The features in the dataset are scaled so as to consider equal weight of all the attributes. Because the value of height and weight attribute is much higher than colour score. Attribute Selection: The attributes not required like the fruit subtype is removed. And redundancy of attribute is removed by considering only one.

9. CROSS VALIDATION STRATEGY We train the model using only 70% of the dataset and remaining is reserved for testing the validity of the model. The splitting is done by randomly selecting some sets to train the model. And remaining sets are used for predicting using the model built.

10. ALGORITHMS USED 1. K nearest neighbour : The distance of new data point to all other training data points is calculated and selects k nearest data points. Finally, it assigns the data point to which majority of the k data points belong.

11. 2. Naïve Bayes: Every pair of features is classified independent of each other. It is a probabilistic model based on Bayes’ theorem.

12. 2. Decision trees: Tries to solve problem using tree representation and selects best attributes for upper levels of the tree.

13. COMPARISON KNN Naïve Bayes Decision Tree Accuracy 0.9 0.65 0.8 Detail Memory based technique Considers all attribute independent Follows SOP representation

14. Tools/techniques used  Tools: 1. Jupyter notebook 2. Python 3. Libraries used: Tkinter, matplotlib, pandas  Techniques: 1. KNN-K nearest neighbor algorithm 2. Naïve Bayes algorithm 3. Decision tree

15.

16.

17.

18.

19. Demonstration

20. REFERENCES  https://towardsdatascience.com/solving-a-simple-classification-problem-with- python  https://stackoverflow.com/  https://matplotlib.org/contents.html  https://youtu.be/JQpWKGcDm70

21. Thank You

Datamining

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to Datamining

Similar to Datamining (20)

Recently uploaded

Recently uploaded (20)

Datamining