Classification of Apple diseases through machine learning

An Automated System for Early Identification
of Diseases in Apple through Machine
Learning
Muqaddas Bin Tahir
MS(Computer Science)
HITEC University

Presentation Outline
 Introduction
 Motivation
 Literature Review
 Problem Statement
 Objective
 Proposed Methodology
 Results and Analysis
 Comparison
 Conclusion and Future work
 References
2
Department of Computer Science HITEC University

Introduction
3

Introduction
Important agricultural crops are threatened by a variety of
plant diseases and pests.
About 42% of the world’s total agriculture crops are
destroyed yearly [8].
In this dissertation a hybrid method is developed to
classify 4 different types of Apple diseases.
Apple leaf is used as a test case
4

Introduction
Proposed Method: Dataset Description:
Data Acquisition
Data Normalization
Transfer Learning
Convolution
Feature Extraction
Feature Fusion
Variance Control
Classification
5
Apple Scab
630 Images
Black Rot
621 Images
Apple Cedar
275 Images
Healthy
1645 Images

Introduction6
Types of Pathogens
that affects crops
Bacteria
Virus
Fungus

Introduction
7
Approaches used to control different diseases
(1)
Regular survey of Experts
(2)
Spray after passing
certain time limit
(3)
Naked Eye Survey of
Farmer
(4)
Used of Different
Image Processing
Techniques’
(5)
Use of AI and
Machine Leaning
in Agriculture

Motivation
8

Motivation
 Identification of diseases in plants is a critical research problem in the domain of computer vision
and agriculture.
 For identification of diseases, mostly farmers rely on their naked eyes. But some diseases are not
easily recognizable through naked eyes.
 Mostly farmers are unable to procure expensive systems for plant protection or regular monitoring
of experts.
 In last decade, different types of image processing techniques play crucial role in identification of
plant diseases. But majority of techniques are time consuming, complex and lengthy.
 So there is a strong motivation for introducing a system which can be quickly and easily trained to
work with different lighting conditions, different angles, and all types of diseases.
 In this dissertation, the Apple fruit is used as test case. 50 lac tons is the total production of apples
all over the world of which 13 lac 35 thousand tons are grown in Pakistan. In this dissertation,
different concepts of machine learning are used to quickly and accurately classify diverse apple
diseases.
9

Literature Review
10

Literature Review
11
SR Authors Main Goals Methodology Issues
1 Yun Zhang
(2017)
--------------
Accuracy
(97.63%)
Identification of
Apple Diseases with
the help of Neural
network.
PCA Jittering
-------------------------------------------------
Convolutional Neural Network
(AlexNet, VGGNet)
--------------------------------------------------
NAG Algorithm
--------------------------------------------------
GoogleNet Inception
Scalability issues
---------------------------------------
Computational Cost is High
----------------------------------------
Expensive (not easily affordable for
Individual persons).
-----------------------------------------
Not applicable for real time environment.
2 Shiv Ram
Dubey
(2017)
--------------
Accuracy
(93%)
Identification of
Apple Diseases with
the help of Neural
network.
K - Mean Clustering
----------------------------------------------
Global Color Histogram
Color coherence vector
Local Binary Pattern
Complete local Binary Pattern
--------------------------------------------------
SVM (support Vector Machine)
Multiclass SVM
Time consuming Process.
-----------------------------------
Not Good for large Databases
------------------------------------
Fusion of more than one feature will improve the output
----------------------------------------------------------------------------
In SVM, it is much difficult to map original data into higher
dimension by using kernel function and their parameters.
-----------------------------------------------------
One biggest gap of SVM is that it correctly works only with
two classes at a time.

Literature Review
12
3 Srdjan
Sladojevic
(2016)
--------------
Accuracy
(96.3%)
Identification of Apple
Diseases with the help of
Deep Neural network.
Convolution Neural Network
--------------------------------------------------
ReLu
--------------------------------------------------
Softmax
--------------------------------------------------
OpenCV
Time taking process (Slow)
4 Prabira
Kumar Sethy
(2017)
--------------
Accuracy
(97.63%)
Detection of Diseases in Rice
Crop Leaves
K - Mean Clustering
--------------------------------------------------
3 - Mean Clustering
Algorithm does not work if leaf is
defected by two or more diseases
simultaneously.

Literature Review
13
5 Shiv Ram
Dubey
(2014)
--------------
Accuracy
(95%)
Detection of
Fruit Diseases
K - Mean Clustering
------------------------------------------------------
Global Color Histogram
Color Coherence Vector
Local Binary Pattern
Complete Local Binary Pattern
------------------------------------------------------
SVM (support Vector Machine)
Multiclass SVM
Time taking Process
--------------------------
Without Fusion of features such method seems
impractical.
-----------------------------------------
Color histogram is not so good. Its information does
not cover all the aspects because, sometimes, image
with very different appearance can have similar
histograms.
-------------------------------------------
Binary data is sensitive to noise.
6 Guan Wang
(2017)
--------------
Accuracy
(90.4%)
Identification of
different plant
diseases
Convolution Neural Network
(AlexNet , VGGnet, ResNet)
-----------------------------------------------------
ReLu
------------------------------------------------------
Softmax
In case of Softmax Classifier, much calculation is
required if you have a complex training data with
many labels.
--------------------------------------------
Use of some more versatile sensors like infrared in
image acquisition can enhance the result accuracy.

Literature Review14
SR Authors Main Goals Methodology Gaps
7 Misigo Ronald,
Miriti Evan
(2016)
--------------
Accuracy (80%)
Classification of Different Apple
Types
Naive Bayes
-----------------------------------------
Otsu algorithm
Naive Bayes can learn individual features easily
but can’t determine the relationship among
features.
8 Savita N. Ghaiwat
et al
(2014)
Review of methodologies use in
detection and classification of
plant leaf diseases.
Artificial Neural Network
-----------------------------------------
Support Vector Machine
-----------------------------------------
Self Organization Maps
-----------------------------------------
Fuzzy Logic
In SVM computational complexity is reduced to
quadratic optimization problem and it’s easy to
control complexity of decision rule and
frequency of error.
----------------------------------------------
In neural network it’s difficult to understand
structure of algorithm and to determine
optimal parameters when training data is
not linearly separable.

Literature Review15
9 Mrunalini R.
et al
(2011)
Pattern recognition
for crop diseases
K-means clustering and artificial
intelligence
Artificial neural network and fuzzy
logic with some other soft computing
technique can be used to classify the
crop diseases
10 Sabah Bashir
et al
(2012)
Remote area plant disease
detection using image
processing
Co-occurrence matrix method and K-means
clustering technique for texture segmentation.
Bayes classifier, K-means clustering
and principal component classifier can
be used to classify various plant
diseases

Literature Review16
11 Smita
Naikwadi
et al
(2013)
Advances In Image
Processing For
Detection Of Plant
Diseases
Spatial Gray-level
Dependence
Matrices, Color Co-occurrence
texture analysis method
Better detection results can
be obtained with the large
database and advanced feature
of color extraction
12 Di Cui ,
Qin Zhang
(2009)
Detection of soybean rust using
a multispectral Image sensor
Multispectral image sensor, Research need to be carried out to verify
the correlation between DVI and rust
severity.
--------------------------------------------
Comprehensive studies are needed to
verify sensible range and accuracy of this
method in different environments.

Problem Statement
17

Problem Statement
 Many techniques have been implemented in the domain of agriculture for detection and
classification of fruit diseases. Most of these techniques focus on hand crafted features which
do not yield promising results due to several factors such as change in illumination, translation,
occlusion, rotation etc.
18

Problem Statement
 In this research work, various issues related to detection of diseases are:-
Choice of color spaces.
Selection of distinct features
Selection of useful features.
Weak contrast and boundaries.
Inhomogeneous object region during image segmentation.
Irregular shape during feature extraction.
Similarities between defected and normal apple.
Different symptoms but similar features and similar shape.
Scalability issues (No fixed parameter for scale adjustment of different diseases)
19

Objective
s
20

Objective
Develop an efficient methodology using computer vision and machine learning techniques for
detection and classification of apple diseases by incorporating the following :-
 Perform pre-processing to improve the visual contrast of disease spots in the input image.
 The task is to fine tune an end to end Convolutional neural network for Identification of Diseases that
can perform better than other diseases identification techniques .
 Extract Deep CNN features by segmenting mapped RGB images and use SVM for classification.
 Carry out simulation results of the proposed system on publically available dataset.
 The system thus developed will be less affected by illumination, translation and occlusion problems
and will offer a more general solution.
 Perform comparison of the proposed system with the existing methods.
21

Proposed Methodology
22

 The proposed method for Apple diseases identification comprises of six primary steps such
as:-
23
Design
Methodology

Detailed Block diagram of Proposed method
24

Dataset Detail (Sample Images)
Apple Scab Black Rot Apple Cidar Healthy Images
630 Images 621 Images 275 Images 1645 Images
Total Images 3171
25

Proposed Method
 In the proposed model, four different types of Apple diseases are classified with the
help of Transfer Learning by using Inception-v3 ,a neural network model.
26

“Transfer Learning”
27

Proposed Model
Transfer Learning
 Transfer learning is a machine learning method where a model
developed for a task is reused as the starting point for a model on a
second task.
 Main Advantages of TL
 Reduce computational resources
 Less computational time
 Best for small dataset.
 Reduce time for Learning
28

29
Proposed Model
Transfer Learning

 The approach to build from scratch demands
extensive computational resources,
requires substantial training time
lots of training data.
 Concept of transfer learning seems the most plausible method.
 ImageNet, and its Large Visual Recognition Challenge.
 Inception V3 is the model Google Brain Team has built for the same purpose.
which is used further in the proposed technique.
30
Proposed Model
Why Transfer Learning
CHALLENGE
Models try to classify a huge collection of
images into 1000 classes, like “Zebra”,
“orange”, and “Dishwasher”.

“Inception V3”
31

32
 The Inception network was an important milestone in the development
of CNN classifiers. The Inception deep convolutional architecture was
introduced as GoogLeNet in (Szegedy et al. 2015a),
 Inception Network is most popular CNN Network due to its stacked
convolution layers deeper and deeper, to get better performance.
 It uses a lot of tricks to boost performance; both in terms of Speed and
Accuracy.
 Inception network has total four types ( Inception V1, V2, V3, V4)
Proposed Model
Inception V3

33
Proposed Model
Inception V3
Inception CNN
Network
Inception V1
Introduced by (GoogLeNet 2015),
Inception V2
(Batch Normalization)
Inception V3
(Factorization Ideas)

34
 In the proposed method Inception V3 is used.
 Inception V3 used in the proposed method has total 316 Layers.
 Division of layers and categories used in proposed approach:
Proposed Model
Inception V3 Architecture
Layers Name Number of Layers Layers Name Number of Layers
Image Input Layer 1 Softmax 1
Scaling Layer 1 Classification Output Layer 1
Batch Normalization Layer 95
Convolution Layer 94
ReLu 94
Max Pooling 4
Average Pooling 9
Depth Concatenation Layer 15
Fully Connected Layer 1

Results and Analysis
35

Experimental Setup
 Used a publicly available datasets such as Plant Village.
 For Classification different classifiers are tested
 Fine Tree , Medium Tree ...
 Linear Discriminant Analysis (LDA)
 Support vector machine
 K-Nearest Neighbour (KNN)
 ensemble trees and etc are used.
 Eight statistical measures are utilized for performance analysis ,such a
 Sensitivity
 Precision
 Accuracy
 False Negative Rate (FNR),
 False Positive Rate (FPR)
 Area under the Curve (AUC)
36

Experimental Setup
 Hardware Setup
Processor Core i 5
RAM 6 GB
Graphics Card Nvidia Quadro
 Software Setup
37
Software Name/Type Description Version
Operating System Window 7 64 bit
Scripting Language Matlab 2018
Development Environment Matlab 2018

Dataset Description (PlantVillage)
Apple Scab Black Rot Apple Cidar Healthy Images
630 Images 621 Images 275 Images 1645 Images
Total Images 3171
38

Statistical Analysis
39

Ratios for Training and Testing images
40
Apple Scab Black Rot Apple Cedar Healthy
Training
(%)
Testing
(%)
Training
(%)
Testing
(%)
Training
(%)
Testing
(%)
Training
(%)
Testing
(%)
Phase - I 80 20 80 20 80 20 80 20
Phase - II 70 30 70 30 70 30 70 30
Phase - III 50 50 50 50 50 50 50 50

FULLY CONNECTED LAYER (Training 80, Testing 20)
41CLASSIFIER ACCURACY FNR RECALL SPECIFICING AUC
Fine Tree 49.1% 50.9 49.5 49.5 0.68
Medium Tree 49.1% 50.9 49.5 49.5 0.68
Coarse Tree 45.0% 55 45 45 0.65
Linear Discriminant 87.3% 12.7 87.5 87.5 0.91
Linear SVM 95.9% 4.09 96 96 0.99
Quadratic SVM 96.4% 3.59 96.5 96.5 0.99
Cubic SVM 95.9% 4.09 96 96 0.99
Fine Guassian SVM 27.3% 72.7 27 27 0.58
Medium Gussian SVM 93.2% 6.8 93.25 93.25 0.99
Coarse Guassian SVM 87.7% 12.3 87.75 87.75 0.98
Fine KNN 81.4% 18.6 81.25 81.25 0.87
Medium KNN 80.5% 19.5 80.5 80.5 0.96
Coarse KNN 68.2% 31.8 68.25 68.25 0.93
Cosine KNN 88.6% 11.4 88.5 88.5 0.98
Cubic KNN 80.5% 19.5 80.5 80.5 0.96
Weighted KNN 83.2% 16.8 83 83 0.97
Boosted Trees 73.2% 26.8 73.25 73.25 0.90
Bagged Trees 85.0% 15 85 85 0.97
Subspace Discriminant 91.4% 8.59 91 91 0.98
Subspace KNN 80.9% 19.1 80.75 80.75 0.92
RUSBoosted Tree 55.9% 44.1 55.75 55.75 0.79

42
FULLY CONNECTED LAYER (Training 80, Testing 20) Confusion Matrix

43
FULLY CONNECTED
LAYER
(Training 80, Testing 20)
ROC Curve

44
CLASSIFIER ACC True Positive ROC Curve FNR RECALL SPECIFICING AUC
1 2 3 4 1 2 3 4
40.9 59.25 59.25 0.76Fine Tree 59.1% 55 62 60 60 0.70 0.81 0.77 0.76
Medium Tree 59.1% 56 61 60 60 0.74 0.82 0.76 0.76 40.9 59.25 59.25 0.77
Coarse Tree 59.5% 70 56 50 62 0.75 0.76 0.73 0.75 40.5 59.5 59.5 0.74
Linear Discriminant 84.5% 80 88 90 78 0.86 0.92 0.93 0.87 15.9 84 84 0.89
Linear SVM 91.8% 93 91 95 88 0.99 0.99 1.00 0.98 8.2 91.75 91.75 0.99
Quadratic SVM 91.5% 91 94 98 83 0.99 0.99 1.00 0.98 8.5 91.5 91.5 0.99
Cubic SVM 91.2% 93 94 96 82 0.98 0.99 1.00 0.98 8.8 91.25 91.25 0.98
Fine Guassian SVM 30.8% 28 9 15 72 0.55 0.62 0.63 0.60 69.2 31 31 0.6
Medium Gussian SVM 90.9% 94 93 94 83 0.98 0.99 1.00 0.98 9.09 91 91 0.98
Coarse Guassian SVM 85.1% 90 93 94 63 0.98 0.98 1.00 0.97 14.9 85 85 0.98
Fine KNN 81.4% 73 91 96 65 0.84 0.92 0.95 0.80 18.6 81.25 81.25 0.87
Medium KNN 82.3% 90 93 94 52 0.97 0.99 0.99 0.93 17.7 82.25 82.25 0.97
Coarse KNN 76.8% 76 93 99 40 0.94 0.97 0.99 0.94 23.2 77 77 0.96
Cosine KNN 86.0% 93 93 87 72 0.96 0.99 0.98 0.94 14 86.25 86.25 0.96
Cubic KNN 82.0% 89 93 95 51 0.96 0.98 0.99 0.93 18 82 82 0.96
Weighted KNN 82.6% 85 91 96 57 0.97 0.99 0.99 0.94 17.4 82.25 82.25 0.97
Boosted Trees 79.3% 76 78 87 77 0.92 0.94 0.96 0.91 20.7 79.5 79.5 0.93
Bagged Trees 84.8% 85 88 89 77 0.96 0.98 0.99 0.95 15.2 84.75 84.75 0.97
Subspace Discriminant 92.4% 93 96 94 87 0.99 1.00 0.99 0.98 7.59 92.5 92.5 0.99
Subspace KNN 80.5% 72 89 96 65 0.89 0.94 0.98 0.84 19.5 80.5 80.5 0.91
RUSBoosted Tree 61.0% 59 65 61 60 0.79 0.87 0.84 0.76 39 61.25 61.25 0.81

45

46
FULLY CONNECTED
LAYER
ROC Curve

47
1 2 3 4 1 2 3 4
37.4 62.5 62.5 0.792Fine Tree 62.6% 58 69 65 58 0.76 0.82 0.82 0.77
Medium Tree 64.2% 61 69 68 58 0.78 0.81 0.84 0.81 35.8 64 64 0.81
Coarse Tree 58.9% 45 55 66 69 0.68 0.83 0.80 0.76 41.1 58.75 58.75 0.767
Linear Discriminant 83.4% 77 85 93 80 0.85 0.91 0.94 0.86 16.6 83.75 83.75 0.89
Linear SVM 94.0% 92 95 96 93 0.99 1.00 1.00 0.98 6 94 94 0.992
Quadratic SVM 96.4% 95 96 99 96 0.99 1.00 1.00 0.99 3.6 96.5 96.5 0.995
Cubic SVM 95.3% 93 95 99 93 0.99 1.00 1.00 0.99 4.7 95 95 0.995
Fine Guassian SVM 24.5% 11 28 30 28 0.57 0.57 0.58 0.56 75.5 24.25 24.25 0.57
Medium Gussian SVM 94.0% 91 96 96 94 0.99 0.99 1.00 0.99 6 94.25 94.25 0.992
Coarse Guassian SVM 89.8% 88 93 92 86 0.98 0.99 1.00 0.97 10.2 89.75 89.75 0.985
Fine KNN 84.7% 82 87 93 77 0.88 0.91 0.94 0.86 15.3 84.75 84.75 0.897
Medium KNN 84.7% 85 87 96 72 0.96 0.98 0.99 0.96 15.3 85 85 0.972
Coarse KNN 80.1% 78 84 98 61 0.94 0.98 0.99 0.96 19.9 80.25 80.25 0.967
Cosine KNN 86.5% 80 92 93 81 0.96 0.98 0.98 0.96 13.5 86.5 86.5 0.97
Cubic KNN 85.6% 84 87 96 76 0.96 0.98 0.99 0.96 14.4 85.75 85.75 0.972
Weighted KNN 86.1% 83 88 96 77 0.97 0.98 0.99 0.96 13.9 86 86 0.975
Boosted Trees 85.4% 80 89 90 82 0.95 0.97 0.98 0.96 14.6 85.25 85.25 0.965
Bagged Trees 84.9% 80 89 91 80 0.96 0.98 0.99 0.96 15.1 85 85 0.972
Subspace KNN 82.7% 82 84 91 73 0.94 0.95 0.97 0.91 17.3 82.5 82.5 0.942
RUSBoosted Tree 64.8% 66 67 70 56 0.81 0.84 0.87 0.82 35.2 64.75 64.75 0.835

48

49
FULLY CONNECTED
LAYER
ROC Curve

AVERAGE POOLING (Training 80, Testing 20)
50
1 2 3 4 1 2 3 4 36.8 63.5 63.5 0.77
Fine Tree 63.2% 53 84 62 55 0.67 0.91 0.75 0.76
Medium Tree 63.2% 53 84 62 55 0.67 0.91 0.75 0.76 36.8 63.5 63.5 0.772
Coarse Tree 65.5% 60 82 64 56 0.76 0.91 0.82 0.77 34.5 65.5 65.5 0.815
Linear Discriminant 95.5% 95 98 96 93 0.96 0.99 0.98 0.95 4.5 95.5 95.5 0.97
Linear SVM 95.0% 95 96 95 95 0.99 1.00 1.00 1.00 5 95.25 95.25 0.997
Quadratic SVM 95.5% 95 96 96 95 1.00 1.00 1.00 1.00 4.5 95.5 95.5 1
Cubic SVM 95.9% 96 96 96 95 1.00 1.00 1.00 1.00 4.1 95.75 95.75 1
Coarse Guassian SVM 91.8% 89 93 93 93 0.98 0.99 1.00 0.98 8.2 92 92 0.987
Fine KNN 85.0% 80 89 95 76 0.86 0.92 0.95 0.87 15 85 85 0.9
Medium KNN 85.9% 78 95 96 78 0.95 0.99 0.99 0.98 14.1 86.75 86.75 0.977
Coarse KNN 60.0% 25 56 100 58 0.92 0.93 0.97 0.92 40 59.75 59.75 0.935
Cosine KNN 90.9% 87 95 93 89 0.98 0.98 0.98 0.98 9.1 91 91 0.98
Cubic KNN 84.5% 76 93 96 73 0.94 0.99 0.99 0.97 15.5 84.5 84.5 0.972
Weighted KNN 85.9% 75 93 98 78 0.96 0.99 0.99 0.98 14.1 86 86 0.98
Boosted Trees 60.5% 62 76 53 51 0.81 0.93 0.85 0.75 39.5 60.5 60.5 0.835
Bagged Trees 88.2% 82 93 93 85 0.96 0.98 1.00 0.99 11.8 88.25 88.25 0.982
Subspace KNN 84.1% 73 95 93 76 0.92 0.99 0.99 0.91 15.9 84.25 84.25 0.952
RUSBoosted Tree 67.7% 60 82 64 65 0.78 0.91 0.83 0.79 32.3 67.75 67.75 0.827

51
AVERAGE POOLING (Training 80, Testing 20) Confusion Matrix

52
AVERAGE POOLING
ROC Curve

AVERAGE POOLING (Training 70, Testing 30)
53
1 2 3 4 1 2 3 4 30.5 69.75 69.75 0.82
Fine Tree 69.5% 61 77 71 70 0.76 0.86 0.84 0.82
Medium Tree 69.2% 61 76 71 70 0.76 0.87 0.85 0.82 30.8 69.5 69.5 0.825
Coarse Tree 64.9% 71 66 62 61 0.79 0.84 0.74 0.78 35.1 65 65 0.787
Linear Discriminant 97.0% 94 99 100 95 0.96 0.99 1.00 0.97 3 97 97 0.98
Linear SVM 95.4% 93 98 99 93 0.99 1.00 1.00 0.99 4.6 95.75 95.75 0.995
Quadratic SVM 94.8% 93 98 99 90 0.99 1.00 1.00 0.99 5.2 95 95 0.995
Cubic SVM 94.2% 90 98 99 90 0.99 1.00 1.00 0.99 5.8 94.25 94.25 0.995
Coarse Guassian SVM 92.7% 89 96 96 80 0.98 1.00 1.00 0.99 7.3 90.25 90.25 0.992
Fine KNN 88.1% 84 94 95 79 0.89 0.96 0.96 0.88 11.9 88 88 0.922
Medium KNN 88.4% 88 95 93 78 0.97 0.99 0.99 0.99 11.6 88.5 88.5 0.985
Coarse KNN 76.2% 79 60 100 66 0.93 0.96 0.99 0.97 23.8 76.25 76.25 0.962
Cosine KNN 89.6% 85 98 87 89 0.98 1.00 1.00 0.97 10.4 89.75 89.75 0.987
Cubic KNN 92.4% 89 98 98 85 0.98 0.99 1.00 0.98 7.6 92.5 92.5 0.987
Weighted KNN 91.8% 88 95 96 88 0.98 0.99 0.99 0.99 8.2 91.75 91.75 0.987
Boosted Trees 85.4% 79 94 84 84 0.96 0.98 0.99 0.97 14.6 85.25 85.25 0.975
Bagged Trees 88.1% 83 94 91 84 0.96 0.99 1.00 0.98 11.9 88 88 0.982
Subspace KNN 89.6% 85 95 94 84 0.96 1.00 0.99 0.95 10.4 89.5 89.5 0.975
RUSBoosted Tree 70.4% 66 72 72 72 0.84 0.90 0.88 0.87 29.6 70.5 70.5 0.872

54
AVERAGE POOLING (Training 70, Testing 30)Confusion Matrix

55
AVERAGE POOLING
ROC Curve

AVERAGE POOLING LAYER (Training 50, Testing 50)
56
1 2 3 4 1 2 3 4 33.6 66.5 66.5 0.81
Fine Tree 66.4% 64 70 64 68 0.79 0.83 0.83 0.79
Medium Tree 66.4% 64 72 64 66 0.81 0.84 0.82 0.79 33.6 66.5 66.5 0.815
Coarse Tree 63.0% 71 61 61 58 0.79 0.81 0.78 0.77 37 62.75 62.75 0.787
Linear Discriminant 94.9% 91 98 98 93 0.94 0.89 0.98 0.95 5.1 95 95 0.94
Linear SVM 96.7% 96 98 99 94 1.0 1.0 1.0 1.0 3.3 96.75 96.75 1
Quadratic SVM 96.7% 96 98 99 95 1.0 1.0 1.0 1.0 3.3 97 97 1
Cubic SVM 96.7% 95 98 99 96 1.0 1.0 1.0 1.0 3.3 97 97 1
Medium Gussian SVM 94.7% 93 97 95 94 0.99 1.00 1.00 0.99 5.3 94.75 94.75 0.995
Coarse Guassian SVM 92.0% 93 93 95 86 0.99 0.99 1.0 0.99 8 91.75 91.75 0.992
Fine KNN 89.2% 89 91 96 82 0.92 0.94 0.96 0.89 10.8 89.5 89.5 0.927
Medium KNN 92.5% 93 92 98 88 0.99 0.99 1.0 0.99 7.5 92.75 92.75 0.992
Coarse KNN 83.6% 85 78 98 74 0.96 0.98 0.99 0.98 16.4 83.75 83.75 0.977
Cosine KNN 92.3% 95 94 94 86 0.99 0.99 1.0 0.99 7.7 92.25 92.25 0.992
Cubic KNN 91.8% 90 92 98 88 0.98 0.98 1.0 0.99 8.2 92 92 0.987
Weighted KNN 93.4% 93 93 99 89 0.99 0.99 1.0 0.99 6.6 93.5 93.5 0.992
Boosted Trees 88.5% 87 91 91 86 0.98 0.98 0.99 0.98 11.5 88.75 88.75 0.982
Bagged Trees 88.7% 88 92 90 85 0.98 0.99 0.99 0.98 11.3 88.75 88.75 0.985
Subspace KNN 86.9% 87 87 95 79 0.95 0.97 0.98 0.93 13.1 87 87 0.957
RUSBoosted Tree 69.5% 69 73 66 70 0.86 0.87 0.86 0.85 30.5 69.5 69.5 0.86

57
AVERAGE POOLING LAYER (Training 50, Testing 50) Confusion Matrix

58
AVERAGE POOLING
LAYER
ROC Curve

VARIANCE CONTROLLED (AVERAGE POOLING 1000 )(Training 50, Testing 50)
59
CLASSIFIER ACC True Positive ROC Curve FNR RECALL AUC
1 2 3 4 1 2 3 4 35.9 64.25 0.81
Fine Tree 64.1 58 55 75 69 0.77 0.79 0.85 0.84
Medium Tree 64.1 58 55 75 69 0.77 0.79 0.85 0.84 35.9 64.25 0.81
Coarse Tree 67.3 73 51 71 75 0.77 0.74 0.80 0.84 32.7 67.5 0.78
Linear Discriminant 90.5 89 93 96 84 0.94 0.94 0.97 0.90 9.5 90.5 0.93
Linear SVM 91.4 89 91 95 91 0.99 0.99 1.0 0.98 8.6 91.5 0.99
Quadratic SVM 93.6 93 95 96 91 0.99 0.99 1.0 0.98 6.4 93.75 0.99
Cubic SVM 93.2 95 93 96 89 0.98 0.99 1.0 0.98 6.8 93.25 0.98
Fine Guassian SVM 30.9 27 24 35 38 0.56 0.57 0.59 0.58 69.1 31 0.57
Medium Gussian SVM 90.5 82 95 95 91 0.98 0.99 1.0 0.98 9.5 90.75 0.98
Coarse Guassian SVM 85.0 78 85 93 84 0.96 0.98 1.0 0.96 15 85 0.97
Fine KNN 80.9 78 80 98 67 0.85 0.88 0.96 0.81 19.1 80.75 0.87
Medium KNN 80.0 78 82 96 64 0.93 0.96 0.99 0.95 20 80 0.95
Coarse KNN 72.3 76 67 95 51 0.92 0.95 0.97 0.95 27.7 72.25 0.94
Cosine KNN 85.0 76 87 95 82 0.94 0.98 1.0 0.95 15 85 0.96
Cubic KNN 80.5 76 84 96 65 0.93 0.96 0.99 0.96 19.5 80.25 0.96
Weighted KNN 82.3 76 84 98 71 0.94 0.97 1.0 0.95 17.7 82.25 0.96
Boosted Trees 76.4 75 71 85 75 0.90 0.92 0.94 0.89 23.6 76.5 0.91
Bagged Trees 85.9 80 89 93 82 0.93 0.98 0.99 0.95 14.1 86 0.96
Subspace Discriminant 93.2 89 95 96 93 0.98 0.99 1.0 0.98 6.8 93.25 0.98
Subspace KNN 77.7 76 80 98 56 0.90 0.93 1.0 0.89 22.3 77.5 0.93
RUSBoosted Tree 68.2 67 64 71 71 0.84 0.86 0.85 0.84 31.8 68.25 0.84

60
VARIANCE CONTROL (Avg Pooling 1000) (Training 50, Testing 50)
Confusion Matrix (Quadratic SVM 93.6%)

VARIANCE CONTROLLED (AVERAGE POOLING 1500 )(Training 50, Testing 50)
61
CLASSIFIER ACC True Positive ROC Curve FNR RECALL AUC
1 2 3 4 1 2 3 4 35.9 64.25 0.79
Fine Tree 64.1% 60 71 71 55 0.76 0.87 0.82 0.72
Medium Tree 64.1% 60 71 71 55 0.76 0.87 0.82 0.72 35.9 64.25 0.79
Coarse Tree 63.6% 64 73 73 45 0.73 0.88 0.79 0.64 36.4 63.75 0.76
Linear Discriminant 94.1% 95 95 93 95 0.95 0.96 0.96 0.97 5.9 94.5 0.96
Linear SVM 91.8% 89 91 95 93 0.98 0.99 1.0 0.99 8.2 92 0.99
Quadratic SVM 91.4% 91 91 95 89 0.99 0.99 1.0 0.99 8.6 91.5 0.99
Cubic SVM 92.3% 95 93 95 87 0.99 0.99 1.0 0.99 7.7 92.5 0.99
Fine Guassian SVM 28.3% 13 31 38 31 0.55 0.55 0.59 0.55 71.7 28.25 0.56
Medium Gussian SVM 91.8% 91 95 91 91 0.98 0.99 1.0 0.98 8.2 92 0.98
Coarse Guassian SVM 88.2% 89 85 91 87 0.97 0.97 1.0 0.95 11.8 88 0.97
Fine KNN 84.5% 87 84 91 76 0.89 0.90 0.94 0.86 15.5 84.5 0.89
Medium KNN 86.4% 89 82 93 82 0.97 0.97 0.99 0.95 13.6 86.5 0.97
Coarse KNN 75.9% 78 73 98 55 0.93 0.95 0.98 0.92 24.1 76 0.94
Cosine KNN 86.4% 89 89 89 78 0.97 0.98 0.99 0.95 13.6 86.25 0.97
Cubic KNN 85.5% 87 85 93 76 0.97 0.97 0.98 0.95 14.5 85.25 0.96
Weighted KNN 85.5% 85 80 93 84 0.97 0.97 0.99 0.96 14.5 85.5 0.97
Boosted Trees 72.3% 75 73 80 62 0.86 0.89 0.90 0.83 27.7 72.5 0.87
Bagged Trees 86.8% 84 89 93 82 0.95 0.96 0.99 0.93 13.2 87 0.95
Subspace Discriminant 92.3% 85 95 96 93 0.98 0.99 1.0 0.98 7.7 92.25 0.98
Subspace KNN 82.3% 78 85 91 75 0.92 0.97 0.97 0.89 17.7 82.25 0.93
RUSBoosted Tree 63.2% 58 71 71 53 0.77 0.87 0.86 0.77 36.8 63.25 0.81

62
VARIANCE CONTROL (Avg Pooling 1500) (Training 50, Testing 50)
Confusion Matrix (Linear Discriminant 94.1%)

COMPARISON OF CLASSIFIER (Fully Connected Layer)
63
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
FineTree
MediumTree
CoarseTree
LinearDiscriminant
LinearSVM
QuadraticSVM
CubicSVM
FineGuassianSVM
MediumGussianSVM
CoarseGuassianSVM
FineKNN
MediumKNN
CoarseKNN
CosineKNN
CubicKNN
WeightedKNN
BoostedTrees
BaggedTrees
SubspaceDiscriminant
SubspaceKNN
RUSBoostedTree
(Training : 80, Testing :
20)
(Training : 70, Testing :
30)
Highest
Accuracy
Quadratic SVM
96.4%

COMPARISON OF CLASSIFIER (Average Pooling Layer)
64
0.00%
20.00%
40.00%
60.00%
80.00%
100.00%
120.00%
FineTree
MediumTree
CoarseTree
LinearDiscriminant
LinearSVM
QuadraticSVM
CubicSVM
FineGuassianSVM
MediumGussianSVM
CoarseGuassianSVM
FineKNN
MediumKNN
CoarseKNN
CosineKNN
CubicKNN
WeightedKNN
BoostedTrees
BaggedTrees
SubspaceKNN
RUSBoostedTree
Training: 80,Testing:
20
Training: 70,Testing:
30
Highest
Accuracy
Linear Discriminant
97.0%

COMPARISON OF CLASSIFIER After Variance Control (Average
Pooling Layer)
65
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
100.00%
FineTree
MediumTree
CoarseTree
LinearDiscriminant
LinearSVM
QuadraticSVM
CubicSVM
FineGuassianSVM
MediumGussianSVM
CoarseGuassianSVM
FineKNN
MediumKNN
CoarseKNN
CosineKNN
CubicKNN
WeightedKNN
BoostedTrees
BaggedTrees
SubspaceKNN
RUSBoostedTree
Training: 50,Testing: 50
(AVG Pooling : 1000)
Training: 50,Testing: 50
(AVG Pooling : 1500)
Highest Accuracy
Linear Discriminant
94.1%

Conclusion and Future Work
66

Conclusion
 From the comparison with existing techniques, it is evident that the proposed
approach is significant in terms of disease classification accuracy.
67
Researcher Year Method Accuracy
1 Yun Zhang 2017 CNN 97.62 %
2 Qin et al 2016 ReliefF method, SVM 94.74 %
3 Mc Donald, Stewart 2014 Automated image analysis 92 %
4 Barbedo 2014 B/W image Segmentation 89 %
5 Rothe et al 2015 Active contour model 85 %
6 Tan et al 2016 CNN 96.08
7 Islam et al 2017 SVM 95 %
8 Shiv Ram Dubey 2014 K-Mean , SVM 95%
9 Lucas G. Nachtigall 2016 CNN 97%
10 Misigo Ronald 2016 Naive Bayes, Otsu algorithm 80%
11 Mostafa Mehdipour Ghazi 2017 CNN, AlexNet, VGGNet, GoogleNet 80%
12 Guan Wang 2017 CNN, VGG16, VGG19, ResNet 90.4%
13 Proposed Work 2019 Transfer Learning, CNN, Inception V3 97.0%

Future Work
 Dataset used in the proposed method can be extended to all apple diseases.
 Other methods can be used for transfer learning.
 This work is focused on front view of Apple leaf but can be extended to side
view and back view also.
 Fusion of some other advance features will improve the output.
 Real Time Detection of Diseases will further help the farmers to protect crops
from diseases in more fast way.
 Should be trained on wild images of infected leaves as well.
68

References
 [1] Bin Liu; Yun Zhang; Dong Jian He “Identification of Apple leaf diseases based on Deep Convolutional neural network”.
Symmetry 2018 10(1), 11.
 [2] Qin, F.; Liu, D.X.; Sun, B.D.; Ruan, L.; Ma, Z.; Wang, H. “Identification of alfalfa leaf diseases using image recognition
technology”. PLoS ONE 2016, 11, e0168274.
 [3] E. L. Stewart and B. A. McDonald, “Measuring quantitative virulence in the wheat pathogen zymoseptoria tritici using high-
throughput automated image analysis” Phytopathology, vol. 104, no. 9, pp. 985–992, 2014.
 [4] J. G. A. Barbedo, “An automatic method to detect and measure leaf disease symptoms using digital image processing” Plant
Disease, vol. 98, no. 12, pp. 1709–1716, 2014.
 [5] Rothe, P.R.; Kshirsagar, R.V. “Cotton leaf disease identification using pattern recognition techniques”. In Proceedings of the
2015 International Conference on Pervasive Computing, Pune, India, 8–10 January 2015; pp. 1–6.
 [6] Tan, W.X.; Zhao, C.J.; Wu, H.R. “CNN intelligent early warning for apple skin lesion image acquired by infrared video
sensors”. High Technol. Lett. 2016, 22, 67–74.
 [7] Islam, M.; Dinh, A.; Wahid, K.; Bhowmik, P. “Detection of potato diseases using image segmentation and multiclass support
vector machine”. In Proceedings of the 30th IEEE Canadian Conference on Electrical and Computer Engineering, Windsor, ON,
Canada, 30 April–3 May 2017; pp. 1–4.
 [8] http://www.dostpakistan.pk/pakistan-worlds-10th-largest-apple-producer/ :Apple Production in Pakistan
69

Thank You
70

Glossary (Some Basic Terminologies)
 Artificial Neural Network
 ANN is inspired by the neural structure of the brain of animals. The fundamental processing unit of
a neural network is a single neuron. A biological neuron receives inputs from other neurons,
combines it in some manner, performs an operation on the result, and then produce the outputs
as the result.
 Neuron
 Neuron is the basic unit of a neural network that gets input values and bias values. Each input
value has an associated weight. When a value arrives at neuron, it gets multiplied by the
associated weight. These weights can be updated during training time.
 Connections
 Neuron in the same layer or another layer is connected by connections that have always a weight
value associated with it. These weights values represent the relationship between neurons. The
stronger the relationship between neuron is, higher the value of weight is. During the training of
neural network, the values of these weights are adjusted to reduce error.
71

 Layers used in Neural Network
 Input Layer
 First layer of neural network is called input layer that takes input values and pass these values to
the next layer of neural network. No operation is performed on the input values in this layer and it
doesn’t have any weights and biases associated with it.
 Hidden Layer
 Input layer usually pass input values to hidden layers which apply the different transformation to
these inputs so that the output layer can use these transformation. There may be one or more
hidden layers in any neural network; however the number of input layer depends upon the
complexity of the problem. Usually some activation is applied on each hidden layer. The output of
final hidden layer is forwarded to the output layer.
 Output Layer
 The output layer is the final layer of neural network that receive its input from the last hidden layer.
Output layer act as classifier and predict the output class for the task of classification.
72

 Deep Learning
 Deep learning is the sub domain of machine learning. The term “deep” is used to describe the
development of many layered neural networks. For training a deep learning model a huge set of
labeled data and a neural network with many layers is required. Usually by increasing the size of
data, the deep learning improves its accuracy.
 Deep learning models can achieve very high accuracy, sometimes the accuracy that is better than
human-level.
 Activation functions
 Activation functions also called transfer function are used to introduce non-linearity to neural
networks. It takes values in form of vector or matrix and squashes the input values in a smaller
range. This range depends on the activation function being used.
73

 Convolution Layer
 The convolution layers extract features from raw data. Convolutional layers have parameters
(kernel) that are learned so that these filters are adjusted automatically to extract the most useful
information of feature.
 The CNN extract features through an operation called convolution.
74

 Pooling
 Pooling layer is also used to shrink the input volume area. Information or data
reduction is done in a way that requirement image may not disturb and reduced data
is received with features present which is required for categorization. Two types of
pooling are mostly used (Max Pooling, Average pooling)
75

 ReLu Layer
 ReLu layer is also called a Rectified linear unit. At this level, if the value is negative then it is
changed to zero and if its greater than zero then it resumes its weight.
 ReLu function can be written in the following form
 Convert linear to non-linear.
76

 SoftMax
 The Softmax function is used in neural network for multi-classification output.The softmax
activation function is used to convert an arbitrary real value to posterior probability
 DropOut Layer
 Deep neural network usually contains a large number of parameters, multiple hidden layers
and learn the complicated relationship between the inputs and outputs. While training with
few data, it might be possible that many of these complicated relationships will be the result
of noise. This will lead to a problem of over-fitting.
 Batch Normalization
 Batch Normalization Accelerating Deep Network Training by Reducing Internal Covariate
Shift. Today, Batch Normalization is used in almost all CNN architectures. Batch
Normalization is used to achieve hide learning rate , less dropout.
77

 Fully Connected Layer
 Fully connected layer is used for classification by using the features obtained from the
convolution and pooling layers. The fully connected layer uses the softmax activation on its
last layer that converts the values into the probabilities.
78

 Factorizing Convolution
 The aim of factorizing convolution is to reduce the number of connections/ parameters
without decreasing the network efficiency.
79

Classification of Apple diseases through machine learning

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to Classification of Apple diseases through machine learning

Similar to Classification of Apple diseases through machine learning (20)

Recently uploaded

Recently uploaded (20)

Classification of Apple diseases through machine learning