1. Statistical Texture Features based Automatic Detection and
Classification of Diabetic Retinopathy
Presented By: Supervised By: Co-Supervised By:
Md. Rahat Khan
ID: 2015105050006
Dept. of CSE, KYAU
A. S. M. Shafi
Lecturer
Dept. of CSE, KYAU
Dr Mir Mohammad Azad
Professor and Head
Dept. of CSE, KYAU
3. 3 Motivation
Diabetic Retinopathy (DR) is the fifth leading cause of global blindness.
The total number of people with diabetes is projected to rise from 171 million in 2000 to
366 million in 2030.
Detection of DR is a time-consuming and manual process that requires digital color
fundus photographs of the retina.
The number with vision-threatening DR will increase from 37.3 million to 56.3 million,
if any proper action is not taken.
4. 4 Research Questions
The specific problem statement for this thesis is:
“ Can machine learning be used for automatic detection and classification of
diabetic retinopathy? ”
However, the following research questions would facilitate the achievement of this
thesis:
Is this approach to detect and classify diabetic retinopathy?
Are the system’s accuracy acceptable?
Can the method reduce the cognitive burden on a qualified doctor?
5. 5 Objectives
“ Effectively use machine learning to detect and classify diabetic retinopathy
from retinal images. ”
Other intensions of the current study include the followings:
Study the terms and features related to DR.
A novel approach to utilize machine learning features.
Drastically reducing the cognitive burden of a qualified physician.
The proposed approach is favorable and effective which achieves the best
performance of accuracy.
6. Diabetic retinopathy (DR) is a medical condition where the retina is damaged because of
fluid leaks from blood vessels into the retina.
6 Clinical Background
02
01
Non-
Proliferative
Diabetic
Retinopathy
Proliferative
Diabetic
Retinopathy
Mainly occurs when
most of the blood
vessels in the retina
close, preventing
enough blood
flow. These new blood
vessels are abnormal
and do not supply the
retina with proper
blood flow.
The earliest stage of
diabetic retinopathy
where damage blood
vessels in the retina
begin to leak extra
fluid and small
amounts of blood
into the eye.
7. 7 System Architecture
Input Images
Preprocessing
Segmentation
Feature Extraction
Classification
Assessment
Retinal Image
Resize Image
Dark and bright region
detection using blood
vessel extraction
GLCM
GLRLM
SVM, KNN, RF
Accuracy, Sensitivity,
Precision, F1-score
--------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------
--------------------------------------------------------------------------------------------
Figure 1: A framework of the proposed approach.
SVM: Support Vector Machine
KNN: K-Nearest Neighbors
RF: Random Forest
GLCM: Gray Level Co-occurrence
Matrix
GLRLM: Gray Level Run Length
Matrix
8. 8 Proposed Algorithm
Algorithmic Steps:
1. Capture the digital fundus image as input image.
2. Pre-process all input image into a uniform size.
3. Apply Kirsch’s template technique to extract blood vessels from the preprocessed
image.
4. Execute second-order and higher-order statistical texture feature algorithm on the
segmented image found from step 3.
5. Generate a Feature Vector (FV) for each training image.
6. Implement three classifiers (SVM, KNN, and RF) to test the train image for
classification and evaluate the performance of the results.
9. 9 Dataset
“The proposed system utilizes more than 600 different retinal images”.
Important sources of database include:
Indian Diabetic Retinopathy Image Dataset (https://idrid.grand-challenge.org/Data/).
Kaggle Dataset (https://www.kaggle.com/c/diabetic-retinopathy-detection).
Table 1: Our dataset: The distribution of retinal images used in our proposed system.
Type Short Name Number of Images Size
Non-Proliferative Diabetic Retinopathy NPDR 167 246MB
Proliferative Diabetic Retinopathy PDR 340 367MB
Normal Image - 136 150MB
10. 10 Preprocessing
The original image is resized into a standard size of 565x375 format to allow faster
calculation.
(a)
(b)
Figure 2: Preprocessing: (a) input image, and (b) preprocessed image.
11. Image Segmentation
To extract the blood vessels from the preprocessed image, Kirsch template is used as a
image segmentation technique. It is used to detect the edge of blood vessel by using the
eight direction of template which rotated fairly by 45 °. From the templates result, the
greater will be considered for the output of products and then extracted. Figure 3 shows
the arrays of Kirsch’s templates.
11
5 -3 -3 -3 -3 5 -3 -3 -3 -3 5 5 -3 -3 -3 5 5 5 -3 -3 -3
5 0 -3 -3 0 5 5 0 -3 -3 0 5 -3 0 -3 -3 0 -3 -3 0 5
5 -3 -3 -3 -3 5 5 5 -3 -3 -3 -3 5 5 5 -3 -3 -3 -3 5 5
5 5 -3
5 0 -3
-3 -3 -3
0° 45° 90° 135° 180° 225° 270° 315°
Figure 3: Example of Kirsch’s templates.
13. Feature Extraction
The process to represent raw image in a reduced form to facilitate decision making
such as pattern detection, classification or recognition.
Feature extraction techniques:
Second Order Statistical Texture Features
Higher Order Statistical Texture Features
13
14. Second Order Statistical Texture Feature
14
The Gray Level Co-occurrence Matrix (GLCM) method is a way of extracting second
order statistical texture features. The GLCM functions are used for finding texture
properties of an image by calculating the frequency of occurrence of pixel pairs with
specific values and in a specific spatial relationship.
Figure 5: Example of the creation of a GLCM matrix.
4*4 image
1 2 1 3
1 3 2 2
4 2 1 1
1 1 3 4
GLCM Matrix
2 1 3 0
2 1 0 0
0 1 0 1
0 1 0 0
15. Second Order Statistical Texture Feature (cont’d)
15
Contrast
𝑛=0
𝐺−1
𝑛2
{
𝑖=1
𝐺
𝑗=1
𝐺
𝑃(𝑖, 𝑗)}
Correlation
𝑖=0
𝐺−1
𝑗=0
𝐺−1
𝑖 ∗ 𝑗 ∗ 𝑝 𝑖, 𝑗 − µ𝑥 ∗ µ𝑦
𝜎𝑥 ∗ 𝜎𝑦
Energy
(ASM) 𝑖=0
𝐺−1
𝑗=0
𝐺−1
𝑃 𝑖, 𝑗 2
Entropy −
𝑖=0
𝐺−1
𝑗=0
𝐺−1
𝑃 𝑖, 𝑗 ∗ log(𝑝(𝑖, 𝑗))
Inverse
Difference
Moment 𝑖=0
𝐺−1
𝑗=0
𝐺−1
1
1 + 𝑖 − 𝑗 2
𝑃(𝑖, 𝑗)
Sum
Entropy
−
𝑖=0
2𝐺−2
𝑃𝑥+𝑦 𝑖 log( 𝑃𝑥+𝑦(𝑖))
Difference
Entropy
−
𝑖=0
𝐺−1
𝑃𝑥+𝑦 𝑖 log( 𝑃𝑥+𝑦(𝑖))
G is the number of gray levels used.
μ is the mean value of P.
μx, μy, ∂x and ∂y are the means and standard
deviations of Px and Py.
Px(i) is the ith entry in the marginal matrix
obtained by summing rows of P(i, j).
Sum of
Squares 𝑖=0
𝐺−1
𝑗=0
𝐺−1
(𝑖 − 𝜇)2
𝑃 𝑖, 𝑗
Sum
Average
𝑖=0
2𝐺−2
𝑖𝑃𝑥+𝑦(𝑖)
16. Second Order Statistical Texture Feature (cont’d)
16
Table 2: GLCM features computed from retinal images.
Feature Non-Proliferative image Proliferative image Normal Image
Contrast 0.049079996 0.080073074 0.1245
Correlation 7.37215018 11.84084557 14.485
Energy 0.481351097 0.261918084 0.3242
Entropy 1.114384043 1.624658084 1.4874
Homogeneity 0.978856348 0.963758729 0.9593
Sum of Square 7.322934412 11.78826862 14.4349
Sum Average 5.186458425 6.183003155 7.0898
Sum Entropy 1.075093217 1.570173178 1.4239
Difference Entropy 0.181596513 0.268884514 0.3070
17. Higher-Order Statistical Texture Feature
17
The Gary Level Run Length Matrix (GLRLM) method is a way of extracting higher order
statistical texture features. The run length is the number of pixels in the run, and the run
length value is the number of times such a run occurs in an image.
Figure 6: Design of the GLRLM matrix from a 4 × 4 image with 5 gray levels.
4*4 images
1 2 2 3
1 2 3 3
4 2 4 1
4 1 2 3
GLRLM Matrix
4 0 0 0
3 1 0 0
2 1 0 0
3 0 0 0
18. Higher-Order Statistical Texture Feature (cont’d)
18
A run-length matrix Q(i,j) for a given image is defined by the specifying direction and then
count the occurrence of a run for each gray levels i and run-length j in this direction. Here,
𝑛𝑝 is the number of pixels and 𝑛𝑟 denotes the total number of runs.
Short Run Emphasis
(SRE)
1
nr
i=1
M
j=1
N
Q(i, j)
j2
Long Run Emphasis
(LRE)
1
nr
i=1
M
j=1
N
Q i, j ∗ j2
Low-Gray-Level
Run Emphasis
(LGRE)
1
nr
i=1
M
j=1
N
Q(i, j)
i2
.
High-Gray-Level
Run Emphasis
(HGRE)
1
nr
i=1
M
j=1
N
Q i, j ∗ i2
Gray-Level Non-
uniformity (GLN)
1
nr
i=1
M
(
j=1
N
Q i, j )2
Run-Length Non-
uniformity (RLN)
1
nr
j=1
N
(
i=1
M
Q i, j )2
Run Percentage (RP)
np
nr
20. Classification
20
Support Vector Machine (SVM):
SVM is a binary classifier based on
the concept of a hyperplane that
defines decision boundaries.
Figure 8: A linear support vector machine.
Random Forest (RF):
Random forest is a classifier that operates by
constructing a multitude of decision trees at
training time and outputting the class that is the
mode of the classes or mean prediction of the
individual trees.
Figure 9: Random forest classifier.
21. Classification (cont’d)
21
K-Nearest Neighbor (KNN):
A simple classifier that stores all available cases and classifies new cases based on a
similarity measure (e.g., distance functions)
Euclidean =
i=1
k
(xi − yi)2
Figure 10: KNN classifier
Choosing the optimal value for K is best done by
first inspecting the data. In general, a large K
value is more precise as it reduces the overall
noise but there is no guarantee. Historically, the
optimal K for most datasets has been between 3-
10. That produces much better results than 1NN.
22. Cross Validation
22
K-Fold Cross Validation:
I. Shuffle the dataset randomly.
II. Split the dataset into k groups.
Figure 11: K-fold cross-validation process.
III. For each unique group:
Take the group as a hold out
or test data set.
Take the remaining groups
as a training data set.
Fit a model on the training
set and evaluate it on the test
set.
Retain the evaluation score
and discard the model.
Summarize the skill of the
model using the sample of
model evaluation scores.
23. Experimental Results
23
To evaluate the performance of the proposed method, we have examined four metrics per
class: Sensitivity (Sen), Precision, F1-score and Accuracy (Acc).
Table 6: Formula for evaluation scheme.
TP = True Positives; TN = True Negatives; FP = False Positives; FN = False Negatives
Sensitivity
(Recall)
TP / (TP + FN)
Precision TP / TP + FP
F1-score 2 * Recall * Precision / Recall + Precision
Accuracy TP + TN/(TP + TN + FP + FN)
Table 5: Confusion matrix for two-class
classification problem.
Predicted Class
Positive Negative
Actual Class
Positive TP FN
Negative FP TN
24. Experimental Results (cont’d)
24
Table 7: Confusion matrix of SVM.
PDR NPDR Normal
PDR 326 11 4
NPDR 19 136 12
Normal 4 3 129
Sensitivity
Predicted class
Actual
class
Figure 12: Performance of the prediction models with SVM classifier.
Precision F1-score Accuracy
93%
91%
89%
96%
81%
95%
94%
86%
92%
94.1%
93.01%
96.43%
25. Experimental Results (cont’d)
25
87%
OPEN
TICKETS
Table 8: Confusion matrix of KNN (K=3).
PDR NPDR Normal
PDR 302 23 16
NPDR 21 127 19
Normal 18 14 104
Predicted class
Actual
class
Figure 13: Performance of the prediction models with KNN (K=3) classifier.
Sensitivity Precision F1-score Accuracy
89%
77%
75%
89%
76%
76%
89%
77%
76%
87.89%
88.04%
89.6%
26. Experimental Results (cont’d)
26
87%
OPEN
TICKETS
Table 9: Confusion matrix of KNN (K=5).
PDR NPDR Normal
PDR 308 17 16
NPDR 21 136 10
Normal 18 19 99
Predicted class
Actual
class
Figure 14: Performance of the prediction models with KNN (K=5) classifier.
Sensitivity Precision F1-score Accuracy
89%
79%
79%
90%
81%
73%
90%
80%
76%
88.82%
89.6%
90.22%
27. Experimental Results (cont’d)
27
87%
OPEN
TICKETS
Table 10: Confusion matrix of RF.
PDR NPDR Normal
PDR 333 5 3
NPDR 6 149 12
Normal 2 3 131
Predicted class
Actual
class
Figure 15: Performance of the prediction models with RF classifier.
Sensitivity Precision F1-score Accuracy
98%
95%
90%
98%
89%
96%
98%
92%
93%
97.52%
95.92%
96.89%
28. Experimental Results (cont’d)
28
87%
OPEN
TICKETS
Table 11. Sensitivity, precision, F1-score and accuracy of each classifier
Image Type
SVM KNN (K=3) KNN (K=5) RF
Sn Pre F1 Acc Sn Pre F1 Acc Sn Pre F1 Acc Sn Pre F1 Acc
PDR 93 96 94 94.1 89 89 89 87.89 89 90 90 88.82 98 98 98 97.52
NPDR 91 81 86 93.01 77 76 77 88.04 79 81 80 89.6 95 89 92 95.92
Normal 89 95 92 96.43 75 76 76 89.6 79 73 76 90.22 90 96 93 96.89
29. Experimental Results (cont’d)
29
87%
OPEN
TICKETS
Table 12: Weighted measure of each classifier
Name of the classifier
Weighted Measure
Sn Pre F1-score Acc
SVM 91.63 91.89 91.35 94.30
KNN (K=3) 82.93 82.88 83.14 88.29
KNN (K=5) 83.66 84.07 84.45 89.31
RF 95.53 96.45 95.38 95.19
30. Comparison with Existing Methods
30
Table 13: Comparison among different methods.
Authors Methodology Dataset Result (Overall Accuracy)
Khademi et. al
Shift-invariant Discrete Wavelet
Transform
86 82.2%
S. Manker et. al Morphological operations 107 89.50%
Qureshi et. al
Convolutional Neural Network (CNN)
+ RF
125 97.5%
Neto et. al Unsupervised coarse-to-fine algorithm 60 87%
Sarathi et. al Ellipse fitting 63 92%
Garcia et. al CNN 35,126 83.68%
Proposed Method Statistical texture features + RF 644 95.19%
31. Discussion
31
We incorporate both second-order and higher-order statistical texture features
with linear SVM, KNN, and RF classifier.
We consider the texture feature because it gives us more details about specific
regions in an image.
The proposed system for diabetic retinopathy classification showed that the use of
statistical features achieves almost high weighted sensitivity (95.53%) and,
equally importantly, displays high weighted precision (96.45%) and weighted F1-
score (95.38%) with RF classifier.
32. Conclusion
32
The results of the extensive experimental study have to lead us to the following
clear conclusions:
The use of statistical texture features on the retinal image results in higher
classification accuracy in terms of sensitivity, precision, and F1-score.
GLCM and GLRLM provide information about the connected length of a
particular pixel in a definite direction.
The use of RF positively affects the discrimination of normal and abnormal
samples.
33. Limitation
33
Lack of publicly available datasets.
The high difference in various image qualities.
High interclass similarity and intraclass variation.
Coding has developed as a top-down programming in each step, which caused the
processing to take long.
We believe that with a larger and more representative training set, better results in the
classification stage could have been obtained.
34. Future Work
34
In the future, the method should be tested on larger datasets to correctly
evaluate the algorithm.
Future perspectives of this work include the improvement of retinal images by
applying super-resolution.