Vishnu Vardhan Project.pdf

EXCEL ENGINEERING COLLEGE
(AUTONOMOUS)
A
Project Report
On
“Skin Disease Classification from Image”
SUBMITTED BY
Sura Vishnu Vardhan Reddy
(Regd.No:730921106108)
LinkedIn Id: https://www.linkedin.com/in/sura-vishnu-vardhan-reddy-b68729269
GitHub Id: https://github.com/vishnu6643
SUBMITTED TO
Pegasus Aerospace System
Erode, Tamil Nadu, Pin – 638002

All content following this page was uploaded by Sura Vishnu Vardhan Reddy on 25 April 2023.
The user has requested enhancement of the downloaded file.

Skin Disease Classification from Image
Abstract
Skin diseases are one of the most common types of health illnesses faced by the people for
ages. The identification of skin disease mostly relies on the expertise of the doctors and skin
biopsy results, which is a time-consuming process. An automated computer-based system for
skin disease identification and classification through images is needed to improve the
diagnostic accuracy as well as to handle the scarcity of human experts. Classification of skin
disease from an image is a crucial task and highly depends on the features of the diseases
considered in order to classify it correctly. Many skin diseases have highly similar visual
characteristics, which add more challenges to the selection of useful features from the image.
The accurate analysis of such diseases from the image would improve the diagnosis,
accelerates the diagnostic time and leads to better and cost-effective treatment for patients.
This paper presents the survey of different methods and techniques for skin disease
classification namely; traditional or handcrafted feature-based as well as deep learning-based
techniques.
Keywords— Skin diseases, lesions, classification, deep learning, CNN, SVM.
I – Introduction
The largest organ of human body is “Skin”, an adult carry around 3.6 kg and 2 square meters
of it. Skin acts as a waterproof, insulating shield, guarding the body against extremes of
temperature, damaging UV lights, and harmful chemicals. With the rate of 10-12%, the
population affected across India from skin disease is estimated at nearly 15.1 Crore in 2013
and which increases to 18.8 crores by 2015[38]. According to statistics provided by the World
Health Organization [39] around 13 million melanoma skin cancer occurs globally each year,
which shows skin diseases are growing very rapidly. There are many factors responsible for a
disease to occur such as UV lights, pollution, poor immunity, and an unhealthy lifestyle. There
are two major categories in which the lesions (spot) of skin disease are classified; benign and
malignant skin lesions. Most of the skin lesions are benign in nature which is gentle and non-
dangerous, whereas those which are dangerous for patient’s health and evil in nature are
malignant skin lesions such as melanoma skin cancer. Diagnosis of skin disease from an image
is a challenging problem as there exist many skin diseases. Researchers reported following
problems during skin disease classification: 1) A disease may have many lesion types. 2) Many
diseases may have a similar visual characteristic, which is often confusing for the
dermatologist as well to identify the disease by visual inspection. 3) The varying skin colors
and skin type (age) introduce more difficulty in computer-based diagnosis. Therefore, relevant
feature selection for such diseases is very important in computer-based diagnosis in order to
identify it correctly. The success of an automatic system rely on how accurately the system
performs and does need image processing as well as machine learning tasks.

There are many technologies available in the medical science for diagnosis of skin diseases.
But the computer based automatic diagnosis is quite more useful for medical decision
support and makes the entire process fast. For example, if such automated system is
implemented in the healthcare centres, then patient does not have to suffer unnecessarily
due to unavailability of experts. Further, it is non-invasive method of diagnosis therefore it is
not painful. As per 2015 statistics of India [38], for approximately 121 crore of population
there are about 6,000 dermatologists providing services in India. This means that for every
100,000 people, only 0.49 dermatologists are available in India as compared to 3.2 in many
states of the US [38].
Due to recent advances in the technology large amount of medical data is produced daily
and these data contains valuable and crucial information about the patients. The image based
artificial intelligence is becoming more popular for certain diseases especially skin diseases.
The diagnostic accuracy for computer-based system highly relies on the selection of relevant
feature, classifier used and the availability of dataset as well as number of images on which
the model has been trained. Now a day’s for pattern recognition and classification tasks the
Convolution Neural Networks (CNN) are highly used. For better understanding of various
works done by the researchers, we carry out a survey on different approaches used for the
classification of the skin diseases.
This paper is divided into four sections. Section II presents the background knowledge; type
of images, and usage of traditional and deep learning based approaches for skin disease
classification. Section III presents a survey on traditional or feature extraction based methods
as well as CNN based approaches for skin disease identification and classification. Section IV
presents the analysis and findings of traditional and CNN based methods and finally, Section V
presents the conclusion

II - BACKGROUND KNOWLEDGE
This section is divided into three parts: Skin disease Image type, general process for skin
disease classification using traditional techniques and using deep learning-based techniques.
A. Clinical and Dermoscopic Images
A clinical image is said to be the image of the patient's affected body part- such as an
injury, skin lesion or it can be diagnostic image. The image is captured with normal or digital
camera. This type of image may have different lightening, resolution and different angle
depend on the type of camera used for capturing the image. For computer aided diagnosis,
dermoscopic images are more useful. These images are produced using dermoscope [16],
which is an instrument used by dermatologist to analyse the skin lesions. The dermoscope
usually has uniform illumination and more contrast. As the device has bright illumination, the
lesions are clear enough for visualization and recognition. Furthermore, processing of
dermoscopic images become easy because the images have less noise. Fig.1 (a) illustrates the
way to capture dermoscopic image, (b) presents the dermoscopic image and (c) shows the
clinical image.
(a) (d) (c)
B. Skin Disease Classification using Traditional Approach;
In the traditional approach, the handcrafted features are fed into the conventional
classifier. Fig. 2 shows the general process of skin disease classification using the
traditional approach.
1) Input Image
Input Image Skin disease image databases for many diseases are available freely.
However, some are fully or partially open source and others are commercially available.
The input image can be of type dermoscopic or clinical based on the dataset used. Table I
contains the information about the availability and details of various datasets. The widely
used datasets are mentioned in the table.
2) Image pre-processing
Image pre-processing is an important step and it is required because an image may
contain many noises such as dermoscopic gel, air bubbles, and hairs.

However, clinical images require more pre-processing as compared to dermoscopic because
of parameters such as resolution, lightening condition, illumination, angle of image captured,
size of skin area covered may vary and depends on the person who is capturing the image.
These captured images could create problems in the subsequent stages.
The skin hairs can be removed using different filters such as; median, average or Gaussian filter,
morphological operations such as erosion and dilation, binary thresholding and software such as Dull
Razor. For low contrast images; lesion or contrast enhancement algorithms are useful. The contrast
enhancement with histogram equalization provides better visualization by uniform distribution of
pixel intensity across the image and it is one of the most used techniques in literature. For salt and
pepper kind of noise; a median or mean filter can give better noise removal results.
Dataset Image No. Images Classes Open source
Derm Net NZ
image library.
Clinical 20000+ - Partially
Dermofit Image
Library.
Dermoscopic 1300 10 Yes
ISBI-2016. Dermoscopic 1279 2 Yes
ISBI – 2017 Dermoscopic 2750 2 Yes
Ham10000 Dermoscopic 10015 7 Yes
Stanford Hospital Clinical - - No
Pecking Union medical
college
clinical database
Dermoscopic 28000 - No
IRMA Dataset Dermoscopic 747 2 Not
Available
PH2 Dermoscopic 200 2 Yes
MED-NODE Clinical 170 2 Yes
DermQuest Clinical 22500 - Yes
Hospital Pedro
Hispano,
Matosinhos
Dermoscopic 200 3 No
SD-198 Clinical and
Dermoscopic
6584 198 Yes

3) Image Segmentation
Image segmentation extracts the disease affected area from the normal skin and can play very
important role in skin disease detection [16]. Image segmentation can be carried out by three
ways: 1) pixel-based, 2) edge-based, and 3) region-based segmentation. In pixel-based
segmentation, each pixel of an image is identified to be the part of a homogeneous region or to
an object. This can be done using binary thresholding or variant of it. The edge-based method
detects and links edge pixels to form the bounding shape of the skin lesions. For example,
Robert, Prewitt, Sobel and Canny operators, adaptive snake or gradient vector flow can be used.
The Region-based methods rely on similar patterns in the intensity values within the
neighborhood pixels and are based on continuity. The examples are region growing, merging
and splitting, and Watershed algorithm.
4) Feature Extraction
The most prominent features which are used to describe and identify skin diseases
visually are its color and texture information. The color information plays an
important role to distinguish one disease from another. These color features can be
extracted using various techniques such as color histograms, color correlograms,
color descriptors, GLCM. The texture information conveys the complex visual patterns
of the skin lesions and spatially organized entities such as brightness, color, shape,
and size. Image texture is basically a function of variation in pixel intensity. GLCM,
local binary pattern, SIFT are some techniques used by researchers to get the texture
information from the image. In addition to color and texture, each lesion may have
different shapes and sizes based on the type of the disease and its severity.
5) Classification
Classification is a supervised learning approach for machine learning task. It requires
labelled dataset to map the data into specific groups or classes. There are various
classification algorithms used to classify the skin disease images such as support vector
machine, feed forward neural network, back propagation neural network, k-nearest
neighbour, decision trees, etc.

Deep Learning is a part of machine learning algorithm inspired by the structure and function of
human brain commonly known as neural networks. Convolution Neural Networks (CNN) is a class of
deep learning algorithm which is mostly used for analysing the visual contents such as images and
videos. With the development of CNN, there has been dramatic improvement observed to solve
many classification-based problems in medical image analysis... The basic process for CNN based
skin disease image classification is presented.
CNN based approach of skin disease Classification
The process starts with data acquisition. Input to the CNN can be dermoscopic or clinical image,
which can be pre-processed if needed; the next step is data augmentation. This results in enough
training samples to train the model. Finally, the data is fed into the CNN which performs feature
extraction and classification by its own. A CNN typically consists of convolution layer in which
numbers of filters perform convolution operation on the image and generates feature maps. These
feature maps are further down sampled by pooling layers. Finally, the fully connected layer has all
the connection from previous layer and does the classification accordingly.
Many researchers have used CNN for skin disease classification via transfer learning or fine-tuning of
pre-trained models like Inception v3, ResNet, VGG architecture and many more. In transfer learning
only weights are optimized if new classification layers have to be added. However, the weights of the
original model remain as it is. In fine-tuning the parameters of a trained model must be altered very
carefully while trying to validate that model for a dataset with a smaller number of images which
does not belong to the train set. Moreover, we need to keep track of the hyper parameters of CNN
otherwise the model may have problem of over-fitting. Over-fitting means model learned too well,
i.e., it also learns irrelevant information and noise as well which may result in good training accuracy
but poor testing accuracy.
III. SURVEY OF LITERATURE
This section presents a survey on both traditional and deep learning-based skin disease
identification and classification approaches. Table II and III analyses all major works for both the
aforementioned techniques; traditional/handcrafted feature-based techniques and deep learning
techniques for classification of skin disease from images.
C. Skin Disease Classification using Deep Learning based Approach

Amarathunga et al. It have come up with expert system limited to classify three diseases. The
system consists of two separate units namely; data processing and Image processing unit.
The data processing unit was responsible for image acquisition, pre-processing for noise
removal, segmentation and feature extraction from the skin disease images whereas data
processing unit was employed for data mining task or classification. Five classification
algorithms were tested by the authors namely; AdaBoost, BayesNet, J48, MLP and
NaiveBayes. Out of these five the MLP classifier gave better results as compared to other
classifiers. However, the data source of images and attributes considered for disease
classification is not mentioned.
Chakraborty et al. [3] have proposed a hybrid model using multi objective optimization algorithm
NSGA-II and ANN for diagnosis of skin lesion being benign or malignant. The bagof-features
approach is applied to classify the skin lesions and are generated using SIFT. SIFT algorithm identifies
and locates the key points from the input image and generates the feature vector. Also, to handle
large number of keypoints kmeans clustering algorithm was used to get representative keypoints
where each cluster contains some representative keypoints and these are the generated bag-of-
features. These features are then fed to the hybrid classifier where NSGA-II is used to train the ANN.
Authors [3] also compared the model’s accuracy with ANN-PSO (ANN trained with particle swarm
optimization) and ANN-CS (ANN trained with cukoo search.)
The spatial and frequency domain-based technique is used by Chatterjee et al. It is for identification
of skin lesion being benign or malignant. The malignant lesions are further classified into
subcategories namely; melanocytic or epidermal skin lesions. The cross-correlation technique is
used to extract regional features which are invariant to light intensity and illumination changes. Also,
the cross spectrum-based frequency domain analysis has been used for retrieving more detailed
features of skin lesions. For classification the SVM classifier was used with three non-linear kernels
[10] out of which SVM with RBF kernel gave promising accuracy as compared to other kernels.
A. Survey on Traditional Techniques for Skin Disease Image Classification

TABLE II. SURVEY OF TRADITIONAL TECHNIQUES FOR SKIN DISEASE CLASSIFICATION
References Disease Image Type No. of
images
Pre
processing
Segmentation Feature Extractions Classifiers Performance
Measure
Amarathunga Eczema,
Impetigo,
melanoma
Clinical - Y Thresholding - MLP Accuracy: 90%
Chakraborty BCC, SA Dermoscopic - - Thresholding SIFT NN-NSGA-II Accuracy:
90.56%
Precision:
88.26%
Recall:93.64%
F-measure:
90.87%
Manerkar . Warts, Benign &
Malignant Skin
cancer
Clinical 45 Y C-means
Clustering
and
watershed
algorithm
GLCM and IQA SVM Accuracy:
9698%
Zaqout . Benign,
Malignant or
suspecious
lesions
Dermoscopic 200 Y Thresholding ABCD rule
implementation
using entropy,
bifold,color
and
diameter
TDS Accuracy: 90%
Sensitivity: 85%
Specificity:
92.22%
Chatterjee Melanoma,Nevus,
BCC,SK
Dermoscopic 6,838 - - Crosscorrelation,cross
spectrum
SVM Accuracy:
98.79%
Sensitivity:
99.01%
Specificity:
95.35%
Arifin Acne, eczema,
psoriasis, Tinea,
vitilogo, scabies
Clinical 704 Y Thresholding GLCM feedforward
backpropagation
ANN
Accuracy:
94.04%
Monisha BCC, SA,
Lentigo simplex
Dermoscopic - Y GMM GLCM, DRLBP &
GRLTP
NSGA-II-PNN -
a. Disease: SK - Seborrheic keratoses, BCC - Basal Cell Carcinoma, SA-Skin Angioma, classifier: TDS (Total Dermoscopic score =
Asymmetry* 1.3 + Border-Irregularity*0.1 + color *0.5 + diameter*0.5), NSGA-II - Nondominated Sorting Genetic Algorithm, feature
Extraction: GMM- Gradient Mixture Model, GLCM-Grey level co-occurrence matrix, IQA-Image Quality Assessment, PNN- probabilistic
Neural Network

Esteva et al. were first to report about how the image classifier convolutional neural netwok (CNN)
can achieve the performance similar to the 21 board-certified dermatologists for identification of
malignant lesions. The 3-way disease partition algorithm was designed to classify a given skin lesion
to be malignant, benign or non-neoplastic. Also, 9-way disease partition was performed to classify a
given lesion into one of the 9 mentioned categories. The state-of-the art InceptionV3 CNN
architecture was used for skin lesion classification has concluded that the CNN can outperform
human experts if it is trained with enough data. Also, has concluded that the CNN can outperform
human experts if it is trained with enough data.
Zhang et al. It also used InceptionV3 architecture with modified final layer to classify 4 diseases. The
model was trained on two nearly similar datasets of dermoscopic images. Authors concluded that
misclassification can occur due to presence of multiple disease lesions on the single skin image.
Sun et al. It has proposed handcrafted feature based as well as CNN based approaches for
classification of clinical images. They trained four CNN architectures namely; Caffenet, fine-tuned
Caffenet, VGG and fine-tuned VGGNet. Out of these four the fine-tuned VGGNet gave quite good
accuracy. The accuracy of VGGNet was similar to that of the handcrafted feature which was
generated by 7 different methods namely; SIFT, HOG, LBP, and color histogram with SVM classifier.
However, the architectures and use of benchmark dataset plays an important role for skin disease
image classification to achieve good accuracy.
Gessert et al. introduced patch-based method to obtain fine-grain differences between various skin
lesions from high resolution images. The high resolution images are divided into 5, 9, and 16 crops or
patches and these images patches or crops are fed to the standard CNN architectures. Three
architectures were used by the authors namely; Inception v3, DenseNet and SE-Resnext50
architecture for prediction of disease from high resolution image patch.
Rehman et al. It has proposed CNN architecture by setting 16 different filters of 7*7 kernel size with
pooling layers for down sampling. The proposed model was trained for malignant and benign
category of diseases namely; melanoma, Seborrheic keratosis and nevus. The RGB channels of the
segmented image are normalized with zero mean and unit variance. This normalized matrix was fed
to CNN for feature extraction, further the fully connected layer consists of 3-layer ANN classifier
which classify the skin lesion being banign or malignant.
Kulhalli et al. It has proposed a 5-stage, 3-stage and 2stage hierarchical approach to classify 7
diseases using InceptionV3 CNN architecture. The authors have addressed the class imbalance
problem by using image augmentation technique in order to balance the category classes. The 5-
stage classifier gave better result as compared to 2 and 3-stage hierarchical classifiers. Further, the
authors suggested that the model can be further fine-tuned and ensemble-based methods might
help in order to improve the classification performance.
B. Survey on deep learning-based approach for Skin Disease Image
Classification

IV. ANALYSIS & FINDINGS
Both traditional, as well as CNN based approaches are useful for the classification of skin diseases.
The traditional methods require appropriate feature extraction as well as segmentation method for
skin diseases. Further, it is important to identify the relevant features and discard irrelevant features
as the classification often depends on features selected. Therefore, if irrelevant features got selected
then it may lead to misclassification. However, contrary to CNN traditional approach does not
require a large size dataset.
CNN can learn the features of the skin diseases automatically. It selects the filters intelligently as
compared to the traditional or manual way of selecting filters in traditional approach to extract the
relevant features from the images. Therefore, no feature extraction method is needed in CNN based
approach. However, pretrained models can be used to classify the skin diseases but these models
are heavy in terms of: 1) number of parameters, 2) number of layers, 3) selection and fine-tuning of
the appropriate pre-trained model and 4) The model has to be trained from scratch as it is not been
trained for skin disease images.
However, CNN can also be designed from scratch. The following criteria are important whenever
CNN architecture is designed to classify skin diseases:
1) Dataset: The availability of large dataset is very important as CNN learns much efficiently
whenever it’s been trained with enough data. The large dataset of clinical images are available on [31],
[32]. For dermoscopic images the large datasets are published by ISIC [27].
2) Hyperparameters of CNN: The network structure is determined by the hyperparameters.
These hyperparameters are supposed to be set before training the CNN. The parameters which define
the network structure are number of hidden layers, dropout, kernel size, number of kernels, batch
size, number of epochs, activation function, learning rate, etc.
3) Computational Power: The main challenge of training CNN is the availability of computational
resources. There are thousands of trainable parameters on CNN; therefore, it is computationally costly
as compared to the traditional way of classifying skin disease. To train the CNN GPU availability is a
must. Also, the training time is more and it depends on the size of the dataset used to train the model.

TABLE III. SURVEY OF DEEP LEARNING BASED SKIN DISEASE CLASSIFICATION
Reference Disease
classes
Image type No. of
images
Dataset Additional
(Preprocessing/
Segmentation)
CNN Architecture Performance Measures
Sun et al.
[24]
Wide Variety Clinical 6,584 SD-198 [34] Fine-tuned VGG19 Accuracy: 50.27%
5,619 SD-128[24]
Esteva et al.
[4]
Malignant
and
Ben
ign
skin lesions
Clinical 129,450 ASIC [27],
Edinburgh
Dermofit
Library[33],
Stanford
Hospital [4]
Inception V3 with PA
(partition algorithm)
Accuracy: 72.1 ±0.9%
Dermoscopic 3,374
Zhang et al.
[5]
Melanocyticn
evus, SK
BCC,
psoriasis.
Dermoscopic 1,067 Dataset A [5] - Inception v3 Dataset A:
Accuracy: 87.25 ± 2.24%
522 Dataset B [5] Dataset B:
Accuracy:86.63% ± 5.78%
Rehman
al. [22]
et Malignant
and
Ben
ign
skin lesions
Dermoscopic 379 ASIC-2016 [27] Segmentation
using
Generalized
Gaussian
Distribution
CNN With Conv : 16
filters of 7*7,
pooling layer:16
100*50*5
FC: Accuracy: 98.32%
Sensitivity: 98.15%
Specificity: 98.41%
Brinker
al. [6]
et Melanoma
and Nevi
Clincal - HAM10000 [27] ResNet50 Mean Sensitivity: 89.4%
Dermoscopic 12,378 Mean Specificity: 64.4%
ROC: 0.769
Kulhalli al.
[7]
et Melanoma,
Nevi, SK,
Akiec, BCC,
DF, BKL
Dermoscopic 10,015 HAM10000 [27] InceptionV3 Normalized F1 Score : 0.93
Khan et al.
[8]
Melanoma
Vs other
Dermoscopic 1,279 ISBI-16[27] Lesion
Enhancement
ResNet50
ResNet101
and Accuracy:
ISBI 2016 : 90.20%
2,790 ISBI-17[27] ISBI 2017: 95.60%
10,000 HAM10000 [27] Ham1000: 89.8%

SKIN DISEASE CLASSIFICATION
Deep learning to predict the various skin diseases. The main objective of this project is to
achieve maximum accuracy of skin disease prediction. Deep learning techniques helps in
detection of skin disease at an initial stage. The feature extraction plays a key role in
classification of skin diseases. The usage of Deep Learning algorithms reduces the need for
human labor, such as manual feature extraction and data reconstruction for classification
purpose. Moreover, Explainable AI is used to interpret the decisions made by our model.
ABOUT THE DATASET
HAM10000 ("Human Against Machine with 10000 training images") dataset - a large
collection of multi-sources dermatoscopic images of pigmented lesions. The dermatoscopic
images are collected from different populations, acquired and stored by different modalities.
The final dataset consists of 10015 dermatoscopic images.
It has 7 different classes of skin cancer which are listed below:
• Melanocytic nevi
• Melanoma
• Benign keratosis-like lesions
• Basal cell carcinoma
• Actinic keratoses
• Vascular lesions
• Dermatofibroma
Importing Libraries
#Importing required libraries
import matplotlib.pyplot as plt
from PIL import Image
import seaborn as sns
import numpy as np
import pandas as pd
import os
from tensorflow.keras.utils import to_categorical
from glob import glob
✓ HAM10000_metadata.csv file is the main csv file that includes the data of all training
images, the features of which are –
1. Lesion_id
2. Image_id
3. Dx
4. Dx_type
5. Age
6. Sex
7. Localization

Reading the Data from the Dataset
# Reading the data from HAM_metadata.csv
df = pd.read_csv('../input/skin-cancer-mnist-ham10000/HAM10000_metadata.csv'
df.head()
df.dtypes
lesion_id object
image_id object
dx object
dx_type object
age float64
sex object
localization object
dtype: object
lesion_id image_id dx dx_type age sex localization
0 HAM_0000118 ISIC_0027419 bkl histo 80.0 male scalp
4 HAM_0001466 ISIC_0031633 bkl histo 75.0 male ear

df.describe()
Data Cleaning
Removing NULL values and performing visualizations to gain insights of
dataset: Univariate and Bivariate Analysis
df. isnull(). sum()
lesion_id 0
image_id 0
dx 0
dx_type 0
age 57
sex 0
localization 0
dtype: int64
The feature 'age' consists of 57 null records. Thus, we need to replace them with the mean of
'age' since dropping 57 records would lead to loss of data.
age
count 9958.000000
mean 51.863828
std 16.968614
min 0.000000
25% 40.000000
50% 50.000000
75% 65.000000
max 85.000000

df['age'].fillna(int(df['age'].mean()),inplace=True)
df.isnull().sum()
lesion_id 0
image_id 0
dx 0
dx_type 0
age 0
sex 0
localization 0
dtype: int64
lesion_type_dict = {
'nv': 'Melanocytic nevi',
'mel': 'Melanoma',
'bkl': 'Benign keratosis-like lesions ',
'bcc': 'Basal cell carcinoma',
'akiec': 'Actinic keratoses',
'vasc': 'Vascular lesions',
'df': 'Dermatofibroma'
}
base_skin_dir = '../input/skin-cancer-mnist-ham10000'
# Merge images from both folders into one dictionary
imageid_path_dict = {os.path.splitext(os.path.basename(x))[0]: x
for x in glob(os.path.join(base_skin_dir, '*', '*.jpg'))}
df['path'] = df['image_id'].map(imageid_path_dict.get)
df['cell_type'] = df['dx'].map(lesion_type_dict.get)
df['cell_type_idx'] = pd.Categorical(df['cell_type']).codes
df.head()

lesion_id image_id
d
x
dx_ty
pe
ag
e
sex
localizat
ion
path
cell_ty
pe
cell_type
_idx
0
HAM_000
0118
ISIC_0027
419
b
kl
histo
80.
0
ma
le
scalp
../input/skin-
cancer-mnist-
ham10000/ham1
0000_i...
Benign
kerato
sis-like
lesions
2
1
HAM_000
0118
ISIC_0025
030
b
kl
histo
80.
0
ma
le
scalp
../input/skin-
cancer-mnist-
ham10000/ham1
0000_i...
Benign
kerato
sis-like
lesions
2
2
HAM_000
2730
ISIC_0026
769
b
kl
histo
80.
0
ma
le
scalp
../input/skin-
cancer-mnist-
ham10000/ham1
0000_i...
Benign
kerato
sis-like
lesions
2
3
HAM_000
2730
ISIC_0025
661
b
kl
histo
80.
0
ma
le
scalp
../input/skin-
cancer-mnist-
ham10000/ham1
0000_i...
Benign
kerato
sis-like
lesions
2
4
HAM_000
1466
ISIC_0031
633
b
kl
histo
75.
0
ma
le
ear
../input/skin-
cancer-mnist-
ham10000/ham1
0000_i...
Benign
kerato
sis-like
lesions
2
Image Preprocessing
df['image'] = df['path'].map(lambda x: np.asarray(Image.open(x).resize((125,100))))
n_samples = 5
fig, m_axs = plt.subplots(7, n_samples, figsize = (4*n_samples, 3*7))
for n_axs, (type_name, type_rows) in zip(m_axs,
df.sort_values(['cell_type']).groupby('cell_type')):
n_axs[0].set_title(type_name)
for c_ax, (_, c_row) in zip(n_axs, type_rows.sample(n_samples,
random_state=2018).iterrows()):
c_ax.imshow(c_row['image'])
c_ax.axis('off')
fig.savefig('category_samples.png', dpi=300)
Resizing of images because the original dimensions of 450 * 600 * 3
take long time to process in Neural Networks

# See the image size distribution - should just return one row (all images are uniform)
df['image'].map(lambda x: x.shape).value_counts()
(100, 125, 3) 10015
Name: image, dtype: int64
Exploratory Data Analysis
Exploratory data analysis can help detect obvious errors, identify outliers in datasets,
understand relationships, unearth important factors, find patterns within data, and provide
new insights.

df= df[df['age'] != 0]
df= df[df['sex'] != 'unknown']
plt.figure(figsize=(20,10))
plt.subplots_adjust(left=0.125, bottom=1, right=0.9, top=2, hspace=0.2)
plt.subplot(2,4,1)
plt.title("AGE",fontsize=15)
plt.ylabel("Count")
df['age'].value_counts().plot.bar()
plt.subplot(2,4,2)
plt.title("GENDER",fontsize=15)
plt.ylabel("Count")
df['sex'].value_counts().plot.bar()
plt.subplot(2,4,3)
plt.title("localization",fontsize=15)
plt.ylabel("Count")
plt.xticks(rotation=45)
df['localization'].value_counts().plot.bar()
plt.subplot(2,4,4)
plt.title("CELL TYPE",fontsize=15)
plt.ylabel("Count")
df['cell_type'].value_counts().plot.bar()
<AxesSubplot:title={'center':'CELL TYPE'}, ylabel='Count'>
1. Skin diseases are found to be maximum in people aged around 45. Minimum for 10
and below. We also observe that the probability of having skin disease increases with
the increase in age.
2. Skin diseases are more prominent in Men as compared to Women and other gender.
3. Skin diseases are more visible on the "back" of the body and least on the "acral
surfaces"(such as limbs, fingers, or ears).
4. The most found disease among people is Melanocytic nevi while the least found is
Dermatofibroma.

plt.subplot(1,2,1)
df['dx'].value_counts().plot.pie(autopct="%1.1f%%")
plt.subplot(1,2,2)
df['dx_type'].value_counts().plot.pie(autopct="%1.1f%%")
plt.show()
1. Type of skin disease:
• nv: Melanocytic nevi - 69.9%
• mel: Melanoma - 11.1 %
• bkl: Benign keratosis-like lesions - 11.0%
• bcc: Basal cell carcinoma - 5.1%
• akiec: Actinic keratoses- 3.3%
• vasc: Vascular lesions-1.4%
• df: Dermatofibroma - 1.1%
2. How the skin disease was discovered :
• histo - histopathology - 53.3%
• follow_up - follow up examination - 37.0%
• consensus - expert consensus - 9.0%
• confocal - confirmation by in-vivo confocal microscopy - 0.7%

BIVARIATE ANALYSIS
plt.title('LOCALIZATION VS GENDER',fontsize = 15)
sns.countplot(y='localization', hue='sex',data=df)
<AxesSubplot:title={'center':'LOCALIZATION VS GENDER'}, xlabel='count', ylabel='localization'>
• Back are is the most affected among people and more prominent in men.
• Infection on Lower extremity of the body is more visible in women.
• Some unknown regions also show infections and it's visible in men, women and other
genders.
• The acral surfaces show the least infection cases that too in men only. Other gender
groups don't show this kind of infection.
plt.title('LOCALIZATION VS CELL TYPE',fontsize = 15)
sns.countplot(y='localization', hue='cell_type',data=df)
<AxesSubplot:title={'center':'LOCALIZATION VS CELL TYPE'}, xlabel='count', ylabel='localization'>

• The face is infected the most by Benign keratosis-like lesions.
• Body parts(except face) are infected the most by Melanocytic nevi.
plt.subplot(131)
plt.title('AGE VS CELL TYPE',fontsize = 15)
sns.countplot(y='age', hue='cell_type',data=df)
plt.subplot(132)
plt.title('GENDER VS CELL TYPE',fontsize = 15)
sns.countplot(y='sex', hue='cell_type',data=df)
<AxesSubplot:title={'center':'GENDER VS CELL TYPE'}, xlabel='count', ylabel='sex'>
1. The age group between 0-75 years is infected the most by Melanocytic nevi. On the other hand,
the people aged 80-90 are affected more by Benign keratosis-like lesions.
2. All the gender groups are affected the most by Melanocytic nevi.
from sklearn.model_selection import train_test_split
import keras
from keras.models import Sequential
from keras.layers import Dense, Dropout
import tensorflow as tf
from sklearn.preprocessing import StandardScaler

ANN
features=df.drop(columns=['cell_type_idx'],axis=1)
target=df['cell_type_idx']
features.head()
lesion_id image_id dx
dx_ty
pe
age sex
localizati
on
path
cell_ty
pe
imag
e
0
HAM_0000
118
ISIC_0027
419
bk
l
histo
80.
0
mal
e
scalp
../input/skin-cancer-
mnist-
ham10000/ham1000
0_i...
Benign
keratosi
s-like
lesions
[[[18
9,
152,
194],
[192,
156,
198],
[191,
154,..
.
1
HAM_0000
118
ISIC_0025
030
bk
l
histo
80.
0
mal
e
scalp
mnist-
ham10000/ham1000
0_i...
Benign
keratosi
s-like
lesions
[[[24,
13,
22],
[24,
14,
22],
[24,
14,
26],
[2...
2
HAM_0002
730
ISIC_0026
769
bk
l
histo
80.
0
mal
e
scalp
mnist-
ham10000/ham1000
0_i...
Benign
keratosi
s-like
lesions
[[[18
6,
127,
135],
[189,
133,
145],
[192,
135,..
.
3
HAM_0002
730
ISIC_0025
661
bk
l
histo
80.
0
mal
e
scalp
mnist-
ham10000/ham1000
0_i...
Benign
keratosi
s-like
lesions
[[[24,
11,
17],
[24,
11,
20],
[30,
15,
25],
[4...

x_train_o, x_test_o, y_train_o, y_test_o = train_test_split(features, target,
test_size=0.25,random_state=666)
tf.unique(x_train_o.cell_type.values)
Unique(y=<tf.Tensor: shape=(7,), dtype=string, numpy=
array([b'Melanocytic nevi', b'Basal cell carcinoma', b'Melanoma',
b'Actinic keratoses', b'Vascular lesions',
b'Benign keratosis-like lesions ', b'Dermatofibroma'], dtype=object)>, idx=<tf.Tensor: sha
pe=(7440,), dtype=int32, numpy=array([0, 1, 2, ..., 1, 0, 0], dtype=int32)>)
x_train = np.asarray(x_train_o['image'].tolist())
x_test = np.asarray(x_test_o['image'].tolist())
x_train_mean = np.mean(x_train)
x_train_std = np.std(x_train)
x_test_mean = np.mean(x_test)
x_test_std = np.std(x_test)
x_train = (x_train - x_train_mean)/x_train_std
x_test = (x_test - x_test_mean)/x_test_std
# Perform one-hot encoding on the labels
y_train = to_categorical(y_train_o, num_classes = 7)
y_test = to_categorical(y_test_o, num_classes = 7)
y_test
array([[0., 0., 0., ..., 1., 0., 0.],
[0., 0., 0., ..., 1., 0., 0.],
[0., 0., 0., ..., 1., 0., 0.],
...,
[0., 0., 0., ..., 1., 0., 0.],
[0., 0., 0., ..., 0., 1., 0.],
[0., 0., 0., ..., 1., 0., 0.]], dtype=float32)
x_train, x_validate, y_train, y_validate = train_test_split(x_train, y_train, test_size = 0.1,
random_state = 999)
# Reshape image in 3 dimensions (height = 100, width = 125 , canal = 3)
x_train = x_train.reshape(x_train.shape[0], *(100, 125, 3))
x_test = x_test.reshape(x_test.shape[0], *(100, 125, 3))
x_validate = x_validate.reshape(x_validate.shape[0], *(100, 125, 3))

x_train = x_train.reshape(6696,125*100*3)
x_test = x_test.reshape(2481,125*100*3)
print(x_train.shape)
print(x_test.shape)
(6696, 37500)
(2481, 37500)
# define the keras model
model = Sequential()
model.add(Dense(units= 64, kernel_initializer = 'uniform', activation = 'relu', input_dim =
37500))
model.add(Dense(units= 64, kernel_initializer = 'uniform', activation = 'relu'))
model.add(Dense(units = 7, kernel_initializer = 'uniform', activation = 'softmax'))
optimizer = tf.keras.optimizers.Adam(learning_rate = 0.00075,
beta_1 = 0.9,
beta_2 = 0.999,
epsilon = 1e-8)
# compile the keras model
model.compile(optimizer = optimizer, loss = 'categorical_crossentropy', metrics =
['accuracy'])
# fit the keras model on the dataset
history = model.fit(x_train, y_train, batch_size = 10, epochs = 50)
accuracy = model.evaluate(x_test, y_test, verbose=1)[1]
print("Test: accuracy = ",accuracy*100,"%")
result of Epoch
from keras.utils.vis_utils import plot_model
plot_model(model, to_file='model_plot.png', show_shapes=True, show_layer_names=True)

CNN
CNN is ideal for image classification. It is better since CNN has features parameter sharing and
dimensionality reduction. Because of parameter sharing in CNN, the number of parameters is
reduced thus the computations get decreased.

Since the data is less, we apply data augmentation using ImageDataGenerator.
ImageDataGenerator generates augmentation of images in real-time while the model is still
training. One can apply any random transformations on each training image as it is passed to
the model.
The CNN model is a repeated network of the following layers:
1. Convolutional
2. Pooling
3. Dropout
4. Flatten
5. Dense
from tensorflow.keras.layers import Flatten,Dense,Dropout,BatchNormalization,Conv2D,
MaxPool2D
from keras.optimizers import Adam
from keras.callbacks import ReduceLROnPlateau
from tensorflow.keras.preprocessing.image import ImageDataGenerator
# Set the CNN model
# my CNN architechture is In -> [[Conv2D->relu]*2 -> MaxPool2D -> Dropout]*3 -> Flatten -> Dense*2 -> Dropout
-> Out
input_shape = (100, 125, 3)
num_classes = 7
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',padding = 'Same',input_shape=input_shape))
model.add(Conv2D(32,kernel_size=(3, 3), activation='relu',padding = 'Same',))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Dropout(0.16))
model.add(Conv2D(32, kernel_size=(3, 3),activation='relu',padding = 'Same'))
model.add(Conv2D(32,kernel_size=(3, 3), activation='relu',padding = 'Same',))
model.add(MaxPool2D(pool_size = (2, 2)))
model.add(Conv2D(64, (3, 3), activation='relu',padding = 'same'))
model.add(Conv2D(64, (3, 3), activation='relu',padding = 'Same'))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(256, activation='relu'))
model.add(Dense(128, activation='relu'))
model.add(Dense(num_classes, activation='softmax'))
model.summary()
Applied Data augmentation using ImageDatagenerator before
model training

Model: "sequential_1"
_________________________________________________________________
Layer (type) Output Shape Param #
=================================================================
conv2d (Conv2D) (None, 100, 125, 32) 896
_________________________________________________________________
conv2d_1 (Conv2D) (None, 100, 125, 32) 9248
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 50, 62, 32) 0
_________________________________________________________________
dropout (Dropout) (None, 50, 62, 32) 0
_________________________________________________________________
conv2d_2 (Conv2D) (None, 50, 62, 32) 9248
_________________________________________________________________
conv2d_3 (Conv2D) (None, 50, 62, 32) 9248
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 25, 31, 32) 0
_________________________________________________________________
dropout_1 (Dropout) (None, 25, 31, 32) 0
_________________________________________________________________
conv2d_4 (Conv2D) (None, 25, 31, 64) 18496
_________________________________________________________________
conv2d_5 (Conv2D) (None, 25, 31, 64) 36928
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 12, 15, 64) 0
_________________________________________________________________
dropout_2 (Dropout) (None, 12, 15, 64) 0
_________________________________________________________________
flatten (Flatten) (None, 11520) 0
_________________________________________________________________
dense_5 (Dense) (None, 256) 2949376
_________________________________________________________________
_________________________________________________________________
dropout_3 (Dropout) (None, 128) 0
_________________________________________________________________
=================================================================
Total params: 3,067,239
Trainable params: 3,067,239
Non-trainable params: 0
# Define the optimizer
optimizer = Adam(lr=0.0001, beta_1=0.9, beta_2=0.999, epsilon=None, decay=0.0,
amsgrad=False)
# Compile the model
model.compile(optimizer = optimizer , loss = "categorical_crossentropy",
metrics=["accuracy"])
# Set a learning rate annealer
learning_rate_reduction = ReduceLROnPlateau(monitor='val_accuracy',
patience=4,
verbose=1,
factor=0.5,
min_lr=0.00001)

random_state = 999)
x_validate = x_validate.reshape(x_validate.shape[0], *(100, 125, 3))
# With data augmentation to prevent overfitting
datagen = ImageDataGenerator(
featurewise_center=False, # set input mean to 0 over the dataset
samplewise_center=False, # set each sample mean to 0
featurewise_std_normalization=False, # divide inputs by std of the dataset
samplewise_std_normalization=False, # divide each input by its std
zca_whitening=False, # apply ZCA whitening
rotation_range=10, # randomly rotate images in the range (degrees, 0 to 180)
zoom_range = 0.1, # Randomly zoom image
width_shift_range=0.12, # randomly shift images horizontally (fraction of total width)
height_shift_range=0.12, # randomly shift images vertically (fraction of total height)
horizontal_flip=True, # randomly flip images
vertical_flip=True) # randomly flip images
datagen.fit(x_train)
# Fit the model
epochs = 60
batch_size = 16
history = model.fit_generator(datagen.flow(x_train,y_train, batch_size=batch_size),
epochs = epochs, validation_data = (x_validate,y_validate),
verbose = 1, steps_per_epoch=x_train.shape[0] // batch_size
, callbacks=[learning_rate_reduction])
from tensorflow.keras.metrics import Recall
from sklearn.metrics import classification_report,confusion_matrix

loss, accuracy = model.evaluate(x_test, y_test, verbose=1)
loss_v, accuracy_v = model.evaluate(x_validate, y_validate, verbose=1)
print("Validation: accuracy = %f ; loss_v = %f" % (accuracy_v, loss_v))
print("Test: accuracy = %f ; loss = %f" % (accuracy, loss))
model.save("model.h5")
78/78 [==============================] - 1s 8ms/step - loss: 0.6185 - accuracy: 0.7686
21/21 [==============================] - 0s 7ms/step - loss: 0.6881 - accuracy: 0.7433
Validation: accuracy = 0.743284 ; loss_v = 0.688070
Test: accuracy = 0.768642 ; loss = 0.618472
import itertools
# Function to plot confusion matrix
def plot_confusion_matrix(cm, classes,
normalize=False,
title='Confusion matrix',
cmap=plt.cm.Blues):
"""
This function prints and plots the confusion matrix.
Normalization can be applied by setting `normalize=True`.
"""
plt.imshow(cm, interpolation='nearest', cmap=cmap)
plt.title(title)
plt.colorbar()
tick_marks = np.arange(len(classes))
plt.xticks(tick_marks, classes, rotation=45)
plt.yticks(tick_marks, classes)
if normalize:
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
thresh = cm.max() / 2.
for i, j in itertools.product(range(cm.shape[0]), range(cm.shape[1])):
plt.text(j, i, cm[i, j],
horizontalalignment="center",
color="white" if cm[i, j] > thresh else "black")
plt.tight_layout()
plt.ylabel('True label')
plt.xlabel('Predicted label')
# Predict the values from the validation dataset
Y_pred = model.predict(x_validate)
# Convert predictions classes to one hot vectors
Y_pred_classes = np.argmax(Y_pred,axis = 1)
# Convert validation observations to one hot vectors
Y_true = np.argmax(y_validate,axis = 1)
# compute the confusion matrix
confusion_mtx = confusion_matrix(Y_true, Y_pred_classes)

Y_pred = model.predict(x_test)
Y_true = np.argmax(y_test,axis = 1)
# plot the confusion matrix
plot_confusion_matrix(confusion_mtx, classes = range(7))

label_frac_error = 1 - np.diag(confusion_mtx) / np.sum(confusion_mtx, axis=1)
plt.bar(np.arange(7),label_frac_error)
plt.xlabel('True Label')
plt.ylabel('Fraction classified incorrectly')
Text(0, 0.5, 'Fraction classified incorrectly')
# # Function to plot model's validation loss and validation accuracy
# def plot_model_history(model_history):
# fig, axs = plt.subplots(1,2,figsize=(15,5))
# # summarize history for accuracy
# axs[0].plot(range(1,len(model_history.history['accuracy'])+1),model_history.history['accuracy'])
#
axs[0].plot(range(1,len(model_history.history['val_accuracy'])+1),model_history.history['val_accuracy'
])
# axs[0].set_title('Model Accuracy')
# axs[0].set_ylabel('Accuracy')
# axs[0].set_xlabel('Epoch')
#
axs[0].set_xticks(np.arange(1,len(model_history.history['accuracy'])+1),len(model_history.history['acc
uracy'])/10)
# axs[0].legend(['train', 'val'], loc='best')
# # summarize history for loss
# axs[1].plot(range(1,len(model_history.history['loss'])+1),model_history.history['loss'])
# axs[1].plot(range(1,len(model_history.history['val_loss'])+1),model_history.history['val_loss'])
# axs[1].set_title('Model Loss')
# axs[1].set_ylabel('Loss')
# axs[1].set_xlabel('Epoch')
#
axs[1].set_xticks(np.arange(1,len(model_history.history['loss'])+1),len(model_history.history['loss'])/10
)
# axs[1].legend(['train', 'val'], loc='best')
# plt.show()
# plot_model_history(history)

Tranfer Learning
Why MobileNet?
MobileNet significantly reduces the number of parameters when compared to the network
with regular convolutions with the same depth in the nets. This results in lightweight deep
neural networks.
The 2 layers in addition to the ones used for CNN are: Batch Normalization Zero Padding
import tensorflow
from tensorflow.keras.models import Model
from tensorflow.keras.callbacks import EarlyStopping, ReduceLROnPlateau,
ModelCheckpoint
df['image'] = df['path'].map(lambda x: np.asarray(Image.open(x).resize((450,600))))
features=df.drop(columns=['cell_type_idx'],axis=1)
target=df['cell_type_idx']
x_train_o, x_test_o, y_train_o, y_test_o = train_test_split(features, target,
test_size=0.25,random_state=666)
tf.unique(x_train_o.cell_type.values)
Unique(y=<tf.Tensor: shape=(7,), dtype=string, numpy=
array([b'Melanocytic nevi', b'Basal cell carcinoma', b'Melanoma',
b'Vascular lesions', b'Benign keratosis-like lesions ',
b'Actinic keratoses', b'Dermatofibroma'], dtype=object)>, idx=<tf.Tensor: shape=(7511,), dtype=int32, numpy=array([0, 1
, 0, ..., 1, 0, 0], dtype=int32)>)
x_train = np.asarray(x_train_o['image'].tolist())
x_test = np.asarray(x_test_o['image'].tolist())
x_train_mean = np.mean(x_train)
x_train_std = np.std(x_train)
x_test_mean = np.mean(x_test)
x_test_std = np.std(x_test)
x_train = (x_train - x_train_mean)/x_train_std
x_test = (x_test - x_test_mean)/x_test_std
# Perform one-hot encoding on the labels
y_train = to_categorical(y_train_o, num_classes = 7)
y_test = to_categorical(y_test_o, num_classes = 7)
y_test
Due to lack of dataset, pretrained model of MobileNet is used.

random_state = 999)
x_validate = x_validate.reshape(x_validate.shape[0], *(224, 224, 3)
print(x_train.shape)
# create a copy of a mobilenet model
mobile = tensorflow.keras.applications.mobilenet.MobileNet()
mobile.summary()
def change_model(model, new_input_shape=(None, 40, 40, 3),custom_objects=None):
# replace input shape of first layer
config = model.layers[0].get_config()
config['batch_input_shape']=new_input_shape
model._layers[0]=model.layers[0].from_config(config)
# rebuild model architecture by exporting and importing via json
new_model =
tensorflow.keras.models.model_from_json(model.to_json(),custom_objects=custom_objects
)
# copy weights from old model to new one
for layer in new_model._layers:
try:
layer.set_weights(model.get_layer(name=layer.name).get_weights())
print("Loaded layer {}".format(layer.name))
except:
print("Could not transfer weights for layer {}".format(layer.name))
return new_model
new_model = change_model(mobile, new_input_shape=[None] + [100,125,3])
new_model.summary()

# CREATE THE MODEL ARCHITECTURE
# Exclude the last 5 layers of the above model.
# This will include all layers up to and including global_average_pooling2d_1
x = new_model.layers[-6].output
# Create a new dense layer for predictions
# 7 corresponds to the number of classes
x = Dropout(0.25)(x)
predictions = Dense(7, activation='softmax')(x)
# inputs=mobile.input selects the input layer, outputs=predictions refers to the
# dense layer we created above.
model = Model(inputs=new_model.input, outputs=predictions)
# We need to choose how many layers we actually want to be trained.
# Here we are freezing the weights of all layers except the
# last 23 layers in the new model.
# The last 23 layers of the model will be trained.
for layer in model.layers[:-23]:
layer.trainable = False
# Define Top2 and Top3 Accuracy
from tensorflow.keras.metrics import categorical_accuracy, top_k_categorical_accuracy
def top_3_accuracy(y_true, y_pred):
return top_k_categorical_accuracy(y_true, y_pred, k=3)
def top_2_accuracy(y_true, y_pred):
return top_k_categorical_accuracy(y_true, y_pred, k=2)
model.compile(Adam(lr=0.01), loss='categorical_crossentropy',
metrics=[categorical_accuracy, top_2_accuracy, top_3_accuracy])
# Add weights to try to make the model more sensitive to melanoma
class_weights={
0: 1.0, # akiec
1: 1.0, # bcc
2: 1.0, # bkl
3: 1.0, # df
4: 3.0, # mel # Try to make the model more sensitive to Melanoma.
5: 1.0, # nv
6: 1.0, # vasc
}

filepath = "model.h5"
checkpoint = ModelCheckpoint(filepath, monitor='val_top_3_accuracy', verbose=1,
save_best_only=True, mode='max')
reduce_lr = ReduceLROnPlateau(monitor='val_top_3_accuracy', factor=0.5, patience=2,
verbose=1, mode='max', min_lr=0.00001)
callbacks_list = [checkpoint, reduce_lr]
history = model.fit_generator(datagen.flow(x_train,y_train, batch_size=batch_size),
class_weight=class_weights,
validation_data=(x_validate,y_validate),steps_per_epoch=x_train.shape[0] //
batch_size,
epochs=10, verbose=1,
callbacks=callbacks_list)
# get the metric names so we can use evaulate_generator
model.metrics_names
# Here the the last epoch will be used.
val_loss, val_cat_acc, val_top_2_acc, val_top_3_acc =
model.evaluate(datagen.flow(x_test,y_test, batch_size=16) )
print('val_loss:', val_loss)
print('val_cat_acc:', val_cat_acc)
print('val_top_2_acc:', val_top_2_acc)
# Here the best epoch will be used.
model.load_weights('model.h5')
val_loss, val_cat_acc, val_top_2_acc, val_top_3_acc =
model.evaluate_generator(datagen.flow(x_test,y_test, batch_size=16)
)
print('val_loss:', val_loss)
print('val_cat_acc:', val_cat_acc)

# display the loss and accuracy curves
acc = history.history['categorical_accuracy']
val_acc = history.history['val_categorical_accuracy']
loss = history.history['loss']
val_loss = history.history['val_loss']
train_top2_acc = history.history['top_2_accuracy']
val_top2_acc = history.history['val_top_2_accuracy']
train_top3_acc = history.history['top_3_accuracy']
val_top3_acc = history.history['val_top_3_accuracy']
epochs = range(1, len(acc) + 1)
plt.plot(epochs, loss, 'bo', label='Training loss')
plt.plot(epochs, val_loss, 'b', label='Validation loss')
plt.title('Training and validation loss')
plt.legend()
plt.figure()
plt.plot(epochs, acc, 'bo', label='Training cat acc')
plt.plot(epochs, val_acc, 'b', label='Validation cat acc')
plt.title('Training and validation cat accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, train_top2_acc, 'bo', label='Training top2 acc')
plt.plot(epochs, val_top2_acc, 'b', label='Validation top2 acc')
plt.title('Training and validation top2 accuracy')
plt.legend()
plt.figure()
plt.plot(epochs, train_top3_acc, 'bo', label='Training top3 acc')
plt.plot(epochs, val_top3_acc, 'b', label='Validation top3 acc')
plt.title('Training and validation top3 accuracy')
plt.legend()
plt.show()
Plot the Training Curves

accuracy = model.evaluate(x_test, y_test,verbose=1)[1]
accuracy_v = model.evaluate(x_validate, y_validate)[1]
print("Validation: accuracy = ", accuracy_v)
print("Test: accuracy = ",accuracy)
model.save("model.h5")
# make a prediction
predictions = model.predict_generator(datagen.flow(x_test,y_test, batch_size=16),
verbose=1)
predictions.shape
test_batches = datagen.flow(x_test,y_test, batch_size=16)
test_batches
# Source: Scikit Learn website
# http://scikit-learn.org/stable/auto_examples/
# model_selection/plot_confusion_matrix.html#sphx-glr-auto-examples-model-
# selection-plot-confusion-matrix-py
normalize=False,
cmap=plt.cm.Blues):
"""
"""
if normalize:
print("Normalized confusion matrix")
else:
print('Confusion matrix, without normalization')
print(cm)
plt.title(title)
plt.colorbar()
Create a Confusion Matrix

fmt = '.2f' if normalize else 'd'
plt.text(j, i, format(cm[i, j], fmt),
plt.tight_layout()
# Function to plot confusion matrix
normalize=False,
cmap=plt.cm.Blues):
"""
"""
plt.title(title)
plt.colorbar()
if normalize:
plt.text(j, i, cm[i, j],
plt.tight_layout()
Y_pred = model.predict(x_validate)

Y_true = np.argmax(y_validate,axis = 1)
Y_pred = model.predict(x_test)
Y_true = np.argmax(y_test,axis = 1)
y_pred = model.predict(x_test)
y_pred =y_pred>0.5
cm_plot_labels = ['akiec', 'bcc', 'bkl', 'df', 'mel','nv', 'vasc']
from sklearn.metrics import classification_report
# Generate a classification report
report = classification_report(y_test, y_pred, target_names=cm_plot_labels)
print(report)
model.save("mobilenet_model.h5")
Generate the Classification Report

tile_df = df.copy()
tile_df.drop('lesion_id', inplace=True, axis=1)
tile_df.drop('image_id', inplace=True, axis=1)
tile_df.drop('cell_type', inplace=True, axis=1)
tile_df.drop('path', inplace=True, axis=1)
tile_df.drop('dx', inplace=True, axis=1)
tile_df.head()
X = tile_df.drop(['cell_type_idx'],axis=1).values
y = tile_df['cell_type_idx'].values
X_train,X_test,y_train,y_test = train_test_split(X,y,random_state=0)
pip install alibi
pip install shap
import shap
shap.initjs()
import numpy as np
import pandas as pd
from alibi.explainers import KernelShap
from scipy.special import logit
Techniques applied: LIME, PDP, SHAP, etc.
dx_type
age sex localization cell_type_idx
0 histo 80.0 male scalp 2
4 histo 75.0 male ear 2

from sklearn.metrics import confusion_matrix, plot_confusion_matrix
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression
tile_df['localization_onehot'] = tile_df.localization.map({'scalp':0, 'ear':1, 'face':2,
'neck':3,'back':4, 'trunk':5, 'chest':6,
'upper extremity':7, 'abdomen':8, 'lower extremity':9,
'genital':10, 'hand':11, 'foot':12, 'acral':13, 'unknown':14})
tile_df.head()
tile_df['dx_type_onehot'] =
tile_df.dx_type.map({'confocal':0,'consensus':1,'follow_up':2,'histo':3})
tile_df.head()
[23]:
dx_type age sex localization cell_type_idx localization_onehot dx_type_onehot
0 histo 80.0 male scalp 2 0 3
4 histo 75.0 male ear 2 1 3
dx_type
age sex localization cell_type_idx localization_onehot
0 histo 80.0 male scalp 2 0
4 histo 75.0 male ear 2 1

tile_df['gender_male'] = tile_df.sex.map({'female':0, 'male':1, 'unknown':2})
tile_df.head()
dx_typ
e
age sex
localizatio
n
cell_type_id
x
localization_oneho
t
dx_type_oneho
t
gender_mal
e
0
hist
o
80.
0
male scalp 2 0 3 1
1
hist
o
80.
0
male scalp 2 0 3 1
2
hist
o
80.
0
male scalp 2 0 3 1
3
hist
o
80.
0
male scalp 2 0 3 1
4
hist
o
75.
0
male ear 2 1 3 1
tile_df.columns
Index(['dx_type', 'age', 'sex', 'localization', 'cell_type_idx',
'localization_onehot', 'dx_type_onehot', 'gender_male'],
dtype='object')
features = ['age', 'localization_onehot', 'dx_type_onehot','gender_male']
X = tile_df[features]
y = tile_df['cell_type_idx'].values
X_train,X_test,y_train,y_test = train_test_split(X,y,random_state=0)
from xgboost import XGBClassifier
from sklearn.ensemble import RandomForestClassifier
model = XGBClassifier(random_state=1)
model = model.fit(X_train, y_train)
y_pred = model.predict(X_test)
predictions = [round(value) for value in y_pred]

from sklearn.metrics import accuracy_score
accuracy = accuracy_score(y_test, predictions)
print("Accuracy: %.2f%%" % (accuracy * 100.0))
Accuracy: 72.16%
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)
print('Expected Value: ', explainer.expected_value)
Expected Value: [-0.6287137, -0.21934628, 0.4661603, -1.7456617, 2.6632032, 0.5190712, -1.2845858]
shap.summary_plot(shap_values, X_test, plot_type="bar")
shap.summary_plot(shap_values[0], X_test)
from sklearn.preprocessing import LabelEncoder

## Preprocess training and test target (y) after having performed train-test split
le = LabelEncoder()
y_multi_train = pd.Series(le.fit_transform(y_train))
y_multi_test = pd.Series(le.transform(y_test))
## Check classes
le.classes_
array([0, 1, 2, 3, 4, 5, 6], dtype=int8)
shap.initjs()
shap.dependence_plot('dx_type_onehot', interaction_index='age',
shap_values=shap_values[0],
features=X_test,
display_features=X_test)
shap.initjs()
shap.force_plot(explainer.expected_value[0], shap_values[0][:100,:], X_test.iloc[:100,:])

shap.initjs()
shap.force_plot(explainer.expected_value[0], shap_values[0][15,:], X_test.iloc[15,:])
Feature Importance:
Feature importance measures the increase in the prediction error of the model after we
permuted the feature values.
A feature is "important" if shuffling its values increases the model error, because in this case
the model relied on the feature for the prediction.
A feature is "unimportant" if shuffling its values leaves the model error unchanged, because
in this case the model ignored the feature for the prediction.
Now install the eli5
pip install eli5
import eli5
from eli5.sklearn import PermutationImportance
eli5.show_weights(model.get_booster(), top=15)
tgt = 6
print('Reference:', y_test[tgt])
print('Predicted:', predictions[tgt])
eli5.show_prediction(model.get_booster(), X_test.iloc[tgt],
feature_names=features, show_feature_values=True)
Weight Feature
0.8239 dx_type_onehot
0.0748 age
0.0667 localization_onehot
0.0346 gender_male

Reference: 4
Predicted: 4
y=0 (probabili
ty 0.000,
score -7.697)
top features
y=1 (probabili
ty 0.000,
score -5.861)
top features
y=2 (probabili
ty 0.000,
score -1.815)
top features
y=3 (probabili
ty 0.000,
score -2.777)
top features
y=4 (probabili
ty 1.000,
score 7.013)
top features
y=5 (probabili
ty 0.000,
score -5.009)
top features
y=6 (probabili
ty 0.000,
score -4.376)
top features
C
o
n
t
r
i
b
u
ti
o
n
?
F
e
a
t
u
r
e
V
a
l
u
e
-
0
.
1
3
3
g
e
n
d
er
_
m
al
e
0
.
0
0
0
-
1
.
1
2
9

1
.
0
0
0
-
1
.
3
4
7
lo
c
al
iz
at
io
n
_
o
n
e
h
ot
9
.
0
0
0
-
1
.
3
7
0
a
g
e
5
0
.
0
0
0
-
3
.
7
d
x
_t
y
2
.
0
C
o
n
t
r
i
b
u
ti
o
n
?
F
e
a
t
u
r
e
V
a
l
u
e
-
0
.
0
9
6
g
e
n
d
er
_
m
al
e
0
.
0
0
0
-
0
.
4
5
2
a
g
e
5
0
.
0
0
0
-
0
.
5
5
5
lo
c
al
iz
at
io
n
_
o
n
e
h
ot
9
.
0
0
0
-
0
.
7
1
9

1
.
0
0
0
-
4
.
0
d
x
_t
y
2
.
0
C
o
n
t
r
i
b
u
ti
o
n
?
F
e
a
t
u
r
e
V
a
l
u
e
+
0
.
0
8
7
lo
c
al
iz
at
io
n
_
o
n
e
h
ot
9
.
0
0
0
+
0
.
0
2
1
a
g
e
5
0
.
0
0
0
+
0
.
0
1
6
g
e
n
d
er
_
m
al
e
0
.
0
0
0
-
0
.
0
3
4

1
.
0
0
0
-
1
.
9
d
x
_t
y
2
.
0
C
o
n
t
r
i
b
u
ti
o
n
?
F
e
a
t
u
r
e
V
a
l
u
e
+
0
.
8
1
6
a
g
e
5
0
.
0
0
0
+
0
.
5
5
9
lo
c
al
iz
at
io
n
_
o
n
e
h
ot
9
.
0
0
0
-
0
.
5
2
5
g
e
n
d
er
_
m
al
e
0
.
0
0
0
-
1
.
3
8
1
d
x
_t
y
p
e
_
o
n
e
2
.
0
0
0
C
o
n
t
r
i
b
u
ti
o
n
?
F
e
a
t
u
r
e
V
a
l
u
e
+
4
.
7
5
1
d
x
_t
y
p
e
_
o
n
e
h
ot
2
.
0
0
0
+
2
.
1
6
3

1
.
0
0
0
+
0
.
0
6
4
a
g
e
5
0
.
0
0
0
+
0
.
0
4
1
g
e
n
d
er
_
m
al
e
0
.
0
0
0
-
0
.
0
lo
c
al
iz
at
9
.
0
C
o
n
t
r
i
b
u
ti
o
n
?
F
e
a
t
u
r
e
V
a
l
u
e
+
0
.
0
1
9

1
.
0
0
0
-
0
.
0
0
4
lo
c
al
iz
at
io
n
_
o
n
e
h
ot
9
.
0
0
0
-
0
.
0
3
3
g
e
n
d
er
_
m
al
e
0
.
0
0
0
-
0
.
1
4
5
a
g
e
5
0
.
0
0
0
-
4
.
8
d
x
_t
y
2
.
0
C
o
n
t
r
i
b
u
ti
o
n
?
F
e
a
t
u
r
e
V
a
l
u
e
+
0
.
2
0
7
g
e
n
d
er
_
m
al
e
0
.
0
0
0
-
0
.
0
6
7
lo
c
al
iz
at
io
n
_
o
n
e
h
ot
9
.
0
0
0
-
1
.
1
8
2
d
x
_t
y
p
e
_
o
n
e
h
ot
2
.
0
0
0
-
1
.
5
a
g
e
5
0
.
0

1
8
p
e
_
o
n
e
h
ot
0
0
3
8
p
e
_
o
n
e
h
ot
0
0
0
6
p
e
_
o
n
e
h
ot
0
0
h
ot
-
2
.
2
4
6

1
.
0
0
0
0
6
io
n
_
o
n
e
h
ot
0
0
4
7
p
e
_
o
n
e
h
ot
0
0
4
9
0
0
-
1
.
7
8
5

1
.
0
0
0
PDP :
The partial dependence plot shows the marginal effect one or two features have on the
predicted outcome of a machine learning model.
A partial dependence plot can show whether the relationship between the target and a
feature is linear, monotonic or more complex.
For each of the categories, we get a PDP estimate by forcing all data instances to have the
same category.
pip install pdpbox
from pdpbox import pdp, get_dataset, info_plots
pdp_feat_67_rf = pdp.pdp_isolate(model=model,
dataset=X_train,
model_features=features,
feature='dx_type_onehot')
fig, axes = pdp.pdp_plot(pdp_isolate_out=pdp_feat_67_rf,
feature_name='type of diagnosis',
center=True,
x_quantile=True,
ncols=3,
plot_lines=True,
frac_to_plot=100
The PDP (Partial Dependence Plot) shows us the relation between an increase/decrease of one feature to the prediction of
the model.
For example: In figure 1 (class 0), we observe that the chances of the skin disease belonging to class 0 increases when the
value of dx_type_onehot changes from 2 (follow up) to 3 (histopathology).
Similarly, in figure 5 (class 4), we observe that the probability of the skin disease belonging to class 4 is extremely high when
the value of dx_type_onehot lies between 0 and 2, and decreases comparatively when it lies between 2 and 3.
Similarly, probability of the skin disease belonging to class 6 is extremely low when the value of dx_type_onehot lies
between 0 and 2 (confocal, consensus and follow up), and increases comparatively when it changes from 2 to 3.

LIME
Step 1: Generate random perturbations for input image
Step 2: Predict class for perturbations
Step 3: Compute weights (importance) for the perturbations
Step 4: Fit a explainable linear model using the perturbations, predictions and weights
import skimage.io
import skimage.segmentation
np.random.seed(222)
Xi = x_test[3]
preds = model.predict(Xi[np.newaxis,:,:,:])
top_pred_classes = preds[0].argsort()[-5:][::-1] # Save ids of top 5 classes
top_pred_classes
print(y_test[3])
skimage.io.imshow(Xi)
#Generate segmentation for image
superpixels = skimage.segmentation.quickshift(Xi, kernel_size=4,max_dist=200, ratio=0.2)
num_superpixels = np.unique(superpixels).shape[0]
skimage.io.imshow(skimage.segmentation.mark_boundaries(Xi, superpixels))
print("The number of super pixels generated")
num_superpixels
LIME is a technique that explains how the input features of a machine learning model
affect its predictions. For instance, for image classification tasks, LIME finds the region of
an image (set of super-pixels) with the strongest association with a prediction label.
LIME creates explanations by generating a new dataset of random perturbations (with
their respective predictions) around the instance being explained and then fitting a
weighted local surrogate model - model that gives explanation of individual predictions.

#Generate perturbations
num_perturb = 150
perturbations = np.random.binomial(1, 0.5, size=(num_perturb, num_superpixels))
#Create function to apply perturbations to images
import copy
def perturb_image(img,perturbation,segments):
active_pixels = np.where(perturbation == 1)[0]
mask = np.zeros(segments.shape)
for active in active_pixels:
mask[segments == active] = 1
perturbed_image = copy.deepcopy(img)
perturbed_image = perturbed_image*mask[:,:,np.newaxis]
return perturbed_image
#Show example of perturbations
print(perturbations[0])
predictions = []
for pert in perturbations:
perturbed_img = perturb_image(Xi,pert,superpixels)
pred = model.predict(perturbed_img[np.newaxis,:,:,:])
predictions.append(pred)
predictions = np.array(predictions)
print(predictions.shape)
skimage.io.imshow(perturb_image(Xi,perturbations[0],superpixels))
#Compute distances to original image
import sklearn.metrics
original_image = np.ones(num_superpixels)[np.newaxis,:] #Perturbation with all superpixels enabled
distances = sklearn.metrics.pairwise_distances(perturbations,original_image, metric='cosine').ravel()
print(distances.shape)
#Transform distances to a value between 0 an 1 (weights) using a kernel function
kernel_width = 0.25
weights = np.sqrt(np.exp(-(distances**2)/kernel_width**2)) #Kernel function
print(weights.shape)

#Estimate linear model
from sklearn.linear_model import LinearRegression
class_to_explain = 4
simpler_model = LinearRegression()
simpler_model.fit(X=perturbations, y=predictions[:,:,class_to_explain],
sample_weight=weights)
coeff = simpler_model.coef_[0]
#Use coefficients from linear model to extract top features
num_top_features = 4
top_features = np.argsort(coeff)[-num_top_features:]
#Show only the superpixels corresponding to the top features
mask = np.zeros(num_superpixels)
mask[top_features]= True #Activate top superpixels
skimage.io.imshow(perturb_image(Xi,mask,superpixels))
Conclusion
This paper is focused on various techniques for classification of skin diseases. Automating
the process of skin disease identification and classification can be very helpful and takes less time for
diagnosis as well. This paper presents the survey of traditional or feature extraction based and CNN
based approach for skin disease classification. From the study it is concluded that for traditional
approach the feature selection process is time consuming also selection of relevant feature is very
important. Whereas, the deep learning algorithm CNN learns the features automatically and
efficiently, for feature extraction CNN selects the filters intelligently as compared with manual ones.
The pre-trained models like Inception v3, resnet, VGG16, VGG19, Alexner etc are trained on very
large dataset with millions of general images and can be used with transfer learning or fine tuning.
However, the pre-trained model has to be trained from scratch if it is not being trained with skin
disease images before. Also, the CNN needs quite big dataset for training so it can learn effectively as
compare to the traditional way of skin disease classification.

References
[1] D.A. Okuboyejo, O.O. Olugbara and S.A. Odunaike, “Automating skin disease diagnosis using image classification,” In proceedings of
the world congress on engineering and computer science 2013 Oct 23, Vol. 2, pp. 850-854.
[2] A.A. Amarathunga, E.P. Ellawala, G.N. Abeysekar and C.R Amalraj, “Expert system for diagnosis of skin diseases,” International Journal
of Scientific & Technology Research, 2015 Jan 4;4(01):174-8.
[3] S. Chakraborty, K. Mali, S. Chatterjee, S. Anand, A. Basu, S. Banerjee, M. Das and A. Bhattacharya, “Image based skin disease detection
using hybrid neural network coupled bag-of-features,” In 2017 IEEE 8th Annual Ubiquitous Computing, Electronics and Mobile
Communication Conference (UEMCON), 2017 Oct 19, pp. 242-246. IEEE.
[4] A. Esteva, B. Kuprel, R.A. Novoa, J. Ko, S.M. Swetter, H.M. Blau and S. Thrun, “Dermatologist-level classification of skin cancer with deep
neural networks,” Nature, 2017 Feb;542(7639):115-8.
[5] X. Zhang, S. Wang, J. Liu and C. Tao, “Towards improving diagnosis of skin diseases by combining deep neural network and human
knowledge,” BMC medical informatics and decision making, 2018 Jul;18(2):59.
[6] T.J. Brinker, A. Hekler, J.S. Utikal, N. Grabe, D. Schadendorf, J. Klode, C. Berking, T. Steeb, A.H. Enk and C. von Kalle, “Skin cancer
classification using convolutional neural networks: systematic review,” Journal of medical Internet research, 2018;20(10):e11936.
[7] R. Kulhalli, C. Savadikar and B. Garware, “A Hierarchical Approach to Skin Lesion Classification,” In Proceedings of the ACM India Joint
International Conference on Data Science and Management of Data 2019 Jan 3 (pp. 245-250).
[8] M.A. Khan, M.Y. Javed, M. Sharif, T. Saba and A. Rehman, “Multimodel deep neural network based features extraction and optimal
selection approach for skin lesion classification,” In 2019 international conference on computer and information sciences (ICCIS) 2019
Apr 3 (pp. 1-7) IEEE.
[9] J. Premaladha, S. Sujitha, M.L. Priya and K.S. Ravichandran, “A survey on melanoma diagnosis using image processing and soft
computing techniques,” Research Journal of Information Technology, 2014 May;6(2):65-80.
[10] S. Chatterjee, D. Dey, S. Munshi and S. Gorai, “Extraction of features from cross correlation in space and frequency domains for
classification of skin lesions,” Biomedical Signal Processing and Control, 2019 Aug 1,53:101581.
[11] M.S. Manerkar, U. Snekhalatha, S. Harsh, J. Saxena, S.P. Sarma and M. Anburajan, “Automated skin disease segmentation and
classification using multi-class SVM classifier”. 2016.
[12] N. Codella, J. Cai, M. Abedini, R. Garnavi, A. Halpern and J.R. Smith, “Deep learning, sparse coding, and SVM for melanoma recognition
in dermoscopy images,” In International workshop on machine learning in medical imaging, 2015 Oct 5 (pp. 118-126), Springer, Cham.
[13] P.M. Burlina, N.J. Joshi, E. Ng, S.D. Billings, A.W. Rebman and J.N. Aucott, “Automated detection of erythema migrans and other
confounding skin lesions via deep learning,” Computers in biology and medicine, 2019 Feb 1, 105:151-6.
[14] I. Zaqout, “Diagnosis of skin lesions based on dermoscopic images using image processing TECHNIQUES,” In Pattern
RecognitionSelected Methods and Applications, 2019 Jul 15, IntechOpen.
[15] V.B. Kumar, S.S. Kumar and V. Saboo, “Dermatological disease detection using image processing and machine learning,” In 2016 Third
International Conference on Artificial Intelligence and Pattern Recognition, (AIPR) 2016 Sep 19 (pp. 1-6). IEEE.
[16] E. Jana, R. Subban and S. Saraswathi, “Research on Skin Cancer Cell Detection using Image Processing,” In 2017 IEEE International
Conference on Computational Intelligence and Computing Research (ICCIC), 2017 Dec 14, (pp. 1-8), IEEE.
[17] M. Monisha, A. Suresh and M.R. Rashmi, “Artificial intelligence based skin classification using GMM,” Journal of medical systems, 2019
Jan 1, 43(1):3.
[18] N.C. Codella NC, D. Gutman, M.E. Celebi, B. Helba, M.A. Marchetti, S.W. Dusza, A. Kalloo, K. Liopyris, N. Mishra, H. Kittler and A. Halpern,
“Skin lesion analysis toward melanoma detection: A challenge at the 2017 international symposium on biomedical imaging (isbi),
hosted by the international skin imaging collaboration (isic),” In 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI
2018), 2018 Apr 4, (pp. 168-172), IEEE.
>@ https://www.cancercenter.com/cancer-types/melanoma/symptoms -
>@ https://www.mayoclinic.org/diseases-conditions
>@ https://www.isic-archive.com
>@ https://sites.google.com/site/robustmelanomascreening/dataset
>@ https://www.dropbox.com/s/k88qukc20ljnbuo/PH2Dataset.rar [30]http://www.cs.rug.nl/~imaging/databases/melanoma_naevi/
[31]https://www.derm101.com/image-library/?match=IN
>@ N. Yadav, V.K. Narang and U. Shrivastava, “Skin diseases detection models using image processing: A survey,” International Journal of
Computer Applications, 2016 Mar, 137(12):34-9.
>@ N. Gessert, T. Sentker, F. Madesta, R. Schmitz, H. Kniep, I. Baltruschat,
5 Werner and A. Schlaefer, “Skin Lesion Classification Using CNNswith Patch-Based Attention and Diagnosis-Guided Loss Weighting,”
IEEE Transactions on Biomedical Engineering, 2019 May
9.
>@ https://www.biospectrumindia.com/news/73/8437/skin-diseases-togrow-in-india-by-2015-report.html
>@ https://www.who.int/uv/faq/skincancer/en/index1.html
>@ https://towardsdatascience.com
>@ www.analyticsvidhya.com

Vishnu Vardhan Project.pdf

Recommended

Recommended

More Related Content

Similar to Vishnu Vardhan Project.pdf

Similar to Vishnu Vardhan Project.pdf (20)

Recently uploaded

Recently uploaded (20)

Vishnu Vardhan Project.pdf