Automated Analysis Of Blood Smear Images For Leukemia Detection A Comprehensive Review

Automated analysis of blood smear images for leukemia detection: a
comprehensive review
AJAY MITTAL, SABRINA DHALLA, and SAVITA GUPTA, UIET, Panjab University, INDIA
AASTHA GUPTA∗, Department of Mathematics, Panjab University, INDIA
Leukemia, the cancer of blood-forming tissues, becomes fatal if not detected in the early stages. It is detected through a
blood smear test that involves the morphological analysis of the stained blood slide. The manual microscopic examination of
slides is tedious, time-consuming, error-prone, and subject to inter-observer and intra-observer bias. Several computerized
methods to automate this task have been developed to alleviate these issues during the past few years. However, no exclusive
comprehensive review of these methods has been presented to date. Such a review shall be highly beneicial for novice
readers interested in pursuing research in this domain. This paper ills the void by presenting a comprehensive review of
149 papers detailing the methods used to analyze blood smear images and detect leukemia. The primary focus of the review
is on presenting the underlying techniques used, their reported performance, along with their merits and demerits. It also
enumerates the research issues that have been satisfactorily solved and open challenges still existing in the domain.
CCS Concepts: • Medical Image Analysis → Computer-aided diagnosis; Early disease detection; Segmentation; Classiica-
tion.
Additional Key Words and Phrases: Leukemia detection, Deep learning, Leukocyte Segmentation, Classiication, Blood Smear
Images
1 INTRODUCTION
Leukemia, the malignancy of blood-forming tissues, is a potentially deadly disease that is known to cause
thousands of deaths every year worldwide [1]. This fatal disease exhibits generalized signs and symptoms such
as persistent fatigue, unexplained weight loss, frequent infections, recurrent nosebleeds, swollen lymph nodes,
night sweats, bone pain, and anemia. In clinical practice, it is hard for a general physician to correlate these
symptoms with such a deadly disease and determine that something is out of the ordinary. Moreover, the disease
does not exhibit symptoms until the leukemia cells accumulate to a certain level, making its early detection
highly challenging.
The disease has a poor prognosis, and any delay in correct diagnosis has a severe negative impact on the
patient’s survivability. Thus, it is crucial to spread awareness among the physicians, healthcare workers, and
general public, especially in the vulnerable cohort with family history, genetic disorders such as Down syndrome,
chemotherapy, or radiotherapy treatment, to be vigilant about such symptoms. Simultaneously, it is essential to
develop highly efective methods for screening and early detection of the disease.
At present, unlike other malignancies, no special screening methods are available for detecting leukemia
before the onset of its symptoms. Doctors recommend that vulnerable people undergo regular medical checkups,
∗Corresponding Author: Aastha Gupta
Authors’ addresses: Ajay Mittal, ajaymittal@pu.ac.in; Sabrina Dhalla; Savita Gupta, UIET, Panjab University, Chandigarh, INDIA, 160 014;
Aastha Gupta, Department of Mathematics, Panjab University, Chandigarh, INDIA.
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that
copies are not made or distributed for proit or commercial advantage and that copies bear this notice and the full citation on the irst
page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy
otherwise, or republish, to post on servers or to redistribute to lists, requires prior speciic permission and/or a fee. Request permissions from
permissions@acm.org.
© 2022 Association for Computing Machinery.
0360-0300/2022/2-ART $15.00
https://doi.org/10.1145/3514495
ACM Comput. Surv.

2 • Mital et al.
including physical examination and routine blood testing. Increased white blood cell (WBC) count in a routine
blood test indicates an infection, stress, inlammation, or a bone marrow disease (maybe leukemia) and follows
blood smear test for morphological analysis of the blood cells. In a blood smear test, a trained laboratorian
evaluates the stained slide under a microscope. This manual examination is labor-intensive, time-consuming,
and error-prone with large inter-observer and intra-observer bias. Therefore, many computer-aided methods
have been developed for automated morphological analysis of the blood smear images during the past few years.
However, no elaborative survey consolidating the advancements in these methods has been presented to date,
thus indicating the need.
This paper presents a comprehensive review of computer-aided methods to analyze blood smear images and
detect leukemia from them. The review is conducted as per Preferred Reporting Items for Systematic Review and
Meta-Analyses (PRISMA) guidelines [2]. Seven online databases are searched using a properly formulated query
to identify papers included in the review. The identiied papers are screened against inclusion and exclusion
criteria to select the papers for inclusion in the survey. Using the deined selection protocol, a total of 149 research
papers are included in the review. The primary focus of the review is on presenting the underlying approach of
these methods, the dataset used by them, their reported performance, along with their pros and cons. The review
refrains from comparing their performance by bringing them on a common platform since their datasets vary in
quality and complexity.
The rest of the paper is organized as follows. Section 2 describes the protocol adopted for conducting the
review. Section 3 presents the basic information about the cellular components of blood and the tests commonly
prescribed to detect hematologic diseases like leukemia. Section 4 presents the information about leukemia,
and its various sub-types. Section 5 briely presents the methods used for automated leukocyte analysis. These
sections form the basis for novice readers interested in pursuing research in this area. Section 6 list the datasets
used in the development of systems for automated analysis of blood smear images. Section 7 presents a detailed
review of these systems. Finally, conclusions are drawn in Section 8.
2 REVIEW PROTOCOL
A review involves planning, formulation of a review protocol and its holistic adherence. The review protocol
describes the planned methods for the review. It involves i) formulation of research questions, ii) data collection
strategy, iii) inclusion/exclusion criteria, v) methodology for data extraction, analysis, summarization, and
synthesization of results. The details of the protocol formulated for the present review are as follows.
2.1 Research uestions
The research questions deine the objectives of the review clearly, unambiguously, and in a structured form. This
review aims to ind answers to the following questions.
RQ1 What is the prerequisite domain knowledge required by a novice user interested in developing new
computational methods for leukemia detection?
RQ2 What are the blood tests used to diagnose leukemia? How are these tests performed in clinical practice?
Are there any automated methods available for blood testing?
RQ3 What are the various segmentation and classiication techniques used to detect leukemia from blood smear
images? What are their pros and cons? Which technique gives better results? What are the challenges
involved in the development of such automated techniques?
RQ4 What are the publicly available datasets used for the development and testing of computational methods
for leukemia detection?
ACM Comput. Surv.

Automated analysis of blood smear images for leukemia detection: a comprehensive review • 3
Fig. 1. Number of research papers identified, screened and included in the review
2.2 Data Collection
To exhaustively identify the relevant papers, primary searches are carried out on seven diferent electronic
databases, namely IEEEXplore, ScienceDirect, PubMed, Springer, ACM digital library, SPIE digital library, and
Wiley online library, during 5-8 June 2020. In addition to the primary searches, secondary searches are carried
out to scrutinize and consider the papers mentioned in the bibliography of the research papers obtained from the
primary searches. Searches are again carried out during 20-22 September 2021 to update the review before its
publication. Primary searches are carried out using the query ł(((leukemia) OR (leukemia detection) OR (leukemia
classiication) OR ((segmentation) AND (leukemia) or (((WBC segmentation) or (leukocyte segmentation)) AND
(leukemia)))) AND ((blood smear images) OR (medical images)))" to identify the papers published during or after
2010.
2.3 Inclusion and Exclusion Criteria
The identiied papers, n0 =2259, are screened through three stages using the inclusion and exclusion criteria,
as mentioned in Table 1. In the irst stage, the papers are screened based on their title. The papers that pass
through the irst stage, n1 =426, are screened based on their abstract. The papers passed through the second stage,
n2 =188, are screened based on the whole text of the paper. The papers passed through the third stage, n3 =166,
are checked for uniqueness. Finally, 149 papers are included in the review after removing 17 duplicate papers.
The number of papers passed and rejected during various screening stages is shown in the lowchart in Figure 1.
Once a paper is included, a thorough analysis of the same is conducted and the result is reported in the paper.
Table 1. Inclusion and exclusion criteria used for short-listing papers
SNo. Parameters Inclusion Criteria Exclusion Criteria
1. Time Period Papers published from the year 2010 Papers published before 2010
2. Interventions/ Investiga-
tions
Papers related to leukemia detection from micro-
scopic images
Papers related leukemia detection using low cytometry,
molecular methods, gene analysis and drug efect
3. Imaging Modality Papers including blood smear images Papers not on blood smear images
4. Study Design Papers involving computational methods for
leukemia detection with their experimental results
• Papers including diagnosis and treatment of leukemia
• Case studies, Patents
• Papers in language other than English
ACM Comput. Surv.

4 • Mital et al.
2.4 Relevance and uality Assessment
To maintain the sanctity of the screening process, all the papers are independently accessed by two authors
and then discussed with the third author. The diference in opinion is resolved by joint critical analysis. All the
included 149 papers present a computational technique for leukocyte segmentation and/or classiication for
leukemia detection from blood smear images. The included papers are rigorously analyzed, and the results are
presented in the following sections as per the formulated research questions.
3 HEMATOLOGIC INFORMATION
Peripheral blood is a specialized body luid that circulates through the entire body and is responsible for carrying
oxygen and other nutrients to the living cells, carrying cellular waste from cells to the excretory system, and
removing pathogens from diferent parts of the body. It consists of four major components red blood cells (RBCs,
also called erythrocytes), white blood cells (WBCs, also called leukocytes), platelets (also called thrombocytes), and
plasma.
3.1 Cellular components of blood
As stated, blood consists of plasma and diferent cellular components. Plasma is the yellowish liquid that amounts
to 55% of the blood volume. It is mostly water containing dissolved salts, proteins, glucose, electrolytes, hormones,
enzymes, and antibodies. Its role is to carry these dissolved substances to diferent parts of the body. The cellular
components of the blood are as follows:
(1) Red blood cells or erythrocytes are the most abundant cells in the blood amounting to approximately 44% of
its volume. Morphologically, an RBC has a biconcave disk with the lattened center as depicted in Figure
2(a). An RBC starts as an immature cell in bone marrow, where it matures for approximately seven days
before being released into the bloodstream. It contains a special protein known as hemoglobin, which helps
in carrying oxygen from the lungs to the living cells in the body and carbon dioxide back from the cells to
the lungs.
(2) White blood cells or leukocytes are a heterogeneous group of nucleated cells that account for 1% of the blood
volume. They are developed from the precursor cells produced in the bone marrow and protect the body
from infection. The morphology of a WBC depends upon its type. As classiied in Figure 2(b), WBCs have
the following types:
(a) Granulocytes are the WBCs that have a bi-lobed nucleus and granules in their cytoplasm. The granules
contain chemicals that are released during innate and adaptive immune responses against bacterial, viral,
and parasitic infections. Granulocytes are of three types:
(i) Neutrophils are the granulocytes that ight against infections caused by bacteria or fungi.
(ii) Eosinophils are the granulocytes that respond to infections caused by parasites and play a varied role
in the controlling immune response.
(iii) Basophils are the largest and least common granulocytes. They are responsible for inlammatory
reactions during the immune response.
(b) Lymphocytes are the WBCs that exist in the bloodstream and lymphatic system. They have a gigantic
nucleus and granule-free cytoplasm. They are further classiied as:
(i) B lymphocytes generate antibodies that are required to mount an immune response against infections
from bacteria, viruses, and other foreign antigens.
(ii) T lymphocytes are the killer cells that destroy cancer cells, foreign cells, or cells infected with antigens.
(c) Monocytes are the WBCs that have a sizeable indented nucleus and a cytoplasm with ine granules. These
cells are responsible for the immune response to chronic infections resulting from bacteria.
ACM Comput. Surv.

(a) (b)
Fig. 2. Blood cells (a) Types of blood cells, (b) Types of white blood cells.
(3) Platelets or thrombocytes do not have a cell nucleus and are fragments of cytoplasm that adhere to damaged
blood vessels resulting in clotting.
The number, proportion, and morphological characteristics of these cells provide a vital cue about a person’s
general health, disease, or condition and are determined through following blood tests.
3.2 Blood tests
Hematologic diseases such as leukemia are generally detected using the following blood tests:
(1) Complete blood count (CBC) is the most common blood test that is performed to get information about the
number of cells in the blood. It is performed using an automated instrument, known as a hemocytometer,
that counts the number of RBCs, WBCs, and platelets in the blood. The numbers outside the established
reference intervals indicate a disease or condition and require further tests.
(2) White blood cell diferential is generally performed after an abnormal CBC report. It is generally performed
using an automated hematology analyzer to determine the absolute number or percentage of diferent
types of WBCs. It can also detect the presence of abnormal or immature cells. An increase or decrease in a
particular type of WBCs helps doctors narrow down on the conditions that afect the speciic WBCs.
Table 2. Morphological characteristics of stained cellular components of blood
Cell Sub-type Type Nucleus Cytoplasm Size(µm)
RBCs Lacks cell nucleus Orange pink to rose, Large, Contains 270 mil-
lion haemoglobin molecules
6.2-8.2
WBCs
Granulocytes
Neutrophils Deep blue to violet, Several lobes (2-5) Pale pink to tan, Filled with ine purple granules 12-16
Eosinophils Purple, bi-lobed Pale pink to tan, Filled with large orange to
bright red granules
14-16
Basophils Bi-lobed, Purple, Obscured by granules Pale pink to tan cytoplasm illed with large pur-
ple to blue-black granules
14-16
Lymphocytes B- and T-type Spherical, Large dark stained Light blue cytoplasm with no granules 8-15
Monocytes Deep bluish, large indented Pale gray to blue cytoplasm with ine granules 14-20
Platelets No cell nucleus Red purple surrounded by light blue 1.5-3
ACM Comput. Surv.

6 • Mital et al.
(a) (b) (c) (d) (e) (f) (g)
Fig. 3. Morphology of stained blood cells (a) Erythrocytes, (b) Neutrophils, (c) Eosinophils, (d) Basophils, (e) Lymphocytes, (f)
Monocytes, (g) Thrombocytes
(3) Blood smear test is done to discover morphological abnormalities in blood cells. In a blood smear test, a
thin layer of blood is smeared on glass and is then stained to highlight diferent structures or the cellular
components in the blood cells. The stained slide is then examined and evaluated by a trained laboratorian
using a microscope. Of all the available stains, the Romanowsky group of stains are universally used for
staining blood cells. Romanowsky group of stains includes Giemsa [3], Wright [4] and Leishman [5] stains.
Almost all of the Romanowsky group of stains consists of two components- Methylene blue and Eosin
Y or Azure B dyes. Methylene blue is a basic dye with a positive charge and thus binds to a negatively
charged cell structure, i.e., nucleus. Eosin Y or azure B are acidic dyes with a negative charge and ainity for
positively charged cellular components such as granules and cytoplasm. The morphological characteristics
of stained cellular components of blood are depicted in Figure 3 and are summarized in Table 2.
Among all of the tests, the relevance of the blood smear test in the morphologic diagnosis of various hematologic
diseases is enormous. Any abnormality in the number or morphology of the blood cells found in the blood smear
test suggests the diagnosis. Although it is generally followed by bone marrow biopsy and molecular testing
to identify chromosomal abnormalities and DNA markers to determine the type and stage of the disease, the
diagnostic relevance of the blood smear test has not been lessened.
4 LEUKEMIA AND ITS TYPES
As stated, blood majorly consists of three types of blood cells RBCs, WBCs, and platelets. Of all the blood cells,
WBCs are more prone to become cancerous than the RBCs and platelets. In leukemia, the blood-forming tissues
become malignant and start producing abnormal WBCs. Leukemia has diferent types which are determined on
the following basis:
(1) Depending upon the type of WBCs afected: On the basis of the type of WBCs afected, leukemia is classiied
as lymphocytic (also known as lymphoblastic) or myelogenous (also known as myeloid). In lymphocytic
leukemia, abnormal lymphocytes are produced whereas in myelogenous leukemia abnormalities are found
in granulocytes and monocytes, collectively known as myeloid cells.
(2) Depending upon the maturity of abnormal cells: On the basis of whether most of the abnormal cells are
immature or mature, leukemia is classiied as acute or chronic. In acute leukemia, most abnormal cells
are immature, whereas, in chronic leukemia, most abnormal cells are mature. Acute leukemia progresses
rapidly since it afects immature cells, which multiply very quickly. On the other hand, chronic leukemia is
less severe and progresses more slowly as compared to acute leukemia.
Thus, there are four possible combinations of lymphocytic or myelogenous leukemia and acute or chronic
leukemia, resulting in leukemia being classiied as acute lymphocytic leukemia (ALL), chronic lymphocytic
leukemia (CLL), acute myelogenous leukemia (AML), and chronic myelogenous leukemia (CML). The classiication
of leukemia is summarized in Table 3.
ACM Comput. Surv.

5 AUTOMATED ANALYSIS OF BLOOD SMEAR SLIDES
The manual examination of blood smear slide is labor-intensive, tedious, time-consuming, done on a small
number of cells (about 100), and subject to inter-observer bias [6]. A lot of efort has been made since the 1960s to
automate the counting and morphological analysis of WBCs to improve the turnaround time and eicacy of the
examination. On the basis of the technique being used for automated leukocyte analysis, analyzers are classiied
as low-through systems, and digital image processing systems [7].
5.1 Flow-through systems
These systems require an anticoagulated blood sample injected into a stream of sheath luid within the low
chamber. By hydrodynamic focussing, each cell is made to pass through a low cell1 in a single line. A laser
device is focused on the low cell. As the laser beam strikes a cell, it scatters in diferent directions (see Figure
4 for basic illustration). Photo-detectors are used to detect scattered light. The light scattered in the forward
direction determines the cell size, whereas scattering in the side directions determines the nuclear complexity
and granularity of the cytoplasm.
The low-through systems provide signiicant speed and eiciency in handling blood samples with high
precision and minimal human intervention. However, they cannot perform morphological analysis and thus
cannot be used for the morphology-based diagnosis of hematology diseases. Digital image processing systems
alleviate this limitation of low-through systems for automatic leukocyte analysis.
5.2 Digital image processing systems
These systems identify leukocytes in the high-resolution images of the stained blood smear slides using digital
image processing (DIP), pattern recognition, and machine learning algorithms. These systems inspect slides based
on features such as geometry, size, color, and texture of cells exactly in a way that mimics the manual inspection.
These systems are not as fast as low-through systems, but they can analyze the morphology of blood cells in
addition to their counting.
Fully automated digital image processing systems for hematologic inspection have three components- automatic
slide preparer, image acquisition device, and an observation algorithm. Of all these three components, the most
important component is the observation, or the image processing, algorithm as it has to take care of the artifacts
that arise due to the other two components and maintain the eicacy of the analyzer. The image processing
algorithm should be versatile to handle specimen preparation problems such as variation in optical density in
1a glass slide containing small luidic channel
Table 3. Classification of leukemia
Criterion
Leukemia
ALL CLL AML CML
Cells afected Lymphocytes Lymphocytes Myeloid cells Myeloid cells
Maturity of afected cells Immature Mature Immature Mature
Rate of progression Fast Slow Fast Slow
Disease course
Rapidly fatal
(<6 months w/o treatment)
Indolent disease course
(2-6 years w/o treatment)
Rapidly fatal
(<6 months w/o treatment)
Indolent disease course
(2-6 years w/o treatment)
Group afected Mostly children Old aged people Adults Adults and rarely in children
Survival rate Age<50-75%, Others-25% Age<50-94%, Others-83% Age<50-55%, Others-14% Age<50-84%, Others-48%
Sub-types L1 - L3 M0 - M7
ACM Comput. Surv.

8 • Mital et al.
(a) (b)
Fig. 4. Flow-through system (a) Basic illustration, (b) Forward and side scater of light.
slide, overlapping cells, disrupted cells, stain debris, stain variations [7], and image acquisition problems such as
variable magniication (i.e., scale), noise, compression.
The accuracy of these systems has always been a cause of concern for researchers. The early DIP-based
WBC analyzers such as leukocyte automatic recognition computer (LARC) and Hematrak developed in the
mid-1960s have disappointed the community with their performance. LARC misclassiied 68.6% of abnormal
cells as normal [8, 9], whereas Hematrak [10] has a false-positive rate of 5.5% and a false-negative rate of 12%
[11ś13]. The community got a notion that the performance of these systems can never exceed or be at par
with human performance. Thus there had been more reliance on low-through systems for the automation of
hematology laboratories during the 1980s. Due to the advancements in pattern recognition through artiicial
neural networks (ANNs), the focus again shifted to the development of sophisticated and robust DIP techniques
for automated WBC analysis during the irst decade of the twenty-irst century. In 2001, CellaVision r
○
launched
its irst hematology analyzer in Europe. Since then, many commercial products such as CellaVision r
○
DM8,
CellaVision r
○
DC-1, CellaVision r
○
DM6, CellaVision r
○
DM1200, CellaVision r
○
DM9600 Sysmex DI-60, Nextslide,
EasyCell, Vision Hema, Cobas m511 have been launched. The systems have classiication accuracy in the range
82% to 95.4% [14], and are useful in signiicantly reducing the workload and manpower requirements in the
hematology laboratories. Although the performance of these systems is good and near to the performance of a
human expert, there is still a pursuit of developing algorithms with advanced pattern recognition and machine
learning techniques to outperform humans in WBC analysis. After the breakthrough success of Krizhevsky et
al.’s deep convolutional neural network (CNN) model, AlexNet [15], in the ImageNet classiication challenge in
the year 2012, deep learning has become a popular machine learning paradigm. Recently, deep learning has been
applied for automatic classiication of leukocytes in blood smear images [16ś19] and for automatic diagnosis of
leukemia [20ś22].
6 DATASETS
The following public datasets are used in studies related to the detection and diagnosis of leukemia from the
blood smear images.
(1) Acute Lymphoblastic Leukemia Image Database for Image Processing (ALL-IDB): This dataset [23] is designed
to segment and classify ALL blast (i.e., immature or precursor) cells. It consists of images shot using a Canon
PowerShot G5 camera. They are available in .jpg format with a resolution of 2592 × 1944 and 24-bit color
depth. Two versions of this dataset are available, ALL-IDB1 and ALL-IDB2. The ALL-IDB1 dataset contains
ACM Comput. Surv.

Fig. 5. Workflow pipeline of DIP systems for morphologic analysis of leukocytes in blood smear images
108 labeled blood slide images. Each .jpg ile has an associated .xyc ile, which contains the x,y coordinates
of leukemic cells’ center. The second version of the dataset, ALL-IDB2, contains region-of-interest, i.e.,
segmented blast cells, from the images of ALL-IDB1.
(2) American Society of Hematology (ASH) Dataset: This dataset [24] is an online library that contains images of
blood slides of various diseases. It has more than 4,879 images of ALL, various types of lymphoma, myeloid
disorder, anemia, etc.
(3) C-NMC Leukemia Dataset: This dataset [25] is made available as a part of ALL challenge in ISBI-20192. The
dataset consists of approximately 15,000 images of resolution 450 × 450 available in .bmp format with the
labels available in .csv format.
(4) The Munich Acute Myeloid Leukemia (M-AML) Morphology Dataset: This dataset [26] contains around
18,000 annotated images of single cells for AML. They have been segmented from the blood slides under
the guidance of experts. Each image is available in .tif format and has a dimension of 400 × 400. Various
morphological classes have been explained and summarised in .txt format whereas the labels are available
in .dat format.
(5) SN-AM Dataset: This dataset [27] contains bone marrow slides of acute lymphoid leukemia (B-ALL) and
multiple myeloma (MM). The images are captured using a Nikon Eclipse-200 camera and are available in
.bmp format with a resolution of 2560 × 1920. The dataset contains 90 images of B-ALL and 100 images of
MM.
(6) Clinical Proteomic Tumor Analysis Consortium Acute Myeloid Leukemia (CPTAC-AML) Dataset: This dataset
[28] is used to determine the cause of cancer on a molecular basis. Radiologic images and pathology reports
are collected to investigate the phenotype of cancer. The tissue slide images are available in .svs format,
whereas the clinical data is in .json iles.
(7) Hematology Atlas (HA): This dataset [29] contains images related to leukemia, anemia, parasite, and fungal
diseases. It contains 88 images covering various types of leukemia such as ALL, AML, acute plasmacytic
leukemia, hairy cell leukemia.
The available datasets used for automated detection and diagnosis of leukemia are consolidated in Table 4.
7 COMPUTER-AIDED DIAGNOSIS OF LEUKEMIA
Computer-aided diagnosis (CADx) systems for leukemia detection take a blood smear image as an input, perform
morphologic analysis on the leukocytes present in it, and classify it as normal or leukemia positive. They further
classify the leukemia positive image according to leukemia types- ALL, CLL, AML, and CML. Some systems also
classify ALL and AML positive images according to their subtypes L0-L3 and M0-M7, respectively, and generate
the count for normal and abnormal, mature and immature leukocytes. The general worklow pipeline followed
these CADx systems is shown in Figure 5. As illustrated, it consists of four stages- pre-processing, segmentation,
feature engineering, and classiication.
2International Symposium on Biomedical Imaging (ISBI) challenge for Classiication of Normal versus Malignant Cells (C-NMC)
ACM Comput. Surv.

10 • Mital et al.
7.1 Pre-processing
It is the irst stage of the worklow pipeline in which the following operations are selectively applied to the input
blood smear image so that it is apt for the following stages.
(1) Noise iltering: The blood smear images are susceptible to both spatially uncorrelated random noise and
spatially correlated ixed-pattern noise (FPN) [30]. Random noise is removed by applying various spatial
ilters such as average [30, 31], median [32ś34], Gaussian [35ś37], and Wiener [38ś40] ilters. Although
these ilters alleviate noise, they simultaneously smooth out the input image’s sharp details such as cell
boundaries on which the following segmentation stage highly depends. Therefore, in conjunction with
noise iltering, edge-preserving techniques such as extremum operator [32], unsharp masking [33], high
boost iltering [35], and Kuwahara ilter [41] are also applied. After the removal of random noise, the FPN
is removed using re-calibration and background subtraction techniques [30].
(2) Contrast enhancement: The digital image of a well-prepared stained blood smear slide has high contrast
diference between the blood cells and plasma, and the nucleus and cytoplasm of cells. Various technique
use the contrast diference for leukocyte segmentation. However, use of incorrect concentration of the ink
solution [42], unstable staining, and underexposure or overexposure [43] during the slide preparation may
result in a stained blood smear image with poor contrast. Therefore, irstly contrast enhancement needs to
be done for efective contrast-based segmentation of WBCs, their nucleus, and cytoplasm. For efective
contrast-based segmentation of WBCs, their nucleus, and cytoplasm, irstly contrast enhancement needs to
be done.
The global contrast of blood smear image is enhanced through histogram equalization [44, 45],
linear contrast stretching [46, 47] or combined application of both the techniques [48ś52]. The histogram
equalization uses a non-linear transformation function which is simple and eicient to compute. However,
it sometimes results in severe efects such as contrast loss in the background and small regions [53, 54]. This
limitation of histogram equalization is removed by linear contrast stretching, which uses a linear scaling
function and results in contrast enhancement with less severe efects. The global contrast enhancement
techniques use the intensity information of the whole image and increase the overall contrast. However,
they do not adapt to the local brightness in the input image. Therefore, some systems use local contrast
enhancement techniques such as contrast-limited adaptive histogram equalization (CLAHE) [34, 55, 56] for
increasing the contrast of selective regions (such as cells) in the blood smear image.
(3) Colour space conversion: The digital image of a blood smear slide acquired using a color camera attached
to a microscope has an RGB color model by default. The RGB color model is apt for color creation and
manipulation. It has a few limitations that make it unsuitable for blood smear image processing. The
color space produced by the RGB model is non-linear and discontinuous. The discontinuities in the color
Table 4. Publicly available datasets used for leukemia detection
Dataset Year Source/ Creator Size Resolution Format Type Image Type
ALL-IDB 2005 University of Milan 200 MB 2592 × 1944 .jpg ALL Blood slides
ASH 2016 ASH Image Bank - - - Mixed Blood slides
C-NMC 2019 IIIT, Delhi and AIIMS, Delhi 10.44 GB 450 × 450 .bmp ALL Single Cell
M-AML 2019 Munich University Hospital 11 GB 400 × 400 .tif AML Single Cell
SN-AM 2019 The Cancer Imaging Archive (TCIA) 2.9 GB 2560 × 1920 .bmp, .tif B-ALL and MM Blood slides
CPTAC-AML 2020 National Cancer Institute 378 GB - .svs, .json AML Tissue Slides
HA - Dr. Nivaldo Medeiros - - - Mixed Blood slides
ACM Comput. Surv.

Fig. 6. Sequence of steps generally followed to segment leukocytes and their components from the blood smear images
space make changes in color hue hard to follow. Moreover, the color hue in the RGB model is afected by
illumination changes that make its tracking and analysis nontrivial [57].
Many researchers have, therefore, changed the color model of the input image during the pre-processing
stage. It has been changed to HSV color model in [47, 58ś60], HSL color model in [61, 62], HSI color model in
[46, 63, 64], and perceptually uniform CIE color space with CIELab color model in [65, 66]. Few researchers
have tried using diferent color models for the segmentation of leukocytes, nucleus, and background.
Moshavash et al. [67] used the CIELab and CMYK color models for the segmentation of leukocytes and
background, respectively. Putzu et al. used CMYK for leukocyte segmentation, and CIELab color model for
nucleus identiication [49, 50]. In contrast, Mohapatra et al. used CIELab for leukocyte segmentation, and
HSV color model for nucleus segmentation [68ś70].
(4) Data augmentation: Deep learning algorithms require a large number of diverse training samples to tune
their trainable parameters properly. This requirement of deep learning is generally not satisied in the
medical domain; medical data being scarce with high data collection and labeling costs. It is thus one
of the major problems restricting the seamless application of deep learning for medical image analysis.
Data augmentation alleviates this problem by signiicantly increasing the number of training samples
without actually collecting new data. It includes operations such as horizontal and vertical lip, rotation,
random crop, and padding. On leukemia datasets, Kassani et al. [21] performed data augmentation using
horizontal lip, vertical lip, and contrast adjustment to increase the size of the dataset to nearly eight folds.
Shaique [22] and Ghosh et al. [71] performed augmentation using image rotation and mirroring. Mourya
et al. [72] applied aine data augmentation techniques such as shearing and Gaussian blur to improve
the performance of their LeukoNet classiier. The pros and cons of various preprocessing techniques are
mentioned in Table 5.
7.2 Segmentation
After the preprocessing stage, the preprocessed blood smear image is segmented to extract the blood cells and
their nucleus, or both nucleus and cytoplasm [52]. It is a challenging task because blood cells are of diferent
sizes and shapes and may overlap. Due to its complicated nature, it is generally performed as a sequence of steps,
as indicated in Figure 6. Based on the underlying approach used to segment the leukocytes and their components
from the blood smear image, the segmentation methods are classiied as:
(1) Rule-based methods: These methods use heuristic rules formulated from prior knowledge about the mor-
phologic characteristics of leukocytes, their nucleus, and cytoplasm. Several methods have been proposed
to segment WBCs using the rules based on intensity, color, shape, and texture. As depicted in Figure 7, the
rule-based WBC segmentation methods apply several rules in sequence to obtain the desired result. The
advantage of these methods is that the constituent rules can be applied in diferent orders, giving notable
freedom. Based on the prominence of the applied rules, the rule-based methods are further classiied as:
(a) Intensity-thresholding based methods: Intensity thresholding is one of the simplest methods that has
been used autonomously or in conjunction with other approaches to segment leukocytes from the
blood smear image. Rawat et al. [73] used a global threshold and morphological operations to segment
leukocytes’ nuclei. Leukocytes’ cytoplasm is then extracted by subtracting the extracted nuclei. The
ACM Comput. Surv.

12 • Mital et al.
intensity threshold is estimated using prior knowledge about the blood smear images and is ine-tuned
by maximizing the number of edge points with large gradients. The segmented regions are smoothed
Table 5. Preprocessing operations applied on input blood smear images
Op. Method Pros (▲) Cons (▼) References
Noise
Filtering
Average
iltering
▲ Simple, intuitive, easy to implement
▲ Eliminates variation among a pixel and its
neighborhood
▲ Degree of smoothing can be controlled by
selecting appropriate kernel size
▼ Edge blurring and loss of sharp image details
▼ High dependence of results on the size of the
kernel used
[30, 31]
Median
iltering
▲ Preserves sharp features
▲ Less sensitive to outliers
▼ Mild signal-to-noise ratio (SNR) gain
▼ Fail to supress medium-tailed Gaussian noise
[32ś34]
Gaussian
iltering
▲ Computationally eicient
▲ Higher signiicance for pixels near edge
▲ Rotationally symmetric
▲ Degree of smoothing can be controlled by σ
▼ Loss of image ine details and contrast
▼ Not suited for salt-and-pepper noise
[35ś37]
Weiner
iltering
▲ Uses the statistical properties of image to
remove noise
▲ Removes additive noise and inverts blurring
▼ Prior information about the power spectral
density of the original must be known
▼ Relatively slow
▼ Used metric, mean square error (MSE), is not
always relevant
[38ś40]
Contrast
Enhancement
Histogram
equalization
(HE)
▲ Fairly simple and invertible operation
▲ Computationally inexpensive
▲ Efective for grayscale images
▼ Sometimes results in artifacts such as contrast
loss in background and small regions
[44, 45, 48ś52]
Linear contrast
stretching
▲ Linear mapping of input to output values
▲ Less severe artifacts as compared to HE
▼ Mild SNR gain
▼ Sensitive to outliers
[46ś52]
CLAHE ▲ Prevent over ampliication of noise
▲ Improved contrast enhancement and better
results
▼ Operates on small regions
▼ Computationally slow, requires large number
of operations
[34, 55, 56]
Color
Space
Conversion
RGB→HSV
RGB→HSL
RGB→HSI
▲ Based on human color perception
▲ Chromaticity is decoupled from intensity
▲ Robust to non-uniform illumination
▼ Conversion requires a non-linear transforma-
tion which generates non-removable singulari-
ties
[46, 47, 58ś64]
RGB→CIELab ▲ Perceptually uniform color space
▲ Eicient in measuring small color diference
▲ Chromaticity is decoupled from intensity
▼ Conversion requires non-linear transforma-
tion
▼ Problem of non-removable singularities
[49, 50, 65ś70]
Data
Augumentation
Geometric/
Color space/
Feature space
augumentations
▲ Artiicially increases the number of training
samples
▲ Avoids overitting
▲ Alleviates class imbalanace
▲ Improves model performance
▼ Augumented data distribution can be quite
diferent from original one
▼ Data augumentation is based on approxima-
tions and decisions based on data augumenta-
tion should be treated accordingly
[21, 22, 71, 72]
Fig. 7. Work-flow pipeline of the rule-based WBC segmentation methods
ACM Comput. Surv.

using morphological operations such as dilation and erosion to remove single points and lines caused
by noise. Abd Halim et al. [74] used a ixed global threshold, τ = 100, on S component of HSI color
space converted ALL and AML images for leukocyte segmentation. The adaptive thresholding-based
methods, such as Otsu thresholding [75], Zack algorithm [76], are found to be better than local and global
thresholding-based methods. Otsu’s method along with diferent postprocessing techniques has been
used in [31, 77] for segmentation of leukocytes’ nuclei. Zack algorithm is widely used for segmentation
of microscopic cells in [67, 78ś80]. Instead of grayscale thresholding, Toh et al. [81] used color-based
thresholding for leukocyte segmentation. They segmented the blood smear image based on the color
ratio of R, G, B values. The G to B color value ratio has been selected as 0.85 and 0.75 for segmenting the
nucleus and cytoplasm, respectively. The median ilter is then applied to the resultant image to remove
noise.
(b) Edge detection-based methods: These methods leverage on good contrast diference between the cellular
components of blood and the plasma. These methods aim at detecting the intensity discontinuities in the
image to delineate the cellular boundaries. The edge detectors such as Sobel and Canny have been used
to determine nucleus boundary in many studies including [58, 68, 82] and [33, 70, 83], respectively. Many
researchers [37, 66] have used a mix of edge detectors for leukocyte segmentation. The major limitation
of these methods is that they provide low-level information in terms of edge points. The isolated edge
points must be grouped into structures to extract high-level information about the blood cells. Such
grouping is challenging and majorly done using circular Hough transform (CHT). CHT works by letting
each edge point (x,y) vote in a 3D accumulator cell (a,b,r). An accumulator cell with the maximum
number of votes deines a circle with (a,b) as center and radius r. Safuan et al. [84] applied CHT to
identify and count WBCs in blood smear images. A combination of S channel from HSV color space and
C channel from CMYK color space is used as input to the algorithm. They found that a circle radius of
10−12 pixels is best suited to detect WBCs even in the clump region. Sudha et al. [85] used Gradient CHT
(GCHT) in conjunction with the edge strength-based Grabcut method to segment and count overlapped
leukocytes in blood smear images. CHT, along with other methods, have been used to detect WBCs in
[86ś88]. Although CHT is widely used for leukocyte segmentation, it faces issues with closely placed
cells and stains.
(c) Region-based methods: Region-based methods exploit the coherence properties of the image pixels to group
them into homogenous regions. A class of region-based methods, known as seed-based region growing
(SBRG) methods, requires an initial seed and propagates outwards in diferent directions from the seed.
They examine the properties of the neighboring pixels of the seed recursively during the propagation
and group pixels with similar properties into coherent regions. Harun et al. [91] applied a modiication
of the SBRG method, namely seeded region growing area extraction (SRGAE) method, for leukocyte
segmentation. The method focuses on the properties of each pixel rather than that of the whole region
and is thus less prone to errors due to poor lighting or noisy boundaries. Another class of region-based
methods applies watershed method [99] for leukocyte segmentation [31, 58, 67, 79, 80, 93, 95ś97]. The
watershed method is generally tuned to avoid over-segmentation, and morphological operations are
applied after the segmentation to smoothen the boundary between the nucleus and cytoplasm. Some
region-based segmentation methods use clustering to group adjacent pixels with similar characteristics
into the same cluster in each region [100]. Various researchers have used K-means clustering (KMC)
[66, 68, 89, 91, 92, 95, 96], Shadowed C-means clustering (SCM) [90, 101, 102], rough K-means clustering
(RKM) [103], fuzzy C-means (FCM) clustering [33, 91, 104] and K-medoids clustering [97] for segmenting
WBCs, their nucleus and cytoplasm. Jabar et al. [105] compared the performance of three diferent types of
clustering, classical K-means (CKM), FCM, and adaptive K-means (AKM) clustering for the segmentation
of acute leukemia blood cells. They concluded that the AKM-clustering technique uses the values of
ACM Comput. Surv.

14 • Mital et al.
Table 6. Rule-based leukocyte segmentation methods
Cat. Ref. Year Pre [P]/Post [PO]-
processing stage†
Rules⋆ Dataset(#) and Performance‡ Pros (▲) and Cons (▼)
Intensity
thresholding-based
[74] 2011 [P]-GCS, RGB→HSI
[PO]-MF, RG
Thresholding on S compo-
nent of HSI (τ =100)
CD (ALL and AML images) ▲Easy to implement
▲Fast and computationally eicient
▲No prior information required
▲Otsu’s and Zack’s methods
enable dynamic thresholding (DT)
▼Highly sensitive to noise
▼Neglects spatial information
▼Threshold selection is crucial
▼Poorly selected threshold may
result in under/over-segmentation
▼DT works well only when
histogram has bimodal distribution
▼DT fails when object is small
compared to background, variances
of object and background
intensities are large, image is
corrupted with additive noise
[77] 2013 [P]-GC
[PO]-CED, MO
Otsu thresholding CD (140: 93-CLL, 47-Normal), SA- N:
99.92%, WC: 99.85%, C: 99.63%
[31] 2016 [P]-RGB→L*a*b*
[PO]-MO, AF, WS
Otsu thresholding CD (70), SA- NC:90.2%, LC:82.4%
[78] 2016 [P]-RGB→CMYK
[PO]-MCW
Zack thresholding ALL-IDB1
[73] 2017 [P]-GC, MF, HE
[PO]-MO
Global thresholding ALL-IDB2
[81] 2018 [PO]-MF Color thresholding CD, SA- AML: 97.63%, ALL-97.64%
[79] 2018 [P]-RGB→HSV, HE
[PO]-MO, CB, WS, DT
Zack thresholding ALL-IDB1
[67] 2018 [P]-RGB→L*a*b* (NS),
RGB→CMYK (BS) [PO]-WS
Zack thresholding (BS), IFD-
based thresholding (NS)
ALL-IDB and CD, SA-N:76%
[80] 2019 RGB→CMYK, HE
[PO]-WS
Zack Algorithm ALL-IDB1
Edge
detection-based
[33]
[68]
2010 [P]-RGB→L*a*b*, MF, UM
[PO]-CB
FCM and CED
KMC and SED
CD (108) ▲Easy to implement
▲No prior information required
▼Use low-level operations
▼Sensitive to noise
▼Need to group edge points into
structures using edge linking/ CHT
▼Edge linking/CHT operations are
computationally expensive
▼CHT may give fallacious results
in clumped regions
▼Needs post-processing
[70] 2011 [P]-MF, UM [PO]-CB SSC and CED CD (108)
[58] 2012 [P]-RGB→HSV
[PO]-MO, WS
SED on S channel CD (50), SA-94.5% (Overall), M2-
94.58%, M5-95.06%, M6-95.65%
[66] 2014 [P]-RGB→L*a*b* [PO]-MO KMC, SED and CED CD (80; 40-AML, 40-Non-AML)
[83] 2016 [P]-GC [PO]-MO CED CD (260)
[82] 2018 [P]-GC, HE [PO]-MO Otsu Thresholding and SED ALL-IDB1
[84] 2018 [P]-RGB→L*a*b*, NORM
[PO]-MO
Otsu Thresholding and
CHT
ALL-IDB, Accuracy-98.87%
[37] 2019 [P]-Scaling, RGB→HSV
[PO]-GNF
CED and SED ALL-IDB
[85] 2020 [P]-RGB→HSV Grabcut and GCHT ALL-IDB, Precision-99.32%, Recall-
98.05%, F-measure-98.67%
Region-based
[89] 2014 [P]-LCS, RGB→HSI [PO]-MF KMC CD (50) ▲Simple and easy to implement
▲Exploit coherence properties
▲ Less prone to errors due to noise
▼Relatively computationally slow
▼ Results are highly dependent
upon seed location
▼Need ine tunning to avoid
under/over-segmentation
▼Issue in determining number of
clusters
▼Sometimes result in creation of
empty clusters
▼Need post-processing
[90] 2014 [P]-SF, LCS SCM CD (104; 54-AML, 50-Normal)
[91] 2015 [P]-RGB→HSI
[PO]-MF, RG
KMC, FCM, MKMC CD (100-ALL, 100-AML), SA-92.26%
(KMC), 96.31%(FCM), 97.02%(MKMC)
[92] 2015 [P]-MF, RGB→L*a*b*
[PO]-MO, RG
KMC ALL-IDB
[93] 2015 [P]-GC, MO WS ALL-IDB, SA-95.56%
[94] 2016 [P]-LCS, RGB→HSI
[PO]-MF
Thresholding and SBRG CD (M1-M5, M7 images)
[95] 2018 [P]-MF, RGB→L*a*b*
[PO]-WS, DT
KMC CD (M4, M5, M7), SA-87.72%
[96] 2018 [PO]-MF, UM, WS, MO KMC CD (100)
[97] 2019 [P]-RGB→L*a*b*
[PO]-Binarization, WS
K-medoids clustering [98] CD (600, ALL-IDB1+HA+Internet im-
ages)
†-AF-Average iltering, BS-Background segmentation, CB-Cropping using bounding box, CED-Canny edge detection, DT-Distance Transform, GC-
Grayscale conversion, GCS-Global contrast stretching, GNF-Gaussian noise iltering, HE-Histogram equalization, LCS-Local contrast stretching, MCW-
Marker-controlled watershed, MF-Median iltering, MO-Morphological operations, NORM-Normalization, NS-Nucleus segmentation, RG-Region
growing, SBRG-Seed-based region growing, SED-Sobel edge detection, SF-Selective iltering, UM-Unsharp masking, WS-Watershed segmentation
⋆-CHT-Circular Hough transform, FCM-Fuzzy C-means clustering, GCHT-Gradient circular Hough transform, IFD-Intuitionistic fuzzy divergence,
KMC-K-means clustering, MKMC-Moving k-means clustering, SCM-Shadowed C-means clustering, SSC-Semi-supervised clustering
‡-C-Cytoplasm, CD-Custom dataset, LC-Leukemia cells, N-Nucleus, NC-Normal cells, SA-Segmentation accuracy, WC-Whole cell
ACM Comput. Surv.

Fig. 8. Work-flow pipeline of the deformable model-based WBC segmentation methods
local minima and maxima for initialization of K-centroid and gave the best results on the AML dataset.
Harun et al. [91] also performed the comparison of KMC, FCM, and moving K-means (MKM) clustering
methods for segmenting whole blood cells from the blood smear images. Both qualitative and quantitative
performances are analyzed for each of these algorithms. They concluded that the MKM-clustering method
outperforms the other two algorithms in segmenting ALL and AML blast cells.
The consolidated information about rule-based leukocyte segmentation methods is presented in Table 6.
(2) Deformable model-based methods: The deformable models use lexible 2D or 3D curves to represent the
shape of an object to be segmented. After initialization, these curves evolve under the inluence of internal
forces, external forces, and user-deined constraints to it the object boundary. The internal forces keep
the model smooth during the evolution process, whereas the external forces push the model towards
the object boundary. These models are extremely successful in segmenting objects of diversiied shapes
even under tremendous variability in image quality [106]. The general worklow pipeline of deformable
model-based methods is shown in Figure 8. Based on how the model is deined in shape domain, the
deformable model-based methods are classiied as:
(a) Parametric model-based methods: These models use a small set of parameters to deine a model that
represents the geometrical shape and appearance of the object to be delineated. The model is iteratively
updated by adjusting its parameters according to the internal and external image forces. When both forces
balance each other, an equilibrium is attained that describes the segmented output. These models are
advantageous as they use compact representations and converge faster. However, their major limitation
is that they can segment a single object, and merging or splitting during the deformation is impossible.
Active contour models (ACMs), snakes [107], are the parametric curves that have been used to segment
leukocytes’ nuclei from the blood smear images in many studies including [108ś111]. The irregular shape
of the cell nucleus makes its segmentation a challenging task. Snakes can easily deform their shape and
thus can segment the cell’s nucleus efectively. After nucleus segmentation, other components such as
cytoplasm and the background can be separated from each other by using a thresholding-based method.
Another active contour model, color gradient vector low (GVF) snake, is used for segmenting cells in the
blood smear image [112]. The GVF snake is advantageous over other gradient-based ACMs because of
their insensitivity to contour initialization and large capture region. Moreover, the GVF snake converges
in one-ninth of the time taken by an ordinary snake model. Ko et al. [61] applied GVF snake on the green
component of the RGB colored blood smear image to segment the cells’ cytoplasm instead of nuclei.
(b) Geometric model-based methods: Geometric models are based on the theory of evolution of curves in which
a curve or a surface is represented as a level set of a higher-dimensional scalar function. A 2-D curve can
be represented as a level set of 3-variable function [120, 121]. Compared to their parametric counterparts,
these models are advantageous as they easily control the topological changes, are numerically stable,
and do not yield self-intersections. The level set method has been used by various researchers for the
nucleus segmentation of WBCs [113]. Some of them have applied the geometric model in conjunction
with other algorithms. Chinnathambi et al. [114], and Al-Dulaimi [116] used a modiied version of the
ACM Comput. Surv.

16 • Mital et al.
level-set algorithm that includes the convolution of active contour and level set based on piecewise
smooth function. The modiication led to the smooth and proper segmentation of blast cells. Wenhua et al.
[115] applied the level set method followed by canny edge detector to resolve the ambiguous boundaries
in the segmentation. This method is found useful only for the simple boundaries and not for the complex
ones. Gharipour and Liew [117] implemented a level set-based approach in which a touching cell splitting
approach is used to minimize Bayesian classiication risk. Al-Dulaimi et al. [118] proposed an automated
approach in which level sets are used to estimate initial cell boundaries with the help of morphological
operations. Geometric active contours are used further to adapt the topology of cell boundaries. Moallem
et al. [119] used Chan-Vese level-set algorithm after obtaining the binary masks for corresponding cells to
obtain better contours for WBC boundaries. The consolidated information about deformable model-based
leukocyte segmentation methods is presented in Table 7.
(3) Machine learning-based methods: Machine learning-based methods, speciically deep learning (DL)-based
methods, have been extensively used in the medical domain to segment various anatomical structures
such as lungs, heart, liver [122]. These methods built a binary or multi-class classiier, train it using an
annotated dataset, and exhaustively classify each pixel as either belonging to the anatomical structure or
Table 7. Deformable model-based leukocyte segmentation methods
processing stage†
Rules⋆ Dataset(#) and
Performance‡
Pros (▲) and Cons (▼)
Parametric
model-based
[108] 2010 [P]-BLF, RGB→YCbCr,
Otsu Thresholding
ACM CD (20) ▲Model use compact representations
▲ Fast in convergence
▲Results in smooth and closed
contours
▲Attains subpixel accuracy
▲Suited to segment cells with weak
boundaries
▼Model topology adoption during
deformation is not possible
▼Can capture only single cell
▼Multiple models needed for
multiple cells
[61] 2011 [P]-MO, CED [PO]-MF, RG MSC, GVF snake CD (WBCs-60)
[112] 2011 [P]-RGB→HSL CED and GVF CD (100), SA-88.4% (N), 67.7%
(C)
[109] 2014 [P]-Intensity Map based
on G and B channels
[PO]-CFA
Otsu thresholding, ACM,
MCWS
CD (650-WBCs), SA- PR:
94.09%, RC: 98%, FS: 96.01%
[110] 2015 [PO]-Parametric Thresh-
olding
ACM CD
[111] 2018 [P]-RGB Normalization,
Gamma Correction
ACM CD, Accuracy-99.38%,
Speciicity-97.92%
Geometric
model-based
[113] 2012 [P]-Otsu Thresholding
(τ = 90) [PO]-MO
Level set CD (100) ▲ Numerically stable
▲Easily control topological changes
▲ Does not yield self-intersections
▲Results in smooth and proper
segmentation
▲Robust to noise and spurious edges
▼High computational cost
▼Edge stopping function depends on
image gradient
▼ Only cells with edges deined by
gradient can be segmentted
▼Fitted curve may pass the cell
boundary a little bit
[114] 2014 [PO]-MO Modiied level set CD
[115] 2014 [PO]-CED Level set CD
[116] 2016 [P]-GC, MO [PO]-
Binarization
Level set using GAC ALL-IDB and CD, BDE-12.5, RI-
0.93
[117] 2016 [PO]-Cell splitting algo-
rithm
Local level set method
based on Bayesian risk and
weighted image patch
CD (2292), SA: Jaccard-91.6%,
MAD-3.5, Hausdorf Distance-
12.7, Dice FP-4.7, Dice FN-3.9
[118] 2016 [P]-CE, MO [PO]-
Binarization
Level set via GAC CD (1001), SA: BDE-4.96, JDE-
0.15
[119] 2017 [P]-GC, Otsu Threshold-
ing, RF, MO
Chan-vese level set CD (1300)
†-BLF-Bilateral iltering, CE-Contrast enhancement, CED-Canny edge detection, CFA-Circle it algorithm, GC-Grayscale conversion, MF-Median
iltering, MO-Morphological operations, RF-Range iltering, RG-Region growing
⋆ACM-Active contour model, GAC-Geometric active contour, GVF-Gradient vector low, MCWS-Marker-controlled watershed segmentation, MSC-
Mean shift clustering
‡ BDE-Boundary distance error, C-Cytoplasm, CD-Custom dataset, FN-False negative, FP-False positive, FS-F-score, JDE-Jaccard distance error,
N-Nucleus, MAD-Mean absolute contour distance, PR-Precision, RC-Recall, RI-Rand Index, SA-Segmentation accuracy
ACM Comput. Surv.

Fig. 9. Work-flow pipeline of the machine learning-based WBC segmentation methods
the background. Since these methods classify each pixel, they are also known as pixel classiication (PC)-
based methods. The general worklow pipeline of the machine learning-based or pixel classiication-based
methods is shown in Figure 9.
The performance of these methods is highly dependent on the quality and quantity of annotated train-
ing data. Thus, if the classiier is properly trained, unlike rule-based methods and deformable model-based
methods, these methods demonstrate a high level of performance even when the anatomical structure is
deformed. The PC-based methods generally use an encoder-decoder architecture for segmenting leuko-
cytes from the blood smear images. In the encoder network, the convolution and pooling operations are
performed to learn the feature maps at diferent scales. The decoder network performs upsampling so that
the result and the input images are of the same size. Tran et al. [123] applied SegNet [124] to segment RBCs
and WBCs from peripheral blood smear images. The weights for the SegNet network are initialized using
the VGG16 network. Although, SegNet architecture results in better delineation of boundaries, it has a
limitation that only the pooling indices from an encoder layer are passed to the corresponding decoder
layer rather than complete feature maps. Zhang et al. [125] used an adversarial residual network to segment
the cell and nucleus of leukocytes. They used segmentation and discriminative networks in their work.
The sole purpose of using the hybrid adversarial network instead of the U-shaped network is to solve the
problem of inter-class consistency. Cheng et al. [126] used ResNet-50 as an encoder to extract features from
images. These features are passed to the region proposal network (RPN) which outputs anchor boxes using
a series of intersection-over-union (IoU) operations. Mask region-based convolutional neural network
(Mask RCNN) [127] is another PC-based method that separates various cells in an image at once. Yuan et
al. [128] applied this method for nuclei detection and segmentation. The blob detection method is initially
used to generate binary masks for training the network. It is then followed by erosion to adjust the mask’s
size. The method does not require a labeled dataset, and the obtained results are satisfactory. Lu et al. [129]
used a dual-path network (DPN), which has advantages of both ResNet and DenseNet networks, to extract
features from blood smear images. It is followed by channel attention, a type of soft attention, to weigh
individual channels. It does so by calculating relationships between the channels and by compressing spatial
dimensions of the image. At last, a feature decoder based on UNet is applied for pixel-to-pixel classiication.
The pros and cons of deep learning-based leukocyte segmentation methods are listed in Table 8.
(4) Hybrid methods: These methods, also known as fusion-based segmentation methods, use a combination of
more than one technique to perform the segmentation task. Khobragade et al. [130] used a combination
of Otsu thresholding and Sobel operator to perform segmentation. Jagadev et al. [60] used a fusion of
three algorithms, namely K-means, marker-controlled watershed, and HSV color-based method for image
segmentation. Jha et al. [131] proposed a method based on the active contour model and FCM algorithm.
Hedge et al. [132] used the TissueQuant algorithm, Gaussian weighing functions, and HSI model to detect
the nuclei in the images. The major limitation of hybrid methods is that they are ad hoc, customized to
give good performance on a speciic set of images, and do not generalize well on other datasets.
ACM Comput. Surv.

18 • Mital et al.
Table 8. Deep learning-based leukocyte segmentation methods
processing stage†
Rules Dataset(#) and Performance‡ Pros (▲) and Cons (▼)
Pixel
classiication-based
[123] 2018 [P]-DA, RS SegNet with VGG16 pre-
trained network for initializa-
tion
ALL-IDB1, SA-94.92% (WBC),
75.04% (IoU)
▲ Robust and lexible
▲ Outperforms state-of-art rule-based
and deformable model-based methods
▲ DL-based methods have automated
feature extraction
▲ DL-based methods use parallel
computations for speedy training
▼Require large annotated datasets
▼Require extensive computational
power and time for training
▼ Output cannot be easily
comprehended
[128] 2019 [P]-Blob detection,
Otsu thresholding
Mask RCNN CD, SA-82% (PR), 73% (IoU)
[126] 2019 [P]-DA Encoder (ResNet50)-Decoder
(FRCNN)
CD (1000), SA-98.16% (PR), 99.25%
(RC), 98.7% (FS)
[129] 2020 [P]-DA Encoder (DPN)-CA-Decoder
(UNet)
CD (462), SA-98.5% (MCA-1.5%),
97.49% (KI)
[125] 2020 [P]-DA Adversarial residual network
(ResNet50 as discriminator)
CD (5000+Kaggle Dataset), SA-
Accuracy: 98.68%, Speciicity:
98.59%, Sensitivity: 99.11%
†-DA-Data augumentation, RS-Resizing
⋆CA-Channel attention, DPN-Dual path network, RCNN-Region-based CNN, FRCNN-Fast region-based CNN
‡ CD-Custom dataset, FS-F-score, IoU-Intersection over union, KI-Kappa index, MCA-Miss classiication error, PR-Precision, RC-Recall, SA-
Segmentation accuracy, WBC-White blood cell
Overlapping and Aggregating cells: As enumerated, several methods exist for detecting and segmenting leukocytes
from the blood smear images. Although these methods perform well when the cells are sparsely distributed over
the background, segmenting clumped areas where the cells are adjacent to/ overlap each other still remains
a challenging task [133ś137]. The overlapped cells impact the morphological analysis as they have diferent
characteristics such as area and shape compared to single cells. They also impact the cell count as multiple cells
are counted as one, resulting in an incorrect diagnosis. The following methods for overlapping cells detection
and splitting adjacent cells have been proposed.
(1) Geometric feature-based methods: The overlapping and aggregating cells have a larger area, an elliptical or
cloverleaf shape, and have higher intensity in the intersecting part. Geometric feature-based methods use
these features to detect overlapping cells and are further categorized as i) region-based methods and ii)
boundary-based methods [138]. The region-based methods detect overlapping cells based on the size or
number of pixels. The boundary-based methods identify overlapping cells based on the shape of boundary,
contour length, and edge strength [139]. Once identiied, the clumped cluster can be decomposed into
single cells for accurate segmentation majorly using watershed segmentation [50, 58, 140ś142], marker-
controlled watershed segmentation [109, 143], concavity analysis [144ś146]. The watershed method based
on distance transform uses regional minima of distance map as markers. The presence of multiple regional
minima results in over-segmentation. The problem is alleviated by using marker-controlled watershed
with predeined markers such as h-minima transform, where h is a threshold value [109] and color-based
marker [143]. Though the manual selection of threshold in [109] is a tedious task, Xie et al. [147] used
deep learning for automatic marker selection. The nucleus image-based marker selection has also been
used in the two most widely used cell segmentation softwares namely Cell Proiler [148] and llastik [149].
Concavity analysis has been used to split the overlapping or adjoining cells by forming a line segment
between two speciic cut points where the boundary curvature changes abruptly. The performance of this
method is highly dependent on the degree of overlap between the cells and the sophistication with which
boundary is processed.
(2) Machine learning-based methods: Recently various machine learning-based methods have been proposed to
split the aggregating cells. Duggal et al. [150] used a deep belief network (DBN) to identify pixels lying on
the ridge of overlapping cells. Once identiied, ridge pixels are removed to split overlapping cells. The use
ACM Comput. Surv.

of contextual information and deep CNNs is supposed to enhance the segmentation of overlapping cells.
Deep contour-aware networks (DCAN) that use contextual information in the form of multi-scale features
have been proposed to segment overlapping cells in [151, 152]. The machine learning-based methods sufer
from two problems: i) they exhibit under-segmentation bias while segmenting clumped areas, and ii) they
need a signiicant amount of manually labeled training data.
Although a lot of work has been done, the segmentation of overlapping/ aggregating cells remains one of the
most challenging and open issues in the ield.
7.3 Feature engineering
Features are the group of informative and non-redundant numerical values used to identify, measure properties
of, or describe objects in an image. Feature engineering is a highly skilled task that requires extensive domain
expertise. In shallow machine learning-based methods, the features are intuitive and handcrafted. The engineering
process is manual, involving subtasks such as feature extraction, feature normalization, and feature selection. It is
challenging to extract only the necessary and suicient features to perform segmentation or classiication tasks
satisfactorily. Therefore, deep learning-based methods rely on the automatic extraction of features. This section
describes the feature engineering process involving feature extraction, normalization, and selection subtasks that
are generally used in shallow machine learning-based methods.
(1) Feature extraction: Mainly, the following three categories of features (consolidated in Table 9) have been
used to distinguish between the blast and non-blast cells.
(a) Geometrical features: The geometrical shape of a cell’s nucleus is one of the most important features
in determining whether the cell is a blast or non-blast [68]. Both the region and boundary descriptors
have been used to describe the shape features [153]. Huang et al. [154] used perimeter and area shape
features in their work. In addition to the area feature, Putzu et al. [50] have computed the area under the
convex hull for each leukocyte. This computation helped in the determination of solidity (refer equation
1), which indicates how dense an object is. The value of solidity less than 1, typically 0.90, indicates the
presence of blast cells.
Solidity =
Area
Convex Area
(1)
In addition to the calculation of solidity, Moshavash et al. [67] computed the roundness (as in equation
2) to classify the segmented objects as leukocytes. Objects with roundness values above the threshold
value 0.8 are labeled as non-leukocytes and with solidity value above 0.93 are classiied as leukocytes.
Similarly, Sahol et al. [155] also used the combination of roundness (threshold value=0.52), and solidity
(value=0.98) features to get desired classiication results.
Roundness =
4 × π × Area
Convex Perimeter
(2)
Irregularity and complexity in nucleus boundary have also been used as signiicant features in ALL
diagnoses. The standard Euclidean geometry fails to represent the irregularity in the lymphocyte’s
nucleus boundary objectively. It is better quantiied using fractal geometry and contour signature. The
fractal geometry has been widely used in the medical image analysis ield [156]. Although many fractal
dimensions are available, Hausdorf dimension (HD) [68] is the most commonly used fractal dimension
for cell boundary roughness analysis. It is easily implemented using the box-counting method [157] in
which a cell is covered with grids of variable size s and the number of boxes N covering the cell boundary
are counted. The Hausdorf dimension D can be computed using the relationship between s and N listed
ACM Comput. Surv.

20 • Mital et al.
in equation 3.
D =
ln N
lns
(3)
The higher the value of D is, the higher is the irregularity in the cell’s boundary. This dimensionless
measure has been used to quantify the roughness of the blast cell’s nucleus boundary in [66, 90, 97].
The geometric features are susceptible to errors during the segmentation and are generally used in
conjunction with other features such as color features [80], and texture features [50, 67].
(b) Textural features: Texture refers to the structure of the fundamental sub-segments and the interrelations
between their spatial arrangements in a digital image. Such arrangements can be observed in the form of
various intensity or frequency levels. Various techniques used for describing the textural features are as
follows.
(i) Fourier descriptors: Fourier descriptors help to encode the texture of an image with the help of the
Fourier transform function. Generally, 2-D discrete Fourier transform (DFT) [90], and fast Fourier
transform (FFT) has been used as descriptors for the textural information. Features such as mean,
standard deviation, kurtosis, skewness for a cell image are calculated in the frequency domain with the
help of DFT or FFT [158]. Reta et al. [159] have used WoldâĂŹs decomposition model [160] for the
textural analysis of the image. This model uses DFT to solve the harmonic ield in cells by locating the
valuable harmonics peaks (using an amplitude threshold of 10). The advantageous feature of this model
is that it resembles the human vision system (HVS) and evaluates both periodic and random textures
in an image. Moreover, it is invariant to the application of aine transformations such as translation,
rotation, and scaling on the image.
(ii) Wavelet texture features: These features are extracted by applying discrete wavelet transform (DWT),
Gabor wavelet transform (GWT), Sym4, and Db4 wavelet transforms. The wavelet transform is ad-
vantageous as compared to Fourier transform in terms of temporal resolution. Parvaresh et al. [161]
used discrete wavelet transform (DWT) for textural feature extraction after converting the image into
grayscale. The best coeicients after the transform are selected to describe the edges in the image. As a
result, a total of 22 features are extracted, including 16 textural features. Nikitaev et al. [62] used Haar,
Daubechies, and Cohen-Daubechies-Feauveau wavelet functions for analyzing the nucleus structure of
the blood cells for leukemia detection.
(iii) Haralick texture features: These features are computing using grey-level co-occurrence matrix (GLCM),
which counts the co-occurrence of neighboring grey levels in an image. The co-occurring values are
checked in a certain direction θ and distance D. Contrast, correlation, homogeneity, energy and entropy
are various statistical measures extracted from GLCM using the ofset value ((-1,0), (0,-1), (1,0), (0,1)).
Huang et al. [154] extracted the texture features using four GLCMs. Each of them is formed using 0◦,
45◦, 90◦ and 135◦ orientation and unit distance. ChinNeoh et al. [162] also used GLCM to compute
descriptors from 54 textual features. Apart from GLCM features, other texture descriptors such as
kurtosis and skewness are also used. Textural features have been extensively used for ALL, and AML
detection in [49, 73, 83, 96, 155, 163ś166].
(iv) Laws texture method: Laws method [171] uses diferent kernels to detect texture in an image. Originally,
ive one-dimensional masks were used namely level, spots, waves, edges and ripples. The matrix values
corresponding to these one-dimensional masks are [+1 +4 6 +4 +1], [-1 0 2 0 -1], [-1 +2 0 -2 +1], [-1 -2 0
+2 +1] and [+1 -4 6 -4 +1], respectively. Laws texture features have been used by Hedge et al. [172, 173]
for classiication of white blood cells in peripheral blood smear images.
(c) Chromatic features: The chromatic features such as intensity and color are extremely useful in various
segmentation and classiication tasks. These features are highly discriminative, and their extraction is
computationally inexpensive as compared to textural features. The color feature has been extensively
ACM Comput. Surv.

used to identify blood cell components (i.e., nucleus and cytoplasm). Han et al. [174] have used color
features based on variations in the RGB channels of the peripheral blood smear image. These features
have also been used in [67, 145, 163]. Instead of using the intensity values directly, Nasir et al. [175] used
mean and standard deviation of the color intensity in the RGB color space.
Along with mean and standard deviation, Sahol et al. [155] also used skewness, kurtosis, and energy
from each RGB channel as a feature for recognition of blood cells in the blood smear image. Although
frequently used, the RGB color space is perceptually non-uniform and highly sensitive to illumination
changes. Hence, Fatichah et al. [176] utilized color characteristics based on other color spaces such as
CIE L*a*b, HSI, HSL. Mohapatra et al. [90] developed a method for early diagnosis of ALL in which the
mean intensity of value of RGB and HSV channels is used as a color feature.
(2) Feature normalization: The values of extracted features vary over diferent ranges. Data normalization is
commonly used to bring the feature values to a common scale without distorting the diferences in ranges.
It improves the performance and training stability of the model. The commonly used feature normalization
techniques are consolidated in Table 10.
(3) Feature selection: During the feature extraction process, a lot many features are extracted. However, not all
the features are equally discriminative, i.e., signiicant in decision making. The presence of insigniicant
features not only adds to the curse of dimensionality but may also negatively afect the performance
of a machine learning algorithm. Thus, feature selection, also known as dimensionality reduction, is an
important task that reduces the dimensionality of feature space by selecting the signiicant features only.
The feature selection methods are broadly classiied as:
(a) Filter methods: These methods use statistical methods to determine the correlation between each input
variable and the target variable. The input variables strongly correlated to the target variable are selected.
These methods are advantageous because the feature selection is computationally inexpensive, fast, and
does not depend upon the machine learning algorithm being used. The diferent statistical methods used
for feature selection in leukocyte detection studies are Linear Discriminant Analysis (LDA), Principal
Component Analysis (PCA), and Analysis of Variance (ANOVA). LDA determines the linear discriminants
to represent the axes that maximize the separation between the target classes. It is closely related to
the PCA technique that also inds the axes of maximal variance but without considering target classes.
Table 9. Features used by leukocyte classification methods
Cat.
Feature Feature Selection
Method
Number of Selected
Features
Geometric features Texture features Chromatic features
Feature Extrac-
tion Methods
Boundary segments, Chain
codes, polygonal approx-
imation, signature, skele-
tons
DFT, FFT, DWT, GWT,
GLCM, local binary pat-
tern (LBP)
Color histogram • PCA ([73, 154, 163,
166, 167])
• t-test ([90])
• Meta-heuristic
([96, 155, 161, 168])
• ANOVA ([169])
[154]-5S,80T,
[50]-30S,80T,21C,
[66]-9S,4T,3C,
[90]-32TO,
[170]-80TO,
[166]-6S,4T,
[73]-11S,45T,15C,
[67]-15S,32T,6C,
[161]-44TO,
[96]-15S,5T,3C,
[155]-30S,45T,84C,
[167]-35TO,
[168]-1000TO
⋆S-Shape, T-Texture,
C-Color, TO-Total
features
Computation Perimeter, area, convex
area, solidity, roundness,
form factor, elongation,
contour signature, fractal
dimension
Uniformity, smoothness,
skewness, kurtosis, stan-
dard deviation, mean,
energy, entropy
Homogeneity, hue, satura-
tion, mean, contrast, corre-
lation, standard deviation,
skewness
Research
Works
[50, 66, 67, 69, 73, 90, 96, 97,
154, 155, 163, 166ś170]
[50, 66, 67, 69, 90, 96, 97,
154, 155, 161, 163, 166ś168,
170]
[50, 66, 67, 69, 73, 90, 96,
155, 161, 167ś170]
Pros (▲) and
Cons (▼)
▲Recognizes cell shapes
▼Susceptible to errors
▼Need other features in-
conjunction
▼Computationally inten-
sive
▼Not invariant to scale
and rotation
▲Easy to compute
▲Low-dimensional feature
vector
▼Sensitive to noise
ACM Comput. Surv.

22 • Mital et al.
Table 10. Features normalization techniques
Tech. What it does? Formula When to use? Pros (▲) and Cons (▼)
Visualization
Range
Scaling
Brings the values of features to a
standard range, generally 0 to 1,
through min-max normalization
x′
=
x − xmin
xmax − xmin
When features are uniformly
distributed across a ixed
range
▼Sensitive to outliers
▼May not properly nor-
malize test values.
Feature
Clipping
Feature value above or below a cer-
tain value are clipped
x′
=

max, x max
min, x min
When features contain ex-
treme outliers
▼Used in conjunction with
other normalization tech-
niques
Z-Score
Represents the number of standard
deviations the feature is away from
the mean
x′
=
x − µ
σ
When features do not con-
tain extreme outliers
▲Robust to new data
▼Efective if feature has
Gaussian distribution
Log
Scaling
Compresses the feature range using
log transformation
x′
= log(x)
When features over a wide
range are to be represented
over a narrow range
▲Wider range of data can
be represented
▼Negative and positive
values cannot be simulate-
nously handled
Huang et al. [154] used PCA to select the features that distinguish between the nuclei of cancerous
and non-cancerous cells. It has also been used in [73] for reinement of features to detect immature
lymphoblast cells. LDA has been used in [170, 177] for lymphoblast classiication. It is quite challenging
for LDA to select features. Hence, PCA has been used before LDA in [167] so that the feature selection
becomes trivial. ANOVA is a statistical tool that calculates the inluence of individual features with the
help of a p-value. Features having lower p-values, typically less than 0.05 are considered eicient in
identifying diferences among various groups. Ghane et al. [169] used ANOVA in conjunction with PCA
to select the best features for classifying CML.
(b) Wrapper methods: These methods combine feature selection and the learning process to select an optimal
subset of features. The selection process is iterative with nested cross-validation and hence, computa-
tionally very expensive. The wrapper methods either use forward selection or backward selection for
feature selection. Tabu search [178] is one of the best forward selection algorithms that optimally solve
the feature selection problem. Parvaresh et al. [161] used the modiied version of the Tabu algorithm,
known as chain Tabu search, that iteratively searches the optical features without being stuck in local
minima. The other wrapper methods used for automatic classiication of leukemia from blood smear
images include Social Spider Optimization algorithm (SSOA) [155], ant colony optimization (ACO) [179],
genetic algorithm (GA) [180], probabilistic incremental program evolution (PIPE) and particle swarm
optimization (PSO) [181]. The pros and cons of ilter methods and wrapper methods for feature selection
are mentioned in Table 11.
7.4 Classification
It is the inal stage of the worklow pipeline in which a class label such as ALL positive, ALL negative, is assigned
by a classiier. Sometimes, an additional regression module generates cell count along with the class labels. These
assignments are done based on learning done by the classiier and regressor during the training phase. In the
training phase, the classiiers, except the instance-based classiiers, use the labeled training dataset and keep on
ACM Comput. Surv.

Table 11. Feature selection methods
Parameter Filter Methods (Pros (▲) and Cons (▼)) Wrapper Methods (Pros (▲) and Cons (▼))
Dependency on learning algorithm
▲Generic set of methods independent of speciic ma-
chine learning algorithm
▼Use the machine learning algorithm to ind optimal
features
Performance ▲Faster as compared to wrapper methods ▼Computationally very expensive
Over-itting ▲Less prone to overitting ▼High chances of over-itting
Examples Pearson’s coeicient, LDA, PCA, ANOVA, Chi-square Forward selection, backward selection
Used in leukemia detection studies [73], [154], [167], [169], [170], [177] [155], [161], [179], [180], [181]
adjusting their model parameters till they start making correct classiication for a suiciently large number of
input images. Once a classiier is trained, it is deployed to classify the input images not earlier included in the
training set. This section enumerates the classiiers that have been used in various studies on automated analysis
of the blood smear images:
(1) Instance-based Classiiers: These classiiers perform classiication by comparing the given input instance
with the training instances stored in the memory. These classiiers are thus also known as memory-based
classiiers. Since the training instances are stored in the memory, the complexity of the model (or the number
of parameters) grows with the increase in the number of training instances. The k-Nearest Neighbour (KNN)
[182] is one such classiier that assigns to the given input instance a class that is most common among its
k-neighbors. Though k-NN classiier is simple to implement, the value of k has a profound impact on the
classiier’s performance. Supardi et al. [183] have performed the classiication of blasts in acute leukemia
blood samples using k-NN classiier. In their empirical study, they determined that k = 4 performed the
best classiication. Purwanti et al. [184] tried all the odd values of k from 1 to 15 and concluded that k = 7
gives the optimal results for the given dataset. Bhattacharjee et al. [93] used the value of k as 1 to perform
the classiication using low computational resources. Although k-NN classiier has been used in several
studies related to blood smear image analysis [185, 186], it has many disadvantages being a non-parametric
classiier. Many researchers have thus tried using parametric classiiers such as SVM, ANN to limit the
model’s complexity with an increase in the size of the training dataset.
(2) Support Vector Machine (SVM): SVM is one of the best classiication algorithms in machine learning
[33, 68, 187]. It belongs to a class of supervised learning algorithms that can even deal with non-linear
relationships within the data. It aims to separate the input instances into diferent classes through a hyper-
plane itting. The hyper-plane is itted in such a way that the distance of the plane from the nearest data
points belonging to diferent classes is maximum. The given data points might be linearly inseparable. SVM
tackles this issue using kernel functions that transform the linearly inseparable data into a linearly separable
form in higher-dimensional space. The most commonly used kernels are the linear kernel, polynomial
kernel, and radial basis function (RBF) kernel. Among these, the linear and polynomial kernels are easy to
apply and take less time, whereas RBF kernels are known for accurate results.
Amin et al. [47] used an SVM classiier to categorize input blood smear images as non-leukemic or
leukemic. For the latter category, multi-class SVM is used to further categorize ALL images according to
their sub-types L1-L3. Kazemi et al. [188] used RBF kernels to recognize AML in the blood smear images
automatically. The AML positive images are further classiied into their subtypes, M2-M5. Setiawan et
al. [95] used linear kernel SVM to classify AML positive images into three subtypes M4, M5, and M7.
However, the results for AML sub-type classiication are not promising in both the works [95, 188]. Rawat
et al. [73] worked on a similar kind of hierarchical SVM classiier which grouped leukemic cells into L1,
L2, and L3. The proposed multi-class classiication system is quite promising and is robust enough to
support the medical application. They also proposed a technique using varied kernels to separate images
ACM Comput. Surv.

24 • Mital et al.
into ALL and AML subtypes [189]. Multilayer perceptron kernel (MPK) is used to distinguish between
ALL cell types (L1/L2/L3), whereas Gaussian radial basis kernel (RBF) is used for categorizing AML cell
types (M2/M3/M5). Putzu et al. [50] compared the performance of four kernels namely linear, quadratic,
polynomial, and Gaussian radial basis kernels. The parameter values for each type of kernel are optimized
to obtain the best accuracy value. In all the cases, the attained accuracy is above 90%, but after deploying
the Gaussian radial basis kernel, the accuracy reached the 98% mark. Since results for each kernel varied
signiicantly, Moradiamin et al. [190] used an ensemble of these kernels to categorize the images into L1, L2,
L3, and non-cancerous classes. The inal decision is based on the majority voting principle as per the results
obtained from all the kernels. In all these studies, the SVM classiier has been applied to small datasets, and
only a limited number of kernels have been used. There is a need to test the performance of SVM on larger
datasets and to evaluate the performance of other available kernels as well.
(3) Artiicial Neural Networks (ANN): ANNs, inspired by the biological neural networks, have gained huge
popularity in recent years because of their generalization and parallel processing capabilities. The ANNs
learn through a training process in which they update the weights associated with each neuron. The
training process is carried on till the ANN starts producing the desired output. As opposed to SVMs, ANNs
can be easily used for multi-category classiication by changing neural connections.
Adjouadi et al. [191] used low cytometry data and ANN for ALL/ AML detection from blood smear
images with 96% accuracy. Parvaresh et al. [161] used feed-forward multi-layer perceptron (MLP) for
diagnosing leukemia using blood smear images. MLP has also been used by Fatma et al. [89] to identify
and classify acute leukemia from blood smear images with 91% accuracy. Vincent et al. [166] used two-step
neural network for ALL classiication. Both the sub-networks are trained using LM optimization algorithm
[192]. They used the same pre-classiication steps as used by Madhukar et al. [193]. However, their approach
difers in the classiication step where NNs are used instead of SVMs due to which the classiication accuracy
increased by 4%. The major limitation of ANNs with shallower depth (i.e., fewer layers) is the requirement
of good quality discriminative features. Extracting good quality features is a diicult task that requires
extensive domain expertise and specialized training. During the 1990s, the researchers tried increasing the
number of layers or the depth of ANN to alleviate the problem of shallower ANNs. However, training deep
ANNs is a challenging task due to tuning a large number of trainable parameters. Such tuning requires
enormous computational power and a large training dataset. The solution to these problems has been found
during the last decade. The availability of graphic processing units (GPUs) and large training datasets has
made the deep ANNs relatively easier to train. Nowadays, deep ANNs, particularly deep convolutional
neural networks (CNNs) as depicted in Figure 10, have been used in almost every visual recognition task in
the medical domain such as brain tumor segmentation [194, 195], detection of mitotic cells [196, 197].
Thanh et al. [198] used a CNN with 4 layers to extract features and classify blood smear images as
leukemic or non-leukemic with 96.3% accuracy. Rajpurohit [82] et al. used a little deeper network with 7
layers to perform the classiication task. They also compared the performance of CNN with a feed-forward
network. They found that deep CNN has a classiication accuracy of 98% which is 3% more than the
feed-forward network. Sipes and Li [199] also used CNN for image classiication of ALL and compared its
performance with k-NN and other neural networks. They found that CNN performs the best, k-NN the
worst, and the performance of other NNs varies between that of k-NN and CNN. Another concept that
increased the popularity of CNNs is that of transfer learning. In case of scarcity of customized training
data, the CNN model can irstly be trained on natural image dataset like ImageNet from Large Scale Visual
Recognition Challenge [200] and then on available customized training data. The CNN learns general
low-level features like edges and blobs from natural image datasets and speciic high-level features from the
customized dataset. Ghosh et al. [71] used pre-trained AlexNet model [15] and transfer learning for ALL
classiication from peripheral blood smear images. Using transfer learning, they attained the classiication
ACM Comput. Surv.

Fig. 10. Architecture of Convolutional Neural Networks (CNNs)
accuracy of 97%. Nowadays, the CNNs have become so popular and promising that in the ISBI-2019 C-NMC
challenge [201], all the selected entries used variants of CNNs to solve the problem.
(4) Ensemble classiiers: The word łensemble is derived from the Latin word łinsimul which means together.
Thus, an ensemble classiier uses multiple classiiers that learn multiple hypotheses and join together to
solve the given problem. There are many diferent ways in which the outcome of individual classiiers
can be combined, such as majority vote, average, probability product, and median [214]. The ensemble
classiiers have become a preferred choice during recent years [215ś217].
Mohapatra et al. [90] created an ensemble of three classiiers namely KNN, MLP, and SVM for leukemia
detection. Vogado et al. [209] formed an ensemble of SVM, MLP, and random forest (RF) classiiers to
classify blood smear images. The individual classiiers are ensembled using majority voting criteria. Verma
et al. [211] used MobileNetV2 [218] as a base classiier and its variants as other models to create an ensemble
classiier. Escalante et al. [163] used particle swarm model selection (PSMS) method for selecting constituent
classiiers of the ensemble. The PSMS method searches the highly accurate classiiers from the given search
space without requiring any user intervention and prior knowledge. The ensemble of classiiers searched
using the PSMS method resulted in a classiication accuracy of 97.68% for binary classiication and 94.21% for
multi-class classiication. For multi-class classiication, one-vs-all (OVA) [219] method is used. To develop a
n-class classiier, a set of n-independent binary classiiers are used. To classify an input data point, it is
made to pass through all the n-classiiers.
(5) Other methods: This category includes the classiication methods that have not been included in any of
the categories mentioned above. Fatichah et al. [176] used a modiied version of a traditional decision tree
algorithm called fuzzy decision tree (FDT). It deals with data in the form of fuzzy sets and fuzzy classes. The
ambiguity in class labeling is resolved with high accuracy using fuzzy entropy [220]. Similarly, Viswanathan
et al. [51] used a fuzzy version of classic k-means clustering. The main advantage of using fuzzy C-mean
classiication is that it allows a data point to be part of multiple clusters at a time with varying degrees
of membership. Hence, it has an upper hand over k-means clustering algorithm, which makes clusters
with very crisp boundaries. Nasir et al. [175] used simpliied fuzzy ARTMAP neural networks, which is
a simpliied version of original fuzzy ARTMAP [221] for leukocyte image classiication. Jothi et al. [213]
proposed an ALL classiication method which makes use of Jaya algorithm [222] for optimizing the rules
generated by the classiiers such as Naive Bayes, KNN, SVM, decision tree, and LDA.
These available classiication methods along with their important characteristics are summarized in Table 12.
ACM Comput. Surv.

26 • Mital et al.
Table 12. Classification techniques used for leukemia detection from blood smear images
Cat. Ref. Year Method(†) Classiication(‡) Features[F]/ Parame-
ters[P]/ Architecture[A]
Dataset(#) and Performance⋆ Pros (▲) and Cons (▼)
Instance-based
Classiiers
[183] 2012 KNN ALL vs AML [F]-S, C and SH (12), [P]-k = 4 CD (1500), Acc: 86% ▲Training is very fast
▲Learn complex functions
▲Do not lose information
▼Slow at query time
▼Need lot of storage
▼High classiication cost
[185] 2013 KNN ALL vs Non-ALL [F]-AR, PR, CI ALL-IDB, Acc: 93%
[184] 2017 KNN ALL vs Non-ALL [F]-SH, [P]-k = 7 ALL-IDB2, Acc: 90%
[186] 2017 KNN ALL vs Non-ALL [F]-SH,TX, [P]-k = 7 CD (20), Acc: 91.66%
[202] 2018 C-KNN ALL vs Non-ALL [F]-TX, SH ALL-IDB2, Acc: 96%, SN: 95%, SP: 97%
[203] 2021 KNN ALL/ AML/ CLL/ CML [F]-SH, C, GE, TX ALL-IDB1, Acc: 80%, SN: 77%, SP: 81%
Support
Vector
Machine
(SVM)
[33, 68] 2010 SVM Blast cells [F]-SH, TX CD (108), Acc: 95% ▲Efective in
high-dimensional spaces
▲Memory eicient
▲Works well when margin of
separation between classes is
clear
▼High training cost for large
datasets
▼Does not perform well for
noisy data/overlapping classes
[50] 2014 SVM ALL vs Non-ALL [F]-SH, C, TX, [P]-RBF Kernel CD (33), Acc: 93%, SN: 98%
[47] 2015 SVM ALL vs Non-ALL [F]-TO (8),[P]-RBF Kernel CD (312), Acc: 96%, SN: 98%, SP: 95%
[188] 2016 SVM ALL vs Non-ALL [F]-SH, C, TX, [P]-RBF Kernel CD (330), Acc:96%, SN: 98%, SP: 95%
[190] 2016 SVM ALL vs Non-ALL
ALL→L1, L2, L3
[F]-SH, TX, [P]-Ensemble of
Kernels
CD (312), Acc: 98%(Normal), 96.76% (L1),
96.5% (L2), 98.85% (L3)
[73] 2017 SVM ALL→L1, L2, L3 [F]-SH, C, TX (71-TO) ALL-IDB, Acc: 94.6%
[189] 2017 SVM ALL vs AML [F]-SH, C, TX (331-TO) ASH, Acc: 87%
[95] 2018 SVM AML→M4, M5, M7 [F]-TO (6) [P]-Linear Kernel CD (1710), Acc: 87.98%-95.55%
[204] 2021 SVM ALL vs Non-ALL [P]-TMBO ALL-IDB, Acc: 98%, SN: 96%, SP: 95%
Artiicial
Neural
Networks
(ANNs)
[191] 2010 NN ALL vs AML [F]-FCBF CD (220), Acc: 96.67% ▲ANNs are lexible
▲Good at modeling NL
▲Prediction is fast
▲ANNs can adapt to varying
circumstances
▲Efective at processing
spatial and sequential data
▼Training is computationally
expensive
▼Feature extraction is diicult
▼Results are diicult to
comprehend
▼Results depend on quality
and quantity of data
[89] 2014 MLFFNN Blast cells [F]-C, TX CD (50), Acc: 91%
[166] 2015 NN ALL vs AML [P]-LMA CD (90), Acc: 97.7%, SN: 98%, SP: 97.6%
[205] 2017 MLFFNN ALL vs Non-ALL [P]-EHO ALL-IDB2, Acc: 91.8%
[198] 2017 CNN Blast cells [A]-2CONV, 1MP, 1FC ALL-IDB1, Acc: 96.43%
[71] 2017 CNN ALL vs Non-ALL [A]-Pretrained AlexNet ALL-IDB, Acc: 97%, SN: 100%, SP: 95%
[20] 2018 CNN ALL→L1, L2, L3 [A]-6CONV, 3MP, 2FC CD (330), Acc: 97.78%
[22] 2018 CNN ALL→L1, L2, L3 [A]-Pretrained AlexNet ALL-IDB, Acc: 96%, SN: 97%, SP: 99%
[161] 2018 MLP ALL vs Non-ALL [P]-Logistic AF ALL-IDB2, Acc: 98.88%
[82] 2018 CNN ALL vs Non-ALL [A]-7CONV, 7MP, 2FC ALL-IDB1, Acc: 98.33%
[199] 2018 CNN ALL vs Non-ALL [A]-3Layer Architecture ALL-IDB, PR: 88%, RC: 90%, FS: 89%
[206] 2019 CNN ALL vs Non-ALL [A]-2CONV, 2MP, 2FC ALL-IDB+ASH, Acc: 88.25%
[207] 2021 CNN Blast cells [A]-AlexNet CD (2820), Acc: 100%
[208] 2021 CNN All vs Non-ALL [A]-PCANet ALL-IDB2, Acc: 96.84%
Ensemble
Classiiers
(ECs)
[163] 2012 EC MCC [P]-PSMS CD (633), Binary Acc: 97.68%, Multi-
class Acc: 94.21%
▲ECs have higher predictive
accuracy
▲ECs provide an extra DOF in
bias/variance tradeof
▲ECs reduce the spread or
dispersion of predictions
▼ECs cost more to create, train
and deploy
▼Results of ECs are more
diicult to comprehend
[90] 2014 EC ALL vs Non-ALL [A]-NB, KNN, MLP, SVM CD (104), Acc: 95%, SN: 95%, SP: 95%
[209] 2017 EC ALL vs Non-ALL [A-]SVM, MLP, RF ALL-IDB, Acc: 100%
[210] 2019 EC ALL vs Non-ALL [A]-Variants of Inception-
ResNet
C-NMC, PR: 84%, RC: 85%, FS-84%
[21] 2019 EC ALL vs Non-ALL [A]-VGG 16, MobileNet C-NMC, Acc: 96%, SN: 95%, SP: 99%
[211] 2019 EC ALL vs Non-ALL [A]-Variants of MobileNet C-NMC, Acc: 89.47%
[212] 2021 EC ALL vs Non-ALL [A]-VGG11, ResNet18, Shuf-
leNetv2
C-NMC, Acc: 87.52%, FS: 87.4%
Others
[175] 2013 MLP ALL vs Non-ALL [A]-SFAM, [F]-S,SH,C (42-TO) CD (500), Acc: 94.71% ▲Use various optimization
techniques
▲Allow a data point to be part
of multiple clusters
[176] 2015 FDT ALL vs Non-ALL [F]- TO (8) CD (120), Acc: 84%
[51] 2015 FCM Blast cells [F]-TX, SH, C ALL-IDB
[213] 2019 MA ALL vs Non-ALL [P]-Jaya optimization ALL-IDB, PR:74%-99%, RC: 45%-98%
†-C-KNN-Customized KNN, CNN-Convolutional neural network, EC-Ensemble classiier, FDT-Fuzzy decision tree, KNN-K-nearest neighbors, MA-
Multiple algorithms, MLFNN- Multi-layer feed forward neural network, MLP- Multi-layer perceptron, NN-neural network, SVM-Support vector machine,
‡-MCC-Multi-class classiication, Architecture [A]-CONV-Convolutional layer, FC-Fully connected layer, MP-Max-pooling layer, RF-Random Forest,
SFAM-Simpliied fuzzy ARTMAP, Features [F]-AR-Area, C-Color, CI-Circularity, FCBF-Flow cytometry based features, GE-Geometrical, PR-Perimeter,
S-Size, SH-Shape, TO-Total, TX-Texture, Parameters [P]-AF-Activation function, EHO-Elephant herd optimization, LMA-Levenberg-marquardt
algorithm, PSMS-Particle swarm model selection, RBF-Radial basis function, TMBO-Taylor-monarch butterly optimization, ⋆-Acc-Accuracy, CD-
Custom dataset, FS-F-score, PR-Precision, RC-Recall, SN-Sensitivity, SP-Speciicity, ▲-DOF-Degree of freedom, NL-Non-linearities
8 CONCLUSION
Manual analysis of the blood smear images is a challenging, time-consuming, and error-prone task. Although
several image processing and machine learning-based systems have been developed for automated blood smear
ACM Comput. Surv.

Automated Analysis Of Blood Smear Images For Leukemia Detection A Comprehensive Review

Automated Analysis Of Blood Smear Images For Leukemia Detection A Comprehensive Review

Recommended

Recommended

More Related Content

Similar to Automated Analysis Of Blood Smear Images For Leukemia Detection A Comprehensive Review

Similar to Automated Analysis Of Blood Smear Images For Leukemia Detection A Comprehensive Review (20)

More from Kristen Carter

More from Kristen Carter (20)

Recently uploaded

Recently uploaded (20)

Automated Analysis Of Blood Smear Images For Leukemia Detection A Comprehensive Review