SlideShare a Scribd company logo
1 of 24
Download to read offline
Review
Machine learning applications to non-destructive
defect detection in horticultural products
Jean Frederic Isingizwe Nturambirwe, Umezuruike Linus Opara*
Postharvest Technology Research Laboratory, South African Research Chair in Postharvest Technology, Department
of Horticultural Science, Stellenbosch University, Private Bag X1, Stellenbosch 7602, South Africa
a r t i c l e i n f o
Article history:
Received 18 June 2019
Received in revised form
22 October 2019
Accepted 10 November 2019
Published online 29 November 2019
Keywords:
Non-destructive
Machine learning
Internal damage
Early detection
Fruit defect classification
Deep learning
Machine learning (ML) methods have become useful tools that, in conjunction with sensing
devices for quality evaluation, allow for quick and effective evaluation of the quality of
food commodities based on empirical data. This review presents the recent advances in
machine learning methods and their use with various sensing devices to detect defects in
horticultural products. There are technical hurdles in tackling major issues around defect
detection in fruit and vegetables as well as various other food items, such as achieving fast,
early and quantitative assessments. The role that ML methods have played towards
addressing such issues are reviewed, the present limitations highlighted, and future
prospects identified.
© 2019 IAgrE. Published by Elsevier Ltd. All rights reserved.
1. Introduction
For the past few decades, the horticultural sector has seen
significant technical advances aimed at reducing food post-
harvest losses whereby, non-destructive (ND) technology has
been increasingly adopted for effective fruit quality evaluation
and assurance. These techniques span optical and acoustic
vibration to nuclear magnetic resonance, computer vision
techniques, computed tomography, electronic noses (Gao,
Zhu, & Cai, 2010), near infrared spectroscopy, hyperspectral
imaging and intelligent packaging (Sousa-Gallagher, Tank, &
Sousa, 2016). These evolving ND techniques for quality
monitoring and assessment, together with packaging and
storage solutions, are seen as the main players that in future
implementations will help achieve longer sustenance of
quality in fruit and vegetables. Future trends also favour the
introduction of intelligent packaging which incorporates ND
sensors for chemical, biological or physical characteristics,
and radio-frequency identification (Biji, Ravishankar, Mohan,
& Srinivasa Gopal, 2015) in packaging systems that can allow
to monitor the overall stability of produce during transport or
storage (Lee, Lee, Choi, & Hur, 2015). Intelligent systems pro-
vide information that can be used to extend shelf life of food
products, they can be made of biodegradable films and
therefore low-cost, which can minimise wastage and also
* Corresponding author.
E-mail address: opara@sun.ac.za (U.L. Opara).
Available online at www.sciencedirect.com
ScienceDirect
journal homepage: www.elsevier.com/locate/issn/15375110
b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3
https://doi.org/10.1016/j.biosystemseng.2019.11.011
1537-5110/© 2019 IAgrE. Published by Elsevier Ltd. All rights reserved.
contribute to environmental sustainability (Sousa-Gallagher
et al., 2016).
Machine learning (ML) methods are an integral part in the
development of many sensing technologies (Cui, Ling, Zhu, &
Keener, 2018), responsible for retrieval of information, signal
processing and analysis of data acquired by most sensors
(Cui et al., 2018; Markom et al., 2009; Xu et al., 2016). They
have proven to overcome the limitations of the classical
computing paradigm in cases such as classification and
defect detection in various types of fruit using computer
vision (Gill, Sandhu, & Singh, 2014; Khoje & Bodhe, 2013). As
pointed out by Gill et al. (2014), soft computing models are the
enablers of the future use of computer vision based non-
destructive studies in fruit (Gill et al., 2014). Researchers
have repeatedly emphasised the need to improve modelling
performances by using advanced feature extraction tech-
niques such as histogram-based feature extraction, grey-
level co-occurrence matrix (GLCM) and/or wavelet-based
features. ML methods that could address many challenges
pertaining to biosystems predictive modelling were also
proposed, they include neural networks (NNs) or least square
support vector machine (LS-SVM) among others (Baiano,
Terracone, Peri, & Romaniello, 2012; Baietto & Wilson,
2015). Though a widely used powerful tool in many
research fields such as diagnosing medical abnormalities
(Esteva et al., 2017) and defect detection in civil engineering
(Cha, Choi, & Büyük€
oztürk, 2017), deep learning, a sub-field of
machine learning, is hardly used in agriculture technologies
and less so in horticultural industry (Wang, Hu, & Zhai, 2018).
However, there has been recent agricultural applications of
convolutional neural network (CNN) in image classification
(leaf picking) by robotic systems (Ahlin, Joffe, Hu, McMurray,
& Sadegh, 2016) and in fruit detection, counting and seg-
mentation (Bargoti & Underwood, 2017; Chen et al., 2017; Sa,
Ge, Dayoub, Upcroft, Perez, & McCool, 2016). Deep learning
has been reported to enable integration of feature extraction
that results in superior performance over conventional
image processing methods in many vision tasks (Girshick,
Donahue, Darrell, & Malik, 2014) and therefore, a potential
candidate for performance enhancer in defect detection
systems. There have been reviews whereby machine
learning applications in the food industry have focussed on
specific sensors (Du & Sun, 2006), infield usage and sensor
fusion (Srivastava & Sadistap, 2018), a specific commodity
Nomenclature
ANN Artificial neural network
AUC Area under curve
BPNN Back propagation neural network
BSR Basal stem rot
CA Clustering analysis
CCD Charge-coupled device
CFS Correlation-based feature subset selection
ChiS Chi square
CNN Convolutional neural network
CT Computer tomography
CV Computer vision
DL Deep learning
DS Direct standardisation
DT Decision tree
ELM Extreme learning machine
FURIA Fuzzy unordered rule induction algorithm
GA Genetic algorithm
GIA Gini impurity algorithm
GLCM Grey-level co-occurrence matrix
GPU Graphical processing unit
HSI Hyperspectral imaging
IG Information gain
k-NN K - nearest neighbour
LDA Linear discriminant analysis
LINE A liblinear classifier
LOG Linear logistic regression
LR Linear regression
LS-SVM Least squares support vector machine
LVQN Learning vector quantization network
ML Machine learning
MLPNN Multilayer perceptron neural network
MNF Minimum noise fraction
MR Magnetic resonance
mRMR Minimum redundancy maximum relevance
MSI Multispectral imaging
NB Naı̈ve Bayesian method
NBC Naı̈ve Bayes classifier
ND Non-destructive
NIR Near infrared
NNs Neural networks
NNC Nearest-neighbour classifier
PCA Principal component analysis
PDS Piecewise direct standardisation
PLS Partial least squares
PLS-DA Partial least squares discriminant analysis
PLSR Partial least squares regression
ReCNN Region based convolutional neural network
RF Random forest
RMSEP Root mean square error of prediction
SIRI Structure illumination reflectance imaging
SFS Sequential forward selection
SLOG Simple logistic
SLR Sparse logistic regression
SMO Sequential minimal optimisation
SOM Self-organising maps
SPA Successive projection algorithm
SSAE Stacked sparse auto-encoder
SVM Support vector machine
SWIR Shortwave infrared
SWNIR Shortwave near-infrared
VGG Visual geometry group
Vis Visible spectrum
WSA Watershed segmentation algorithm
ZF Zeiler Fergus
ø Basis function for a neural network
b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 61
(Lu, 2017) or on various quality features assessment
(Hameed, Chai, & Rassau, 2018; Ropodi, Panagou, & Nychas,
2016).
In this review, the focus is on the application of ML in
solving the existing issues in non-destructive detection of
defects in fruit and vegetables. We explore the role it has
played in enabling non-destructive techniques for horticul-
tural quality assessment, especially in defect detection, and
we pinpoint the hurdles that researchers are still trying to
overcome and discuss future directions for research and
applications.
2. Defect and detection
2.1. Types of defect
Fruits and vegetables are prone to defects due to pre-harvest
practices, postharvest handling and storage conditions and
therefore, may lead to various losses throughout the food
chain. The diagram in Fig. 1 depicts the common types of
defects encountered in fruit and vegetables.
Pathological disorders are associated with attacks by vi-
ruses, fungi, bacteria or microbial pathogens that in time can
lead to fruit spoilage or decay (Fourie, 2008). Many disorders of
pathological nature exist and their manifestations in agricul-
tural products may be visually similar regardless of the type of
infection or product (Barbedo, 2016). Thus, the ability to detect
the infecting agent and/or chemical reactions there associated
helps identify the causal effects and accurately determine the
specific disorder (Ray et al., 2017).
Excessive external forces in the form of compression or
impact cause mechanical damage to agricultural products.
This results in tissue failure, pigment deterioration and
metabolic changes in affected areas. It increases the vulner-
ability of the product to infections and reduces its shelf life.
Mechanical damage can occur during growth on tree due to
environmental factors or during and after harvest due to
human or machine handling (Hussein, Fawole, & Opara, 2018;
Li & Thomas, 2014).
Physiological stresses related to nutrition, temperature,
respiration at various developmental stages and during
storage can lead to disorders such as bitter pit, watercore,
mealiness, sunburn, browning, superficial scald, granulation
and internal drying, among others (Herremans et al., 2013;
2014; Magwaza et al., 2012). Fruit with such physiological
disorders result in lower commercial value (van Dael et al.,
2016).
Morphological disorders manifest themselves as de-
formations that make a product have an ‘abnormal shape’.
Though such deformations may not affect the compositional
properties of a product, they complicate some object and
defect detection tasks, especially using computer vision,
whereby shades due to irregular surface curvatures may be
wrongly encoded as certain similar defects (Anyasi, Jideani, &
Mchau, 2015; Moallem, Serajoddin, & Pourghassem, 2017).
Internal defects encompass all latent disorders and dam-
ages that may be pathological, physiological or early devel-
opment of mechanical damage. The ability to detect such
latent defects is of high importance along the food chain; it
provides a way of sorting quality disease free fresh produce for
the market, preventing disease spreading, possible food losses
and consumer dissatisfaction (van Dael et al., 2016; Van Dael,
Verboven, Zanella, Sijbers, & Nicolai, 2019; Moggia et al., 2015;
Raghavendra & Rao, 2016).
2.2. Techniques for defect detection in plant material
Defects may be latent and internal or externally visible;
therefore, detection methods may differ from one case to
another, depending of the nature of the defect. The choice
in instrumentation may also depend on the context (e.g.
research, industrial) and the commodity investigated.
Defect detection has three overall outcomes; ensuring
consistently high-quality of products for the consumer,
enhancing profitability for the industry and reducing food
losses (Lu, 2017).
Many non-destructive techniques are in use for objective
detection of defects in plant material. They include optical
detection (Tischler, Thiessen, & Hartung, 2018), thermal im-
aging (Kim, Kim, Park, Kim, & Cho, 2014), structured illumi-
nation (Lu, Li, & Lu, 2016; Lu & Lu, 2018), electrical
spectroscopy (Khaled, Abd Aziz, Bejo, Nawi, & Abu Seman,
2018), electronic nose (Cui et al., 2018), infrared spectros-
copy, hyperspectral imaging (Che et al., 2018), magnetic
resonance imaging, X-rays and various biological sensing
techniques (Ruiz-Altisent et al., 2010). Thermal imaging is
based on measuring infrared radiation emanating from an
object (Van Linden, Vereycken, Bravo, Ramon, & De
Baerdemaeker, 2003). It has potential for detecting bruises in
fruit since bruised areas hold a different temperature
compared to healthy tissue, resulting in a contrasted response
in the radiation detected by a thermal camera. Electrical
spectroscopy is based on the measurement of electrical
properties of material such as dissipation factor, impedance,
dielectric constant and capacitance (Khaled et al., 2018).
Rather than using uniform lighting, structured illumination
uses spatially patterned (e.g. sinusoidally modulated) lighting,
to image food products, which makes it capable of depth-
resolved and topographic imaging (Lu, 2017). Table 1 sum-
marises some most recent non-destructive techniques that
were used in tandem with ML for detecting defects in plant
Fig. 1 e Common types of defects encountered in
horticultural products.
b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3
62
material. Tischler et al. (2018) used optical measurements
(computer assisted fluorometer, ‘MultiDetExc’) to detect
‘brown rust’ in wheat at early stage of infection. The method
consisted in temporally measuring the fluorescence of opti-
cally excited chlorophyll (in discrete wavelengths) in wheat
plants that were artificially infused with this fungal infection.
The system was reported to be unbiased by daylight, relatively
rapid and less invasive than its competitors (Tischler et al.,
2018). Mehl, Chen, Kim, and Chan, (2004) developed a hyper-
spectral imaging system to detect surface defects such as
bruises, side rots flyspecks, scabs and moulds, fungal diseases
(such as black pox), and soil contaminations in apple fruits.
The system consisted of an sample illumination system and a
charge-coupled device (CCD) camera to record the image from
reflected and filtered light from the fruit samples (Mehl et al.,
2004). Hyperspectral imaging, similar to other vibrational
spectroscopy, exploits the molecular vibrations when they
interact with electromagnetic radiation. Hyperspectral imag-
ing is an attractive technique because it offers both spectro-
scopic and imaging aspects and thus enables the
simultaneous acquisition of both spectral and spatial infor-
mation from an object for a comprehensive analysis of ma-
terial. It has been a trending application to the study of quality
in food and agricultural products (Lu, Huang, & Lu, 2017).
2.3. Defects detection challenges
Pre-harvest practices, harvest quality, the genetic predispo-
sition of crops and postharvest storage conditions all play an
important role in determining various fruit properties and
quality conditions such as fruit physical features (shape, size,
deformations, disorders) and resistance to disease attack (De
Groote, 2012; Hussein et al., 2018; Ray et al., 2017).
Postharvest handling (harvesting methods, transport and
packaging) of fresh fruits is likely to inflict mechanical dam-
age to fruit. A recent review summarised methods for
measuring and indexing the potential of bruise damage to
produce, under mechanical loading, suggesting ways to pre-
vent bruise occurrence through pre- and postharvest handing
practices (Opara & Pathare, 2014). When such practices are not
enforced, which is common in developing countries, such
damage and disorders may occur which increase susceptibil-
ity to spoilage and may result in economic losses. These losses
could be reduced by grading damaged fruit based on accurate
determination of damage severity, both internal and external.
An objective method for this purpose is required, but it is not
yet developed and is still a challenge for research into food
safety (Li & Thomas, 2014).
Another challenge emanates from the nature of defects
and how well the link to their cause is understood. For
example, structural, cell and tissue damage in fruit (Jim
enez,
Rallo, Rapoport,  Su
arez, 2016) may lead to increased decay
and are common in inhomogeneous fruit such as tomato and
kiwifruit. It has direct implication on food safety and quality;
however, it has had little attention in research. Currently in-
ternal damage can only be visualised destructively. Li and
Thomas (2014) speculated that based on a relationship be-
tween internal and external damage, if any existed, one could
use absorbed energy or peak contact force as a representative
measure of internal damage. Validating predictions of inter-
nal damage from associated surface damage (as the area of
damaged exocarp), using methods such as in (Idah, Ajisegiri, 
Yisa, 2007; Van Zeebroeck et al., 2007) would however be
required. According to Li and Thomas (2014), in order to fully
understand the dynamics between handling and associated
damage, the use of logistic regression modelling could be
Table 1 e Various sensing techniques used with ML for defect detection in recent years.
Technique Example of study Parameters Reference
Electrical spectroscopy Classification of diseased oil
palm leaves
Impedance, dielectric constant,
capacitance, dissipation factor
Khaled et al. (2018)
Thermography Mechanical damage
detection and estimation
Infrared radiation emitted by a heated
object
Kim et al. (2014)
Structured illumination Bruise detection in apples ‘’ Lu and Lu (2018)
Hyperspectral imaging HSI Diverse Molecular vibrational frequencies Wu and Sun (2013)
‘’ Physical damage in pear ‘’ Lee et al. (2014)
‘’ Early bruises in peaches ‘’ Li et al. (2018)
‘’ Diverse ‘’ Lu et al. (2017)
Shortwave Infrared (SWIR) HSI Bruise detection in apples ‘’ Keresztes et al. (2016)
Machine vision Hidden insect infestation Electromagnetic emission (visible
range)
Moradi (2011); Okamoto (2013); Lu and
Ariana (2013)
Magnetic resonance imaging ‘’ Relaxation in spin resonance of atomic
nuclei
Haishi, Koizumi, Arai, Koizumi, and
Kano (2011)
X-ray imaging ‘’ Contrasted attenuation of transmitted
X-rays
Chuang et al. (2011)
Acoustic ‘’ Change in sonic vibration recorded
from the emitting source
Hetzroni, Soroker, and Cohen (2016);
Mankin, Hagstrum, Smith, Roda, and
Kairo (2011); Potamitis and Ganchev
(2009)
Gas chromatography ‘’ Quantitation of chemical volatile
components
Kendra et al. (2011)
‘’: same as above.
b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 63
complemented in combination with technologies such as X-
rays, hyperspectral imaging, magnetic resonance or ultra-
sonic techniques.
Thus, the important aspects in damage detection in the
horticultural products that are still problematic include:
- Detection and determination of extent of internal de-
fects (Li  Thomas, 2014);
- Early detection (Lu, 2017);
- Objective, quantitative evaluation of mechanical dam-
age (Li  Thomas, 2014; Opara  Pathare, 2014) and
- Fast detection of defects for industrial application such
as sorting and grading systems and portable infield tests
(Abasi, Minaei, Jamshidi,  Fathi, 2018).
Research in the area of postharvest non-destructive quality
assessment has aimed at finding solutions to achieve objec-
tives such as these mentioned above. Some techniques that
have been used to achieve these goals are shown in Fig. 2.
Recent reviews that dealt with defect detection methods,
highlighting the progress and exposing the gaps were sum-
marised in Table 2.
A few approaches and technical solutions have been pro-
posed in the past, whereby nuclear magnetic resonance is
considered the most prominent technique; it can quantita-
tively asses internal and external damage (Zhao, Men, Liu,
Wu,  Yan, 2016). However, magnetic resonance (MR) sys-
tems are costly, require high expertise to operate, have a low
speed of measurement and their relatively low-cost, low-field
versions are still lacking a specialised customisation in terms
of readiness for practical applications. The implementation of
MR systems on sorting lines is also problematic since these are
generally made of high magnetic susceptibility metals, which
would be disruptive to the stability of measurement fields in
MR systems and thus not yet fit for industrial application.
Another, very promising technology is NIR based spectros-
copy and imaging which, with adequate feature selection, has
been reported to be a convenient option for online sorting
(Stella et al., 2015). However, NIR use is restricted to a limited
number of attributes (Lakshmi et al., 2017); an idea worth
exploring is that of fusion of data concomitantly generated by
different devices in order to complement the limitations
of each.
3. ML methods used in ND techniques
ML is a branch of computer intelligence that aims to study and
build algorithms that can learn from and make predictions on
data. The goal is to give to computers the task of continuously
improving performance on a specific task by making data-
driven predictions or decisions. Basically, there is a general
belief that behind the data we observe there exists a process
and it is not completely random. ML aims to find a rule that
explains data based on a limited size data sample (Hsieh,
2009). In the context of this review the term data refers to
empirical data unless explicitly mentioned. Empirical data is
the type of data acquired experimentally through a mea-
surement process as part of scientific inquiries. In defect
detection of horticultural products using non-destructive
techniques, such data is acquired in a form of image (2- or 3-
dimensional), continuous spectral information in time or
frequency domain or discrete values of numerical or character
type.
One sub-field of ML that is also extensively used in horti-
cultural quality assessment is that of ‘pattern recognition’. It
deals with the automatic discovery of regularities in data by
means of computer algorithms and the use of such regular-
ities in tasks like categorization (Bishop, 2006). The sub-
divisions of ML are given by the chart in Fig. 3.
Typical applications of ML in defect detection of horticul-
tural products encompasses classification and regression.
Classification techniques predict discrete responses; the
models are built to classify data into categories, while
regression techniques predict continuous responses such as
forecast in temporal changes of a given time dependent
characteristic. Many learning algorithms have been used for
assessing properties of horticultural products including defect
detection, some, more popular than others depending on the
learning task. Different learning methods and their dedicated
uses are summarised in Table 3 and more details on some of
the popular algorithms are provided below.
3.1. Artificial neural network
Artificial neural networks (ANNs) are designed to mimic the
function of a human brain based on models of biological
neurons (Jamshidi, 2003). An ANN consists of a number of
interconnected neurons (parallel processing units made of
input, hidden and output layers, see Fig. 4(A)) which in turn
comprises weights, thresholds and an activation function
(Khaled et al., 2018). In high dimensional data, models for
regression and classification, that are built on linear combi-
nations of basis functions, become ineffective and therefore
need to be adapted to the data. Neural networks have shown
to be effective in such a situation of pattern recognition,
whereby the feed-forward neural network is considered to be
the most successful (Bishop, 2006).
Basis functions used in neural networks follow the form of
Eq (1) which is a linear combination of nonlinear basis func-
tions øj(x).
Fig. 2 e Main challenges of defect detection in horticultural
products.
b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3
64
Table 2 e Recent reviews on various techniques for detecting defects in fruits and vegetables.
Topic Summary Knowledge Gap Reference
Detecting apple defects by
non-destructive
spectroscopy and imaging
Overview of common
defects in apples, current
status and prospects of
their detection techniques.
- Further research is needed to improve existing techniques and explore new, emerging techniques for
more effective detection of both external and internal defects in apples.
- More research on the development of rapid, low-cost x-ray imaging and MRI sensing systems;
- Further effort toward improving the hardware and software for hyperspectral imaging, for more
efficient image acquisition and processing, to enable automated on-line sorting and grading.”
Lu (2017)
Pre-harvest factors
influencing damage
Understanding factors that
influence bruise
susceptibility
- Reduce the phenomenon of bruise occurrence
- Manipulation of preharvest factors to influence bruise resistance?!
- Some factors are not widely researched; more study can shed light to it.
Hussein et al. (2018)
Biosensors for sustainable
food engineering
Five challenges for food
sustainability: the role of
biosensors in addressing
them
- Production challenge about food safety and security,
- Quality challenge in food diversity and qualities,
- Economic challenge in governing food system including its packaging and supply chain,
- Environmental challenge including food waste processing
- Explore RFID sensors in smart packaging
- Graphene-based bio-sensing: superior optical, electric, thermal, mechanical and chemical properties
Neethirajan, Ragavan,
Weng, and Chand (2018)
Quantitative measurements
of mechanical damage in
fruits
Objective and quantitative
assessment of damage to
enable grading
- NMR potential for internal damage in fruit to achieve transfer to supply chain.
- Detect internal from external damage?!  proposed methods: absorbed energy or peak contact force
as surrogate measure of damage
- Objectively speaking, how does handing cause bruising? Possible solution: Logistic
regression þ Spectroscopy NMR, HSI, X-rays, Ultrasonic tech.
- Multiscale FME modelling: check consistency of regression models linking bruising and mechanical
parameters
- Cell and tissue damage also can lead to food safety and quality issues: should be
investigated  microscopy studies
Li and Thomas (2014)
Techniques for
measurement of bruise
damage
Indexing for bruise
potential, methods for
bruise measure; suggested
ways to prevent bruise
occurrence through pre-
and postharvest handing
practices
- Standardization of bruise assessment criteria, measurement and analytical techniques to improve
the traceability and transferability of bruise measurement and to permit inter-laboratory
comparisons
- Bruise susceptibility studies are very helpful in preventing damage during handling operations;
effective prevention is only possible when the factors responsible for bruise development are known.
- To reduce impact damage, fruit acceleration and deceleration must be carefully controlled
- Need for “in-depth studies to investigate and predict the effects of bruising on nutritional and flavour
quality.”
Opara and Pathare (2014)
Plant pest detection using
an artificial nose system
Promising in quick and
early non-invasive
diagnosis of insect damage,
bacterial, fungal and viral
infection in plant tissue.
Challenges with sensor performance, environment suitability for sampling and detection, selectivity and
scaling up
Cui et al. (2018)
Non-destructive detection
for fruit quality
Detectable defects
1. Internal damage,
2. Physical damage,
3. Decay
4. Insect damage
5. Frost injury
Tested or prominent detection techniques
1. Sonic vibration, X-ray, MRI, Laser inspection;
2. Optical absorbance, electrical properties, HSI, NIRS;
3. NMR, electrical properties, HSI;
4. NMR;
5. NMR, HIS
In all, improvements are needed to meet routine application.
Gao et al. (2010)
(continued on next page)
b
i
o
s
y
s
t
e
m
s
e
n
g
i
n
e
e
r
i
n
g
1
8
9
(
2
0
2
0
)
6
0
e8
3
65
yðx; wÞ ¼ f
 X
L
j¼1
wj∅jðxÞ

(1)
where f (.) is a nonlinear activation function in classification or
identity in regression and wj are coefficients along which øj(x)
is made dependent on parameters that are adjustable during
training. Therefore, a basic neural network model is denoted
as a series of functional transformations as in Eq. (2), whereby
each basis function is a nonlinear function of a linear com-
bination of inputs of which coefficients are adaptive
parameters.
For input variables x1, …xD we build L linear combinations
such that activations aj are given by
aj ¼
X
D
i¼1
wl1
ji xi þ wl1
j0 (2)
where l1 indicates the first layer of the network, parameters
wl1
ji and wl1
j0 are weights and biases, respectively and j ¼ 1, …, L.
The outputs of the basis function in Eq. (1) denoted by zj,
also referred to as hidden units, are given by a transform of
activations using a differentiable (generally sigmoidal func-
tions), nonlinear activation function h (.), such that
zj ¼ h

aj

(3)
Similarly, output unit activations will be given by the
following equation:
ak ¼
X
D
j¼1
wl2
kjzj þ wl2
k0 (4)
For K total outputs and k ¼ 1, … K. An appropriate trans-
form is applied to produce outputs yk.
ANNs are known to be adaptable in learning, good in
generalisation and noise tolerance. Like supervised methods,
they require a large sample set for training but they provide
more robust algorithms and higher accuracy than unsuper-
vised methods. Nonetheless, there is a tendency to over-fit
data and the problem with interpretation of a classifier
which is inherent with the experimental nature of modelling;
a trained neural network has the characteristics of a ‘black
box’ (Cui et al., 2018). ANNs have been used with electronic
nose systems for accurate quantitative analysis, in detecting
diseases (Markom et al., 2009), in classification of hyper-
spectral images of damaged mushrooms (Rojas-moraleda,
Valous,  Gowen, 2017), with dielectric spectroscopy (Khaled
Table
2
e
(continued
)
Topic
Summary
Knowledge
Gap
Reference
ND
methods
for
detection
of
insect
infestation
in
fruit
and
vegetables
The
methods
have
included
fluorescence
and
visible-IR
spectroscopy;
hyperspectral,
X-ray,
thermal
and
MR
imaging;
and
acoustic
and
chemical
emission
detection
Future
directions:
-
Reduce
the
effect
of
background
data
in
the
resulting
profile
or
data;
-
Optimise
techniques
for
a
specific
fruit
and
insect;
-
Automation
of
techniques
for
continuous
monitoring
of
insect
infestation
under
real
conditions;
-
Integrated
and
simultaneous
use
of
different
methods
to
achieve
higher
detection
accuracy
and
insect
infection
management.
Ekramirad,
Adedeji,
and
Alimardradni
(2016)
Fig. 3 e Main categories of ML; adapted from Hsieh (2009).
b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3
66
et al., 2018) and some other food quality related applications
(Du  Sun, 2006; Gandhi  Armstrong, 2016).
A new formulation of neural networks which has been
especially successful in learning applications aimed at pattern
recognition in images is convolutional neural networks (CNN)
which implements so called ‘deep learning’ (DL), a subset of ML.
The particularity of DLnetworks isthat they have more complex
ways of layer interconnectivity, more nodes and are capable of
automatic parameter extraction; however, training them does
require higher computational power than conventional neural
networks. In addition to CNN, the main architectures of DL
networks include recurrent neural networks, recursive neural
Table 3 e Common ML functions.
Algorithm Learning task
Decision tree classification, regression Supervised learning
Bagged and boosted decision trees classification
Generalised linear model regression
Support vector machine classification, regression
Gaussian kernel classification, regression
Ensembles classification, regression
Logistic regression classification
K-nearest neighbour classification
Discriminant analysis classification
Neural network classification
Naı̈ve Bayes classification
Gaussian process regression model regression
Nonlinear regression regression
Genetic linear regression regression
k-Means Hard clustering Unsupervised learning
k-Medoids Hard clustering
Hierarchical clustering Hard clustering
Self-organising map Hard clustering
Fuzzy c-means Soft clustering
Gaussian mixture model Soft clustering
Principal component analysis Dimensionality reduction
Factor analysis Dimensionality reduction
Nonnegative matrix factorisation Dimensionality reduction
Fig. 4 e A schematic illustration of an ANN with two hidden layers (A); adapted from Acquarelli et al. (2017) and a CNN (B) for
a vision problem (object detection); adapted from Voulodimos, Doulamis, Doulamis,  Protopapadakis (2018).
b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 67
networks and unsupervised pre-trained networks (Patrı́cio 
Rieder, 2018). Though DL is increasingly being used in vision-
based implementations for autonomous vehicles and for artifi-
cial intelligence, as well as in aspects of signal processing
(Marchi, Ferroni, Eyben, Gabrielli,  Squartini, 2014), character
recognition (Breuel, Ul-hasan, Al-azawi,  Shafait, 2013), lan-
guage identification (Sak, Senior,  Beaufays, 2014) and trans-
lation (Sutskever, Vinyals,  Le, 2014), it has rarely been used in
imaging systems for agricultural applications (S. Naik  Patel,
2017). Nonetheless, there are a few cases where DL has been
successfully used for the quality evaluation of agricultural
products (Ferentinos, 2018; Fuentes, Yoon, Lee,  Park, 2018;
Grinblat, Uzal, Larese,  Granitto, 2016; Mohanty, Hughes, 
Salath
e, 2016; Picon et al., 2019).
CNN algorithms draw inspiration from biological vision
processes in the visual cortex of an animal; vision cells are
sensitive to minute sub-regions of the visual field (Acquarelli,
Laarhoven, Gerretzen,  Tran, 2017). CNNs exploit the property
that many natural signals can be decomposed in a hierarchical
manner such that by composing lower-level features, higher-
level ones can be obtained. For example, in images, objects are
composed of parts, which in turn are made of motifs, which are
also formed by local combinations of edges. Similar hierarchies
can be found in speech and text (Lecun, Bengio,  Hinton, 2015).
Some of the CNN architectures found in the literature include
region based CNN (ReCNN), fast and faster ReCNN (Sa et al.,
2016), ResNet (He, Zhang, Ren,  Sun, 2016), VGG Net
(Simonyan  Zisserman, 2015), ZF Net (Zeiler  Fergus, 2014),
GoogLeNet (Mohanty et al., 2016), AlexNet (Jiang et al., 2019) and
LeNet-5 (Kirk  Wen-Mei, 2016).The structure ofa typical CNN is
a series of stages starting from convolutional and pooling layers
whereby the former detects local connections of features from
the previous layer while the latter merges semantically similar
features into one (Lecun et al., 2015). Generally speaking, a
typical DL network is made up of an input layer where the input
is a feature set, a number of stacked stages of convolution, non-
linearity and pooling, more convolution and fully connected
layers, and the output layer (see Fig. 4(B)).
3.2. Fuzzy logic
The human experience in producing complex decisions based
on uncertain and vague information is simulated in fuzzy logic.
It has proven to be a valuable tool in dealing with incomplete
and/or ambiguous information in classification problems
including grading of fruit using computer vision systems
(Shahin, Tollner,  McClendon, 2001). However, it involves
tuning for better performance which can be problematic in
problems dealing with high dimensional data (Du  Sun, 2006).
Fuzzy logic has proven instrumental in control systems for
managing complex production processes of food and beverages.
Instead of representing a complex system behaviour by quan-
titative, mathematical expression of systems transfer, fuzzy
systems offer the possibility of using simpler linguistic variables
and algorithmic formulations (Birle, Hussein,  Becker, 2013).
3.3. Decision trees
Decision trees explain variation of a single response variable by
repeatedly splitting the data into more homogeneous groups,
using combinations of explanatory variables that may be cate-
gorical (classification) and/or continuous numeric (regression)
(De' Ath  Fabricus, 2000). A simple prediction model is fitted
within each data partition and for classification problem, the
accuracy is calculated as classification gain after every splitting
step, whereas for regression, the squared error of prediction is
used. Algorithms for growing trees are widely available and
summarised in Loh (2011). Decision trees are advantageous in
the sense that they are easy to construct and interpret, they can
handle various response data types such as categorical,
numeric,ratingsandtheyareabletohandlemissingdatainboth
response and independent variables. Separate tree models can
be combined into what is known as committee of experts in
order to enhance model performance, an approach also known
as ensemble learning. Popular methods of model combination
include bagging and boosting (Moisen, 2008), these have also
been applied to other learning methods such as linear discrim-
inant analysis (Ashour, Guo, Hawas,  Xu, 2018), neural net-
works and partial least squares (Bian, Li, Shao,  Liu, 2016), to
name a few.
Bagging is a method for generating multiple versions of a
predictor using bootstrap replicates of the learning data set and
combining them into one to improve accuracy. When predicting
a class, a plurality vote is conducted, whereas an average is
calculated over the predictor versions for a numerical outcome.
Bagging can improve the accuracy by a combined model, if the
bootstrap induced perturbation of the learning data set in-
troduces significant variability between the predictors within
the aggregation constructed (Breiman, 1996).
Boosting is a technique for agglomerating multiple classi-
fiers which results in a combined model with higher perfor-
mance than the individual classifier alone. The base classifiers
are trained in sequence using a weighted form of the data set
whereby the weighting coefficient for each data point depends
on the performance of the previous classifiers. Upon training
all classifiers the final prediction is obtained by weighted
majority voting. Boosting can give good results even when the
base classifiers are weak learners (learners with nearly
random performance), it can be interpreted as a sequential
optimisation of an additive model with an exponential error,
which opens possibilities for range of boosting-like algorithms
such as extensions to multiclass and regression problems
(Friedman, Hastie,  Tibshirani, 2000). The most widely used
boosting algorithm is AdaBoost (adaptive boosting) which is
described in Freund  Schapire (1999).
3.4. Random forest
Random forest (RF) is a supervised method based on ensemble
learning algorithm and is popular in classification and regres-
sion. RF combines a multitude of decision trees at the training
stage and the mode of classes for individual trees is selected as
the output class (Cui et al., 2018). RF is efficient for large database,
for variable importance estimation and the generated forests can
be used on future datasets. During prediction, classification of a
new object is done by growing decision trees and going through
the input vector down in all the trees of the forest and choosing
the classification with majority votes over all trees of the forest.
Applying a strategy of sampling replacement (out-of-bag) en-
sures an unbiased estimation of classification error and
b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3
68
estimation of feature importance whereas, using randomly
selected inputs or combinations of inputs at each node to grow
each tree results in the most desirable performance character-
istics. The randomness of decision forests can address multi-
class problems with unbalanced datasets and overcomes the
tendency to overfit that is typical of decision trees (Breiman,
2001). Some application of RF in non-destructive studies of food
quality have included digital imaging (Pereira, Barbon,  Valous,
2018), vision systems for object detection and papaya ripeness
estimation (Goel  Sehgal, 2015), and others (Adam, Deng,
Odindi, Abdel-Rahman,  Mutanga, 2017; Knauer et al., 2017).
3.5. Support vector machine
Support vector machine (SVM) is based on structural risk
management from statistical learning theory and is used for
nonlinear regression and classification (Huang, Hung, Lee, Li,
 Jiang, 2014). In a classification problem, the SVM algorithm
aims to maximise an optimal hyperplane as a decision func-
tion. The basic SVM deals with two-class situations whereby
the created hyperplane for separating data is defined by a
number of support vectors (margins to the nearest data
points) (Samanta, Al-Balushi,  Al-Araimi, 2003). SVM is
known for excellent performance in classification and pre-
diction due to its efficiency at avoiding issue of overfitting
which is common in modelling such high-dimensional data
(Huang et al., 2014).
Training data classes are encoded by “1” and “-1” or
mathematically represented as ffxi; yigT
i1; xi 2 Rn
; yi 2 f  1;
þ 1gg, i ¼ 1, …, l and the hyperplane is given by:
w , x þ b ¼ 0
where the parameters of the hyperplane are a weight vector,
w and bias, the constant b; x is the input dataset. The decision
function f (.) can therefore be denoted as follows:
fðxÞ ¼ sign ðw , x þ bÞ
Other Kernel based formulations of SVM can be found in
Huang et al. (2014).
In multiclass problems three main approaches aim to
combine multiple two-class SVMs and are as follows. The first
considers all possible pairs of one class against one other (one
versus one) which, for a given number c of classes, would
result in c(c-1)/2 classifiers and the correct class of samples is
determined by a voting strategy. The second approach each
single class encoded as “1” is classified against all the rest (c-1)
encoded as “-1”, which results in c dual-class training prob-
lems and a decision function is applied, of which the
maximum value is the deciding factor for the class of a new
unknown sample. The third approach follows the c(c-1)/2
dual-class categorization problem and training is similar to
that in the ‘one versus one’ case. In testing, a two-class
directed acyclic graph is established whereby a sample of
unknown class is tested from the root nodes (Nasrabadi, 2007).
3.6. Clustering analysis
Clustering analysis is an unsupervised method for classifica-
tion of data structures and associations that were rather not
evident. It yields results that are easy to understand, however,
the methods for determining the appropriate number of
clusters are not satisfactory. Results are presented in a form of
dendrogram whereby the closer the points are in the clusters,
the more similar the samples (Belous, Malyarovskaya, 
Klemeshova, 2016). Clustering analysis has been successful
in using electronic nose applications to detect defects in plant,
including plant diseases and artificially- or herbivore-induced
damage in cucumber, tomato and pepper plants (Markom
et al., 2009) and spider mites infestation in cucumber
(Laothawornkitkul et al., 2008).
3.7. Linear discriminant analysis (LDA)
LDA is a linear classification technique that aims to maximise
between-class variance while minimizing within-class vari-
ance using ‘Fischer's Metric’, with the assumption that
variance-covariance matrices of the classes are equal (Naik
et al., 2017). If this assumption does not apply a more gener-
alised formulation, the quadratic discriminant analysis
method is used (Gewali, Monteiro,  Saber, 2018, pp. 1e46).
These classifiers are known as Gaussian generative models
and are widely used. They have the advantage of allowing the
determination of marginal density of the data and they
perform well on an a notably wide and diverse set of classifi-
cation problems (Maugis, Celeux,  Martin-magniette, 2011).
LDA is a common classification approach in chemometrics
and has been to solve various detection problems including
effective classification of fly infested olive fruit (Moscetti et al.,
2015), detecting early bruises in apples (Baranowski, Mazurek,
Wozniak,  Majewska, 2012), detecting damage due to fungal
decay, shrivel and mechanical load in blueberry (Leiva-
Valenzuela  Aguilera, 2013) and determining powdery
mildew disease severity in wine grapes (Knauer et al., 2017).
3.8. Genetic algorithm
Genetic algorithms (GAs) are used as tools for optimisation of
a given response function and feature selection (Alma  Bulut,
2012). Inspired by Darwin's theory of natural evolution, they
apply genetic operators such as mutation and crossover to
select the fittest solution over a certain number of computa-
tional generations until a stop criterion is met (converged
solution or maximum number of generations) (Niazi  Leardi,
2012). GAs have been repeatedly used in association with PLS
regression to optimise prediction models (improve prediction
accuracy and model simplicity) of various properties of food-
stuff (Feng  Sun, 2013, pp. 74e83; Nturambirwe, Nieuwoudt,
Opara,  Perold, 2017), including defects in horticultural
products.
3.9. Other learning methods
Various other learning methods have been applied to non-
destructive quality evaluation of agricultural products. They
include logistic regression (Hu, Dong,  Liu, 2016; Jarolmasjed,
Khot,  Sankaran, 2018), naı̈ve Bayes (Sinha, Khot, Schroeder,
 Sankaran, 2018; van Dael et al., 2016), nearest neighbour
(Kuzy, Jiang,  Li, 2018; Moscetti, Haff, Monarca, Cecchini, 
b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 69
Massantini, 2016), stochastic gradient decent (Mohanty et al.,
2016), gradient tree boosting (Che et al., 2018), etc.
Though various learning algorithms may provide accept-
able performance at a given learning task, choosing the most
performing is always preferable. Characteristics such as
memory usage, predictive accuracy on test data, training
speed and interpretability of the inner workings of an algo-
rithm are typical trade-off criteria that can be used in a trial
and error process to make such a selection.
4. Feature extraction and selection
A feature in a horticultural image is, for example, an aspect of
interest that is useful in describing fruits; it might be related
to colour, shape, size, strength, composition, flavour or a
defect. Feature descriptors are commonly used for object
detection and image recognition; they represent an image or
part of it by retaining useful information while the redundant
one is left out (Naik  Patel, 2017). In terms of spectral data,
features are extracted as spectral bands; however, in some
cases spectral features can be combined with spatial (i.e.
pixel-based) ones (Knauer et al., 2017). As a crucial step in
fruit defect detection, feature extraction is commonly done
in order to make data manageable and feature selection aims
to reduce these features to those most significant without
loss of information (Leiva-Valenzuela  Aguilera, 2013). This
means to select the lowest number of features that yield the
lowest error with the highest correct classification hits.
Gabor features, Gabor filter, Hu moments, Flusser and Suk
moments, local binary patterns, discrete Fourier transform,
mean gradient first-order derivative, Mean Laplacian second-
order derivative, mean, standard deviation, Skewness and
kurtosis are typical methods for feature extraction (Leiva-
Valenzuela  Aguilera, 2013). Other methods such as deep
feature extraction, which is based on deep neural network,
are useful when the data structure is complex and help limit
the networks risk of overfitting which is typical when the
training set is of limited size (Chen, Jiang, Li, Jia,  Member,
2016). It is worth mentioning that as a general approach
followed in computer vision systems, while machine
learning algorithms are performed subsequently to applying
handcrafted algorithms for feature extraction, the latter is
incorporated as an essential part of the very structure of the
DL framework (Rosebrock, 2017).
5. Major ML methods used to detect defects
in fruit and vegetables
There has been an increasing use of ML methods in various
fields of scientific research and technological development
including agriculture (Gandhi  Armstrong, 2016) and the
study of food quality (Ropodi et al., 2016). Their uses in
enabling the effective detection of damage and disorders in
horticultural products have also been reported and are sur-
veyed here with respect to the known detection challenges.
An overview of the recent uses of ML in defect detection is
seen in Table 4.
5.1. Detecting internal defects
Subdermal or internal damage and disorders in fruit and
vegetables cannot be identified visually. Visible computer
vision systems, despite their advanced applicability, are also
unable to detect such defects. Alternatives such as NIR spec-
troscopy and imaging (Liu, Pu,  Sun, 2017); thermal imaging
(Ding, Dong, Jiao,  Zheng, 2017); X-ray radiography and to-
mography (van Dael et al., 2016, 2019; Herremans et al., 2014;
Magwaza  Opara, 2014); magnetic resonance imaging (Tao,
Zhang, McCarthy, Beckles,  Saltveit, 2014; Zhang 
McCarthy, 2012) and ultrasound imaging (Ahmed et al., 2017)
have proven capable of testing the internal state of objects.
Imaging techniques have shown superior capabilities in
the study of internal structure and disorders in fruits and
vegetables and therefore, they are most preferred. They do,
however, have challenges such as speed limitation, which can
be associated with time cost of data acquisition (X-ray) or
processing (HSI); high cost for some devices or technical lim-
itation (limited penetration depth for infrared based devices);
etc. Other limitations related to inspecting internal features in
fruit can emanate from the nature of the object under inves-
tigation whereby, fruit soft tissue result in low contrast in X-
ray radiographs (Mathanker, Weckler,  Bowser, 2013), thick
rind and opaque fruit limit penetration of infrared radiation
for vibrational spectrometry and imaging.
Although research is underway to improve on hardware
capabilities, a paralleled solution that is based on data
handling is also under development. ML tools have been
adopted for data mining and analysis in association with non-
destructive detection devices, to probe internal structure and
disorders of horticultural fresh products.
van Dael et al. (2016) used naı̈ve Bayes and k-nearest
neighbours (k-NN) classifiers to separate citrus fruits with
internal disorders from those with healthy tissue based on X-
ray radiographs. The classification algorithm managed to
capture 95.7% of oranges with granulation and 93.6% of
lemons affected with endoxerosis, correctly.
In a HSI based study of internal damage and external de-
fects in cucumber, Cen, He, and Lu (2016) proved the ability of
a deep learning framework in improving the accuracy of
detection models. Combining CNN with a stacked sparse auto-
encoder (SSAE) to learn both spectral and spatial features,
higher accuracy of detection than that obtained with spectral
data alone was consistently obtained at both scanning speeds
used (Cen et al., 2016).
Recently, Wang (2018) applied two CNNs, namely residual
network (ResNet) and ResNext to the classification of hyper-
spectral transmittance data. The objective was to improve on
the accuracy and reduce detection time costs for internal
damage in blueberries. In comparison to other ML classifiers
such as RF, linear regression, SVM, bagging and multilayer
perceptron; the ResNet and ResNext yielded superior classifi-
cation performance in terms of accuracy, precision, F1-score
and area under curve (AUC) (Wang et al., 2018).
With the rapidly increasing developments in deep learning
applied in object recognition, the use of imaging systems to
detect internal defects in horticultural products can be
rendered more efficient by implementing pretrained learning
b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3
70
Table 4 e An overview of recent applications of ML methods to the detection of defects in horticultural products.
Instrument ML method Product Study Evaluation Reference
HSI (transmission.) CNN(ResNet/ResNeXt)
versus SMO, LR, RF, MLP,
Bagging
Blueberry Internal mechanical damage Up to (CNN)
Acc. ¼ 0.88 Rec. ¼ 0.93
Prec. ¼ 0.86
F1-sc. ¼ 0.89
AUC ¼ 0.92
Wang et al. (2018)
NIR HSI k-NN, LDA, NBC, DT, ELM Mango Detecting mechanical damage Correct classification rate: 97.95% V
elez Rivera et al. (2014)
HSI CNN-SSAE Cucumber Surface and internal defect Class acc: 91.1% and 88.6% at
speeds of 85 and 165 mm/s
Cen et al. (2016)
HSI Successive projection
algorithm
Peaches Detect fungal disease based on
chlorophyll content
band ratio gave high (98.75%)
classification accuracy for diseased
peaches
Sun, Wang, et al. (2017)
VIS-HSI RF Apple bruising Average accuracy of bruise
extraction models reached 99.9%
Che et al. (2018)
HSI ANN Peaches Cold injury Overall class acc: 95.8%; predictive
corr. coef.:
0.698e0.903
Pan et al. (2016)
Electronic nose ANN, CA, RF Diverse Bacterial, fungal, viral infections
and insect damage.
Diverse Cui et al. (2018)
BPNN, LVQN, CA, LDA, PCA Rice plant Mechanical damage, herbivore
attack
Correct class rate training: 100%,
test set: 60e100%
Zhou and Wang (2011)
Colorimeter SVM, LDA Blueberries Fungal decay, shrivel, mechanical
damage
Classifier performance  97%, 93%,
86%
Leiva-Valenzuela and Aguilera (2013)
Machine vision Fuzzy logic Apple Water core severity Class acc 86e89% Shahin et al. (2001)
ANN Bruise damage Shahin, Tollner, McClendon, and Arabnia (2002)
X-ray Imaging ANN Sweet onion Defective vs good Overall Class acc 90% Shahin, Tollner, Gitaitis, Sumner, and Maw (2002)
E-nose, GCeMS MLPNN, PCA Strawberry Pathogenic fungal disease Class acc: 96.6% Pan et al. (2014)
HSI SVM, SLOG, SMO, BNN,
FURIA, NNC, LINE, LOG, NB,
RF
Apple Bruise Correct class rate  95% train, 90%
valid
Siedliska, Baranowski, and Mazurek (2014)
HSI (transmission) SOM, SVM, Active learning
algorithm (EER)
Blueberry Mechanical damage Acc: 0.87, Prec: 0.93, Recall: 0.78,
Training: 9 (EER)
Hu et al. (2018)
X-ray radiography kNN, NB Citrus fruits Internal disorders Class acc: oranges: 95.7%, lemons:
93.6%
van Dael et al. (2016)
Dielectric
spectroscopy
SVM, ANN, SVM-FS, GA, RF Oil palm Basal stem rot infection Overall acc: 88.64%, kappa: 0.8480,
mean absolute error: 0.1652
Khaled et al. (2018)
Wetting
sensors
network
RF Apple Scab e Wrzesien, Treder, Klamkowski, and Rudnicki (2019)
HSI PCA clustering Apple Decay Acc: decay 99%, sound 100% Li, Luo, Wang, and Fan (2019)
CV K-means, Fuzzy C-Means Olive Surface defects Overall acc: 88e93% Hussain and Ahmed (2019)
Acc., accuracy; Rec., recovery rate; Prec., precision; F1-sc., F1-score; AUC, area under curve; corr. coef., correlation coefficient.
b
i
o
s
y
s
t
e
m
s
e
n
g
i
n
e
e
r
i
n
g
1
8
9
(
2
0
2
0
)
6
0
e8
3
71
platforms for automated detection processes and improving
detection specificity.
5.2. Objective and quantitative measurement of
mechanical damage
Quantitative measurement of defects entails establishing de-
fects indices based on the level of severity. Knowledge about
the degree of damage is useful in the sense that it allows
produce to be graded into such groups as ‘sound’ - fit for long
distance transport or long shelf life, ‘mildly damaged’ - fit for
processing and ‘damaged’ - unsafe for consumption but fit for
animal feed or to be discarded. The ability to carry out such a
grading would therefore, lead to reduction in food losses and
contribute to food security (Li  Thomas, 2014).
Che et al. (2018) used various classification algorithms to
detect bruise damage in apples at three temporal stages of
development (0, 12, and 18 h) by comparing pixel-based clas-
sification and bruise segmentation methods applied on
hyperspectral images. Such algorithms included SVM, DT
(classification and regression), stochastic gradient descent,
RFs and gradient tree boosting. In their objectives, they
envisaged to improve on the abilities of traditional image
processing methods of segmenting bruises by creating a pixel-
based bruise extraction method. In their findings, the RF
method was rated as best in precision and stability overall and
best for pixel based bruised region prediction for its high
classification accuracy and generalisation ability. Classifica-
tion accuracy was also evidently found to increase with the
severity (dependent on time after bruising) of damage (Che
et al., 2018).
Recently, Sun, Gu, et al. (2017) achieved a high classifica-
tion accuracy (up to 96.87%) for both detecting chilling injury
and distinguishing between four categories of peaches ac-
cording to condition of chilling damage (sound, slight, mod-
erate and heavy injury) by using ANNs (Sun, Gu, et al., 2017).
In another study, when infrared thermography was used
with periodic thermal energy input to pear, it was possible to
obtain quantitative metrics of size and depth of bruises in
pear, obtainable from phase information of thermal emission
by the samples (Kim et al., 2014).
ML algorithms have proven to outperform classical
image analysis methods at localising damaged areas in
fruit and shown the dependence of damage severity to
detection accuracy. High accuracies were also shown to be
achievable while distinguishing between degrees of dam-
age using ML methods. It should be noted that with such
capabilities, ML is a good candidate for enabling the
implementation of quantitative models for defect detection
suitable for practical scenarios. However, learning plat-
forms for real life defect detection application are required
and extensive efforts should be deployed to achieve such a
development.
5.3. Early detection of defects
Detection of fruit and vegetable defects at their earliest stage
of development is crucial to prevent damage aggravation and
possible disease spread over entire containers, which could
result in catastrophic food losses. The definition of early
detection may refer to the detection before manifestation or
earliest stage of manifestation.
Khaled (2018) proposed a method for early detection of
basal stem rot (BSR) disease in oil palm leaves. A series of
feature selection algorithms were used, namely genetic algo-
rithm, random forest and support vector machine. The latter
was also used in addition to artificial neural networks as
classifiers. The study lead to a clarification on effectiveness of
feature selection methods used and on preference in best
electrical parameter (impedance) that were suitable for early
detection of BSR in oil palm leaves (Khaled et al., 2018). In a
study by V
elez Rivera et al. (2014), detection of mechanical
damage induced in mango fruit at early stage of development
by a HSI system was assessed. Using various classification
learning methods and selection of the best spectral bands in
distinguishing between damaged and sound mangos, they
obtained increasing rates of classification correctness over
seven days after damage induction. From day one, rates of
67.46, 84.63, 89.27 89.76 and 94.87 were obtained for the clas-
sifiers used, i.e. naı̈ve Bayes, ELM, DT, LDA and k-NN,
respectively. The latter had particularly the highest classifi-
cation performance overall, followed by LDA and were high
enough by day three (97.5% and 95.54%, respectively). How-
ever, it is worth noting that feature selection led to lower
classification performance than using full spectral bandwidth,
therefore further research effort was recommended to ach-
ieve more efficient feature selection (V
elez Rivera et al., 2014).
In a study of common fungal disease detection in strawberry
using an E-nose, Pan, Zhang, Zhu, Mao, and Tu (2014) achieved
an overall discrimination accuracy of 96.6% and an improve-
ment of correct ratios, from 93.3 to 100% in testing samples for
individual treatment, as early as day two after inoculation,
using MLPNN classifier. Three types of diseases used were also
well differentiated from one another all along the 10 days
period of the study using PCA (Pan et al., 2014). E-nose com-
bined with multi-layer perceptron neural network (MLPNN)
classifier was therefore proven an acceptable method for early
detection of common fungal infection (early decay) during
postharvest storage.
There have been other recent reports on early detection of
defects in fruit with successful results. Li, Chen, and Huang
(2018) used an improved watershed segmentation technique
of hyperspectral images based on morphological gradient
reconstruction to detect bruises in peaches at their early stage
of development. Combined with PCA selection of effective
wavelengths, the new segmentation method led to detection
accuracies as high as 96.5% for samples with defects and
97.5% for sound ones. Though the detection accuracy was
high, the robustness of the method vis 
a vis biological vari-
ability remains untested (Li et al., 2018). A study of bruise
detection in pears reported that the use of lock-in method to
infrared thermography was effective for early bruise detection
(Kim et al., 2014). This work, however, did not produce a basis
for decision support in real life application.
In a study attempting to detect early symptoms of decay in
navel orange fruits, a rather rarely used approach based on
‘image visualisation’ was adopted. An algorithm for image
segmentation, based on a combination of thresholding and
pseudo-colour image was used to locate decayed tissue,
resulting in 100% success rate of detection for decayed
b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3
72
samples with an error rate smaller than 1% in sound ones. PCA
was instrumental in data reduction which resulted in four
effective wavelengths and classification of categories
(decayed vs sound fruit) (Li et al., 2016).
From the various studies mentioned above, it can be
noticed that with the wide range of learning algorithms that
are available and their combinations with features selection
techniques, given a specific problem, there are generally ways
of finding the best suited learning algorithm that can super-
sede the mainstream chemometrics methods. However, there
is a need for transferring the many learned lessons into con-
crete implementations for real life applications. To that end, it
should be stressed that the term “early” detection still needs a
standard measure; to date it is experimentally established
differently from one study to another. For example, in defects
such as bruise damage the term is rather commonly equiva-
lent to “fresh” bruise and in some disease infection it practi-
cally means the “onset stage” or the “least severe” of the
chosen range of infection extent. These terms can be rather
vague, they need to be assigned a measurable value and pre-
dicting defects before they manifest would be rather ideal.
However, the achieved prediction levels in the reported
studies using ML give promise for successful practical imple-
mentations within the current framework.
5.4. Fast detection of defects in fruit
Online detection of defects in horticultural products is one of
the most desirable aspects in industrial application of non-
destructive methods towards sorting and grading of fruit
and vegetables. Even though visual systems are already in use
for this purpose, they are still ineffective at detecting internal
quality and defects (Leiva-Valenzuela  Aguilera, 2013). Other
technologies like hyperspectral imaging that are capable for
probing internal defects still face the issue of slow processing
speed and efforts are still being deployed in algorithms
development in order to match the typical industrial sorting
speeds (Calvini, Orlandi, Foca,  Ulrici, 2018). Also, studies
involving advanced learning algorithms have made progress
in reducing the image processing time, whereby feature se-
lection and pre-processing methods play an integrant role.
Recently, Keresztes, Goodarzi, and Saeys (2016) developed
a system for ‘real-time’ detection of bruise Jonagold apples
based on shortwave infrared (SWIR) HSI. By combining the
best reflectance calibration and best pre-processing technique
for glare correction, the detection accuracy and processing
time per apple reached to 98% and 20 ms, respectively,
whereby, the shorter processing times corresponded to slower
samples scanning speed. In order to make improvement in
processing speed and glare induced inaccuracies, further
optimisation of the system's hardware and illumination was
recommended (Keresztes et al., 2016). Wang, Hu, and Zhai
(2018) also showed that convolutional neural networks
improved the time cost at detecting internal damage of blue-
berries using HSI. With a classification time for each testing
sample reduced to 5.2 ms and 6.5 ms for both types of used
CNNs, the potential of deep CNN to enable online fruit sorting
based on internal damage was demonstrated.
A technical trend that has gained much attention and
showing great promise for fast detection of defects in fruits
and vegetables is that of hyperspectral imaging. Hyperspectral
imaging systems are used in the acquisition of spectral images
that serve to determine the optimal wavelengths usable in
faster multispectral imaging systems. However, the latter,
even though faster, has shown a lesser detection performance
than the former (Huang, Li, Wang,  Chen, 2015). The effi-
ciency of multispectral systems in this scenario depends
much upon the transferability of classification algorithms
from the HSI to MSI system. Recently, a comparative study
was conducted on detecting various defects in apples with
intent to determine the image recognition method with better
portability and stability from an HSI to MSI system consid-
ering the reduction of illumination evenness. The study
concluded that the goal of minimising the effect of uneven
illumination and meeting model robustness to physical and
biological variability that hinders the accurate identification
of surface defects was achievable (Zhang et al., 2018).
The recent applications of deep learning architectures
have opened a window of opportunities whereby, graphical
processing unit (GPU) oriented programming greatly speeds
up processing time and has outperformed the classical CPU
based approach. Shorter processing times (per sample) were
achieved for defect detection in cucumber by using a GPU
implementation of CNN-SSAE framework that fuses spectral
and spatial features of HSI data (Cen et al., 2016). Though the
implementation of such GPU based frameworks currently re-
quires specialists with high levels of coding skills, developing
such platforms for specific detection tasks could benefit the
horticultural industry in the near future. Although, computers
with decent GPU capabilities are relatively costly, de-
velopments in computer hardware are ever improving the
affordability of computing hardware, which is likely to alle-
viate the burden of high cost.
5.5. Other uses of ML in defect detection
Learning algorithms have been used to predict various defects
in contexts other than those already covered in the above
sections. Among other applications are the detection of cold
injury in peaches, whereby using MLPNN, Pan (2016) suc-
cessfully distinguished injured from sound peaches in cold
storage with high accuracy (92.9e100%) based on HSI data,
proving the feasibility of HSI in detecting damage resulting
from cold storage (Pan et al., 2016).
In a study of detecting chilling injury, Sun, Gu, et al. (2017),
Sun, Wang, et al. (2017) used six optimal wavelengths ob-
tained by successive projections algorithm (SPA), classifica-
tion models by, Fisher linear discriminant analysis, SVM, ANN
and PLS-DA achieved high detection accuracy (92.96e97.28%)
(Sun, Gu, et al., 2017).
The technological developments have enabled machine
vision to gain ground in replacing traditional manual handling
method to assess fruit quality. The former does not always
recognise specific defects such as drying, fungal decay and
mechanical damage in blueberries (Leiva-Valenzuela 
Aguilera, 2013); the detection efficiency depends much upon
the used classification algorithm's objective and tolerance. In
this sense, improving the detection performance of computer
vision systems and other techniques through learning algo-
rithms has been subject of numerous studies.
b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 73
Valenzuela (2013) devised a method of detecting damage
to blueberries due to fungal infection, shrivel and mechani-
cal stress, in different orientations using various classifiers.
Support vector machine and linear discriminant analysis
were reported to have superior performance over the rest of
used classifiers (quadratic discriminant analysis, Mahala-
nobis distance, K nearest neighbours and probabilistic neu-
ral network) (Leiva-Valenzuela  Aguilera, 2013). The
authors reported their recognition approach as promising for
online sorting and grading of blueberries based on various
defects, however, they stressed the need for incorporating
non-visible and internal defects, requiring complementary
sensors.
Convolutional neural network was used in identifying
diseases such as leaf mould, grey mould and plague in tomato
plants based on RGB images. A deconvolutional network for
deep visualisation was used to analyse the performance of
internal layers of a CNN (VGG-16) as influenced by colour and
spectral information of diseases images. Based on the fact that
images for each disease present specific characteristics in
terms of colour, texture, patterns, location in the plant, shape,
etc., it was possible to relate colour sensitivity to a given
specific disease and therefore determine parameters that
could help redesign the CNN and improve its recognition rate
(Fuentes, Im, Yoon,  Park, 2017).
6. Important features selection
The purpose of variable selection can be perceived in three
aspects: improvement of model prediction performance,
provision of faster and more cost-effective predictors, and
providing a better understanding of the process generating
the data (Guyon  Elisseeff, 2003).
Considerable effort has been deployed for the realisation of
online inspection whereby, the transformation from slow
hyperspectral imaging to the fast application level of multi-
spectral imaging requires a key step of selecting the most
efficient wavelengths for specific inspection task (Zhang et al.,
2014). However, wavelength selection carried out in order to
enable the development of multispectral system has proven to
reduce the classification performance of the latter, in some
cases (Huang et al., 2015).
Another trend is the combination of NIR hyperspectral
imaging with ML techniques. In a study of mechanically
induced damage detection in mango fruit, V
elez Rivera et al.
(2014) used five classification techniques in combination
with eleven feature selection techniques to determine the
most relevant features for their classification problem (V
elez
Rivera et al., 2014). The feature selection methods included
correlation-based feature subset selection (CFS), chi square
(ChiS), Fisher score, Gini impurity algorithm (GIA), informa-
tion gain (IG), minimum redundancy maximum relevance
(mRMR), ReliefF, sequential forward selection (SFS), sparse
logistic regression (SLR), stepwise, and Student's T-test. On the
other hand, the classifiers used were linear discriminant
analysis (LDA), k-nearest neighbours (k-NN), naı̈ve Bayes
classifier (NBC) as a probabilistic approach; and decision trees
(DT) and extreme learning machine (ELM). More details on the
variable selection methods commonly applied in vibrational
spectroscopy were reviewed in Xiaobo, Jiewen, Povey, Holmes,
and Hanpin (2010).
It is noticeable from Table 5, that there are more di-
vergences than similarities in the optimal wavelengths ob-
tained in different studies on same defect for the same
commodities. Therefore, there is a need for standardised
wavelength characteristics that would ensure optimal per-
formance of multispectral systems for specific tasks.
7. ML as an enabler of data fusion
The scope of sensor fusion entails the use of multiple sensing
techniques simultaneously in order to improve the assess-
ment of targeted material properties (Srivastava  Sadistap,
2018). There has been an increasing interest in fusing data
from complementary sensors to study properties of food
items; this approach has proven to provide better insight on a
studied item than a single sensor. Various data fusion meth-
odologies exist and can be achieved at different levels of
complexity (measurement level, feature level and decision
fusion level) as reviewed for the application to food and
beverage authentication (Borr
as et al., 2015). Various suc-
cessful cases of data fusion applied to the study of food
properties were reported in recent years whereby, ML
methods were the enabler of data handling and analysis.
These applications include the fusion of spectral and spatial
data (feature level) from a Vis-NIR hyperspectral imaging
system to predict sensory quality index scores of fish fillet.
Calibration models were built using LS-SVM, textural features
were extracted by using grey-level gradient co-occurrence
matrix method and successive projections algorithm was
used for effective wavelengths selection (Cheng  Sun, 2015).
Mendoza, Lu and Cen (2014) showed that fusing systems
(among visible and shortwave near infrared (Vis-SWNIR)
spectroscopy, acoustic firmness, spectral scattering and bio
yield firmness) provided more complete and complementary
information on firmness and soluble solids content and was a
more effective approach at predicting the latter attributes
than using individual sensors. Later, they argued that the
optical information provided by Vis-SWNIR spectroscopy and
scattering techniques on apple firmness and soluble solids
content was complementary and thus their fusion would
provide higher accuracy than considered separately (Mendoza
et al., 2014). There have been more reports on recent appli-
cations of data fusion to characterisation of food and bever-
ages (Biancolillo, Bucci, Magrı̀, Magrı̀,  Marini, 2014; Borr
as
et al., 2015). However, in recent years, few reports are found
on fused techniques for defect detection in fruits and vege-
tables. Li, Heinemann, and Sherry (2007) successfully devel-
oped data fusion models for apple defect detection by
integrating two instruments for volatile detection. Data fusion
was carried out at both feature (using probabilistic neural
network-based sensor fusion) and decision levels (using
Bayesian network fusion) creating superior accuracy (lower
classification error) to that obtained from single sensor data. It
was concluded that feature selection was a crucial step in
achieving such improved performance of sensor fusion
framework and the latter was fit for detection of diseased or
spoiled apples reliably (Li et al., 2007). It is evident that
b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3
74
learning algorithms and soft computing techniques are
crucial for non-destructive multi-sensor fusion and this needs
more exploration in the characterisation of agricultural
products.
8. ML and model transfer
Due to instrument-related variations that are typical of most
spectrometers, calibration models are bound to the spec-
trometer that generated the data and therefore inapplicable
on another spectrometer with statistically retained accuracy
and precision. There has been progress in developing ways to
circumvent this problem which are referred to as calibration
or model transfer. The objective is to correct the difference of
spectra between the master and a slave instrument by
transforming spectra from the latter to appear as if originating
from the master instrument. Once this achieved, the original
calibration model can be used on the transformed spectra.
This approach is known as standardisation and the most
popular method, the piecewise direct standardisation (PDS) is
used as a benchmark for the new developed transfer methods
(Luo et al., 2017). This can also be achieved by a different
approach whereby the aim is to correct the new samples
predicted values for the bias and the slope of the regression
equation, under the assumption that predicted values of two
different instruments have linear dependence. Alternatively,
a third approach that tries to standardise the model co-
efficients can be used. Various other methods have been
developed to achieve calibration transfer and they follow two
main approaches additionally to standardisation, namely
reduction of the difference in data acquired under different
conditions and model updating. Data correction applies signal
pre-processing methods (Workman, 2018), whereas model
updating keeps adding new data acquired under new condi-
tions and then rebuilding the model. Details on many cali-
bration transfer methods that have been applied to infrared,
near-infrared and Raman spectroscopies, their advantages
and shortcomings have been reviewed extensively (Feudale
et al., 2002; Workman Jr, 2018). A recent study developed a
transfer method that is based on affine transformation which
does not require standard samples and reportedly, was more
effective than most common standardization methods (Zhao
et al., 2019). Aspects of machine learning have also been
applied to solve challenges of standardization such the
possible non-linearity relationships between spectra from two
instruments (Chen, Bin, Lu, Zhang,  Liang, 2016), which
cannot be addressed by direct standardisation (DS) or PDS.
Transfer learning was successfully used to improve imple-
mentation of calibration transfer in E-noses (Yan  Zhang,
2016). In horticultural applications, numerous transfer
methods have been developed which are mostly aimed at
studies of internal quality attributes and most generally based
on NIR spectroscopy (Alamar, Bobelyn, Lammertyn, Nicolaı̈, 
Molt
o, 2007; Bergman, Brage, Josefson, Svensson,  Spar
en,
2006; Fan et al., 2019). Model transfers for defect detection
studies, on the other hand, have had little attention. Given
that defect detection has been proven more and more feasible
with imaging techniques, transfer methods developed for
spectroscopy are likely to be obsolete in this case. However,
Table
5
e
Recent
applications
of
feature
selection
methods
to
improve
learning
models
for
defect
detection
in
fruit
and
vegetables.
Instrument
Selection
method
Type
of
defect
Product
Waveband
Reference
NIR
-
HSI
SLR,
T
test,
IG,
SFS,
mRMR,
GIA,
ChiS,
CFS
Mech
damage
Mango
700
nme780
nm,
890
nme900
nm,
1070
nme1080
nm
V

e
lez
Rivera
et
al.
(2014)
HSI
Successive
projection
algorithm
Fungal
infection
decay/chlorophyll
content
Peach
617
nm,
675
nm,
and
818
nm
Sun,
Wang,
et
al.
(2017)
HSI
MLPANN
Cold
injury
Peach
487,
514,
629,
656,
774,
802,
920
and
948
nm
Pan
et
al.
(2016)
HSI
(400e1000
nm)
Chilling
injury
Peach
580,
599,
650,
675,
710,
and
970
nm
Sun,
Gu,
et
al.
(2017)
Cold
injury
Nectarine
670
and
780
nm
Lurie
et
al.
(2011)
Cold
injury
Banana
660
nm
Hashim
et
al.
(2013)
Internal
defect
Cucumber
745,
805,
965
and
985
nm
Ariana
and
Lu
(2010)
HSI
Cold
injury
Apple
717,
751,
875,
960
and
980
nm
ElMasry,
Wang,
and
Vigneault
(2009)
SW-

LW-HSI
PCA-weighting
coefficient
Bruise
Peach
781,
816,
840,
945,
1000,
1065,
1260,
1460,
1917,
2500
nm
Li
et
al.
(2018)
HSI
(450e1000
nm)
to
MSI
PCA-weighting
coefficient
Bruise
Apple
780,
850
and
960
nm
Huang
et
al.
(2015)
HSI
(408e117
nm)
PCA
loadings
Hidden
bruise
Kiwifruit
682,
723,
744,
810,
and
852
nm
Lü,
Tang,
Cai,
Zhao,
and
Vittayapadung
(2011)
b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 75
ML and DL have proven effective in calibration for various
types of sensors and sensor networks (Chatzidakis  Botton,
2019; Wang et al., 2017), in obtaining low error of prediction
(RMSEP) values and stable calibration transfer in spectroscopy
cases; therefore, it as an educated guess that they are the best
option for implementing model transfer aimed at defect
detection studies.
9. Summary and future directions
Quality control in the horticultural industry is important to
ensure food safety, quality and prevent unnecessary food
related economic losses. Enabling non-destructive detection
of defects in horticultural products is two-fold in pre-
requisites: one is the development of detection instruments
that are equipped to address the existing challenges, the other
lies in advancing data handling techniques in a sense that is
complementary to the former.
Technological advances have been made in either case and
research is ongoing, that will help reach the goals that are
sought for. Such goals include user-friendliness (easy to
operate and maintain) of sensing devices and their suitability
for industrial applications (fast, reliable, portable and cost
effective). HSI has become the most preferred and predomi-
nant non-destructive technique applied for defect detection in
food and Agri-products. Many studies have contributed to
continued improvements in reducing image processing time
costs and optimisation of HSI hardware, which is expected to
help match high sorting speeds required for industrial
applications.
Many learning algorithms have been developed to improve
the detection accuracy and speed up image processing time
costs; they include advanced segmentation techniques, deep
learning methods to automate feature extraction and other
classification learners for identification and detection of de-
fects based on pixel density (Che et al., 2018) and semi-
supervised learning methods such as active learning algo-
rithms. The latter have been proven to effectively reduce the
labelling cost while keeping a high classification performance
(Hu, Zhao,  Zhai, 2018). This is cost effective for online ap-
plications and similar application environments whereby
continuous labelling update and model transfer is common.
Future work should also focus on exploring such semi-
supervised learning techniques.
Nonetheless, the challenge remains that of standardiza-
tion of techniques; each reported study is more or less limited
to a specific instrumental parameter, study single food item
and particular defect, use a different learning algorithm or a
different validation process. Such specificity limits the wide-
spread use of the technologies beyond research in-
vestigations; future work should endeavour to standardise the
methodologies that have already been proven successful and
make them available for practical use.
No one algorithm can solve all problems. Choosing the
appropriate learning algorithm for a specific problem is a
crucial step for the model effectiveness. Algorithm selection
has mostly been a trial and error process; various studies have
adopted a comparative approach whereby many algorithms
are used for a classification task and through a trade-off be-
tween the algorithms based on some performance charac-
teristics, the best algorithm is given preference. Future work
should seek to establish frameworks where algorithms
trained and tested for a specific application (e.g. specific de-
fects in a given fruit) are recommended as such for further
practical use.
Early detection of defects has been predominantly
accomplished using e-nose for pathological defects and dis-
orders, where ML plays an integrant role in data analysis and
acquisition. Detecting mechanical damage at early stage of
development has also been successful using imaging tech-
niques, whereby deep learning methods were the enabler of
feature extraction and enhancing detection accuracy. How-
ever, the feasibility of applying these deep learning tech-
niques to other promising technologies such as
thermography, radiography, magnetic resonance, etc., re-
mains unexplored and should be taken into consideration in
the future.
ML has been effectively used in assessing quantitative
measures of damage (severity/degree of damage) such as
mechanical damage and chilling injury in fruits. Most gener-
ally, such studies have relied on an experimentally estab-
lished index of level of severity either by direct induction
through an experimental procedure or based on temporal
evolution of the defect. Learning methods were used to
elucidate the distinction among different degrees of damage
as captured by an objective non-destructive measurement
technique. This topic is also underexplored by advanced ML
algorithms; quantitation of damage and disorders remains a
challenge in the horticulture sector, which calls for more
attention oriented towards the use of learning algorithms in
the future.
‘Fast detection’ is one of the most sought-after goals for the
use of ND detection methods in the fruit and vegetable in-
dustry. The most successful case has been that of computer
vision, but its capabilities are limited to surface defects. HSI
and most especially MSI on the other hand is the most
prominent candidate for fast detection, where ML and espe-
cially deep learning have been instrumental towards the
successful implementation of ‘fast’ systems. Deep learning
has allowed for improvements in reducing the time cost of
image processing and effective feature extraction for direct
defect identification. The use of deep learning to this end has
shown great promise but little has been investigated in this
regard. Therefore, more studies are needed that could bring
this goal to a realisation.
ML has played an important role in enabling multi-
sensor data fusion (where the nature of sensors allows
it) whereby, the data can be fused at different levels to
achieve better model performance and robustness than
when a single sensor is used. Many studies have been
conducted in quality evaluation of food items, but little
has been reported on detection of damage and disorders
to fruit and vegetables. Future research should consider
this approach as an option to help overcome existing
challenges.
There has also recently been an issue of “cheating model”
raised in Li et al. (2019), which seems to have more to do with
b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3
76
the learning techniques than the data. Although the issue
they raised was directed to analytics in agricultural products,
it is a situation that can be encountered in any other field of
data driven research. Holding competitions as they suggest
to check the models could also produce the same situation,
especially when there are conflicts of interest. There is
however a possibility that one can enforce data and model
sharing, especially in cases where the work is not patented,
the models can be checked under a monitored platform
within the umbrella of authenticating results, for the same
purpose peer-review is carried out. There are already
ongoing examples of code sharing on “Github” and other
“Git-like” platforms by developers, a similar system could be
used to share models and check data. It is the author's
opinion that such a system, that is already working in other
areas of technical development could stand a good chance of
operating fairly.
With the advances in DL that are already revolutionising
pattern recognition, DL platforms have become easily acces-
sible and they are adaptable to other applications. There are
successful such case studies in plant diseases and fruit
detection, and they can be extended to defects in fruits. A
large-scale database of images that capture various defects in
fruits and vegetables would be useful in training such DL
platforms which in turn would lead to advancing automation
in grading and sorting systems. The predisposition of HSI to
acquire both spatial and spectral data, the structured illumi-
nation reflectance (SIRI) and thermal imaging offer a chance to
probe internal damage (Lu  Lu, 2018). Deep learning offers an
opportunity for training models on a massive scale, which has
already been proven by cases such as the “ImageNet” com-
petitions and similar complex problems. The proposed idea
here is seems only possible, given that it is already possible to
populate such a database by capturing fruit and vegetables
defects.
10. Conclusions
ML and DL methods have proven to hold promise for
overcoming the existing challenges around effective,
objective and fast detection of defects in horticultural
products. Research has proven ML methods to be effective
at enhancing accuracy of detection either by enabling data
and sensor fusion, enabling data dimensionality reduction
or feature extraction. They allow for faster spectral and
image processing than traditional segmentation methods,
faster object detection and automation becomes very
feasible by trained learning algorithms. The use of ML has
been efficient in data driven problem solving in many areas
of science and technology. In the future, more effort should
be deployed to establish focused frameworks with objec-
tivity to provide standardised solutions to the current
problems around detection of defects in fruit and vegeta-
bles. One typical such idea would be to establish deep
learning platforms trained and dedicated to recognising
various defects in fruit and vegetables based on images
acquired by non-destructive imaging devices such as
hyperspectral imaging systems.
Declaration of Competing Interest
The authors declare no conflict of interest.
Acknowledgement
This work is based on the research supported wholly by the
National Research Foundation of South Africa (Grant
Numbers: 64813). The opinions, findings and conclusions or
recommendations expressed are those of the author(s) alone,
and the NRF accepts no liability whatsoever in this regard.
r e f e r e n c e s
Abasi, S., Minaei, S., Jamshidi, B.,  Fathi, D. (2018). Dedicated
non-destructive devices for food quality measurement - a
review. Trends in Food Science  Technology, 78, 197e205. https://
doi.org/10.1016/j.tifs.2018.05.009.
Acquarelli, J., Van Laarhoven, T., Gerretzen, J.,  Tran, T. N. (2017).
Convolutional neural networks for vibrational spectroscopic
data analysis. Analytica Chimica Acta, 954, 22e31. https://
doi.org/10.1016/j.aca.2016.12.010.
Adam, E., Deng, H., Odindi, J., Abdel-Rahman, E. M.,  Mutanga, O.
(2017). Detecting the early stage of phaeosphaeria leaf spot
infestations in maize crop using in situ hyperspectral data and
guided regularized random forest algorithm. Journal of
Spectroscopy, 2017, 1e9. https://doi.org/10.1155/2017/6961387.
Ahlin, K., Joffe, B., Hu, A. P., McMurray, G.,  Sadegh, N. (2016).
Autonomous leaf picking using deep learning and visual-
servoing. IFAC-PapersOnLine, 49(16), 177e183. https://doi.org/
10.1016/j.ifacol.2016.10.033.
Ahmed, M. R., Yasmin, J., Ahmed, M. R., Yasmin, J., Lee, W.,
Mo, C., et al. (2017). Imaging technologies for nondestructive
measurement of internal properties of agricultural Products :
A review. Journal of Biosystems Engineering, 42(3), 199e216.
https://doi.org/10.5307/JBE.2017.42.3.199.
Alamar, M. C., Bobelyn, E., Lammertyn, J., Nicolaı̈, B. M., 
Molt
o, E. (2007). Calibration transfer between NIR diode array
and FT-NIR spectrophotometers for measuring the soluble
solids contents of apple. Postharvest Biology and Technology,
45(1), 38e45. https://doi.org/10.1016/j.postharvbio.2007.01.008.
Alma, O. G.,  Bulut, E. (2012). Genetic algorithm based variable
selection for partial least squares regression using ICOMP
criterion. Asian Journal of Mathematics  Statistics, 5, 82e92.
https://doi.org/10.3923/ajms.2012.82.92.
Anyasi, T. A., Jideani, A. I. O.,  Mchau, G. A. (2015). Morphological
, physicochemical , and antioxidant profile of noncommercial
banana cultivars. Food Sciences and Nutrition, 3(3), 221e232.
https://doi.org/10.1002/fsn3.208.
Ariana, D. P.,  Lu, R. (2010). Hyperspectral waveband selection
for internal defect detection of pickling cucumbers and whole
pickles. Computers and Electronics in Agriculture, 74(1), 137e144.
https://doi.org/10.1016/j.compag.2010.07.008.
Ashour, A. S., Guo, Y., Hawas, A. R.,  Xu, G. (2018). Ensemble of
subspace discriminant classifiers for schistosomal liver
fibrosis staging in mice microscopic images. Health Information
Science and Systems, 6(1), 1e10. https://doi.org/10.1007/s13755-
018-0059-8.
Baiano, A., Terracone, C., Peri, G.,  Romaniello, R. (2012).
Application of hyperspectral imaging for prediction of
physico-chemical and sensory characteristics of table grapes.
b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 77
Computers and Electronics in Agriculture, 87, 142e151. https://
doi.org/10.1016/j.compag.2012.06.002.
Baietto, M.,  Wilson, A. D. (2015). Electronic-nose applications
for fruit identification, ripeness and quality grading. Sensors,
15(1), 899e931. https://doi.org/10.3390/s150100899.
Baranowski, P., Mazurek, W., Wozniak, J.,  Majewska, U. (2012).
Detection of early bruises in apples using hyperspectral data
and thermal imaging. Journal of Food Engineering, 110(3),
345e355. https://doi.org/10.1016/j.jfoodeng.2011.12.038.
Barbedo, J. G. A. (2016). A review on the main challenges in
automatic plant disease identification based on visible range
images. Biosystems Engineering, 144, 52e60. https://doi.org/
10.1016/j.biosystemseng.2016.01.017.
Bargoti, S.,  Underwood, J. P. (2017). Image segmentation for fruit
detection and yield estimation in apple orchards. Journal of
Field Robotics, 34(6), 1039e1060. https://doi.org/10.1002/rob.
Belous, O., Malyarovskaya, V.,  Klemeshova, K. (2016).
Diagnostics of subtropical plants functional state by cluster
analysis. Scientific Journal for Food Industry, 10(1), 237e242.
https://doi.org/10.5219/526.
Bergman, E.-L., Brage, H., Josefson, M., Svensson, O.,  Spar
en, A.
(2006). Transfer of NIR calibrations for pharmaceutical
formulations between different instruments. Journal of
Pharmaceutical and Biomedical Analysis, 41(1), 89e98.
Biancolillo, A., Bucci, R., Magrı̀, A. L., Magrı̀, A. D.,  Marini, F.
(2014). Data-fusion for multiplatform characterization of an
Italian craft beer aimed at its authentication. Analytica Chimica
Acta, 820, 23e31. https://doi.org/10.1016/j.aca.2014.02.024.
Bian, X., Li, S., Shao, X.,  Liu, P. (2016). Variable space boosting
partial least squares for multivariate calibration of near-
infrared spectroscopy $. Chemometrics and Intelligent Laboratory
Systems, 158, 174e179. https://doi.org/10.1016/
j.chemolab.2016.08.005.
Biji, K. B., Ravishankar, C. N., Mohan, C. O.,  Srinivasa
Gopal, T. K. (2015). Smart packaging systems for food
applications: A review. Journal of Food Science  Technology,
52(10), 6125e6135. https://doi.org/10.1007/s13197-015-1766-7.
Birle, S., Hussein, M. A.,  Becker, T. (2013). Fuzzy logic control
and soft sensing applications in food and beverage processes.
Food Control, 29(1), 254e269. https://doi.org/10.1016/
j.foodcont.2012.06.011.
Bishop, C. M. (2006). Pattern recognition and machine learning.
Springer. https://doi.org/10.1117/1.2819119.
Borr
as, E., Ferr
e, J., Boqu
e, R., Mestres, M., Ace~
na, L.,  Busto, O.
(2015). Data fusion methodologies for food and beverage
authentication and quality assessment - a review. Analytica
Chimica Acta, 891, 1e14. https://doi.org/10.1016/
j.aca.2015.04.042.
Breiman, L. (1996). Bagging predictors. Machine Learning, 24,
123e140.
Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5e32.
https://doi.org/10.1023/A:1010933404324.
Breuel, T. M., Ul-hasan, A., Al-azawi, M. A.,  Shafait, F. (2013).
High-performance OCR for printed English and fraktur using
LSTM networks. In 2013 12th international Conference on
document Analysis and recognition (pp. 683e687). https://doi.org/
10.1109/ICDAR.2013.140.
Calvini, R., Orlandi, G., Foca, G.,  Ulrici, A. (2018). In Development
of a classification algorithm for efficient handling of multiple classes
in sorting systems based on hyperspectral imaging (Vol. 1, pp.
1e15). https://doi.org/10.1255/jsi.2018.a13.
Cen, H., He, Y.,  Lu, R. (2016). Hyperspectral imaging-based
surface and internal defects detection of cucumber via
stacked sparse auto-encoder and convolutional neural
network. In 2016 ASABE annual international meeting (p. 1).
American Society of Agricultural and Biological Engineers.
Cha, Y. J., Choi, W.,  Büyük€
oztürk, O. (2017). Deep learning-based
crack damage detection using convolutional neural networks.
Computer-Aided Civil and Infrastructure Engineering, 32(5),
361e378. https://doi.org/10.1111/mice.12263.
Chatzidakis, M.,  Botton, G. A. (2019). Towards calibration-
invariant spectroscopy using deep learning. Scientific Reports,
9(1), 2126. https://doi.org/10.1038/s41598-019-38482-1.
Chen, W. R., Bin, J., Lu, H. M., Zhang, Z. M.,  Liang, Y. Z. (2016).
Calibration transfer via an extreme learning machine auto-
encoder. Analyst, 141(6), 1973e1980. https://doi.org/10.1039/
c5an02243f.
Cheng, J. H.,  Sun, D. W. (2015). Data fusion and hyperspectral
imaging in tandem with least squares-support vector machine
for prediction of sensory quality index scores of fish fillet. LWT
- Food Science and Technology, 63(2), 892e898. https://doi.org/
10.1016/j.lwt.2015.04.039.
Chen, Y., Jiang, H., Li, C., Jia, X.,  Member, S. (2016). Deep feature
extraction and classification of hyperspectral images based on
convolutional neural networks. EEE Transactions on Geoscience
and Remote Sensing, 54(10), 6232e6251.
Chen, S. W., Shivakumar, S. S., Dcunha, S., Das, J., Okon, E., Qu, C.,
et al. (2017). Counting apples and oranges with deep learning:
A data-driven approach. IEEE Robotics and Automation Letters,
2(2), 781e788. https://doi.org/10.1109/LRA.2017.2651944.
Che, W., Sun, L., Zhang, Q., Tan, W., Ye, D., Zhang, D., et al.
(2018). Pixel based bruise region extraction of apple using
Vis-NIR hyperspectral imaging. Computers and Electronics in
Agriculture, 146, 12e21. https://doi.org/10.1016/
j.compag.2018.01.013.
Chuang, C., Ouyang, C., Lin, T., Yang, M., Yang, E., Huang, T., et al.
(2011). Automatic X-ray quarantine scanner and pest
infestation detector for agricultural products. Computers and
Electronics in Agriculture, 77(1), 41e59. https://doi.org/10.1016/
j.compag.2011.03.007.
Cui, S., Ling, P., Zhu, H.,  Keener, H. M. (2018). Plant pest
detection using an artificial nose system: A review. Sensors,
18(2), 1e18. https://doi.org/10.3390/s18020378.
De Groote, H. (2012). Crop biotechnology in developing countries.
In Commercial, legal, sociological, and public aspects of agricultural
plant biotechnologies (1st ed., pp. 563e576). Elsevier Inc. https://
doi.org/10.1016/B978-0-12-381466-1.00036-5.
De’ Ath, G.,  Fabricus, K. E. (2000). Classification and regression
trees: A powerful yet simple technique for ecological data
analysis. Ecology, 81(11), 3178e3192.
Ding, L., Dong, D., Jiao, L.,  Zheng, W. (2017). Potential using of
infrared thermal imaging to detect volatile compounds
released from decayed grapes. PLoS One, 12(6), 1e11. https://
doi.org/10.1371/journal.pone.0180649.
Du, C. J.,  Sun, D. W. (2006). Learning techniques used in
computer vision for food quality evaluation: A review. Journal
of Food Engineering, 72, 39e55. https://doi.org/10.1016/
j.jfoodeng.2004.11.017.
Ekramirad, N., Adedeji, A. A.,  Alimardradni, R. (2016). A review
of non - destructive methods for detection of insect
infestation in fruits and vegetables. Innovations in Food
Research, 2, 6e12.
ElMasry, G., Wang, N.,  Vigneault, C. (2009). Detecting chilling
injury in Red Delicious apple using hyperspectral imaging and
neural networks. Postharvest biology and technology, 52(1), 1e8.
https://doi.org/10.1016/j.postharvbio.2008.11.008.
Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M.,
Blau, H. M., et al. (2017). Dermatologist-level classification of
skin cancer with deep neural networks. Nature, 542(7639),
115e118. https://doi.org/10.1038/nature21056.
Fan, S., Li, J., Xia, Y., Tian, X., Guo, Z.,  Huang, W. (2019). Long-
term evaluation of soluble solids content of apples with
biological variability by using near-infrared spectroscopy and
calibration transfer method. Postharvest Biology and
Technology, 151, 79e87. https://doi.org/10.1016/
j.postharvbio.2019.02.001.
b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3
78
Machine learning applications to non-destructive defect detection in horticultural products.pdf
Machine learning applications to non-destructive defect detection in horticultural products.pdf
Machine learning applications to non-destructive defect detection in horticultural products.pdf
Machine learning applications to non-destructive defect detection in horticultural products.pdf
Machine learning applications to non-destructive defect detection in horticultural products.pdf

More Related Content

Similar to Machine learning applications to non-destructive defect detection in horticultural products.pdf

Plant Diseases Prediction Using Image Processing
Plant Diseases Prediction Using Image ProcessingPlant Diseases Prediction Using Image Processing
Plant Diseases Prediction Using Image ProcessingIRJET Journal
 
Plant Disease Detection Technique Using Image Processing and machine Learning
Plant Disease Detection Technique Using Image Processing and machine LearningPlant Disease Detection Technique Using Image Processing and machine Learning
Plant Disease Detection Technique Using Image Processing and machine LearningJitendra111809
 
LEAF DISEASE IDENTIFICATION AND REMEDY RECOMMENDATION SYSTEM USINGCNN
LEAF DISEASE IDENTIFICATION AND REMEDY RECOMMENDATION SYSTEM USINGCNNLEAF DISEASE IDENTIFICATION AND REMEDY RECOMMENDATION SYSTEM USINGCNN
LEAF DISEASE IDENTIFICATION AND REMEDY RECOMMENDATION SYSTEM USINGCNNIRJET Journal
 
Optimized deep learning-based dual segmentation framework for diagnosing heal...
Optimized deep learning-based dual segmentation framework for diagnosing heal...Optimized deep learning-based dual segmentation framework for diagnosing heal...
Optimized deep learning-based dual segmentation framework for diagnosing heal...IAESIJAI
 
abstract1 ppt (2).pptx
abstract1 ppt (2).pptxabstract1 ppt (2).pptx
abstract1 ppt (2).pptxRamyaKona3
 
IRJET- Leaf Disease Detecting using CNN Technique
IRJET- Leaf Disease Detecting using CNN TechniqueIRJET- Leaf Disease Detecting using CNN Technique
IRJET- Leaf Disease Detecting using CNN TechniqueIRJET Journal
 
Deep learning for Precision farming: Detection of disease in plants
Deep learning for Precision farming: Detection of disease in plantsDeep learning for Precision farming: Detection of disease in plants
Deep learning for Precision farming: Detection of disease in plantsIRJET Journal
 
Improved vision-based diagnosis of multi-plant disease using an ensemble of d...
Improved vision-based diagnosis of multi-plant disease using an ensemble of d...Improved vision-based diagnosis of multi-plant disease using an ensemble of d...
Improved vision-based diagnosis of multi-plant disease using an ensemble of d...IJECEIAES
 
Techniques of deep learning and image processing in plant leaf disease detect...
Techniques of deep learning and image processing in plant leaf disease detect...Techniques of deep learning and image processing in plant leaf disease detect...
Techniques of deep learning and image processing in plant leaf disease detect...IJECEIAES
 
Tomato Disease Fusion and Classification using Deep Learning
Tomato Disease Fusion and Classification using Deep LearningTomato Disease Fusion and Classification using Deep Learning
Tomato Disease Fusion and Classification using Deep LearningIJCI JOURNAL
 
OPTIMIZATION-BASED AUTO-METR IC
OPTIMIZATION-BASED AUTO-METR              ICOPTIMIZATION-BASED AUTO-METR              IC
OPTIMIZATION-BASED AUTO-METR ICRAJASEKHARV8
 
Using k means cluster and fuzzy c means for defect segmentation in fruits
Using k means cluster and fuzzy c means for defect segmentation in fruitsUsing k means cluster and fuzzy c means for defect segmentation in fruits
Using k means cluster and fuzzy c means for defect segmentation in fruitsIAEME Publication
 
Using k means cluster and fuzzy c means for defect segmentation in fruits
Using k means cluster and fuzzy c means for defect segmentation in fruitsUsing k means cluster and fuzzy c means for defect segmentation in fruits
Using k means cluster and fuzzy c means for defect segmentation in fruitsIAEME Publication
 
Using k means cluster and fuzzy c means for defect segmentation in fruits
Using k means cluster and fuzzy c means for defect segmentation in fruitsUsing k means cluster and fuzzy c means for defect segmentation in fruits
Using k means cluster and fuzzy c means for defect segmentation in fruitsIAEME Publication
 
Plant Monitoring using Image Processing, Raspberry PI & IOT
 	  Plant Monitoring using Image Processing, Raspberry PI & IOT 	  Plant Monitoring using Image Processing, Raspberry PI & IOT
Plant Monitoring using Image Processing, Raspberry PI & IOTIRJET Journal
 
ORGANIC PRODUCT DISEASE DETECTION USING CNN
ORGANIC PRODUCT DISEASE DETECTION USING CNNORGANIC PRODUCT DISEASE DETECTION USING CNN
ORGANIC PRODUCT DISEASE DETECTION USING CNNIRJET Journal
 
Fruit Disease Detection and Classification
Fruit Disease Detection and ClassificationFruit Disease Detection and Classification
Fruit Disease Detection and ClassificationIRJET Journal
 
A deep learning-based approach for early detection of disease in sugarcane pl...
A deep learning-based approach for early detection of disease in sugarcane pl...A deep learning-based approach for early detection of disease in sugarcane pl...
A deep learning-based approach for early detection of disease in sugarcane pl...IAESIJAI
 

Similar to Machine learning applications to non-destructive defect detection in horticultural products.pdf (20)

Plant Diseases Prediction Using Image Processing
Plant Diseases Prediction Using Image ProcessingPlant Diseases Prediction Using Image Processing
Plant Diseases Prediction Using Image Processing
 
Plant Disease Detection Technique Using Image Processing and machine Learning
Plant Disease Detection Technique Using Image Processing and machine LearningPlant Disease Detection Technique Using Image Processing and machine Learning
Plant Disease Detection Technique Using Image Processing and machine Learning
 
LEAF DISEASE IDENTIFICATION AND REMEDY RECOMMENDATION SYSTEM USINGCNN
LEAF DISEASE IDENTIFICATION AND REMEDY RECOMMENDATION SYSTEM USINGCNNLEAF DISEASE IDENTIFICATION AND REMEDY RECOMMENDATION SYSTEM USINGCNN
LEAF DISEASE IDENTIFICATION AND REMEDY RECOMMENDATION SYSTEM USINGCNN
 
Optimized deep learning-based dual segmentation framework for diagnosing heal...
Optimized deep learning-based dual segmentation framework for diagnosing heal...Optimized deep learning-based dual segmentation framework for diagnosing heal...
Optimized deep learning-based dual segmentation framework for diagnosing heal...
 
abstract1 ppt (2).pptx
abstract1 ppt (2).pptxabstract1 ppt (2).pptx
abstract1 ppt (2).pptx
 
IRJET- Leaf Disease Detecting using CNN Technique
IRJET- Leaf Disease Detecting using CNN TechniqueIRJET- Leaf Disease Detecting using CNN Technique
IRJET- Leaf Disease Detecting using CNN Technique
 
Deep learning for Precision farming: Detection of disease in plants
Deep learning for Precision farming: Detection of disease in plantsDeep learning for Precision farming: Detection of disease in plants
Deep learning for Precision farming: Detection of disease in plants
 
Improved vision-based diagnosis of multi-plant disease using an ensemble of d...
Improved vision-based diagnosis of multi-plant disease using an ensemble of d...Improved vision-based diagnosis of multi-plant disease using an ensemble of d...
Improved vision-based diagnosis of multi-plant disease using an ensemble of d...
 
Techniques of deep learning and image processing in plant leaf disease detect...
Techniques of deep learning and image processing in plant leaf disease detect...Techniques of deep learning and image processing in plant leaf disease detect...
Techniques of deep learning and image processing in plant leaf disease detect...
 
Tomato Disease Fusion and Classification using Deep Learning
Tomato Disease Fusion and Classification using Deep LearningTomato Disease Fusion and Classification using Deep Learning
Tomato Disease Fusion and Classification using Deep Learning
 
OPTIMIZATION-BASED AUTO-METR IC
OPTIMIZATION-BASED AUTO-METR              ICOPTIMIZATION-BASED AUTO-METR              IC
OPTIMIZATION-BASED AUTO-METR IC
 
Using k means cluster and fuzzy c means for defect segmentation in fruits
Using k means cluster and fuzzy c means for defect segmentation in fruitsUsing k means cluster and fuzzy c means for defect segmentation in fruits
Using k means cluster and fuzzy c means for defect segmentation in fruits
 
Using k means cluster and fuzzy c means for defect segmentation in fruits
Using k means cluster and fuzzy c means for defect segmentation in fruitsUsing k means cluster and fuzzy c means for defect segmentation in fruits
Using k means cluster and fuzzy c means for defect segmentation in fruits
 
Using k means cluster and fuzzy c means for defect segmentation in fruits
Using k means cluster and fuzzy c means for defect segmentation in fruitsUsing k means cluster and fuzzy c means for defect segmentation in fruits
Using k means cluster and fuzzy c means for defect segmentation in fruits
 
Plant Monitoring using Image Processing, Raspberry PI & IOT
 	  Plant Monitoring using Image Processing, Raspberry PI & IOT 	  Plant Monitoring using Image Processing, Raspberry PI & IOT
Plant Monitoring using Image Processing, Raspberry PI & IOT
 
AI to track plant diseases_S.Srinivasnaik.pdf
AI to track plant diseases_S.Srinivasnaik.pdfAI to track plant diseases_S.Srinivasnaik.pdf
AI to track plant diseases_S.Srinivasnaik.pdf
 
Accurate plant species analysis for plant classification using convolutional...
Accurate plant species analysis for plant classification using  convolutional...Accurate plant species analysis for plant classification using  convolutional...
Accurate plant species analysis for plant classification using convolutional...
 
ORGANIC PRODUCT DISEASE DETECTION USING CNN
ORGANIC PRODUCT DISEASE DETECTION USING CNNORGANIC PRODUCT DISEASE DETECTION USING CNN
ORGANIC PRODUCT DISEASE DETECTION USING CNN
 
Fruit Disease Detection and Classification
Fruit Disease Detection and ClassificationFruit Disease Detection and Classification
Fruit Disease Detection and Classification
 
A deep learning-based approach for early detection of disease in sugarcane pl...
A deep learning-based approach for early detection of disease in sugarcane pl...A deep learning-based approach for early detection of disease in sugarcane pl...
A deep learning-based approach for early detection of disease in sugarcane pl...
 

Recently uploaded

Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtrahman018755
 
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...gajnagarg
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsMonica Sydney
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsMonica Sydney
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrHenryBriggs2
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制pxcywzqs
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasDigicorns Technologies
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样ayvbos
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查ydyuyu
 
Call girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsCall girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsMonica Sydney
 
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsMira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsPriya Reddy
 
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...kumargunjan9515
 
Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.krishnachandrapal52
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirtrahman018755
 
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiAbu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiMonica Sydney
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Roommeghakumariji156
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdfMatthew Sinclair
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"growthgrids
 

Recently uploaded (20)

Real Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirtReal Men Wear Diapers T Shirts sweatshirt
Real Men Wear Diapers T Shirts sweatshirt
 
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
Top profile Call Girls In Dindigul [ 7014168258 ] Call Me For Genuine Models ...
 
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi EscortsIndian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
Indian Escort in Abu DHabi 0508644382 Abu Dhabi Escorts
 
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi EscortsRussian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
Russian Escort Abu Dhabi 0503464457 Abu DHabi Escorts
 
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
call girls in Anand Vihar (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrStory Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
Story Board.pptxrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrrr
 
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
一比一原版(Offer)康考迪亚大学毕业证学位证靠谱定制
 
APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53APNIC Updates presented by Paul Wilson at ARIN 53
APNIC Updates presented by Paul Wilson at ARIN 53
 
Best SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency DallasBest SEO Services Company in Dallas | Best SEO Agency Dallas
Best SEO Services Company in Dallas | Best SEO Agency Dallas
 
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
一比一原版(Curtin毕业证书)科廷大学毕业证原件一模一样
 
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查在线制作约克大学毕业证(yu毕业证)在读证明认证可查
在线制作约克大学毕业证(yu毕业证)在读证明认证可查
 
Call girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girlsCall girls Service in Ajman 0505086370 Ajman call girls
Call girls Service in Ajman 0505086370 Ajman call girls
 
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call GirlsMira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
Mira Road Housewife Call Girls 07506202331, Nalasopara Call Girls
 
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...Local Call Girls in Seoni  9332606886 HOT & SEXY Models beautiful and charmin...
Local Call Girls in Seoni 9332606886 HOT & SEXY Models beautiful and charmin...
 
Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.Meaning of On page SEO & its process in detail.
Meaning of On page SEO & its process in detail.
 
Trump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts SweatshirtTrump Diapers Over Dems t shirts Sweatshirt
Trump Diapers Over Dems t shirts Sweatshirt
 
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu DhabiAbu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
Abu Dhabi Escorts Service 0508644382 Escorts in Abu Dhabi
 
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac RoomVip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
Vip Firozabad Phone 8250092165 Escorts Service At 6k To 30k Along With Ac Room
 
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
20240510 QFM016 Irresponsible AI Reading List April 2024.pdf
 
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency""Boost Your Digital Presence: Partner with a Leading SEO Agency"
"Boost Your Digital Presence: Partner with a Leading SEO Agency"
 

Machine learning applications to non-destructive defect detection in horticultural products.pdf

  • 1. Review Machine learning applications to non-destructive defect detection in horticultural products Jean Frederic Isingizwe Nturambirwe, Umezuruike Linus Opara* Postharvest Technology Research Laboratory, South African Research Chair in Postharvest Technology, Department of Horticultural Science, Stellenbosch University, Private Bag X1, Stellenbosch 7602, South Africa a r t i c l e i n f o Article history: Received 18 June 2019 Received in revised form 22 October 2019 Accepted 10 November 2019 Published online 29 November 2019 Keywords: Non-destructive Machine learning Internal damage Early detection Fruit defect classification Deep learning Machine learning (ML) methods have become useful tools that, in conjunction with sensing devices for quality evaluation, allow for quick and effective evaluation of the quality of food commodities based on empirical data. This review presents the recent advances in machine learning methods and their use with various sensing devices to detect defects in horticultural products. There are technical hurdles in tackling major issues around defect detection in fruit and vegetables as well as various other food items, such as achieving fast, early and quantitative assessments. The role that ML methods have played towards addressing such issues are reviewed, the present limitations highlighted, and future prospects identified. © 2019 IAgrE. Published by Elsevier Ltd. All rights reserved. 1. Introduction For the past few decades, the horticultural sector has seen significant technical advances aimed at reducing food post- harvest losses whereby, non-destructive (ND) technology has been increasingly adopted for effective fruit quality evaluation and assurance. These techniques span optical and acoustic vibration to nuclear magnetic resonance, computer vision techniques, computed tomography, electronic noses (Gao, Zhu, & Cai, 2010), near infrared spectroscopy, hyperspectral imaging and intelligent packaging (Sousa-Gallagher, Tank, & Sousa, 2016). These evolving ND techniques for quality monitoring and assessment, together with packaging and storage solutions, are seen as the main players that in future implementations will help achieve longer sustenance of quality in fruit and vegetables. Future trends also favour the introduction of intelligent packaging which incorporates ND sensors for chemical, biological or physical characteristics, and radio-frequency identification (Biji, Ravishankar, Mohan, & Srinivasa Gopal, 2015) in packaging systems that can allow to monitor the overall stability of produce during transport or storage (Lee, Lee, Choi, & Hur, 2015). Intelligent systems pro- vide information that can be used to extend shelf life of food products, they can be made of biodegradable films and therefore low-cost, which can minimise wastage and also * Corresponding author. E-mail address: opara@sun.ac.za (U.L. Opara). Available online at www.sciencedirect.com ScienceDirect journal homepage: www.elsevier.com/locate/issn/15375110 b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 https://doi.org/10.1016/j.biosystemseng.2019.11.011 1537-5110/© 2019 IAgrE. Published by Elsevier Ltd. All rights reserved.
  • 2. contribute to environmental sustainability (Sousa-Gallagher et al., 2016). Machine learning (ML) methods are an integral part in the development of many sensing technologies (Cui, Ling, Zhu, & Keener, 2018), responsible for retrieval of information, signal processing and analysis of data acquired by most sensors (Cui et al., 2018; Markom et al., 2009; Xu et al., 2016). They have proven to overcome the limitations of the classical computing paradigm in cases such as classification and defect detection in various types of fruit using computer vision (Gill, Sandhu, & Singh, 2014; Khoje & Bodhe, 2013). As pointed out by Gill et al. (2014), soft computing models are the enablers of the future use of computer vision based non- destructive studies in fruit (Gill et al., 2014). Researchers have repeatedly emphasised the need to improve modelling performances by using advanced feature extraction tech- niques such as histogram-based feature extraction, grey- level co-occurrence matrix (GLCM) and/or wavelet-based features. ML methods that could address many challenges pertaining to biosystems predictive modelling were also proposed, they include neural networks (NNs) or least square support vector machine (LS-SVM) among others (Baiano, Terracone, Peri, & Romaniello, 2012; Baietto & Wilson, 2015). Though a widely used powerful tool in many research fields such as diagnosing medical abnormalities (Esteva et al., 2017) and defect detection in civil engineering (Cha, Choi, & Büyük€ oztürk, 2017), deep learning, a sub-field of machine learning, is hardly used in agriculture technologies and less so in horticultural industry (Wang, Hu, & Zhai, 2018). However, there has been recent agricultural applications of convolutional neural network (CNN) in image classification (leaf picking) by robotic systems (Ahlin, Joffe, Hu, McMurray, & Sadegh, 2016) and in fruit detection, counting and seg- mentation (Bargoti & Underwood, 2017; Chen et al., 2017; Sa, Ge, Dayoub, Upcroft, Perez, & McCool, 2016). Deep learning has been reported to enable integration of feature extraction that results in superior performance over conventional image processing methods in many vision tasks (Girshick, Donahue, Darrell, & Malik, 2014) and therefore, a potential candidate for performance enhancer in defect detection systems. There have been reviews whereby machine learning applications in the food industry have focussed on specific sensors (Du & Sun, 2006), infield usage and sensor fusion (Srivastava & Sadistap, 2018), a specific commodity Nomenclature ANN Artificial neural network AUC Area under curve BPNN Back propagation neural network BSR Basal stem rot CA Clustering analysis CCD Charge-coupled device CFS Correlation-based feature subset selection ChiS Chi square CNN Convolutional neural network CT Computer tomography CV Computer vision DL Deep learning DS Direct standardisation DT Decision tree ELM Extreme learning machine FURIA Fuzzy unordered rule induction algorithm GA Genetic algorithm GIA Gini impurity algorithm GLCM Grey-level co-occurrence matrix GPU Graphical processing unit HSI Hyperspectral imaging IG Information gain k-NN K - nearest neighbour LDA Linear discriminant analysis LINE A liblinear classifier LOG Linear logistic regression LR Linear regression LS-SVM Least squares support vector machine LVQN Learning vector quantization network ML Machine learning MLPNN Multilayer perceptron neural network MNF Minimum noise fraction MR Magnetic resonance mRMR Minimum redundancy maximum relevance MSI Multispectral imaging NB Naı̈ve Bayesian method NBC Naı̈ve Bayes classifier ND Non-destructive NIR Near infrared NNs Neural networks NNC Nearest-neighbour classifier PCA Principal component analysis PDS Piecewise direct standardisation PLS Partial least squares PLS-DA Partial least squares discriminant analysis PLSR Partial least squares regression ReCNN Region based convolutional neural network RF Random forest RMSEP Root mean square error of prediction SIRI Structure illumination reflectance imaging SFS Sequential forward selection SLOG Simple logistic SLR Sparse logistic regression SMO Sequential minimal optimisation SOM Self-organising maps SPA Successive projection algorithm SSAE Stacked sparse auto-encoder SVM Support vector machine SWIR Shortwave infrared SWNIR Shortwave near-infrared VGG Visual geometry group Vis Visible spectrum WSA Watershed segmentation algorithm ZF Zeiler Fergus ø Basis function for a neural network b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 61
  • 3. (Lu, 2017) or on various quality features assessment (Hameed, Chai, & Rassau, 2018; Ropodi, Panagou, & Nychas, 2016). In this review, the focus is on the application of ML in solving the existing issues in non-destructive detection of defects in fruit and vegetables. We explore the role it has played in enabling non-destructive techniques for horticul- tural quality assessment, especially in defect detection, and we pinpoint the hurdles that researchers are still trying to overcome and discuss future directions for research and applications. 2. Defect and detection 2.1. Types of defect Fruits and vegetables are prone to defects due to pre-harvest practices, postharvest handling and storage conditions and therefore, may lead to various losses throughout the food chain. The diagram in Fig. 1 depicts the common types of defects encountered in fruit and vegetables. Pathological disorders are associated with attacks by vi- ruses, fungi, bacteria or microbial pathogens that in time can lead to fruit spoilage or decay (Fourie, 2008). Many disorders of pathological nature exist and their manifestations in agricul- tural products may be visually similar regardless of the type of infection or product (Barbedo, 2016). Thus, the ability to detect the infecting agent and/or chemical reactions there associated helps identify the causal effects and accurately determine the specific disorder (Ray et al., 2017). Excessive external forces in the form of compression or impact cause mechanical damage to agricultural products. This results in tissue failure, pigment deterioration and metabolic changes in affected areas. It increases the vulner- ability of the product to infections and reduces its shelf life. Mechanical damage can occur during growth on tree due to environmental factors or during and after harvest due to human or machine handling (Hussein, Fawole, & Opara, 2018; Li & Thomas, 2014). Physiological stresses related to nutrition, temperature, respiration at various developmental stages and during storage can lead to disorders such as bitter pit, watercore, mealiness, sunburn, browning, superficial scald, granulation and internal drying, among others (Herremans et al., 2013; 2014; Magwaza et al., 2012). Fruit with such physiological disorders result in lower commercial value (van Dael et al., 2016). Morphological disorders manifest themselves as de- formations that make a product have an ‘abnormal shape’. Though such deformations may not affect the compositional properties of a product, they complicate some object and defect detection tasks, especially using computer vision, whereby shades due to irregular surface curvatures may be wrongly encoded as certain similar defects (Anyasi, Jideani, & Mchau, 2015; Moallem, Serajoddin, & Pourghassem, 2017). Internal defects encompass all latent disorders and dam- ages that may be pathological, physiological or early devel- opment of mechanical damage. The ability to detect such latent defects is of high importance along the food chain; it provides a way of sorting quality disease free fresh produce for the market, preventing disease spreading, possible food losses and consumer dissatisfaction (van Dael et al., 2016; Van Dael, Verboven, Zanella, Sijbers, & Nicolai, 2019; Moggia et al., 2015; Raghavendra & Rao, 2016). 2.2. Techniques for defect detection in plant material Defects may be latent and internal or externally visible; therefore, detection methods may differ from one case to another, depending of the nature of the defect. The choice in instrumentation may also depend on the context (e.g. research, industrial) and the commodity investigated. Defect detection has three overall outcomes; ensuring consistently high-quality of products for the consumer, enhancing profitability for the industry and reducing food losses (Lu, 2017). Many non-destructive techniques are in use for objective detection of defects in plant material. They include optical detection (Tischler, Thiessen, & Hartung, 2018), thermal im- aging (Kim, Kim, Park, Kim, & Cho, 2014), structured illumi- nation (Lu, Li, & Lu, 2016; Lu & Lu, 2018), electrical spectroscopy (Khaled, Abd Aziz, Bejo, Nawi, & Abu Seman, 2018), electronic nose (Cui et al., 2018), infrared spectros- copy, hyperspectral imaging (Che et al., 2018), magnetic resonance imaging, X-rays and various biological sensing techniques (Ruiz-Altisent et al., 2010). Thermal imaging is based on measuring infrared radiation emanating from an object (Van Linden, Vereycken, Bravo, Ramon, & De Baerdemaeker, 2003). It has potential for detecting bruises in fruit since bruised areas hold a different temperature compared to healthy tissue, resulting in a contrasted response in the radiation detected by a thermal camera. Electrical spectroscopy is based on the measurement of electrical properties of material such as dissipation factor, impedance, dielectric constant and capacitance (Khaled et al., 2018). Rather than using uniform lighting, structured illumination uses spatially patterned (e.g. sinusoidally modulated) lighting, to image food products, which makes it capable of depth- resolved and topographic imaging (Lu, 2017). Table 1 sum- marises some most recent non-destructive techniques that were used in tandem with ML for detecting defects in plant Fig. 1 e Common types of defects encountered in horticultural products. b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 62
  • 4. material. Tischler et al. (2018) used optical measurements (computer assisted fluorometer, ‘MultiDetExc’) to detect ‘brown rust’ in wheat at early stage of infection. The method consisted in temporally measuring the fluorescence of opti- cally excited chlorophyll (in discrete wavelengths) in wheat plants that were artificially infused with this fungal infection. The system was reported to be unbiased by daylight, relatively rapid and less invasive than its competitors (Tischler et al., 2018). Mehl, Chen, Kim, and Chan, (2004) developed a hyper- spectral imaging system to detect surface defects such as bruises, side rots flyspecks, scabs and moulds, fungal diseases (such as black pox), and soil contaminations in apple fruits. The system consisted of an sample illumination system and a charge-coupled device (CCD) camera to record the image from reflected and filtered light from the fruit samples (Mehl et al., 2004). Hyperspectral imaging, similar to other vibrational spectroscopy, exploits the molecular vibrations when they interact with electromagnetic radiation. Hyperspectral imag- ing is an attractive technique because it offers both spectro- scopic and imaging aspects and thus enables the simultaneous acquisition of both spectral and spatial infor- mation from an object for a comprehensive analysis of ma- terial. It has been a trending application to the study of quality in food and agricultural products (Lu, Huang, & Lu, 2017). 2.3. Defects detection challenges Pre-harvest practices, harvest quality, the genetic predispo- sition of crops and postharvest storage conditions all play an important role in determining various fruit properties and quality conditions such as fruit physical features (shape, size, deformations, disorders) and resistance to disease attack (De Groote, 2012; Hussein et al., 2018; Ray et al., 2017). Postharvest handling (harvesting methods, transport and packaging) of fresh fruits is likely to inflict mechanical dam- age to fruit. A recent review summarised methods for measuring and indexing the potential of bruise damage to produce, under mechanical loading, suggesting ways to pre- vent bruise occurrence through pre- and postharvest handing practices (Opara & Pathare, 2014). When such practices are not enforced, which is common in developing countries, such damage and disorders may occur which increase susceptibil- ity to spoilage and may result in economic losses. These losses could be reduced by grading damaged fruit based on accurate determination of damage severity, both internal and external. An objective method for this purpose is required, but it is not yet developed and is still a challenge for research into food safety (Li & Thomas, 2014). Another challenge emanates from the nature of defects and how well the link to their cause is understood. For example, structural, cell and tissue damage in fruit (Jim enez, Rallo, Rapoport, Su arez, 2016) may lead to increased decay and are common in inhomogeneous fruit such as tomato and kiwifruit. It has direct implication on food safety and quality; however, it has had little attention in research. Currently in- ternal damage can only be visualised destructively. Li and Thomas (2014) speculated that based on a relationship be- tween internal and external damage, if any existed, one could use absorbed energy or peak contact force as a representative measure of internal damage. Validating predictions of inter- nal damage from associated surface damage (as the area of damaged exocarp), using methods such as in (Idah, Ajisegiri, Yisa, 2007; Van Zeebroeck et al., 2007) would however be required. According to Li and Thomas (2014), in order to fully understand the dynamics between handling and associated damage, the use of logistic regression modelling could be Table 1 e Various sensing techniques used with ML for defect detection in recent years. Technique Example of study Parameters Reference Electrical spectroscopy Classification of diseased oil palm leaves Impedance, dielectric constant, capacitance, dissipation factor Khaled et al. (2018) Thermography Mechanical damage detection and estimation Infrared radiation emitted by a heated object Kim et al. (2014) Structured illumination Bruise detection in apples ‘’ Lu and Lu (2018) Hyperspectral imaging HSI Diverse Molecular vibrational frequencies Wu and Sun (2013) ‘’ Physical damage in pear ‘’ Lee et al. (2014) ‘’ Early bruises in peaches ‘’ Li et al. (2018) ‘’ Diverse ‘’ Lu et al. (2017) Shortwave Infrared (SWIR) HSI Bruise detection in apples ‘’ Keresztes et al. (2016) Machine vision Hidden insect infestation Electromagnetic emission (visible range) Moradi (2011); Okamoto (2013); Lu and Ariana (2013) Magnetic resonance imaging ‘’ Relaxation in spin resonance of atomic nuclei Haishi, Koizumi, Arai, Koizumi, and Kano (2011) X-ray imaging ‘’ Contrasted attenuation of transmitted X-rays Chuang et al. (2011) Acoustic ‘’ Change in sonic vibration recorded from the emitting source Hetzroni, Soroker, and Cohen (2016); Mankin, Hagstrum, Smith, Roda, and Kairo (2011); Potamitis and Ganchev (2009) Gas chromatography ‘’ Quantitation of chemical volatile components Kendra et al. (2011) ‘’: same as above. b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 63
  • 5. complemented in combination with technologies such as X- rays, hyperspectral imaging, magnetic resonance or ultra- sonic techniques. Thus, the important aspects in damage detection in the horticultural products that are still problematic include: - Detection and determination of extent of internal de- fects (Li Thomas, 2014); - Early detection (Lu, 2017); - Objective, quantitative evaluation of mechanical dam- age (Li Thomas, 2014; Opara Pathare, 2014) and - Fast detection of defects for industrial application such as sorting and grading systems and portable infield tests (Abasi, Minaei, Jamshidi, Fathi, 2018). Research in the area of postharvest non-destructive quality assessment has aimed at finding solutions to achieve objec- tives such as these mentioned above. Some techniques that have been used to achieve these goals are shown in Fig. 2. Recent reviews that dealt with defect detection methods, highlighting the progress and exposing the gaps were sum- marised in Table 2. A few approaches and technical solutions have been pro- posed in the past, whereby nuclear magnetic resonance is considered the most prominent technique; it can quantita- tively asses internal and external damage (Zhao, Men, Liu, Wu, Yan, 2016). However, magnetic resonance (MR) sys- tems are costly, require high expertise to operate, have a low speed of measurement and their relatively low-cost, low-field versions are still lacking a specialised customisation in terms of readiness for practical applications. The implementation of MR systems on sorting lines is also problematic since these are generally made of high magnetic susceptibility metals, which would be disruptive to the stability of measurement fields in MR systems and thus not yet fit for industrial application. Another, very promising technology is NIR based spectros- copy and imaging which, with adequate feature selection, has been reported to be a convenient option for online sorting (Stella et al., 2015). However, NIR use is restricted to a limited number of attributes (Lakshmi et al., 2017); an idea worth exploring is that of fusion of data concomitantly generated by different devices in order to complement the limitations of each. 3. ML methods used in ND techniques ML is a branch of computer intelligence that aims to study and build algorithms that can learn from and make predictions on data. The goal is to give to computers the task of continuously improving performance on a specific task by making data- driven predictions or decisions. Basically, there is a general belief that behind the data we observe there exists a process and it is not completely random. ML aims to find a rule that explains data based on a limited size data sample (Hsieh, 2009). In the context of this review the term data refers to empirical data unless explicitly mentioned. Empirical data is the type of data acquired experimentally through a mea- surement process as part of scientific inquiries. In defect detection of horticultural products using non-destructive techniques, such data is acquired in a form of image (2- or 3- dimensional), continuous spectral information in time or frequency domain or discrete values of numerical or character type. One sub-field of ML that is also extensively used in horti- cultural quality assessment is that of ‘pattern recognition’. It deals with the automatic discovery of regularities in data by means of computer algorithms and the use of such regular- ities in tasks like categorization (Bishop, 2006). The sub- divisions of ML are given by the chart in Fig. 3. Typical applications of ML in defect detection of horticul- tural products encompasses classification and regression. Classification techniques predict discrete responses; the models are built to classify data into categories, while regression techniques predict continuous responses such as forecast in temporal changes of a given time dependent characteristic. Many learning algorithms have been used for assessing properties of horticultural products including defect detection, some, more popular than others depending on the learning task. Different learning methods and their dedicated uses are summarised in Table 3 and more details on some of the popular algorithms are provided below. 3.1. Artificial neural network Artificial neural networks (ANNs) are designed to mimic the function of a human brain based on models of biological neurons (Jamshidi, 2003). An ANN consists of a number of interconnected neurons (parallel processing units made of input, hidden and output layers, see Fig. 4(A)) which in turn comprises weights, thresholds and an activation function (Khaled et al., 2018). In high dimensional data, models for regression and classification, that are built on linear combi- nations of basis functions, become ineffective and therefore need to be adapted to the data. Neural networks have shown to be effective in such a situation of pattern recognition, whereby the feed-forward neural network is considered to be the most successful (Bishop, 2006). Basis functions used in neural networks follow the form of Eq (1) which is a linear combination of nonlinear basis func- tions øj(x). Fig. 2 e Main challenges of defect detection in horticultural products. b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 64
  • 6. Table 2 e Recent reviews on various techniques for detecting defects in fruits and vegetables. Topic Summary Knowledge Gap Reference Detecting apple defects by non-destructive spectroscopy and imaging Overview of common defects in apples, current status and prospects of their detection techniques. - Further research is needed to improve existing techniques and explore new, emerging techniques for more effective detection of both external and internal defects in apples. - More research on the development of rapid, low-cost x-ray imaging and MRI sensing systems; - Further effort toward improving the hardware and software for hyperspectral imaging, for more efficient image acquisition and processing, to enable automated on-line sorting and grading.” Lu (2017) Pre-harvest factors influencing damage Understanding factors that influence bruise susceptibility - Reduce the phenomenon of bruise occurrence - Manipulation of preharvest factors to influence bruise resistance?! - Some factors are not widely researched; more study can shed light to it. Hussein et al. (2018) Biosensors for sustainable food engineering Five challenges for food sustainability: the role of biosensors in addressing them - Production challenge about food safety and security, - Quality challenge in food diversity and qualities, - Economic challenge in governing food system including its packaging and supply chain, - Environmental challenge including food waste processing - Explore RFID sensors in smart packaging - Graphene-based bio-sensing: superior optical, electric, thermal, mechanical and chemical properties Neethirajan, Ragavan, Weng, and Chand (2018) Quantitative measurements of mechanical damage in fruits Objective and quantitative assessment of damage to enable grading - NMR potential for internal damage in fruit to achieve transfer to supply chain. - Detect internal from external damage?! proposed methods: absorbed energy or peak contact force as surrogate measure of damage - Objectively speaking, how does handing cause bruising? Possible solution: Logistic regression þ Spectroscopy NMR, HSI, X-rays, Ultrasonic tech. - Multiscale FME modelling: check consistency of regression models linking bruising and mechanical parameters - Cell and tissue damage also can lead to food safety and quality issues: should be investigated microscopy studies Li and Thomas (2014) Techniques for measurement of bruise damage Indexing for bruise potential, methods for bruise measure; suggested ways to prevent bruise occurrence through pre- and postharvest handing practices - Standardization of bruise assessment criteria, measurement and analytical techniques to improve the traceability and transferability of bruise measurement and to permit inter-laboratory comparisons - Bruise susceptibility studies are very helpful in preventing damage during handling operations; effective prevention is only possible when the factors responsible for bruise development are known. - To reduce impact damage, fruit acceleration and deceleration must be carefully controlled - Need for “in-depth studies to investigate and predict the effects of bruising on nutritional and flavour quality.” Opara and Pathare (2014) Plant pest detection using an artificial nose system Promising in quick and early non-invasive diagnosis of insect damage, bacterial, fungal and viral infection in plant tissue. Challenges with sensor performance, environment suitability for sampling and detection, selectivity and scaling up Cui et al. (2018) Non-destructive detection for fruit quality Detectable defects 1. Internal damage, 2. Physical damage, 3. Decay 4. Insect damage 5. Frost injury Tested or prominent detection techniques 1. Sonic vibration, X-ray, MRI, Laser inspection; 2. Optical absorbance, electrical properties, HSI, NIRS; 3. NMR, electrical properties, HSI; 4. NMR; 5. NMR, HIS In all, improvements are needed to meet routine application. Gao et al. (2010) (continued on next page) b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 65
  • 7. yðx; wÞ ¼ f X L j¼1 wj∅jðxÞ (1) where f (.) is a nonlinear activation function in classification or identity in regression and wj are coefficients along which øj(x) is made dependent on parameters that are adjustable during training. Therefore, a basic neural network model is denoted as a series of functional transformations as in Eq. (2), whereby each basis function is a nonlinear function of a linear com- bination of inputs of which coefficients are adaptive parameters. For input variables x1, …xD we build L linear combinations such that activations aj are given by aj ¼ X D i¼1 wl1 ji xi þ wl1 j0 (2) where l1 indicates the first layer of the network, parameters wl1 ji and wl1 j0 are weights and biases, respectively and j ¼ 1, …, L. The outputs of the basis function in Eq. (1) denoted by zj, also referred to as hidden units, are given by a transform of activations using a differentiable (generally sigmoidal func- tions), nonlinear activation function h (.), such that zj ¼ h aj (3) Similarly, output unit activations will be given by the following equation: ak ¼ X D j¼1 wl2 kjzj þ wl2 k0 (4) For K total outputs and k ¼ 1, … K. An appropriate trans- form is applied to produce outputs yk. ANNs are known to be adaptable in learning, good in generalisation and noise tolerance. Like supervised methods, they require a large sample set for training but they provide more robust algorithms and higher accuracy than unsuper- vised methods. Nonetheless, there is a tendency to over-fit data and the problem with interpretation of a classifier which is inherent with the experimental nature of modelling; a trained neural network has the characteristics of a ‘black box’ (Cui et al., 2018). ANNs have been used with electronic nose systems for accurate quantitative analysis, in detecting diseases (Markom et al., 2009), in classification of hyper- spectral images of damaged mushrooms (Rojas-moraleda, Valous, Gowen, 2017), with dielectric spectroscopy (Khaled Table 2 e (continued ) Topic Summary Knowledge Gap Reference ND methods for detection of insect infestation in fruit and vegetables The methods have included fluorescence and visible-IR spectroscopy; hyperspectral, X-ray, thermal and MR imaging; and acoustic and chemical emission detection Future directions: - Reduce the effect of background data in the resulting profile or data; - Optimise techniques for a specific fruit and insect; - Automation of techniques for continuous monitoring of insect infestation under real conditions; - Integrated and simultaneous use of different methods to achieve higher detection accuracy and insect infection management. Ekramirad, Adedeji, and Alimardradni (2016) Fig. 3 e Main categories of ML; adapted from Hsieh (2009). b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 66
  • 8. et al., 2018) and some other food quality related applications (Du Sun, 2006; Gandhi Armstrong, 2016). A new formulation of neural networks which has been especially successful in learning applications aimed at pattern recognition in images is convolutional neural networks (CNN) which implements so called ‘deep learning’ (DL), a subset of ML. The particularity of DLnetworks isthat they have more complex ways of layer interconnectivity, more nodes and are capable of automatic parameter extraction; however, training them does require higher computational power than conventional neural networks. In addition to CNN, the main architectures of DL networks include recurrent neural networks, recursive neural Table 3 e Common ML functions. Algorithm Learning task Decision tree classification, regression Supervised learning Bagged and boosted decision trees classification Generalised linear model regression Support vector machine classification, regression Gaussian kernel classification, regression Ensembles classification, regression Logistic regression classification K-nearest neighbour classification Discriminant analysis classification Neural network classification Naı̈ve Bayes classification Gaussian process regression model regression Nonlinear regression regression Genetic linear regression regression k-Means Hard clustering Unsupervised learning k-Medoids Hard clustering Hierarchical clustering Hard clustering Self-organising map Hard clustering Fuzzy c-means Soft clustering Gaussian mixture model Soft clustering Principal component analysis Dimensionality reduction Factor analysis Dimensionality reduction Nonnegative matrix factorisation Dimensionality reduction Fig. 4 e A schematic illustration of an ANN with two hidden layers (A); adapted from Acquarelli et al. (2017) and a CNN (B) for a vision problem (object detection); adapted from Voulodimos, Doulamis, Doulamis, Protopapadakis (2018). b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 67
  • 9. networks and unsupervised pre-trained networks (Patrı́cio Rieder, 2018). Though DL is increasingly being used in vision- based implementations for autonomous vehicles and for artifi- cial intelligence, as well as in aspects of signal processing (Marchi, Ferroni, Eyben, Gabrielli, Squartini, 2014), character recognition (Breuel, Ul-hasan, Al-azawi, Shafait, 2013), lan- guage identification (Sak, Senior, Beaufays, 2014) and trans- lation (Sutskever, Vinyals, Le, 2014), it has rarely been used in imaging systems for agricultural applications (S. Naik Patel, 2017). Nonetheless, there are a few cases where DL has been successfully used for the quality evaluation of agricultural products (Ferentinos, 2018; Fuentes, Yoon, Lee, Park, 2018; Grinblat, Uzal, Larese, Granitto, 2016; Mohanty, Hughes, Salath e, 2016; Picon et al., 2019). CNN algorithms draw inspiration from biological vision processes in the visual cortex of an animal; vision cells are sensitive to minute sub-regions of the visual field (Acquarelli, Laarhoven, Gerretzen, Tran, 2017). CNNs exploit the property that many natural signals can be decomposed in a hierarchical manner such that by composing lower-level features, higher- level ones can be obtained. For example, in images, objects are composed of parts, which in turn are made of motifs, which are also formed by local combinations of edges. Similar hierarchies can be found in speech and text (Lecun, Bengio, Hinton, 2015). Some of the CNN architectures found in the literature include region based CNN (ReCNN), fast and faster ReCNN (Sa et al., 2016), ResNet (He, Zhang, Ren, Sun, 2016), VGG Net (Simonyan Zisserman, 2015), ZF Net (Zeiler Fergus, 2014), GoogLeNet (Mohanty et al., 2016), AlexNet (Jiang et al., 2019) and LeNet-5 (Kirk Wen-Mei, 2016).The structure ofa typical CNN is a series of stages starting from convolutional and pooling layers whereby the former detects local connections of features from the previous layer while the latter merges semantically similar features into one (Lecun et al., 2015). Generally speaking, a typical DL network is made up of an input layer where the input is a feature set, a number of stacked stages of convolution, non- linearity and pooling, more convolution and fully connected layers, and the output layer (see Fig. 4(B)). 3.2. Fuzzy logic The human experience in producing complex decisions based on uncertain and vague information is simulated in fuzzy logic. It has proven to be a valuable tool in dealing with incomplete and/or ambiguous information in classification problems including grading of fruit using computer vision systems (Shahin, Tollner, McClendon, 2001). However, it involves tuning for better performance which can be problematic in problems dealing with high dimensional data (Du Sun, 2006). Fuzzy logic has proven instrumental in control systems for managing complex production processes of food and beverages. Instead of representing a complex system behaviour by quan- titative, mathematical expression of systems transfer, fuzzy systems offer the possibility of using simpler linguistic variables and algorithmic formulations (Birle, Hussein, Becker, 2013). 3.3. Decision trees Decision trees explain variation of a single response variable by repeatedly splitting the data into more homogeneous groups, using combinations of explanatory variables that may be cate- gorical (classification) and/or continuous numeric (regression) (De' Ath Fabricus, 2000). A simple prediction model is fitted within each data partition and for classification problem, the accuracy is calculated as classification gain after every splitting step, whereas for regression, the squared error of prediction is used. Algorithms for growing trees are widely available and summarised in Loh (2011). Decision trees are advantageous in the sense that they are easy to construct and interpret, they can handle various response data types such as categorical, numeric,ratingsandtheyareabletohandlemissingdatainboth response and independent variables. Separate tree models can be combined into what is known as committee of experts in order to enhance model performance, an approach also known as ensemble learning. Popular methods of model combination include bagging and boosting (Moisen, 2008), these have also been applied to other learning methods such as linear discrim- inant analysis (Ashour, Guo, Hawas, Xu, 2018), neural net- works and partial least squares (Bian, Li, Shao, Liu, 2016), to name a few. Bagging is a method for generating multiple versions of a predictor using bootstrap replicates of the learning data set and combining them into one to improve accuracy. When predicting a class, a plurality vote is conducted, whereas an average is calculated over the predictor versions for a numerical outcome. Bagging can improve the accuracy by a combined model, if the bootstrap induced perturbation of the learning data set in- troduces significant variability between the predictors within the aggregation constructed (Breiman, 1996). Boosting is a technique for agglomerating multiple classi- fiers which results in a combined model with higher perfor- mance than the individual classifier alone. The base classifiers are trained in sequence using a weighted form of the data set whereby the weighting coefficient for each data point depends on the performance of the previous classifiers. Upon training all classifiers the final prediction is obtained by weighted majority voting. Boosting can give good results even when the base classifiers are weak learners (learners with nearly random performance), it can be interpreted as a sequential optimisation of an additive model with an exponential error, which opens possibilities for range of boosting-like algorithms such as extensions to multiclass and regression problems (Friedman, Hastie, Tibshirani, 2000). The most widely used boosting algorithm is AdaBoost (adaptive boosting) which is described in Freund Schapire (1999). 3.4. Random forest Random forest (RF) is a supervised method based on ensemble learning algorithm and is popular in classification and regres- sion. RF combines a multitude of decision trees at the training stage and the mode of classes for individual trees is selected as the output class (Cui et al., 2018). RF is efficient for large database, for variable importance estimation and the generated forests can be used on future datasets. During prediction, classification of a new object is done by growing decision trees and going through the input vector down in all the trees of the forest and choosing the classification with majority votes over all trees of the forest. Applying a strategy of sampling replacement (out-of-bag) en- sures an unbiased estimation of classification error and b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 68
  • 10. estimation of feature importance whereas, using randomly selected inputs or combinations of inputs at each node to grow each tree results in the most desirable performance character- istics. The randomness of decision forests can address multi- class problems with unbalanced datasets and overcomes the tendency to overfit that is typical of decision trees (Breiman, 2001). Some application of RF in non-destructive studies of food quality have included digital imaging (Pereira, Barbon, Valous, 2018), vision systems for object detection and papaya ripeness estimation (Goel Sehgal, 2015), and others (Adam, Deng, Odindi, Abdel-Rahman, Mutanga, 2017; Knauer et al., 2017). 3.5. Support vector machine Support vector machine (SVM) is based on structural risk management from statistical learning theory and is used for nonlinear regression and classification (Huang, Hung, Lee, Li, Jiang, 2014). In a classification problem, the SVM algorithm aims to maximise an optimal hyperplane as a decision func- tion. The basic SVM deals with two-class situations whereby the created hyperplane for separating data is defined by a number of support vectors (margins to the nearest data points) (Samanta, Al-Balushi, Al-Araimi, 2003). SVM is known for excellent performance in classification and pre- diction due to its efficiency at avoiding issue of overfitting which is common in modelling such high-dimensional data (Huang et al., 2014). Training data classes are encoded by “1” and “-1” or mathematically represented as ffxi; yigT i1; xi 2 Rn ; yi 2 f 1; þ 1gg, i ¼ 1, …, l and the hyperplane is given by: w , x þ b ¼ 0 where the parameters of the hyperplane are a weight vector, w and bias, the constant b; x is the input dataset. The decision function f (.) can therefore be denoted as follows: fðxÞ ¼ sign ðw , x þ bÞ Other Kernel based formulations of SVM can be found in Huang et al. (2014). In multiclass problems three main approaches aim to combine multiple two-class SVMs and are as follows. The first considers all possible pairs of one class against one other (one versus one) which, for a given number c of classes, would result in c(c-1)/2 classifiers and the correct class of samples is determined by a voting strategy. The second approach each single class encoded as “1” is classified against all the rest (c-1) encoded as “-1”, which results in c dual-class training prob- lems and a decision function is applied, of which the maximum value is the deciding factor for the class of a new unknown sample. The third approach follows the c(c-1)/2 dual-class categorization problem and training is similar to that in the ‘one versus one’ case. In testing, a two-class directed acyclic graph is established whereby a sample of unknown class is tested from the root nodes (Nasrabadi, 2007). 3.6. Clustering analysis Clustering analysis is an unsupervised method for classifica- tion of data structures and associations that were rather not evident. It yields results that are easy to understand, however, the methods for determining the appropriate number of clusters are not satisfactory. Results are presented in a form of dendrogram whereby the closer the points are in the clusters, the more similar the samples (Belous, Malyarovskaya, Klemeshova, 2016). Clustering analysis has been successful in using electronic nose applications to detect defects in plant, including plant diseases and artificially- or herbivore-induced damage in cucumber, tomato and pepper plants (Markom et al., 2009) and spider mites infestation in cucumber (Laothawornkitkul et al., 2008). 3.7. Linear discriminant analysis (LDA) LDA is a linear classification technique that aims to maximise between-class variance while minimizing within-class vari- ance using ‘Fischer's Metric’, with the assumption that variance-covariance matrices of the classes are equal (Naik et al., 2017). If this assumption does not apply a more gener- alised formulation, the quadratic discriminant analysis method is used (Gewali, Monteiro, Saber, 2018, pp. 1e46). These classifiers are known as Gaussian generative models and are widely used. They have the advantage of allowing the determination of marginal density of the data and they perform well on an a notably wide and diverse set of classifi- cation problems (Maugis, Celeux, Martin-magniette, 2011). LDA is a common classification approach in chemometrics and has been to solve various detection problems including effective classification of fly infested olive fruit (Moscetti et al., 2015), detecting early bruises in apples (Baranowski, Mazurek, Wozniak, Majewska, 2012), detecting damage due to fungal decay, shrivel and mechanical load in blueberry (Leiva- Valenzuela Aguilera, 2013) and determining powdery mildew disease severity in wine grapes (Knauer et al., 2017). 3.8. Genetic algorithm Genetic algorithms (GAs) are used as tools for optimisation of a given response function and feature selection (Alma Bulut, 2012). Inspired by Darwin's theory of natural evolution, they apply genetic operators such as mutation and crossover to select the fittest solution over a certain number of computa- tional generations until a stop criterion is met (converged solution or maximum number of generations) (Niazi Leardi, 2012). GAs have been repeatedly used in association with PLS regression to optimise prediction models (improve prediction accuracy and model simplicity) of various properties of food- stuff (Feng Sun, 2013, pp. 74e83; Nturambirwe, Nieuwoudt, Opara, Perold, 2017), including defects in horticultural products. 3.9. Other learning methods Various other learning methods have been applied to non- destructive quality evaluation of agricultural products. They include logistic regression (Hu, Dong, Liu, 2016; Jarolmasjed, Khot, Sankaran, 2018), naı̈ve Bayes (Sinha, Khot, Schroeder, Sankaran, 2018; van Dael et al., 2016), nearest neighbour (Kuzy, Jiang, Li, 2018; Moscetti, Haff, Monarca, Cecchini, b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 69
  • 11. Massantini, 2016), stochastic gradient decent (Mohanty et al., 2016), gradient tree boosting (Che et al., 2018), etc. Though various learning algorithms may provide accept- able performance at a given learning task, choosing the most performing is always preferable. Characteristics such as memory usage, predictive accuracy on test data, training speed and interpretability of the inner workings of an algo- rithm are typical trade-off criteria that can be used in a trial and error process to make such a selection. 4. Feature extraction and selection A feature in a horticultural image is, for example, an aspect of interest that is useful in describing fruits; it might be related to colour, shape, size, strength, composition, flavour or a defect. Feature descriptors are commonly used for object detection and image recognition; they represent an image or part of it by retaining useful information while the redundant one is left out (Naik Patel, 2017). In terms of spectral data, features are extracted as spectral bands; however, in some cases spectral features can be combined with spatial (i.e. pixel-based) ones (Knauer et al., 2017). As a crucial step in fruit defect detection, feature extraction is commonly done in order to make data manageable and feature selection aims to reduce these features to those most significant without loss of information (Leiva-Valenzuela Aguilera, 2013). This means to select the lowest number of features that yield the lowest error with the highest correct classification hits. Gabor features, Gabor filter, Hu moments, Flusser and Suk moments, local binary patterns, discrete Fourier transform, mean gradient first-order derivative, Mean Laplacian second- order derivative, mean, standard deviation, Skewness and kurtosis are typical methods for feature extraction (Leiva- Valenzuela Aguilera, 2013). Other methods such as deep feature extraction, which is based on deep neural network, are useful when the data structure is complex and help limit the networks risk of overfitting which is typical when the training set is of limited size (Chen, Jiang, Li, Jia, Member, 2016). It is worth mentioning that as a general approach followed in computer vision systems, while machine learning algorithms are performed subsequently to applying handcrafted algorithms for feature extraction, the latter is incorporated as an essential part of the very structure of the DL framework (Rosebrock, 2017). 5. Major ML methods used to detect defects in fruit and vegetables There has been an increasing use of ML methods in various fields of scientific research and technological development including agriculture (Gandhi Armstrong, 2016) and the study of food quality (Ropodi et al., 2016). Their uses in enabling the effective detection of damage and disorders in horticultural products have also been reported and are sur- veyed here with respect to the known detection challenges. An overview of the recent uses of ML in defect detection is seen in Table 4. 5.1. Detecting internal defects Subdermal or internal damage and disorders in fruit and vegetables cannot be identified visually. Visible computer vision systems, despite their advanced applicability, are also unable to detect such defects. Alternatives such as NIR spec- troscopy and imaging (Liu, Pu, Sun, 2017); thermal imaging (Ding, Dong, Jiao, Zheng, 2017); X-ray radiography and to- mography (van Dael et al., 2016, 2019; Herremans et al., 2014; Magwaza Opara, 2014); magnetic resonance imaging (Tao, Zhang, McCarthy, Beckles, Saltveit, 2014; Zhang McCarthy, 2012) and ultrasound imaging (Ahmed et al., 2017) have proven capable of testing the internal state of objects. Imaging techniques have shown superior capabilities in the study of internal structure and disorders in fruits and vegetables and therefore, they are most preferred. They do, however, have challenges such as speed limitation, which can be associated with time cost of data acquisition (X-ray) or processing (HSI); high cost for some devices or technical lim- itation (limited penetration depth for infrared based devices); etc. Other limitations related to inspecting internal features in fruit can emanate from the nature of the object under inves- tigation whereby, fruit soft tissue result in low contrast in X- ray radiographs (Mathanker, Weckler, Bowser, 2013), thick rind and opaque fruit limit penetration of infrared radiation for vibrational spectrometry and imaging. Although research is underway to improve on hardware capabilities, a paralleled solution that is based on data handling is also under development. ML tools have been adopted for data mining and analysis in association with non- destructive detection devices, to probe internal structure and disorders of horticultural fresh products. van Dael et al. (2016) used naı̈ve Bayes and k-nearest neighbours (k-NN) classifiers to separate citrus fruits with internal disorders from those with healthy tissue based on X- ray radiographs. The classification algorithm managed to capture 95.7% of oranges with granulation and 93.6% of lemons affected with endoxerosis, correctly. In a HSI based study of internal damage and external de- fects in cucumber, Cen, He, and Lu (2016) proved the ability of a deep learning framework in improving the accuracy of detection models. Combining CNN with a stacked sparse auto- encoder (SSAE) to learn both spectral and spatial features, higher accuracy of detection than that obtained with spectral data alone was consistently obtained at both scanning speeds used (Cen et al., 2016). Recently, Wang (2018) applied two CNNs, namely residual network (ResNet) and ResNext to the classification of hyper- spectral transmittance data. The objective was to improve on the accuracy and reduce detection time costs for internal damage in blueberries. In comparison to other ML classifiers such as RF, linear regression, SVM, bagging and multilayer perceptron; the ResNet and ResNext yielded superior classifi- cation performance in terms of accuracy, precision, F1-score and area under curve (AUC) (Wang et al., 2018). With the rapidly increasing developments in deep learning applied in object recognition, the use of imaging systems to detect internal defects in horticultural products can be rendered more efficient by implementing pretrained learning b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 70
  • 12. Table 4 e An overview of recent applications of ML methods to the detection of defects in horticultural products. Instrument ML method Product Study Evaluation Reference HSI (transmission.) CNN(ResNet/ResNeXt) versus SMO, LR, RF, MLP, Bagging Blueberry Internal mechanical damage Up to (CNN) Acc. ¼ 0.88 Rec. ¼ 0.93 Prec. ¼ 0.86 F1-sc. ¼ 0.89 AUC ¼ 0.92 Wang et al. (2018) NIR HSI k-NN, LDA, NBC, DT, ELM Mango Detecting mechanical damage Correct classification rate: 97.95% V elez Rivera et al. (2014) HSI CNN-SSAE Cucumber Surface and internal defect Class acc: 91.1% and 88.6% at speeds of 85 and 165 mm/s Cen et al. (2016) HSI Successive projection algorithm Peaches Detect fungal disease based on chlorophyll content band ratio gave high (98.75%) classification accuracy for diseased peaches Sun, Wang, et al. (2017) VIS-HSI RF Apple bruising Average accuracy of bruise extraction models reached 99.9% Che et al. (2018) HSI ANN Peaches Cold injury Overall class acc: 95.8%; predictive corr. coef.: 0.698e0.903 Pan et al. (2016) Electronic nose ANN, CA, RF Diverse Bacterial, fungal, viral infections and insect damage. Diverse Cui et al. (2018) BPNN, LVQN, CA, LDA, PCA Rice plant Mechanical damage, herbivore attack Correct class rate training: 100%, test set: 60e100% Zhou and Wang (2011) Colorimeter SVM, LDA Blueberries Fungal decay, shrivel, mechanical damage Classifier performance 97%, 93%, 86% Leiva-Valenzuela and Aguilera (2013) Machine vision Fuzzy logic Apple Water core severity Class acc 86e89% Shahin et al. (2001) ANN Bruise damage Shahin, Tollner, McClendon, and Arabnia (2002) X-ray Imaging ANN Sweet onion Defective vs good Overall Class acc 90% Shahin, Tollner, Gitaitis, Sumner, and Maw (2002) E-nose, GCeMS MLPNN, PCA Strawberry Pathogenic fungal disease Class acc: 96.6% Pan et al. (2014) HSI SVM, SLOG, SMO, BNN, FURIA, NNC, LINE, LOG, NB, RF Apple Bruise Correct class rate 95% train, 90% valid Siedliska, Baranowski, and Mazurek (2014) HSI (transmission) SOM, SVM, Active learning algorithm (EER) Blueberry Mechanical damage Acc: 0.87, Prec: 0.93, Recall: 0.78, Training: 9 (EER) Hu et al. (2018) X-ray radiography kNN, NB Citrus fruits Internal disorders Class acc: oranges: 95.7%, lemons: 93.6% van Dael et al. (2016) Dielectric spectroscopy SVM, ANN, SVM-FS, GA, RF Oil palm Basal stem rot infection Overall acc: 88.64%, kappa: 0.8480, mean absolute error: 0.1652 Khaled et al. (2018) Wetting sensors network RF Apple Scab e Wrzesien, Treder, Klamkowski, and Rudnicki (2019) HSI PCA clustering Apple Decay Acc: decay 99%, sound 100% Li, Luo, Wang, and Fan (2019) CV K-means, Fuzzy C-Means Olive Surface defects Overall acc: 88e93% Hussain and Ahmed (2019) Acc., accuracy; Rec., recovery rate; Prec., precision; F1-sc., F1-score; AUC, area under curve; corr. coef., correlation coefficient. b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 71
  • 13. platforms for automated detection processes and improving detection specificity. 5.2. Objective and quantitative measurement of mechanical damage Quantitative measurement of defects entails establishing de- fects indices based on the level of severity. Knowledge about the degree of damage is useful in the sense that it allows produce to be graded into such groups as ‘sound’ - fit for long distance transport or long shelf life, ‘mildly damaged’ - fit for processing and ‘damaged’ - unsafe for consumption but fit for animal feed or to be discarded. The ability to carry out such a grading would therefore, lead to reduction in food losses and contribute to food security (Li Thomas, 2014). Che et al. (2018) used various classification algorithms to detect bruise damage in apples at three temporal stages of development (0, 12, and 18 h) by comparing pixel-based clas- sification and bruise segmentation methods applied on hyperspectral images. Such algorithms included SVM, DT (classification and regression), stochastic gradient descent, RFs and gradient tree boosting. In their objectives, they envisaged to improve on the abilities of traditional image processing methods of segmenting bruises by creating a pixel- based bruise extraction method. In their findings, the RF method was rated as best in precision and stability overall and best for pixel based bruised region prediction for its high classification accuracy and generalisation ability. Classifica- tion accuracy was also evidently found to increase with the severity (dependent on time after bruising) of damage (Che et al., 2018). Recently, Sun, Gu, et al. (2017) achieved a high classifica- tion accuracy (up to 96.87%) for both detecting chilling injury and distinguishing between four categories of peaches ac- cording to condition of chilling damage (sound, slight, mod- erate and heavy injury) by using ANNs (Sun, Gu, et al., 2017). In another study, when infrared thermography was used with periodic thermal energy input to pear, it was possible to obtain quantitative metrics of size and depth of bruises in pear, obtainable from phase information of thermal emission by the samples (Kim et al., 2014). ML algorithms have proven to outperform classical image analysis methods at localising damaged areas in fruit and shown the dependence of damage severity to detection accuracy. High accuracies were also shown to be achievable while distinguishing between degrees of dam- age using ML methods. It should be noted that with such capabilities, ML is a good candidate for enabling the implementation of quantitative models for defect detection suitable for practical scenarios. However, learning plat- forms for real life defect detection application are required and extensive efforts should be deployed to achieve such a development. 5.3. Early detection of defects Detection of fruit and vegetable defects at their earliest stage of development is crucial to prevent damage aggravation and possible disease spread over entire containers, which could result in catastrophic food losses. The definition of early detection may refer to the detection before manifestation or earliest stage of manifestation. Khaled (2018) proposed a method for early detection of basal stem rot (BSR) disease in oil palm leaves. A series of feature selection algorithms were used, namely genetic algo- rithm, random forest and support vector machine. The latter was also used in addition to artificial neural networks as classifiers. The study lead to a clarification on effectiveness of feature selection methods used and on preference in best electrical parameter (impedance) that were suitable for early detection of BSR in oil palm leaves (Khaled et al., 2018). In a study by V elez Rivera et al. (2014), detection of mechanical damage induced in mango fruit at early stage of development by a HSI system was assessed. Using various classification learning methods and selection of the best spectral bands in distinguishing between damaged and sound mangos, they obtained increasing rates of classification correctness over seven days after damage induction. From day one, rates of 67.46, 84.63, 89.27 89.76 and 94.87 were obtained for the clas- sifiers used, i.e. naı̈ve Bayes, ELM, DT, LDA and k-NN, respectively. The latter had particularly the highest classifi- cation performance overall, followed by LDA and were high enough by day three (97.5% and 95.54%, respectively). How- ever, it is worth noting that feature selection led to lower classification performance than using full spectral bandwidth, therefore further research effort was recommended to ach- ieve more efficient feature selection (V elez Rivera et al., 2014). In a study of common fungal disease detection in strawberry using an E-nose, Pan, Zhang, Zhu, Mao, and Tu (2014) achieved an overall discrimination accuracy of 96.6% and an improve- ment of correct ratios, from 93.3 to 100% in testing samples for individual treatment, as early as day two after inoculation, using MLPNN classifier. Three types of diseases used were also well differentiated from one another all along the 10 days period of the study using PCA (Pan et al., 2014). E-nose com- bined with multi-layer perceptron neural network (MLPNN) classifier was therefore proven an acceptable method for early detection of common fungal infection (early decay) during postharvest storage. There have been other recent reports on early detection of defects in fruit with successful results. Li, Chen, and Huang (2018) used an improved watershed segmentation technique of hyperspectral images based on morphological gradient reconstruction to detect bruises in peaches at their early stage of development. Combined with PCA selection of effective wavelengths, the new segmentation method led to detection accuracies as high as 96.5% for samples with defects and 97.5% for sound ones. Though the detection accuracy was high, the robustness of the method vis a vis biological vari- ability remains untested (Li et al., 2018). A study of bruise detection in pears reported that the use of lock-in method to infrared thermography was effective for early bruise detection (Kim et al., 2014). This work, however, did not produce a basis for decision support in real life application. In a study attempting to detect early symptoms of decay in navel orange fruits, a rather rarely used approach based on ‘image visualisation’ was adopted. An algorithm for image segmentation, based on a combination of thresholding and pseudo-colour image was used to locate decayed tissue, resulting in 100% success rate of detection for decayed b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 72
  • 14. samples with an error rate smaller than 1% in sound ones. PCA was instrumental in data reduction which resulted in four effective wavelengths and classification of categories (decayed vs sound fruit) (Li et al., 2016). From the various studies mentioned above, it can be noticed that with the wide range of learning algorithms that are available and their combinations with features selection techniques, given a specific problem, there are generally ways of finding the best suited learning algorithm that can super- sede the mainstream chemometrics methods. However, there is a need for transferring the many learned lessons into con- crete implementations for real life applications. To that end, it should be stressed that the term “early” detection still needs a standard measure; to date it is experimentally established differently from one study to another. For example, in defects such as bruise damage the term is rather commonly equiva- lent to “fresh” bruise and in some disease infection it practi- cally means the “onset stage” or the “least severe” of the chosen range of infection extent. These terms can be rather vague, they need to be assigned a measurable value and pre- dicting defects before they manifest would be rather ideal. However, the achieved prediction levels in the reported studies using ML give promise for successful practical imple- mentations within the current framework. 5.4. Fast detection of defects in fruit Online detection of defects in horticultural products is one of the most desirable aspects in industrial application of non- destructive methods towards sorting and grading of fruit and vegetables. Even though visual systems are already in use for this purpose, they are still ineffective at detecting internal quality and defects (Leiva-Valenzuela Aguilera, 2013). Other technologies like hyperspectral imaging that are capable for probing internal defects still face the issue of slow processing speed and efforts are still being deployed in algorithms development in order to match the typical industrial sorting speeds (Calvini, Orlandi, Foca, Ulrici, 2018). Also, studies involving advanced learning algorithms have made progress in reducing the image processing time, whereby feature se- lection and pre-processing methods play an integrant role. Recently, Keresztes, Goodarzi, and Saeys (2016) developed a system for ‘real-time’ detection of bruise Jonagold apples based on shortwave infrared (SWIR) HSI. By combining the best reflectance calibration and best pre-processing technique for glare correction, the detection accuracy and processing time per apple reached to 98% and 20 ms, respectively, whereby, the shorter processing times corresponded to slower samples scanning speed. In order to make improvement in processing speed and glare induced inaccuracies, further optimisation of the system's hardware and illumination was recommended (Keresztes et al., 2016). Wang, Hu, and Zhai (2018) also showed that convolutional neural networks improved the time cost at detecting internal damage of blue- berries using HSI. With a classification time for each testing sample reduced to 5.2 ms and 6.5 ms for both types of used CNNs, the potential of deep CNN to enable online fruit sorting based on internal damage was demonstrated. A technical trend that has gained much attention and showing great promise for fast detection of defects in fruits and vegetables is that of hyperspectral imaging. Hyperspectral imaging systems are used in the acquisition of spectral images that serve to determine the optimal wavelengths usable in faster multispectral imaging systems. However, the latter, even though faster, has shown a lesser detection performance than the former (Huang, Li, Wang, Chen, 2015). The effi- ciency of multispectral systems in this scenario depends much upon the transferability of classification algorithms from the HSI to MSI system. Recently, a comparative study was conducted on detecting various defects in apples with intent to determine the image recognition method with better portability and stability from an HSI to MSI system consid- ering the reduction of illumination evenness. The study concluded that the goal of minimising the effect of uneven illumination and meeting model robustness to physical and biological variability that hinders the accurate identification of surface defects was achievable (Zhang et al., 2018). The recent applications of deep learning architectures have opened a window of opportunities whereby, graphical processing unit (GPU) oriented programming greatly speeds up processing time and has outperformed the classical CPU based approach. Shorter processing times (per sample) were achieved for defect detection in cucumber by using a GPU implementation of CNN-SSAE framework that fuses spectral and spatial features of HSI data (Cen et al., 2016). Though the implementation of such GPU based frameworks currently re- quires specialists with high levels of coding skills, developing such platforms for specific detection tasks could benefit the horticultural industry in the near future. Although, computers with decent GPU capabilities are relatively costly, de- velopments in computer hardware are ever improving the affordability of computing hardware, which is likely to alle- viate the burden of high cost. 5.5. Other uses of ML in defect detection Learning algorithms have been used to predict various defects in contexts other than those already covered in the above sections. Among other applications are the detection of cold injury in peaches, whereby using MLPNN, Pan (2016) suc- cessfully distinguished injured from sound peaches in cold storage with high accuracy (92.9e100%) based on HSI data, proving the feasibility of HSI in detecting damage resulting from cold storage (Pan et al., 2016). In a study of detecting chilling injury, Sun, Gu, et al. (2017), Sun, Wang, et al. (2017) used six optimal wavelengths ob- tained by successive projections algorithm (SPA), classifica- tion models by, Fisher linear discriminant analysis, SVM, ANN and PLS-DA achieved high detection accuracy (92.96e97.28%) (Sun, Gu, et al., 2017). The technological developments have enabled machine vision to gain ground in replacing traditional manual handling method to assess fruit quality. The former does not always recognise specific defects such as drying, fungal decay and mechanical damage in blueberries (Leiva-Valenzuela Aguilera, 2013); the detection efficiency depends much upon the used classification algorithm's objective and tolerance. In this sense, improving the detection performance of computer vision systems and other techniques through learning algo- rithms has been subject of numerous studies. b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 73
  • 15. Valenzuela (2013) devised a method of detecting damage to blueberries due to fungal infection, shrivel and mechani- cal stress, in different orientations using various classifiers. Support vector machine and linear discriminant analysis were reported to have superior performance over the rest of used classifiers (quadratic discriminant analysis, Mahala- nobis distance, K nearest neighbours and probabilistic neu- ral network) (Leiva-Valenzuela Aguilera, 2013). The authors reported their recognition approach as promising for online sorting and grading of blueberries based on various defects, however, they stressed the need for incorporating non-visible and internal defects, requiring complementary sensors. Convolutional neural network was used in identifying diseases such as leaf mould, grey mould and plague in tomato plants based on RGB images. A deconvolutional network for deep visualisation was used to analyse the performance of internal layers of a CNN (VGG-16) as influenced by colour and spectral information of diseases images. Based on the fact that images for each disease present specific characteristics in terms of colour, texture, patterns, location in the plant, shape, etc., it was possible to relate colour sensitivity to a given specific disease and therefore determine parameters that could help redesign the CNN and improve its recognition rate (Fuentes, Im, Yoon, Park, 2017). 6. Important features selection The purpose of variable selection can be perceived in three aspects: improvement of model prediction performance, provision of faster and more cost-effective predictors, and providing a better understanding of the process generating the data (Guyon Elisseeff, 2003). Considerable effort has been deployed for the realisation of online inspection whereby, the transformation from slow hyperspectral imaging to the fast application level of multi- spectral imaging requires a key step of selecting the most efficient wavelengths for specific inspection task (Zhang et al., 2014). However, wavelength selection carried out in order to enable the development of multispectral system has proven to reduce the classification performance of the latter, in some cases (Huang et al., 2015). Another trend is the combination of NIR hyperspectral imaging with ML techniques. In a study of mechanically induced damage detection in mango fruit, V elez Rivera et al. (2014) used five classification techniques in combination with eleven feature selection techniques to determine the most relevant features for their classification problem (V elez Rivera et al., 2014). The feature selection methods included correlation-based feature subset selection (CFS), chi square (ChiS), Fisher score, Gini impurity algorithm (GIA), informa- tion gain (IG), minimum redundancy maximum relevance (mRMR), ReliefF, sequential forward selection (SFS), sparse logistic regression (SLR), stepwise, and Student's T-test. On the other hand, the classifiers used were linear discriminant analysis (LDA), k-nearest neighbours (k-NN), naı̈ve Bayes classifier (NBC) as a probabilistic approach; and decision trees (DT) and extreme learning machine (ELM). More details on the variable selection methods commonly applied in vibrational spectroscopy were reviewed in Xiaobo, Jiewen, Povey, Holmes, and Hanpin (2010). It is noticeable from Table 5, that there are more di- vergences than similarities in the optimal wavelengths ob- tained in different studies on same defect for the same commodities. Therefore, there is a need for standardised wavelength characteristics that would ensure optimal per- formance of multispectral systems for specific tasks. 7. ML as an enabler of data fusion The scope of sensor fusion entails the use of multiple sensing techniques simultaneously in order to improve the assess- ment of targeted material properties (Srivastava Sadistap, 2018). There has been an increasing interest in fusing data from complementary sensors to study properties of food items; this approach has proven to provide better insight on a studied item than a single sensor. Various data fusion meth- odologies exist and can be achieved at different levels of complexity (measurement level, feature level and decision fusion level) as reviewed for the application to food and beverage authentication (Borr as et al., 2015). Various suc- cessful cases of data fusion applied to the study of food properties were reported in recent years whereby, ML methods were the enabler of data handling and analysis. These applications include the fusion of spectral and spatial data (feature level) from a Vis-NIR hyperspectral imaging system to predict sensory quality index scores of fish fillet. Calibration models were built using LS-SVM, textural features were extracted by using grey-level gradient co-occurrence matrix method and successive projections algorithm was used for effective wavelengths selection (Cheng Sun, 2015). Mendoza, Lu and Cen (2014) showed that fusing systems (among visible and shortwave near infrared (Vis-SWNIR) spectroscopy, acoustic firmness, spectral scattering and bio yield firmness) provided more complete and complementary information on firmness and soluble solids content and was a more effective approach at predicting the latter attributes than using individual sensors. Later, they argued that the optical information provided by Vis-SWNIR spectroscopy and scattering techniques on apple firmness and soluble solids content was complementary and thus their fusion would provide higher accuracy than considered separately (Mendoza et al., 2014). There have been more reports on recent appli- cations of data fusion to characterisation of food and bever- ages (Biancolillo, Bucci, Magrı̀, Magrı̀, Marini, 2014; Borr as et al., 2015). However, in recent years, few reports are found on fused techniques for defect detection in fruits and vege- tables. Li, Heinemann, and Sherry (2007) successfully devel- oped data fusion models for apple defect detection by integrating two instruments for volatile detection. Data fusion was carried out at both feature (using probabilistic neural network-based sensor fusion) and decision levels (using Bayesian network fusion) creating superior accuracy (lower classification error) to that obtained from single sensor data. It was concluded that feature selection was a crucial step in achieving such improved performance of sensor fusion framework and the latter was fit for detection of diseased or spoiled apples reliably (Li et al., 2007). It is evident that b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 74
  • 16. learning algorithms and soft computing techniques are crucial for non-destructive multi-sensor fusion and this needs more exploration in the characterisation of agricultural products. 8. ML and model transfer Due to instrument-related variations that are typical of most spectrometers, calibration models are bound to the spec- trometer that generated the data and therefore inapplicable on another spectrometer with statistically retained accuracy and precision. There has been progress in developing ways to circumvent this problem which are referred to as calibration or model transfer. The objective is to correct the difference of spectra between the master and a slave instrument by transforming spectra from the latter to appear as if originating from the master instrument. Once this achieved, the original calibration model can be used on the transformed spectra. This approach is known as standardisation and the most popular method, the piecewise direct standardisation (PDS) is used as a benchmark for the new developed transfer methods (Luo et al., 2017). This can also be achieved by a different approach whereby the aim is to correct the new samples predicted values for the bias and the slope of the regression equation, under the assumption that predicted values of two different instruments have linear dependence. Alternatively, a third approach that tries to standardise the model co- efficients can be used. Various other methods have been developed to achieve calibration transfer and they follow two main approaches additionally to standardisation, namely reduction of the difference in data acquired under different conditions and model updating. Data correction applies signal pre-processing methods (Workman, 2018), whereas model updating keeps adding new data acquired under new condi- tions and then rebuilding the model. Details on many cali- bration transfer methods that have been applied to infrared, near-infrared and Raman spectroscopies, their advantages and shortcomings have been reviewed extensively (Feudale et al., 2002; Workman Jr, 2018). A recent study developed a transfer method that is based on affine transformation which does not require standard samples and reportedly, was more effective than most common standardization methods (Zhao et al., 2019). Aspects of machine learning have also been applied to solve challenges of standardization such the possible non-linearity relationships between spectra from two instruments (Chen, Bin, Lu, Zhang, Liang, 2016), which cannot be addressed by direct standardisation (DS) or PDS. Transfer learning was successfully used to improve imple- mentation of calibration transfer in E-noses (Yan Zhang, 2016). In horticultural applications, numerous transfer methods have been developed which are mostly aimed at studies of internal quality attributes and most generally based on NIR spectroscopy (Alamar, Bobelyn, Lammertyn, Nicolaı̈, Molt o, 2007; Bergman, Brage, Josefson, Svensson, Spar en, 2006; Fan et al., 2019). Model transfers for defect detection studies, on the other hand, have had little attention. Given that defect detection has been proven more and more feasible with imaging techniques, transfer methods developed for spectroscopy are likely to be obsolete in this case. However, Table 5 e Recent applications of feature selection methods to improve learning models for defect detection in fruit and vegetables. Instrument Selection method Type of defect Product Waveband Reference NIR - HSI SLR, T test, IG, SFS, mRMR, GIA, ChiS, CFS Mech damage Mango 700 nme780 nm, 890 nme900 nm, 1070 nme1080 nm V e lez Rivera et al. (2014) HSI Successive projection algorithm Fungal infection decay/chlorophyll content Peach 617 nm, 675 nm, and 818 nm Sun, Wang, et al. (2017) HSI MLPANN Cold injury Peach 487, 514, 629, 656, 774, 802, 920 and 948 nm Pan et al. (2016) HSI (400e1000 nm) Chilling injury Peach 580, 599, 650, 675, 710, and 970 nm Sun, Gu, et al. (2017) Cold injury Nectarine 670 and 780 nm Lurie et al. (2011) Cold injury Banana 660 nm Hashim et al. (2013) Internal defect Cucumber 745, 805, 965 and 985 nm Ariana and Lu (2010) HSI Cold injury Apple 717, 751, 875, 960 and 980 nm ElMasry, Wang, and Vigneault (2009) SW- LW-HSI PCA-weighting coefficient Bruise Peach 781, 816, 840, 945, 1000, 1065, 1260, 1460, 1917, 2500 nm Li et al. (2018) HSI (450e1000 nm) to MSI PCA-weighting coefficient Bruise Apple 780, 850 and 960 nm Huang et al. (2015) HSI (408e117 nm) PCA loadings Hidden bruise Kiwifruit 682, 723, 744, 810, and 852 nm Lü, Tang, Cai, Zhao, and Vittayapadung (2011) b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 75
  • 17. ML and DL have proven effective in calibration for various types of sensors and sensor networks (Chatzidakis Botton, 2019; Wang et al., 2017), in obtaining low error of prediction (RMSEP) values and stable calibration transfer in spectroscopy cases; therefore, it as an educated guess that they are the best option for implementing model transfer aimed at defect detection studies. 9. Summary and future directions Quality control in the horticultural industry is important to ensure food safety, quality and prevent unnecessary food related economic losses. Enabling non-destructive detection of defects in horticultural products is two-fold in pre- requisites: one is the development of detection instruments that are equipped to address the existing challenges, the other lies in advancing data handling techniques in a sense that is complementary to the former. Technological advances have been made in either case and research is ongoing, that will help reach the goals that are sought for. Such goals include user-friendliness (easy to operate and maintain) of sensing devices and their suitability for industrial applications (fast, reliable, portable and cost effective). HSI has become the most preferred and predomi- nant non-destructive technique applied for defect detection in food and Agri-products. Many studies have contributed to continued improvements in reducing image processing time costs and optimisation of HSI hardware, which is expected to help match high sorting speeds required for industrial applications. Many learning algorithms have been developed to improve the detection accuracy and speed up image processing time costs; they include advanced segmentation techniques, deep learning methods to automate feature extraction and other classification learners for identification and detection of de- fects based on pixel density (Che et al., 2018) and semi- supervised learning methods such as active learning algo- rithms. The latter have been proven to effectively reduce the labelling cost while keeping a high classification performance (Hu, Zhao, Zhai, 2018). This is cost effective for online ap- plications and similar application environments whereby continuous labelling update and model transfer is common. Future work should also focus on exploring such semi- supervised learning techniques. Nonetheless, the challenge remains that of standardiza- tion of techniques; each reported study is more or less limited to a specific instrumental parameter, study single food item and particular defect, use a different learning algorithm or a different validation process. Such specificity limits the wide- spread use of the technologies beyond research in- vestigations; future work should endeavour to standardise the methodologies that have already been proven successful and make them available for practical use. No one algorithm can solve all problems. Choosing the appropriate learning algorithm for a specific problem is a crucial step for the model effectiveness. Algorithm selection has mostly been a trial and error process; various studies have adopted a comparative approach whereby many algorithms are used for a classification task and through a trade-off be- tween the algorithms based on some performance charac- teristics, the best algorithm is given preference. Future work should seek to establish frameworks where algorithms trained and tested for a specific application (e.g. specific de- fects in a given fruit) are recommended as such for further practical use. Early detection of defects has been predominantly accomplished using e-nose for pathological defects and dis- orders, where ML plays an integrant role in data analysis and acquisition. Detecting mechanical damage at early stage of development has also been successful using imaging tech- niques, whereby deep learning methods were the enabler of feature extraction and enhancing detection accuracy. How- ever, the feasibility of applying these deep learning tech- niques to other promising technologies such as thermography, radiography, magnetic resonance, etc., re- mains unexplored and should be taken into consideration in the future. ML has been effectively used in assessing quantitative measures of damage (severity/degree of damage) such as mechanical damage and chilling injury in fruits. Most gener- ally, such studies have relied on an experimentally estab- lished index of level of severity either by direct induction through an experimental procedure or based on temporal evolution of the defect. Learning methods were used to elucidate the distinction among different degrees of damage as captured by an objective non-destructive measurement technique. This topic is also underexplored by advanced ML algorithms; quantitation of damage and disorders remains a challenge in the horticulture sector, which calls for more attention oriented towards the use of learning algorithms in the future. ‘Fast detection’ is one of the most sought-after goals for the use of ND detection methods in the fruit and vegetable in- dustry. The most successful case has been that of computer vision, but its capabilities are limited to surface defects. HSI and most especially MSI on the other hand is the most prominent candidate for fast detection, where ML and espe- cially deep learning have been instrumental towards the successful implementation of ‘fast’ systems. Deep learning has allowed for improvements in reducing the time cost of image processing and effective feature extraction for direct defect identification. The use of deep learning to this end has shown great promise but little has been investigated in this regard. Therefore, more studies are needed that could bring this goal to a realisation. ML has played an important role in enabling multi- sensor data fusion (where the nature of sensors allows it) whereby, the data can be fused at different levels to achieve better model performance and robustness than when a single sensor is used. Many studies have been conducted in quality evaluation of food items, but little has been reported on detection of damage and disorders to fruit and vegetables. Future research should consider this approach as an option to help overcome existing challenges. There has also recently been an issue of “cheating model” raised in Li et al. (2019), which seems to have more to do with b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 76
  • 18. the learning techniques than the data. Although the issue they raised was directed to analytics in agricultural products, it is a situation that can be encountered in any other field of data driven research. Holding competitions as they suggest to check the models could also produce the same situation, especially when there are conflicts of interest. There is however a possibility that one can enforce data and model sharing, especially in cases where the work is not patented, the models can be checked under a monitored platform within the umbrella of authenticating results, for the same purpose peer-review is carried out. There are already ongoing examples of code sharing on “Github” and other “Git-like” platforms by developers, a similar system could be used to share models and check data. It is the author's opinion that such a system, that is already working in other areas of technical development could stand a good chance of operating fairly. With the advances in DL that are already revolutionising pattern recognition, DL platforms have become easily acces- sible and they are adaptable to other applications. There are successful such case studies in plant diseases and fruit detection, and they can be extended to defects in fruits. A large-scale database of images that capture various defects in fruits and vegetables would be useful in training such DL platforms which in turn would lead to advancing automation in grading and sorting systems. The predisposition of HSI to acquire both spatial and spectral data, the structured illumi- nation reflectance (SIRI) and thermal imaging offer a chance to probe internal damage (Lu Lu, 2018). Deep learning offers an opportunity for training models on a massive scale, which has already been proven by cases such as the “ImageNet” com- petitions and similar complex problems. The proposed idea here is seems only possible, given that it is already possible to populate such a database by capturing fruit and vegetables defects. 10. Conclusions ML and DL methods have proven to hold promise for overcoming the existing challenges around effective, objective and fast detection of defects in horticultural products. Research has proven ML methods to be effective at enhancing accuracy of detection either by enabling data and sensor fusion, enabling data dimensionality reduction or feature extraction. They allow for faster spectral and image processing than traditional segmentation methods, faster object detection and automation becomes very feasible by trained learning algorithms. The use of ML has been efficient in data driven problem solving in many areas of science and technology. In the future, more effort should be deployed to establish focused frameworks with objec- tivity to provide standardised solutions to the current problems around detection of defects in fruit and vegeta- bles. One typical such idea would be to establish deep learning platforms trained and dedicated to recognising various defects in fruit and vegetables based on images acquired by non-destructive imaging devices such as hyperspectral imaging systems. Declaration of Competing Interest The authors declare no conflict of interest. Acknowledgement This work is based on the research supported wholly by the National Research Foundation of South Africa (Grant Numbers: 64813). The opinions, findings and conclusions or recommendations expressed are those of the author(s) alone, and the NRF accepts no liability whatsoever in this regard. r e f e r e n c e s Abasi, S., Minaei, S., Jamshidi, B., Fathi, D. (2018). Dedicated non-destructive devices for food quality measurement - a review. Trends in Food Science Technology, 78, 197e205. https:// doi.org/10.1016/j.tifs.2018.05.009. Acquarelli, J., Van Laarhoven, T., Gerretzen, J., Tran, T. N. (2017). Convolutional neural networks for vibrational spectroscopic data analysis. Analytica Chimica Acta, 954, 22e31. https:// doi.org/10.1016/j.aca.2016.12.010. Adam, E., Deng, H., Odindi, J., Abdel-Rahman, E. M., Mutanga, O. (2017). Detecting the early stage of phaeosphaeria leaf spot infestations in maize crop using in situ hyperspectral data and guided regularized random forest algorithm. Journal of Spectroscopy, 2017, 1e9. https://doi.org/10.1155/2017/6961387. Ahlin, K., Joffe, B., Hu, A. P., McMurray, G., Sadegh, N. (2016). Autonomous leaf picking using deep learning and visual- servoing. IFAC-PapersOnLine, 49(16), 177e183. https://doi.org/ 10.1016/j.ifacol.2016.10.033. Ahmed, M. R., Yasmin, J., Ahmed, M. R., Yasmin, J., Lee, W., Mo, C., et al. (2017). Imaging technologies for nondestructive measurement of internal properties of agricultural Products : A review. Journal of Biosystems Engineering, 42(3), 199e216. https://doi.org/10.5307/JBE.2017.42.3.199. Alamar, M. C., Bobelyn, E., Lammertyn, J., Nicolaı̈, B. M., Molt o, E. (2007). Calibration transfer between NIR diode array and FT-NIR spectrophotometers for measuring the soluble solids contents of apple. Postharvest Biology and Technology, 45(1), 38e45. https://doi.org/10.1016/j.postharvbio.2007.01.008. Alma, O. G., Bulut, E. (2012). Genetic algorithm based variable selection for partial least squares regression using ICOMP criterion. Asian Journal of Mathematics Statistics, 5, 82e92. https://doi.org/10.3923/ajms.2012.82.92. Anyasi, T. A., Jideani, A. I. O., Mchau, G. A. (2015). Morphological , physicochemical , and antioxidant profile of noncommercial banana cultivars. Food Sciences and Nutrition, 3(3), 221e232. https://doi.org/10.1002/fsn3.208. Ariana, D. P., Lu, R. (2010). Hyperspectral waveband selection for internal defect detection of pickling cucumbers and whole pickles. Computers and Electronics in Agriculture, 74(1), 137e144. https://doi.org/10.1016/j.compag.2010.07.008. Ashour, A. S., Guo, Y., Hawas, A. R., Xu, G. (2018). Ensemble of subspace discriminant classifiers for schistosomal liver fibrosis staging in mice microscopic images. Health Information Science and Systems, 6(1), 1e10. https://doi.org/10.1007/s13755- 018-0059-8. Baiano, A., Terracone, C., Peri, G., Romaniello, R. (2012). Application of hyperspectral imaging for prediction of physico-chemical and sensory characteristics of table grapes. b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 77
  • 19. Computers and Electronics in Agriculture, 87, 142e151. https:// doi.org/10.1016/j.compag.2012.06.002. Baietto, M., Wilson, A. D. (2015). Electronic-nose applications for fruit identification, ripeness and quality grading. Sensors, 15(1), 899e931. https://doi.org/10.3390/s150100899. Baranowski, P., Mazurek, W., Wozniak, J., Majewska, U. (2012). Detection of early bruises in apples using hyperspectral data and thermal imaging. Journal of Food Engineering, 110(3), 345e355. https://doi.org/10.1016/j.jfoodeng.2011.12.038. Barbedo, J. G. A. (2016). A review on the main challenges in automatic plant disease identification based on visible range images. Biosystems Engineering, 144, 52e60. https://doi.org/ 10.1016/j.biosystemseng.2016.01.017. Bargoti, S., Underwood, J. P. (2017). Image segmentation for fruit detection and yield estimation in apple orchards. Journal of Field Robotics, 34(6), 1039e1060. https://doi.org/10.1002/rob. Belous, O., Malyarovskaya, V., Klemeshova, K. (2016). Diagnostics of subtropical plants functional state by cluster analysis. Scientific Journal for Food Industry, 10(1), 237e242. https://doi.org/10.5219/526. Bergman, E.-L., Brage, H., Josefson, M., Svensson, O., Spar en, A. (2006). Transfer of NIR calibrations for pharmaceutical formulations between different instruments. Journal of Pharmaceutical and Biomedical Analysis, 41(1), 89e98. Biancolillo, A., Bucci, R., Magrı̀, A. L., Magrı̀, A. D., Marini, F. (2014). Data-fusion for multiplatform characterization of an Italian craft beer aimed at its authentication. Analytica Chimica Acta, 820, 23e31. https://doi.org/10.1016/j.aca.2014.02.024. Bian, X., Li, S., Shao, X., Liu, P. (2016). Variable space boosting partial least squares for multivariate calibration of near- infrared spectroscopy $. Chemometrics and Intelligent Laboratory Systems, 158, 174e179. https://doi.org/10.1016/ j.chemolab.2016.08.005. Biji, K. B., Ravishankar, C. N., Mohan, C. O., Srinivasa Gopal, T. K. (2015). Smart packaging systems for food applications: A review. Journal of Food Science Technology, 52(10), 6125e6135. https://doi.org/10.1007/s13197-015-1766-7. Birle, S., Hussein, M. A., Becker, T. (2013). Fuzzy logic control and soft sensing applications in food and beverage processes. Food Control, 29(1), 254e269. https://doi.org/10.1016/ j.foodcont.2012.06.011. Bishop, C. M. (2006). Pattern recognition and machine learning. Springer. https://doi.org/10.1117/1.2819119. Borr as, E., Ferr e, J., Boqu e, R., Mestres, M., Ace~ na, L., Busto, O. (2015). Data fusion methodologies for food and beverage authentication and quality assessment - a review. Analytica Chimica Acta, 891, 1e14. https://doi.org/10.1016/ j.aca.2015.04.042. Breiman, L. (1996). Bagging predictors. Machine Learning, 24, 123e140. Breiman, L. (2001). Random forests. Machine Learning, 45(1), 5e32. https://doi.org/10.1023/A:1010933404324. Breuel, T. M., Ul-hasan, A., Al-azawi, M. A., Shafait, F. (2013). High-performance OCR for printed English and fraktur using LSTM networks. In 2013 12th international Conference on document Analysis and recognition (pp. 683e687). https://doi.org/ 10.1109/ICDAR.2013.140. Calvini, R., Orlandi, G., Foca, G., Ulrici, A. (2018). In Development of a classification algorithm for efficient handling of multiple classes in sorting systems based on hyperspectral imaging (Vol. 1, pp. 1e15). https://doi.org/10.1255/jsi.2018.a13. Cen, H., He, Y., Lu, R. (2016). Hyperspectral imaging-based surface and internal defects detection of cucumber via stacked sparse auto-encoder and convolutional neural network. In 2016 ASABE annual international meeting (p. 1). American Society of Agricultural and Biological Engineers. Cha, Y. J., Choi, W., Büyük€ oztürk, O. (2017). Deep learning-based crack damage detection using convolutional neural networks. Computer-Aided Civil and Infrastructure Engineering, 32(5), 361e378. https://doi.org/10.1111/mice.12263. Chatzidakis, M., Botton, G. A. (2019). Towards calibration- invariant spectroscopy using deep learning. Scientific Reports, 9(1), 2126. https://doi.org/10.1038/s41598-019-38482-1. Chen, W. R., Bin, J., Lu, H. M., Zhang, Z. M., Liang, Y. Z. (2016). Calibration transfer via an extreme learning machine auto- encoder. Analyst, 141(6), 1973e1980. https://doi.org/10.1039/ c5an02243f. Cheng, J. H., Sun, D. W. (2015). Data fusion and hyperspectral imaging in tandem with least squares-support vector machine for prediction of sensory quality index scores of fish fillet. LWT - Food Science and Technology, 63(2), 892e898. https://doi.org/ 10.1016/j.lwt.2015.04.039. Chen, Y., Jiang, H., Li, C., Jia, X., Member, S. (2016). Deep feature extraction and classification of hyperspectral images based on convolutional neural networks. EEE Transactions on Geoscience and Remote Sensing, 54(10), 6232e6251. Chen, S. W., Shivakumar, S. S., Dcunha, S., Das, J., Okon, E., Qu, C., et al. (2017). Counting apples and oranges with deep learning: A data-driven approach. IEEE Robotics and Automation Letters, 2(2), 781e788. https://doi.org/10.1109/LRA.2017.2651944. Che, W., Sun, L., Zhang, Q., Tan, W., Ye, D., Zhang, D., et al. (2018). Pixel based bruise region extraction of apple using Vis-NIR hyperspectral imaging. Computers and Electronics in Agriculture, 146, 12e21. https://doi.org/10.1016/ j.compag.2018.01.013. Chuang, C., Ouyang, C., Lin, T., Yang, M., Yang, E., Huang, T., et al. (2011). Automatic X-ray quarantine scanner and pest infestation detector for agricultural products. Computers and Electronics in Agriculture, 77(1), 41e59. https://doi.org/10.1016/ j.compag.2011.03.007. Cui, S., Ling, P., Zhu, H., Keener, H. M. (2018). Plant pest detection using an artificial nose system: A review. Sensors, 18(2), 1e18. https://doi.org/10.3390/s18020378. De Groote, H. (2012). Crop biotechnology in developing countries. In Commercial, legal, sociological, and public aspects of agricultural plant biotechnologies (1st ed., pp. 563e576). Elsevier Inc. https:// doi.org/10.1016/B978-0-12-381466-1.00036-5. De’ Ath, G., Fabricus, K. E. (2000). Classification and regression trees: A powerful yet simple technique for ecological data analysis. Ecology, 81(11), 3178e3192. Ding, L., Dong, D., Jiao, L., Zheng, W. (2017). Potential using of infrared thermal imaging to detect volatile compounds released from decayed grapes. PLoS One, 12(6), 1e11. https:// doi.org/10.1371/journal.pone.0180649. Du, C. J., Sun, D. W. (2006). Learning techniques used in computer vision for food quality evaluation: A review. Journal of Food Engineering, 72, 39e55. https://doi.org/10.1016/ j.jfoodeng.2004.11.017. Ekramirad, N., Adedeji, A. A., Alimardradni, R. (2016). A review of non - destructive methods for detection of insect infestation in fruits and vegetables. Innovations in Food Research, 2, 6e12. ElMasry, G., Wang, N., Vigneault, C. (2009). Detecting chilling injury in Red Delicious apple using hyperspectral imaging and neural networks. Postharvest biology and technology, 52(1), 1e8. https://doi.org/10.1016/j.postharvbio.2008.11.008. Esteva, A., Kuprel, B., Novoa, R. A., Ko, J., Swetter, S. M., Blau, H. M., et al. (2017). Dermatologist-level classification of skin cancer with deep neural networks. Nature, 542(7639), 115e118. https://doi.org/10.1038/nature21056. Fan, S., Li, J., Xia, Y., Tian, X., Guo, Z., Huang, W. (2019). Long- term evaluation of soluble solids content of apples with biological variability by using near-infrared spectroscopy and calibration transfer method. Postharvest Biology and Technology, 151, 79e87. https://doi.org/10.1016/ j.postharvbio.2019.02.001. b i o s y s t e m s e n g i n e e r i n g 1 8 9 ( 2 0 2 0 ) 6 0 e8 3 78