SlideShare a Scribd company logo
1 of 23
Download to read offline
Research Article
Online Adaptive Bearings Condition Assessment Using
Continuous Hidden Markov Models
Francesco Cartella1
and Hichem Sahli1,2
1
Electronics and Informatics Department (ETRO), Vrije Universiteit Brussel (VUB),
1050 Brussels, Belgium
2
Interuniversity Microelectronics Center (IMEC), 3001 Leuven, Belgium
Correspondence should be addressed to Francesco Cartella; fcartell@etro.vub.ac.be
Received 2 July 2014; Accepted 9 October 2014
Academic Editor: Xiaotun Qiu
Copyright ยฉ F. Cartella and H. Sahli. This is an open access article distributed under the Creative Commons Attribution License,
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Condition assessment (CA) of bearings can be performed by using left-right hidden Markov models, because of the monotonically
increasing pattern of the bearings degradation. Classical CA approaches assume that all possible system states are fixed and known a
priori. The training of the system is performed offline at once with data from all of the system states. These assumptions significantly
impede condition assessment applications in case that all the possible states of the system are not known in advance, or changes in
environmental or operative conditions occur during the toolโ€™s usage. To overcome these limitations, we propose combining left-right
continuous HMMs (CHMM) with a change point detection algorithm for (i) estimating, from historical observations, the initial
number of the CHMM states and the initial guess of its model parameters and (ii) updating the state space as well as the model
parameters during monitoring. Moreover, to deal with multidimensional sensor measurements, we propose using kernel principal
component analysis for dimensionality reduction. Qualitative and quantitative evaluations of the proposed methodology have been
performed using both simulated and real data from the NASA benchmark repository. Compared to state of the art techniques, the
proposed methodology results in (i) an improvement of the HMM training phase in terms of iterations number; (ii) the detection
of unknown states at an early stage; and (iii) an effective change of the CHMMโ€™s structure to represent the degradation processes
more accurately in presence of unknown conditions.
1. Introduction
Machines and equipments breakdown in industrial environ-
ments can have significant impact in the profitability of a
business. Maintaining the engineering assets in operating
conditions is an industrial and economical requirement.
Nowadays optimization strategies for maintenance are an
important part of research for companies [1, 2], since, next
to energy costs, maintenance spending due to equipment or
machine failure can be the largest part of the operational
budget [3].
Corrective maintenance [1, 4], which is the system main-
tenance strategy subject of this work, is based on condition
monitoring and is performed after a degradation modality
emerges in a system, with the goal of preventing the system
from failing. A fundamental requirement of this strategy is
the detection of the abnormal behavior, that will lead to the
system failure, at an early stage in order to give enough time
to perform the repair operations.
Industrial equipments and their components are often
characterized by a very complex structure. For this reason
the usage of mathematical models for condition monitoring
can represent an arduous task. On the other hand, industrial
machines are usually equipped with monitoring devices (sen-
sors placed in critical parts of the system under study) which
are able to record and collect large amount of measurements.
Given the availability of these huge amounts of data, the
methodology proposed in this paper will focus on data-
driven models [5], which are based on machine learning
approaches.
The classical usage of data-driven techniques consists of
two phases: a first training phase in which a preprocessing
of the historical raw sensory data is performed in order to
extract relevant features, which will be then exploited to
Hindawi Publishing Corporation
Advances in Mechanical Engineering
Article ID 758785
2 Advances in Mechanical Engineering
learn the parameters of behavioral models of the considered
machine and a second application phase, during the runtime
of the system, in which the trained model is used along with
runtime measurement to evaluate the health condition of
the machine. The advantage of such approaches is that, for
a well monitored system, there is no need for knowledge
of the systemโ€™s internal physical mechanism or for a prior
mathematical model to represent the system degradation.
Several data-driven models for condition monitoring
have been proposed, such as clustering techniques [6],
data-mining approaches [7], support vector machines [8],
dynamic Bayesian networks [9], and hidden Markov models
[10, 11]. Hidden Markov models (HMMs) have been shown
to be effective pattern recognition tools in a large variety
of recognition tasks. They have been successfully applied to
condition monitoring of manufacturing machines [12, 13] as
well as in other application domains ranging from speech
recognition [14] to intrusion detection in computer systems
[15].
The common usage of hidden Markov hodels, especially
for condition assessment applications, consists of dividing
the degradation pattern of the historical data of the system
under study into different degradation levels, from normal
to deeper degradation (i.e., good, partially worn, seriously
worn, broken), and training either several HMMs, each one
corresponding to observations (features) from a particular
degradation level [16โ€“19] or training a single HMM using
historical observations corresponding to the normal behavior
of the system [19โ€“23]. Online condition assessment is then
performed by selecting the HMM that, on new incoming
data, gives the highest likelihood or by detecting an abnor-
mality if the likelihood is lower than a threshold defined
for the normal behavior [19]. However, the latter approach
is not well suited to condition assessment of industrial
systems, as it can be used only for anomaly detection. Indeed,
such approach is only able to detect a condition that is
different from the considered normal behavior in the training
phase.
Another approach [14] consists of training a single HMM
by using the sensor measurements corresponding to the
entire lifetime of the system, by setting the number of states
equal to the number of degradation stages contained in the
training history. By using this approach, the model maps each
degradation phase to a state of the HMM. The condition
assessment is performed online on new incoming data by
calculating the Viterbi path [24] which returns the most
probable state sequence.
The approaches described above assume that all possible
system condition states are known a priori and that training
data sets from the associated states are available. In addition,
the training procedure has to be performed offline and only
once at the beginning, with the available training set. For
this reason, the trained models are able to recognize only
the conditions contained in the training data. These assump-
tions significantly impede component diagnosis applications
when:
(i) it is difficult to identify and train all of the possible
states of the system in advance;
(ii) environmental factors or operative conditions change
during the toolโ€™s usage.
Since industrial machines are constantly subjected to
updates, changes of configurations or changes of working
conditions, a need for models that are able to automati-
cally adapt to new situations, by dynamically learning the
structure and the parameters, is required. To the best of our
knowledge, such approach has only been used by Lee et al.
[25], whom proposed a modified hidden Markov model with
variable state space to estimate the current state of system
degradation as well as to detect unknown faults at an early
stage. A multivariate statistical process control is used to
detect the occurrence of an unknown condition by measuring
the deviation of the current signal from a reference signal
representing prior known states; then the HMM is updated
by including a new state. The limitation of this approach is
that (i) one should collect reference signal representing prior
known states and (ii) define a parametric model (Gaussian in
the case of [25]) of the observation signals.
To overcome the abovementioned drawbacks of most
of the state of the art HMM-based condition assessment
approaches, we proposed in [26] a methodology based on
HMMs which is able to perform (i) an automatic estimation
of the initial parameters and number of states of the model
and (ii) an online adaptation of the model structure, together
with the online learning of the parameters, in presence of not
stable conditions. The proposed approach aims to diagnose
the degradation processes of industrial systems using a
variable state space left-right continuous HMM (CHMM), by
integrating a change point detection (CPD) algorithm [27].
The CPD algorithm allows detecting the dynamic changes
of machine condition and hence of the HMM parameters,
by discovering time points at which the properties of the
input time-series data change, corresponding to the (signal)
segmentation of the observed data stream. The detected
segments correspond to the different states of the machine. In
this work, as a further contribution, we evaluate the reliability
of the data driven identification of the number of (possible)
systemโ€™s states by means of clustering techniques and the
Dunnโ€™s validity indexes [28].
Concerning the online condition assessment, the com-
parison between the estimated most-likely state, by the
Viterbi algorithm [24], and the CPD detected segments
(states) allows fast failures detection. Moreover, since the
CPD algorithm is independent from the density model of the
observation, in comparison with the method of Lee et al. [25],
our proposal is more effective as it is independent of prior
data and allows faster detection of unknown conditions.
The used CPD algorithm is able to handle 1-dimensional
time series, while industrial machines are usually equipped
with several sensors for which multiple features can be
extracted, resulting in a multidimensional observation vector.
To have an accurate modeling for real world cases, it is
essential to consider as much information as possible related
to system changes, carried by multiple signals. In this work
we propose applying kernel principal component analysis
(KPCA) [29โ€“31] for dimensionality reduction. KPCA, being
able to effectively deal with nonlinear data, embeds the
Advances in Mechanical Engineering 3
information coming from all the measurements in to a 1-
dimensional time series and allows a faster detection of
condition changes by the CPD algorithm.
Although it is adaptable to a wide range of health
estimation problems, we applied the proposed methodology
on bearings condition monitoring. This choice derives from
the fact that, being the most common mechanical elements in
industry, bearings are the components which most frequently
fail in rotating machines [32]. Furthermore, given the nature
of the component, we can assume that the degradation pat-
tern of bearings is monotonically increasing and, therefore,
the use of left-right HMMs is well-motivated.
The rest of the paper is organized as follows. Section 2
gives a short overview of the used concepts and methods,
namely, continuous left-right hidden Markov models, the
preprocessing and features extraction techniques, and the
change point detection algorithm. In Section 3, we provide
a detailed description of the proposed approach. In Section 4
experimental results are discussed. The conclusion and future
research directions are given in Section 5.
2. Methods and Algorithms
Before going into a more detailed discussion of the proposed
approach, an overview of the techniques used in this paper is
given in this section.
2.1. Left-Right Continuous Hidden Markov Models. Left-
right vontinuous hidden Markov models (CHMMs) [14] are
dynamic models in which there are two types of variables:
a state variable describing the system state, which is never
observed/measured directly (hidden), and an observation
variable which is related to the hidden state.
A left-right CHMM ๐œ† = (๐ด, ๐ถ, ๐œ‡, ๐‘ˆ, ๐œ‹) is characterized by
the following:
(1) ๐‘, the number of states in the model, ๐‘† = {๐‘†1, . . . , ๐‘† ๐‘}
the set of states, and ๐‘ ๐‘ก the state at time ๐‘ก,
(2) the state transition probability distribution ๐ด = {๐‘Ž๐‘–๐‘—}
where
๐‘Ž๐‘–๐‘— = ๐‘ƒ (๐‘ ๐‘ก+1 = ๐‘†๐‘— | ๐‘ ๐‘ก = ๐‘†๐‘–) , 1 โ‰ค ๐‘–, ๐‘— โ‰ค ๐‘, (1)
(3) X๐‘ก, the continuous observation vector emitted from
the state at time ๐‘ก,
(4) the observation density, represented by a finite mix-
ture:
๐‘๐‘— (X) =
๐‘€
โˆ‘
๐‘š=1
๐‘๐‘—๐‘šโ„ต (X, ๐œ‡ ๐‘—๐‘š, U ๐‘—๐‘š) , 1 โ‰ค ๐‘— โ‰ค ๐‘, (2)
where X is the vector being modeled, ๐‘๐‘—๐‘š the mixture
coefficient for the mth mixture in state ๐‘—, and โ„ต is any
log-concave or elliptically symmetric density (usually
a Gaussian), with mean vector ๐œ‡๐‘—๐‘š and covariance
matrix U ๐‘—๐‘š for the mth mixture component in state ๐‘—.
The parameters ๐‘๐‘—๐‘š, ๐œ‡๐‘—๐‘š, and U ๐‘—๐‘š, are the coefficients
of ๐ถ, ๐œ‡ and ๐‘ˆ, respectively,
(5) the initial state distribution ๐œ‹ = {๐œ‹๐‘–} where
๐œ‹๐‘– = ๐‘ƒ (๐‘ 1 = ๐‘†๐‘–) , 1 โ‰ค ๐‘– โ‰ค ๐‘. (3)
It is proved that the probability density function of (2)
can be used to approximate, arbitrarily closely, any finite,
continuous density function [33]. Hence it can be applied to
a wide range of problems.
The left-right topology has the property that the state
index can either increase or stay the same, as time goes on.
In this case, the following holds for the coefficients of the
transition matrix:
๐‘Ž๐‘–๐‘— = 0, ๐‘— < ๐‘–, (4)
๐‘Ž๐‘–๐‘— = 0, ๐‘— > ๐‘– + ฮ”, (5)
๐‘Ž ๐‘๐‘ = 1, (6)
๐‘Ž ๐‘๐‘– = 0, ๐‘– < ๐‘, (7)
where (4) avoids transition to previous states and (5) avoids
jumps of more than ฮ” states, while the state ๐‘ is an absorbing
state. Moreover, since the model starts from state ๐‘†1, the initial
state probabilities have the following property:
๐œ‹๐‘– = {
0 ๐‘– ฬธ= 1
1 ๐‘– = 1.
(8)
For CHMMs applied to condition monitoring, learning
and inference procedures can be used.
(1) Learning: adjust the model parameters ๐œ† = (๐ด, ๐ถ, ๐œ‡,
๐‘ˆ, ๐œ‹) in order to maximize the probability ๐‘ƒ(X | ๐œ†);
that is, for the considered CHMM structure, find the
model parameters that better fit the historical sensor
measurements X. This problem is solved through the
Baum-Welch algorithm [34].
(2) Decoding: given a model ๐œ† = (๐ด, ๐ถ, ๐œ‡, ๐‘ˆ, ๐œ‹) and a se-
quence of sensor measurements X = X1, X2, . . . , X ๐‘‡,
find the hidden state sequence ๐‘† = ๐‘ 1, ๐‘ 2, . . . , ๐‘  ๐‘‡ that
has most likely produced the observation sequence.
The solution of this problem is given by the Viterbi
algorithm [24].
In the next section we will describe how we can fit
the multidimensional time-series input coming from the
machine sensors into a CHMM model.
2.2. Feature Extraction and Dimensionality Reduction. In
this paper, the inputs are vibration signals originating from
sensors on the machine under study. For this purpose, one
accelerometer or more accelerometers are placed in the
component housing and are used to translate the vibrations
emitted to an electric continuous signal. For each sensor,
the electric signal measured is then sampled at a predefined
sampling frequency. The monitoring system of the machine
acquires and stores such samples one at a time, under the
form of a digital signal, which represents the raw input data,
referred to as ๐‘Ÿ(๐‘ก) with 1 โ‰ค ๐‘ก โ‰ค ๐‘‡. It is assumed that a
4 Advances in Mechanical Engineering
r(t)
Window
1
Window
2
Window
3
Window
n โˆ’ 1
Window
n
X(1) X(2) X(3) X(n โˆ’ 1) X(n)ยท ยท ยท
ยท ยท ยท
Figure 1: Feature extraction from raw signals.
characteristic shape of the measured vibration is related to the
actual condition of the component.
Due to the high sampling frequency that is often used, a
high-dimensional signal is measured. Moreover, consecutive
values of a time series are usually not independent but
highly correlated; thus there is a lot of redundancy. Hence,
rather than looking at each point in time sequentially to
detect anomalies, it is common to use sliding windows
of length ๐ฟ. Within each window, ๐‘˜ and ๐‘‘ features are
calculated, resulting in a vector of ๐‘‘ components X(๐‘˜) =
[๐‘ฅ1(๐‘˜), ๐‘ฅ2(๐‘˜), . . . , ๐‘ฅ ๐‘‘(๐‘˜)] ๐‘‡
. Finally, the raw signal ๐‘Ÿ(๐‘ก) is trans-
formed in a multivariate time series as (10). The process is
illustrated in Figure 1. Consider
Raw signal: ๐‘Ÿ (๐‘ก) = ๐‘Ÿ (1) , . . . , ๐‘Ÿ (๐‘‡)
โ†“ features extraction,
X (๐‘ก) = [X (1) , . . . , X (๐‘›)] ,
(9)
X (๐‘ก) =
[
[
[
[
[
[
๐‘ฅ1 (1)
...
๐‘ฅ ๐‘‘ (1)
]
]
]
, . . . ,
[
[
[
๐‘ฅ1 (๐‘›)
...
๐‘ฅ ๐‘‘ (๐‘›)
]
]
]
]
]
]
. (10)
In this work, as also explained in Sections 4.2.1 and 4.2.3,
we considered windows of size ๐ฟ = 20480 (fixed by the used
data set) and ๐‘‘ = 3 time domain features, namely, the mean,
the root mean square, and the kurtosis, of the sensory signals.
It has to be noted that frequency [35] (i.e., Fourier transform)
or time-frequency [36, 37] (i.e., wavelet transform) domain
features could be used as well.
Since the considered CPD algorithm is able to process
1-dimensional signals, a further dimensionality reduction
technique has to be applied on the ๐‘‘-dimensional features
vectors. In this paper, kernel principal component analysis
(KPCA) [29โ€“31] is applied as a technique for dimensionality
reduction.
In the literature, principal component analysis (PCA) has
been extensively used as a features extraction technique for
condition assessment applications [18]. However, KPCA has
been shown to perform better than the standard PCA for
processes with particularly nonlinear characteristics [31]. The
basic idea of KPCA is that when the variations are nonlinear,
the data can always be mapped into a higher-dimensional
space, by means of nonlinear kernel functions, in which they
vary linearly. In this work Gaussian radial basis kernels have
been used. The reader is referred to [29, 31] for the details on
KPCA.
The main advantages of KPCA are the fact that (i) it does
not involve nonlinear optimization [29] since it requires only
linear algebra, making it as simple as standard PCA which
requires only the solution of an eigenvalue problem; (ii) due
to its ability to use different kernels, it can handle a wide
range of nonlinearities; (iii) it does not require the number of
components to be extracted be specified prior to modeling.
Applying KPCA on feature vectors X(๐‘ก) generates a 1-
dimensional time-series, ๐‘ฆ(๐‘ก), (๐‘ก = 1, . . . , ๐‘›), which can be
used as input to the CPD algorithm.
2.3. Change Point Detection Algorithm. The change point
detection approaches apply data mining techniques to iden-
tify the time points at which the changes, that is, events, occur.
This has been discussed in several applications such as fraud
detection [38], rare event discovery [39], event/trend change
detection [27], and activity monitoring [40].
In [27] a method has been proposed for the detection
of the appropriate set of number of points that minimizes
the error in fitting a predefined function using maximum
likelihood. There is no fixed number of change-points to be
detected. Moreover, no constraints are imposed on the class
of functions that will be fitted to the subsequences between
successive change-points.
Following the notation in [27], let ๐‘ฆ(๐‘ก), (๐‘ก = 1, . . . , ๐‘›), be
the time series to be segmented. It is assumed that the time
series can be modeled mathematically, where each model
is characterized by a set of parameters. The problem of
change-points detection is formulated as finding a piecewise
segmented model, given by the following:
๐‘Œ = ๐‘“1 (๐‘ก, V1) + ๐‘’1 (๐‘ก) , (1 < ๐‘ก โ‰ค ๐œƒ1) ,
= ๐‘“2 (๐‘ก, V2) + ๐‘’2 (๐‘ก) , (๐œƒ1 < ๐‘ก โ‰ค ๐œƒ2) ,
= โ‹… โ‹… โ‹…
= ๐‘“๐‘™ (๐‘ก, V๐‘™) + ๐‘’๐‘™ (๐‘ก) , (๐œƒ๐‘™โˆ’1 < ๐‘ก โ‰ค ๐‘›) ,
(11)
where ๐‘“๐‘–(๐‘ก, V๐‘–) is the function (with its vector of parameters V๐‘–)
that is fitted to the segment ๐‘–. The ๐œƒ๐‘–โ€™s are the change-points
between successive segments, and ๐‘’๐‘–(๐‘ก)โ€™s are error terms.
Advances in Mechanical Engineering 5
Several types of basis functions, ๐‘“๐‘–(๐‘ก, V๐‘–), can be consid-
ered, for example, algebraic polynomials, wavelet, Fourier,
and so forth [41]. In our current implementation polynomial
fitting functions have been selected, since they represent a
good compromise between expressiveness power and com-
putational effort.
In [27], two CPD approaches have been proposed, the
batch (offline) and the incremental (online). In the batch
version, the entire data set is available in advance, from
which the best set of change-points is determined. In the
incremental version, the algorithm receives new data points
one at a time and determines if the new observation causes
a new change-point to be discovered. Both approaches are
summarized here after. Details on the implementation can be
found in [27].
2.3.1. Batch Algorithm. The batch CPD algorithm assumes
that the entire time series data is available before the analysis
begins. Let ๐‘† be the residual sum of squares for the selected
fitting function that maintains time points ๐‘ก๐‘–, ๐‘ก๐‘–+1, . . . , ๐‘ก๐‘—. In
this paper we define the likelihood criterion as L(๐‘–, ๐‘—) = ๐‘†
since we used a homoscedastic error model [27].
The key idea behind the algorithm in [27] is that, at every
iteration, each segment is examined to see whether it can be
split into two significantly different segments. The splitting
procedure can be illustrated by a consideration of the first
stage, since all subsequent stages consist of equivalent scaled-
down problems.
Let the data set cover the time points ๐‘ก1, ๐‘ก2, . . . , ๐‘ก ๐‘›. The
change-point in the first stage is the ๐‘— minimizing L(1, ๐‘—) +
L(๐‘— + 1, ๐‘›), say ๐‘—โˆ—
. Here ๐‘—โˆ—
is defined as follows:
L (1, ๐‘—โˆ—
) + L (๐‘—โˆ—
+ 1, ๐‘›) = min
๐‘โ‰ค๐‘—โ‰ค๐‘›โˆ’๐‘
L (1, ๐‘—) + L (๐‘— + 1, ๐‘›) .
(12)
The range of ๐‘— depends on the fact that at least ๐‘ points are
needed for model fitting in each segment.
At the second stage, the best candidate change-points ๐‘1
and ๐‘2 of each of the two segments are located. The selection
of the better of these two candidates yields a division of the
original sequence into three segments. Assuming that point ๐‘1
is chosen, now the likelihood criteria of the model becomes
as follows:
L = (L (1, ๐‘1) + L (๐‘1 + 1, ๐‘—โˆ—
) + L (๐‘—โˆ—
+ 1, ๐‘›))
< (L (1, ๐‘—โˆ—
) + L (๐‘—โˆ—
+ 1, ๐‘2) + L (๐‘2 + 1, ๐‘›)) .
(13)
The procedure above is repeated until a stopping criterion
is reached. Once the algorithm has detected all โ€œrealโ€ change-
points, adding any more change-points the likelihood will
increase. Therefore, the algorithm should stop when the like-
lihood criteria becomes stable or starts to increase. Formally,
if in iterations ๐‘ง and ๐‘ง + 1 the respective likelihood criteria
values are L ๐‘ง and L ๐‘ง+1, the algorithm should stop if
(L ๐‘ง โˆ’ L ๐‘ง+1)
L ๐‘ง
< ๐‘ , (14)
where ๐‘  is a user-defined stability threshold. When stability
threshold ๐‘  is set to 0%, the algorithm stops only when the
likelihood criteria starts increasing.
2.3.2. Incremental Algorithm. In an online setting, the
change-point detection must proceed concurrently with data
collection. The key idea of the incremental algorithm is that
if the next data point collected by the sensor reflects a signif-
icant change in phenomenon, then the likelihood of being a
change-point is going to be smaller than the likelihood that
it is not. However, if the difference in likelihoods is small,
we cannot definitively conclude that a change did occur,
since it may be the artifact of a large amount of noise in the
data. Therefore a user-defined likelihood increase threshold
is introduced. Consider
(Lno change โˆ’ Lchange)
Lno change
> ๐›ฟ, (15)
where ๐›ฟ is a user-defined likelihood increase threshold.
Suppose that the last change-point was detected at time
๐‘ก๐‘™โˆ’1. At time ๐‘ก๐‘™ the algorithm starts by collecting enough data
to fit the regression model. Suppose at time ๐‘ก๐‘— a new data
point is collected. The candidate change-point is found by
determining ๐‘ก๐‘–, with likelihood criterion Lmin(๐‘™, ๐‘—), such that
Lmin (๐‘™, ๐‘—) = min
๐‘™<๐‘–โ‰ค๐‘—
L (๐‘™, ๐‘–) + L (๐‘– + 1, ๐‘—) . (16)
If this minimum is significantly smaller than L(๐‘™, ๐‘—), that
is, the likelihood criteria of no change-points from ๐‘ก๐‘™ to
๐‘ก๐‘—, then ๐‘ก๐‘– is a change-point. Otherwise, the process should
continue with the next point, that is, ๐‘ก๐‘—+1.
In the incremental algorithm, execution time is a sig-
nificant factor. If enough information is stored, some of
the calculations can be avoided. Thus, at time ๐‘ก๐‘—+1 to find
likelihood criteria
Lmin (๐‘™, ๐‘— + 1) = min
๐‘™<๐‘–โ‰ค๐‘—
L (๐‘™, ๐‘–) + L (๐‘– + 1, ๐‘— + 1) , (17)
it is only necessary to calculate L(๐‘– + 1, ๐‘— + 1), since (๐‘™, ๐‘–) was
calculated in the previous iteration.
It should be noted that if a change-point is not detected for
a long time, the successive computations become increasingly
expensive. A possible solution is to consider a sliding window
of only the last ๐‘ž points.
3. Framework
The framework of the proposed approach is illustrated in
Figure 2. As discussed in the introduction, the methodology
is applied in two phases, the training (phase 1) and the online
application (phase 2).
3.1. Phase 1: Estimation of the Initial Number of States and
Parameters Initialization. During the first phase, we assume
that enough historical raw data have been collected from the
monitoring system of the machine under study. Such data can
be measurements from multiple sensor types (temperature,
6 Advances in Mechanical Engineering
Phase 1-training
Raw signal
training set
Features
extraction
Kernel
PCA
Kernel
PCA
Batch CPD
algorithm
Segmented Number of states
estimation
Initial state number
Guess = Nโˆ—
X
X
y
y
Model
Parameters initialization
Initial parameters
Guess = ๐œ†0
Learning procedure (Baum Welch)
Learned parameters ๐œ†โˆ—
CHMM model
Phase 2-testing
Online testing
signal Last L raw
data points
Features
extraction
Online CPD
algorithm
New state
detected?
Yes
Add the testing
signal to the
training set
Add a new state
to the CHMM
(Nโˆ—
= Nโˆ—
+ 1)
X
X
y
Condition
assessment
(Viterbi path)
Figure 2: Framework of the proposed approach.
pressure, acceleration,. . .etc.). For each sensor measurement,
as described in Section 2.2, we apply the feature extraction,
yielding multidimensional feature vectors X(๐‘ก), on which
we apply KPCA dimensionality reduction. The resulting 1-
dimensional signal, ๐‘ฆ(๐‘ก), is then input to the CPD algorithm,
which in turn delivers (detects) ๐‘โˆ—
segments.
Each segment is mapped to a HMM state. The intuition
of this approach comes from the assumption that there is a
correlation between observations and states. Thus, a variation
in the condition of the system corresponds to a change in the
behavior of the sensorial measurements.
Once the number of states (๐‘โˆ—
) has been estimated, and
knowing the end-points of each segment, the initialization
of the HMM parameters ๐œ†0 = (๐œ‹0, ๐ด0, ๐ถ0, ๐œ‡0, U0) can
be performed. Since left-right HMMs are considered, ๐œ‹0 is
initialized by using (8). Using (4), (6), (7), and (5) with ฮ” = 1,
the transition matrix ๐ด0 is initialized as follows:
๐‘Ž๐‘–๐‘— =
# tr (๐‘–, ๐‘—)
#๐‘ ๐‘–
, ๐‘— = ๐‘– + 1 (18)
โˆ€๐‘–, ๐‘Ž๐‘–๐‘– = 1 โˆ’ โˆ‘
๐‘— ฬธ=๐‘–
๐‘Ž๐‘–๐‘—, (19)
where # tr(๐‘–, ๐‘—) is the number of transitions from segment ๐‘– to
๐‘— (which is equal to 1 if left-right HMMs) and #๐‘ ๐‘– the number
of data points in segment ๐‘–. The transition matrix is ensured
to satisfy the stochastic constraint, being โˆ€๐‘–, โˆ‘ ๐‘
๐‘—=1 ๐‘Ž๐‘–๐‘— = 1.
Concerning the coefficients ๐œ‡๐‘—๐‘š, U ๐‘—๐‘š, and ๐‘๐‘—๐‘š relative to
the observation density parameters (see equation (2)), they
are estimated from the multidimensional features signal X(๐‘ก)
for each segment (state) 1 โ‰ค ๐‘— โ‰ค ๐‘โˆ—
with the following
equations:
โˆ€๐‘š, ๐œ‡ ๐‘—๐‘š = X(๐‘ก), ๐‘ก โˆˆ ๐‘—; (20)
โˆ€๐‘š, U ๐‘—๐‘š = Cov (X (๐‘ก)) , ๐‘ก โˆˆ ๐‘—, (21)
where X(๐‘ก) and Cov(X(๐‘ก)) are, respectively, the mean vector
and the covariance matrix of the multidimensional features
signal. The mixture coefficients ๐‘๐‘—๐‘š are initialized by using the
Advances in Mechanical Engineering 7
uniform distribution, for each ๐‘— and ๐‘š : ๐‘๐‘—๐‘š = 1/๐‘€, ๐‘€ being
the number of mixtures.
Together with the features extracted from the training
histories, the estimated initial parameters ๐œ†0 are used as
inputs to the Baum-Welch procedure. The obtained optimal
model parameters, ๐œ†โˆ—
, of the ๐‘โˆ—
-states left-right CHMM, are
then used in the following phase 2 for the online adaptive
condition assessment.
3.2. Phase 2: Online Adaptive Condition Assessment. The
application phase is performed on new incoming data which
are referred to as testing signal in Figure 2. These raw data are
acquired online, during the normal runtime of the machine
under study and are collected one sample a time. When
a window of predefined length ๐ฟ of points are acquired,
the feature extraction procedure is executed to produce
multidimensional features. Two parallel process are then
executed: the trained CHMM is used to estimate the current
systemโ€™s state, while the online CPD algorithm keeps track
of the encountered changes of conditions to possibly detect
previously unknown situations. In particular:
(i) for condition assessment, the multidimensional fea-
tures signal is input to the inference algorithm, using
the current ๐‘โˆ—
-states left-right CHMM model and its
optimal parameters ๐œ†โˆ—
, to calculate, via the Viterbi
algorithm [14], the most probable state sequence,
given the observations up to that moment. The last
state in the inferred sequence is then selected as the
actual condition of the machine;
(ii) for possible HMM state and parameters update, we
map the multidimensional features to 1-dimensional
signal using the out-of-sample extension according to
the kernel PCA parameters calculated in the training
phase. The obtained 1-dimensional signal is input to
the online CPD algorithm to detect possible changes,
which could correspond to state changes. Here, we
assume that an unknown situation is experienced
by the system when the detected segments (change-
points) exceed ๐‘โˆ—
, being the number of states of the
optimal CHMM of the previous phase or iteration.
In such case, when enough samples coming from
the new condition are acquired, a new state is added
to the CHMM, to obtain a new model with ๐‘โˆ—
=
๐‘โˆ—
+ 1 states. The model parameters, ๐œ†โˆ—
, are also
adapted to the new situation, by performing the
training procedure as described above in phase 1,
using the current testing signal together with the
training histories.
4. Experimental Results
To verify the effectiveness of the proposed methodology, sev-
eral experiments have been performed both on simulated and
on real data. The simulated data were generated according
to the parameters described in the example reported in [25],
while condition monitoring data related to bearings taken
from the NASA benchmark repository [42] have been used
as real data. All the parameters concerning kernel PCA and
CPD algorithms have been carefully estimated through linear
searches.
4.1. Simulated Experiment. In this section we illustrate the
performances of the proposed approach by means of a
numerical example proposed in [25]. The goal of these
experiments is to study the case in which some of the hidden
states of the CHMM are not known in advance, but they
appear during the normal functioning of the system.
To better illustrate the effectiveness of the proposed
methodology we compare it to method of Lee et al. [25],
denoted here after as modified HMM approach. Moreover, we
evaluate the results according to two criteria:
(i) the classification accuracy, defined as the percentage
of states correctly classified by the Viterbi algorithm
with respect to the true known state sequence,
(ii) the detection time, defined as the moment in time in
which the unknown state is detected.
4.1.1. Data Generation. A training set composed of 5 histories
from a 4-state CHMM has been generated. Supposing that
two signals (๐‘‹1 and ๐‘‹2) are monitored, the observation
density follows a two-jointly Gaussian distribution. The
CHMM parameters used to generate the training data are as
follows:
๐œ‹ =
[
[
[
[
1
0
0
0
]
]
]
]
, ๐ด =
[
[
[
[
0.99 0.01 0 0
0 0.99 0.01 0
0 0 0.99 0.01
0 0 0 1
]
]
]
]
,
๐œ‡1 = [
20
20
] , ๐œ‡2 = [
20
35
] ,
๐œ‡3 = [
35
35
] , ๐œ‡4 = [
35
20
] ,
๐‘ˆ1 = [
20 0
0 20
] , ๐‘ˆ2 = [
15 0
0 15
] ,
๐‘ˆ3 = [
15 โˆ’2
โˆ’2 15
] , ๐‘ˆ4 = [
5 0
0 5
] .
(22)
For each history, 500 data points have been sampled.
One possible example of observable signals is illustrated in
Figure 3.
To evaluate the behavior of the proposed methodology
under the presence of an unknown situation, we sampled
the testing case using a CHMM with 5 states: for the first 4
states we used the same parameters as the training set, while
a fifth (previously unknown) state has been added, with the
following parameters:
๐œ‡5 = [
50
50
] , ๐‘ˆ5 = [
10 3
3 10
] . (23)
By knowing beforehand the true states sequence of the
testing case as well as the the switch time to the fifth state
(time 511), in the following we compare the classification
accuracy and the unknown state detection time obtained by
our method and by the modified HMM of Lee et al. [25].
8 Advances in Mechanical Engineering
5 10 15 20 25 30 35 40 45 50
5
10
15
20
25
30
35
40
45
50
Signal X1
SignalX2
State S1
State S2
State S3
State S4
Figure 3: Scatter plot of the first simulated case. For the artificial
experiment, 500 observation samples from a 4-state CHMM with
2 monitored signals (๐‘‹1 and ๐‘‹2, following a 2-joint Gaussian
observation distribution) have been sampled for 5 histories.
4.1.2. Proposed Approach. For each training history, we
applied kernel PCA, using a Gaussian kernel with variance
which is equal to 1, in order to reduce the 2-dimensional
observation to a 1-dimensional signal. An example of dimen-
sionality reduction is illustrated in Figure 4 for the first
training history.
The obtained 1-dimensional signals have been input to the
batch CPD algorithm to perform the initial states numbers
estimation. As shown in Figure 5, the 4 segments are correctly
detected. The selected parameters of the batch CPD algorithm
are as follows: a second degree polynomial fitting function
and a stability threshold ๐‘  = 0.01 (see equation (14)).
Once the initial number of segments is obtained, we
consider a ๐‘โˆ—
= 4-state left-right CHMM model for which
the parameter ๐œ† has to be estimated using the Baum-Welch
algorithm. To evaluate the efficiency of the proposed initial
parameters estimation approach, we performed 11 runs of the
Baum-Welch algorithm using as initial values the estimated
๐œ†0 as described in Section 3.1 and 10 randomly set ๐œ†0โ€™s.
For these experiments, we selected a single Gaussian as
observation density of the CHMM and a maximum number
of 15 iterations.
Figure 6 shows that the learning procedure converges in
4 iterations when the initial parameters are set using the pro-
posed approach, compared to the randomly set parameters.
Moreover, the final likelihood value is comparable with the
best random initialization.
The 4-state CHMM trained model is then used to perform
the online condition assessment as described in Section 3.2.
For these experiments we employed the following parameters
for the online CPD algorithm: a first order polynomial fitting
function and a likelihood threshold ๐›ฟ = 0.4 (see equation
(15)).
As illustrated in Figure 7(d), until time 522, the trained
4-state CHMM has been used, and the Viterbi path correctly
assesses the four states. At time 512 (denoted by the last cross
in Figure 7(c)), the online CPD detects a further state. After
10 more data have been acquired, the CHMM is enriched with
a new state, and the training procedure is executed again to
estimate the paramters of a 5-state CHMM. From the time
point 522 on, the new 5-state CHMM is used, for which the
Viterbi path correctly assesses the fifth state until the end of
the experiment. The accuracy achieved by the Viterbi path is
97.23% of correctly classified states in the sequence.
4.1.3. Modified HMM Approach [25]. The results above have
been compared to the ones using the approach of Lee et al.
[25] that, in contrast to the proposed methodology which
employs data mining and curve fitting techniques, is based
on multivariate statistical tests.
The detection of a new state is performed by evaluating
a statistically significant shift in the mean of the observation
signal using the hotelling multivariate control chart (referred
to as hotelling MCC as follows). Hotelling MCC considers
as measure of similarity the Mahalanobis distance, denoted
as ๐ท2
(O, ๐œ‡๐‘–), between the empirical mean O of the last
๐‘ observations (set to 10 in our experiments) and the ๐œ‡๐‘–
parameter relative to the state ๐‘–, which is the most probable
state given by the Viterbi path:
๐ท2
(O, ๐œ‡๐‘–) = ๐‘ (O โˆ’ ๐œ‡๐‘–)
๐‘‡
Uโˆ’1
๐‘– (O โˆ’ ๐œ‡๐‘–) , (24)
where U๐‘– is the covariance matrix relative to the most
probable state ๐‘–. If this statistic exceeds a certain threshold,
called upper control limit (UCL), for more than ๐‘… consecutive
times, then a significant change is assumed in the process
and a new state is added. ๐‘… can be seen as a measure of the
detection sensitivity of the algorithm, if ๐‘… is increased, the
detection algorithm may become more robust against false
detections caused by the process randomness itself; on the
other hand, the algorithm may respond more slowly to the
unknown state.
Concerning the UCL, since ๐œ‡๐‘– and U๐‘– are known, the ๐ท2
statistic follows the ๐œ’2
distribution with ๐‘ degrees of freedom,
where ๐‘ is the number of multivariates, that is, the number
of features considered as observations. Thus, the UCL can be
obtained as UCL = ๐œ’2
๐›ผ,๐‘, where ๐›ผ is the risk level.
An example of the Lee et al. [25] approach with ๐‘… =
3 is illustrated in Figure 8. A zoom of Figure 8(c), which
represents the Mahalanobis distance, is presented in Figure 9.
Here it is more clear that, after the three points exceeding the
UCL (denoted by out of control) have been detected and the
new state is added to the HMM, the Mahalanobis distance
comes back again under control.
Table 1 summarizes the classification accuracy (percent-
age of states correctly classified by the Viterbi algorithm) and
the detection time of the new state (knowing that the switch to
state 5 happened at time 511 in the ground truth data), for both
our proposed approach and the one of [25]. The classification
accuracy of the Lee et al. [25] method clearly depends on the
choice of the parameter ๐‘…: if it is increased, the detection
Advances in Mechanical Engineering 9
0 500
0
3
Sample
โˆ’3
Signal X1
(a) Signal ๐‘‹1 versus time
0 500
0
3
Sample
โˆ’3
Signal X2
(b) Signal ๐‘‹2 versus time
0 500
0
1
Sample
KPCA
โˆ’1
(c) Dimensionality reduction with KPCA
Figure 4: Dimensionality reduction with KPCA. The information available in the two-dimensional signal shown in plots (a) and (b) is
combined in a single signal shown in plot (c) using KPCA. This transformation keeps information related to changes in the original signals.
0 500
0
0.6
Samples
KPCA
1-dimensional signal
Change-points
โˆ’0.6
(a) Offline CPD on training history #1
0 500
0
0.6
Samples
KPCA
1-dimensional signal
Change-points
โˆ’0.6
(b) Offline CPD on training history #2
Figure 5: Initial states number estimation with batch CPD. The 4 segments associated with the 4 CHMM states are correctly detected.
algorithm may become more robust against false detections
caused by the process randomness itself; on the other hand,
the algorithm may respond more slowly to the unknown state.
4.1.4. Discussion. From the obtained results we can conclude
that the proposed method
(i) is effective in estimating correctly the initial number
of HMM states as well as the initial parameters;
(ii) improves the convergency speed of the learning algo-
rithm;
(iii) outperforms the Lee et al. [25] approach in terms
of classification accuracy and earlier detection of the
new condition.
4.2. Real Data. In this section we describe the application
of our approach to a real case study. With the aim of
investigating the case in which a failure condition is unknown
during the training phase, we consider bearing monitoring
data taken from the NASA benchmark repository [42], which
have been used in several other works [37, 43โ€“45]. As the
available data set does not contain any explicit information
about the bearing conditions, we use the detection time of the
Table 1: Comparison of the classification accuracy and detection
time (real new state switch time = 511).
๐‘… Accuracy Detection time
Proposed method 97.23% 512
Hotelling MCC 2 96.31% 531
Hotelling MCC 3 95.23% 541
Hotelling MCC 4 93.69% 551
failure state as a measure of model performance. In addition,
we evaluate the reliability of the proposed CPD methodology,
for the identification of the initial number of systemโ€™s states,
compared to clustering techniques and the Dunnโ€™s Cluster
Validity Indexes [28]. Moreover, similarly to the simulated
experiment, we compare our approach with the one of Lee
et al. [25].
4.2.1. Data Description. The NASA benchmark repository
[42] contains monitoring measurements collected during
several run-to-failure tests performed on bearings under
constant load conditions on a special designed test rig (see
Figure 10). Four (4) bearings (bearing-1,. . .,bearing-4) were
installed on a rotating shaft and the test has been performed
10 Advances in Mechanical Engineering
0 5 10 15
Iterations
Loglikelihood
Proposed initialization
โˆ’2
โˆ’1.9
โˆ’1.8
โˆ’1.7
โˆ’1.6
โˆ’1.5
โˆ’1.4
โˆ’1.3
ร—104
(a) Learning curves
1 2 3 4 5 6 7 8 9 10 11
0
Loglikelihood
Initializations
Random initializations
Proposed initialization
โˆ’15000
โˆ’10000
โˆ’5000
(b) Final loglikelihood ranking
1 2 3 4 5 6 7 8 9 10 11
0
5
10
15
Numberofiterations
Initializations
Random initializations
Proposed initialization
(c) Number of iterations
Figure 6: Training Evaluation, the proposed ๐œ†0 estimation versus 10 randomly set ๐œ†0. By using the proposed approach for parameters
initialization the Baum-Welch algorithm converges after only 4 iterations (the fastest in this comparison) and the final likelihood value is
comparable with the best random initialization.
by keeping a constant rotation speed of 2000 rpm, while a
radial load of 6000 lbs has been added to the shaft and bearing
by a spring mechanism.
All the bearings are forcedly lubricated and the failure of
a bearing is detected by means of a magnetic plug installed
in the oil feedback pipe, that collects debris from the oil
as evidence of bearing degradation. When the accumulated
debris adhered to the magnetic plug exceeds a certain level,
causing an electrical switch to close, a failure of at least one of
the four bearings is assumed and the test stops [36].
The dataset consists of vibration signals collected by
means of accelerometers. On each bearing housing, two
accelerometers have been installed (one on the vertical ๐‘Œ-
axis and one on the horizontal ๐‘‹-axis) for a total of 8
accelerometers (see Figure 10).
Three run-to-failure tests, test-1 to test-3, have been
performed on the test rig. One-second snapshot data were
collected at a sampling rate of 20 kHz (20480 samples for each
snapshot) every 20 minutes for test-1 and every 10 minutes
for test-2 and test-3. At the end of the first run-to-failure
experiment a crack was found near the shoulder of bearing-3,
while test-2 and test-3 ended up with an outer race defect in
bearing-1 and bearing-3, respectively (see Figure 11).
In our experiments we considered as training set 4
histories of bearings that did not experience failure (channels
๐‘‹ and ๐‘Œ of the first and the second bearing in test-1), and for
testing we used the history of a bearing that ended with failure
(the third in test-1).
The raw data of the training set have been preprocessed
by extracting three time-domain features: mean, root mean
square (RMS), and kurtosis within windows of length ๐ฟ =
20480, which is, for simplicity, the same length of the given
snapshots. Figure 12 shows the raw vibration data and the
resulting RMS signal of length 2156.
Advances in Mechanical Engineering 11
0 100 200 300 400 500 600 700
0
20
40
60
Sample
SignalX1
(a) Testing signal: observation 1
0 100 200 300 400 500 600 700
0
20
40
60
Sample
SignalX2
(b) Testing signal: observation 2
0 100 200 300 400 500 600 700
0
0.5
1
KPCA
Sample
โˆ’1
โˆ’0.5
1-dimensional signal
Change-points
Detection time
(c) Online change point detection
0 100 200 300 400 500 600 700
1
2
3
4
5
States
Sample
(d) Viterbi path
Figure 7: Online condition assessment using the online CPD algorithm on the testing case. At time 512 the online CPD algorithm detects a
further state, which is added to the original 4-state CHMM. From time 522 on, the fifth state, which was unknown during the training phase,
is correctly assessed by the Viterbi algorithm.
0 100 200 300 400 500 600 700
0
20
40
60
Sample
SignalX1
(a) Testing signal: observation 1
0 100 200 300 400 500 600 700
0
20
40
60
Sample
SignalX2
(b) Testing signal: observation 2
0 100 200 300 400 500 600 700
Mahalanobis
distance
Sample
Under control
100
Out of control
UCL = 13.81 (๐›ผ = 0.001)
(c) Hotelling multivariate control chart
0 100 200 300 400 500 600 700
1
2
4
3
5
States
Sample
(d) Viterbi path
Figure 8: Online condition assessment using the hotelling multivariate control chart. After ๐‘… = 3 consecutive times that the Mahalanobis
distance exceeds the upper control limit, an unknown situation is assumed and the fifth state is added to the CHMM.
4.2.2. Proposed Methodology Results. We applied KPCA, with
a Gaussian kernel of variance which is equal to 5, to the fea-
ture signals. The obtained 1-dimentional signal is illustrated
in Figure 13. As it can be seen, the obtained 1-dimensional
signal carries the relevant information contained in the three
extracted features (mean, RMS, and kurtosis).
We then applied the batch CPD algorithm (with a
polynomial degree of the fitting function which is equal to
1 and a stability threshold ๐‘  = 0.03) on the 1-dimensional
signal, resulting in the detection of 2 states, as illustrated in
Figure 14.
Since no explicit ground truth about the number of (bear-
ings) states has been made available in the benchmark data
[42], we propose using a clustering approach to estimate it. K-
means clustering, with varying number of clusters from 2 to
10, has been applied on the (historical) 3-dimensional features
observations. The generalized Dunnโ€™s validity indexes [28]
have been used to determine the optimal number of clusters
corresponding to the number of bearing states. Figure 15
illustrates 5 of the 18 validity indexes (๐‘‰11, . . . , ๐‘‰51) of [28]
versus the number of clusters. High index values indicate
the optimal number of clusters, in this case 2. These results
confirm that the proposed CPD algorithm applied to the
1-dimensional signal obtained after KPCA dimensionality
reduction produces the correct number of bearing states.
For the next steps of bearing condition assessment, a
2-state CHMM is considered, for which, as illustrated in
Figure 16, the proposed CHMM parameters initialization
improves the convergence speed with a final likelihood com-
pared to the value obtained by the best random initialization.
12 Advances in Mechanical Engineering
0 10 20 30 40 50 60 70
Mahalanobisdistance
Under control
Out of control State change
10โˆ’3
10โˆ’2
10โˆ’1
100
101
102
103
104
Number of sample (sample size = 10)
UCL = 13.81 (๐›ผ = 0.001)
Figure 9: The hotelling multivariate control chart for testing data. After the three points out of control have been detected and the new state
is added to the CHMM, the Mahalanobis distance comes back again under control.
Accelerometers
Radial load
Thermocouples
Motor
Bearing-1 Bearing-2 Bearing-3 Bearing-4
Figure 10: Testing rig and sensors positions [36]. The 4 bearings installed on a rotating shaft under a constant load and kept at a constant
rotation speed for 3 run-to-failure tests. The measurements available are collected by means of 2 accelerometers installed in each bearing case
(one on the vertical axis and one on the horizontal axis) for a total of 8 accelerometers.
(a) (b) (c)
Figure 11: Damaged bearings [36]. At the end of the first test the bearing-3 experienced a crack near the shoulder (a), while at the end of the
second and the third test an outer race defect occurred in bearing-1 (b) and bearing-3 (c), respectively.
Advances in Mechanical Engineering 13
0 0.5 1 1.5 2
0
0.2
0.4
0.6
Samples
(a)
Acceleration
Raw signal-snapshot 1
0 500 1000 1500 2000 2500
0.12
0.13
0.14
0.15
0.16
0.17
0.18
0.19
0.2
0.21
0.22
Time
(b)
RMS
Feature
extraction
โˆ’0.8
โˆ’0.6
โˆ’0.4
โˆ’0.2
ร—104
Figure 12: Raw vibration data (a) versus RMS features (b) for bearing-2 in test-1.
0 2156
0
2
4
Mean
Time (unit = 20min)
โˆ’2
(a) Mean
0 2156
0
5
RMS
Time (unit = 20min)
โˆ’5
(b) RMS
0 2156
0
5
Kurtosis
Time (unit = 20min)
โˆ’5
(c) Kurtosis
0 2156
0
0.5
KPCA
โˆ’1
โˆ’0.5
Time (unit = 20min)
(d) Dimensionality reduction with KPCA
Figure 13: Dimensionality reduction with KPCA on the bearing data.
0 2156
0
KPCA
1-dimensional signal
Change-points
โˆ’0.8
โˆ’0.6
โˆ’0.4
โˆ’0.2
(a) Offline CPD on training history #1
0 2156
โˆ’0.8
โˆ’0.6
โˆ’0.4
โˆ’0.2
0
0.2
KPCA
1-dimensional signal
Change-points
(b) Offline CPD on training history #2
Figure 14: Initial states number estimation with batch CPD for training histories #1 and #2. Two states are detected, also for the third and the
fourth training cases.
For the CHMM observation density, we used a mixture
of 3 Gaussians, allowing trade-off between precision and
computation time.
For simulating the online condition assessment using the
proposed phase 2 method of Section 3.2, we considered faulty
bearing vibration signals to which we applied the feature
estimation (mean, RMS, and kurtosis) as described earlier.
The resulted features have been successively input to (i) the
learned CHMM to calculate the Viterbi path and (ii) to the
KPCA dimensionality reduction algorithm for the further
14 Advances in Mechanical Engineering
1 2 3 4 5 6 7 8 9 10 11
0
0.2
0.4
0.6
0.8
1
1.2
1.4
1.6
Number of clusters
Indexesvalues
(a) Dunnโ€™s indexes for training history 1
1 2 3 4 5 6 7 8 9 10 11
0
0.5
1
1.5
Number of clusters
Indexesvalues
(b) Dunnโ€™s indexes for training history 2
1 2 3 4 5 6 7 8 9 10 11
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Number of clusters
Indexesvalues
Max value
V11
V21
V31
V41
V51
(c) Dunnโ€™s indexes for training history 3
1 2 3 4 5 6 7 8 9 10 11
0
0.2
0.4
0.6
0.8
1
1.2
1.4
Number of clusters
Indexesvalues
Max value
V11
V21
V31
V41
V51
(d) Dunnโ€™s indexes for training history 4
Figure 15: Dunnโ€™s validity indexes applied on the 4 training histories.
application of the online CPD. For the latter, a second degree
polynomial fitting function and a threshold ๐›ฟ = 0.6 (equation
(15)) have been considered.
The described process above is illustrated in Figure 17. At
time 1848, the online CPD correctly detects the beginning of
a new degradation modality (Figure 17(d)), which will bring
the bearing to fail. A new state is then added to the HMM
and the classification of the third state is performed correctly
until the end.
From a visual analysis of the observations (see Figure 17),
it is clear that the starting of the severe degradation that
leads to the bearing failure is reflected by the kurtosis, which
at time around 1830 starts to have a discontinuous pattern.
This information is correctly captured by the kernel PC and
further detected by the online CPD.
4.2.3. Discussion on Features. In this section we discuss
different features with the aim of demonstrating that our
approach allows determining bearings condition right up
until failure and detects the unknown degradation stage at a
very early stage, also using different sets of features.
As feature selection is out of the scope of this work,
we evaluated the basic state of the art features: mean, RMS,
kurtosis, skewness, and wavelet features.
Kurtosis Only. By using only one feature, both the offline and
the online CPD algorithms are applied directly on the feature
signal itself.
With kurtosis alone, our method fails during the initial
state estimation. As can be noticed from Figure 18, the offline
CPD algorithm fails to detect the correct point at which a
switch between state 1 and state 2 happens for the third and
the fourth training histories.
Concerning the online detection of the new state, kurtosis
works fine in discovering the bearing degradation at an early
stage. In fact, our method detects the unknown condition
at sample 1829, 19 samples (6 hours) before the detection
using mean, RMS, and kurtosis (see Section 4.2.2). This fact
confirms that kurtosis describes well the beginning of the
bearing degradation.
However, if we base our method only on the kurtosis, we
loose the information about the switch that happens between
Advances in Mechanical Engineering 15
0 5 10 15
Iterations
Loglikelihood
Proposed initialization
ร—104
โˆ’5
โˆ’4.5
โˆ’4
โˆ’3.5
โˆ’3
โˆ’2.5
โˆ’2
โˆ’1.5
โˆ’1
โˆ’0.5
(a) Learning curves
1 2 3 4 5 6 7 8 9 10 11
0
Loglikelihood
Initializations
Random initializations
Proposed initialization
โˆ’9000
โˆ’8000
โˆ’7000
โˆ’6000
โˆ’5000
โˆ’4000
โˆ’3000
โˆ’2000
โˆ’1000
(b) Final loglikelihood ranking
1 2 3 4 5 6 7 8 9 10 11
0
5
10
15
Numberofiterations
Initializations
Random initializations
Proposed initialization
(c) Iterations number
Figure 16: Training Evaluation, the proposed ๐œ†0 estimation versus 10 randomly set ๐œ†0. The effectiveness of the proposed approach for
parameters initialization is confirmed in the real case, since the Baum-Welch algorithm converges after only 3 iterations (the fastest in this
comparison) and the final likelihood value is comparable with the best random initialization.
state 1 and state 2, which is contained in the mean and in
the RMS. As can be noticed from Figure 19, both the CHMM
and the online CPD agree on the change of condition right
at the beginning (sample 8), which is not the case. As a
consequence, by using only kurtosis, we fail to determine the
correct bearing condition from the beginning to the end.
Mean and Root Mean Square. By considering only dimen-
sional features like mean and RMS, we obtain the correct
results concerning the initial state estimation with offline
CPD, as shown in Figure 20.
However, dimensional features do not contain informa-
tion about the beginning of the bearing degradation. Hence,
the online CPD for the new states detection fails to recognize
when the bearing starts to degrade and gives an alert for a new
failure condition at sample 2135 (Figure 21). According to the
data, it is too late because the bearing is already broken. This
is further confirmed from the data description provided by
NASA (see Section 4.2.1).
We can then conclude that, although combined mean
and RMS can determine bearing condition right up until
failure, it leads to a late alert of the bearing failure, missing the
desirable characteristic of condition monitoring of advising
the operator before a catastrophic failure occurs.
Mean, RMS, and Skewness. By substituting skewness to
kurtosis and keeping mean and RMS, Figure 22 shows that
the offline CPD applied on the KPCA detects the two states
of the training bearings, in the same way as it did when using
kurtosis (see Section 4.2.2).
Concerning the online identification of the new third
state, the online CPD takes advantages from skewness in
16 Advances in Mechanical Engineering
0 2156
0
20
40
Mean
โˆ’20
(a) Testing signal: observation 1
0 2156
0
10
20
RMS
โˆ’10
(b) Testing signal: observation 2
0 2156
0
5
10
15
Kurtosis
โˆ’5
(c) Testing signal: observation 3
0 1848 2156
0
0.5
1
KPCA
1-dimensional signal
Change-points
Detection time
(d) Online change point detection
0 1848 2156
1
2
3
States
(e) Viterbi path
Figure 17: Online condition assessment using the proposed approach. The beginning of a new degradation modality is detected by the online
CPD algorithm at an early stage (time 1848), which allowed augmenting the CHMM with a new state. The augmented CHMM correctly
detects the degraded state of the bearing (Viterbi path).
0
0
2
4
Change-points
2156
Kurtosis
1-dimensional signal
โˆ’2
(a) Offline CPD on training history #3
0 2156
0
2
4
Kurtosis
1-dimensional signal
Change-points
โˆ’2
(b) Offline CPD on training history #4
Figure 18: Initial states number estimation with batch CPD for training histories #3 and #4 based on kurtosis only. The method fails to detect
the correct moment in which the state change occurs.
terms of earlier detection of the bearing degradation and
slightly outperforms the results obtained in Section 4.2.2. In
fact, it detects the third state at sample 1824, 8 hours before.
However, as shown in Figure 23, the Viterbi path of
the CHMM fails in detecting the correct state at the end.
Hence, we can conclude that, despite the earlier detection of
the bearing degradation, the configuration mean, RMS, and
skewness is not appropriate to determine bearing condition
right up until failure.
Wavelet Packet Transform. Finally, we tried our method on
time-frequency domain features extracted using the wavelet
packet transform (WPT). As described in the work of Pandya
et al. [46], we combined the energy and the kurtosis of wavelet
coefficients at the second level of the wavelet tree.
Figure 24 shows that offline CPD works fine in detecting
the two states of the training bearings, while Figure 25 shows
that our methodology detects the new bearing degradation
state at sample 1831 when applied on time-frequency features,
performing slightly better (around 5 hours earlier detection)
than when mean, RMS, and kurtosis are used.
From the feature study, one can conclude that the pro-
posed mean, RMS, and kurtosis features give good results
for condition monitoring of bearings. However, an extended
Advances in Mechanical Engineering 17
0 2156
0
5
10
15
Kurtosis
โˆ’5
(a) Testing signal: observation 1
0 1829 2156
0
5
10
15
Kurtosis
1-dimensional signal
Change-points
Detection time
โˆ’5
(b) Online change point detection
0 1829 2156
1
2
3
States
(c) Viterbi path
Figure 19: Online condition assessment using the proposed approach on kurtosis only. Although the beginning of the bearing degradation
is correctly detected at an early stage, kurtosis does not contain the information on the switch from state 1 to state 2 and, consequently, the
method fails in determining the correct bearing condition.
0 2156
0
0.2
KPCA
1-dimensional signal
Change-points
โˆ’0.8
โˆ’0.6
โˆ’0.4
โˆ’0.2
(a) Offline CPD on training history #1
0 2156
0
0.2
KPCA
1-dimensional signal
Change-points
โˆ’0.8
โˆ’0.6
โˆ’0.4
โˆ’0.2
(b) Offline CPD on training history #2
Figure 20: Initial states number estimation with batch CPD for training histories #1 and #2 based on mean and RMS. Two states are correctly
detected.
0 2156
0
20
40
Mean
โˆ’20
(a) Testing signal: observation 1
0 2156
0
10
20
RMS
โˆ’10
(b) Testing signal: observation 2
0 2135
0
0.5
1
KPCA
1-dimensional signal
Change-points
Detection time
โˆ’0.5
(c) Online change point detection
0 2135
1
2
3
States
(d) Viterbi path
Figure 21: Online condition assessment using the proposed approach on mean and RMS. The online CPD detects a new failure condition
too late, when the bearing is already broken.
18 Advances in Mechanical Engineering
0 2156
0
0.2
0.4
0.6
0.8
KPCA
1-dimensional signal
Change-points
โˆ’0.2
(a) Offline CPD on training history #1
0 2156
0
0.2
0.4
0.6
0.8
KPCA
1-dimensional signal
Change-points
โˆ’0.2
(b) Offline CPD on training history #2
Figure 22: Initial states number estimation with batch CPD for training histories #1 and #2 based on mean, RMS and skewness. Two states
are correctly detected.
0 2156
0
20
40
Mean
โˆ’20
(a) Testing signal: observation 1
0 2156
0
10
20
RMS
โˆ’10
(b) Testing signal: observation 2
0 2156
0
10
Skewness
โˆ’20
โˆ’10
(c) Testing signal: observation 3
0 1824 2156
0
0.5
1
KPCA
1-dimensional signal
Change-points
Detection time
โˆ’0.5
(d) Online change point detection
0 1824 2156
1
2
3
States
(e) Viterbi path
Figure 23: Online condition assessment using the proposed approach on mean, RMS, and skewness. Although The online CPD detects a
new failure condition at an early stage, the CHMM fails to assess the correct state sequence at the end.
0 2156
0
0.2
0.4
0.6
KPCA
1-dimensional signal
Change-points
โˆ’0.4
โˆ’0.2
(a) Offline CPD on training history #1
0 2156
0
0.2
0.4
0.6
KPCA
1-dimensional signal
Change-points
โˆ’0.4
โˆ’0.2
(b) Offline CPD on training history #2
Figure 24: Initial states number estimation with batch CPD for histories #1 and #2 using the energy and the kurtosis of the second level
wavelet packet transform coefficients. Two states are correctly detected.
Advances in Mechanical Engineering 19
0 2156
0
10
20
30
WPT-coefficient 1 energy
โˆ’10
(a) Testing signal: observation 1
0 2156
0
10
20
30
WPT-coefficient 2 energy
โˆ’10
(b) Testing signal: observation 2
0 2156
0
5
10
15
WPT-coefficient 1 kurtosis
โˆ’5
(c) Testing signal: observation 3
0 2156
0
10
20
30
WPT-coefficient 2 kurtosis
โˆ’10
(d) Testing signal: observation 4
0 1831 2156
0
0.2
0.4
0.6
KPCA
1-dimensional signal
Change-points
Detection time
โˆ’0.4
โˆ’0.2
(e) Online change point detection
0 1831 2156
1
2
3
States
(f) Viterbi path
Figure 25: Online condition assessment using the proposed approach on the energy and the kurtosis of WPT treeโ€™s second level coefficients.
The results obtained with the proposed time domain features are confirmed by using time-frequency domain features.
0 2156
0
20
40
Mean
โˆ’20
(a) Testing signal: observation 1
0 2156
0
10
20
RMS
โˆ’10
(b) Testing signal: observation 2
0 2156
0
5
10
15
Kurtosis
โˆ’5
(c) Testing signal: observation 3
0 2061
Mahalanobis distance
Under control
Out of control
10โˆ’10
10โˆ’5
100
105
UCL = 16.27 (๐›ผ = 0.001)
(d) Hotelling multivariate control chart
0 2061
1
2
3
States
(e) Viterbi path
Figure 26: Online condition assessment using the hotelling multivariate control chart of [25]. The detection of the degradation modality
occurs at time 2061 (71 hours later than the proposed approach).
20 Advances in Mechanical Engineering
0 206
Mahalanobis distance
Under control
Out of control State change
Number of sample (sample size = 10)
10โˆ’1
100
101
102
103
104
UCL = 16.26 (๐›ผ = 0.001)
Figure 27: The hotelling multivariate control chart for testing data.
The observations modeling through a joint Gaussian distribution is
highly sensitive to small differences between the mean of the testing
and the training observation signals. Due to this small difference,
several out of control points are observed at the beginning, that
would have wrongly brought the method to detect an unknown
situation.
analysis of feature extraction and selection should be made
depending on the problem in hand and on the used sensors.
4.2.4. Modified HMM Results. Figure 26 illustrates the
approach of Lee et al. [25], in which the detection sensitivity
parameter ๐‘… has been set as ๐‘… = 6. As it can be seen, the
detection of the unknown condition happens at time 2061.
Figure 27, which represents the Mahalanobis distance
versus number of sample, illustrates the limits of the approach
and its observations modeling, being a joint Gaussian distri-
bution and not a mixture of Gaussians. Hotelling MCC is in
fact highly sensitive to small differences between the mean
of the testing and the training observation signals. Hence, it
is applicable only when training and testing observations are
very similar. As demonstrated by the first part of Figure 27,
where there are several points out of control at the beginning
because the mean of the testing signals in this portion is
slightly different from the training set. Although such method
would have wrongly detected an unknown situation (a new
state would), it has been forced to avoid this behavior by
adjusting the parameter ๐‘….
4.2.5. Discussion. The above results highlight that the pro-
posed methodology
(i) overcomes the limitations imposed by the hotelling
MCC since a complex density can be used to model
the observations;
(ii) is parametric-free; the only parameters are the thresh-
olds of the CPD algorithms, which could be set
according to the sensor measurements during the
training phase;
(iii) is effective in detecting failure modalities at an early
stage.
5. Conclusion and Future Work
In this paper, a new approach to the online adaptive condition
monitoring has been introduced. The methodology is based
on continuous observation hidden Markov models that,
combined with the change point detection algorithm, are
enhanced to deal with variable state space and are used for
condition assessment and unknown state detection. More-
over, a new approach for the HMM initial states number
estimation has been provided, together with a method for the
parameters initialization. The obtained results illustrate that
the proposed methodology (1) correctly estimates the initial
number of states for the HMM; (2) improves the execution
speed of the training procedure in terms of iterations number
reduction with a final likelihood comparable with the best
random initialization; (3) classify the current state of the
system more effectively than the modified HMM coupled
with the hotelling multivariate control chart; (4) detects an
anomalous condition or an unknown state at an early stage;
and (5) changes the structure and the parameters of the HMM
to adapt the model to the new situations.
Our approach has been implemented in Matlab (version
8.3.0), under Windows 7 Enterprise 64-bit Operating System
on a desktop computer with an Intel Core i7 3.07 GHz CPU
and 16 GB RAM. In terms of computational cost, phase 1,
that is, the training phase, takes 296 seconds to learn the
CHMM parameters from historical data. It has to be noted
that this phase is applied only once for a given machine.
Phase 2, that is, the online adaptive condition assessment,
takes, on average, 0.2 seconds each time a new snapshot
of data is collected (20480 samples). As a consequence, the
algorithm can be used in real time condition assessment.
In terms of hardware cost, a 164 kBytes buffer is used to
store the incoming data, while for the CHMM the memory
consumption is proportional to the model structure. In
general terms, (๐‘ โˆ’ 1) + (๐‘ โˆ’ 1) โ‹… ๐‘ + ๐ท โ‹… ๐‘ โ‹… ๐‘€ + ๐ท โ‹… ๐ท โ‹… ๐‘ โ‹…
๐‘€ + (๐‘€ โˆ’ 1) โ‹… ๐‘ double-precision memory is required, with
๐‘ the number of states, ๐‘€ the number of Gaussian mixtures
in the observation model, and ๐ท the number of considered
features.
The present work highlights through experiments per-
formed on simulated and real data that the presented
methodology can be used in practice as a decision support
tool to corrective maintenance strategies. However, it has
some problems that must be considered. The dimensionality
reduction through kernel principal cmponent analysis causes
an inevitable loss of information in the signal used for the
change point detection algorithm. To avoid this problem,
future researches will consider the application of a modified
multidimensional CPD algorithm performed directly in the
features hyperplane. Moreover, by using left-right HMMs, it
has been assumed that the degradation pattern of the system
is characterized by a monotonically increasing evolution
which gradually crosses all the wearing phases until the
complete failure. Despite being reasonable for bearings, this
assumption does not hold for several real systems. In order
to render the methodology more generally applicable, it is
necessary to consider in future work a method which is able
to distinguish also intermediate states and to recognize if the
Advances in Mechanical Engineering 21
new detected state has been already encountered. Although
the present-time researches are oriented to clustering tech-
niques coupled with statistical similarity measures, other
methods will be taken into consideration, with the final goal
to use as degradation model the more general full ergodic
HMM.
Conflict of Interests
The authors declare that there is no conflict of interests
regarding the publication of this paper.
References
[1] T. Honkanen, Modelling industrial maintenance systems and the
effects of automatic condition monitoring [Ph.D. thesis], Helsinki
University of Technology, Information and Computer Systems
in Automation, Helsinki, Finland, 2004.
[2] R. Dekker, โ€œApplications of maintenance optimization models:
a review and analysis,โ€ Reliability Engineering & System Safety,
vol. 51, no. 3, pp. 229โ€“240, 1996.
[3] L. Solomon, โ€œEssential elements of maintenance improvement
programs,โ€ in Production Control in the Process Industry: Pro-
duction Control in the Process Industry No 8: Proceedings of
the IFAC Workshop, Osaka, 29-31 and Kariya, Japan, 1-2 Nov.
1989 (Hardback), E. Oshima and C. van Rijn, Eds., pp. 195โ€“198,
Pergamon Press, Oxford, UK, 1989.
[4] H. Wang, โ€œA survey of maintenance policies of deteriorating
systems,โ€ European Journal of Operational Research, vol. 139, no.
3, pp. 469โ€“489, 2002.
[5] Y. Peng, M. Dong, and M. J. Zuo, โ€œCurrent status of machine
prognostics in condition-based maintenance: a review,โ€ The
International Journal of Advanced Manufacturing Technology,
vol. 50, no. 14, pp. 297โ€“313, 2010.
[6] A. Sfetsos, โ€œShort-term load forecasting with a hybrid clustering
algorithm,โ€ IEE Proceedings: Communications, vol. 150, no. 3, pp.
257โ€“262, 2003.
[7] R. Vilalta and M. Sheng, โ€œPredicting rare events in temporal
domains,โ€ in Proceedings of the 2nd IEEE International Confer-
ence on Data Mining (ICDM โ€™02), pp. 474โ€“481, December 2002.
[8] C. Domeniconi, C.-S. Perng, R. Vilalta, and S. Ma, โ€œA classi-
fication approach for prediction of target events in temporal
sequences,โ€ in Proceedings of the 6th European Conference on
Principles of Data Mining and Knowledge Discovery (PKDD โ€™02),
pp. 125โ€“137, Springer, London, UK, 2002.
[9] K. Medjaher, J.-Y. Moya, and N. Zerhouni, โ€œFailure
prognostic by using dynamic Bayesian Networks,โ€ in
Dependable Control of Discrete Systems, M. P. Fanti
and M. Dotoli, Eds., vol. 1, IFAC, Bari, Italy, 2009,
https://hal.archives-ouvertes.fr/hal-00402938/en/.
[10] P. Baruah and R. B. Chinnam, โ€œHMMs for diagnostics and
prognostics in machining processes,โ€ International Journal of
Production Research, vol. 43, no. 6, pp. 1275โ€“1293, 2005.
[11] X. Zhang, R. Xu, C. Kwan, S. Y. Liang, Q. Xie, and L.
Haynes, โ€œAn integrated approach to bearing fault diagnostics
and prognostics,โ€ in Proceedings of the IEEE American Control
Conference (ACC โ€™05), vol. 4, pp. 2750โ€“2755, June 2005.
[12] Z. Li, Z. Wu, Y. He, and C. Fulei, โ€œHidden Markov model-
based fault diagnostics method in speed-up and speed-down
process for rotating machinery,โ€ Mechanical Systems and Signal
Processing, vol. 19, no. 2, pp. 329โ€“339, 2005.
[13] H. M. Ertunc, K. A. Loparo, and H. Ocak, โ€œTool wear condition
monitoring in drilling operations using hidden Markov models
(HMMs),โ€ International Journal of Machine Tools and Manufac-
ture, vol. 41, no. 9, pp. 1363โ€“1384, 2001.
[14] L. R. Rabiner, โ€œA tutorial on hidden Markov models and selected
applications in speech recognition,โ€ Proceedings of the IEEE, vol.
77, no. 2, pp. 257โ€“286, 1989.
[15] N. Ye, Y. Zhang, and C. M. Borror, โ€œRobustness of the Markov-
chain model for cyber-attack detection,โ€ IEEE Transactions on
Reliability, vol. 53, no. 1, pp. 116โ€“123, 2004.
[16] C. Bunks, D. McCarthy, and T. Al-Ani, โ€œCondition-based main-
tenance of machines using hidden Markov models,โ€ Mechanical
Systems and Signal Processing, vol. 14, no. 4, pp. 597โ€“612, 2000.
[17] M. Bjerkeseth, Using hidden markov models for fault diagnostics
and prognostics in condition based maintenance systems [Ph.D.
dissertation], University of Agder, 2010.
[18] T. Liu, J. Chen, and G. Dong, โ€œApplication of continuous hide
markov model to bearing performance degradation assess-
ment,โ€ in Proceedings of the 24th International Congress on
Condition Monitoring and Diagnostics Engineering Management
(COMADEM โ€™11), pp. 166โ€“172, 2011.
[19] H. Ocak and K. A. Loparo, โ€œA new bearing fault detection
and diagnosis scheme based on hidden Markov modeling
of vibration signals,โ€ in Proceedings of the IEEE Interntional
Conference on Acoustics, Speech, and Signal Processing, pp. 3141โ€“
3144, IEEE Computer Society, Washington, DC, USA, May
2001.
[20] G. Almeida and S. Park, โ€œProcess monitoring in chemical
industriesโ€”a hidden Markov model approach,โ€ in Proceedings
of the 18th European Symposium on Computer Aided Process
Engineering (ESCAPE โ€™18), 2008.
[21] I. Strachan and D. Clifton, A Hidden Markov Model for Condi-
tion Monitoring of a Manufacturing Drilling Process, vol. 1, (IET)
Condition Monitoring, Dublin, Ireland, 2009.
[22] G. Almeida and S. Park, โ€œFault detection in continuous indus-
trial chemical processes: a new approach using the hidden
Markov modeling. Case study: a boiler from a Brazilian cellu-
lose pulp mill,โ€ in Intelligent Data Engineering and Automated
Learningโ€”IDEAL 2012, vol. 7435, pp. 743โ€“752, Springer, Berlin,
Germany, 2012.
[23] T. Liu, J. Chen, X. N. Zhou, and W. B. Xiao, โ€œBearing per-
formance degradation assessment using linear discriminant
analysis and coupled HMM,โ€ Journal of Physics: Conference
Series, vol. 364, no. 1, Article ID 012028, 2012.
[24] G. D. Forney Jr., โ€œThe Viterbi algorithm,โ€ Proceedings of the
IEEE, vol. 61, pp. 268โ€“278, 1973.
[25] S. Lee, L. Li, and J. Ni, โ€œOnline degradation assessment and
adaptive fault detection using modified hidden markov model,โ€
Journal of Manufacturing Science and Engineering, vol. 132, no.
2, Article ID 021010, 11 pages, 2010.
[26] F. Cartella, T. Liu, S. Meganck, J. Lemeire, and H. Sahli, โ€œOnline
adaptive learning of left-right continuous HMM for bearings
condition assessment,โ€ Journal of Physics: Conference Series, vol.
364, no. 1, Article ID 012031, 2012.
[27] V. Guralnik and J. Srivastava, โ€œEvent detection from time
series data,โ€ Proceedings of the 5th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining (KDD
โ€™99), pp. 33โ€“42, 1999.
[28] J. C. Bezdek and N. R. Pal, โ€œSome new indexes of cluster
validity,โ€ IEEE Transactions on Systems, Man, and Cybernetics
Part B: Cybernetics, vol. 28, no. 3, pp. 301โ€“315, 1998.
22 Advances in Mechanical Engineering
[29] B. Schยจolkopf, A. Smola, and K.-R. Mยจuller, โ€œNonlinear compo-
nent analysis as a kernel eigenvalue problem,โ€ Neural Computa-
tion, vol. 10, no. 5, pp. 1299โ€“1319, 1998.
[30] S. Mika, B. Scholkopf, A. Smola, K. R. Muller, M. Scholz, and
G. Ratsch, โ€œKernel pca and de noising in feature spaces,โ€ in
Advances in Neural Information Processing Systems, vol. 11, pp.
536โ€“542, MIT Press, 1999.
[31] S. Romdhani, S. Gong, and A. Psarrou, โ€œA multi-view nonlinear
active shape model using kernel pca,โ€ in Proceedings of the
British Machine Vision Conference, pp. 483โ€“492, 1999.
[32] P. Oโ€™Donnell, โ€œReport of large motor reliability survey of indus-
trial and commercial installations, part II,โ€ IEEE Transactions
on Industry Applications, vol. 21, no. 4, pp. 865โ€“872, 1985.
[33] J. Q. Li and A. R. Barron, โ€œMixture density estimation,โ€ in
Advances in Neural Information Processing Systems 12, pp. 279โ€“
285, MIT Press, Cambridge, Mass, USA, 1999.
[34] A. P. Dempster, N. M. Laird, and D. B. Rubin, โ€œMaximum
likelihood from incomplete data via the EM algorithm,โ€ Journal
of the Royal Statistical Society B: Methodological, vol. 39, no. 1,
pp. 1โ€“38, 1977.
[35] R. B. Randall, Frequency Analysis, Bruel and Kjaer, 1987.
[36] H. Qiu, J. Lee, J. Lin, and G. Yu, โ€œWavelet filter-based weak
signature detection method and its application on rolling
element bearing prognostics,โ€ Journal of Sound and Vibration,
vol. 289, no. 4-5, pp. 1066โ€“1090, 2006.
[37] D. A. Tobon-Mejia, K. Medjaher, N. Zerhouni, and G. Tripot,
โ€œEstimation of the remaining useful life by using wavelet
packet decomposition and hmms,โ€ in Proceedings of the IEEE
Aerospace Conference AIAA, pp. 1โ€“10, IEEE Computer Society,
Los Alamitos, Calif, USA, 2011.
[38] P. Burge and J. Shaw-Taylor, โ€œDetecting cellular fraud using
adaptive prototypes,โ€ in Proceedings of the AI Approaches to
Fraud Detection and Risk Management, pp. 9โ€“13, 1997.
[39] K. Yamanishi, J. Takeuchi, G. Williams, and P. Milne, โ€œOn-
line unsupervised outlier detection using finite mixtures with
discounting learning algorithms,โ€ in Proceedings of the 6th ACM
SIGKDD International Conference on Knowledge Discovery and
Data Mining (KDD โ€™00), pp. 250โ€“254, ACM Press, 2000.
[40] T. Fawcett and F. Provost, โ€œActivity monitoring: noticing inter-
esting changes in behavior,โ€ in Proceedings of the 5th ACM
SIGKDD International Conference on Knowledge Discovery and
Data Mining (KDD โ€™99), pp. 53โ€“62, 1999.
[41] V. Cherkassky and F. Mulier, Learning from Data, Wiley-
Interscience, New York, NY, USA, 1998.
[42] NSF-I/UCRC, โ€œCenter for intelligent maintenance systems.
prognostic data repository: Bearing data set,โ€ http://ti.arc.nasa
.gov/tech/dash/pcoe/prognostic-data-repository/.
[43] D. A. Tobon-Mejia, K. Medjaher, N. Zerhouni, and G. Tripot,
โ€œA mix ture of gaussians hidden markov model for failure
diagnostic and prognostic,โ€ in Proceedings of the 6th Annual
IEEE International Conference on Automation Science and
Engineering (CASE โ€™10), pp. 338โ€“343, IEEE, August 2010.
[44] D. A. Tobon-Mejia, K. Medjaher, N. Zerhouni, and G. Tripot,
โ€œHidden Markov Models for failure diagnostic and prognostic,โ€
in Proceedings of the Prognostics and System Health Management
Conference (PHM-Shenzhen โ€™11), pp. 1โ€“8, Shenzhen, China, May
2011.
[45] D. A. Tobon-Mejia, K. Medjaher, N. Zerhouni, and G. Tripot,
โ€œA data-driven failure prognostics method based on mixture
of gaussians hidden markov models,โ€ IEEE Transactions on
Reliability, vol. 61, no. 2, pp. 491โ€“503, 2012.
[46] D. Pandya, S. Upadhyay, and S. P. Harsha, โ€œAnn based fault
diagnosis of rolling element bearing using time-frequency
domain feature,โ€ International Journal of Engineering Science
and Technology, vol. 4, pp. 2878โ€“2886, 2012.
Submit your manuscripts at
http://www.hindawi.com
VLSI Design
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
International Journal of
Rotating
Machinery
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com
Journal of
Engineering
Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Shock and Vibration
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Mechanical
Engineering
Advances in
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Civil Engineering
Advances in
Acoustics and Vibration
Advances in
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Electrical and Computer
Engineering
Journal of
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Distributed
Sensor Networks
International Journal of
The Scientific
World JournalHindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Sensors
Journal of
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Modelling &
Simulation
in Engineering
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Active and Passive
Electronic Components
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Chemical Engineering
International Journal of
Control Science
and Engineering
Journal of
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Antennas and
Propagation
International Journal of
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014
Navigation and
Observation
International Journal of
Advances in
OptoElectronics
Hindawi Publishing Corporation
http://www.hindawi.com
Volume 2014
Robotics
Journal of
Hindawi Publishing Corporation
http://www.hindawi.com Volume 2014

More Related Content

What's hot

Bus information live monitoring system
Bus information live monitoring systemBus information live monitoring system
Bus information live monitoring systemVenkat Projects
ย 
On designing automatic reaction strategy for critical infrastructure scada sy...
On designing automatic reaction strategy for critical infrastructure scada sy...On designing automatic reaction strategy for critical infrastructure scada sy...
On designing automatic reaction strategy for critical infrastructure scada sy...Luxembourg Institute of Science and Technology
ย 
A 7 e module decomposition structure
A 7 e module decomposition structureA 7 e module decomposition structure
A 7 e module decomposition structureahsan riaz
ย 
Slide Projek Sarjana Muda
Slide Projek Sarjana MudaSlide Projek Sarjana Muda
Slide Projek Sarjana MudaNurul Naemah Ariepin
ย 
Statistical Model to Validate A Metaprocess-Oriented Methodology based on RAS...
Statistical Model to Validate A Metaprocess-Oriented Methodology based on RAS...Statistical Model to Validate A Metaprocess-Oriented Methodology based on RAS...
Statistical Model to Validate A Metaprocess-Oriented Methodology based on RAS...IJMERJOURNAL
ย 
A web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tamA web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tameSAT Journals
ย 
Bo24437446
Bo24437446Bo24437446
Bo24437446IJERA Editor
ย 
A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...IOSR Journals
ย 
System identification of a steam distillation pilot scale using arx and narx ...
System identification of a steam distillation pilot scale using arx and narx ...System identification of a steam distillation pilot scale using arx and narx ...
System identification of a steam distillation pilot scale using arx and narx ...eSAT Journals
ย 
System identification of a steam distillation pilotscale
System identification of a steam distillation pilotscaleSystem identification of a steam distillation pilotscale
System identification of a steam distillation pilotscaleeSAT Publishing House
ย 
Adequacy Analysis and Security Reliability Evaluation of Bulk Power System
Adequacy Analysis and Security Reliability Evaluation of Bulk Power SystemAdequacy Analysis and Security Reliability Evaluation of Bulk Power System
Adequacy Analysis and Security Reliability Evaluation of Bulk Power SystemIOSR Journals
ย 
2-DOF BLOCK POLE PLACEMENT CONTROL APPLICATION TO:HAVE-DASH-IIBTT MISSILE
2-DOF BLOCK POLE PLACEMENT CONTROL APPLICATION TO:HAVE-DASH-IIBTT MISSILE2-DOF BLOCK POLE PLACEMENT CONTROL APPLICATION TO:HAVE-DASH-IIBTT MISSILE
2-DOF BLOCK POLE PLACEMENT CONTROL APPLICATION TO:HAVE-DASH-IIBTT MISSILEZac Darcy
ย 
Smart production system
Smart production systemSmart production system
Smart production systemSiva Prasad
ย 
A model for run time software architecture adaptation
A model for run time software architecture adaptationA model for run time software architecture adaptation
A model for run time software architecture adaptationijseajournal
ย 
Presentation reliability and diagnosis in industrial systems
Presentation   reliability and diagnosis in industrial systemsPresentation   reliability and diagnosis in industrial systems
Presentation reliability and diagnosis in industrial systemsGlรกucio Bastos
ย 
Transformation of simulink models to iec 61499 function blocks for verificati...
Transformation of simulink models to iec 61499 function blocks for verificati...Transformation of simulink models to iec 61499 function blocks for verificati...
Transformation of simulink models to iec 61499 function blocks for verificati...Tiago Oliveira
ย 
Driving semiconductor-manufacturing-business-performance-through-analytics (1)
Driving semiconductor-manufacturing-business-performance-through-analytics (1)Driving semiconductor-manufacturing-business-performance-through-analytics (1)
Driving semiconductor-manufacturing-business-performance-through-analytics (1)Suneetha Mathukumalli
ย 

What's hot (17)

Bus information live monitoring system
Bus information live monitoring systemBus information live monitoring system
Bus information live monitoring system
ย 
On designing automatic reaction strategy for critical infrastructure scada sy...
On designing automatic reaction strategy for critical infrastructure scada sy...On designing automatic reaction strategy for critical infrastructure scada sy...
On designing automatic reaction strategy for critical infrastructure scada sy...
ย 
A 7 e module decomposition structure
A 7 e module decomposition structureA 7 e module decomposition structure
A 7 e module decomposition structure
ย 
Slide Projek Sarjana Muda
Slide Projek Sarjana MudaSlide Projek Sarjana Muda
Slide Projek Sarjana Muda
ย 
Statistical Model to Validate A Metaprocess-Oriented Methodology based on RAS...
Statistical Model to Validate A Metaprocess-Oriented Methodology based on RAS...Statistical Model to Validate A Metaprocess-Oriented Methodology based on RAS...
Statistical Model to Validate A Metaprocess-Oriented Methodology based on RAS...
ย 
A web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tamA web application detecting dos attack using mca and tam
A web application detecting dos attack using mca and tam
ย 
Bo24437446
Bo24437446Bo24437446
Bo24437446
ย 
A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...A Hierarchical Feature Set optimization for effective code change based Defec...
A Hierarchical Feature Set optimization for effective code change based Defec...
ย 
System identification of a steam distillation pilot scale using arx and narx ...
System identification of a steam distillation pilot scale using arx and narx ...System identification of a steam distillation pilot scale using arx and narx ...
System identification of a steam distillation pilot scale using arx and narx ...
ย 
System identification of a steam distillation pilotscale
System identification of a steam distillation pilotscaleSystem identification of a steam distillation pilotscale
System identification of a steam distillation pilotscale
ย 
Adequacy Analysis and Security Reliability Evaluation of Bulk Power System
Adequacy Analysis and Security Reliability Evaluation of Bulk Power SystemAdequacy Analysis and Security Reliability Evaluation of Bulk Power System
Adequacy Analysis and Security Reliability Evaluation of Bulk Power System
ย 
2-DOF BLOCK POLE PLACEMENT CONTROL APPLICATION TO:HAVE-DASH-IIBTT MISSILE
2-DOF BLOCK POLE PLACEMENT CONTROL APPLICATION TO:HAVE-DASH-IIBTT MISSILE2-DOF BLOCK POLE PLACEMENT CONTROL APPLICATION TO:HAVE-DASH-IIBTT MISSILE
2-DOF BLOCK POLE PLACEMENT CONTROL APPLICATION TO:HAVE-DASH-IIBTT MISSILE
ย 
Smart production system
Smart production systemSmart production system
Smart production system
ย 
A model for run time software architecture adaptation
A model for run time software architecture adaptationA model for run time software architecture adaptation
A model for run time software architecture adaptation
ย 
Presentation reliability and diagnosis in industrial systems
Presentation   reliability and diagnosis in industrial systemsPresentation   reliability and diagnosis in industrial systems
Presentation reliability and diagnosis in industrial systems
ย 
Transformation of simulink models to iec 61499 function blocks for verificati...
Transformation of simulink models to iec 61499 function blocks for verificati...Transformation of simulink models to iec 61499 function blocks for verificati...
Transformation of simulink models to iec 61499 function blocks for verificati...
ย 
Driving semiconductor-manufacturing-business-performance-through-analytics (1)
Driving semiconductor-manufacturing-business-performance-through-analytics (1)Driving semiconductor-manufacturing-business-performance-through-analytics (1)
Driving semiconductor-manufacturing-business-performance-through-analytics (1)
ย 

Viewers also liked

AIA Solutions Brochure
AIA Solutions BrochureAIA Solutions Brochure
AIA Solutions BrochureDenise J. Dewberry
ย 
Republica bolivariana de venezuela
Republica bolivariana de venezuelaRepublica bolivariana de venezuela
Republica bolivariana de venezuelaozzycastro1972
ย 
Sรญntesis de las propuestas y actuaciones en el Casco Histรณrico de Cรณrdoba. 2016
Sรญntesis de las propuestas y actuaciones en el Casco Histรณrico de Cรณrdoba. 2016Sรญntesis de las propuestas y actuaciones en el Casco Histรณrico de Cรณrdoba. 2016
Sรญntesis de las propuestas y actuaciones en el Casco Histรณrico de Cรณrdoba. 2016Izquierda Unida Cรณrdoba
ย 
Rotary International Certificate of Appreciation
Rotary International Certificate of AppreciationRotary International Certificate of Appreciation
Rotary International Certificate of AppreciationMarco Castro Sanchez
ย 
Secret Garden Residencial
Secret Garden ResidencialSecret Garden Residencial
Secret Garden ResidencialSuporteaoCorretor
ย 
Sports in Russia Contest
Sports in Russia ContestSports in Russia Contest
Sports in Russia Contestprosvsports
ย 
Adriana Vizental
Adriana VizentalAdriana Vizental
Adriana VizentalDia Diana
ย 
ACERCAMIENTO A LA FORMA DEL PROYECTO
ACERCAMIENTO A LA  FORMA DEL PROYECTOACERCAMIENTO A LA  FORMA DEL PROYECTO
ACERCAMIENTO A LA FORMA DEL PROYECTOrogelio01
ย 
La dinamica-de-las-culturas-de-america-latina-ppt
La dinamica-de-las-culturas-de-america-latina-pptLa dinamica-de-las-culturas-de-america-latina-ppt
La dinamica-de-las-culturas-de-america-latina-pptSandra Mayta Sheron
ย 
Kelan kuntoutus 2016
Kelan kuntoutus 2016Kelan kuntoutus 2016
Kelan kuntoutus 2016Kela
ย 
Successful Leadership
Successful LeadershipSuccessful Leadership
Successful LeadershipVaNessa Vollmer
ย 
Digijytky kunnossapidossa 2015 - Oliotalo
Digijytky kunnossapidossa 2015 - OliotaloDigijytky kunnossapidossa 2015 - Oliotalo
Digijytky kunnossapidossa 2015 - OliotaloVincitOy
ย 

Viewers also liked (20)

Victoria's Regional Statement 2015
Victoria's Regional Statement 2015Victoria's Regional Statement 2015
Victoria's Regional Statement 2015
ย 
AIA Solutions Brochure
AIA Solutions BrochureAIA Solutions Brochure
AIA Solutions Brochure
ย 
Republica bolivariana de venezuela
Republica bolivariana de venezuelaRepublica bolivariana de venezuela
Republica bolivariana de venezuela
ย 
Sรญntesis de las propuestas y actuaciones en el Casco Histรณrico de Cรณrdoba. 2016
Sรญntesis de las propuestas y actuaciones en el Casco Histรณrico de Cรณrdoba. 2016Sรญntesis de las propuestas y actuaciones en el Casco Histรณrico de Cรณrdoba. 2016
Sรญntesis de las propuestas y actuaciones en el Casco Histรณrico de Cรณrdoba. 2016
ย 
Tema 1
Tema 1Tema 1
Tema 1
ย 
Rotary International Certificate of Appreciation
Rotary International Certificate of AppreciationRotary International Certificate of Appreciation
Rotary International Certificate of Appreciation
ย 
Priroda kazahstana
Priroda kazahstanaPriroda kazahstana
Priroda kazahstana
ย 
Carioca Offices
Carioca OfficesCarioca Offices
Carioca Offices
ย 
Conceptos arquitectรณnicos
Conceptos arquitectรณnicosConceptos arquitectรณnicos
Conceptos arquitectรณnicos
ย 
Secret Garden Residencial
Secret Garden ResidencialSecret Garden Residencial
Secret Garden Residencial
ย 
Sports in Russia Contest
Sports in Russia ContestSports in Russia Contest
Sports in Russia Contest
ย 
Moradas da Colina
Moradas da ColinaMoradas da Colina
Moradas da Colina
ย 
Adriana Vizental
Adriana VizentalAdriana Vizental
Adriana Vizental
ย 
Kuntoutussรครคtiรถn strategiakartta
Kuntoutussรครคtiรถn strategiakarttaKuntoutussรครคtiรถn strategiakartta
Kuntoutussรครคtiรถn strategiakartta
ย 
ACERCAMIENTO A LA FORMA DEL PROYECTO
ACERCAMIENTO A LA  FORMA DEL PROYECTOACERCAMIENTO A LA  FORMA DEL PROYECTO
ACERCAMIENTO A LA FORMA DEL PROYECTO
ย 
Liikunta ja osallisuus
Liikunta ja osallisuusLiikunta ja osallisuus
Liikunta ja osallisuus
ย 
La dinamica-de-las-culturas-de-america-latina-ppt
La dinamica-de-las-culturas-de-america-latina-pptLa dinamica-de-las-culturas-de-america-latina-ppt
La dinamica-de-las-culturas-de-america-latina-ppt
ย 
Kelan kuntoutus 2016
Kelan kuntoutus 2016Kelan kuntoutus 2016
Kelan kuntoutus 2016
ย 
Successful Leadership
Successful LeadershipSuccessful Leadership
Successful Leadership
ย 
Digijytky kunnossapidossa 2015 - Oliotalo
Digijytky kunnossapidossa 2015 - OliotaloDigijytky kunnossapidossa 2015 - Oliotalo
Digijytky kunnossapidossa 2015 - Oliotalo
ย 

Similar to FCAME2014

Structural Health Monitoring
Structural Health MonitoringStructural Health Monitoring
Structural Health MonitoringJNTU
ย 
A robust algorithm based on a failure sensitive matrix for fault diagnosis of...
A robust algorithm based on a failure sensitive matrix for fault diagnosis of...A robust algorithm based on a failure sensitive matrix for fault diagnosis of...
A robust algorithm based on a failure sensitive matrix for fault diagnosis of...IJMER
ย 
Actuator Fault Decoupled Residual Generation on Lateral Moving Aircraft
Actuator Fault Decoupled Residual Generation on Lateral Moving AircraftActuator Fault Decoupled Residual Generation on Lateral Moving Aircraft
Actuator Fault Decoupled Residual Generation on Lateral Moving AircraftTELKOMNIKA JOURNAL
ย 
BEARINGS PROGNOSTIC USING MIXTURE OF GAUSSIANS HIDDEN MARKOV MODEL AND SUPPOR...
BEARINGS PROGNOSTIC USING MIXTURE OF GAUSSIANS HIDDEN MARKOV MODEL AND SUPPOR...BEARINGS PROGNOSTIC USING MIXTURE OF GAUSSIANS HIDDEN MARKOV MODEL AND SUPPOR...
BEARINGS PROGNOSTIC USING MIXTURE OF GAUSSIANS HIDDEN MARKOV MODEL AND SUPPOR...IJNSA Journal
ย 
BEARINGS PROGNOSTIC USING MIXTURE OF GAUSSIANS HIDDEN MARKOV MODEL AND SUPPOR...
BEARINGS PROGNOSTIC USING MIXTURE OF GAUSSIANS HIDDEN MARKOV MODEL AND SUPPOR...BEARINGS PROGNOSTIC USING MIXTURE OF GAUSSIANS HIDDEN MARKOV MODEL AND SUPPOR...
BEARINGS PROGNOSTIC USING MIXTURE OF GAUSSIANS HIDDEN MARKOV MODEL AND SUPPOR...IJNSA Journal
ย 
MultivariableProcessIdentificationforMPC-TheAsymptoticMethodanditsApplication...
MultivariableProcessIdentificationforMPC-TheAsymptoticMethodanditsApplication...MultivariableProcessIdentificationforMPC-TheAsymptoticMethodanditsApplication...
MultivariableProcessIdentificationforMPC-TheAsymptoticMethodanditsApplication...ssusere3c688
ย 
HVAC_CSIRO_Proof_2015
HVAC_CSIRO_Proof_2015HVAC_CSIRO_Proof_2015
HVAC_CSIRO_Proof_2015Mitchell Yuwono
ย 
INDUCTIVE LOGIC PROGRAMMING FOR INDUSTRIAL CONTROL APPLICATIONS
INDUCTIVE LOGIC PROGRAMMING FOR INDUSTRIAL CONTROL APPLICATIONSINDUCTIVE LOGIC PROGRAMMING FOR INDUSTRIAL CONTROL APPLICATIONS
INDUCTIVE LOGIC PROGRAMMING FOR INDUSTRIAL CONTROL APPLICATIONScsandit
ย 
Giddings
GiddingsGiddings
Giddingsanesah
ย 
A Hybrid Genetic Algorithm-TOPSIS-Computer Simulation Approach For Optimum Op...
A Hybrid Genetic Algorithm-TOPSIS-Computer Simulation Approach For Optimum Op...A Hybrid Genetic Algorithm-TOPSIS-Computer Simulation Approach For Optimum Op...
A Hybrid Genetic Algorithm-TOPSIS-Computer Simulation Approach For Optimum Op...Steven Wallach
ย 
Introduction to System, Simulation and Model
Introduction to System, Simulation and ModelIntroduction to System, Simulation and Model
Introduction to System, Simulation and ModelMd. Hasan Imam Bijoy
ย 
Towards a good abs design for more Reliable vehicles on the roads
Towards a good abs design for more Reliable vehicles on the roadsTowards a good abs design for more Reliable vehicles on the roads
Towards a good abs design for more Reliable vehicles on the roadsijcsit
ย 
A resonable approach for manufacturing system based on supervisory control 2
A resonable approach for manufacturing system based on supervisory control 2A resonable approach for manufacturing system based on supervisory control 2
A resonable approach for manufacturing system based on supervisory control 2IAEME Publication
ย 
Data mining-for-prediction-of-aircraft-component-replacement
Data mining-for-prediction-of-aircraft-component-replacementData mining-for-prediction-of-aircraft-component-replacement
Data mining-for-prediction-of-aircraft-component-replacementSaurabh Gawande
ย 
K-10714 ABHISHEK(MATLAB )
K-10714 ABHISHEK(MATLAB )K-10714 ABHISHEK(MATLAB )
K-10714 ABHISHEK(MATLAB )shailesh yadav
ย 
Intelligence decision making of fault detection and fault tolerances method f...
Intelligence decision making of fault detection and fault tolerances method f...Intelligence decision making of fault detection and fault tolerances method f...
Intelligence decision making of fault detection and fault tolerances method f...Siva Samy
ย 
H04544759
H04544759H04544759
H04544759IOSR-JEN
ย 
Trajectory Control With MPC For A Robot Manipรผlatรถr Using ANN Model
Trajectory Control With MPC For A Robot Manipรผlatรถr Using  ANN ModelTrajectory Control With MPC For A Robot Manipรผlatรถr Using  ANN Model
Trajectory Control With MPC For A Robot Manipรผlatรถr Using ANN ModelIJMER
ย 

Similar to FCAME2014 (20)

FCMPE2015
FCMPE2015FCMPE2015
FCMPE2015
ย 
Structural Health Monitoring
Structural Health MonitoringStructural Health Monitoring
Structural Health Monitoring
ย 
A robust algorithm based on a failure sensitive matrix for fault diagnosis of...
A robust algorithm based on a failure sensitive matrix for fault diagnosis of...A robust algorithm based on a failure sensitive matrix for fault diagnosis of...
A robust algorithm based on a failure sensitive matrix for fault diagnosis of...
ย 
Actuator Fault Decoupled Residual Generation on Lateral Moving Aircraft
Actuator Fault Decoupled Residual Generation on Lateral Moving AircraftActuator Fault Decoupled Residual Generation on Lateral Moving Aircraft
Actuator Fault Decoupled Residual Generation on Lateral Moving Aircraft
ย 
593176
593176593176
593176
ย 
BEARINGS PROGNOSTIC USING MIXTURE OF GAUSSIANS HIDDEN MARKOV MODEL AND SUPPOR...
BEARINGS PROGNOSTIC USING MIXTURE OF GAUSSIANS HIDDEN MARKOV MODEL AND SUPPOR...BEARINGS PROGNOSTIC USING MIXTURE OF GAUSSIANS HIDDEN MARKOV MODEL AND SUPPOR...
BEARINGS PROGNOSTIC USING MIXTURE OF GAUSSIANS HIDDEN MARKOV MODEL AND SUPPOR...
ย 
BEARINGS PROGNOSTIC USING MIXTURE OF GAUSSIANS HIDDEN MARKOV MODEL AND SUPPOR...
BEARINGS PROGNOSTIC USING MIXTURE OF GAUSSIANS HIDDEN MARKOV MODEL AND SUPPOR...BEARINGS PROGNOSTIC USING MIXTURE OF GAUSSIANS HIDDEN MARKOV MODEL AND SUPPOR...
BEARINGS PROGNOSTIC USING MIXTURE OF GAUSSIANS HIDDEN MARKOV MODEL AND SUPPOR...
ย 
MultivariableProcessIdentificationforMPC-TheAsymptoticMethodanditsApplication...
MultivariableProcessIdentificationforMPC-TheAsymptoticMethodanditsApplication...MultivariableProcessIdentificationforMPC-TheAsymptoticMethodanditsApplication...
MultivariableProcessIdentificationforMPC-TheAsymptoticMethodanditsApplication...
ย 
HVAC_CSIRO_Proof_2015
HVAC_CSIRO_Proof_2015HVAC_CSIRO_Proof_2015
HVAC_CSIRO_Proof_2015
ย 
INDUCTIVE LOGIC PROGRAMMING FOR INDUSTRIAL CONTROL APPLICATIONS
INDUCTIVE LOGIC PROGRAMMING FOR INDUSTRIAL CONTROL APPLICATIONSINDUCTIVE LOGIC PROGRAMMING FOR INDUSTRIAL CONTROL APPLICATIONS
INDUCTIVE LOGIC PROGRAMMING FOR INDUSTRIAL CONTROL APPLICATIONS
ย 
Giddings
GiddingsGiddings
Giddings
ย 
A Hybrid Genetic Algorithm-TOPSIS-Computer Simulation Approach For Optimum Op...
A Hybrid Genetic Algorithm-TOPSIS-Computer Simulation Approach For Optimum Op...A Hybrid Genetic Algorithm-TOPSIS-Computer Simulation Approach For Optimum Op...
A Hybrid Genetic Algorithm-TOPSIS-Computer Simulation Approach For Optimum Op...
ย 
Introduction to System, Simulation and Model
Introduction to System, Simulation and ModelIntroduction to System, Simulation and Model
Introduction to System, Simulation and Model
ย 
Towards a good abs design for more Reliable vehicles on the roads
Towards a good abs design for more Reliable vehicles on the roadsTowards a good abs design for more Reliable vehicles on the roads
Towards a good abs design for more Reliable vehicles on the roads
ย 
A resonable approach for manufacturing system based on supervisory control 2
A resonable approach for manufacturing system based on supervisory control 2A resonable approach for manufacturing system based on supervisory control 2
A resonable approach for manufacturing system based on supervisory control 2
ย 
Data mining-for-prediction-of-aircraft-component-replacement
Data mining-for-prediction-of-aircraft-component-replacementData mining-for-prediction-of-aircraft-component-replacement
Data mining-for-prediction-of-aircraft-component-replacement
ย 
K-10714 ABHISHEK(MATLAB )
K-10714 ABHISHEK(MATLAB )K-10714 ABHISHEK(MATLAB )
K-10714 ABHISHEK(MATLAB )
ย 
Intelligence decision making of fault detection and fault tolerances method f...
Intelligence decision making of fault detection and fault tolerances method f...Intelligence decision making of fault detection and fault tolerances method f...
Intelligence decision making of fault detection and fault tolerances method f...
ย 
H04544759
H04544759H04544759
H04544759
ย 
Trajectory Control With MPC For A Robot Manipรผlatรถr Using ANN Model
Trajectory Control With MPC For A Robot Manipรผlatรถr Using  ANN ModelTrajectory Control With MPC For A Robot Manipรผlatรถr Using  ANN Model
Trajectory Control With MPC For A Robot Manipรผlatรถr Using ANN Model
ย 

FCAME2014

  • 1. Research Article Online Adaptive Bearings Condition Assessment Using Continuous Hidden Markov Models Francesco Cartella1 and Hichem Sahli1,2 1 Electronics and Informatics Department (ETRO), Vrije Universiteit Brussel (VUB), 1050 Brussels, Belgium 2 Interuniversity Microelectronics Center (IMEC), 3001 Leuven, Belgium Correspondence should be addressed to Francesco Cartella; fcartell@etro.vub.ac.be Received 2 July 2014; Accepted 9 October 2014 Academic Editor: Xiaotun Qiu Copyright ยฉ F. Cartella and H. Sahli. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Condition assessment (CA) of bearings can be performed by using left-right hidden Markov models, because of the monotonically increasing pattern of the bearings degradation. Classical CA approaches assume that all possible system states are fixed and known a priori. The training of the system is performed offline at once with data from all of the system states. These assumptions significantly impede condition assessment applications in case that all the possible states of the system are not known in advance, or changes in environmental or operative conditions occur during the toolโ€™s usage. To overcome these limitations, we propose combining left-right continuous HMMs (CHMM) with a change point detection algorithm for (i) estimating, from historical observations, the initial number of the CHMM states and the initial guess of its model parameters and (ii) updating the state space as well as the model parameters during monitoring. Moreover, to deal with multidimensional sensor measurements, we propose using kernel principal component analysis for dimensionality reduction. Qualitative and quantitative evaluations of the proposed methodology have been performed using both simulated and real data from the NASA benchmark repository. Compared to state of the art techniques, the proposed methodology results in (i) an improvement of the HMM training phase in terms of iterations number; (ii) the detection of unknown states at an early stage; and (iii) an effective change of the CHMMโ€™s structure to represent the degradation processes more accurately in presence of unknown conditions. 1. Introduction Machines and equipments breakdown in industrial environ- ments can have significant impact in the profitability of a business. Maintaining the engineering assets in operating conditions is an industrial and economical requirement. Nowadays optimization strategies for maintenance are an important part of research for companies [1, 2], since, next to energy costs, maintenance spending due to equipment or machine failure can be the largest part of the operational budget [3]. Corrective maintenance [1, 4], which is the system main- tenance strategy subject of this work, is based on condition monitoring and is performed after a degradation modality emerges in a system, with the goal of preventing the system from failing. A fundamental requirement of this strategy is the detection of the abnormal behavior, that will lead to the system failure, at an early stage in order to give enough time to perform the repair operations. Industrial equipments and their components are often characterized by a very complex structure. For this reason the usage of mathematical models for condition monitoring can represent an arduous task. On the other hand, industrial machines are usually equipped with monitoring devices (sen- sors placed in critical parts of the system under study) which are able to record and collect large amount of measurements. Given the availability of these huge amounts of data, the methodology proposed in this paper will focus on data- driven models [5], which are based on machine learning approaches. The classical usage of data-driven techniques consists of two phases: a first training phase in which a preprocessing of the historical raw sensory data is performed in order to extract relevant features, which will be then exploited to Hindawi Publishing Corporation Advances in Mechanical Engineering Article ID 758785
  • 2. 2 Advances in Mechanical Engineering learn the parameters of behavioral models of the considered machine and a second application phase, during the runtime of the system, in which the trained model is used along with runtime measurement to evaluate the health condition of the machine. The advantage of such approaches is that, for a well monitored system, there is no need for knowledge of the systemโ€™s internal physical mechanism or for a prior mathematical model to represent the system degradation. Several data-driven models for condition monitoring have been proposed, such as clustering techniques [6], data-mining approaches [7], support vector machines [8], dynamic Bayesian networks [9], and hidden Markov models [10, 11]. Hidden Markov models (HMMs) have been shown to be effective pattern recognition tools in a large variety of recognition tasks. They have been successfully applied to condition monitoring of manufacturing machines [12, 13] as well as in other application domains ranging from speech recognition [14] to intrusion detection in computer systems [15]. The common usage of hidden Markov hodels, especially for condition assessment applications, consists of dividing the degradation pattern of the historical data of the system under study into different degradation levels, from normal to deeper degradation (i.e., good, partially worn, seriously worn, broken), and training either several HMMs, each one corresponding to observations (features) from a particular degradation level [16โ€“19] or training a single HMM using historical observations corresponding to the normal behavior of the system [19โ€“23]. Online condition assessment is then performed by selecting the HMM that, on new incoming data, gives the highest likelihood or by detecting an abnor- mality if the likelihood is lower than a threshold defined for the normal behavior [19]. However, the latter approach is not well suited to condition assessment of industrial systems, as it can be used only for anomaly detection. Indeed, such approach is only able to detect a condition that is different from the considered normal behavior in the training phase. Another approach [14] consists of training a single HMM by using the sensor measurements corresponding to the entire lifetime of the system, by setting the number of states equal to the number of degradation stages contained in the training history. By using this approach, the model maps each degradation phase to a state of the HMM. The condition assessment is performed online on new incoming data by calculating the Viterbi path [24] which returns the most probable state sequence. The approaches described above assume that all possible system condition states are known a priori and that training data sets from the associated states are available. In addition, the training procedure has to be performed offline and only once at the beginning, with the available training set. For this reason, the trained models are able to recognize only the conditions contained in the training data. These assump- tions significantly impede component diagnosis applications when: (i) it is difficult to identify and train all of the possible states of the system in advance; (ii) environmental factors or operative conditions change during the toolโ€™s usage. Since industrial machines are constantly subjected to updates, changes of configurations or changes of working conditions, a need for models that are able to automati- cally adapt to new situations, by dynamically learning the structure and the parameters, is required. To the best of our knowledge, such approach has only been used by Lee et al. [25], whom proposed a modified hidden Markov model with variable state space to estimate the current state of system degradation as well as to detect unknown faults at an early stage. A multivariate statistical process control is used to detect the occurrence of an unknown condition by measuring the deviation of the current signal from a reference signal representing prior known states; then the HMM is updated by including a new state. The limitation of this approach is that (i) one should collect reference signal representing prior known states and (ii) define a parametric model (Gaussian in the case of [25]) of the observation signals. To overcome the abovementioned drawbacks of most of the state of the art HMM-based condition assessment approaches, we proposed in [26] a methodology based on HMMs which is able to perform (i) an automatic estimation of the initial parameters and number of states of the model and (ii) an online adaptation of the model structure, together with the online learning of the parameters, in presence of not stable conditions. The proposed approach aims to diagnose the degradation processes of industrial systems using a variable state space left-right continuous HMM (CHMM), by integrating a change point detection (CPD) algorithm [27]. The CPD algorithm allows detecting the dynamic changes of machine condition and hence of the HMM parameters, by discovering time points at which the properties of the input time-series data change, corresponding to the (signal) segmentation of the observed data stream. The detected segments correspond to the different states of the machine. In this work, as a further contribution, we evaluate the reliability of the data driven identification of the number of (possible) systemโ€™s states by means of clustering techniques and the Dunnโ€™s validity indexes [28]. Concerning the online condition assessment, the com- parison between the estimated most-likely state, by the Viterbi algorithm [24], and the CPD detected segments (states) allows fast failures detection. Moreover, since the CPD algorithm is independent from the density model of the observation, in comparison with the method of Lee et al. [25], our proposal is more effective as it is independent of prior data and allows faster detection of unknown conditions. The used CPD algorithm is able to handle 1-dimensional time series, while industrial machines are usually equipped with several sensors for which multiple features can be extracted, resulting in a multidimensional observation vector. To have an accurate modeling for real world cases, it is essential to consider as much information as possible related to system changes, carried by multiple signals. In this work we propose applying kernel principal component analysis (KPCA) [29โ€“31] for dimensionality reduction. KPCA, being able to effectively deal with nonlinear data, embeds the
  • 3. Advances in Mechanical Engineering 3 information coming from all the measurements in to a 1- dimensional time series and allows a faster detection of condition changes by the CPD algorithm. Although it is adaptable to a wide range of health estimation problems, we applied the proposed methodology on bearings condition monitoring. This choice derives from the fact that, being the most common mechanical elements in industry, bearings are the components which most frequently fail in rotating machines [32]. Furthermore, given the nature of the component, we can assume that the degradation pat- tern of bearings is monotonically increasing and, therefore, the use of left-right HMMs is well-motivated. The rest of the paper is organized as follows. Section 2 gives a short overview of the used concepts and methods, namely, continuous left-right hidden Markov models, the preprocessing and features extraction techniques, and the change point detection algorithm. In Section 3, we provide a detailed description of the proposed approach. In Section 4 experimental results are discussed. The conclusion and future research directions are given in Section 5. 2. Methods and Algorithms Before going into a more detailed discussion of the proposed approach, an overview of the techniques used in this paper is given in this section. 2.1. Left-Right Continuous Hidden Markov Models. Left- right vontinuous hidden Markov models (CHMMs) [14] are dynamic models in which there are two types of variables: a state variable describing the system state, which is never observed/measured directly (hidden), and an observation variable which is related to the hidden state. A left-right CHMM ๐œ† = (๐ด, ๐ถ, ๐œ‡, ๐‘ˆ, ๐œ‹) is characterized by the following: (1) ๐‘, the number of states in the model, ๐‘† = {๐‘†1, . . . , ๐‘† ๐‘} the set of states, and ๐‘ ๐‘ก the state at time ๐‘ก, (2) the state transition probability distribution ๐ด = {๐‘Ž๐‘–๐‘—} where ๐‘Ž๐‘–๐‘— = ๐‘ƒ (๐‘ ๐‘ก+1 = ๐‘†๐‘— | ๐‘ ๐‘ก = ๐‘†๐‘–) , 1 โ‰ค ๐‘–, ๐‘— โ‰ค ๐‘, (1) (3) X๐‘ก, the continuous observation vector emitted from the state at time ๐‘ก, (4) the observation density, represented by a finite mix- ture: ๐‘๐‘— (X) = ๐‘€ โˆ‘ ๐‘š=1 ๐‘๐‘—๐‘šโ„ต (X, ๐œ‡ ๐‘—๐‘š, U ๐‘—๐‘š) , 1 โ‰ค ๐‘— โ‰ค ๐‘, (2) where X is the vector being modeled, ๐‘๐‘—๐‘š the mixture coefficient for the mth mixture in state ๐‘—, and โ„ต is any log-concave or elliptically symmetric density (usually a Gaussian), with mean vector ๐œ‡๐‘—๐‘š and covariance matrix U ๐‘—๐‘š for the mth mixture component in state ๐‘—. The parameters ๐‘๐‘—๐‘š, ๐œ‡๐‘—๐‘š, and U ๐‘—๐‘š, are the coefficients of ๐ถ, ๐œ‡ and ๐‘ˆ, respectively, (5) the initial state distribution ๐œ‹ = {๐œ‹๐‘–} where ๐œ‹๐‘– = ๐‘ƒ (๐‘ 1 = ๐‘†๐‘–) , 1 โ‰ค ๐‘– โ‰ค ๐‘. (3) It is proved that the probability density function of (2) can be used to approximate, arbitrarily closely, any finite, continuous density function [33]. Hence it can be applied to a wide range of problems. The left-right topology has the property that the state index can either increase or stay the same, as time goes on. In this case, the following holds for the coefficients of the transition matrix: ๐‘Ž๐‘–๐‘— = 0, ๐‘— < ๐‘–, (4) ๐‘Ž๐‘–๐‘— = 0, ๐‘— > ๐‘– + ฮ”, (5) ๐‘Ž ๐‘๐‘ = 1, (6) ๐‘Ž ๐‘๐‘– = 0, ๐‘– < ๐‘, (7) where (4) avoids transition to previous states and (5) avoids jumps of more than ฮ” states, while the state ๐‘ is an absorbing state. Moreover, since the model starts from state ๐‘†1, the initial state probabilities have the following property: ๐œ‹๐‘– = { 0 ๐‘– ฬธ= 1 1 ๐‘– = 1. (8) For CHMMs applied to condition monitoring, learning and inference procedures can be used. (1) Learning: adjust the model parameters ๐œ† = (๐ด, ๐ถ, ๐œ‡, ๐‘ˆ, ๐œ‹) in order to maximize the probability ๐‘ƒ(X | ๐œ†); that is, for the considered CHMM structure, find the model parameters that better fit the historical sensor measurements X. This problem is solved through the Baum-Welch algorithm [34]. (2) Decoding: given a model ๐œ† = (๐ด, ๐ถ, ๐œ‡, ๐‘ˆ, ๐œ‹) and a se- quence of sensor measurements X = X1, X2, . . . , X ๐‘‡, find the hidden state sequence ๐‘† = ๐‘ 1, ๐‘ 2, . . . , ๐‘  ๐‘‡ that has most likely produced the observation sequence. The solution of this problem is given by the Viterbi algorithm [24]. In the next section we will describe how we can fit the multidimensional time-series input coming from the machine sensors into a CHMM model. 2.2. Feature Extraction and Dimensionality Reduction. In this paper, the inputs are vibration signals originating from sensors on the machine under study. For this purpose, one accelerometer or more accelerometers are placed in the component housing and are used to translate the vibrations emitted to an electric continuous signal. For each sensor, the electric signal measured is then sampled at a predefined sampling frequency. The monitoring system of the machine acquires and stores such samples one at a time, under the form of a digital signal, which represents the raw input data, referred to as ๐‘Ÿ(๐‘ก) with 1 โ‰ค ๐‘ก โ‰ค ๐‘‡. It is assumed that a
  • 4. 4 Advances in Mechanical Engineering r(t) Window 1 Window 2 Window 3 Window n โˆ’ 1 Window n X(1) X(2) X(3) X(n โˆ’ 1) X(n)ยท ยท ยท ยท ยท ยท Figure 1: Feature extraction from raw signals. characteristic shape of the measured vibration is related to the actual condition of the component. Due to the high sampling frequency that is often used, a high-dimensional signal is measured. Moreover, consecutive values of a time series are usually not independent but highly correlated; thus there is a lot of redundancy. Hence, rather than looking at each point in time sequentially to detect anomalies, it is common to use sliding windows of length ๐ฟ. Within each window, ๐‘˜ and ๐‘‘ features are calculated, resulting in a vector of ๐‘‘ components X(๐‘˜) = [๐‘ฅ1(๐‘˜), ๐‘ฅ2(๐‘˜), . . . , ๐‘ฅ ๐‘‘(๐‘˜)] ๐‘‡ . Finally, the raw signal ๐‘Ÿ(๐‘ก) is trans- formed in a multivariate time series as (10). The process is illustrated in Figure 1. Consider Raw signal: ๐‘Ÿ (๐‘ก) = ๐‘Ÿ (1) , . . . , ๐‘Ÿ (๐‘‡) โ†“ features extraction, X (๐‘ก) = [X (1) , . . . , X (๐‘›)] , (9) X (๐‘ก) = [ [ [ [ [ [ ๐‘ฅ1 (1) ... ๐‘ฅ ๐‘‘ (1) ] ] ] , . . . , [ [ [ ๐‘ฅ1 (๐‘›) ... ๐‘ฅ ๐‘‘ (๐‘›) ] ] ] ] ] ] . (10) In this work, as also explained in Sections 4.2.1 and 4.2.3, we considered windows of size ๐ฟ = 20480 (fixed by the used data set) and ๐‘‘ = 3 time domain features, namely, the mean, the root mean square, and the kurtosis, of the sensory signals. It has to be noted that frequency [35] (i.e., Fourier transform) or time-frequency [36, 37] (i.e., wavelet transform) domain features could be used as well. Since the considered CPD algorithm is able to process 1-dimensional signals, a further dimensionality reduction technique has to be applied on the ๐‘‘-dimensional features vectors. In this paper, kernel principal component analysis (KPCA) [29โ€“31] is applied as a technique for dimensionality reduction. In the literature, principal component analysis (PCA) has been extensively used as a features extraction technique for condition assessment applications [18]. However, KPCA has been shown to perform better than the standard PCA for processes with particularly nonlinear characteristics [31]. The basic idea of KPCA is that when the variations are nonlinear, the data can always be mapped into a higher-dimensional space, by means of nonlinear kernel functions, in which they vary linearly. In this work Gaussian radial basis kernels have been used. The reader is referred to [29, 31] for the details on KPCA. The main advantages of KPCA are the fact that (i) it does not involve nonlinear optimization [29] since it requires only linear algebra, making it as simple as standard PCA which requires only the solution of an eigenvalue problem; (ii) due to its ability to use different kernels, it can handle a wide range of nonlinearities; (iii) it does not require the number of components to be extracted be specified prior to modeling. Applying KPCA on feature vectors X(๐‘ก) generates a 1- dimensional time-series, ๐‘ฆ(๐‘ก), (๐‘ก = 1, . . . , ๐‘›), which can be used as input to the CPD algorithm. 2.3. Change Point Detection Algorithm. The change point detection approaches apply data mining techniques to iden- tify the time points at which the changes, that is, events, occur. This has been discussed in several applications such as fraud detection [38], rare event discovery [39], event/trend change detection [27], and activity monitoring [40]. In [27] a method has been proposed for the detection of the appropriate set of number of points that minimizes the error in fitting a predefined function using maximum likelihood. There is no fixed number of change-points to be detected. Moreover, no constraints are imposed on the class of functions that will be fitted to the subsequences between successive change-points. Following the notation in [27], let ๐‘ฆ(๐‘ก), (๐‘ก = 1, . . . , ๐‘›), be the time series to be segmented. It is assumed that the time series can be modeled mathematically, where each model is characterized by a set of parameters. The problem of change-points detection is formulated as finding a piecewise segmented model, given by the following: ๐‘Œ = ๐‘“1 (๐‘ก, V1) + ๐‘’1 (๐‘ก) , (1 < ๐‘ก โ‰ค ๐œƒ1) , = ๐‘“2 (๐‘ก, V2) + ๐‘’2 (๐‘ก) , (๐œƒ1 < ๐‘ก โ‰ค ๐œƒ2) , = โ‹… โ‹… โ‹… = ๐‘“๐‘™ (๐‘ก, V๐‘™) + ๐‘’๐‘™ (๐‘ก) , (๐œƒ๐‘™โˆ’1 < ๐‘ก โ‰ค ๐‘›) , (11) where ๐‘“๐‘–(๐‘ก, V๐‘–) is the function (with its vector of parameters V๐‘–) that is fitted to the segment ๐‘–. The ๐œƒ๐‘–โ€™s are the change-points between successive segments, and ๐‘’๐‘–(๐‘ก)โ€™s are error terms.
  • 5. Advances in Mechanical Engineering 5 Several types of basis functions, ๐‘“๐‘–(๐‘ก, V๐‘–), can be consid- ered, for example, algebraic polynomials, wavelet, Fourier, and so forth [41]. In our current implementation polynomial fitting functions have been selected, since they represent a good compromise between expressiveness power and com- putational effort. In [27], two CPD approaches have been proposed, the batch (offline) and the incremental (online). In the batch version, the entire data set is available in advance, from which the best set of change-points is determined. In the incremental version, the algorithm receives new data points one at a time and determines if the new observation causes a new change-point to be discovered. Both approaches are summarized here after. Details on the implementation can be found in [27]. 2.3.1. Batch Algorithm. The batch CPD algorithm assumes that the entire time series data is available before the analysis begins. Let ๐‘† be the residual sum of squares for the selected fitting function that maintains time points ๐‘ก๐‘–, ๐‘ก๐‘–+1, . . . , ๐‘ก๐‘—. In this paper we define the likelihood criterion as L(๐‘–, ๐‘—) = ๐‘† since we used a homoscedastic error model [27]. The key idea behind the algorithm in [27] is that, at every iteration, each segment is examined to see whether it can be split into two significantly different segments. The splitting procedure can be illustrated by a consideration of the first stage, since all subsequent stages consist of equivalent scaled- down problems. Let the data set cover the time points ๐‘ก1, ๐‘ก2, . . . , ๐‘ก ๐‘›. The change-point in the first stage is the ๐‘— minimizing L(1, ๐‘—) + L(๐‘— + 1, ๐‘›), say ๐‘—โˆ— . Here ๐‘—โˆ— is defined as follows: L (1, ๐‘—โˆ— ) + L (๐‘—โˆ— + 1, ๐‘›) = min ๐‘โ‰ค๐‘—โ‰ค๐‘›โˆ’๐‘ L (1, ๐‘—) + L (๐‘— + 1, ๐‘›) . (12) The range of ๐‘— depends on the fact that at least ๐‘ points are needed for model fitting in each segment. At the second stage, the best candidate change-points ๐‘1 and ๐‘2 of each of the two segments are located. The selection of the better of these two candidates yields a division of the original sequence into three segments. Assuming that point ๐‘1 is chosen, now the likelihood criteria of the model becomes as follows: L = (L (1, ๐‘1) + L (๐‘1 + 1, ๐‘—โˆ— ) + L (๐‘—โˆ— + 1, ๐‘›)) < (L (1, ๐‘—โˆ— ) + L (๐‘—โˆ— + 1, ๐‘2) + L (๐‘2 + 1, ๐‘›)) . (13) The procedure above is repeated until a stopping criterion is reached. Once the algorithm has detected all โ€œrealโ€ change- points, adding any more change-points the likelihood will increase. Therefore, the algorithm should stop when the like- lihood criteria becomes stable or starts to increase. Formally, if in iterations ๐‘ง and ๐‘ง + 1 the respective likelihood criteria values are L ๐‘ง and L ๐‘ง+1, the algorithm should stop if (L ๐‘ง โˆ’ L ๐‘ง+1) L ๐‘ง < ๐‘ , (14) where ๐‘  is a user-defined stability threshold. When stability threshold ๐‘  is set to 0%, the algorithm stops only when the likelihood criteria starts increasing. 2.3.2. Incremental Algorithm. In an online setting, the change-point detection must proceed concurrently with data collection. The key idea of the incremental algorithm is that if the next data point collected by the sensor reflects a signif- icant change in phenomenon, then the likelihood of being a change-point is going to be smaller than the likelihood that it is not. However, if the difference in likelihoods is small, we cannot definitively conclude that a change did occur, since it may be the artifact of a large amount of noise in the data. Therefore a user-defined likelihood increase threshold is introduced. Consider (Lno change โˆ’ Lchange) Lno change > ๐›ฟ, (15) where ๐›ฟ is a user-defined likelihood increase threshold. Suppose that the last change-point was detected at time ๐‘ก๐‘™โˆ’1. At time ๐‘ก๐‘™ the algorithm starts by collecting enough data to fit the regression model. Suppose at time ๐‘ก๐‘— a new data point is collected. The candidate change-point is found by determining ๐‘ก๐‘–, with likelihood criterion Lmin(๐‘™, ๐‘—), such that Lmin (๐‘™, ๐‘—) = min ๐‘™<๐‘–โ‰ค๐‘— L (๐‘™, ๐‘–) + L (๐‘– + 1, ๐‘—) . (16) If this minimum is significantly smaller than L(๐‘™, ๐‘—), that is, the likelihood criteria of no change-points from ๐‘ก๐‘™ to ๐‘ก๐‘—, then ๐‘ก๐‘– is a change-point. Otherwise, the process should continue with the next point, that is, ๐‘ก๐‘—+1. In the incremental algorithm, execution time is a sig- nificant factor. If enough information is stored, some of the calculations can be avoided. Thus, at time ๐‘ก๐‘—+1 to find likelihood criteria Lmin (๐‘™, ๐‘— + 1) = min ๐‘™<๐‘–โ‰ค๐‘— L (๐‘™, ๐‘–) + L (๐‘– + 1, ๐‘— + 1) , (17) it is only necessary to calculate L(๐‘– + 1, ๐‘— + 1), since (๐‘™, ๐‘–) was calculated in the previous iteration. It should be noted that if a change-point is not detected for a long time, the successive computations become increasingly expensive. A possible solution is to consider a sliding window of only the last ๐‘ž points. 3. Framework The framework of the proposed approach is illustrated in Figure 2. As discussed in the introduction, the methodology is applied in two phases, the training (phase 1) and the online application (phase 2). 3.1. Phase 1: Estimation of the Initial Number of States and Parameters Initialization. During the first phase, we assume that enough historical raw data have been collected from the monitoring system of the machine under study. Such data can be measurements from multiple sensor types (temperature,
  • 6. 6 Advances in Mechanical Engineering Phase 1-training Raw signal training set Features extraction Kernel PCA Kernel PCA Batch CPD algorithm Segmented Number of states estimation Initial state number Guess = Nโˆ— X X y y Model Parameters initialization Initial parameters Guess = ๐œ†0 Learning procedure (Baum Welch) Learned parameters ๐œ†โˆ— CHMM model Phase 2-testing Online testing signal Last L raw data points Features extraction Online CPD algorithm New state detected? Yes Add the testing signal to the training set Add a new state to the CHMM (Nโˆ— = Nโˆ— + 1) X X y Condition assessment (Viterbi path) Figure 2: Framework of the proposed approach. pressure, acceleration,. . .etc.). For each sensor measurement, as described in Section 2.2, we apply the feature extraction, yielding multidimensional feature vectors X(๐‘ก), on which we apply KPCA dimensionality reduction. The resulting 1- dimensional signal, ๐‘ฆ(๐‘ก), is then input to the CPD algorithm, which in turn delivers (detects) ๐‘โˆ— segments. Each segment is mapped to a HMM state. The intuition of this approach comes from the assumption that there is a correlation between observations and states. Thus, a variation in the condition of the system corresponds to a change in the behavior of the sensorial measurements. Once the number of states (๐‘โˆ— ) has been estimated, and knowing the end-points of each segment, the initialization of the HMM parameters ๐œ†0 = (๐œ‹0, ๐ด0, ๐ถ0, ๐œ‡0, U0) can be performed. Since left-right HMMs are considered, ๐œ‹0 is initialized by using (8). Using (4), (6), (7), and (5) with ฮ” = 1, the transition matrix ๐ด0 is initialized as follows: ๐‘Ž๐‘–๐‘— = # tr (๐‘–, ๐‘—) #๐‘ ๐‘– , ๐‘— = ๐‘– + 1 (18) โˆ€๐‘–, ๐‘Ž๐‘–๐‘– = 1 โˆ’ โˆ‘ ๐‘— ฬธ=๐‘– ๐‘Ž๐‘–๐‘—, (19) where # tr(๐‘–, ๐‘—) is the number of transitions from segment ๐‘– to ๐‘— (which is equal to 1 if left-right HMMs) and #๐‘ ๐‘– the number of data points in segment ๐‘–. The transition matrix is ensured to satisfy the stochastic constraint, being โˆ€๐‘–, โˆ‘ ๐‘ ๐‘—=1 ๐‘Ž๐‘–๐‘— = 1. Concerning the coefficients ๐œ‡๐‘—๐‘š, U ๐‘—๐‘š, and ๐‘๐‘—๐‘š relative to the observation density parameters (see equation (2)), they are estimated from the multidimensional features signal X(๐‘ก) for each segment (state) 1 โ‰ค ๐‘— โ‰ค ๐‘โˆ— with the following equations: โˆ€๐‘š, ๐œ‡ ๐‘—๐‘š = X(๐‘ก), ๐‘ก โˆˆ ๐‘—; (20) โˆ€๐‘š, U ๐‘—๐‘š = Cov (X (๐‘ก)) , ๐‘ก โˆˆ ๐‘—, (21) where X(๐‘ก) and Cov(X(๐‘ก)) are, respectively, the mean vector and the covariance matrix of the multidimensional features signal. The mixture coefficients ๐‘๐‘—๐‘š are initialized by using the
  • 7. Advances in Mechanical Engineering 7 uniform distribution, for each ๐‘— and ๐‘š : ๐‘๐‘—๐‘š = 1/๐‘€, ๐‘€ being the number of mixtures. Together with the features extracted from the training histories, the estimated initial parameters ๐œ†0 are used as inputs to the Baum-Welch procedure. The obtained optimal model parameters, ๐œ†โˆ— , of the ๐‘โˆ— -states left-right CHMM, are then used in the following phase 2 for the online adaptive condition assessment. 3.2. Phase 2: Online Adaptive Condition Assessment. The application phase is performed on new incoming data which are referred to as testing signal in Figure 2. These raw data are acquired online, during the normal runtime of the machine under study and are collected one sample a time. When a window of predefined length ๐ฟ of points are acquired, the feature extraction procedure is executed to produce multidimensional features. Two parallel process are then executed: the trained CHMM is used to estimate the current systemโ€™s state, while the online CPD algorithm keeps track of the encountered changes of conditions to possibly detect previously unknown situations. In particular: (i) for condition assessment, the multidimensional fea- tures signal is input to the inference algorithm, using the current ๐‘โˆ— -states left-right CHMM model and its optimal parameters ๐œ†โˆ— , to calculate, via the Viterbi algorithm [14], the most probable state sequence, given the observations up to that moment. The last state in the inferred sequence is then selected as the actual condition of the machine; (ii) for possible HMM state and parameters update, we map the multidimensional features to 1-dimensional signal using the out-of-sample extension according to the kernel PCA parameters calculated in the training phase. The obtained 1-dimensional signal is input to the online CPD algorithm to detect possible changes, which could correspond to state changes. Here, we assume that an unknown situation is experienced by the system when the detected segments (change- points) exceed ๐‘โˆ— , being the number of states of the optimal CHMM of the previous phase or iteration. In such case, when enough samples coming from the new condition are acquired, a new state is added to the CHMM, to obtain a new model with ๐‘โˆ— = ๐‘โˆ— + 1 states. The model parameters, ๐œ†โˆ— , are also adapted to the new situation, by performing the training procedure as described above in phase 1, using the current testing signal together with the training histories. 4. Experimental Results To verify the effectiveness of the proposed methodology, sev- eral experiments have been performed both on simulated and on real data. The simulated data were generated according to the parameters described in the example reported in [25], while condition monitoring data related to bearings taken from the NASA benchmark repository [42] have been used as real data. All the parameters concerning kernel PCA and CPD algorithms have been carefully estimated through linear searches. 4.1. Simulated Experiment. In this section we illustrate the performances of the proposed approach by means of a numerical example proposed in [25]. The goal of these experiments is to study the case in which some of the hidden states of the CHMM are not known in advance, but they appear during the normal functioning of the system. To better illustrate the effectiveness of the proposed methodology we compare it to method of Lee et al. [25], denoted here after as modified HMM approach. Moreover, we evaluate the results according to two criteria: (i) the classification accuracy, defined as the percentage of states correctly classified by the Viterbi algorithm with respect to the true known state sequence, (ii) the detection time, defined as the moment in time in which the unknown state is detected. 4.1.1. Data Generation. A training set composed of 5 histories from a 4-state CHMM has been generated. Supposing that two signals (๐‘‹1 and ๐‘‹2) are monitored, the observation density follows a two-jointly Gaussian distribution. The CHMM parameters used to generate the training data are as follows: ๐œ‹ = [ [ [ [ 1 0 0 0 ] ] ] ] , ๐ด = [ [ [ [ 0.99 0.01 0 0 0 0.99 0.01 0 0 0 0.99 0.01 0 0 0 1 ] ] ] ] , ๐œ‡1 = [ 20 20 ] , ๐œ‡2 = [ 20 35 ] , ๐œ‡3 = [ 35 35 ] , ๐œ‡4 = [ 35 20 ] , ๐‘ˆ1 = [ 20 0 0 20 ] , ๐‘ˆ2 = [ 15 0 0 15 ] , ๐‘ˆ3 = [ 15 โˆ’2 โˆ’2 15 ] , ๐‘ˆ4 = [ 5 0 0 5 ] . (22) For each history, 500 data points have been sampled. One possible example of observable signals is illustrated in Figure 3. To evaluate the behavior of the proposed methodology under the presence of an unknown situation, we sampled the testing case using a CHMM with 5 states: for the first 4 states we used the same parameters as the training set, while a fifth (previously unknown) state has been added, with the following parameters: ๐œ‡5 = [ 50 50 ] , ๐‘ˆ5 = [ 10 3 3 10 ] . (23) By knowing beforehand the true states sequence of the testing case as well as the the switch time to the fifth state (time 511), in the following we compare the classification accuracy and the unknown state detection time obtained by our method and by the modified HMM of Lee et al. [25].
  • 8. 8 Advances in Mechanical Engineering 5 10 15 20 25 30 35 40 45 50 5 10 15 20 25 30 35 40 45 50 Signal X1 SignalX2 State S1 State S2 State S3 State S4 Figure 3: Scatter plot of the first simulated case. For the artificial experiment, 500 observation samples from a 4-state CHMM with 2 monitored signals (๐‘‹1 and ๐‘‹2, following a 2-joint Gaussian observation distribution) have been sampled for 5 histories. 4.1.2. Proposed Approach. For each training history, we applied kernel PCA, using a Gaussian kernel with variance which is equal to 1, in order to reduce the 2-dimensional observation to a 1-dimensional signal. An example of dimen- sionality reduction is illustrated in Figure 4 for the first training history. The obtained 1-dimensional signals have been input to the batch CPD algorithm to perform the initial states numbers estimation. As shown in Figure 5, the 4 segments are correctly detected. The selected parameters of the batch CPD algorithm are as follows: a second degree polynomial fitting function and a stability threshold ๐‘  = 0.01 (see equation (14)). Once the initial number of segments is obtained, we consider a ๐‘โˆ— = 4-state left-right CHMM model for which the parameter ๐œ† has to be estimated using the Baum-Welch algorithm. To evaluate the efficiency of the proposed initial parameters estimation approach, we performed 11 runs of the Baum-Welch algorithm using as initial values the estimated ๐œ†0 as described in Section 3.1 and 10 randomly set ๐œ†0โ€™s. For these experiments, we selected a single Gaussian as observation density of the CHMM and a maximum number of 15 iterations. Figure 6 shows that the learning procedure converges in 4 iterations when the initial parameters are set using the pro- posed approach, compared to the randomly set parameters. Moreover, the final likelihood value is comparable with the best random initialization. The 4-state CHMM trained model is then used to perform the online condition assessment as described in Section 3.2. For these experiments we employed the following parameters for the online CPD algorithm: a first order polynomial fitting function and a likelihood threshold ๐›ฟ = 0.4 (see equation (15)). As illustrated in Figure 7(d), until time 522, the trained 4-state CHMM has been used, and the Viterbi path correctly assesses the four states. At time 512 (denoted by the last cross in Figure 7(c)), the online CPD detects a further state. After 10 more data have been acquired, the CHMM is enriched with a new state, and the training procedure is executed again to estimate the paramters of a 5-state CHMM. From the time point 522 on, the new 5-state CHMM is used, for which the Viterbi path correctly assesses the fifth state until the end of the experiment. The accuracy achieved by the Viterbi path is 97.23% of correctly classified states in the sequence. 4.1.3. Modified HMM Approach [25]. The results above have been compared to the ones using the approach of Lee et al. [25] that, in contrast to the proposed methodology which employs data mining and curve fitting techniques, is based on multivariate statistical tests. The detection of a new state is performed by evaluating a statistically significant shift in the mean of the observation signal using the hotelling multivariate control chart (referred to as hotelling MCC as follows). Hotelling MCC considers as measure of similarity the Mahalanobis distance, denoted as ๐ท2 (O, ๐œ‡๐‘–), between the empirical mean O of the last ๐‘ observations (set to 10 in our experiments) and the ๐œ‡๐‘– parameter relative to the state ๐‘–, which is the most probable state given by the Viterbi path: ๐ท2 (O, ๐œ‡๐‘–) = ๐‘ (O โˆ’ ๐œ‡๐‘–) ๐‘‡ Uโˆ’1 ๐‘– (O โˆ’ ๐œ‡๐‘–) , (24) where U๐‘– is the covariance matrix relative to the most probable state ๐‘–. If this statistic exceeds a certain threshold, called upper control limit (UCL), for more than ๐‘… consecutive times, then a significant change is assumed in the process and a new state is added. ๐‘… can be seen as a measure of the detection sensitivity of the algorithm, if ๐‘… is increased, the detection algorithm may become more robust against false detections caused by the process randomness itself; on the other hand, the algorithm may respond more slowly to the unknown state. Concerning the UCL, since ๐œ‡๐‘– and U๐‘– are known, the ๐ท2 statistic follows the ๐œ’2 distribution with ๐‘ degrees of freedom, where ๐‘ is the number of multivariates, that is, the number of features considered as observations. Thus, the UCL can be obtained as UCL = ๐œ’2 ๐›ผ,๐‘, where ๐›ผ is the risk level. An example of the Lee et al. [25] approach with ๐‘… = 3 is illustrated in Figure 8. A zoom of Figure 8(c), which represents the Mahalanobis distance, is presented in Figure 9. Here it is more clear that, after the three points exceeding the UCL (denoted by out of control) have been detected and the new state is added to the HMM, the Mahalanobis distance comes back again under control. Table 1 summarizes the classification accuracy (percent- age of states correctly classified by the Viterbi algorithm) and the detection time of the new state (knowing that the switch to state 5 happened at time 511 in the ground truth data), for both our proposed approach and the one of [25]. The classification accuracy of the Lee et al. [25] method clearly depends on the choice of the parameter ๐‘…: if it is increased, the detection
  • 9. Advances in Mechanical Engineering 9 0 500 0 3 Sample โˆ’3 Signal X1 (a) Signal ๐‘‹1 versus time 0 500 0 3 Sample โˆ’3 Signal X2 (b) Signal ๐‘‹2 versus time 0 500 0 1 Sample KPCA โˆ’1 (c) Dimensionality reduction with KPCA Figure 4: Dimensionality reduction with KPCA. The information available in the two-dimensional signal shown in plots (a) and (b) is combined in a single signal shown in plot (c) using KPCA. This transformation keeps information related to changes in the original signals. 0 500 0 0.6 Samples KPCA 1-dimensional signal Change-points โˆ’0.6 (a) Offline CPD on training history #1 0 500 0 0.6 Samples KPCA 1-dimensional signal Change-points โˆ’0.6 (b) Offline CPD on training history #2 Figure 5: Initial states number estimation with batch CPD. The 4 segments associated with the 4 CHMM states are correctly detected. algorithm may become more robust against false detections caused by the process randomness itself; on the other hand, the algorithm may respond more slowly to the unknown state. 4.1.4. Discussion. From the obtained results we can conclude that the proposed method (i) is effective in estimating correctly the initial number of HMM states as well as the initial parameters; (ii) improves the convergency speed of the learning algo- rithm; (iii) outperforms the Lee et al. [25] approach in terms of classification accuracy and earlier detection of the new condition. 4.2. Real Data. In this section we describe the application of our approach to a real case study. With the aim of investigating the case in which a failure condition is unknown during the training phase, we consider bearing monitoring data taken from the NASA benchmark repository [42], which have been used in several other works [37, 43โ€“45]. As the available data set does not contain any explicit information about the bearing conditions, we use the detection time of the Table 1: Comparison of the classification accuracy and detection time (real new state switch time = 511). ๐‘… Accuracy Detection time Proposed method 97.23% 512 Hotelling MCC 2 96.31% 531 Hotelling MCC 3 95.23% 541 Hotelling MCC 4 93.69% 551 failure state as a measure of model performance. In addition, we evaluate the reliability of the proposed CPD methodology, for the identification of the initial number of systemโ€™s states, compared to clustering techniques and the Dunnโ€™s Cluster Validity Indexes [28]. Moreover, similarly to the simulated experiment, we compare our approach with the one of Lee et al. [25]. 4.2.1. Data Description. The NASA benchmark repository [42] contains monitoring measurements collected during several run-to-failure tests performed on bearings under constant load conditions on a special designed test rig (see Figure 10). Four (4) bearings (bearing-1,. . .,bearing-4) were installed on a rotating shaft and the test has been performed
  • 10. 10 Advances in Mechanical Engineering 0 5 10 15 Iterations Loglikelihood Proposed initialization โˆ’2 โˆ’1.9 โˆ’1.8 โˆ’1.7 โˆ’1.6 โˆ’1.5 โˆ’1.4 โˆ’1.3 ร—104 (a) Learning curves 1 2 3 4 5 6 7 8 9 10 11 0 Loglikelihood Initializations Random initializations Proposed initialization โˆ’15000 โˆ’10000 โˆ’5000 (b) Final loglikelihood ranking 1 2 3 4 5 6 7 8 9 10 11 0 5 10 15 Numberofiterations Initializations Random initializations Proposed initialization (c) Number of iterations Figure 6: Training Evaluation, the proposed ๐œ†0 estimation versus 10 randomly set ๐œ†0. By using the proposed approach for parameters initialization the Baum-Welch algorithm converges after only 4 iterations (the fastest in this comparison) and the final likelihood value is comparable with the best random initialization. by keeping a constant rotation speed of 2000 rpm, while a radial load of 6000 lbs has been added to the shaft and bearing by a spring mechanism. All the bearings are forcedly lubricated and the failure of a bearing is detected by means of a magnetic plug installed in the oil feedback pipe, that collects debris from the oil as evidence of bearing degradation. When the accumulated debris adhered to the magnetic plug exceeds a certain level, causing an electrical switch to close, a failure of at least one of the four bearings is assumed and the test stops [36]. The dataset consists of vibration signals collected by means of accelerometers. On each bearing housing, two accelerometers have been installed (one on the vertical ๐‘Œ- axis and one on the horizontal ๐‘‹-axis) for a total of 8 accelerometers (see Figure 10). Three run-to-failure tests, test-1 to test-3, have been performed on the test rig. One-second snapshot data were collected at a sampling rate of 20 kHz (20480 samples for each snapshot) every 20 minutes for test-1 and every 10 minutes for test-2 and test-3. At the end of the first run-to-failure experiment a crack was found near the shoulder of bearing-3, while test-2 and test-3 ended up with an outer race defect in bearing-1 and bearing-3, respectively (see Figure 11). In our experiments we considered as training set 4 histories of bearings that did not experience failure (channels ๐‘‹ and ๐‘Œ of the first and the second bearing in test-1), and for testing we used the history of a bearing that ended with failure (the third in test-1). The raw data of the training set have been preprocessed by extracting three time-domain features: mean, root mean square (RMS), and kurtosis within windows of length ๐ฟ = 20480, which is, for simplicity, the same length of the given snapshots. Figure 12 shows the raw vibration data and the resulting RMS signal of length 2156.
  • 11. Advances in Mechanical Engineering 11 0 100 200 300 400 500 600 700 0 20 40 60 Sample SignalX1 (a) Testing signal: observation 1 0 100 200 300 400 500 600 700 0 20 40 60 Sample SignalX2 (b) Testing signal: observation 2 0 100 200 300 400 500 600 700 0 0.5 1 KPCA Sample โˆ’1 โˆ’0.5 1-dimensional signal Change-points Detection time (c) Online change point detection 0 100 200 300 400 500 600 700 1 2 3 4 5 States Sample (d) Viterbi path Figure 7: Online condition assessment using the online CPD algorithm on the testing case. At time 512 the online CPD algorithm detects a further state, which is added to the original 4-state CHMM. From time 522 on, the fifth state, which was unknown during the training phase, is correctly assessed by the Viterbi algorithm. 0 100 200 300 400 500 600 700 0 20 40 60 Sample SignalX1 (a) Testing signal: observation 1 0 100 200 300 400 500 600 700 0 20 40 60 Sample SignalX2 (b) Testing signal: observation 2 0 100 200 300 400 500 600 700 Mahalanobis distance Sample Under control 100 Out of control UCL = 13.81 (๐›ผ = 0.001) (c) Hotelling multivariate control chart 0 100 200 300 400 500 600 700 1 2 4 3 5 States Sample (d) Viterbi path Figure 8: Online condition assessment using the hotelling multivariate control chart. After ๐‘… = 3 consecutive times that the Mahalanobis distance exceeds the upper control limit, an unknown situation is assumed and the fifth state is added to the CHMM. 4.2.2. Proposed Methodology Results. We applied KPCA, with a Gaussian kernel of variance which is equal to 5, to the fea- ture signals. The obtained 1-dimentional signal is illustrated in Figure 13. As it can be seen, the obtained 1-dimensional signal carries the relevant information contained in the three extracted features (mean, RMS, and kurtosis). We then applied the batch CPD algorithm (with a polynomial degree of the fitting function which is equal to 1 and a stability threshold ๐‘  = 0.03) on the 1-dimensional signal, resulting in the detection of 2 states, as illustrated in Figure 14. Since no explicit ground truth about the number of (bear- ings) states has been made available in the benchmark data [42], we propose using a clustering approach to estimate it. K- means clustering, with varying number of clusters from 2 to 10, has been applied on the (historical) 3-dimensional features observations. The generalized Dunnโ€™s validity indexes [28] have been used to determine the optimal number of clusters corresponding to the number of bearing states. Figure 15 illustrates 5 of the 18 validity indexes (๐‘‰11, . . . , ๐‘‰51) of [28] versus the number of clusters. High index values indicate the optimal number of clusters, in this case 2. These results confirm that the proposed CPD algorithm applied to the 1-dimensional signal obtained after KPCA dimensionality reduction produces the correct number of bearing states. For the next steps of bearing condition assessment, a 2-state CHMM is considered, for which, as illustrated in Figure 16, the proposed CHMM parameters initialization improves the convergence speed with a final likelihood com- pared to the value obtained by the best random initialization.
  • 12. 12 Advances in Mechanical Engineering 0 10 20 30 40 50 60 70 Mahalanobisdistance Under control Out of control State change 10โˆ’3 10โˆ’2 10โˆ’1 100 101 102 103 104 Number of sample (sample size = 10) UCL = 13.81 (๐›ผ = 0.001) Figure 9: The hotelling multivariate control chart for testing data. After the three points out of control have been detected and the new state is added to the CHMM, the Mahalanobis distance comes back again under control. Accelerometers Radial load Thermocouples Motor Bearing-1 Bearing-2 Bearing-3 Bearing-4 Figure 10: Testing rig and sensors positions [36]. The 4 bearings installed on a rotating shaft under a constant load and kept at a constant rotation speed for 3 run-to-failure tests. The measurements available are collected by means of 2 accelerometers installed in each bearing case (one on the vertical axis and one on the horizontal axis) for a total of 8 accelerometers. (a) (b) (c) Figure 11: Damaged bearings [36]. At the end of the first test the bearing-3 experienced a crack near the shoulder (a), while at the end of the second and the third test an outer race defect occurred in bearing-1 (b) and bearing-3 (c), respectively.
  • 13. Advances in Mechanical Engineering 13 0 0.5 1 1.5 2 0 0.2 0.4 0.6 Samples (a) Acceleration Raw signal-snapshot 1 0 500 1000 1500 2000 2500 0.12 0.13 0.14 0.15 0.16 0.17 0.18 0.19 0.2 0.21 0.22 Time (b) RMS Feature extraction โˆ’0.8 โˆ’0.6 โˆ’0.4 โˆ’0.2 ร—104 Figure 12: Raw vibration data (a) versus RMS features (b) for bearing-2 in test-1. 0 2156 0 2 4 Mean Time (unit = 20min) โˆ’2 (a) Mean 0 2156 0 5 RMS Time (unit = 20min) โˆ’5 (b) RMS 0 2156 0 5 Kurtosis Time (unit = 20min) โˆ’5 (c) Kurtosis 0 2156 0 0.5 KPCA โˆ’1 โˆ’0.5 Time (unit = 20min) (d) Dimensionality reduction with KPCA Figure 13: Dimensionality reduction with KPCA on the bearing data. 0 2156 0 KPCA 1-dimensional signal Change-points โˆ’0.8 โˆ’0.6 โˆ’0.4 โˆ’0.2 (a) Offline CPD on training history #1 0 2156 โˆ’0.8 โˆ’0.6 โˆ’0.4 โˆ’0.2 0 0.2 KPCA 1-dimensional signal Change-points (b) Offline CPD on training history #2 Figure 14: Initial states number estimation with batch CPD for training histories #1 and #2. Two states are detected, also for the third and the fourth training cases. For the CHMM observation density, we used a mixture of 3 Gaussians, allowing trade-off between precision and computation time. For simulating the online condition assessment using the proposed phase 2 method of Section 3.2, we considered faulty bearing vibration signals to which we applied the feature estimation (mean, RMS, and kurtosis) as described earlier. The resulted features have been successively input to (i) the learned CHMM to calculate the Viterbi path and (ii) to the KPCA dimensionality reduction algorithm for the further
  • 14. 14 Advances in Mechanical Engineering 1 2 3 4 5 6 7 8 9 10 11 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 Number of clusters Indexesvalues (a) Dunnโ€™s indexes for training history 1 1 2 3 4 5 6 7 8 9 10 11 0 0.5 1 1.5 Number of clusters Indexesvalues (b) Dunnโ€™s indexes for training history 2 1 2 3 4 5 6 7 8 9 10 11 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Number of clusters Indexesvalues Max value V11 V21 V31 V41 V51 (c) Dunnโ€™s indexes for training history 3 1 2 3 4 5 6 7 8 9 10 11 0 0.2 0.4 0.6 0.8 1 1.2 1.4 Number of clusters Indexesvalues Max value V11 V21 V31 V41 V51 (d) Dunnโ€™s indexes for training history 4 Figure 15: Dunnโ€™s validity indexes applied on the 4 training histories. application of the online CPD. For the latter, a second degree polynomial fitting function and a threshold ๐›ฟ = 0.6 (equation (15)) have been considered. The described process above is illustrated in Figure 17. At time 1848, the online CPD correctly detects the beginning of a new degradation modality (Figure 17(d)), which will bring the bearing to fail. A new state is then added to the HMM and the classification of the third state is performed correctly until the end. From a visual analysis of the observations (see Figure 17), it is clear that the starting of the severe degradation that leads to the bearing failure is reflected by the kurtosis, which at time around 1830 starts to have a discontinuous pattern. This information is correctly captured by the kernel PC and further detected by the online CPD. 4.2.3. Discussion on Features. In this section we discuss different features with the aim of demonstrating that our approach allows determining bearings condition right up until failure and detects the unknown degradation stage at a very early stage, also using different sets of features. As feature selection is out of the scope of this work, we evaluated the basic state of the art features: mean, RMS, kurtosis, skewness, and wavelet features. Kurtosis Only. By using only one feature, both the offline and the online CPD algorithms are applied directly on the feature signal itself. With kurtosis alone, our method fails during the initial state estimation. As can be noticed from Figure 18, the offline CPD algorithm fails to detect the correct point at which a switch between state 1 and state 2 happens for the third and the fourth training histories. Concerning the online detection of the new state, kurtosis works fine in discovering the bearing degradation at an early stage. In fact, our method detects the unknown condition at sample 1829, 19 samples (6 hours) before the detection using mean, RMS, and kurtosis (see Section 4.2.2). This fact confirms that kurtosis describes well the beginning of the bearing degradation. However, if we base our method only on the kurtosis, we loose the information about the switch that happens between
  • 15. Advances in Mechanical Engineering 15 0 5 10 15 Iterations Loglikelihood Proposed initialization ร—104 โˆ’5 โˆ’4.5 โˆ’4 โˆ’3.5 โˆ’3 โˆ’2.5 โˆ’2 โˆ’1.5 โˆ’1 โˆ’0.5 (a) Learning curves 1 2 3 4 5 6 7 8 9 10 11 0 Loglikelihood Initializations Random initializations Proposed initialization โˆ’9000 โˆ’8000 โˆ’7000 โˆ’6000 โˆ’5000 โˆ’4000 โˆ’3000 โˆ’2000 โˆ’1000 (b) Final loglikelihood ranking 1 2 3 4 5 6 7 8 9 10 11 0 5 10 15 Numberofiterations Initializations Random initializations Proposed initialization (c) Iterations number Figure 16: Training Evaluation, the proposed ๐œ†0 estimation versus 10 randomly set ๐œ†0. The effectiveness of the proposed approach for parameters initialization is confirmed in the real case, since the Baum-Welch algorithm converges after only 3 iterations (the fastest in this comparison) and the final likelihood value is comparable with the best random initialization. state 1 and state 2, which is contained in the mean and in the RMS. As can be noticed from Figure 19, both the CHMM and the online CPD agree on the change of condition right at the beginning (sample 8), which is not the case. As a consequence, by using only kurtosis, we fail to determine the correct bearing condition from the beginning to the end. Mean and Root Mean Square. By considering only dimen- sional features like mean and RMS, we obtain the correct results concerning the initial state estimation with offline CPD, as shown in Figure 20. However, dimensional features do not contain informa- tion about the beginning of the bearing degradation. Hence, the online CPD for the new states detection fails to recognize when the bearing starts to degrade and gives an alert for a new failure condition at sample 2135 (Figure 21). According to the data, it is too late because the bearing is already broken. This is further confirmed from the data description provided by NASA (see Section 4.2.1). We can then conclude that, although combined mean and RMS can determine bearing condition right up until failure, it leads to a late alert of the bearing failure, missing the desirable characteristic of condition monitoring of advising the operator before a catastrophic failure occurs. Mean, RMS, and Skewness. By substituting skewness to kurtosis and keeping mean and RMS, Figure 22 shows that the offline CPD applied on the KPCA detects the two states of the training bearings, in the same way as it did when using kurtosis (see Section 4.2.2). Concerning the online identification of the new third state, the online CPD takes advantages from skewness in
  • 16. 16 Advances in Mechanical Engineering 0 2156 0 20 40 Mean โˆ’20 (a) Testing signal: observation 1 0 2156 0 10 20 RMS โˆ’10 (b) Testing signal: observation 2 0 2156 0 5 10 15 Kurtosis โˆ’5 (c) Testing signal: observation 3 0 1848 2156 0 0.5 1 KPCA 1-dimensional signal Change-points Detection time (d) Online change point detection 0 1848 2156 1 2 3 States (e) Viterbi path Figure 17: Online condition assessment using the proposed approach. The beginning of a new degradation modality is detected by the online CPD algorithm at an early stage (time 1848), which allowed augmenting the CHMM with a new state. The augmented CHMM correctly detects the degraded state of the bearing (Viterbi path). 0 0 2 4 Change-points 2156 Kurtosis 1-dimensional signal โˆ’2 (a) Offline CPD on training history #3 0 2156 0 2 4 Kurtosis 1-dimensional signal Change-points โˆ’2 (b) Offline CPD on training history #4 Figure 18: Initial states number estimation with batch CPD for training histories #3 and #4 based on kurtosis only. The method fails to detect the correct moment in which the state change occurs. terms of earlier detection of the bearing degradation and slightly outperforms the results obtained in Section 4.2.2. In fact, it detects the third state at sample 1824, 8 hours before. However, as shown in Figure 23, the Viterbi path of the CHMM fails in detecting the correct state at the end. Hence, we can conclude that, despite the earlier detection of the bearing degradation, the configuration mean, RMS, and skewness is not appropriate to determine bearing condition right up until failure. Wavelet Packet Transform. Finally, we tried our method on time-frequency domain features extracted using the wavelet packet transform (WPT). As described in the work of Pandya et al. [46], we combined the energy and the kurtosis of wavelet coefficients at the second level of the wavelet tree. Figure 24 shows that offline CPD works fine in detecting the two states of the training bearings, while Figure 25 shows that our methodology detects the new bearing degradation state at sample 1831 when applied on time-frequency features, performing slightly better (around 5 hours earlier detection) than when mean, RMS, and kurtosis are used. From the feature study, one can conclude that the pro- posed mean, RMS, and kurtosis features give good results for condition monitoring of bearings. However, an extended
  • 17. Advances in Mechanical Engineering 17 0 2156 0 5 10 15 Kurtosis โˆ’5 (a) Testing signal: observation 1 0 1829 2156 0 5 10 15 Kurtosis 1-dimensional signal Change-points Detection time โˆ’5 (b) Online change point detection 0 1829 2156 1 2 3 States (c) Viterbi path Figure 19: Online condition assessment using the proposed approach on kurtosis only. Although the beginning of the bearing degradation is correctly detected at an early stage, kurtosis does not contain the information on the switch from state 1 to state 2 and, consequently, the method fails in determining the correct bearing condition. 0 2156 0 0.2 KPCA 1-dimensional signal Change-points โˆ’0.8 โˆ’0.6 โˆ’0.4 โˆ’0.2 (a) Offline CPD on training history #1 0 2156 0 0.2 KPCA 1-dimensional signal Change-points โˆ’0.8 โˆ’0.6 โˆ’0.4 โˆ’0.2 (b) Offline CPD on training history #2 Figure 20: Initial states number estimation with batch CPD for training histories #1 and #2 based on mean and RMS. Two states are correctly detected. 0 2156 0 20 40 Mean โˆ’20 (a) Testing signal: observation 1 0 2156 0 10 20 RMS โˆ’10 (b) Testing signal: observation 2 0 2135 0 0.5 1 KPCA 1-dimensional signal Change-points Detection time โˆ’0.5 (c) Online change point detection 0 2135 1 2 3 States (d) Viterbi path Figure 21: Online condition assessment using the proposed approach on mean and RMS. The online CPD detects a new failure condition too late, when the bearing is already broken.
  • 18. 18 Advances in Mechanical Engineering 0 2156 0 0.2 0.4 0.6 0.8 KPCA 1-dimensional signal Change-points โˆ’0.2 (a) Offline CPD on training history #1 0 2156 0 0.2 0.4 0.6 0.8 KPCA 1-dimensional signal Change-points โˆ’0.2 (b) Offline CPD on training history #2 Figure 22: Initial states number estimation with batch CPD for training histories #1 and #2 based on mean, RMS and skewness. Two states are correctly detected. 0 2156 0 20 40 Mean โˆ’20 (a) Testing signal: observation 1 0 2156 0 10 20 RMS โˆ’10 (b) Testing signal: observation 2 0 2156 0 10 Skewness โˆ’20 โˆ’10 (c) Testing signal: observation 3 0 1824 2156 0 0.5 1 KPCA 1-dimensional signal Change-points Detection time โˆ’0.5 (d) Online change point detection 0 1824 2156 1 2 3 States (e) Viterbi path Figure 23: Online condition assessment using the proposed approach on mean, RMS, and skewness. Although The online CPD detects a new failure condition at an early stage, the CHMM fails to assess the correct state sequence at the end. 0 2156 0 0.2 0.4 0.6 KPCA 1-dimensional signal Change-points โˆ’0.4 โˆ’0.2 (a) Offline CPD on training history #1 0 2156 0 0.2 0.4 0.6 KPCA 1-dimensional signal Change-points โˆ’0.4 โˆ’0.2 (b) Offline CPD on training history #2 Figure 24: Initial states number estimation with batch CPD for histories #1 and #2 using the energy and the kurtosis of the second level wavelet packet transform coefficients. Two states are correctly detected.
  • 19. Advances in Mechanical Engineering 19 0 2156 0 10 20 30 WPT-coefficient 1 energy โˆ’10 (a) Testing signal: observation 1 0 2156 0 10 20 30 WPT-coefficient 2 energy โˆ’10 (b) Testing signal: observation 2 0 2156 0 5 10 15 WPT-coefficient 1 kurtosis โˆ’5 (c) Testing signal: observation 3 0 2156 0 10 20 30 WPT-coefficient 2 kurtosis โˆ’10 (d) Testing signal: observation 4 0 1831 2156 0 0.2 0.4 0.6 KPCA 1-dimensional signal Change-points Detection time โˆ’0.4 โˆ’0.2 (e) Online change point detection 0 1831 2156 1 2 3 States (f) Viterbi path Figure 25: Online condition assessment using the proposed approach on the energy and the kurtosis of WPT treeโ€™s second level coefficients. The results obtained with the proposed time domain features are confirmed by using time-frequency domain features. 0 2156 0 20 40 Mean โˆ’20 (a) Testing signal: observation 1 0 2156 0 10 20 RMS โˆ’10 (b) Testing signal: observation 2 0 2156 0 5 10 15 Kurtosis โˆ’5 (c) Testing signal: observation 3 0 2061 Mahalanobis distance Under control Out of control 10โˆ’10 10โˆ’5 100 105 UCL = 16.27 (๐›ผ = 0.001) (d) Hotelling multivariate control chart 0 2061 1 2 3 States (e) Viterbi path Figure 26: Online condition assessment using the hotelling multivariate control chart of [25]. The detection of the degradation modality occurs at time 2061 (71 hours later than the proposed approach).
  • 20. 20 Advances in Mechanical Engineering 0 206 Mahalanobis distance Under control Out of control State change Number of sample (sample size = 10) 10โˆ’1 100 101 102 103 104 UCL = 16.26 (๐›ผ = 0.001) Figure 27: The hotelling multivariate control chart for testing data. The observations modeling through a joint Gaussian distribution is highly sensitive to small differences between the mean of the testing and the training observation signals. Due to this small difference, several out of control points are observed at the beginning, that would have wrongly brought the method to detect an unknown situation. analysis of feature extraction and selection should be made depending on the problem in hand and on the used sensors. 4.2.4. Modified HMM Results. Figure 26 illustrates the approach of Lee et al. [25], in which the detection sensitivity parameter ๐‘… has been set as ๐‘… = 6. As it can be seen, the detection of the unknown condition happens at time 2061. Figure 27, which represents the Mahalanobis distance versus number of sample, illustrates the limits of the approach and its observations modeling, being a joint Gaussian distri- bution and not a mixture of Gaussians. Hotelling MCC is in fact highly sensitive to small differences between the mean of the testing and the training observation signals. Hence, it is applicable only when training and testing observations are very similar. As demonstrated by the first part of Figure 27, where there are several points out of control at the beginning because the mean of the testing signals in this portion is slightly different from the training set. Although such method would have wrongly detected an unknown situation (a new state would), it has been forced to avoid this behavior by adjusting the parameter ๐‘…. 4.2.5. Discussion. The above results highlight that the pro- posed methodology (i) overcomes the limitations imposed by the hotelling MCC since a complex density can be used to model the observations; (ii) is parametric-free; the only parameters are the thresh- olds of the CPD algorithms, which could be set according to the sensor measurements during the training phase; (iii) is effective in detecting failure modalities at an early stage. 5. Conclusion and Future Work In this paper, a new approach to the online adaptive condition monitoring has been introduced. The methodology is based on continuous observation hidden Markov models that, combined with the change point detection algorithm, are enhanced to deal with variable state space and are used for condition assessment and unknown state detection. More- over, a new approach for the HMM initial states number estimation has been provided, together with a method for the parameters initialization. The obtained results illustrate that the proposed methodology (1) correctly estimates the initial number of states for the HMM; (2) improves the execution speed of the training procedure in terms of iterations number reduction with a final likelihood comparable with the best random initialization; (3) classify the current state of the system more effectively than the modified HMM coupled with the hotelling multivariate control chart; (4) detects an anomalous condition or an unknown state at an early stage; and (5) changes the structure and the parameters of the HMM to adapt the model to the new situations. Our approach has been implemented in Matlab (version 8.3.0), under Windows 7 Enterprise 64-bit Operating System on a desktop computer with an Intel Core i7 3.07 GHz CPU and 16 GB RAM. In terms of computational cost, phase 1, that is, the training phase, takes 296 seconds to learn the CHMM parameters from historical data. It has to be noted that this phase is applied only once for a given machine. Phase 2, that is, the online adaptive condition assessment, takes, on average, 0.2 seconds each time a new snapshot of data is collected (20480 samples). As a consequence, the algorithm can be used in real time condition assessment. In terms of hardware cost, a 164 kBytes buffer is used to store the incoming data, while for the CHMM the memory consumption is proportional to the model structure. In general terms, (๐‘ โˆ’ 1) + (๐‘ โˆ’ 1) โ‹… ๐‘ + ๐ท โ‹… ๐‘ โ‹… ๐‘€ + ๐ท โ‹… ๐ท โ‹… ๐‘ โ‹… ๐‘€ + (๐‘€ โˆ’ 1) โ‹… ๐‘ double-precision memory is required, with ๐‘ the number of states, ๐‘€ the number of Gaussian mixtures in the observation model, and ๐ท the number of considered features. The present work highlights through experiments per- formed on simulated and real data that the presented methodology can be used in practice as a decision support tool to corrective maintenance strategies. However, it has some problems that must be considered. The dimensionality reduction through kernel principal cmponent analysis causes an inevitable loss of information in the signal used for the change point detection algorithm. To avoid this problem, future researches will consider the application of a modified multidimensional CPD algorithm performed directly in the features hyperplane. Moreover, by using left-right HMMs, it has been assumed that the degradation pattern of the system is characterized by a monotonically increasing evolution which gradually crosses all the wearing phases until the complete failure. Despite being reasonable for bearings, this assumption does not hold for several real systems. In order to render the methodology more generally applicable, it is necessary to consider in future work a method which is able to distinguish also intermediate states and to recognize if the
  • 21. Advances in Mechanical Engineering 21 new detected state has been already encountered. Although the present-time researches are oriented to clustering tech- niques coupled with statistical similarity measures, other methods will be taken into consideration, with the final goal to use as degradation model the more general full ergodic HMM. Conflict of Interests The authors declare that there is no conflict of interests regarding the publication of this paper. References [1] T. Honkanen, Modelling industrial maintenance systems and the effects of automatic condition monitoring [Ph.D. thesis], Helsinki University of Technology, Information and Computer Systems in Automation, Helsinki, Finland, 2004. [2] R. Dekker, โ€œApplications of maintenance optimization models: a review and analysis,โ€ Reliability Engineering & System Safety, vol. 51, no. 3, pp. 229โ€“240, 1996. [3] L. Solomon, โ€œEssential elements of maintenance improvement programs,โ€ in Production Control in the Process Industry: Pro- duction Control in the Process Industry No 8: Proceedings of the IFAC Workshop, Osaka, 29-31 and Kariya, Japan, 1-2 Nov. 1989 (Hardback), E. Oshima and C. van Rijn, Eds., pp. 195โ€“198, Pergamon Press, Oxford, UK, 1989. [4] H. Wang, โ€œA survey of maintenance policies of deteriorating systems,โ€ European Journal of Operational Research, vol. 139, no. 3, pp. 469โ€“489, 2002. [5] Y. Peng, M. Dong, and M. J. Zuo, โ€œCurrent status of machine prognostics in condition-based maintenance: a review,โ€ The International Journal of Advanced Manufacturing Technology, vol. 50, no. 14, pp. 297โ€“313, 2010. [6] A. Sfetsos, โ€œShort-term load forecasting with a hybrid clustering algorithm,โ€ IEE Proceedings: Communications, vol. 150, no. 3, pp. 257โ€“262, 2003. [7] R. Vilalta and M. Sheng, โ€œPredicting rare events in temporal domains,โ€ in Proceedings of the 2nd IEEE International Confer- ence on Data Mining (ICDM โ€™02), pp. 474โ€“481, December 2002. [8] C. Domeniconi, C.-S. Perng, R. Vilalta, and S. Ma, โ€œA classi- fication approach for prediction of target events in temporal sequences,โ€ in Proceedings of the 6th European Conference on Principles of Data Mining and Knowledge Discovery (PKDD โ€™02), pp. 125โ€“137, Springer, London, UK, 2002. [9] K. Medjaher, J.-Y. Moya, and N. Zerhouni, โ€œFailure prognostic by using dynamic Bayesian Networks,โ€ in Dependable Control of Discrete Systems, M. P. Fanti and M. Dotoli, Eds., vol. 1, IFAC, Bari, Italy, 2009, https://hal.archives-ouvertes.fr/hal-00402938/en/. [10] P. Baruah and R. B. Chinnam, โ€œHMMs for diagnostics and prognostics in machining processes,โ€ International Journal of Production Research, vol. 43, no. 6, pp. 1275โ€“1293, 2005. [11] X. Zhang, R. Xu, C. Kwan, S. Y. Liang, Q. Xie, and L. Haynes, โ€œAn integrated approach to bearing fault diagnostics and prognostics,โ€ in Proceedings of the IEEE American Control Conference (ACC โ€™05), vol. 4, pp. 2750โ€“2755, June 2005. [12] Z. Li, Z. Wu, Y. He, and C. Fulei, โ€œHidden Markov model- based fault diagnostics method in speed-up and speed-down process for rotating machinery,โ€ Mechanical Systems and Signal Processing, vol. 19, no. 2, pp. 329โ€“339, 2005. [13] H. M. Ertunc, K. A. Loparo, and H. Ocak, โ€œTool wear condition monitoring in drilling operations using hidden Markov models (HMMs),โ€ International Journal of Machine Tools and Manufac- ture, vol. 41, no. 9, pp. 1363โ€“1384, 2001. [14] L. R. Rabiner, โ€œA tutorial on hidden Markov models and selected applications in speech recognition,โ€ Proceedings of the IEEE, vol. 77, no. 2, pp. 257โ€“286, 1989. [15] N. Ye, Y. Zhang, and C. M. Borror, โ€œRobustness of the Markov- chain model for cyber-attack detection,โ€ IEEE Transactions on Reliability, vol. 53, no. 1, pp. 116โ€“123, 2004. [16] C. Bunks, D. McCarthy, and T. Al-Ani, โ€œCondition-based main- tenance of machines using hidden Markov models,โ€ Mechanical Systems and Signal Processing, vol. 14, no. 4, pp. 597โ€“612, 2000. [17] M. Bjerkeseth, Using hidden markov models for fault diagnostics and prognostics in condition based maintenance systems [Ph.D. dissertation], University of Agder, 2010. [18] T. Liu, J. Chen, and G. Dong, โ€œApplication of continuous hide markov model to bearing performance degradation assess- ment,โ€ in Proceedings of the 24th International Congress on Condition Monitoring and Diagnostics Engineering Management (COMADEM โ€™11), pp. 166โ€“172, 2011. [19] H. Ocak and K. A. Loparo, โ€œA new bearing fault detection and diagnosis scheme based on hidden Markov modeling of vibration signals,โ€ in Proceedings of the IEEE Interntional Conference on Acoustics, Speech, and Signal Processing, pp. 3141โ€“ 3144, IEEE Computer Society, Washington, DC, USA, May 2001. [20] G. Almeida and S. Park, โ€œProcess monitoring in chemical industriesโ€”a hidden Markov model approach,โ€ in Proceedings of the 18th European Symposium on Computer Aided Process Engineering (ESCAPE โ€™18), 2008. [21] I. Strachan and D. Clifton, A Hidden Markov Model for Condi- tion Monitoring of a Manufacturing Drilling Process, vol. 1, (IET) Condition Monitoring, Dublin, Ireland, 2009. [22] G. Almeida and S. Park, โ€œFault detection in continuous indus- trial chemical processes: a new approach using the hidden Markov modeling. Case study: a boiler from a Brazilian cellu- lose pulp mill,โ€ in Intelligent Data Engineering and Automated Learningโ€”IDEAL 2012, vol. 7435, pp. 743โ€“752, Springer, Berlin, Germany, 2012. [23] T. Liu, J. Chen, X. N. Zhou, and W. B. Xiao, โ€œBearing per- formance degradation assessment using linear discriminant analysis and coupled HMM,โ€ Journal of Physics: Conference Series, vol. 364, no. 1, Article ID 012028, 2012. [24] G. D. Forney Jr., โ€œThe Viterbi algorithm,โ€ Proceedings of the IEEE, vol. 61, pp. 268โ€“278, 1973. [25] S. Lee, L. Li, and J. Ni, โ€œOnline degradation assessment and adaptive fault detection using modified hidden markov model,โ€ Journal of Manufacturing Science and Engineering, vol. 132, no. 2, Article ID 021010, 11 pages, 2010. [26] F. Cartella, T. Liu, S. Meganck, J. Lemeire, and H. Sahli, โ€œOnline adaptive learning of left-right continuous HMM for bearings condition assessment,โ€ Journal of Physics: Conference Series, vol. 364, no. 1, Article ID 012031, 2012. [27] V. Guralnik and J. Srivastava, โ€œEvent detection from time series data,โ€ Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD โ€™99), pp. 33โ€“42, 1999. [28] J. C. Bezdek and N. R. Pal, โ€œSome new indexes of cluster validity,โ€ IEEE Transactions on Systems, Man, and Cybernetics Part B: Cybernetics, vol. 28, no. 3, pp. 301โ€“315, 1998.
  • 22. 22 Advances in Mechanical Engineering [29] B. Schยจolkopf, A. Smola, and K.-R. Mยจuller, โ€œNonlinear compo- nent analysis as a kernel eigenvalue problem,โ€ Neural Computa- tion, vol. 10, no. 5, pp. 1299โ€“1319, 1998. [30] S. Mika, B. Scholkopf, A. Smola, K. R. Muller, M. Scholz, and G. Ratsch, โ€œKernel pca and de noising in feature spaces,โ€ in Advances in Neural Information Processing Systems, vol. 11, pp. 536โ€“542, MIT Press, 1999. [31] S. Romdhani, S. Gong, and A. Psarrou, โ€œA multi-view nonlinear active shape model using kernel pca,โ€ in Proceedings of the British Machine Vision Conference, pp. 483โ€“492, 1999. [32] P. Oโ€™Donnell, โ€œReport of large motor reliability survey of indus- trial and commercial installations, part II,โ€ IEEE Transactions on Industry Applications, vol. 21, no. 4, pp. 865โ€“872, 1985. [33] J. Q. Li and A. R. Barron, โ€œMixture density estimation,โ€ in Advances in Neural Information Processing Systems 12, pp. 279โ€“ 285, MIT Press, Cambridge, Mass, USA, 1999. [34] A. P. Dempster, N. M. Laird, and D. B. Rubin, โ€œMaximum likelihood from incomplete data via the EM algorithm,โ€ Journal of the Royal Statistical Society B: Methodological, vol. 39, no. 1, pp. 1โ€“38, 1977. [35] R. B. Randall, Frequency Analysis, Bruel and Kjaer, 1987. [36] H. Qiu, J. Lee, J. Lin, and G. Yu, โ€œWavelet filter-based weak signature detection method and its application on rolling element bearing prognostics,โ€ Journal of Sound and Vibration, vol. 289, no. 4-5, pp. 1066โ€“1090, 2006. [37] D. A. Tobon-Mejia, K. Medjaher, N. Zerhouni, and G. Tripot, โ€œEstimation of the remaining useful life by using wavelet packet decomposition and hmms,โ€ in Proceedings of the IEEE Aerospace Conference AIAA, pp. 1โ€“10, IEEE Computer Society, Los Alamitos, Calif, USA, 2011. [38] P. Burge and J. Shaw-Taylor, โ€œDetecting cellular fraud using adaptive prototypes,โ€ in Proceedings of the AI Approaches to Fraud Detection and Risk Management, pp. 9โ€“13, 1997. [39] K. Yamanishi, J. Takeuchi, G. Williams, and P. Milne, โ€œOn- line unsupervised outlier detection using finite mixtures with discounting learning algorithms,โ€ in Proceedings of the 6th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD โ€™00), pp. 250โ€“254, ACM Press, 2000. [40] T. Fawcett and F. Provost, โ€œActivity monitoring: noticing inter- esting changes in behavior,โ€ in Proceedings of the 5th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD โ€™99), pp. 53โ€“62, 1999. [41] V. Cherkassky and F. Mulier, Learning from Data, Wiley- Interscience, New York, NY, USA, 1998. [42] NSF-I/UCRC, โ€œCenter for intelligent maintenance systems. prognostic data repository: Bearing data set,โ€ http://ti.arc.nasa .gov/tech/dash/pcoe/prognostic-data-repository/. [43] D. A. Tobon-Mejia, K. Medjaher, N. Zerhouni, and G. Tripot, โ€œA mix ture of gaussians hidden markov model for failure diagnostic and prognostic,โ€ in Proceedings of the 6th Annual IEEE International Conference on Automation Science and Engineering (CASE โ€™10), pp. 338โ€“343, IEEE, August 2010. [44] D. A. Tobon-Mejia, K. Medjaher, N. Zerhouni, and G. Tripot, โ€œHidden Markov Models for failure diagnostic and prognostic,โ€ in Proceedings of the Prognostics and System Health Management Conference (PHM-Shenzhen โ€™11), pp. 1โ€“8, Shenzhen, China, May 2011. [45] D. A. Tobon-Mejia, K. Medjaher, N. Zerhouni, and G. Tripot, โ€œA data-driven failure prognostics method based on mixture of gaussians hidden markov models,โ€ IEEE Transactions on Reliability, vol. 61, no. 2, pp. 491โ€“503, 2012. [46] D. Pandya, S. Upadhyay, and S. P. Harsha, โ€œAnn based fault diagnosis of rolling element bearing using time-frequency domain feature,โ€ International Journal of Engineering Science and Technology, vol. 4, pp. 2878โ€“2886, 2012.
  • 23. Submit your manuscripts at http://www.hindawi.com VLSI Design Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 International Journal of Rotating Machinery Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 Hindawi Publishing Corporation http://www.hindawi.com Journal of Engineering Volume 2014 Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 Shock and Vibration Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 Mechanical Engineering Advances in Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 Civil Engineering Advances in Acoustics and Vibration Advances in Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 Electrical and Computer Engineering Journal of Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 Distributed Sensor Networks International Journal of The Scientific World JournalHindawi Publishing Corporation http://www.hindawi.com Volume 2014 Sensors Journal of Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 Modelling & Simulation in Engineering Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 Active and Passive Electronic Components Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 Chemical Engineering International Journal of Control Science and Engineering Journal of Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 Antennas and Propagation International Journal of Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 Navigation and Observation International Journal of Advances in OptoElectronics Hindawi Publishing Corporation http://www.hindawi.com Volume 2014 Robotics Journal of Hindawi Publishing Corporation http://www.hindawi.com Volume 2014