wolf1.pdf

Measurement Science and Technology
ACCEPTED MANUSCRIPT
An optimal deep sparse autoencoder with gated recurrent unit for rolling
bearing fault diagnosis
To cite this article before publication: Ke Zhao et al 2019 Meas. Sci. Technol. in press https://doi.org/10.1088/1361-6501/ab3a59
Manuscript version: Accepted Manuscript
Accepted Manuscript is “the version of the article accepted for publication including all changes made as a result of the peer review process,
and which may also include the addition to the article by IOP Publishing of a header, an article ID, a cover sheet and/or an ‘Accepted
Manuscript’ watermark, but excluding any other editing, typesetting or other changes made by IOP Publishing and/or its licensors”
This Accepted Manuscript is © 2019 IOP Publishing Ltd.
During the embargo period (the 12 month period from the publication of the Version of Record of this article), the Accepted Manuscript is fully
protected by copyright and cannot be reused or reposted elsewhere.
As the Version of Record of this article is going to be / has been published on a subscription basis, this Accepted Manuscript is available for reuse
under a CC BY-NC-ND 3.0 licence after the 12 month embargo period.
After the embargo period, everyone is permitted to use copy and redistribute this article for non-commercial purposes only, provided that they
adhere to all the terms of the licence https://creativecommons.org/licences/by-nc-nd/3.0
Although reasonable endeavours have been taken to obtain all necessary permissions from third parties to include their copyrighted content
within this article, their full citation and copyright line may not be present in this Accepted Manuscript version. Before using any content from this
article, please refer to the Version of Record on IOPscience once published for full citation and copyright details, as permissions will likely be
required. All third party content is fully copyright protected, unless specifically stated otherwise in the figure caption in the Version of Record.
View the article online for updates and enhancements.
This content was downloaded from IP address 154.59.124.171 on 18/08/2019 at 00:31

An optimal deep sparse autoencoder with gated recurrent
unit for rolling bearing fault diagnosis
Ke Zhao, Hongkai Jiang1
, Xingqiu Li, Ruixin Wang
School of Aeronautics, Northwestern Polytechnical University, 710072 Xi’an, China
Abstract: Effective fault diagnosis of rolling bearings are of great importance in guaranteeing the
normal operation of rotating machinery. However, the measured rolling bearing vibration signals
are highly nonlinear and interrupted by background noise, making it hard to obtain the representative
fault features. Based on this, an optimal fault diagnosis method is proposed to accurately and
steadily diagnose the rolling bearing faults in this paper. The proposed method mainly contains the
following stages. Firstly, gated recurrent unit and sparse autoencoder are constructed as a novel
hybrid deep learning model to directly and effectively mine the fault information of rolling bearing
vibration signals. Secondly, the key parameters of the constructed model are optimized by grey wolf
optimizer algorithm to achieve better diagnosis performance. Finally, the features obtained by the
constructed model are input into the classifier to get the final diagnosis results. The proposed method
is validated using the experimental and practical engineering bearing data and the results confirm
the diagnosis performance of the developed method is more effective and robust than other methods.
Keywords: Rolling bearing fault diagnosis; Hybrid deep learning model; Gated recurrent unit;
Sparse autoencoder; Grey wolf optimizer
1. Introduction
With the rapid development of society and technology, the worsening working environment
and increasing working hours lead to various failures of rotating machinery [1]. As the key part of
rotating machinery, the failures of rolling bearing may result in immeasurable losses and
catastrophic damage. However, the difficulties in rolling bearing fault diagnosis are mainly caused
by high-intensity working conditions, and the vibration characteristics of rolling bearing are affected
by local defects, including edge shape and size [2-4]. Consequently, accurate and stable diagnosis
of rolling bearing faults are realistic and urgent in practical engineering.
For decades, vibration mechanism analysis has played a major role in machinery fault
diagnosis [5-7]. The machine equipment are becoming increasingly complex, and the measured
bearing vibration signals are highly nonlinear and non-stationary with much noise. Therefore, how
to effectively obtain the fault features from the measured bearing vibration signals is the crux of
bearing fault diagnosis [8]. Currently, intelligent diagnosis methods have been widely used in rolling
bearings for the advantages of non-requirement for abundant expertise and automatically presenting
diagnosis results [9, 10]. Artificial neural network (ANN) and support vector machine (SVM) are
two most prevalent intelligent diagnosis methods in bearing fault diagnosis [11]. Unal et al.
extracted the features of vibration signals with Hilbert Transform and then used artificial neural
network (ANN) to classify the processed features [12]. Zarei et al. obtained the domain features of
bearing data and applied artificial neural network (ANN) to get the diagnosis results [13]. Yan et al.
captured the multi-domain features of bearing signals and developed optimized support vector
1 Corresponding author
Email address: jianghk@nwpu.edu.cn
Page 1 of 23 AUTHOR SUBMITTED MANUSCRIPT - MST-108890.R1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

machine (SVM) to detect faults [14]. Zheng et al. extracted the features of raw vibration signals and
used support vector machine (SVM) to classify bearing fault conditions [15]. To summarize, ANN
and SVM have made progress in the field of bearing fault diagnosis, but the raw vibration signals
need to extract representative features and the extracted features need to select sensitive features
before diagnosis. Moreover, the diagnosis accuracy heavily relies on the extracted and selected
features. However, it needs advanced signal processing techniques to extract the features of raw
vibration signals and the quality of the selected features is very dependent on engineering experience
[16, 17]. All of these mentioned factors limit the wide application of ANN an SVM, so it is essential
to develop a novel method that could directly and effectively extracts the fault features of raw
vibration signals and doesn’t require abundant engineering experience.
Deep learning has been the focus of research in recent years, assembling multi-layer data
processing units into deep architectures to extract multiple levels of data abstraction [18]. So far
deep learning has made great achievements in natural language processing, speech recognition,
medical image analysis and so on [19-21]. In other words, deep learning methods hold great
potential to get rid of the reliance on various advanced signal processing techniques and manual
feature extraction. [22]. Up to now, autoencoder (AE), convolutional neural network (CNN), deep
belief network (DBN) and recurrent neural network (RNN) are four most commonly used deep
learning methods. Shao et al. constructed a novel deep autoencoder (AE) for rotating machinery
fault diagnosis [23]. Meng et al. proposed an enhancement denoising autoencoder (AE) for rolling
bearing fault diagnosis [24]. Lu et al. designed a hierarchical convolutional neural network (CNN)
for rolling bearing fault diagnosis [25]. Huang et al. adopted an improved convolutional neural
network (CNN) for bearing diagnosis [26]. Shao et al. applied a deep belief network (DBN) with
dual-tree complex wavelet packet for bearing fault diagnosis [27]. Tang et al. developed an adaptive
deep belief network (DBN) for rotating machinery fault diagnosis [28]. Jiang et al. used recurrent
neural network (RNN) to classify bearing fault conditions [29]. Zhao et al. proposed a novel
recurrent neural network (RNN) for machine health monitoring [30]. According to the above
literature review, the previous fault diagnosis methods mainly focus on single deep learning models.
However, single deep learning models are hard to deal with increasingly complex diagnosis issues.
Thus, advanced signal processing techniques or some other model improvement methods are
essential for single deep learning models. This paper is devoted to developing a novel method to
tackle the increasingly complex diagnosis issues, which only focuses on deep learning models,
without considering advanced signal processing techniques and model improvement methods. Up
to now, AE has been the most prevalent deep learning method for rolling bearing fault diagnosis
because of its simple structure, easy to expand and powerful feature learning ability [31]. Sparse
autoencoder (SAE) as a variant of AE could learns more robust feature representations than basic
AE [32]. The measured bearing vibration signals are time series data, and gated recurrent unit (GRU)
as a novel variant of RNN shows extraordinary ability in extracting the time relevance of sequential
signals [33]. Thus, to maximize the advantages of GRU and SAE, a hybrid deep learning model that
combines GRU and SAE is constructed in this paper. GRU is first used for extracting the features
of bearing sequential signals, and then the extracted features are input into SAE to obtain more
robust feature representations. At last, the robust features are input into the classifier to obtain the
final diagnosis results. As all is known, the process of tuning parameters for deep learning models
is really a time-consuming and laborious work, so it is essential and meaningful to obtain the key
parameters of the constructed model automatically and quickly [34]. Grey wolf optimizer (GWO)
Page 2 of 23
AUTHOR SUBMITTED MANUSCRIPT - MST-108890.R1
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

algorithm is a novel optimization algorithm with flexibility, simplicity, robustness and simple
implementation, and it has been successfully applied to bearing fault diagnosis [35, 36]. Thus, GWO
algorithm is applied to automatically and quickly obtain the key parameters of constructed model.
In this paper, an optimal deep sparse autoencoder (SAE) with gated recurrent unit (GRU) is
proposed for rolling bearing fault diagnosis. The proposed method is validated using the
experimental and practical engineering bearing data and the results confirm the diagnosis
performance of the developed method is more effective and robust than other methods. The main
contributions of our work can be summarized as follows.
(1) Our proposed framework could be regarded as a hybrid model of automatic feature learning
based on deep learning models. The hybrid model is constructed by GRU and SAE to directly
extract the representative fault features of raw vibration signals. Then the extracted fault
features will input into the classifier to obtain the final diagnosis results.
(2) Due to the hybrid model has many parameters to tune, and the tuning parameters process is
really a time-consuming and laborious work. Thus, the key parameters of the hybrid model are
obtained by GWO algorithm to save time and to achieve better diagnosis performance.
(3) Comprehensive experimental studies contain experiment bearing fault diagnosis and practical
engineering bearing fault detection. The effectiveness and generalization capability of the
proposed method have been verified.
The organization of the remainder is as follows: The basic theory of the constructed model is
described in Section 2. Section 3 introduces the proposed method in detail. The proposed model is
verified by the experimental bearing data in Section 4. Section 5 gives the practical engineering
application of the proposed method. The general conclusion is given in Section 6.
2. The basic theory of the constructed model
This part is mainly to illustrate the basic theory of the constructed model. Section 2.1
introduces the basic theory of RNN and Section 2.2 describes the basic theory of GRU. The principle
of AE is illustrated in Section 2.3.
2.1 The basic theory of recurrent neural network
Unlike other deep learning models, RNN builds dependencies between its hidden units by a
directed cycle [37]. In other words, the output of a hidden layer at time t-1 will input into itself at
time t. Fig. 1 (a) shows the basic architecture of RNN and Fig. 1 (b) shows the architecture of RNN
across a time step.
Output layer
Hidden layer
t-1 t
Input layer
(a) (b)
Fig. 1 (a) the basic architecture of RNN, (b) the architecture of RNN across a time step
The above-mentioned procedure is presented mathematically by Eq. (1) and Eq. (2):
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

𝐻𝑡 = 𝑓𝐻(𝑈𝑖𝐻𝑋𝑡 + 𝑈𝐻𝐻𝐻𝑡−1 + 𝑏𝐻) (1)
𝑦𝐻 = 𝑓0(𝑈𝐻0𝐻𝑡 + 𝑏0) (2)
where 𝑓𝐻 and 𝑓0 are the activation functions of hidden layer and output layer, 𝑈𝑖𝐻, 𝑈𝐻𝐻 and
𝑈𝐻0 are the weight matrixes. 𝑏𝐻 and 𝑏0 are bias vectors of hidden layer and output layer. 𝐻𝑡
and 𝑦𝐻 are the output of hidden layer and output layer.
2.2 The basic theory of GRU
The basic theory of RNN is presented in Section 2.1. RNN has powerful ability in extracting
the time relevance of sequential signals. However, the gradient vanishing or exploding problem
greatly limit the application of RNN [30]. Thus, GRU as a novel variant of RNN, which could solves
the problem by using a gating mechanism, is used for extracting the features of raw vibration signals
in this paper [38]. Fig. 2 shows the structure of GRU.
× +
× 1-
σ σ
tanh
×
Zt
Ht
Ht-1
Rt
Ct
Xt
Fig. 2 The structure of GRU [38].
Known from Fig. 2, the most difference between GRU and RNN is GRU has two gates, reset
gate R and update gate Z. Reset gate relates to how the inputs and the previously stored information
are integrated. Update gate controls the retention of the previously stored information. The formula
is as follows:
𝑍𝑡 = 𝜎(𝑈𝑍𝑋𝑡 + 𝑉𝑍𝐻𝑡−1 + 𝑏𝑍) (3)
𝑅𝑡 = 𝜎(𝑈𝑅𝑋𝑡 + 𝑉𝑅𝐻𝑡−1 + 𝑏𝑅) (4)
𝐶𝑡 = 𝑡𝑎𝑛ℎ(𝑈𝑋𝑡 + 𝑉(𝑅𝑡 𝐻𝑡−1 ) + 𝑏) (5)
𝐻𝑡 = (1 − 𝑍𝑡) 𝐻𝑡−1 + 𝑍𝑡 𝐶𝑡 (6)
where 𝐻𝑡 and 𝐶𝑡 are an activation and a candidate activation at time t. 𝑍𝑡 and 𝑅𝑡 denote update
and reset gates.  and 𝑡𝑎𝑛ℎ are Sigmoid and hyperbolic tangent functions. 𝑈𝑍, 𝑈𝑅, 𝑈, 𝑉𝑍, 𝑉𝑅
and 𝑉 are weight matrices, respectively. 𝑏𝑍, 𝑏𝑅 and 𝑏 are bias parameters, respectively. is
the dot product.
2.3 The principle of standard autoencoder
Compared with RNN, AE is a type of unsupervised neural network and the goal is to make the
input equal to the output. The basic structure of AE is shown in Fig. 3. It can be seen that the input
is encoded firstly, and then the encoded data are processed by the activation function. At last, the
processed data are decoded as the output. In addition, the output 𝑌𝑖 is approximately equal to the
input 𝑋𝑖. Due to the number of hidden layers are always less than the dimension of input, thus, the
Page 4 of 23
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

hidden units are regard as the high-dimensional of input. For a given input 𝑋𝑖(𝑋𝑖 ∈ 𝑅𝑙∗1
), the
hidden representation 𝐻(𝑋𝑖) could be presented mathematically by Eq. (7).
𝐻(𝑋𝑖) = 𝑓𝑆(𝑉𝑖𝑗𝑋𝑖 + 𝑏1) (7)
where 𝑓𝑆(∙) denotes the Sigmoid activation function, 𝑓𝑆(𝑡) = 1 (1 + 𝑒−𝑡
)
⁄ . 𝑉𝑖𝑗 ∈ 𝑅𝑛∗𝑙 presents
weight matrix, 𝑏1 ∈ 𝑅𝑛∗1 is bias vector.
After that, transforming vector H into reconstructed vector 𝑌𝑖(𝑌𝑖 ∈ 𝑅𝑙∗1
).
𝑌𝑖 = 𝑓𝑆(𝑊
𝑗𝑖𝐻(𝑋𝑖) + 𝑏2) (8)
where 𝑊
𝑗𝑖 ∈ 𝑅𝑙∗𝑛 denotes weight matrix, 𝑏2 ∈ 𝑅𝑙∗1 is bias vector.
The loss function of standard AE is mean square error (MSE), which is to realize the
minimization of the reconstruction error by optimizing the parameters.
𝐿(𝜃) =
1
𝑛
(∑ (
1
2
‖𝑌𝑖 − 𝑋𝑖‖2
)
𝑛
𝑖=1 ) (9)
where 𝜃 denotes the parameters.
Fig. 3 The basic structure of AE
3. The proposed method.
An optimal method is proposed for rolling bearing fault diagnosis. This part is a detailed
illustration of the proposed method. Section 3.1 details the construction of the hybrid deep learning
model. Section 3.2 describes the optimization process of the constructed model. Section 3.3 shows
the general process of the proposed method.
3. 1 The model construction
The worsening environment and increasing working hours contribute to various failures of
rolling bearings. Consequently, accurate and stable diagnosis of bearing faults are realistic and
urgent. The measured bearing vibration signals are sequential signals, which are complex and
nonlinear. However, GRU has powerful ability in extracting the time relevance of sequential signals
and SAE could learns more robust feature representations. Thus, to maximize the advantages of
GRU and SAE, a hybrid deep learning model that combines GRU and SAE is constructed in this
paper and the constructed model is shown in Fig. 4. The raw bearing signals are firstly processed
by GRU layer to obtain the Feature 1, the Feature 1 are input into to the first SAE to get the Feature
2, and then the Feature 2 becomes the input of second SAE for obtaining the Feature 3 (the final
features). Finally, the final features are entered into the classifier to obtain the final diagnosis results.
The cross-entropy loss function is applied for GRU to realize the minimization of the reconstruction
error by optimizing parameters, and the formula is as follows:
Sigmoid
x1
x2
x3
xn
Sigmoid
Sigmoid
/
/
/
/
Y1
Y2
Y3
Yn
Vij
b1 b2
Wji
encoding decoding
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

𝐿(𝜃, 𝑀, 𝐹) = −
1
𝑛
∑ 𝑚𝑗 𝑙𝑜𝑔2 𝑓𝑗
𝑛
𝑗=1 (10)
where 𝑛 is the number of trained samples, 𝜃 is the optimized parameters, 𝑀 = {𝑚𝑗|𝑗 = 1, ⋯ , 𝑛}
denotes the actual output set of trained samples and 𝐹 = {𝑓𝑗|𝑗 = 1, ⋯ , 𝑛} is the corresponding
label set. 𝑗 means the jth trained sample. GRU is trained by back propagation through time (BPTT).
Adaptive gradient is used to update the weight matrices and as follows:
∆𝜃 = −
𝜀
√∑ (
𝜕𝐿(𝜃,𝑀,𝐹)
𝜕𝜃
)𝑗
2
𝑡
𝑗=1
∙ (
𝜕𝜃
)𝑡 (11)
𝜃𝑡 = 𝜃𝑡−1 + ∆𝜃 (12)
where ε is the learning rate, (
𝜕𝜃
)𝑡 denotes the gradient at step t. 𝜃𝑡 is the parameters at
step t and ∆θ is the updated values.
Compared with standardAE, SAE has more robustness and inference ability that could let SAE
learn more reliable and effective features [32]. The most difference between SAE and AE is their
loss functions are different. Regularization term and a sparsity constraint are added to the cost
function to realize the sparse representation of features. And the whole loss function of SAE is as
follows:
𝐿𝑆𝐴𝐸 = 𝐿𝐴𝐸 +
𝜏
2
∑(𝑤𝑖𝑗)2
+ 𝛽(∑ 𝜌 log
𝜌
𝜌𝑗
̂
+ (1 − 𝜌)
𝑝
𝑗=1 log
1−𝜌
1−𝜌𝑗
̂
) (13)
where τ is the regularization term parameter that adjusts the weight 𝑤. The third is the Kullback–
Leibler divergence function that is to measure the difference between 𝜌 and 𝜌𝑗
̂ . 𝜌 is a predefined
sparse parameter, 𝜌𝑗
̂ is the average activation value of hidden unit j and β is the sparse penalty
factor.
3.2 The constructed model optimization
3.2.1 GWO algorithm
In GWO algorithm,  is the best result,  is the second best and  is the third best result.
The formula is as follows:
1
2
D a r a
=  − (14)
( 1) ( ) ( ) ( )
m m
Y k Y k D E Y k Y k
+ = −   − (15)
where k is the current iteration, Y and m
Y are the position vectors of wolf and prey. E and 1
r
are random vectors. a linearly decreases from 2 to 0 in the iterative process. The other wolves
update their positions by  、  and  .
1 1
( )
A
Y Y D E Y Y
 
= −   − (16)
2 2
( )
B
Y Y D E Y Y
 
= −   − (17)
3 3
( )
C
Y Y D E Y Y
 
= −   − (18)
( 1)
3
A B C
Y Y Y
Y k
+ +
+ = (19)
where 1
D , 2
D and 3
D are similar to D , 1
E , 2
E and 3
E are similar to E . The details of
GWO algorithm can be seen in Ref. [39]. GWO algorithm is applied to obtain the optimal
parameters of the hybrid deep learning model.
3.2.2 The constructed model optimization
Page 6 of 23
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

A hybrid deep learning model that combines GRU and SAE is constructed for rolling bearing
fault diagnosis. The mentioned process in section 3.1 seems quite easy. However, it always requires
a lot of tunings for getting a satisfactory diagnosis result, which means a lot of time. The values of
ε in Eq. (11), τ, β and ρ in Eq. (13) greatly impact the diagnosis performance of the constructed
model. Thus, GWO as a novel optimization algorithm is applied to obtain the optimal values of the
mentioned parameters. The optimization process are as follows：
⚫ Step 1: Construct the proposed model.
⚫ Step 2: GWO algorithm initialization and set the number of wolves N and iterative steps K. 𝑐𝑖 =
[𝜀𝑖; 𝜏𝑖; 𝛽𝑖; 𝜌𝑖] (𝑖 = 1,2,3 ⋯ 𝑁) is the optimized parameter set. The error rate of classification is
taken as the fitness function of GWO.
⚫ Step 3: Initialize the original state of each search agent by randomly generating between ranges.
Update the positions of search agents by Eq. (19). The fitness of each search agent is calculated,
the minimum fitness and the optimal state of search agent are all saved at each iteration of GWO.
⚫ Step 4: Finish the optimization process if the iterative step reaches K and obtain the optimized
parameter set.
3.3 The general step of the proposed method
An optimal bearing fault diagnosis method is proposed in this paper. The framework is shown
in Fig. 5, and the general process is as follows.
◼ Step 1: Use data acquisition system to measure the bearing vibration signals.
◼ Step 2: Construct the proposed model.
◼ Step 3: Use GWO to obtain the key parameters of the constructed model.
◼ Step 4: Verify the effectiveness of the optimization process and output the final diagnosis
result.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

Vibration signals acquisition
Rolling bearings
The proposed model construction
Sample 1 Sample N
Sample 2
Measured vibration signals
The proposed method
Raw vibration signals
GRU
layer
Feature 1
Feature 1
SAE1
layer
Feature 2
Feature 3
SAE2
layer
Feature 2
Diagnosis result
Softmax
classifier
Feature 3
The optimization
parameters
Application of the proposed method
Diagnosis results Visualize the learned features
Fig. 5 The framework of the proposed method [22].
Page 8 of 23
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

GRU
layer
Raw vibration signals
Feature 1
SAE1
layer
Feature 1
Feature 2
SAE2
layer
Feature 3
Feature 2
Softmax
classifier
Diagnosis result
Feature 3
Fig. 4 The constructed model
4. Experimental verification
4.1 Data description
The verified bearing data is from Case Western Reserve University (CWRU) [40]. The
experimental device is shown in Fig 6 (a), and the schematic illustration is shown in Fig 6 (b). The
experimental platform contains a torque sensor, a motor, electronic control equipment and a power
meter. The fault diameters are 0.007, 0.014, 0.021 (1 inch = 2.54 cm) and the frequency is 12 kHz.
(a) (b)
Fig. 6 (a) Experimental device, (b) schematic illustration
The drive-end data used in this paper are measured at 1797 rpm. In this case study, 10 kinds of
working conditions were designed. The introduction of the data are shown in Table1, the fault
conditions include ball (B), outer race (OR) and inner race (IR) fault. Each condition includes 150
sample and each includes 800 points. The first 100 samples are for training and the rest are for
testing. The raw vibration signals of the 10 conditions is shown in Fig. 7
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

Table1
Introduction of training sample and testing sample
Health condition Fault diameter (in.) Training/testing samples Condition
Normal 0 100/50 1
B 0.007 100/50 2
B 0.014 100/50 3
B 0.021 100/50 4
IR 0.007 100/50 5
IR 0.014 100/50 6
IR 0.021 100/50 7
OR 0.007 100/50 8
OR 0.014 100/50 9
OR 0.021 100/50 10
Amplitude
(m/𝑠
2
)
(1)
(2)
(3)
(4)
(5)
(6)
Page 10 of 23
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

Fig. 7 The vibration signals of ten bearing conditions and the number corresponds to the
corresponding condition [9].
4.2 Diagnosis results and analysis
This part is mainly to estimate the diagnosis ability of the proposed method for raw vibration
signals, the constructed hybrid model without GWO, Deep GRU, SAE, standard AE, ANN and
SVM are compared with the proposed method. And the only inputs to all methods are raw vibration
signals. GWO is used for obtaining the main parameters of the constructed model, the parameters
are illustrated in Table 2 and the iteration process is shown in Fig. 8. A rule similar to [41] is
followed in deciding the structure of the constructed model and the structure is chosen as 800-400-
200-100-10.
Each method runs 10 times under its respective parameters to estimate the stability of all
methods. The results of all methods are shown in Fig. 9 and the confusion matrix (the first trial),
which detailed describes the fault condition, is shown in Fig. 10. Table3 shows the detailed results
in the experiment.
Table 2
The parameters in the experiment
Description Symbol value
The learning rate of GRU ε 0.0531 (given by GWO)
Weight regularization of SAE τ 4.6023 (given by GWO)
Sparsity proportion of SAE ρ 0.2901 (given by GWO)
Sparsity weight of SAE β 0.3755 (given by GWO)
The number of wolves of GWO N 10
The iterative steps of GWO K 20
Time (s)
(7)
(8)
(9)
(10)
)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

Table 3
Average diagnosis accuracy and standard deviation of the 10 trials
Fig. 8 The GWO optimization process for bearing vibration signals
Categories Methods Average accuracy (%) Standard deviation (%)
Deep learning
The proposed method 97.130 0.625
Constructed model 95.425 0.787
Deep GRU 91.313 1.095
SAE 86.647 1.612
Deep AE 81.673 2.108
Shallow learning SVM 70.546 2.993
ANN 59.589 3.473
Accuracy
(%)
Iterative number
Page 12 of 23
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

Fig. 9 Diagnosis results of the 10 trials for each method
Fig. 10 The confusion matrix of the propsed method (the first trial)
As Table 3 shows, the average accuracy of the proposed method, the constructed hybrid model
without GWO, Deep GRU, SAE ,Deep AE, SVM and ANN are 97.130%, 95.425%, 91.313%,
86.647%, 81.673%, 70.546% and 59.589%, respectively. Obviously, the diagnosis performance of
the proposed method is much better than others. The standard deviation of the proposed method is
only 0.625% that is much smaller than that of the constructed hybrid model without GWO, Deep
GRU, SAE, Deep AE, SVM and ANN, which are 0.787%, 1.095%, 1.612%, 2.108%, 2.993% and
Accuracy
(%)
Actual
label
Predict number
Trial number
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

3.473%, respectively. According to the Table 3 and Fig. 9, we can conclude that: (1) Deep learning
methods could achieve better diagnosis performance than shawllow learning methods, the major is
the adoption of deep architectures could let deep learning methods extract more representive fault
features from the raw bearing vibration signals. (2) The proposed method has more effective and
robust performace than Deep GRU, SAE and Deep AE, the main reason is the proposed method
could effectively extracts the time relevance of sequential signals and learns more robust feature
representations. (3) The only difference bettween the proposed method and the construted model is
whether GWO is used or not. The comparison results show that the proposed method could achives
more effective and robust results, the major is the application of GWO could let the constructed
model has an optimazation set of parameters. These conparisons prove the proposed method is more
effective and robust than other methods.
The specific parameters are as follows: (1) the constructed hybrid model without GWO (the
parameters are obtained by repeated experiments): the structure is 800-400-200-100-10, the learning
rate of GRU is 0.1, the weight regularization of SAE is 4.0, the sparsity proportion of SAE is 0.3
and the sparsity weight of SAE is 0.3. (2) Deep GRU: the structure is 400-200-100-10, the learning
rate is 0.06 and the iterative steps is 100. (3) SAE: the structure is 400-200-100-10, the weight
regularization, the sparsity proportion and the sparsity weight are 3, 0.3 and 0.4, respectively. (4)
Deep AE: the structure is 400-200-100-10, the learning rate and momentum are 0.12 and 0.6. (5)
SVM: the RBF kernel is applied, the penalty factor is 2 and the radius of the kernel function is 0.25.
(6) ANN: the structure is 600-100-10, the learning rate is 0.05 and the iterative step is 550.
The following part is mainly to estimate the feature learning ability of the constructed model
for raw bearing vibration signals. The learned features at each level are visualized by t-SNE
algorithm, which include the raw bearing vibration signals, the learned features at GRU layer, the
learned features at first layer and second layer of SAE. Fig. 11 shows the visualized pictures.
(a) (b)
Page 14 of 23
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

(c) (d)
Fig. 11 3D visualization of the features at each level. (a) The raw vibration signals. (b) The learned
features at GRU layer. (c) The learned features at SAE first layer. (d) The learned features at SAE
second layer.
The features at each level are shown in Fig. 11. It can knows that the raw vibration signals of
ten categories are mixed seriously, which makes it impossible to distinguish them. The learned
features at GRU layer are more recognizable that proves the effectiveness of GRU in extracting the
time relevance of sequential signals. The learned features at first layer and second layer of SAE, the
ten categories can be distinguished clearly, which proves SAE could learn robust feature
representations. Consequently, this case illustrates the proposed method could effectively and
adaptively learns the features of raw vibration signals.
Besides, some published deep learning methods are applied to compare with the
proposed method to prove its superiority and the specific comparison results are shown
in Table 4.
Table 4
The specific comparison results
References Accuracy (Raw CWRU data sets) Accuracy (Processed CWRU data sets)
[9] 95.20% /
[42] 96.36% /
[43] / 99.98%
[44] 96.75% 97.47%
In reference [9], the author constructed a deep wavelet autoencoder to effectively
capture the signal characters of raw CWRU data, and the captured characters are input
into extreme learning machine to obtain the final diagnosis result. Comparing the final
diagnosis result of reference [9] with the diagnosis result of the proposed method, it can
be clearly seen that the proposed method could achieves higher diagnostic accuracy
than the reference [9]. In reference [42], a hybrid model that combines denoising
autoencoder and contractive autoencoder is constructed to automatically extract the
features of raw CWRU data. Then there are two different ways to process the extracted
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

features. The first way is the extracted features are directly input into the softmax
classifier and the final diagnosis accuracy is 93.18%. The other way is the extracted
features are reduced by a modified t-distributed stochastic neighbor embedding to
achieve higher diagnosis accuracy and the final diagnosis accuracy is 96.36 %. The
final diagnosis accuracy of the proposed method is also higher than the method in
reference [42]. In reference [43], the author proposed a novel batch-normalized stacked
autoencoder for machine fault diagnosis. The diagnosis accuracy of the method in
reference [43] is 99.98% that is higher than our method which is 97.13%. However,
one point to emphasize is the inputs in reference [9] are frequency domain signals. As
is known, compared with the raw vibration signals, the frequency domain signals are
easy to obtain good diagnostic results for deep learning models. In reference [44], the
author combined compressed sensing and deep learning for rolling bearing fault
diagnosis. For raw CWRU data, the diagnosis accuracy is 96.75%, which is lower than
our result. After the raw data are processed by compressed sensing, the diagnosis result
is 97.47% that is higher than our results. However, there are two points that cannot be
ignored. The raw CWRU data need to be processed by compressed sensing, which not
only increases the complexity of the method, but also the successful application of
compressed sensing requires engineering experience and puts higher demands for
operators. In addition, the classify fault conditions in reference [44] only has 7
conditions, and ours has 10 conditions. The above comparisons could proves the
superiority of the proposed method and the existing result is also rationality.
5. Engineering verification
5.1 Data description
The actual locomotive bearing vibration signals are apllied to evaluate the reliability of the
proposed method in practical engineering. The experimental device is shown in Fig. 12. The signals
are collected at frequency of 12.8 kHz. More details can be seen in reference [16]. Table 5 lists the
nine conditions of bearing vibration signals. Each condition includes 300 samples and each sample
has 800 points. The first 200 samples are applied for training the model, the others are for testing.
The 8192 data points of each condition are shown in Fig. 13
Accelerometer
Fig. 12 The experimental device of electrical locomotive
Page 16 of 23
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

Figure. 13 The collected vibration signals of the nine conditions [23]. (a) Normal condition. (b)
Slight outer race damage. (c) Serious out race damage. (d) Roller damage. (e) Inner race damage.
(f) Compound faults (out race and inner race). (g) Compound faults (out race and roller). (h)
Compound faults (roller and inner race). (i) Compound faults (outer race、inner race and roller).
Table 5
Amplitude
(m/𝑠
2
)
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
Time (s)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

Introduction of training sample A and testing sample B
Health condition Motor speed (rpm)
Trained/tested
samples
Label
Normal 490 200/100 1
Slight outer race damage 490 200/100 2
Serious out race demage 481 200/100 3
Roller damage 531 200/100 4
Inner race damage 498 200/100 5
Compound faults (out race and inner race) 525 200/100 6
Compound faults (out race and roller) 521 200/100 7
Compound faults (roller and inner race) 640 200/100 8
Compound faults (outer race、inner race and roller) 549 200/100 9
5.2 Results and analysis
This part is mainly to estimate the diagnosis ability of the proposed method for actual
engineering data. The constructed hybrid model without GWO, Deep GRU, SAE, standard AE,
ANN and SVM are compared with the proposed method, and the only inputs to all methods are raw
vibration signals. GWO is used for obtaining the main parameters, same as Table 2, ε is 0.1658,
τ is 3.5329, ρ is 0.1258 and β is 0.3254. The structure is selected as 800-400-200-100-9.
Each method runs 10 times under its respective parameters to estimate the stability of the
proposed method. The results of each method are shown in Fig. 14. Table 6 shows the detailed
results in the experiment.
Fig. 14 Diagnosis results of the 10 trials for each method
Known from Table 6, the average accuracy of the proposed meBthod is 93.834%, that is
obviously higher than other methods, which are 91.304 %, 88.647%, 83.084%, 79.208%, 64.656%,
90.084%, 89.109%, 57.814%, 89.109% and 87.192%, respectively. The standard deviation of the
proposed method is 1.154% that is obviously less than others, which are 1.392%, 1.631%, 1.495%,
Trial number
Accuracy
(%)
Page 18 of 23
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

1.971%, 2.595% and 2.853%, respectively. Obviously, the diagnosis performance of the proposed
method is more effective and robust than other methods. The results confirm the diagnosis ability
of the proposed method is effective and robust and the proposed method is realiabe in practical
engineering.
Table 6
Average diagnosis accuracy and standard deviation of the 10 trial
The specific parameters are as follows: (1) the constructed hybrid model without GWO (the
parameters are obtained by repeated experiments): the structure is 800-400-200-100-9, the learning
rate of GRU is 0.1, the weight regularization of SAE is 3.0, the sparsity proportion of SAE is 0.1
and the sparsity weight of SAE is 0.4. (2) Deep GRU: the structure is 400-200-100-9, the learning
rate is 0.1 and the iterative steps is 80. (3) SAE: the structure is 400-200-100-9, the weight
regularization, the sparsity proportion and the sparsity weight are 3.5, 0.18 and 0.32, respectively.
(4) Deep AE: the structure is 400-200-100-9, the learning rate and momentum are 0.15 and 0.5,
respectively. (5) SVM: the RBF kernel is applied, the penalty factor is 2 and the radius of the kernel
function is 0.25. (6) ANN: the structure is 400-150-9, the learning rate is 0.06 and the iterative step
is 800.
The learned features at each level are visualized by t-SNE algorithm, which include raw bearing
vibration signals, the learned features at GRU layer, the learned features at first layer and second
layer of SAE. Fig. 15 shows the visualized pictures. The visulization results confirm the proposed
method could effectively and and adaptively learns the features of raw vibration signals.
(1) (2)
Methods Average accuracy (%) Standard deviation (%)
The proposed method 93.834 1.154
The constructed model (without GWO) 91.304 1.392
Deep GRU 88.647 1.631
SAE 83.084 1.495
Deep AE 79.208 1.971
SVM 64.656 2.595
ANN 57.814 2.853
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

(3) (4)
Fig. 15 3D visulization of features at each level. (1) The raw vibration signals. (2) The learned
features at GRU layer. (3) The learned features at first SAE layer. (3) The learned features at second
SAE layer.
6. Conclusion
In this paper, an optimal hybrid deep learning model based on gated recurrent unit (GRU) and
sparse autoencoder (SAE) is proposed for rolling bearing fault diagnosis. The key paprameters of
the hybrid deep learning model can be obtained adaptively and the hybrid deep learning model could
directly processes the raw bearing vibration signals. The proposed method is verified by the
experimental and practical engineering bearing data and the results prove the proposed method
could achieves more effective and robust diagnosis performance than other method. In addition, the
results also prove the extraordinary ability of GRU in extracting the time relevance of sequential
signals. Consequently, GRU is a promising tool for bearing fault diagnosis.
However, Compared with some of the latest bearing fault diagnosis articles, the accuracy of
the diagnosis in this paper has no advantage. The author will continue to investigate this topic in
future study to fully mine the feature extraction ability of GRU in processing sequential signals.
Acknowledgement
This research is supported by the major research plan of the National Natural Science
Foundation of China (No. 91860124), the National Natural Science Foundation of China (No.
51875459) and the Aeronautical Science Foundation of China (No. 20170253003), the Synergy
Innovation Foundation of the University and Enterprise for Graduate Students in Northwestern
Polytechnical University (No. XQ201901).
Reference
[1] H.K. Jiang, C.L. Li, H.X. Li, An improved EEMD with multiwavelet packet for rotating
machinery multi-fault diagnosis, Mech. Syst. Signal Process., 36 (2013) 225-239.
[2] Y. Zhang, B.P. Wang, Y. Han, L. Deng, Bearing performance degradation assessment based on
time-frequency code features and SOM network, Meas. Sci. Technol., 28 (2017) 045601.
[3] Y.Y. Zhang, X.Y. Li, L. Gao, L.H. Wang, L. Wen, Imbalanced data fault diagnosis of rotating
machinery using synthetic oversampling and feature learning, J. Manuf. Syst., 48 (2018) 34-50.
[4] J.L. Chen, Y.Y. Zi, Z.J. He, J. Yuan, Improved spectral kurtosis with adaptive redundant
multiwavelet packet and its applications for rotating machinery fault detection, Meas. Sci. Technol.,
23 (2012) 045608.
Page 20 of 23
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

[5] Y.G. Lei, J. Lin, Z.J. He, M.J. Zuo, A review on empirical mode decomposition in fault diagnosis
of rotating machinery, Mech. Syst. Signal Process., 1-2 (2013) 108-126. Meas. Sci. Technol., 23
(2012) 045608.
[6] Y.Y. Zhang, X.Y. Li, L. Gao, P.G. Li, A new subset based deep feature learning method for
intelligent fault diagnosis of bearing, Expert Syst. Appl., 110 (2018) 125-142.
[7] H.R. Cao, L.K. Niu, S.T. Xi, X.F. Chen, Mechanical model development of rolling bearing-rotor
systems: A review, Mech. Syst. Signal Process., 102 (2018) 37-58.
[8] J.D. Zheng, J.S. Cheng, Y. Yu, A rolling bearing fault diagnosis approach based on LCD and
fuzzy entropy, Mech. Mach. Theory, 70 (2013) 441-453.
[9] H.D. Shao, H.K. Jiang, X.Q. Li, S.P. Wu, Intelligent fault diagnosis of rolling bearing using deep
wavelet auto-encoder with extreme learning machine, Knowl.-Based Syst., 140 (2018) 1-14.
[10] S. Ma, F.L. Chu, Ensemble deep learning-based fault diagnosis of rotor bearing systems,
Comput. Ind., 105 (2019) 143-152.
[11] H.D. Shao, H.K. Jiang, K. Zhao, D.D. Wei, X.Q. Li, A novel tracking deep wavelet auto-
encoder method for intelligent fault diagnosis of electric locomotive bearings, Mech. Syst. Signal
Process., 110 (2018) 193–209.
[12] M. Unal, M. Onat, M. Demetgul, H. Kucuk, Fault diagnosis of rolling bearings using a genetic
algorithm optimized neural network. Measurement, 58 (2014) 187-196.
[13] J. Zarei, Induction motors bearing fault detection using pattern recognition techniques. Expert
Syst. Appl., 39 (2012) 68-73.
[14] X.A. Yan, M.P. Jia, A novel optimized SVM classification algorithm with multi-domain feature
and its application to fault diagnosis of rolling bearing. Neurocomputing, 313 (2018) 47-64.
[15] J.D. Zheng, H.Y. Pan, S.B. Yang, J.S. Chen, Generalized composite multiscale permutation
entropy and laplacian score based rolling bearing fault diagnosis. Mech. Syst. Signal Process., 99
(2018) 229-243.
[16] H.D. Shao, H.K. Jiang, H.Z Zhang, T.C. Liang, Electric Locomotive Bearing Fault Diagnosis
Using a Novel Convolutional Deep Belief Network, IEEE T. Ind. Electron. 65 (2017) 2727-2736.
[17] H.D. Shao, H.K. Jiang, X. Zhang, M.G. Niu. Rolling bearing fault diagnosis using an
optimization deep belief network. Meas. Sci. Technol., 26 (2015) 115002.
[18] S.H. Wang, J.W. Xiang, Y.T. Zhong, Y.Q. Zhou, Convolutional neural network-based hidden
Markov models for rolling element bearing fault identification. Knowl.-Based Syst., 144 (2018) 65-
76.
[19] J.H. Dou, J.Y. Qin, Z.X. Jin, Z. Li, Knowledge graph based on domain ontology and natural
language processing technology for Chinese intangible cultural heritage. J. Visual. Lang. Comput.,
48 (2018) 19-28.
[20] H.M. Fayek, M. Lech, L. Cavendon, Evaluating deep learning architectures for Speech
Emotion Recognition. Neural Networks, 92 (2017) 60-68.
[21] Z.L. Hu, J.S. Tang, Z.M. Wang, K. Zhang, L. Zhang, Q.L. Sun, Deep learning for image-based
cancer detection and diagnosis − A survey. Pattern Recognit., 83 (2018) 134-149.
[22] H.D. Shao, H.K. Jiang. F.A. Wang, H.W. Zhao, An enhancement deep feature fusion method
for rotating machinery fault diagnosis, Knowl.-Based Syst., 119 (2017) 200-220.
[23] H.D. Shao, H.K. Jiang, H.W. Zhao, F.A. Wang, A novel deep autoencoder feature learning
method for rotating machinery fault diagnosis, Mech. Syst. Signal Process., 95 (2017) 187-204.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

[24] Z. Meng, X.Y. Zhan, J. Li, Z.Z. Pan,An enhancement denoising autoencoder for rolling bearing
fault diagnosis, Measurement, 130 (2018) 448-454.
[25] C. Lu, Z.Y. Wang, B. Zhou, Intelligent fault diagnosis of rolling bearing using hierarchical
convolutional network based health state classification, Adv. Eng. Inform., 32 (2017) 139-151.
[26] W.Y. Huang, J.S. Cheng, Y. Yang, G.Y. Guo, An improved deep convolutional neural network
with multi-scale information for bearing fault diagnosis, Neurocomputing, 2019.
[27] S.H. Tang, C.Q. Shen, D. Wang, S. Li, W.G. Huang, Z.K. Zhu, Adaptive deep feature learning
network with Nesterov momentum and its application to rotating machinery fault diagnosis,
Neurocomputing, 305 (2018) 1-14.
[28] H.D. Shao, H.K. Jiang, F.A. Wang, Y.N. Wang, Rolling bearing fault diagnosis using adaptive
deep belief network with dual-tree complex wavelet packet, ISA T., 69 (2017) 187-201.
[29] H.K. Jiang, X.Q. Li, H.D. Shao, K. Zhao, Intelligent fault diagnosis of rolling bearing using
improved deep recurrent neural network, Meas. Sci. Technol., 29 (2018) 065107.
[30] R. Zhao, D.Z. Wang, R.Q. Yan, K.Z. Mao, F. Shen, J.J. Wang, Machine Health Monitoring
Using Local Feature-based Gated Recurrent Unit Networks, J. IEEE T. Ind. Electron., 99 (2017),
1539 – 1548.
[31] D.T. Hoang, H.J. Kang, A survey on Deep Learning based bearing fault diagnosis,
Neurocomputing, 335 (2019) 327-335.
[32] L. Xu, M.Y. Cao, B.Y. Song, J.S. Zhang, Y.R. Liu, F.E. Alsaadi, Open-circuit fault diagnosis of
power rectifier using sparse autoencoder based deep neural network, Neurocomputing, 311 (2018)
1-10.
[33] H. Liu, J.Z. Zhou, Y. Zheng, W. Jiang, Y.C. Zhang, Fault diagnosis of rolling bearings with
recurrent neural network-based autoencoders. ISA T. 77 (2018) 167-178.
[34] F.A. Wang, H.K. Jiang, H.D. Shao, W.J. Duan, S.P. Wu, An adaptive deep convolutional neural
network for rolling bearing fault diagnosis, Meas. Sci. Technol., 28 (2017) 095005.
[35] X. Zhang, Z.W. Liu, Q. Miao, L. Wang, An optimized time varying filtering based empirical
mode decomposition method with grey wolf optimizer for machinery fault diagnosis, J. Sound Vib.,
418 (2018) 55-78.
[36] X. Zhang, Q. Miao, Z.W. Liu, Z.J. He, An adaptive stochastic resonance method based on grey
wolf optimizer algorithm and its application to machinery fault diagnosis, ISA T. 71 (2017) 206-
214.
[37] A. Graves. Supervised Sequence Labelling with Recurrent Neural Networks [M], Springer
Berlin Heidelberg, 2012.
[38] X.Q. Li, H.K. Jiang, X. Xiong, H.D. Shao, Rolling bearing health prognosis using a modified
health index based hierarchical gated recurrent unit network, Mech. Mach. Theory, 133 (2019) 229-
249.
[39] S. Mirjalili, S.M. Mirjalili, A. Lewis, Grey Wolf Optimizer, Adv. Eng. Softw., 69 (2014) 46-61.
[40] W.A. Smith, R.B. Randall, Rolling element bearing diagnostics using the Case Western Reserve
University data: a benchmark study, Mech. Syst. Signal Process., 64–65 (2015) 100-131.
[41] F. Jia, Y.G. Lei, J. Lin, X. Zhou, N. Lu, Deep neural networks: a promising tool for fault
characteristic mining and intelligent diagnosis of rotating machinery with massive data, Mech. Syst.
Signal Process. 72–73 (2016) 303–315.
[42] W. Jiang, J.Z. Zhou, H. Liu, Y.H. Shan, A multi-step progressive fault diagnosis method for
rolling element bearing based on energy entropy theory and hybrid ensemble auto-encoder, ISA T.,
Page 22 of 23
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

87 (2019) 235-250.
[43] J.R. Wang, S.M. Li, Z.H. An, X.X. Jiang, W.W. Qian, S.S. Ji, Batch-normalized deep neural
networks for achieving fast intelligent fault diagnosis of machines, Neurocomputing, 329 (2019)
53-65.
[44] J.D. Sun, C.H. Yan, J.T. Wen, Intelligent Bearing Fault Diagnosis Method Combining
Compressed DataAcquisition and Deep Learning, IEEE T. INSTRUM. MEAS., 67 (2018) 185-195.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
A
c
c
e
p
t
e
d
M
a
n
u
s
c
r
i
p
t

wolf1.pdf

Recommended

Recommended

More Related Content

Similar to wolf1.pdf

Similar to wolf1.pdf (20)

Recently uploaded

Recently uploaded (20)

wolf1.pdf