ASCE_ChingHuei_Rev00..

Vibration-Based Damage Detection

Using Unsupervised Support Vector Machine

Ching-Huei Tsou1 and John R. Williams2

{tsou, jrw}@mit.edu

Abstract:

Vibration-based damage detection methods can be used to identify hidden damages in structural

components. Traditional modal based system identification paradigm requires a detailed model of

the structure, such as a finite element model. This paper describes a novel statistical damage

detection approach based on a support-vector machine methodology. The proposed approach is

computational efficient even when the number of features is large and does not suffer from the

local minima problem that is encountered by artificial neural networks. We build the statistical

model through unsupervised learning, avoiding the need of using measurements from the

damaged structure, which is unrealistic in many real world problems. Extracting significant

features from raw vibration time series data is crucial to the efficiency and scalability of statistical

based methods. A feature selection algorithm is presented along with the building of our

statistical model. Numerical simulations, including the ASCE benchmark problem, are analyzed

to examine the accuracy and the scalability of our approach. We show that the proposed approach

is able to detect both the occurrence and the location of damage, and our feature selection scheme

can effectively reduce the required dimensions while retaining high accuracy.
1
Graduate Student, Intelligent Engineering Systems Laboratory (IESL), Department of Civil and

Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
2
Associate Professor, Director of IESL, Department of Civil and Environmental Engineering and

Engineering Systems Division, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.

1. Introduction

The process of implementing a damage detection strategy is referred to as structural health

monitoring (SHM), and can be categorized into five stages [1]: (1) detecting the existence of

damage, (2) locating the damage, (3) identifying the type of the damage, (4) determining the

severity of the damage, (5) predicting the remaining service life of the structure. Research has

been conducted in this field during the past decade and detailed literature reviews on vibration-

based damage detection methods can be found in [2-4]. The basic reasoning behind all vibration-

based damage detection is that the stiffness, mass, or energy dissipation behavior of the structure

will change significantly when damage occurs. These properties can be measured and detected by

monitoring the dynamic response of the system. When compared to other nondestructive damage

detection (NDD) techniques, such as ultrasonic scanning, acoustic emission, x-ray inspection,

etc., the vibration-based method has the advantage of providing a global evaluation of the state of

the structure.

Traditional vibration-based damage identification applications rely on detailed finite element

models of the undamaged structures, and damage diagnosis is made by comparing the modal

responses, such as frequencies and mode shapes, of the model and the potentially damaged

structure. The system identification approaches have been shown to be very accurate provided the

models can produce robust and reliable modal estimates, and large amount of high quality data is

available. But these two requirements cannot always be met in the field.

To overcome these difficulties, pattern recognition based approaches have been proposed [5-8].

Instead of building models from physical properties of the structures, those methods construct

statistical models from the vibration response data directly. This reduces the complexity in the

modeling process, in the cost of losing physical meaning of the model. Also, these methods have

been shown to be accurate in damage detection and are less sensitive to data quality; however,

some problems still remain. For example, methods which use autoregressive models (AR/ARX)

[5] may not be able to fit the vibration data well because it gives only linear approximations.

Complex statistical models are less efficient, and they have little control over the generalization

bound, i.e., they may fit the history data perfectly but have no guarantee on the future data.

Methods based on artificial neural network (ANN) [8] often suffer from local minima problem

and cannot be trained efficiently, and do not scale well to large scale problems. Methods use

support vector machine (SVM) have also been purposed [7, 9], with SVM used to perform

supervised, binary classifications.

In this paper, we propose using one-class SVM and support vector regression (SVR) to perform

unsupervised learning. This does not require training samples from damaged structure because

they are usually unavailable in the practical situation. Training SVM is mathematically equivalent

to solving a convex quadratic programming (QP) problem that does not have local minima. The

lack of local minima means it can be trained faster than ANN. Finally, SVM is used with a linear

kernel to reduce the number of features in our model. This leads to a statistical model that is

efficient, accurate, and easy to implement. Mathematical simulations are provided to examine the

performance and accuracy of this approach.

2. Theory of SVM and Its Application in Damage Detection

We propose SVM-based approach in this paper because of its theoretical advantages over other

learning algorithms. SVM has been applied in various pattern recognition fields and it is not new

to introduce SVM into SHM. Nevertheless, SVM itself has evolved a lot during the past few

years, and these developments also shed new light on its applications in SHM. In this section, we

first review the motivation and algorithm of SVM. Then we move on to introduce two extensions

of SVM, which are able to perform unsupervised learning, and how they can be applied in the

damage detection scheme.

2.1 Support Vector Machine

Support Vector Machine was developed by Vapnik et al. [10] based on structural risk

minimization (SRM) principle from statistical learning theory, rather than empirical risk

minimization (ERM) used by most other learning algorithms (Risk means test error in this

context). This fundamental difference allows SVM to select the best classifier from a family of

functions that not only fits the training data well but also provides a bonded generalization error,

i.e., a better prediction power [11]. Together with kernel techniques, SVM has shown superior

performance on both speed and accuracy, and it has outperformed Artificial Neural Networks

(ANN) in a wide variety of applications [12]. We start introducing the algorithm by discussing

the simplest case, a linear classifier trained on separable binary data. Assume we have l training

examples,

{ x i , yi } , i = 1,L , l where yi ∈ { −1, 1} , x i ∈R n
% %
x i are often referred to as patterns or inputs, and yi are called label or outputs of the example. A
%

linear classifier (a hyperplane in R n ) can be defined as,

f ( x i ) = x T w1 + w0 = 0
% %i %
where w1 is a vector normal to the hyperplane, and w0 is a scalar constant. We can also define
%

another two auxiliary hyperplanes by f ( x i ) = x T w1 + w0 = ±1 . It is easy to show that each of the
% %i %

two parallel hyperplanes has a perpendicular distance the original hyperplane equal to 1/ w1 .
%
The distance is often referred to as the “margin”. Because the data is separable, we can always

find those hyperplanes that separate the training samples perfectly. It is obvious that the solution

is not unique, and the SVM algorithm looks for the one that gives the maximum margin. The

optimization problem for the above process can be expressed as,

1 2
minimize: w1
2 %

subject to: yi ( x i w1 + w0 ) ≥ 1
T

% %

To extend to inseparable data, we can introduce slack variables ξi to relax the constraints, and

then add some penalty to the relaxation. The new optimization problem becomes,

{ }
l
1
w1 + C ∑ max 1 − yi ( x T w1 + w0 ) , 0
2
minimize:
2 % i =1 %i %

subject to: yi ( x i w1 + w0 ) ≥ 1 − ξi and ξi ≥ 0
T

% %

where C is a constant determining the trade-off between our two conflicting goals: maximizing

the margin, and minimizing the training error. For computational simplicity, we can further

transform the optimization problem into its dual form by using Lagrange multipliers, denoted by

α i ’s, and the result becomes,

l
1 l l
maximize: ∑α
i =1
i − ∑∑αiα j yi yj x T x j
2 i =1 j =1 %i %

l
subject to: ∑α y
i =1
i i = 0 and C ≥ α i ≥ 0

For all constraints in Eq. that are not strictly met as equalities, the corresponding α i ’s must be

zeros. This is known as the Karush-Kuhn-Tucker (KKT) conditions in optimization theory.

Examples with non-zero α i ’s are called the support vectors, and the classifier is determined by

the support vectors alone,

N SV
f ( x i ) = x T w1 + w0 =
% %i %
∑α y x
j =1 % j j
T
i x j + w0
%

where N SV denotes the total number of support vectors.

To extend the algorithm from linear to nonlinear, we define a mapping function φ : R n → H

which maps x i from its original Euclidian space to a reproducing kernel Hilbert space (RKHS).
%
The original space is often referred to as the sample space, and the RKHS is called the feature

space. Without losing generality in our context, we can simply think a RKHS as a generalization

Euclidian space which can have infinite dimensions. By replacing x i in the optimization problem
%

with f ( x i ) and perform linear classification in the corresponding RKHS, the solution become,
% %
f ( x i ) = f T ( x i ) w1 + w0
% % % %
N SV N SV
= ∑ α y f%
j =1
j j
T
( x i )f ( x j ) + w0 =
% % %
∑α y Κ ( x ,x ) + w
j =1
j j
% % i j 0

The mapping function φ is called kernel function and its dot product f ( x i )f ( x j ) = Κ ( x i , x j )
T

% % % % % %
is known as the kernel. Popular selection of kernels includes linear kernel, polynomial kernel, and

radial basis function (RBF) kernel. When nonlinear kernels are used, Eq. is no longer a linear

function in the original Euclidian space.

Because solving SVM corresponds to solving a convex QP problem, it does not have local

minima and can be trained faster than algorithm that does, such as ANN. We can show that

N SV << l for easier problems, i.e., problems with small generalization errors. This leads to a

sparse matrix in Eq. and Eq., and that means the optimization problem can be solved efficiently.

Also, through SRM and VC dimension [10], SVM provides a bounded generalization error and a

systematic way to select the complexity of the solution function, which effectively control the

problem of overfitting. Detailed discussion of these properties is beyond the scope of this paper,

and can be found in many recent statistical learning text books [13, 14].

2.2 One-Class Support Vector Machine

SVM is originally a supervised, batch learning algorithm, and has been applied in the SHM field

[7, 9] performing binary classification tasks. A major challenge is that data measured from

damaged structure is often not available in practical situations, thus unsupervised learning

methods are more desirable [15]. Similar needs also occur in other domains, and researchers in

machine learning and pattern recognition communities have extended the idea of SVM into

unsupervised learning, often referred to as one-class SVM [16].

Instead of finding a hyperplane that maximize the margin between two classes in the RKHS, one-

class SVM maximizes the distance from the hyperplane to the origin. The corresponding

optimization problem becomes,

1 1
∑ξ
2
minimize: w1 + −ρ
νl
i
2 % i

subject to: f ( x i ) w1 ≥ ρ −ξi ξi ≥ 0
T
and
% % %

where ν ∈ ( 0,1] is a parameter similar to the C introduced in Eq., and ρ is a offset which will

be calculated automatically during the optimization. If a translation invariant kernel is used (e.g.

RBF kernel), the goal of one-class SVM can also be thought of as to find small spheres that

contain most of the training samples.

2.3 Support Vector Regression

SVM was first developed for classification, and the labels yi in represent a finite number of

possible categories. The algorithm can be extended to estimate real-valued functions by allowing

yi to have real value, and defining a suitable loss function [17]. The following loss function,

{
f ( x i ) − yi = max f ( x i ) − yi − ε , 0
% %
}
known as ε -insensitive loss function, pays no penalty to points within the ε range, and this

carries over the sparseness property from SVM to SVR. Again, the estimated function can be

expressed as Eq., and the goal now is to minimize,

l
1
w1
2 %
2

i =1 %
{
+ C ∑ max f ( x i ) − yi − ε , 0 }
The basic idea of SVM, one-class SVM and SVR are summarized in Table 1 and Figure 1. For

simplicity, the discriminant function (Eq.) of SVM and one-class SVM are drawn as linear

functions. As mentioned, when nonlinear kernels are used, the functions are by no means linear in

the sample space.

Maximize Penalty
misclassified samples and
SVM distance between two hyperplanes
samples within the margin
One-class SVM distance between the hyperplane and origin misclassified samples
SVR smoothness of the function samples outside the ε - tube

Table 1. Comparison of SVM, one-class SVM and SVR

Origin

Figure 1. Geometric interpretation of SVM, one-class SVM and SVR in 2D

2.4 Damage Detection Using SVM

Vibration-based damage detection approaches are grounded on the assumption that the dynamic

response of the system will change significantly when damage occurs. We propose using SVM

for the detection, either through novelty detection or regression, and it is essential to have a

reasonable representation of the dynamic response before we can feed the data into SVM.

A time series is usually modeled by splitting it into series of windows, and the value at each time

point is determined by a set of its previous values, i.e.,

xt = f ( xt −τ , xt − 2τ ,..., xt − mτ )

where m and τ are referred to as the embedding dimension and delay time [18], respectively.

Through this representation, an acceleration response series can be transformed into a data set of

fixed-length vectors, and used by SVM. Damage detection is conducted by examining the

similarity and dissimilarity among data collected from different structure status. Detailed analysis

procedure will be given in the numerical studies section.

3. Feature Selection

As mentioned, statistical proximate approach is an attractive alternative to approaches based on

high order physical models in the sense that the former is computational competitive, less

sensitive to modeling error and data quality, and requires only measurement signals to build the

model. SVM is among the fastest algorithm in statistical learning; however, for large scale

problems the SVM algorithm is still slow and further reducing the computational complexity is

necessary.

3.1 The Motivation of Feature Selection in the Proposed Approach

Although in the dual form of SVM we are facing a QP problem whose computational complexity

is proximately proportional to the square of the number of training examples l , not the number of

features (ref. Eq.~), reducing the number of features is nevertheless helping to improve the

performance. For example, dot products between feature vectors are frequently required when

evaluating a kernel function. This process is time-consuming when the number of features is

large. When implementing a SVM solver, we often cache these results to improve the

performance, and this also brings up the memory consumption problem. Besides, field data is

often polluted by noises and redundant information, and feature selection provides a way of

identifying and eliminating them from the feature set. This not only improves the computational

efficiency but also increase the accuracy.

3.2 Feature Selection using SVM

By looking at the solution of the primal form of SVM given by Eq., we can see that each

component in w1 can be thought of as the weight of its corresponding feature, φ ( x i ) , in RKHS.
% %
Feature reduction is done by removing features with zero weights from the set.

2
The primal SVM optimization problem is to minimize w1 while obey all the constraints, which
%

forces the value of each component wi to be small, but does not set it to zero because the

2 2
derivative of w1 at wi B 0 is small. We could replace w1 by w1 in SVM to solve this
% % %
problem, but this will forbid us transform SVM into its dual form and lose all the advantages. The

simplest way to get around this is to set a threshold for wi , and remove features associate with

weights smaller than the threshold.

Using the time series model indicated in Eq., the target of feature reduction in our damage

detection approach is to reduce the embedding dimension, i.e., the length of patterns in the

sample space. Because we are aimed to the sample space, no feature mapping is needed and a

linear kernel is suitable for this scenario, for its efficiency. Note that the choice of kernel in

feature selection is independent of the choice of kernel in the classification or regression stage.

When a linear kernel is used, features in the RKHS are actually the patterns in the sample space

themselves. This is why we used the conventional term “feature selection” throughout this paper,

although we are actually doing “pattern selection”.

The value of w1 is not calculated in solving SVM because only its dot product is required and
%
which can be obtained more efficiently by evaluating the kernel function. When doing feature

selection, we need to calculate w1 explicitly. The following relation is obtained while deriving
%
the dual form of SVM with linear kernel (ref. Eq.),

n SV
w 1 = ∑ α i yi x i
% i =1 %

which can be used to determine w1 once the corresponding SVM is solved.
%
This feature selection approach allow us to reduce the number of features while keep the

accuracy, as our numerical studies shown.

4. Numerical Studies

In this section, we demonstrate the proposed approach using a simple 2-story shear building and

the ASCE benchmark problem [19]. In all examples, acceleration responses are first normalized

using,

ai = ( ai − µa ) σ a

where µa and σ a are the sample mean and sample standard deviation of the acceleration signal,

respectively. Also, in all examples, when we say acceleration response we mean relative

acceleration response between two adjacent floors, i.e., the acceleration difference between the

current floor and the one below. By doing these, we do not need to deal with the scale and units

of the loading, and we can better isolate the effect of damages in each story. The value of SVM

related parameters, such as C , ν , ε and σ in RBF kernel are selected based on common

practice in pattern recognition and are specified in each example. In general, we can obtain

similar results as long as those parameters are within a reasonable range.

4.1 Two-Story Shear Building

We start with a simple 2-story shear building shown in Figure 2. Damage is modeled by reducing

the stiffness of a column. Vibration data are collected through accelerometers attached under each

floor. Three different SVM based approach are used for damage detection, namely, (1) supervised

SVM, (2) one-class SVM, and (3) support vector regression.

0.2 m

B

0.5 m

A

0.5 m

Figure 2. Plane Steel Frame under Traverse Seismic Load ( EI = 6.379 N ⋅ m 2 for all columns)

4.1.1 Damage Detection Using Supervised Support Vector Machine

Figure 3 shows the acceleration response of the structure under the 1940 El Centro earthquake

load, and each time series corresponds to a different structure status. The damage in a floor is

modeled by reducing the stiffness of one of the columns in that floor by 50%.

1

0.5
Acceleration (g)

0
0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5

Time (sec)
-0.5

Undamaged 1F Damaged 2F Damaged
-1

Figure 3. Acceleration Measurements from Accelerometer A (1F, El Centro)

The vibration data are recorded at location A and B with sampling rate equal to 50Hz. Therefore,

with a 2-second long window, we can extract 100 patterns from the time series for each example.

Knowing the patterns and their corresponding labels (undamaged, 1st floor damaged, or 2nd floor

damaged), we can feed these data into a support vector machine (using C =100 and a RBF kernel

with σ 2 =20). The results of a 5-fold cross validation are shown in Table 1. We can see that SVM

is able to detect the occurrence as well as the location of the damage with very high accuracy,

provided the number of patterns is long enough. The trial-and-error way of selecting patterns here

will be replaced by our feather selection algorithm in section 4.2.2.

# of Patterns CV 1 CV 2 CV 3 CV 4 CV 5 Average
100 97 / 120 89 / 120 90 / 120 87 / 120 78 / 120 76.2%
150 111 / 120 111 / 120 119 / 120 117 / 120 116 / 120 95.7%
200 120 / 120 120 / 120 120 / 120 120 / 120 120 / 120 100%

Table 1. Cross Validation Results (3 structure status; El Centro)

In the next example, the same structure is excited using two different seismic loads. For each

load, acceleration responses in two different structure status (undamaged / 1st floor damaged) are

recorded, as shown in Figure 4.

1.5

1

0.5
Acceleration (g)

0
0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5
-0.5
Time (sec)
-1

-1.5 Undamaged 1F damaged
Undamaged (Kobe) 1F damaged (Kobe)
-2

Figure 4. Acceleration Measurements from Accelerometer A (El Centro and Kobe)

The purpose here is to detect damages in the structure, regardless of the sources of excitations

that causes the damage. We mix the two acceleration responses measured from the structure

under different excitations and train SVM with 2 classes (damaged or undamaged) instead of 4.

The cross validation results are shown in Table 2. Even with mixed excitations, SVM can still

achieve an accurate detection. When we group the training samples with the same structure status

together, we are implicitly indicating that the excitation is not a feature we care about, hence

SVM is focused on maximizing the differences caused by changes in the structure.

# of Patterns CV 1 CV 2 CV 3 CV 4 CV 5 Average

200 145 / 160 151 / 160 146 / 160 144 / 160 147 / 160 91.6%

Table 2. Cross Validation Results (2 structure status; El Centro & Kobe)

4.1.2 Damage Detection Using One-Class Support Vector Machine

Although supervised SVM classification is accurate and easy to implement, in practice we often

do not have vibration data from damaged structure beforehand. In this section, we apply one-class

SVM on the same structure used before. Similarly, we extract features from the acceleration

response by setting a windows size equal to 2 seconds (100 patterns for each example), and we

again use a RBF kernel with σ 2 =20, and ν =0.1. Three one-class SVM models are trained using

response data measured from undamaged structure, each with a different seismic load. Then each

model is used to test response data measured from both damaged and undamaged structure, under

the 3 seismic loads. The results are shown in below.

Training El Centro Golden Gate Kobe
Testing El C. G.G. Kobe El C. G.G. Kobe El C. G.G. Kobe
Undam. 14.3% 29.6% 34.9% 85.3% 16.3% 97.8% 18.9% 21.6% 17.4%
1F 89.1% 90.6% 75.0% 98.8% 97.2% 100% 86.7% 57.8% 68.8%
2F 80.9% 100% 73.4% 100% 100% 100% 75.4% 100% 66.6%

Table 3. Proportion of outliers (800 testing samples)

Undamaged 1F 2F Undamaged 1F 2F Undamaged 1F 2F

100% 100% 100%

75% 75% 75%

50% 50% 50%

25% 25% 25%

0% 0% 0%
El Centro Golden Gate Kobe El Centro Golden Gate Kobe El Centro Golden Gate Kobe

Figure 5. Proportion of outliers (800 testing samples)

(model built using El Centro, Golden Gate, and Kobe seismic loading respectively, left to right)

Table 3 and Figure 5 indicate that when a damage occurred in the structure, the percentage of

outliers increase significantly. Note that each SVM model is trained using positive samples

measured from one particular seismic load. When the model trained using Golden Gate

earthquake is applied to monitor the same structure under a different seismic load, a large portion

of signals measured from the undamaged structure are also considered as outliers. This is due to

the fact that both external force and structure status can affect the acceleration response, and a

model built on one particular loading history cannot be generalized well to monitor arbitrary

loading. To reduce this unwanted effect, we train SVM models using a larger database that

consists of mixture of acceleration responses measured from undamaged structure under different

seismic loads. By grouping these responses together, we implicitly tell SVM to ignore the

differences caused by excitation variability. Table 4 and Figure 6 shows the results of damage

detection using models built on 3 different sized data sets. (left to right, training data measured

from structure under a. Golden Gate, b. El Centro and Golden Gate, c. El Centro, Golden Gate,

Corral, Hach and Hachinohe seismic load)

Training Golden Gate 2 mixture 5 mixture
Testing El Centro Kobe El Centro Kobe El Centro Kobe
Undam. 85.3% 97.8% 9.5% 31.6% 1.0% 16.0%
1F 98.8% 100% 90.5% 75.1% 64.0% 66.6%
2F 100% 100% 79.0% 73.5% 63.8% 64.5%

Table 4. Proportion of outliers (800 testing samples) detected at location A

Undamaged 1F 2F Undamaged 1F 2F Undamaged 1F 2F

100% 100% 100%

75% 75% 75%

50% 50% 50%

25% 25% 25%

0% 0% 0%
El Centro Kobe El Centro Kobe El Centro Kobe

Figure 6. Proportion of outliers detected at location A

As shown in Figure 6, when SVM model is trained using mixed data set, the effect due to loading

variability is averaged out and the change in structure properties become dominant, i.e., the

model is able to detect damages caused by arbitrary loads. Note that the acceleration response

measured from Kobe earthquake is never included in the training set and the result is also good,

i.e., the model can generalize well to unseen data. Nonetheless, when damage occurred in either

floor, the model detects a significant change in both sensors and fails to tell the location of the

damage.

4.1.3 Damage Detection Using Regression-based Methods

Using regression based novelty detection approach for damage detection has been suggested by

Los Alamos National Laboratory (LANL) [5], and followed by others with minor modifications

[6, 20]. The concept of this two-step approach is as following: for each structure, a “reference

database” is created recording the acceleration response of perturbing the undamaged structure by

many different excitations. When a new acceleration response aTBD ( t ) is measured from a

structure whose current status is to be determined, the first step is to select an acceleration

response aun ( t ) from the predefined database which is closest to the current measurement. The

step is referred to as “data normalization”. The second step is to fit aun ( t ) using an auto-

regressive model with exogenous inputs (ARX), and use the ARX model to predict aTBD ( t ) .

Denoting the training error between the ARX model and aun ( t ) at time t as ε un ( t ) and the

prediction error between the ARX model and aTBD ( t ) at time t as ε TBD ( t ) , the ratio of the

standard deviation of the two errors is defined as the damage-sensitive feature,

h = σ (ε TBD ) σ (ε un )

and a experimental threshold limit is used to indicate the occurrence of damage.

We adopt the concept of the damage-sensitive indicator, and make two modifications to the

LANL approach. First, instead of selecting a closest acceleration response from the reference

database and build a regression model from that one response, we build our model from all

responses in the database. This simulates the worst case in the first step of LANL approach, i.e.,

no similar excitation can be found in the reference database. Second, in LANL, ε un ( t ) is the

training error of building the ARX model, and ε TBD ( t ) is the prediction error when ARX is used

to predict the unseen data. To be more consistent, in our approach, ε un ( t ) is calculated by use our

regression model to predict an arbitrary piece of unseen response data from the undamaged

structure. Third, linear regression is replaced by SVR, which does not have to be linear and can

guarantee a bounded generalization error. Also, combining with our feature selection scheme,

SVR also provides a systematic way of determining the embedded dimension, a free parameter in

the time series model.

The 2-story steel frame shown in Figure 2 is used in this example. Two experiments are

conducted by exciting the structure using El Centro (1970) and Golden Gate (1989) earthquakes,

respectively. For each experiment, a 5-second long acceleration response, measured from the

structure 5 seconds after the start of excitation, is used as training data. Response measured in the

next 1 second is used as the testing data. We choose C =100 and ε =0.1 for the SVR, and a RBF

kernel with σ =10 is used. The damage detection results are shown in Table 5.

Seismic load El Centro Golden Gate
Location of Damage 1F 2F 1F 2F
h (location A, 1F) 2.984 1.474 2.240 1.173
h (location B, 2F) 1.207 2.554 1.244 2.344

Table 5. Damage Detection in a 2-story Frame using SVR

As expected, the SVR model built from undamaged structure yields significant higher prediction

errors when used to predict the response from damaged structure. When a suitable threshold limit

is chosen for the damage-sensitive feature h (Eq.), the proposed approach is able to indicate both

the existence and the location of the damage.

4.2 ASCE Benchmark Problem

Structural health monitoring studies often apply different methods to different structures, which

make side-by-side comparison of those methods difficult. To coordinate the studies, the ASCE

Task Group on Health Monitoring built a 4-story 2-bay by 2-bay steel frame benchmark structure

and provided two finite element based models, a 12DOF shear-building and a more realistic

120DOF 3D model [19]. The benchmark problem is studied in the following examples.

4.2.1 Support Vector Regression

Five damage patterns are defined in the benchmark study, and we apply the SVR detection

procedure to the first two patterns: (1) all braces in the first story are removed, and (2) all braces

in both the first story and the third story are removed. Acceleration responses of these two

damage patterns are generated by using the 12DOF analytical model under ambient wind load.

The results of damage detection and localization using damage-sensitive feature h is shown in

Table 6. The training data is a mixture of 5-second acceleration responses obtained from the

undamaged structure under 10 different ambient loads. For each damage pattern, two 1-second

acceleration responses caused by different ambient loads (denoted as L1 and L2 in Table 6.) are

used as the testing data. We choose C =100, and ε =0.1 in the SVR, and RBF kernel with σ =10.

Damage pattern 1 Damage pattern 2
# of patterns 30 100 30 100
Ambient load L1 L2 L1 L2 L1 L2 L1 L2
h (1F) 2.57 2.46 1.69 1.56 2.03 2.07 1.78 1.58
h (2F) 1.74 1.07 1.32 0.88 1.48 1.11 1.28 1.09
h (3F) 1.30 1.43 1.26 1.07 2.19 1.92 1.71 1.48
h (4F) 1.30 1.23 1.02 0.89 1.20 1.11 1.08 1.12

Table 6. Damage detection and localization results for damage pattern I and II

Comparing to the results given in [6] and [20], the differences of the damage-sensitive features

between damaged and undamaged structure is less significant, due to the fact we simulate a worse

case in the data normalization step. Nevertheless, our approach indicate the occurrence and the

location of the damage in both damage patterns successfully, whereas the second floor in damage

pattern 2 is classified as damaged in [6].

We can see that the value of h varies when the length of patterns is changed. Although the

approach is able to distinguish the structure status from one another in both pattern lengths, a

systematic way of feature selection is more desirable. We will apply the feature selection scheme

discussed in section 3.2 in the following example.

4.2.2 Feature Selection

We use the feature reduction scheme on both supervised SVM and unsupervised SVR approach.

Recall that in our first example in section 4.1.1, the number of features is selected via trial-and-

error, and more than 100 features are required in order to achieve 80% accuracy. Using Eq., we

draw the absolute value of the components of the w1 in Figure 7. It is clear that some features are
%

more important to others, and we can understand why a long pattern was required. Table 7 shows

that by selecting features based to the value of wi , we can obtain the same level of accuracy

with much less features. Note that we use the term feature and pattern interchangeably in this

section, because we are selecting features in the input (pattern) space.

3

2.5

2
w1

1.5

1

0.5

0
1 16 31 46 61 76 91 106 121 136 151 166 181 196
feature

Figure 7. Absolute value of the components in the w1 vector
%

First k patterns Selected k patterns

k 50 100 150 200 100 40
CV 1 (120) 60 97 111 120 120 104
CV 2 (120) 62 89 111 120 120 108
CV 3 (120) 68 90 119 120 120 105
CV 4 (120) 61 87 117 120 120 106
CV 5 (120) 63 78 116 120 120 110
Average 52.3% 76.2% 95.7 % 100 % 100 % 88.8 %

Table 7. Feature selection in supervised SVM damage detection

Similarly, we apply the feature selection approach to the ASCE benchmark example. The result is

shown in Figure 8 and Table 8. In this case, we can see that a long pattern is not necessary. Using

the first 9 features, the model is able to generate similar result as using 100 features.

7000

6000

5000

4000
w1

3000

2000

1000

0
1 10 19 28 37 46 55 64 73 82 91 100
feature

Figure 8. Distribution of the components in the w1 vector
%

Damage pattern 1 Damage pattern 2
# of patterns 9 100 9 100
Ambient load L1 L2 L1 L2 L1 L2 L1 L2
h (1F) 2.16 2.18 1.69 1.56 1.90 1.85 1.78 1.58
h (2F) 1.55 1.14 1.32 0.88 1.48 1.13 1.28 1.09
h (3F) 1.35 1.30 1.26 1.07 2.08 1.74 1.71 1.48
h (4F) 1.31 0.99 1.02 0.89 1.15 1.03 1.08 1.12

Table 8. Feature selection in SVR-based damage detection

5. Conclusions

SVM has achieved remarkable success in pattern recognition and machine learning areas, and its

continuing developing also shed new light on its applications in SHM. This paper has described

two approaches which applying unsupervised SVM algorithms to vibration-based damage

detection, in addition to the supervised SVM introduced earlier by other researchers. By

combining SVM based novelty detection techniques with vibration-based damage detection

approach, eliminating the need of using data from damaged structure. These approaches are easy

to implement because only vibration responses measured from the structure are required for

building the models. Numerical examples have shown that the SVR approach is able to detect

both the occurrence and location of damages. Furthermore, large dimensional feature vectors

result in more noises and pose a restriction on the scalability of most statistical pattern

recognition methods. The idea of regularization in SVM is extended into feature selection and we

show that the reduced model can still retain the same level of accuracy.

Acknowledgement

This research is supported by the …

References

1. Rytter, A., Vibration based inspection of Civil Engineering structures, in Department of
Building Technology and Structural Engineering. 1993, University of Aalborg: Denmark.
2. Doebling, S.W., C.R. Farrar, and M.B. Prime, A Summary Review of Vibration-Based
Damage Identification Methods. The Shock and Vibration Digest, 1998. 30(2): p. 91-105.
3. Stubbs, N., et al. A Methodology to Nondestructively Evaluate the Structural Properties
of Bridges. in Proceedings of the 17th International Modal Analysis Conference. 1999.
Kissimmee, Fla.
4. N. Haritos and J.S. Owen, The Use of Vibration Data for Damage Detection in Bridges:
A Comparison of System Identification and Pattern Recognition Approaches.
International Journal of Structural Health Monitoring, 2004.
5. Hoon Sohn and Charles R Farrar, Damage Diagnosis Using Time Series Analysis of
Vibration Signals, in Smart Materials and Structures. 2001.
6. Y. Lei, et al. An Enhanced Statistical Damage Detection Algorithm Using Time Series
Analysis. in Proceedings of the 4th International Workshop on Structural Health
Monitoring. 2003.
7. Worden, K. and A.J. Lane, Damage Identification using Support Vector Machines. Smart
Materials and Structures, 2001. 10(3): p. 540-547.

8. Yun, C.B., et al., Damage Estimation Method Using Committee of Neural Networks.
Smart Nondestructive Evaluation and Health Monitoring of Structural and Biological
Systems II. Proceedings of the SPIE, 2003. 5047: p. 263-274.
9. Ahmet Bulut, Peter Shin, and L. Yan. Real-time Nondestructive Structural Health
Monitoring using Support Vector Machines and Wavelets. in Proceedings of Knowledge
Discovery in Data and Data Mining. 2004. Seattle, WA.
10. Vladimir N. Vapnik, The Nature of Statistical Learning Theory. 1995, New York:
Springer-Verlag.
11. Christopher J.C. Burges, A Tutorial on Support Vector Machines for Pattern
Recognition. Knowledge Discovery and Data Mining, 2(2), 1998.
12. Byvatov E., et al., Comparison of support vector machine and artificial neural network
systems for drug/nondrug classification. Journal of Chemical Information and Computer
Sciences, 2003. 43(6): p. 1882-1889.
13. Bernhard Schölkopf and Alex Smola, Learning with Kernels - Support Vector Machines,
Regularization, Optimization and Beyond. 2002: MIT Press.
14. John Shawe-Taylor and Nello Cristianini, Kernel Methods for Pattern Analysis. 2004:
Cambridge University Press.
15. Michael L. Fugate, Hoon Sohn, and C.R. Farrar. Unsupervised Learning Methods for
Vibration-Based Damage Detection. in Proceedings of the 18th International Modal
Analysis Conference. 2000. San Antonio, Texas.
16. Bernhard Schölkopf, et al., Estimating the Support of a High-Dimensional Distribution.
Neural Computation, 2001. 13: p. 1443-1471.
17. Alex J. Smola and Bernhard Schölkopf, A Tutorial on Support Vector Regression, in
NeuroCOLT2 Technical Report Series. 1998.
18. Mead, W.C., et al. Prediction of Chaotic Time Series using CNLS-Net-Example: The
Mackey-Glass Equation. in Nonlinear Modeling and Forecasting. 1992: Addison
Wesley.
19. Johnson, E.A., et al. A Benchmark Problem for Structural Health Monitoring and
Damage Detection. in Proceedings of the 14th Engineering Mechanics Conference. 2000.
Austin, Texas.
20. K.K. Nair, et al. Application of time series analysis in structural damage evaluation. in
Proceedings of the International Conference on Structural Health Monitoring. 2003.
Tokyo, Japan.

ASCE_ChingHuei_Rev00..

Recommended

Recommended

More Related Content

What's hot

What's hot (18)

Similar to ASCE_ChingHuei_Rev00..

Similar to ASCE_ChingHuei_Rev00.. (20)

More from butest

More from butest (20)

ASCE_ChingHuei_Rev00..