Your SlideShare is downloading. ×
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
ASCE_ChingHuei_Rev00..
Upcoming SlideShare
Loading in...5
×

Thanks for flagging this SlideShare!

Oops! An error has occurred.

×
Saving this for later? Get the SlideShare app to save on your phone or tablet. Read anywhere, anytime – even offline.
Text the download link to your phone
Standard text messaging rates apply

ASCE_ChingHuei_Rev00..

190

Published on

0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
190
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
1
Comments
0
Likes
0
Embeds 0
No embeds

Report content
Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
No notes for slide

Transcript

  • 1. Vibration-Based Damage Detection Using Unsupervised Support Vector Machine Ching-Huei Tsou1 and John R. Williams2 {tsou, jrw}@mit.edu Abstract: Vibration-based damage detection methods can be used to identify hidden damages in structural components. Traditional modal based system identification paradigm requires a detailed model of the structure, such as a finite element model. This paper describes a novel statistical damage detection approach based on a support-vector machine methodology. The proposed approach is computational efficient even when the number of features is large and does not suffer from the local minima problem that is encountered by artificial neural networks. We build the statistical model through unsupervised learning, avoiding the need of using measurements from the damaged structure, which is unrealistic in many real world problems. Extracting significant features from raw vibration time series data is crucial to the efficiency and scalability of statistical based methods. A feature selection algorithm is presented along with the building of our statistical model. Numerical simulations, including the ASCE benchmark problem, are analyzed to examine the accuracy and the scalability of our approach. We show that the proposed approach is able to detect both the occurrence and the location of damage, and our feature selection scheme can effectively reduce the required dimensions while retaining high accuracy. 1 Graduate Student, Intelligent Engineering Systems Laboratory (IESL), Department of Civil and Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA. 2 Associate Professor, Director of IESL, Department of Civil and Environmental Engineering and Engineering Systems Division, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
  • 2. 1. Introduction The process of implementing a damage detection strategy is referred to as structural health monitoring (SHM), and can be categorized into five stages [1]: (1) detecting the existence of damage, (2) locating the damage, (3) identifying the type of the damage, (4) determining the severity of the damage, (5) predicting the remaining service life of the structure. Research has been conducted in this field during the past decade and detailed literature reviews on vibration- based damage detection methods can be found in [2-4]. The basic reasoning behind all vibration- based damage detection is that the stiffness, mass, or energy dissipation behavior of the structure will change significantly when damage occurs. These properties can be measured and detected by monitoring the dynamic response of the system. When compared to other nondestructive damage detection (NDD) techniques, such as ultrasonic scanning, acoustic emission, x-ray inspection, etc., the vibration-based method has the advantage of providing a global evaluation of the state of the structure. Traditional vibration-based damage identification applications rely on detailed finite element models of the undamaged structures, and damage diagnosis is made by comparing the modal responses, such as frequencies and mode shapes, of the model and the potentially damaged structure. The system identification approaches have been shown to be very accurate provided the models can produce robust and reliable modal estimates, and large amount of high quality data is available. But these two requirements cannot always be met in the field. To overcome these difficulties, pattern recognition based approaches have been proposed [5-8]. Instead of building models from physical properties of the structures, those methods construct statistical models from the vibration response data directly. This reduces the complexity in the modeling process, in the cost of losing physical meaning of the model. Also, these methods have been shown to be accurate in damage detection and are less sensitive to data quality; however, some problems still remain. For example, methods which use autoregressive models (AR/ARX)
  • 3. [5] may not be able to fit the vibration data well because it gives only linear approximations. Complex statistical models are less efficient, and they have little control over the generalization bound, i.e., they may fit the history data perfectly but have no guarantee on the future data. Methods based on artificial neural network (ANN) [8] often suffer from local minima problem and cannot be trained efficiently, and do not scale well to large scale problems. Methods use support vector machine (SVM) have also been purposed [7, 9], with SVM used to perform supervised, binary classifications. In this paper, we propose using one-class SVM and support vector regression (SVR) to perform unsupervised learning. This does not require training samples from damaged structure because they are usually unavailable in the practical situation. Training SVM is mathematically equivalent to solving a convex quadratic programming (QP) problem that does not have local minima. The lack of local minima means it can be trained faster than ANN. Finally, SVM is used with a linear kernel to reduce the number of features in our model. This leads to a statistical model that is efficient, accurate, and easy to implement. Mathematical simulations are provided to examine the performance and accuracy of this approach. 2. Theory of SVM and Its Application in Damage Detection We propose SVM-based approach in this paper because of its theoretical advantages over other learning algorithms. SVM has been applied in various pattern recognition fields and it is not new to introduce SVM into SHM. Nevertheless, SVM itself has evolved a lot during the past few years, and these developments also shed new light on its applications in SHM. In this section, we first review the motivation and algorithm of SVM. Then we move on to introduce two extensions of SVM, which are able to perform unsupervised learning, and how they can be applied in the damage detection scheme.
  • 4. 2.1 Support Vector Machine Support Vector Machine was developed by Vapnik et al. [10] based on structural risk minimization (SRM) principle from statistical learning theory, rather than empirical risk minimization (ERM) used by most other learning algorithms (Risk means test error in this context). This fundamental difference allows SVM to select the best classifier from a family of functions that not only fits the training data well but also provides a bonded generalization error, i.e., a better prediction power [11]. Together with kernel techniques, SVM has shown superior performance on both speed and accuracy, and it has outperformed Artificial Neural Networks (ANN) in a wide variety of applications [12]. We start introducing the algorithm by discussing the simplest case, a linear classifier trained on separable binary data. Assume we have l training examples, { x i , yi } , i = 1,L , l where yi ∈ { −1, 1} , x i ∈R n % % x i are often referred to as patterns or inputs, and yi are called label or outputs of the example. A % linear classifier (a hyperplane in R n ) can be defined as, f ( x i ) = x T w1 + w0 = 0 % %i % where w1 is a vector normal to the hyperplane, and w0 is a scalar constant. We can also define % another two auxiliary hyperplanes by f ( x i ) = x T w1 + w0 = ±1 . It is easy to show that each of the % %i % two parallel hyperplanes has a perpendicular distance the original hyperplane equal to 1/ w1 . % The distance is often referred to as the “margin”. Because the data is separable, we can always find those hyperplanes that separate the training samples perfectly. It is obvious that the solution is not unique, and the SVM algorithm looks for the one that gives the maximum margin. The optimization problem for the above process can be expressed as, 1 2 minimize: w1 2 %
  • 5. subject to: yi ( x i w1 + w0 ) ≥ 1 T % % To extend to inseparable data, we can introduce slack variables ξi to relax the constraints, and then add some penalty to the relaxation. The new optimization problem becomes, { } l 1 w1 + C ∑ max 1 − yi ( x T w1 + w0 ) , 0 2 minimize: 2 % i =1 %i % subject to: yi ( x i w1 + w0 ) ≥ 1 − ξi and ξi ≥ 0 T % % where C is a constant determining the trade-off between our two conflicting goals: maximizing the margin, and minimizing the training error. For computational simplicity, we can further transform the optimization problem into its dual form by using Lagrange multipliers, denoted by α i ’s, and the result becomes, l 1 l l maximize: ∑α i =1 i − ∑∑αiα j yi yj x T x j 2 i =1 j =1 %i % l subject to: ∑α y i =1 i i = 0 and C ≥ α i ≥ 0 For all constraints in Eq. that are not strictly met as equalities, the corresponding α i ’s must be zeros. This is known as the Karush-Kuhn-Tucker (KKT) conditions in optimization theory. Examples with non-zero α i ’s are called the support vectors, and the classifier is determined by the support vectors alone, N SV f ( x i ) = x T w1 + w0 = % %i % ∑α y x j =1 % j j T i x j + w0 % where N SV denotes the total number of support vectors. To extend the algorithm from linear to nonlinear, we define a mapping function φ : R n → H which maps x i from its original Euclidian space to a reproducing kernel Hilbert space (RKHS). % The original space is often referred to as the sample space, and the RKHS is called the feature
  • 6. space. Without losing generality in our context, we can simply think a RKHS as a generalization Euclidian space which can have infinite dimensions. By replacing x i in the optimization problem % with f ( x i ) and perform linear classification in the corresponding RKHS, the solution become, % % f ( x i ) = f T ( x i ) w1 + w0 % % % % N SV N SV = ∑ α y f% j =1 j j T ( x i )f ( x j ) + w0 = % % % ∑α y Κ ( x ,x ) + w j =1 j j % % i j 0 The mapping function φ is called kernel function and its dot product f ( x i )f ( x j ) = Κ ( x i , x j ) T % % % % % % is known as the kernel. Popular selection of kernels includes linear kernel, polynomial kernel, and radial basis function (RBF) kernel. When nonlinear kernels are used, Eq. is no longer a linear function in the original Euclidian space. Because solving SVM corresponds to solving a convex QP problem, it does not have local minima and can be trained faster than algorithm that does, such as ANN. We can show that N SV << l for easier problems, i.e., problems with small generalization errors. This leads to a sparse matrix in Eq. and Eq., and that means the optimization problem can be solved efficiently. Also, through SRM and VC dimension [10], SVM provides a bounded generalization error and a systematic way to select the complexity of the solution function, which effectively control the problem of overfitting. Detailed discussion of these properties is beyond the scope of this paper, and can be found in many recent statistical learning text books [13, 14]. 2.2 One-Class Support Vector Machine SVM is originally a supervised, batch learning algorithm, and has been applied in the SHM field [7, 9] performing binary classification tasks. A major challenge is that data measured from damaged structure is often not available in practical situations, thus unsupervised learning methods are more desirable [15]. Similar needs also occur in other domains, and researchers in
  • 7. machine learning and pattern recognition communities have extended the idea of SVM into unsupervised learning, often referred to as one-class SVM [16]. Instead of finding a hyperplane that maximize the margin between two classes in the RKHS, one- class SVM maximizes the distance from the hyperplane to the origin. The corresponding optimization problem becomes, 1 1 ∑ξ 2 minimize: w1 + −ρ νl i 2 % i subject to: f ( x i ) w1 ≥ ρ −ξi ξi ≥ 0 T and % % % where ν ∈ ( 0,1] is a parameter similar to the C introduced in Eq., and ρ is a offset which will be calculated automatically during the optimization. If a translation invariant kernel is used (e.g. RBF kernel), the goal of one-class SVM can also be thought of as to find small spheres that contain most of the training samples. 2.3 Support Vector Regression SVM was first developed for classification, and the labels yi in represent a finite number of possible categories. The algorithm can be extended to estimate real-valued functions by allowing yi to have real value, and defining a suitable loss function [17]. The following loss function, { f ( x i ) − yi = max f ( x i ) − yi − ε , 0 % % } known as ε -insensitive loss function, pays no penalty to points within the ε range, and this carries over the sparseness property from SVM to SVR. Again, the estimated function can be expressed as Eq., and the goal now is to minimize, l 1 w1 2 % 2 i =1 % { + C ∑ max f ( x i ) − yi − ε , 0 } The basic idea of SVM, one-class SVM and SVR are summarized in Table 1 and Figure 1. For simplicity, the discriminant function (Eq.) of SVM and one-class SVM are drawn as linear
  • 8. functions. As mentioned, when nonlinear kernels are used, the functions are by no means linear in the sample space. Maximize Penalty misclassified samples and SVM distance between two hyperplanes samples within the margin One-class SVM distance between the hyperplane and origin misclassified samples SVR smoothness of the function samples outside the ε - tube Table 1. Comparison of SVM, one-class SVM and SVR Origin Figure 1. Geometric interpretation of SVM, one-class SVM and SVR in 2D 2.4 Damage Detection Using SVM Vibration-based damage detection approaches are grounded on the assumption that the dynamic response of the system will change significantly when damage occurs. We propose using SVM for the detection, either through novelty detection or regression, and it is essential to have a reasonable representation of the dynamic response before we can feed the data into SVM. A time series is usually modeled by splitting it into series of windows, and the value at each time point is determined by a set of its previous values, i.e., xt = f ( xt −τ , xt − 2τ ,..., xt − mτ ) where m and τ are referred to as the embedding dimension and delay time [18], respectively. Through this representation, an acceleration response series can be transformed into a data set of fixed-length vectors, and used by SVM. Damage detection is conducted by examining the
  • 9. similarity and dissimilarity among data collected from different structure status. Detailed analysis procedure will be given in the numerical studies section. 3. Feature Selection As mentioned, statistical proximate approach is an attractive alternative to approaches based on high order physical models in the sense that the former is computational competitive, less sensitive to modeling error and data quality, and requires only measurement signals to build the model. SVM is among the fastest algorithm in statistical learning; however, for large scale problems the SVM algorithm is still slow and further reducing the computational complexity is necessary. 3.1 The Motivation of Feature Selection in the Proposed Approach Although in the dual form of SVM we are facing a QP problem whose computational complexity is proximately proportional to the square of the number of training examples l , not the number of features (ref. Eq.~), reducing the number of features is nevertheless helping to improve the performance. For example, dot products between feature vectors are frequently required when evaluating a kernel function. This process is time-consuming when the number of features is large. When implementing a SVM solver, we often cache these results to improve the performance, and this also brings up the memory consumption problem. Besides, field data is often polluted by noises and redundant information, and feature selection provides a way of identifying and eliminating them from the feature set. This not only improves the computational efficiency but also increase the accuracy.
  • 10. 3.2 Feature Selection using SVM By looking at the solution of the primal form of SVM given by Eq., we can see that each component in w1 can be thought of as the weight of its corresponding feature, φ ( x i ) , in RKHS. % % Feature reduction is done by removing features with zero weights from the set. 2 The primal SVM optimization problem is to minimize w1 while obey all the constraints, which % forces the value of each component wi to be small, but does not set it to zero because the 2 2 derivative of w1 at wi B 0 is small. We could replace w1 by w1 in SVM to solve this % % % problem, but this will forbid us transform SVM into its dual form and lose all the advantages. The simplest way to get around this is to set a threshold for wi , and remove features associate with weights smaller than the threshold. Using the time series model indicated in Eq., the target of feature reduction in our damage detection approach is to reduce the embedding dimension, i.e., the length of patterns in the sample space. Because we are aimed to the sample space, no feature mapping is needed and a linear kernel is suitable for this scenario, for its efficiency. Note that the choice of kernel in feature selection is independent of the choice of kernel in the classification or regression stage. When a linear kernel is used, features in the RKHS are actually the patterns in the sample space themselves. This is why we used the conventional term “feature selection” throughout this paper, although we are actually doing “pattern selection”. The value of w1 is not calculated in solving SVM because only its dot product is required and % which can be obtained more efficiently by evaluating the kernel function. When doing feature selection, we need to calculate w1 explicitly. The following relation is obtained while deriving % the dual form of SVM with linear kernel (ref. Eq.),
  • 11. n SV w 1 = ∑ α i yi x i % i =1 % which can be used to determine w1 once the corresponding SVM is solved. % This feature selection approach allow us to reduce the number of features while keep the accuracy, as our numerical studies shown. 4. Numerical Studies In this section, we demonstrate the proposed approach using a simple 2-story shear building and the ASCE benchmark problem [19]. In all examples, acceleration responses are first normalized using, ai = ( ai − µa ) σ a where µa and σ a are the sample mean and sample standard deviation of the acceleration signal, respectively. Also, in all examples, when we say acceleration response we mean relative acceleration response between two adjacent floors, i.e., the acceleration difference between the current floor and the one below. By doing these, we do not need to deal with the scale and units of the loading, and we can better isolate the effect of damages in each story. The value of SVM related parameters, such as C , ν , ε and σ in RBF kernel are selected based on common practice in pattern recognition and are specified in each example. In general, we can obtain similar results as long as those parameters are within a reasonable range. 4.1 Two-Story Shear Building We start with a simple 2-story shear building shown in Figure 2. Damage is modeled by reducing the stiffness of a column. Vibration data are collected through accelerometers attached under each floor. Three different SVM based approach are used for damage detection, namely, (1) supervised SVM, (2) one-class SVM, and (3) support vector regression.
  • 12. 0.2 m B 0.5 m A 0.5 m Figure 2. Plane Steel Frame under Traverse Seismic Load ( EI = 6.379 N ⋅ m 2 for all columns) 4.1.1 Damage Detection Using Supervised Support Vector Machine Figure 3 shows the acceleration response of the structure under the 1940 El Centro earthquake load, and each time series corresponds to a different structure status. The damage in a floor is modeled by reducing the stiffness of one of the columns in that floor by 50%. 1 0.5 Acceleration (g) 0 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 Time (sec) -0.5 Undamaged 1F Damaged 2F Damaged -1 Figure 3. Acceleration Measurements from Accelerometer A (1F, El Centro) The vibration data are recorded at location A and B with sampling rate equal to 50Hz. Therefore, with a 2-second long window, we can extract 100 patterns from the time series for each example. Knowing the patterns and their corresponding labels (undamaged, 1st floor damaged, or 2nd floor damaged), we can feed these data into a support vector machine (using C =100 and a RBF kernel with σ 2 =20). The results of a 5-fold cross validation are shown in Table 1. We can see that SVM
  • 13. is able to detect the occurrence as well as the location of the damage with very high accuracy, provided the number of patterns is long enough. The trial-and-error way of selecting patterns here will be replaced by our feather selection algorithm in section 4.2.2. # of Patterns CV 1 CV 2 CV 3 CV 4 CV 5 Average 100 97 / 120 89 / 120 90 / 120 87 / 120 78 / 120 76.2% 150 111 / 120 111 / 120 119 / 120 117 / 120 116 / 120 95.7% 200 120 / 120 120 / 120 120 / 120 120 / 120 120 / 120 100% Table 1. Cross Validation Results (3 structure status; El Centro) In the next example, the same structure is excited using two different seismic loads. For each load, acceleration responses in two different structure status (undamaged / 1st floor damaged) are recorded, as shown in Figure 4. 1.5 1 0.5 Acceleration (g) 0 0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5 -0.5 Time (sec) -1 -1.5 Undamaged 1F damaged Undamaged (Kobe) 1F damaged (Kobe) -2 Figure 4. Acceleration Measurements from Accelerometer A (El Centro and Kobe) The purpose here is to detect damages in the structure, regardless of the sources of excitations that causes the damage. We mix the two acceleration responses measured from the structure under different excitations and train SVM with 2 classes (damaged or undamaged) instead of 4. The cross validation results are shown in Table 2. Even with mixed excitations, SVM can still achieve an accurate detection. When we group the training samples with the same structure status together, we are implicitly indicating that the excitation is not a feature we care about, hence SVM is focused on maximizing the differences caused by changes in the structure. # of Patterns CV 1 CV 2 CV 3 CV 4 CV 5 Average
  • 14. 200 145 / 160 151 / 160 146 / 160 144 / 160 147 / 160 91.6% Table 2. Cross Validation Results (2 structure status; El Centro & Kobe) 4.1.2 Damage Detection Using One-Class Support Vector Machine Although supervised SVM classification is accurate and easy to implement, in practice we often do not have vibration data from damaged structure beforehand. In this section, we apply one-class SVM on the same structure used before. Similarly, we extract features from the acceleration response by setting a windows size equal to 2 seconds (100 patterns for each example), and we again use a RBF kernel with σ 2 =20, and ν =0.1. Three one-class SVM models are trained using response data measured from undamaged structure, each with a different seismic load. Then each model is used to test response data measured from both damaged and undamaged structure, under the 3 seismic loads. The results are shown in below. Training El Centro Golden Gate Kobe Testing El C. G.G. Kobe El C. G.G. Kobe El C. G.G. Kobe Undam. 14.3% 29.6% 34.9% 85.3% 16.3% 97.8% 18.9% 21.6% 17.4% 1F 89.1% 90.6% 75.0% 98.8% 97.2% 100% 86.7% 57.8% 68.8% 2F 80.9% 100% 73.4% 100% 100% 100% 75.4% 100% 66.6% Table 3. Proportion of outliers (800 testing samples) Undamaged 1F 2F Undamaged 1F 2F Undamaged 1F 2F 100% 100% 100% 75% 75% 75% 50% 50% 50% 25% 25% 25% 0% 0% 0% El Centro Golden Gate Kobe El Centro Golden Gate Kobe El Centro Golden Gate Kobe Figure 5. Proportion of outliers (800 testing samples) (model built using El Centro, Golden Gate, and Kobe seismic loading respectively, left to right) Table 3 and Figure 5 indicate that when a damage occurred in the structure, the percentage of outliers increase significantly. Note that each SVM model is trained using positive samples
  • 15. measured from one particular seismic load. When the model trained using Golden Gate earthquake is applied to monitor the same structure under a different seismic load, a large portion of signals measured from the undamaged structure are also considered as outliers. This is due to the fact that both external force and structure status can affect the acceleration response, and a model built on one particular loading history cannot be generalized well to monitor arbitrary loading. To reduce this unwanted effect, we train SVM models using a larger database that consists of mixture of acceleration responses measured from undamaged structure under different seismic loads. By grouping these responses together, we implicitly tell SVM to ignore the differences caused by excitation variability. Table 4 and Figure 6 shows the results of damage detection using models built on 3 different sized data sets. (left to right, training data measured from structure under a. Golden Gate, b. El Centro and Golden Gate, c. El Centro, Golden Gate, Corral, Hach and Hachinohe seismic load) Training Golden Gate 2 mixture 5 mixture Testing El Centro Kobe El Centro Kobe El Centro Kobe Undam. 85.3% 97.8% 9.5% 31.6% 1.0% 16.0% 1F 98.8% 100% 90.5% 75.1% 64.0% 66.6% 2F 100% 100% 79.0% 73.5% 63.8% 64.5% Table 4. Proportion of outliers (800 testing samples) detected at location A Undamaged 1F 2F Undamaged 1F 2F Undamaged 1F 2F 100% 100% 100% 75% 75% 75% 50% 50% 50% 25% 25% 25% 0% 0% 0% El Centro Kobe El Centro Kobe El Centro Kobe Figure 6. Proportion of outliers detected at location A As shown in Figure 6, when SVM model is trained using mixed data set, the effect due to loading variability is averaged out and the change in structure properties become dominant, i.e., the
  • 16. model is able to detect damages caused by arbitrary loads. Note that the acceleration response measured from Kobe earthquake is never included in the training set and the result is also good, i.e., the model can generalize well to unseen data. Nonetheless, when damage occurred in either floor, the model detects a significant change in both sensors and fails to tell the location of the damage. 4.1.3 Damage Detection Using Regression-based Methods Using regression based novelty detection approach for damage detection has been suggested by Los Alamos National Laboratory (LANL) [5], and followed by others with minor modifications [6, 20]. The concept of this two-step approach is as following: for each structure, a “reference database” is created recording the acceleration response of perturbing the undamaged structure by many different excitations. When a new acceleration response aTBD ( t ) is measured from a structure whose current status is to be determined, the first step is to select an acceleration response aun ( t ) from the predefined database which is closest to the current measurement. The step is referred to as “data normalization”. The second step is to fit aun ( t ) using an auto- regressive model with exogenous inputs (ARX), and use the ARX model to predict aTBD ( t ) . Denoting the training error between the ARX model and aun ( t ) at time t as ε un ( t ) and the prediction error between the ARX model and aTBD ( t ) at time t as ε TBD ( t ) , the ratio of the standard deviation of the two errors is defined as the damage-sensitive feature, h = σ (ε TBD ) σ (ε un ) and a experimental threshold limit is used to indicate the occurrence of damage. We adopt the concept of the damage-sensitive indicator, and make two modifications to the LANL approach. First, instead of selecting a closest acceleration response from the reference database and build a regression model from that one response, we build our model from all
  • 17. responses in the database. This simulates the worst case in the first step of LANL approach, i.e., no similar excitation can be found in the reference database. Second, in LANL, ε un ( t ) is the training error of building the ARX model, and ε TBD ( t ) is the prediction error when ARX is used to predict the unseen data. To be more consistent, in our approach, ε un ( t ) is calculated by use our regression model to predict an arbitrary piece of unseen response data from the undamaged structure. Third, linear regression is replaced by SVR, which does not have to be linear and can guarantee a bounded generalization error. Also, combining with our feature selection scheme, SVR also provides a systematic way of determining the embedded dimension, a free parameter in the time series model. The 2-story steel frame shown in Figure 2 is used in this example. Two experiments are conducted by exciting the structure using El Centro (1970) and Golden Gate (1989) earthquakes, respectively. For each experiment, a 5-second long acceleration response, measured from the structure 5 seconds after the start of excitation, is used as training data. Response measured in the next 1 second is used as the testing data. We choose C =100 and ε =0.1 for the SVR, and a RBF kernel with σ =10 is used. The damage detection results are shown in Table 5. Seismic load El Centro Golden Gate Location of Damage 1F 2F 1F 2F h (location A, 1F) 2.984 1.474 2.240 1.173 h (location B, 2F) 1.207 2.554 1.244 2.344 Table 5. Damage Detection in a 2-story Frame using SVR As expected, the SVR model built from undamaged structure yields significant higher prediction errors when used to predict the response from damaged structure. When a suitable threshold limit is chosen for the damage-sensitive feature h (Eq.), the proposed approach is able to indicate both the existence and the location of the damage.
  • 18. 4.2 ASCE Benchmark Problem Structural health monitoring studies often apply different methods to different structures, which make side-by-side comparison of those methods difficult. To coordinate the studies, the ASCE Task Group on Health Monitoring built a 4-story 2-bay by 2-bay steel frame benchmark structure and provided two finite element based models, a 12DOF shear-building and a more realistic 120DOF 3D model [19]. The benchmark problem is studied in the following examples. 4.2.1 Support Vector Regression Five damage patterns are defined in the benchmark study, and we apply the SVR detection procedure to the first two patterns: (1) all braces in the first story are removed, and (2) all braces in both the first story and the third story are removed. Acceleration responses of these two damage patterns are generated by using the 12DOF analytical model under ambient wind load. The results of damage detection and localization using damage-sensitive feature h is shown in Table 6. The training data is a mixture of 5-second acceleration responses obtained from the undamaged structure under 10 different ambient loads. For each damage pattern, two 1-second acceleration responses caused by different ambient loads (denoted as L1 and L2 in Table 6.) are used as the testing data. We choose C =100, and ε =0.1 in the SVR, and RBF kernel with σ =10. Damage pattern 1 Damage pattern 2 # of patterns 30 100 30 100 Ambient load L1 L2 L1 L2 L1 L2 L1 L2 h (1F) 2.57 2.46 1.69 1.56 2.03 2.07 1.78 1.58 h (2F) 1.74 1.07 1.32 0.88 1.48 1.11 1.28 1.09 h (3F) 1.30 1.43 1.26 1.07 2.19 1.92 1.71 1.48 h (4F) 1.30 1.23 1.02 0.89 1.20 1.11 1.08 1.12 Table 6. Damage detection and localization results for damage pattern I and II Comparing to the results given in [6] and [20], the differences of the damage-sensitive features between damaged and undamaged structure is less significant, due to the fact we simulate a worse case in the data normalization step. Nevertheless, our approach indicate the occurrence and the
  • 19. location of the damage in both damage patterns successfully, whereas the second floor in damage pattern 2 is classified as damaged in [6]. We can see that the value of h varies when the length of patterns is changed. Although the approach is able to distinguish the structure status from one another in both pattern lengths, a systematic way of feature selection is more desirable. We will apply the feature selection scheme discussed in section 3.2 in the following example. 4.2.2 Feature Selection We use the feature reduction scheme on both supervised SVM and unsupervised SVR approach. Recall that in our first example in section 4.1.1, the number of features is selected via trial-and- error, and more than 100 features are required in order to achieve 80% accuracy. Using Eq., we draw the absolute value of the components of the w1 in Figure 7. It is clear that some features are % more important to others, and we can understand why a long pattern was required. Table 7 shows that by selecting features based to the value of wi , we can obtain the same level of accuracy with much less features. Note that we use the term feature and pattern interchangeably in this section, because we are selecting features in the input (pattern) space. 3 2.5 2 w1 1.5 1 0.5 0 1 16 31 46 61 76 91 106 121 136 151 166 181 196 feature Figure 7. Absolute value of the components in the w1 vector % First k patterns Selected k patterns
  • 20. k 50 100 150 200 100 40 CV 1 (120) 60 97 111 120 120 104 CV 2 (120) 62 89 111 120 120 108 CV 3 (120) 68 90 119 120 120 105 CV 4 (120) 61 87 117 120 120 106 CV 5 (120) 63 78 116 120 120 110 Average 52.3% 76.2% 95.7 % 100 % 100 % 88.8 % Table 7. Feature selection in supervised SVM damage detection Similarly, we apply the feature selection approach to the ASCE benchmark example. The result is shown in Figure 8 and Table 8. In this case, we can see that a long pattern is not necessary. Using the first 9 features, the model is able to generate similar result as using 100 features. 7000 6000 5000 4000 w1 3000 2000 1000 0 1 10 19 28 37 46 55 64 73 82 91 100 feature Figure 8. Distribution of the components in the w1 vector % Damage pattern 1 Damage pattern 2 # of patterns 9 100 9 100 Ambient load L1 L2 L1 L2 L1 L2 L1 L2 h (1F) 2.16 2.18 1.69 1.56 1.90 1.85 1.78 1.58 h (2F) 1.55 1.14 1.32 0.88 1.48 1.13 1.28 1.09 h (3F) 1.35 1.30 1.26 1.07 2.08 1.74 1.71 1.48 h (4F) 1.31 0.99 1.02 0.89 1.15 1.03 1.08 1.12 Table 8. Feature selection in SVR-based damage detection 5. Conclusions SVM has achieved remarkable success in pattern recognition and machine learning areas, and its continuing developing also shed new light on its applications in SHM. This paper has described
  • 21. two approaches which applying unsupervised SVM algorithms to vibration-based damage detection, in addition to the supervised SVM introduced earlier by other researchers. By combining SVM based novelty detection techniques with vibration-based damage detection approach, eliminating the need of using data from damaged structure. These approaches are easy to implement because only vibration responses measured from the structure are required for building the models. Numerical examples have shown that the SVR approach is able to detect both the occurrence and location of damages. Furthermore, large dimensional feature vectors result in more noises and pose a restriction on the scalability of most statistical pattern recognition methods. The idea of regularization in SVM is extended into feature selection and we show that the reduced model can still retain the same level of accuracy. Acknowledgement This research is supported by the … References 1. Rytter, A., Vibration based inspection of Civil Engineering structures, in Department of Building Technology and Structural Engineering. 1993, University of Aalborg: Denmark. 2. Doebling, S.W., C.R. Farrar, and M.B. Prime, A Summary Review of Vibration-Based Damage Identification Methods. The Shock and Vibration Digest, 1998. 30(2): p. 91-105. 3. Stubbs, N., et al. A Methodology to Nondestructively Evaluate the Structural Properties of Bridges. in Proceedings of the 17th International Modal Analysis Conference. 1999. Kissimmee, Fla. 4. N. Haritos and J.S. Owen, The Use of Vibration Data for Damage Detection in Bridges: A Comparison of System Identification and Pattern Recognition Approaches. International Journal of Structural Health Monitoring, 2004. 5. Hoon Sohn and Charles R Farrar, Damage Diagnosis Using Time Series Analysis of Vibration Signals, in Smart Materials and Structures. 2001. 6. Y. Lei, et al. An Enhanced Statistical Damage Detection Algorithm Using Time Series Analysis. in Proceedings of the 4th International Workshop on Structural Health Monitoring. 2003. 7. Worden, K. and A.J. Lane, Damage Identification using Support Vector Machines. Smart Materials and Structures, 2001. 10(3): p. 540-547.
  • 22. 8. Yun, C.B., et al., Damage Estimation Method Using Committee of Neural Networks. Smart Nondestructive Evaluation and Health Monitoring of Structural and Biological Systems II. Proceedings of the SPIE, 2003. 5047: p. 263-274. 9. Ahmet Bulut, Peter Shin, and L. Yan. Real-time Nondestructive Structural Health Monitoring using Support Vector Machines and Wavelets. in Proceedings of Knowledge Discovery in Data and Data Mining. 2004. Seattle, WA. 10. Vladimir N. Vapnik, The Nature of Statistical Learning Theory. 1995, New York: Springer-Verlag. 11. Christopher J.C. Burges, A Tutorial on Support Vector Machines for Pattern Recognition. Knowledge Discovery and Data Mining, 2(2), 1998. 12. Byvatov E., et al., Comparison of support vector machine and artificial neural network systems for drug/nondrug classification. Journal of Chemical Information and Computer Sciences, 2003. 43(6): p. 1882-1889. 13. Bernhard Schölkopf and Alex Smola, Learning with Kernels - Support Vector Machines, Regularization, Optimization and Beyond. 2002: MIT Press. 14. John Shawe-Taylor and Nello Cristianini, Kernel Methods for Pattern Analysis. 2004: Cambridge University Press. 15. Michael L. Fugate, Hoon Sohn, and C.R. Farrar. Unsupervised Learning Methods for Vibration-Based Damage Detection. in Proceedings of the 18th International Modal Analysis Conference. 2000. San Antonio, Texas. 16. Bernhard Schölkopf, et al., Estimating the Support of a High-Dimensional Distribution. Neural Computation, 2001. 13: p. 1443-1471. 17. Alex J. Smola and Bernhard Schölkopf, A Tutorial on Support Vector Regression, in NeuroCOLT2 Technical Report Series. 1998. 18. Mead, W.C., et al. Prediction of Chaotic Time Series using CNLS-Net-Example: The Mackey-Glass Equation. in Nonlinear Modeling and Forecasting. 1992: Addison Wesley. 19. Johnson, E.A., et al. A Benchmark Problem for Structural Health Monitoring and Damage Detection. in Proceedings of the 14th Engineering Mechanics Conference. 2000. Austin, Texas. 20. K.K. Nair, et al. Application of time series analysis in structural damage evaluation. in Proceedings of the International Conference on Structural Health Monitoring. 2003. Tokyo, Japan.

×