This document describes a novel statistical damage detection approach using unsupervised support vector machines (SVM). It aims to identify damage in structural components through vibration-based methods. The proposed approach builds a statistical model through unsupervised learning, avoiding the need for measurements from damaged structures. It is computationally efficient even with large numbers of features and does not suffer from local minima problems like artificial neural networks. Numerical simulations show the approach can accurately detect both the occurrence and location of damage.
The document provides an overview of key statistical concepts including:
- Random variables and their probability distributions and functions
- Common estimators like the sample mean and variance
- The distributions of common estimators and how they relate to the underlying population parameters
- Confidence intervals and how they are used to quantify the uncertainty in estimates based on sample data
- Hypothesis testing framework including defining the null and alternative hypotheses, calculating a test statistic, and determining whether to reject or fail to reject the null based on probability thresholds
CVPR2010: Advanced ITinCVPR in a Nutshell: part 3: Feature Selectionzukun
This document discusses high-dimensional feature selection for images, genes, and graphs. It covers several key topics:
1) Feature selection aims to reduce dimensionality for improving classifier performance and identify important patterns. This is challenging with thousands of features.
2) Mutual information is proposed as an optimal criterion for evaluating feature subsets, as it relates to the Bayesian error rate.
3) The mRMR criterion is introduced to maximize feature relevance while minimizing redundancy between features.
Sparse data formats and efficient numerical methods for uncertainties in nume...Alexander Litvinenko
Description of methodologies and overview of numerical methods, which we used for modeling and quantification of uncertainties in numerical aerodynamics
This document proposes a simple procedure for beginners to obtain reasonable results when using support vector machines (SVMs) for classification tasks. The procedure involves preprocessing data through scaling, using a radial basis function kernel, selecting model parameters through cross-validation grid search, and training the full model on the preprocessed data. The document provides examples applying this procedure to real-world datasets, demonstrating improved accuracy over approaches without careful preprocessing and parameter selection.
This document discusses radial basis function networks. It begins by introducing the basic structure of RBF networks, which typically involve an input layer, a hidden layer that applies a nonlinear transformation using radial basis functions, and an output layer with a linear transformation. The document then discusses Cover's theorem, which states that pattern classification problems are more likely to be linearly separable when mapped to a higher-dimensional space through a nonlinear transformation. Several key concepts are introduced, including dichotomies, phi-separable functions, and using hidden functions to map patterns to a hidden feature space.
RECENT ADVANCES in PREDICTIVE (MACHINE) LEARNINGbutest
This document provides an introduction to recent advances in predictive machine learning, specifically support vector machines and boosted decision trees. It begins with an overview of predictive learning and common methods. It then describes kernel methods, including how they were extended to support vector machines. Next, it discusses extending decision trees with boosting. The document concludes by comparing support vector machines and boosted decision trees, and noting they are not the only recent advances in machine learning.
For more info visit us at: http://www.siliconmentor.com/
Support vector machines are widely used binary classifiers known for its ability to handle high dimensional data that classifies data by separating classes with a hyper-plane that maximizes the margin between them. The data points that are closest to hyper-plane are known as support vectors. Thus the selected decision boundary will be the one that minimizes the generalization error (by maximizing the margin between classes).
The document summarizes statistical pattern recognition techniques. It is divided into 9 sections that cover topics like dimensionality reduction, classifiers, classifier combination, and unsupervised classification. The goal of pattern recognition is supervised or unsupervised classification of patterns based on features. Dimensionality reduction aims to reduce the number of features to address the curse of dimensionality when samples are limited. Multiple classifiers can be combined through techniques like stacking, bagging, and boosting. Unsupervised classification uses clustering algorithms to construct decision boundaries without labeled training data.
The document provides an overview of key statistical concepts including:
- Random variables and their probability distributions and functions
- Common estimators like the sample mean and variance
- The distributions of common estimators and how they relate to the underlying population parameters
- Confidence intervals and how they are used to quantify the uncertainty in estimates based on sample data
- Hypothesis testing framework including defining the null and alternative hypotheses, calculating a test statistic, and determining whether to reject or fail to reject the null based on probability thresholds
CVPR2010: Advanced ITinCVPR in a Nutshell: part 3: Feature Selectionzukun
This document discusses high-dimensional feature selection for images, genes, and graphs. It covers several key topics:
1) Feature selection aims to reduce dimensionality for improving classifier performance and identify important patterns. This is challenging with thousands of features.
2) Mutual information is proposed as an optimal criterion for evaluating feature subsets, as it relates to the Bayesian error rate.
3) The mRMR criterion is introduced to maximize feature relevance while minimizing redundancy between features.
Sparse data formats and efficient numerical methods for uncertainties in nume...Alexander Litvinenko
Description of methodologies and overview of numerical methods, which we used for modeling and quantification of uncertainties in numerical aerodynamics
This document proposes a simple procedure for beginners to obtain reasonable results when using support vector machines (SVMs) for classification tasks. The procedure involves preprocessing data through scaling, using a radial basis function kernel, selecting model parameters through cross-validation grid search, and training the full model on the preprocessed data. The document provides examples applying this procedure to real-world datasets, demonstrating improved accuracy over approaches without careful preprocessing and parameter selection.
This document discusses radial basis function networks. It begins by introducing the basic structure of RBF networks, which typically involve an input layer, a hidden layer that applies a nonlinear transformation using radial basis functions, and an output layer with a linear transformation. The document then discusses Cover's theorem, which states that pattern classification problems are more likely to be linearly separable when mapped to a higher-dimensional space through a nonlinear transformation. Several key concepts are introduced, including dichotomies, phi-separable functions, and using hidden functions to map patterns to a hidden feature space.
RECENT ADVANCES in PREDICTIVE (MACHINE) LEARNINGbutest
This document provides an introduction to recent advances in predictive machine learning, specifically support vector machines and boosted decision trees. It begins with an overview of predictive learning and common methods. It then describes kernel methods, including how they were extended to support vector machines. Next, it discusses extending decision trees with boosting. The document concludes by comparing support vector machines and boosted decision trees, and noting they are not the only recent advances in machine learning.
For more info visit us at: http://www.siliconmentor.com/
Support vector machines are widely used binary classifiers known for its ability to handle high dimensional data that classifies data by separating classes with a hyper-plane that maximizes the margin between them. The data points that are closest to hyper-plane are known as support vectors. Thus the selected decision boundary will be the one that minimizes the generalization error (by maximizing the margin between classes).
The document summarizes statistical pattern recognition techniques. It is divided into 9 sections that cover topics like dimensionality reduction, classifiers, classifier combination, and unsupervised classification. The goal of pattern recognition is supervised or unsupervised classification of patterns based on features. Dimensionality reduction aims to reduce the number of features to address the curse of dimensionality when samples are limited. Multiple classifiers can be combined through techniques like stacking, bagging, and boosting. Unsupervised classification uses clustering algorithms to construct decision boundaries without labeled training data.
A simple numerical procedure for estimating nonlinear uncertainty propagationISA Interchange
This document presents a numerical method for estimating nonlinear uncertainty propagation. The method approximates the nonlinear function with piecewise linear segments. It then calculates the probability density function of the dependent variable based on the transformations of the linear segments. For functions of a normally distributed independent variable, the mean and confidence intervals of the dependent variable can be calculated using only the error function. A simple example of applying this method to a parabolic function is presented to demonstrate the technique.
Unsupervised learning involves using unlabeled data. It is used for specific problematic like : clustering, dimensionality reduction and association rule learning.
In the first section we will talk about some of the clustering methods: k-mean, mean shift, Gaussian mixture and affinity propagation model . We will also define and use silhouette scores that will help to select the most appropriate number of clusters that the data may have.
[Notebook](https://colab.research.google.com/drive/1g4hcSfiO-TW35JbiQ_kGQAsgMZDPkp7L)
Tutorial on Markov Random Fields (MRFs) for Computer Vision ApplicationsAnmol Dwivedi
The goal of this mini-project is to implement a pairwise binary label-observation Markov Random Field
model for bi-level image segmentation. Specifically, two inference algorithms, i.e., the Iterative
Conditional Mode (ICM) and Gibbs sampling methods will be implemented to perform image segmentation.
The document discusses applying random distortion testing (RDT) in a spectral clustering context. RDT is a framework for guaranteeing a false alarm probability threshold in detecting distorted data using threshold-based tests. The document introduces RDT and spectral clustering concepts. It then proposes using the p-value from RDT as the similarity function or kernel in spectral clustering, to handle disturbed data. Experiments are conducted to compare the partitioning performance of the RDT p-value kernel to the Gaussian kernel.
The document discusses the multiple linear regression model and ordinary least squares (OLS) estimation. It presents the econometric model, where a dependent variable is modeled as a linear function of explanatory variables, plus an error term. It describes the assumptions of the linear regression model, including linearity, independence of observations, exogeneity of regressors, and properties of the error term. It then discusses OLS estimation, goodness of fit, hypothesis testing, confidence intervals, and asymptotic properties of the OLS estimator.
Principal Component Analysis For Novelty DetectionJordan McBain
This document summarizes a journal article that proposes using principal component analysis (PCA) for novelty detection in condition monitoring applications. It describes how PCA can be used to reduce the dimensionality of feature spaces while retaining most of the variation in the data. The authors modify the standard PCA technique to maximize the difference between the spread of normal data and the spread of outlier data from the mean of the normal data. They validate the approach on artificial and machinery vibration data and show it can effectively distinguish outliers. Future work could involve extending the technique to non-linear data using kernel methods.
Covariance matrices are central to many adaptive filtering and optimisation problems. In practice, they have to be estimated from a finite number of samples; on this, I will review some known results from spectrum estimation and multiple-input multiple-output communications systems, and how properties that are assumed to be inherent in covariance and power spectral densities can easily be lost in the estimation process. I will discuss new results on space-time covariance estimation, and how the estimation from finite sample sets will impact on factorisations such as the eigenvalue decomposition, which is often key to solving the introductory optimisation problems. The purpose of the presentation is to give you some insight into estimating statistics as well as to provide a glimpse on classical signal processing challenges such as the separation of sources from a mixture of signals.
Analytical study of feature extraction techniques in opinion miningcsandit
Although opinion mining is in a nascent stage of development but still the ground is set for
dense growth of researches in the field. One of the important activities of opinion mining is to
extract opinions of people based on characteristics of the object under study. Feature extraction
in opinion mining can be done by various ways like that of clustering, support vector machines
etc. This paper is an attempt to appraise the various techniques of feature extraction. The first
part discusses various techniques and second part makes a detailed appraisal of the major
techniques used for feature extraction
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGcsandit
Although opinion mining is in a nascent stage of development but still the ground is set for dense growth of researches in the field. One of the important activities of opinion mining is to extract opinions of people based on characteristics of the object under study. Feature extraction in opinion mining can be done by various ways like that of clustering, support vector machines
etc. This paper is an attempt to appraise the various techniques of feature extraction. The first part discusses various techniques and second part makes a detailed appraisal of the major techniques used for feature extraction
This chapter summary discusses discrete probability distributions. It distinguishes between discrete and continuous random variables and distributions. It describes how to determine the mean and variance of discrete distributions. It introduces some common discrete distributions like the binomial and Poisson distributions. For the binomial distribution, it explains how to calculate the probability of a given number of successes in a given number of trials. For the Poisson distribution, it provides the probability formula and explains that it models independent events occurring continuously over an interval.
Inference & Learning in Linear Chain Conditional Random Fields (CRFs)Anmol Dwivedi
This mini-project will consider performing inference and learning in Linear Chain CRFs. In particular, it will consider an application to hand-written word recognition. Handwritten word recognition is a task many have explored with different methods of machine learning. Some written characters can be evaluated individually or as a whole word to account for the context in characters. In this mini-project, we use linear chain CRF models to account for context between the characters of a word to improve word recognition accuracy.
This document outlines the key concepts that will be covered in Lecture 2 on Bayesian modeling. It introduces the likelihood function and how it can be used to determine the most likely parameter values given observed data. It provides examples of applying Bayesian modeling to proportions, normal distributions, linear regression with one predictor, and linear regression with multiple predictors. The lecture aims to give students a basic understanding of how Bayesian analysis works and prepare them for fitting linear mixed models.
This document summarizes an article that proposes three new secret key sharing schemes based on the Chinese Remainder Theorem (CRT). It begins by providing background on CRT and secret sharing schemes. It then presents the main result, which is three theorems and algorithms for authenticated key distribution using a given set of primes. The first theorem describes how to construct three secret shares from a secret S such that combining the shares recovers S. It proves this using a lemma about finding integers that satisfy a system of congruences. The next sections provide examples and algorithms to motivate the secret sharing schemes. In summary, the document presents new methods for secret sharing based on number theory and the CRT.
Big Data analysis involves building predictive models from high-dimensional data using techniques like variable selection, cross-validation, and regularization to avoid overfitting. The document discusses an example analyzing web browsing data to predict online spending, highlighting challenges with large numbers of variables. It also covers summarizing high-dimensional data through dimension reduction and model building for prediction versus causal inference.
The document introduces factor of safety and probability of failure in engineering design. It discusses using sensitivity studies to systematically vary parameters over their credible ranges to determine the influence on factor of safety. This allows a more rational assessment of design risks than relying on a single calculated factor of safety. The document then provides an introduction to probability theory and statistical concepts used in probabilistic analyses, including random variables, probability distributions, sampling techniques, and calculating the probability of failure for a slope design example.
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsDerek Kane
The document discusses various regression techniques including ridge regression, lasso regression, and elastic net regression. It begins with an overview of advancements in regression analysis since the late 1800s/early 1900s enabled by increased computing power. Modern high-dimensional data often has many independent variables, requiring improved regression methods. The document then provides technical explanations and formulas for ordinary least squares regression, ridge regression, lasso regression, and their properties such as bias-variance tradeoffs. It explains how ridge and lasso regression address limitations of OLS through regularization that shrinks coefficients.
This document provides a practical guide for using support vector machines (SVMs) for classification tasks. It recommends beginners follow a simple procedure: 1) preprocess data by converting categorical features to numeric and scaling attributes, 2) use a radial basis function kernel, 3) perform cross-validation to select optimal values for hyperparameters C and γ, and 4) train the full model on the training set using the best hyperparameters. The guide explains why this procedure often provides reasonable results for novices and illustrates it using examples of real-world classification problems.
This document provides an overview of kernel machines and the kernel trick in machine learning. It discusses how the kernel trick allows projecting data into a higher dimensional space to make it linearly separable. It describes using kernels like polynomial kernels in the dual formulation to calculate dot products without explicitly performing the projection. The kernel trick avoids having to compute in the higher dimensional space, improving computational efficiency.
This document proposes using machine learning techniques to predict COVID-19 infections based on chest x-ray images. Specifically, it involves using discrete wavelet transform to extract space-frequency features from chest x-rays, reducing the dimensionality of features using Shannon entropy, and then training standard machine learning classifiers like logistic regression, support vector machine, decision tree, and convolutional neural network on the extracted features to classify images as COVID-19 positive or negative. The document provides background on the proposed techniques of discrete wavelet transform, entropy, and various machine learning models.
1) The document discusses Vapnik's approach to statistical modeling and machine learning, focusing on the concepts of generalization, overfitting, and VC dimension.
2) It introduces the idea of Structural Risk Minimization (SRM), which aims to control a model's complexity through the VC dimension in order to maximize generalization. SRM selects the model that minimizes the total risk of empirical risk and confidence interval.
3) As an example, it describes how SRM can be implemented in an industrial data mining context to optimize variables, model structure, and hyperparameters for tasks like classification and regression.
A simple numerical procedure for estimating nonlinear uncertainty propagationISA Interchange
This document presents a numerical method for estimating nonlinear uncertainty propagation. The method approximates the nonlinear function with piecewise linear segments. It then calculates the probability density function of the dependent variable based on the transformations of the linear segments. For functions of a normally distributed independent variable, the mean and confidence intervals of the dependent variable can be calculated using only the error function. A simple example of applying this method to a parabolic function is presented to demonstrate the technique.
Unsupervised learning involves using unlabeled data. It is used for specific problematic like : clustering, dimensionality reduction and association rule learning.
In the first section we will talk about some of the clustering methods: k-mean, mean shift, Gaussian mixture and affinity propagation model . We will also define and use silhouette scores that will help to select the most appropriate number of clusters that the data may have.
[Notebook](https://colab.research.google.com/drive/1g4hcSfiO-TW35JbiQ_kGQAsgMZDPkp7L)
Tutorial on Markov Random Fields (MRFs) for Computer Vision ApplicationsAnmol Dwivedi
The goal of this mini-project is to implement a pairwise binary label-observation Markov Random Field
model for bi-level image segmentation. Specifically, two inference algorithms, i.e., the Iterative
Conditional Mode (ICM) and Gibbs sampling methods will be implemented to perform image segmentation.
The document discusses applying random distortion testing (RDT) in a spectral clustering context. RDT is a framework for guaranteeing a false alarm probability threshold in detecting distorted data using threshold-based tests. The document introduces RDT and spectral clustering concepts. It then proposes using the p-value from RDT as the similarity function or kernel in spectral clustering, to handle disturbed data. Experiments are conducted to compare the partitioning performance of the RDT p-value kernel to the Gaussian kernel.
The document discusses the multiple linear regression model and ordinary least squares (OLS) estimation. It presents the econometric model, where a dependent variable is modeled as a linear function of explanatory variables, plus an error term. It describes the assumptions of the linear regression model, including linearity, independence of observations, exogeneity of regressors, and properties of the error term. It then discusses OLS estimation, goodness of fit, hypothesis testing, confidence intervals, and asymptotic properties of the OLS estimator.
Principal Component Analysis For Novelty DetectionJordan McBain
This document summarizes a journal article that proposes using principal component analysis (PCA) for novelty detection in condition monitoring applications. It describes how PCA can be used to reduce the dimensionality of feature spaces while retaining most of the variation in the data. The authors modify the standard PCA technique to maximize the difference between the spread of normal data and the spread of outlier data from the mean of the normal data. They validate the approach on artificial and machinery vibration data and show it can effectively distinguish outliers. Future work could involve extending the technique to non-linear data using kernel methods.
Covariance matrices are central to many adaptive filtering and optimisation problems. In practice, they have to be estimated from a finite number of samples; on this, I will review some known results from spectrum estimation and multiple-input multiple-output communications systems, and how properties that are assumed to be inherent in covariance and power spectral densities can easily be lost in the estimation process. I will discuss new results on space-time covariance estimation, and how the estimation from finite sample sets will impact on factorisations such as the eigenvalue decomposition, which is often key to solving the introductory optimisation problems. The purpose of the presentation is to give you some insight into estimating statistics as well as to provide a glimpse on classical signal processing challenges such as the separation of sources from a mixture of signals.
Analytical study of feature extraction techniques in opinion miningcsandit
Although opinion mining is in a nascent stage of development but still the ground is set for
dense growth of researches in the field. One of the important activities of opinion mining is to
extract opinions of people based on characteristics of the object under study. Feature extraction
in opinion mining can be done by various ways like that of clustering, support vector machines
etc. This paper is an attempt to appraise the various techniques of feature extraction. The first
part discusses various techniques and second part makes a detailed appraisal of the major
techniques used for feature extraction
ANALYTICAL STUDY OF FEATURE EXTRACTION TECHNIQUES IN OPINION MININGcsandit
Although opinion mining is in a nascent stage of development but still the ground is set for dense growth of researches in the field. One of the important activities of opinion mining is to extract opinions of people based on characteristics of the object under study. Feature extraction in opinion mining can be done by various ways like that of clustering, support vector machines
etc. This paper is an attempt to appraise the various techniques of feature extraction. The first part discusses various techniques and second part makes a detailed appraisal of the major techniques used for feature extraction
This chapter summary discusses discrete probability distributions. It distinguishes between discrete and continuous random variables and distributions. It describes how to determine the mean and variance of discrete distributions. It introduces some common discrete distributions like the binomial and Poisson distributions. For the binomial distribution, it explains how to calculate the probability of a given number of successes in a given number of trials. For the Poisson distribution, it provides the probability formula and explains that it models independent events occurring continuously over an interval.
Inference & Learning in Linear Chain Conditional Random Fields (CRFs)Anmol Dwivedi
This mini-project will consider performing inference and learning in Linear Chain CRFs. In particular, it will consider an application to hand-written word recognition. Handwritten word recognition is a task many have explored with different methods of machine learning. Some written characters can be evaluated individually or as a whole word to account for the context in characters. In this mini-project, we use linear chain CRF models to account for context between the characters of a word to improve word recognition accuracy.
This document outlines the key concepts that will be covered in Lecture 2 on Bayesian modeling. It introduces the likelihood function and how it can be used to determine the most likely parameter values given observed data. It provides examples of applying Bayesian modeling to proportions, normal distributions, linear regression with one predictor, and linear regression with multiple predictors. The lecture aims to give students a basic understanding of how Bayesian analysis works and prepare them for fitting linear mixed models.
This document summarizes an article that proposes three new secret key sharing schemes based on the Chinese Remainder Theorem (CRT). It begins by providing background on CRT and secret sharing schemes. It then presents the main result, which is three theorems and algorithms for authenticated key distribution using a given set of primes. The first theorem describes how to construct three secret shares from a secret S such that combining the shares recovers S. It proves this using a lemma about finding integers that satisfy a system of congruences. The next sections provide examples and algorithms to motivate the secret sharing schemes. In summary, the document presents new methods for secret sharing based on number theory and the CRT.
Big Data analysis involves building predictive models from high-dimensional data using techniques like variable selection, cross-validation, and regularization to avoid overfitting. The document discusses an example analyzing web browsing data to predict online spending, highlighting challenges with large numbers of variables. It also covers summarizing high-dimensional data through dimension reduction and model building for prediction versus causal inference.
The document introduces factor of safety and probability of failure in engineering design. It discusses using sensitivity studies to systematically vary parameters over their credible ranges to determine the influence on factor of safety. This allows a more rational assessment of design risks than relying on a single calculated factor of safety. The document then provides an introduction to probability theory and statistical concepts used in probabilistic analyses, including random variables, probability distributions, sampling techniques, and calculating the probability of failure for a slope design example.
Data Science - Part XII - Ridge Regression, LASSO, and Elastic NetsDerek Kane
The document discusses various regression techniques including ridge regression, lasso regression, and elastic net regression. It begins with an overview of advancements in regression analysis since the late 1800s/early 1900s enabled by increased computing power. Modern high-dimensional data often has many independent variables, requiring improved regression methods. The document then provides technical explanations and formulas for ordinary least squares regression, ridge regression, lasso regression, and their properties such as bias-variance tradeoffs. It explains how ridge and lasso regression address limitations of OLS through regularization that shrinks coefficients.
This document provides a practical guide for using support vector machines (SVMs) for classification tasks. It recommends beginners follow a simple procedure: 1) preprocess data by converting categorical features to numeric and scaling attributes, 2) use a radial basis function kernel, 3) perform cross-validation to select optimal values for hyperparameters C and γ, and 4) train the full model on the training set using the best hyperparameters. The guide explains why this procedure often provides reasonable results for novices and illustrates it using examples of real-world classification problems.
This document provides an overview of kernel machines and the kernel trick in machine learning. It discusses how the kernel trick allows projecting data into a higher dimensional space to make it linearly separable. It describes using kernels like polynomial kernels in the dual formulation to calculate dot products without explicitly performing the projection. The kernel trick avoids having to compute in the higher dimensional space, improving computational efficiency.
This document proposes using machine learning techniques to predict COVID-19 infections based on chest x-ray images. Specifically, it involves using discrete wavelet transform to extract space-frequency features from chest x-rays, reducing the dimensionality of features using Shannon entropy, and then training standard machine learning classifiers like logistic regression, support vector machine, decision tree, and convolutional neural network on the extracted features to classify images as COVID-19 positive or negative. The document provides background on the proposed techniques of discrete wavelet transform, entropy, and various machine learning models.
1) The document discusses Vapnik's approach to statistical modeling and machine learning, focusing on the concepts of generalization, overfitting, and VC dimension.
2) It introduces the idea of Structural Risk Minimization (SRM), which aims to control a model's complexity through the VC dimension in order to maximize generalization. SRM selects the model that minimizes the total risk of empirical risk and confidence interval.
3) As an example, it describes how SRM can be implemented in an industrial data mining context to optimize variables, model structure, and hyperparameters for tasks like classification and regression.
Support Vector Machine topic of machine learning.pptxCodingChamp1
Support Vector Machines (SVM) find the optimal separating hyperplane that maximizes the margin between two classes of data points. The hyperplane is chosen such that it maximizes the distance from itself to the nearest data points of each class. When data is not linearly separable, the kernel trick can be used to project the data into a higher dimensional space where it may be linearly separable. Common kernel functions include linear, polynomial, radial basis function (RBF), and sigmoid kernels. Soft margin SVMs introduce slack variables to allow some misclassification and better handle non-separable data. The C parameter controls the tradeoff between margin maximization and misclassification.
The International Journal of Engineering and Science (The IJES)theijes
The International Journal of Engineering & Science is aimed at providing a platform for researchers, engineers, scientists, or educators to publish their original research results, to exchange new ideas, to disseminate information in innovative designs, engineering experiences and technological skills. It is also the Journal's objective to promote engineering and technology education. All papers submitted to the Journal will be blind peer-reviewed. Only original articles will be published.
And Then There Are Algorithms - Danilo Poccia - Codemotion Rome 2018Codemotion
In machine learning, training large models on a massive amount of data usually improves results. Our customers report, however, that training such models and deploying them is either operationally prohibitive or outright impossible for them. We created a collection of machine learning algorithms that scale to any amount of data, including k-means clustering for data segmentation, factorization machines for recommendations, time-series forecasting, linear regression, topic modeling, and image classification. This talk will discuss those algorithms, understand where and how they can be used.
In this chapter, our goal is to introduce the foundational principles of supervised learning. As we progress, we place particular emphasis on both regression and classification techniques, offering learners a more comprehensive perspective on the practical application of these methodologies in real-world scenarios. By the end of this chapter, learners will not only possess a robust understanding of the core principles but will also be armed with valuable insights into the tangible applications of supervised learning. This knowledge empowers them to skillfully navigate and leverage the full potential of this influential paradigm within the vast expanse of machine learning.
Support Vector Machines USING MACHINE LEARNING HOW IT WORKSrajalakshmi5921
This document discusses support vector machines (SVM), a supervised machine learning algorithm used for classification and regression. It explains that SVM finds the optimal boundary, known as a hyperplane, that separates classes with the maximum margin. When data is not linearly separable, kernel functions can transform the data into a higher-dimensional space to make it separable. The document discusses SVM for both linearly separable and non-separable data, kernel functions, hyperparameters, and approaches for multiclass classification like one-vs-one and one-vs-all.
This document provides an overview of machine learning concepts related to overfitting and model selection. It discusses overfitting in k-nearest neighbors and regression models. It introduces bias-variance decomposition and structural risk minimization. Methods for controlling overfitting like cross-validation, regularization, feature selection and model selection are covered. The concepts of consistency, model convergence speed, and strategies for controlling generalization capacity are explained.
This document discusses different types of regression analysis techniques including linear regression, polynomial regression, support vector regression, decision tree regression, ridge regression, lasso regression, and logistic regression. Linear regression finds the relationship between a continuous dependent variable and one or more independent variables. Polynomial regression handles nonlinear relationships through higher-order terms. Support vector regression and decision tree regression can handle both linear and nonlinear data. Ridge and lasso regression are regularization techniques used to prevent overfitting. Logistic regression is for classification rather than regression problems.
A Fuzzy Interactive BI-objective Model for SVM to Identify the Best Compromis...ijfls
This document summarizes a research paper that proposes a fuzzy bi-objective support vector machine (SVM) model to identify infected COVID-19 patients. The model uses SVM classification with two objectives - maximizing margin between classes and minimizing misclassification errors. An α-cut transforms the fuzzy model into a classical bi-objective problem solved using weighting methods. This generates multiple efficient solutions. An interactive process then identifies the best compromise based on minimizing the number of support vectors in each class. The model constructs a utility function to measure COVID-19 infection levels based on the SVM classification.
A FUZZY INTERACTIVE BI-OBJECTIVE MODEL FOR SVM TO IDENTIFY THE BEST COMPROMIS...ijfls
A support vector machine (SVM) learns the decision surface from two different classes of the input points. In several applications, some of the input points are misclassified and each is not fully allocated to either of these two groups. In this paper a bi-objective quadratic programming model with fuzzy parameters is utilized and different feature quality measures are optimized simultaneously. An α-cut is defined to transform the fuzzy model to a family of classical bi-objective quadratic programming problems. The weighting method is used to optimize each of these problems. For the proposed fuzzy bi-objective quadratic programming model, a major contribution will be added by obtaining different effective support vectors due to changes in weighting values. The experimental results, show the effectiveness of the α-cut with the weighting parameters on reducing the misclassification between two classes of the input points. An interactive procedure will be added to identify the best compromise solution from the generated efficient solutions. The main contribution of this paper includes constructing a utility function for measuring the degree of infection with coronavirus disease (COVID-19).
A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...gerogepatton
A support vector machine (SVM) learns the decision surface from two different classes of the input points, there are misclassifications in some of the input points in several applications. In this paper a bi-objective quadratic programming model is utilized and different feature quality measures are optimized simultaneously using the weighting method for solving our bi-objective quadratic programming problem. An important contribution will be added for the proposed bi-objective quadratic programming model by getting different efficient support vectors due to changing the weighting values. The numerical examples, give evidence of the effectiveness of the weighting parameters on reducing the misclassification between two classes of the input points. An interactive procedure will be added to identify the best compromise solution from the generated efficient solutions.
A BI-OBJECTIVE MODEL FOR SVM WITH AN INTERACTIVE PROCEDURE TO IDENTIFY THE BE...ijaia
A support vector machine (SVM) learns the decision surface from two different classes of the input points, there are misclassifications in some of the input points in several applications. In this paper a bi-objective quadratic programming model is utilized and different feature quality measures are optimized simultaneously using the weighting method for solving our bi-objective quadratic programming problem. An important contribution will be added for the proposed bi-objective quadratic programming model by getting different efficient support vectors due to changing the weighting values. The numerical examples, give evidence of the effectiveness of the weighting parameters on reducing the misclassification between two classes of the input points. An interactive procedure will be added to identify the best compromise solution from the generated efficient solutions.
This document provides a tutorial on support vector machines (SVM). It begins with an abstract briefly introducing SVM and discussing sources used to compile the tutorial. The introduction defines machine learning and SVM, noting SVM was introduced in 1992 and can be used for classification and regression. It assumes familiarity with linear algebra, analysis, neural networks, and artificial intelligence. The tutorial then discusses statistical learning theory, learning and generalization, and introduces SVM by explaining why it was developed due to some limitations of neural networks for certain tasks. It presents illustrations of data classification and the maximum margin classifier concept in SVM.
The document provides a tutorial on support vector machines (SVM). It begins with an abstract briefly introducing SVM and discussing how the tutorial was compiled from various sources. It then provides an introduction on machine learning and how SVM relates. The core concepts of SVM are explained, including statistical learning theory, maximizing margins, soft-margin classifiers, and the kernel trick. Common kernel functions for SVM are also listed. The tutorial is intended to give a brief overview of SVM for readers familiar with linear algebra, analysis, neural networks, and artificial intelligence concepts.
IRJET- Evaluation of Classification Algorithms with Solutions to Class Imbala...IRJET Journal
This document discusses evaluating various classification algorithms to address class imbalance problems using the bank marketing dataset in WEKA. It first introduces data mining and classification algorithms like decision trees, naive Bayes, neural networks, support vector machines, logistic regression and random forests. It then discusses the class imbalance problem that occurs when one class is underrepresented. To address this, it explores sampling techniques like random under-sampling of the majority class, random over-sampling of the minority class, and SMOTE. It uses these techniques on the bank marketing dataset to evaluate the algorithms based on metrics like precision, recall, F1-score, ROC and AUCPR for the minority class.
Data Science - Part IX - Support Vector MachineDerek Kane
This lecture provides an overview of Support Vector Machines in a more relatable and accessible manner. We will go through some methods of calibration and diagnostics of SVM and then apply the technique to accurately detect breast cancer within a dataset.
Este documento analiza el modelo de negocio de YouTube. Explica que YouTube y otros sitios de video online representan un nuevo modelo de negocio para contenidos audiovisuales debido al cambio en los hábitos de consumo causado por las nuevas tecnologías. Describe cómo YouTube aprovecha la participación de los usuarios para mejorar continuamente y atraer una audiencia diferente a la de los medios tradicionales.
The defense was successful in portraying Michael Jackson favorably to the jury in several ways:
1) They dressed Jackson in ornate costumes that conveyed images of purity, innocence, and humility.
2) Jackson was shown entering the courtroom as if on a red carpet, emphasizing his celebrity status.
3) Jackson appeared vulnerable, childlike, and in declining health during the trial, eliciting sympathy from jurors.
4) Defense attorney Tom Mesereau effectively presented a coherent narrative of Jackson as a victim and portrayed Neverland as a place of refuge, undermining the prosecution's arguments.
Michael Jackson was born in 1958 in Gary, Indiana and rose to fame in the 1960s as the lead singer of The Jackson 5, topping music charts in the 1970s. As a solo artist in the 1980s, his album Thriller broke music records. In the 1990s and 2000s, Jackson faced several legal issues related to child abuse allegations while continuing to release music. He married Lisa Marie Presley and Debbie Rowe and had two children before his death in 2009.
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...butest
This document appears to be a list of popular books from various authors. It includes over 150 book titles across many genres such as fiction, non-fiction, memoirs, and novels. The books cover a wide range of topics from politics to cooking to autobiographies.
The prosecution lost the Michael Jackson trial due to several key mistakes and weaknesses in their case:
1) The lead prosecutor, Thomas Sneddon, was too personally invested in the case against Jackson, having pursued him for over a decade without success.
2) Sneddon's opening statement was disorganized and weak, failing to effectively outline the prosecution's case.
3) The accuser's mother was not credible and damaged the prosecution's case through her erratic testimony, history of lies and con artist behavior.
4) Many prosecution witnesses were not credible due to prior lawsuits against Jackson, debts owed to him, or having been fired by him. Several witnesses even took the Fifth Amendment.
Here are three examples of public relations from around the world:
1. The UK government's "Be Clear on Cancer" campaign which aims to raise awareness of cancer symptoms and encourage early diagnosis.
2. Samsung's global brand marketing and sponsorship activities which aim to increase brand awareness and favorability of Samsung products worldwide.
3. The Brazilian government's efforts to improve its international image and relations with other countries through strategic communication and diplomacy.
The three most important functions of public relations are:
1. Media relations because the media is how most organizations reach their key audiences. Strong media relationships are crucial.
2. Writing, because written communication is at the core of public relations and how most information is
Michael Jackson Please Wait... provides biographical information about Michael Jackson including his birthdate, birthplace, parents, height, interests, idols, favorite foods, films, and more. It discusses his background, career highlights including influential albums like Thriller, and films he appeared in such as The Wiz and Moonwalker. The document contains photos and details about Jackson's life and illustrious music career.
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazzbutest
The document discusses the process of manufacturing celebrity and its negative byproducts. It argues that celebrities are rarely the best in their individual pursuits like singing, dancing, etc. but become famous due to being products of a system controlled by wealthy elites. This system stifles opportunities for worthy artists and creates feudalism. The document also asserts that manufactured celebrities should not be viewed as role models due to behaviors like drug abuse and narcissism that result from the celebrity-making process.
Michael Jackson was a child star who rose to fame with the Jackson 5 in the late 1960s and early 1970s. As a solo artist in the 1970s and 1980s, he had immense commercial success with albums like Off the Wall, Thriller, and Bad, which featured hit singles and groundbreaking music videos. However, his career and public image were plagued by controversies related to allegations of child sexual abuse in the 1990s and 2000s. He continued recording and performing but faced ongoing media scrutiny into his private life until his death in 2009.
Social Networks: Twitter Facebook SL - Slide 1butest
The document discusses using social networking tools like Twitter and Facebook in K-12 education. Twitter allows students and teachers to share short updates and can be used to give parents a window into classroom activities. Facebook allows targeted advertising that could be used to promote educational activities. Both tools could help facilitate communication between schools and communities if used properly while managing privacy and security concerns.
Facebook has over 300 million active users who log on daily, and allows brands to create public profile pages to interact with users. Pages are for brands and organizations only, while groups can be made by any user about any topic. Pages do not show admin names and have no limits on fans, while groups display admin names and are limited to 5,000 members. Content on pages should aim to provoke action from subscribers and establish a regular posting schedule using a conversational tone.
Executive Summary Hare Chevrolet is a General Motors dealership ...butest
Hare Chevrolet is a car dealership located in Noblesville, Indiana that has successfully used social media platforms like Twitter, Facebook, and YouTube to create a positive brand image. They invest significant time interacting directly with customers online to foster a sense of community rather than overtly advertising. As a result, Hare Chevrolet has built a large, engaged audience on social media and serves as a model for how brands can use online presences strategically.
Welcome to the Dougherty County Public Library's Facebook and ...butest
This document provides instructions for signing up for Facebook and Twitter accounts. It outlines the sign up process for both platforms, including filling out forms with name, email, password and other details. It describes how the platforms will then search for friends and suggest people to connect with. It also explains how to search for and follow the Dougherty County Public Library page on both Facebook and Twitter once signed up. The document concludes by thanking participants and providing a contact for any additional questions.
Paragon Software announces the release of Paragon NTFS for Mac OS X 8.0, which provides full read and write access to NTFS partitions on Macs. It is the fastest NTFS driver on the market, achieving speeds comparable to native Mac file systems. Paragon NTFS for Mac 8.0 fully supports the latest Mac OS X Snow Leopard operating system in 64-bit mode and allows easy transfer of files between Windows and Mac partitions without additional hardware or software.
This document provides compatibility information for Olympus digital products used with Macintosh OS X. It lists various digital cameras, photo printers, voice recorders, and accessories along with their connection type and any notes on compatibility. Some products require booting into OS 9.1 for software compatibility or do not support devices that need a serial port. Drivers and software are available for download from Olympus and other websites for many products to enable use with OS X.
To use printers managed by the university's Information Technology Services (ITS), students and faculty must install the ITS Remote Printing software on their Mac OS X computer. This allows them to add network printers, log in with their ITS account credentials, and print documents while being charged per page to funds in their pre-paid ITS account. The document provides step-by-step instructions for installing the software, adding a network printer, and printing to that printer from any internet connection on or off campus. It also explains the pay-in-advance printing payment system and how to check printing charges.
The document provides an overview of the Mac OS X user interface for beginners, including descriptions of the desktop, login screen, desktop elements like the dock and hard disk, and how to perform common tasks like opening files and folders. It also addresses frequently asked questions for Windows users switching to Mac OS X, such as where documents are stored, how to save or find documents, and what the equivalent of the C: drive is in Mac OS X. The document concludes with sections on file management tasks like creating and deleting folders, organizing files within applications, using Spotlight search, and an overview of the Dashboard feature.
This document provides a checklist for securing Mac OS X version 10.5, focusing on hardening the operating system, securing user accounts and administrator accounts, enabling file encryption and permissions, implementing intrusion detection, and maintaining password security. It describes the Unix infrastructure and security framework that Mac OS X is built on, leveraging open source software and following the Common Data Security Architecture model. The checklist can be used to audit a system or harden it against security threats.
This document summarizes a course on web design that was piloted in the summer of 2003. The course was a 3 credit course that met 4 times a week for lectures and labs. It covered topics such as XHTML, CSS, JavaScript, Photoshop, and building a basic website. 18 students from various majors enrolled. Student and instructor evaluations found the course to be very successful overall, though some improvements were suggested like ensuring proper software and pairing programming/non-programming students. The document also discusses implications of incorporating web design material into existing computer science curriculums.
1. Vibration-Based Damage Detection
Using Unsupervised Support Vector Machine
Ching-Huei Tsou1 and John R. Williams2
{tsou, jrw}@mit.edu
Abstract:
Vibration-based damage detection methods can be used to identify hidden damages in structural
components. Traditional modal based system identification paradigm requires a detailed model of
the structure, such as a finite element model. This paper describes a novel statistical damage
detection approach based on a support-vector machine methodology. The proposed approach is
computational efficient even when the number of features is large and does not suffer from the
local minima problem that is encountered by artificial neural networks. We build the statistical
model through unsupervised learning, avoiding the need of using measurements from the
damaged structure, which is unrealistic in many real world problems. Extracting significant
features from raw vibration time series data is crucial to the efficiency and scalability of statistical
based methods. A feature selection algorithm is presented along with the building of our
statistical model. Numerical simulations, including the ASCE benchmark problem, are analyzed
to examine the accuracy and the scalability of our approach. We show that the proposed approach
is able to detect both the occurrence and the location of damage, and our feature selection scheme
can effectively reduce the required dimensions while retaining high accuracy.
1
Graduate Student, Intelligent Engineering Systems Laboratory (IESL), Department of Civil and
Environmental Engineering, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
2
Associate Professor, Director of IESL, Department of Civil and Environmental Engineering and
Engineering Systems Division, Massachusetts Institute of Technology, Cambridge, MA 02139, USA.
2. 1. Introduction
The process of implementing a damage detection strategy is referred to as structural health
monitoring (SHM), and can be categorized into five stages [1]: (1) detecting the existence of
damage, (2) locating the damage, (3) identifying the type of the damage, (4) determining the
severity of the damage, (5) predicting the remaining service life of the structure. Research has
been conducted in this field during the past decade and detailed literature reviews on vibration-
based damage detection methods can be found in [2-4]. The basic reasoning behind all vibration-
based damage detection is that the stiffness, mass, or energy dissipation behavior of the structure
will change significantly when damage occurs. These properties can be measured and detected by
monitoring the dynamic response of the system. When compared to other nondestructive damage
detection (NDD) techniques, such as ultrasonic scanning, acoustic emission, x-ray inspection,
etc., the vibration-based method has the advantage of providing a global evaluation of the state of
the structure.
Traditional vibration-based damage identification applications rely on detailed finite element
models of the undamaged structures, and damage diagnosis is made by comparing the modal
responses, such as frequencies and mode shapes, of the model and the potentially damaged
structure. The system identification approaches have been shown to be very accurate provided the
models can produce robust and reliable modal estimates, and large amount of high quality data is
available. But these two requirements cannot always be met in the field.
To overcome these difficulties, pattern recognition based approaches have been proposed [5-8].
Instead of building models from physical properties of the structures, those methods construct
statistical models from the vibration response data directly. This reduces the complexity in the
modeling process, in the cost of losing physical meaning of the model. Also, these methods have
been shown to be accurate in damage detection and are less sensitive to data quality; however,
some problems still remain. For example, methods which use autoregressive models (AR/ARX)
3. [5] may not be able to fit the vibration data well because it gives only linear approximations.
Complex statistical models are less efficient, and they have little control over the generalization
bound, i.e., they may fit the history data perfectly but have no guarantee on the future data.
Methods based on artificial neural network (ANN) [8] often suffer from local minima problem
and cannot be trained efficiently, and do not scale well to large scale problems. Methods use
support vector machine (SVM) have also been purposed [7, 9], with SVM used to perform
supervised, binary classifications.
In this paper, we propose using one-class SVM and support vector regression (SVR) to perform
unsupervised learning. This does not require training samples from damaged structure because
they are usually unavailable in the practical situation. Training SVM is mathematically equivalent
to solving a convex quadratic programming (QP) problem that does not have local minima. The
lack of local minima means it can be trained faster than ANN. Finally, SVM is used with a linear
kernel to reduce the number of features in our model. This leads to a statistical model that is
efficient, accurate, and easy to implement. Mathematical simulations are provided to examine the
performance and accuracy of this approach.
2. Theory of SVM and Its Application in Damage Detection
We propose SVM-based approach in this paper because of its theoretical advantages over other
learning algorithms. SVM has been applied in various pattern recognition fields and it is not new
to introduce SVM into SHM. Nevertheless, SVM itself has evolved a lot during the past few
years, and these developments also shed new light on its applications in SHM. In this section, we
first review the motivation and algorithm of SVM. Then we move on to introduce two extensions
of SVM, which are able to perform unsupervised learning, and how they can be applied in the
damage detection scheme.
4. 2.1 Support Vector Machine
Support Vector Machine was developed by Vapnik et al. [10] based on structural risk
minimization (SRM) principle from statistical learning theory, rather than empirical risk
minimization (ERM) used by most other learning algorithms (Risk means test error in this
context). This fundamental difference allows SVM to select the best classifier from a family of
functions that not only fits the training data well but also provides a bonded generalization error,
i.e., a better prediction power [11]. Together with kernel techniques, SVM has shown superior
performance on both speed and accuracy, and it has outperformed Artificial Neural Networks
(ANN) in a wide variety of applications [12]. We start introducing the algorithm by discussing
the simplest case, a linear classifier trained on separable binary data. Assume we have l training
examples,
{ x i , yi } , i = 1,L , l where yi ∈ { −1, 1} , x i ∈R n
% %
x i are often referred to as patterns or inputs, and yi are called label or outputs of the example. A
%
linear classifier (a hyperplane in R n ) can be defined as,
f ( x i ) = x T w1 + w0 = 0
% %i %
where w1 is a vector normal to the hyperplane, and w0 is a scalar constant. We can also define
%
another two auxiliary hyperplanes by f ( x i ) = x T w1 + w0 = ±1 . It is easy to show that each of the
% %i %
two parallel hyperplanes has a perpendicular distance the original hyperplane equal to 1/ w1 .
%
The distance is often referred to as the “margin”. Because the data is separable, we can always
find those hyperplanes that separate the training samples perfectly. It is obvious that the solution
is not unique, and the SVM algorithm looks for the one that gives the maximum margin. The
optimization problem for the above process can be expressed as,
1 2
minimize: w1
2 %
5. subject to: yi ( x i w1 + w0 ) ≥ 1
T
% %
To extend to inseparable data, we can introduce slack variables ξi to relax the constraints, and
then add some penalty to the relaxation. The new optimization problem becomes,
{ }
l
1
w1 + C ∑ max 1 − yi ( x T w1 + w0 ) , 0
2
minimize:
2 % i =1 %i %
subject to: yi ( x i w1 + w0 ) ≥ 1 − ξi and ξi ≥ 0
T
% %
where C is a constant determining the trade-off between our two conflicting goals: maximizing
the margin, and minimizing the training error. For computational simplicity, we can further
transform the optimization problem into its dual form by using Lagrange multipliers, denoted by
α i ’s, and the result becomes,
l
1 l l
maximize: ∑α
i =1
i − ∑∑αiα j yi yj x T x j
2 i =1 j =1 %i %
l
subject to: ∑α y
i =1
i i = 0 and C ≥ α i ≥ 0
For all constraints in Eq. that are not strictly met as equalities, the corresponding α i ’s must be
zeros. This is known as the Karush-Kuhn-Tucker (KKT) conditions in optimization theory.
Examples with non-zero α i ’s are called the support vectors, and the classifier is determined by
the support vectors alone,
N SV
f ( x i ) = x T w1 + w0 =
% %i %
∑α y x
j =1 % j j
T
i x j + w0
%
where N SV denotes the total number of support vectors.
To extend the algorithm from linear to nonlinear, we define a mapping function φ : R n → H
which maps x i from its original Euclidian space to a reproducing kernel Hilbert space (RKHS).
%
The original space is often referred to as the sample space, and the RKHS is called the feature
6. space. Without losing generality in our context, we can simply think a RKHS as a generalization
Euclidian space which can have infinite dimensions. By replacing x i in the optimization problem
%
with f ( x i ) and perform linear classification in the corresponding RKHS, the solution become,
% %
f ( x i ) = f T ( x i ) w1 + w0
% % % %
N SV N SV
= ∑ α y f%
j =1
j j
T
( x i )f ( x j ) + w0 =
% % %
∑α y Κ ( x ,x ) + w
j =1
j j
% % i j 0
The mapping function φ is called kernel function and its dot product f ( x i )f ( x j ) = Κ ( x i , x j )
T
% % % % % %
is known as the kernel. Popular selection of kernels includes linear kernel, polynomial kernel, and
radial basis function (RBF) kernel. When nonlinear kernels are used, Eq. is no longer a linear
function in the original Euclidian space.
Because solving SVM corresponds to solving a convex QP problem, it does not have local
minima and can be trained faster than algorithm that does, such as ANN. We can show that
N SV << l for easier problems, i.e., problems with small generalization errors. This leads to a
sparse matrix in Eq. and Eq., and that means the optimization problem can be solved efficiently.
Also, through SRM and VC dimension [10], SVM provides a bounded generalization error and a
systematic way to select the complexity of the solution function, which effectively control the
problem of overfitting. Detailed discussion of these properties is beyond the scope of this paper,
and can be found in many recent statistical learning text books [13, 14].
2.2 One-Class Support Vector Machine
SVM is originally a supervised, batch learning algorithm, and has been applied in the SHM field
[7, 9] performing binary classification tasks. A major challenge is that data measured from
damaged structure is often not available in practical situations, thus unsupervised learning
methods are more desirable [15]. Similar needs also occur in other domains, and researchers in
7. machine learning and pattern recognition communities have extended the idea of SVM into
unsupervised learning, often referred to as one-class SVM [16].
Instead of finding a hyperplane that maximize the margin between two classes in the RKHS, one-
class SVM maximizes the distance from the hyperplane to the origin. The corresponding
optimization problem becomes,
1 1
∑ξ
2
minimize: w1 + −ρ
νl
i
2 % i
subject to: f ( x i ) w1 ≥ ρ −ξi ξi ≥ 0
T
and
% % %
where ν ∈ ( 0,1] is a parameter similar to the C introduced in Eq., and ρ is a offset which will
be calculated automatically during the optimization. If a translation invariant kernel is used (e.g.
RBF kernel), the goal of one-class SVM can also be thought of as to find small spheres that
contain most of the training samples.
2.3 Support Vector Regression
SVM was first developed for classification, and the labels yi in represent a finite number of
possible categories. The algorithm can be extended to estimate real-valued functions by allowing
yi to have real value, and defining a suitable loss function [17]. The following loss function,
{
f ( x i ) − yi = max f ( x i ) − yi − ε , 0
% %
}
known as ε -insensitive loss function, pays no penalty to points within the ε range, and this
carries over the sparseness property from SVM to SVR. Again, the estimated function can be
expressed as Eq., and the goal now is to minimize,
l
1
w1
2 %
2
i =1 %
{
+ C ∑ max f ( x i ) − yi − ε , 0 }
The basic idea of SVM, one-class SVM and SVR are summarized in Table 1 and Figure 1. For
simplicity, the discriminant function (Eq.) of SVM and one-class SVM are drawn as linear
8. functions. As mentioned, when nonlinear kernels are used, the functions are by no means linear in
the sample space.
Maximize Penalty
misclassified samples and
SVM distance between two hyperplanes
samples within the margin
One-class SVM distance between the hyperplane and origin misclassified samples
SVR smoothness of the function samples outside the ε - tube
Table 1. Comparison of SVM, one-class SVM and SVR
Origin
Figure 1. Geometric interpretation of SVM, one-class SVM and SVR in 2D
2.4 Damage Detection Using SVM
Vibration-based damage detection approaches are grounded on the assumption that the dynamic
response of the system will change significantly when damage occurs. We propose using SVM
for the detection, either through novelty detection or regression, and it is essential to have a
reasonable representation of the dynamic response before we can feed the data into SVM.
A time series is usually modeled by splitting it into series of windows, and the value at each time
point is determined by a set of its previous values, i.e.,
xt = f ( xt −τ , xt − 2τ ,..., xt − mτ )
where m and τ are referred to as the embedding dimension and delay time [18], respectively.
Through this representation, an acceleration response series can be transformed into a data set of
fixed-length vectors, and used by SVM. Damage detection is conducted by examining the
9. similarity and dissimilarity among data collected from different structure status. Detailed analysis
procedure will be given in the numerical studies section.
3. Feature Selection
As mentioned, statistical proximate approach is an attractive alternative to approaches based on
high order physical models in the sense that the former is computational competitive, less
sensitive to modeling error and data quality, and requires only measurement signals to build the
model. SVM is among the fastest algorithm in statistical learning; however, for large scale
problems the SVM algorithm is still slow and further reducing the computational complexity is
necessary.
3.1 The Motivation of Feature Selection in the Proposed Approach
Although in the dual form of SVM we are facing a QP problem whose computational complexity
is proximately proportional to the square of the number of training examples l , not the number of
features (ref. Eq.~), reducing the number of features is nevertheless helping to improve the
performance. For example, dot products between feature vectors are frequently required when
evaluating a kernel function. This process is time-consuming when the number of features is
large. When implementing a SVM solver, we often cache these results to improve the
performance, and this also brings up the memory consumption problem. Besides, field data is
often polluted by noises and redundant information, and feature selection provides a way of
identifying and eliminating them from the feature set. This not only improves the computational
efficiency but also increase the accuracy.
10. 3.2 Feature Selection using SVM
By looking at the solution of the primal form of SVM given by Eq., we can see that each
component in w1 can be thought of as the weight of its corresponding feature, φ ( x i ) , in RKHS.
% %
Feature reduction is done by removing features with zero weights from the set.
2
The primal SVM optimization problem is to minimize w1 while obey all the constraints, which
%
forces the value of each component wi to be small, but does not set it to zero because the
2 2
derivative of w1 at wi B 0 is small. We could replace w1 by w1 in SVM to solve this
% % %
problem, but this will forbid us transform SVM into its dual form and lose all the advantages. The
simplest way to get around this is to set a threshold for wi , and remove features associate with
weights smaller than the threshold.
Using the time series model indicated in Eq., the target of feature reduction in our damage
detection approach is to reduce the embedding dimension, i.e., the length of patterns in the
sample space. Because we are aimed to the sample space, no feature mapping is needed and a
linear kernel is suitable for this scenario, for its efficiency. Note that the choice of kernel in
feature selection is independent of the choice of kernel in the classification or regression stage.
When a linear kernel is used, features in the RKHS are actually the patterns in the sample space
themselves. This is why we used the conventional term “feature selection” throughout this paper,
although we are actually doing “pattern selection”.
The value of w1 is not calculated in solving SVM because only its dot product is required and
%
which can be obtained more efficiently by evaluating the kernel function. When doing feature
selection, we need to calculate w1 explicitly. The following relation is obtained while deriving
%
the dual form of SVM with linear kernel (ref. Eq.),
11. n SV
w 1 = ∑ α i yi x i
% i =1 %
which can be used to determine w1 once the corresponding SVM is solved.
%
This feature selection approach allow us to reduce the number of features while keep the
accuracy, as our numerical studies shown.
4. Numerical Studies
In this section, we demonstrate the proposed approach using a simple 2-story shear building and
the ASCE benchmark problem [19]. In all examples, acceleration responses are first normalized
using,
ai = ( ai − µa ) σ a
where µa and σ a are the sample mean and sample standard deviation of the acceleration signal,
respectively. Also, in all examples, when we say acceleration response we mean relative
acceleration response between two adjacent floors, i.e., the acceleration difference between the
current floor and the one below. By doing these, we do not need to deal with the scale and units
of the loading, and we can better isolate the effect of damages in each story. The value of SVM
related parameters, such as C , ν , ε and σ in RBF kernel are selected based on common
practice in pattern recognition and are specified in each example. In general, we can obtain
similar results as long as those parameters are within a reasonable range.
4.1 Two-Story Shear Building
We start with a simple 2-story shear building shown in Figure 2. Damage is modeled by reducing
the stiffness of a column. Vibration data are collected through accelerometers attached under each
floor. Three different SVM based approach are used for damage detection, namely, (1) supervised
SVM, (2) one-class SVM, and (3) support vector regression.
12. 0.2 m
B
0.5 m
A
0.5 m
Figure 2. Plane Steel Frame under Traverse Seismic Load ( EI = 6.379 N ⋅ m 2 for all columns)
4.1.1 Damage Detection Using Supervised Support Vector Machine
Figure 3 shows the acceleration response of the structure under the 1940 El Centro earthquake
load, and each time series corresponds to a different structure status. The damage in a floor is
modeled by reducing the stiffness of one of the columns in that floor by 50%.
1
0.5
Acceleration (g)
0
0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5
Time (sec)
-0.5
Undamaged 1F Damaged 2F Damaged
-1
Figure 3. Acceleration Measurements from Accelerometer A (1F, El Centro)
The vibration data are recorded at location A and B with sampling rate equal to 50Hz. Therefore,
with a 2-second long window, we can extract 100 patterns from the time series for each example.
Knowing the patterns and their corresponding labels (undamaged, 1st floor damaged, or 2nd floor
damaged), we can feed these data into a support vector machine (using C =100 and a RBF kernel
with σ 2 =20). The results of a 5-fold cross validation are shown in Table 1. We can see that SVM
13. is able to detect the occurrence as well as the location of the damage with very high accuracy,
provided the number of patterns is long enough. The trial-and-error way of selecting patterns here
will be replaced by our feather selection algorithm in section 4.2.2.
# of Patterns CV 1 CV 2 CV 3 CV 4 CV 5 Average
100 97 / 120 89 / 120 90 / 120 87 / 120 78 / 120 76.2%
150 111 / 120 111 / 120 119 / 120 117 / 120 116 / 120 95.7%
200 120 / 120 120 / 120 120 / 120 120 / 120 120 / 120 100%
Table 1. Cross Validation Results (3 structure status; El Centro)
In the next example, the same structure is excited using two different seismic loads. For each
load, acceleration responses in two different structure status (undamaged / 1st floor damaged) are
recorded, as shown in Figure 4.
1.5
1
0.5
Acceleration (g)
0
0.5 1.5 2.5 3.5 4.5 5.5 6.5 7.5 8.5
-0.5
Time (sec)
-1
-1.5 Undamaged 1F damaged
Undamaged (Kobe) 1F damaged (Kobe)
-2
Figure 4. Acceleration Measurements from Accelerometer A (El Centro and Kobe)
The purpose here is to detect damages in the structure, regardless of the sources of excitations
that causes the damage. We mix the two acceleration responses measured from the structure
under different excitations and train SVM with 2 classes (damaged or undamaged) instead of 4.
The cross validation results are shown in Table 2. Even with mixed excitations, SVM can still
achieve an accurate detection. When we group the training samples with the same structure status
together, we are implicitly indicating that the excitation is not a feature we care about, hence
SVM is focused on maximizing the differences caused by changes in the structure.
# of Patterns CV 1 CV 2 CV 3 CV 4 CV 5 Average
14. 200 145 / 160 151 / 160 146 / 160 144 / 160 147 / 160 91.6%
Table 2. Cross Validation Results (2 structure status; El Centro & Kobe)
4.1.2 Damage Detection Using One-Class Support Vector Machine
Although supervised SVM classification is accurate and easy to implement, in practice we often
do not have vibration data from damaged structure beforehand. In this section, we apply one-class
SVM on the same structure used before. Similarly, we extract features from the acceleration
response by setting a windows size equal to 2 seconds (100 patterns for each example), and we
again use a RBF kernel with σ 2 =20, and ν =0.1. Three one-class SVM models are trained using
response data measured from undamaged structure, each with a different seismic load. Then each
model is used to test response data measured from both damaged and undamaged structure, under
the 3 seismic loads. The results are shown in below.
Training El Centro Golden Gate Kobe
Testing El C. G.G. Kobe El C. G.G. Kobe El C. G.G. Kobe
Undam. 14.3% 29.6% 34.9% 85.3% 16.3% 97.8% 18.9% 21.6% 17.4%
1F 89.1% 90.6% 75.0% 98.8% 97.2% 100% 86.7% 57.8% 68.8%
2F 80.9% 100% 73.4% 100% 100% 100% 75.4% 100% 66.6%
Table 3. Proportion of outliers (800 testing samples)
Undamaged 1F 2F Undamaged 1F 2F Undamaged 1F 2F
100% 100% 100%
75% 75% 75%
50% 50% 50%
25% 25% 25%
0% 0% 0%
El Centro Golden Gate Kobe El Centro Golden Gate Kobe El Centro Golden Gate Kobe
Figure 5. Proportion of outliers (800 testing samples)
(model built using El Centro, Golden Gate, and Kobe seismic loading respectively, left to right)
Table 3 and Figure 5 indicate that when a damage occurred in the structure, the percentage of
outliers increase significantly. Note that each SVM model is trained using positive samples
15. measured from one particular seismic load. When the model trained using Golden Gate
earthquake is applied to monitor the same structure under a different seismic load, a large portion
of signals measured from the undamaged structure are also considered as outliers. This is due to
the fact that both external force and structure status can affect the acceleration response, and a
model built on one particular loading history cannot be generalized well to monitor arbitrary
loading. To reduce this unwanted effect, we train SVM models using a larger database that
consists of mixture of acceleration responses measured from undamaged structure under different
seismic loads. By grouping these responses together, we implicitly tell SVM to ignore the
differences caused by excitation variability. Table 4 and Figure 6 shows the results of damage
detection using models built on 3 different sized data sets. (left to right, training data measured
from structure under a. Golden Gate, b. El Centro and Golden Gate, c. El Centro, Golden Gate,
Corral, Hach and Hachinohe seismic load)
Training Golden Gate 2 mixture 5 mixture
Testing El Centro Kobe El Centro Kobe El Centro Kobe
Undam. 85.3% 97.8% 9.5% 31.6% 1.0% 16.0%
1F 98.8% 100% 90.5% 75.1% 64.0% 66.6%
2F 100% 100% 79.0% 73.5% 63.8% 64.5%
Table 4. Proportion of outliers (800 testing samples) detected at location A
Undamaged 1F 2F Undamaged 1F 2F Undamaged 1F 2F
100% 100% 100%
75% 75% 75%
50% 50% 50%
25% 25% 25%
0% 0% 0%
El Centro Kobe El Centro Kobe El Centro Kobe
Figure 6. Proportion of outliers detected at location A
As shown in Figure 6, when SVM model is trained using mixed data set, the effect due to loading
variability is averaged out and the change in structure properties become dominant, i.e., the
16. model is able to detect damages caused by arbitrary loads. Note that the acceleration response
measured from Kobe earthquake is never included in the training set and the result is also good,
i.e., the model can generalize well to unseen data. Nonetheless, when damage occurred in either
floor, the model detects a significant change in both sensors and fails to tell the location of the
damage.
4.1.3 Damage Detection Using Regression-based Methods
Using regression based novelty detection approach for damage detection has been suggested by
Los Alamos National Laboratory (LANL) [5], and followed by others with minor modifications
[6, 20]. The concept of this two-step approach is as following: for each structure, a “reference
database” is created recording the acceleration response of perturbing the undamaged structure by
many different excitations. When a new acceleration response aTBD ( t ) is measured from a
structure whose current status is to be determined, the first step is to select an acceleration
response aun ( t ) from the predefined database which is closest to the current measurement. The
step is referred to as “data normalization”. The second step is to fit aun ( t ) using an auto-
regressive model with exogenous inputs (ARX), and use the ARX model to predict aTBD ( t ) .
Denoting the training error between the ARX model and aun ( t ) at time t as ε un ( t ) and the
prediction error between the ARX model and aTBD ( t ) at time t as ε TBD ( t ) , the ratio of the
standard deviation of the two errors is defined as the damage-sensitive feature,
h = σ (ε TBD ) σ (ε un )
and a experimental threshold limit is used to indicate the occurrence of damage.
We adopt the concept of the damage-sensitive indicator, and make two modifications to the
LANL approach. First, instead of selecting a closest acceleration response from the reference
database and build a regression model from that one response, we build our model from all
17. responses in the database. This simulates the worst case in the first step of LANL approach, i.e.,
no similar excitation can be found in the reference database. Second, in LANL, ε un ( t ) is the
training error of building the ARX model, and ε TBD ( t ) is the prediction error when ARX is used
to predict the unseen data. To be more consistent, in our approach, ε un ( t ) is calculated by use our
regression model to predict an arbitrary piece of unseen response data from the undamaged
structure. Third, linear regression is replaced by SVR, which does not have to be linear and can
guarantee a bounded generalization error. Also, combining with our feature selection scheme,
SVR also provides a systematic way of determining the embedded dimension, a free parameter in
the time series model.
The 2-story steel frame shown in Figure 2 is used in this example. Two experiments are
conducted by exciting the structure using El Centro (1970) and Golden Gate (1989) earthquakes,
respectively. For each experiment, a 5-second long acceleration response, measured from the
structure 5 seconds after the start of excitation, is used as training data. Response measured in the
next 1 second is used as the testing data. We choose C =100 and ε =0.1 for the SVR, and a RBF
kernel with σ =10 is used. The damage detection results are shown in Table 5.
Seismic load El Centro Golden Gate
Location of Damage 1F 2F 1F 2F
h (location A, 1F) 2.984 1.474 2.240 1.173
h (location B, 2F) 1.207 2.554 1.244 2.344
Table 5. Damage Detection in a 2-story Frame using SVR
As expected, the SVR model built from undamaged structure yields significant higher prediction
errors when used to predict the response from damaged structure. When a suitable threshold limit
is chosen for the damage-sensitive feature h (Eq.), the proposed approach is able to indicate both
the existence and the location of the damage.
18. 4.2 ASCE Benchmark Problem
Structural health monitoring studies often apply different methods to different structures, which
make side-by-side comparison of those methods difficult. To coordinate the studies, the ASCE
Task Group on Health Monitoring built a 4-story 2-bay by 2-bay steel frame benchmark structure
and provided two finite element based models, a 12DOF shear-building and a more realistic
120DOF 3D model [19]. The benchmark problem is studied in the following examples.
4.2.1 Support Vector Regression
Five damage patterns are defined in the benchmark study, and we apply the SVR detection
procedure to the first two patterns: (1) all braces in the first story are removed, and (2) all braces
in both the first story and the third story are removed. Acceleration responses of these two
damage patterns are generated by using the 12DOF analytical model under ambient wind load.
The results of damage detection and localization using damage-sensitive feature h is shown in
Table 6. The training data is a mixture of 5-second acceleration responses obtained from the
undamaged structure under 10 different ambient loads. For each damage pattern, two 1-second
acceleration responses caused by different ambient loads (denoted as L1 and L2 in Table 6.) are
used as the testing data. We choose C =100, and ε =0.1 in the SVR, and RBF kernel with σ =10.
Damage pattern 1 Damage pattern 2
# of patterns 30 100 30 100
Ambient load L1 L2 L1 L2 L1 L2 L1 L2
h (1F) 2.57 2.46 1.69 1.56 2.03 2.07 1.78 1.58
h (2F) 1.74 1.07 1.32 0.88 1.48 1.11 1.28 1.09
h (3F) 1.30 1.43 1.26 1.07 2.19 1.92 1.71 1.48
h (4F) 1.30 1.23 1.02 0.89 1.20 1.11 1.08 1.12
Table 6. Damage detection and localization results for damage pattern I and II
Comparing to the results given in [6] and [20], the differences of the damage-sensitive features
between damaged and undamaged structure is less significant, due to the fact we simulate a worse
case in the data normalization step. Nevertheless, our approach indicate the occurrence and the
19. location of the damage in both damage patterns successfully, whereas the second floor in damage
pattern 2 is classified as damaged in [6].
We can see that the value of h varies when the length of patterns is changed. Although the
approach is able to distinguish the structure status from one another in both pattern lengths, a
systematic way of feature selection is more desirable. We will apply the feature selection scheme
discussed in section 3.2 in the following example.
4.2.2 Feature Selection
We use the feature reduction scheme on both supervised SVM and unsupervised SVR approach.
Recall that in our first example in section 4.1.1, the number of features is selected via trial-and-
error, and more than 100 features are required in order to achieve 80% accuracy. Using Eq., we
draw the absolute value of the components of the w1 in Figure 7. It is clear that some features are
%
more important to others, and we can understand why a long pattern was required. Table 7 shows
that by selecting features based to the value of wi , we can obtain the same level of accuracy
with much less features. Note that we use the term feature and pattern interchangeably in this
section, because we are selecting features in the input (pattern) space.
3
2.5
2
w1
1.5
1
0.5
0
1 16 31 46 61 76 91 106 121 136 151 166 181 196
feature
Figure 7. Absolute value of the components in the w1 vector
%
First k patterns Selected k patterns
20. k 50 100 150 200 100 40
CV 1 (120) 60 97 111 120 120 104
CV 2 (120) 62 89 111 120 120 108
CV 3 (120) 68 90 119 120 120 105
CV 4 (120) 61 87 117 120 120 106
CV 5 (120) 63 78 116 120 120 110
Average 52.3% 76.2% 95.7 % 100 % 100 % 88.8 %
Table 7. Feature selection in supervised SVM damage detection
Similarly, we apply the feature selection approach to the ASCE benchmark example. The result is
shown in Figure 8 and Table 8. In this case, we can see that a long pattern is not necessary. Using
the first 9 features, the model is able to generate similar result as using 100 features.
7000
6000
5000
4000
w1
3000
2000
1000
0
1 10 19 28 37 46 55 64 73 82 91 100
feature
Figure 8. Distribution of the components in the w1 vector
%
Damage pattern 1 Damage pattern 2
# of patterns 9 100 9 100
Ambient load L1 L2 L1 L2 L1 L2 L1 L2
h (1F) 2.16 2.18 1.69 1.56 1.90 1.85 1.78 1.58
h (2F) 1.55 1.14 1.32 0.88 1.48 1.13 1.28 1.09
h (3F) 1.35 1.30 1.26 1.07 2.08 1.74 1.71 1.48
h (4F) 1.31 0.99 1.02 0.89 1.15 1.03 1.08 1.12
Table 8. Feature selection in SVR-based damage detection
5. Conclusions
SVM has achieved remarkable success in pattern recognition and machine learning areas, and its
continuing developing also shed new light on its applications in SHM. This paper has described
21. two approaches which applying unsupervised SVM algorithms to vibration-based damage
detection, in addition to the supervised SVM introduced earlier by other researchers. By
combining SVM based novelty detection techniques with vibration-based damage detection
approach, eliminating the need of using data from damaged structure. These approaches are easy
to implement because only vibration responses measured from the structure are required for
building the models. Numerical examples have shown that the SVR approach is able to detect
both the occurrence and location of damages. Furthermore, large dimensional feature vectors
result in more noises and pose a restriction on the scalability of most statistical pattern
recognition methods. The idea of regularization in SVM is extended into feature selection and we
show that the reduced model can still retain the same level of accuracy.
Acknowledgement
This research is supported by the …
References
1. Rytter, A., Vibration based inspection of Civil Engineering structures, in Department of
Building Technology and Structural Engineering. 1993, University of Aalborg: Denmark.
2. Doebling, S.W., C.R. Farrar, and M.B. Prime, A Summary Review of Vibration-Based
Damage Identification Methods. The Shock and Vibration Digest, 1998. 30(2): p. 91-105.
3. Stubbs, N., et al. A Methodology to Nondestructively Evaluate the Structural Properties
of Bridges. in Proceedings of the 17th International Modal Analysis Conference. 1999.
Kissimmee, Fla.
4. N. Haritos and J.S. Owen, The Use of Vibration Data for Damage Detection in Bridges:
A Comparison of System Identification and Pattern Recognition Approaches.
International Journal of Structural Health Monitoring, 2004.
5. Hoon Sohn and Charles R Farrar, Damage Diagnosis Using Time Series Analysis of
Vibration Signals, in Smart Materials and Structures. 2001.
6. Y. Lei, et al. An Enhanced Statistical Damage Detection Algorithm Using Time Series
Analysis. in Proceedings of the 4th International Workshop on Structural Health
Monitoring. 2003.
7. Worden, K. and A.J. Lane, Damage Identification using Support Vector Machines. Smart
Materials and Structures, 2001. 10(3): p. 540-547.
22. 8. Yun, C.B., et al., Damage Estimation Method Using Committee of Neural Networks.
Smart Nondestructive Evaluation and Health Monitoring of Structural and Biological
Systems II. Proceedings of the SPIE, 2003. 5047: p. 263-274.
9. Ahmet Bulut, Peter Shin, and L. Yan. Real-time Nondestructive Structural Health
Monitoring using Support Vector Machines and Wavelets. in Proceedings of Knowledge
Discovery in Data and Data Mining. 2004. Seattle, WA.
10. Vladimir N. Vapnik, The Nature of Statistical Learning Theory. 1995, New York:
Springer-Verlag.
11. Christopher J.C. Burges, A Tutorial on Support Vector Machines for Pattern
Recognition. Knowledge Discovery and Data Mining, 2(2), 1998.
12. Byvatov E., et al., Comparison of support vector machine and artificial neural network
systems for drug/nondrug classification. Journal of Chemical Information and Computer
Sciences, 2003. 43(6): p. 1882-1889.
13. Bernhard Schölkopf and Alex Smola, Learning with Kernels - Support Vector Machines,
Regularization, Optimization and Beyond. 2002: MIT Press.
14. John Shawe-Taylor and Nello Cristianini, Kernel Methods for Pattern Analysis. 2004:
Cambridge University Press.
15. Michael L. Fugate, Hoon Sohn, and C.R. Farrar. Unsupervised Learning Methods for
Vibration-Based Damage Detection. in Proceedings of the 18th International Modal
Analysis Conference. 2000. San Antonio, Texas.
16. Bernhard Schölkopf, et al., Estimating the Support of a High-Dimensional Distribution.
Neural Computation, 2001. 13: p. 1443-1471.
17. Alex J. Smola and Bernhard Schölkopf, A Tutorial on Support Vector Regression, in
NeuroCOLT2 Technical Report Series. 1998.
18. Mead, W.C., et al. Prediction of Chaotic Time Series using CNLS-Net-Example: The
Mackey-Glass Equation. in Nonlinear Modeling and Forecasting. 1992: Addison
Wesley.
19. Johnson, E.A., et al. A Benchmark Problem for Structural Health Monitoring and
Damage Detection. in Proceedings of the 14th Engineering Mechanics Conference. 2000.
Austin, Texas.
20. K.K. Nair, et al. Application of time series analysis in structural damage evaluation. in
Proceedings of the International Conference on Structural Health Monitoring. 2003.
Tokyo, Japan.