SlideShare a Scribd company logo
1 of 25
Draft November 27, 2017
Metamodeling of ANSYS-Calculated
Force Redistribution caused by Bolt Failures across a Plate
Manas Gupte † Sahil Mohamad † Clarence Worrell
Introduction
A simplified ANSYS model of a plate exposed to a spatially heterogeneous loading with 208
uniformly spaced bolts has been created to estimate the force at each location in the presence of
bolt failures. Figure 1 is a simplified depiction of the plate and its coordinate system.
This study explores the potential for classical statistical modeling and machine learning to develop
a metamodel approximation of the ANSYS model.
Figure 1: Simplified Illustration of Plate and Coordinate System
Sampling Strategy and Data Generation
A dataset consisting of 25,000 samples, each representing some combination of failed and intact
bolts, was generated using three strategies. First, a batch of approximately 5,000 samples was
generated using engineering judgment to represent small clusters of failed bolts. Next, 10,000
samples were generated via Monte-Carlo sampling with 15 as the maximum number of bolt
failures for any given sample. Finally, 10,000 samples were generated via Monte-Carlo sampling
with 50 as the maximum number of bolt failures for any given sample.
For each Monte-Carlo sample, first the number of failed bolts was sampled, and then then that
number of failures was randomly assigned to specific locations. Figure 2 includes several selected
samples to illustrate the range of postulated failures.
80% of the samples were selected randomly for model training, and the remaining 20% were
reserved for model testing.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
1 2.34 1.45 1.31 1.68 1.93 2.28 1.94 1.41 1.87 1.92 1.89 1.87 1.84 1.87 1.87 1.90 1.92 1.85 1.36 1.81 2.48 2.35 1.46 1.23 1.40 2.33
2 3.90 1.87 2.30 3.49 3.74 3.48 3.60 2.50 3.97 4.25 4.25 4.22 4.20 4.20 4.21 4.26 4.25 3.93 2.41 3.36 3.56 4.07 3.34 2.12 1.97 3.87
3 3.30 1.68 1.93 2.97 3.20 3.00 3.07 2.08 3.34 3.62 3.59 3.56 3.54 3.54 3.57 3.60 3.61 3.33 2.08 2.99 3.00 3.25 2.99 1.93 1.71 3.29
4 2.73 1.48 1.58 2.39 2.72 2.65 2.56 1.66 2.77 3.02 3.04 3.01 2.98 2.99 3.00 3.02 3.03 2.76 1.70 2.51 2.67 2.68 2.44 1.62 1.50 2.74
5 2.34 1.40 1.41 2.13 2.36 2.40 2.40 1.41 2.38 2.65 2.67 2.65 2.63 2.63 2.64 2.67 2.65 2.37 1.38 2.38 2.40 2.34 2.08 1.37 1.41 2.32
6 2.05 1.34 1.20 1.81 2.03 2.25 2.24 1.16 2.05 2.33 2.37 2.37 2.35 2.36 2.36 2.38 2.33 2.05 1.15 2.21 2.20 2.09 1.82 1.23 1.31 2.05
7 1.66 1.15 0.98 1.44 1.61 2.01 1.98 0.85 1.59 1.89 1.96 1.96 1.96 1.96 1.96 1.96 1.89 1.59 0.85 2.00 2.02 1.62 1.44 0.98 1.17 1.66
8 1.19 0.82 0.77 0.82 1.09 1.35 1.20 0.68 0.92 1.06 1.08 1.08 1.08 1.07 1.08 1.08 1.05 0.92 0.70 1.21 1.36 1.08 0.84 0.76 0.81 1.18
RowRow
Column
Draft November 27, 2017
Figure 2: Selected Samples to Illustrate Range of Postulated Failures
Exploratory Data Analysis
The input data consist of 208 predictor variables, each representing the bolt state at a given
location. The output data consist of 208 response variables, each representing the force at a given
bolt location. The bolt location coordinate system is specified as r#c#. For example r3c5 refers
to the bolt at Row 3, Column 5. Table 1 and Table 2 characterize the input and output data
structure.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0
4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
Row Column
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
3 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0
4 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
7 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
Row
Column
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26
1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0
2 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0
3 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 1 0 0
4 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0
5 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1
6 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0
7 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0
8 0 0 0 0 1 0 0 0 0 1 0 1 1 0 0 0 0 1 0 1 0 0 0 0 0 0
Row
Column
Draft November 27, 2017
Table 1: Raw Input Data Structure
r1c1 r2c1 r3c1 … r8c26
Sample 1 0 0 0 … 0
Sample 2 1 0 0 … 0
Sample 3 0 0 1 … 0
. . . . … .
. . . . … .
. . . . … .
Sample 25,000 0 1 0 … 1
Table 2: Raw Output Data Structure
r1c1 r2c1 r3c1 … r8c26
Sample 1 2.34 3.90 3.30 … 1.18
Sample 2 0.00 5.48 3.31 … 1.57
Sample 3 2.32 4.10 0.00 … 1.23
. . . . … .
. . . . … .
. . . . … .
Sample 25,000 2.82 0.00 3.63 … 1.00
Figure 3 is a correlation plot of ANSYS-estimated bolt forces for a neighborhood of 50 bolt
locations. The plot identifies various pairs of moderately correlated locations. Each correlated pair
represents two immediately adjacent bolt locations. This indicates that the forces observed at any
given location are largely a function of the immediately adjacent locations, and the influence of
distant locations is weak, or even non-existent.
Also note that the correlation is negative. This is because the force at a broken bolt location is near
zero, and the force at its neighbors increases because it is absorbing the force that the broken bolt
had absorbed when it was intact. So when a bolt breaks, the force at its location decreases while
the force at its neighbors increases.
Draft November 27, 2017
Figure 3: Correlation Plot of Bolt Forces at a Neighborhood of 50 Bolts
Figure 4 depicts histograms of the forces across one example row and one example column. The
figure identifies that the range of forces is between zero (0) and 15. There also appears to be more
variation in the vertical (columnar) direction than in the horizontal (row-wise) direction.
Figure 4: Histograms of the Forces across One Example Row and One Example Column
Draft November 27, 2017
Data Pre-Processing
The data in their raw form present at least two challenges:
 High dimensionality
 Binary (0/1) nature of the predictors
The high dimensionality (208 predictors and 208 responses) makes it infeasible to perform a full
grid sampling study. Strong symmetry between Columns 1-13 and 14-26 was noted however
during exploratory data analysis, and this allowed rearranging the data to double the sample size
from 25,000 to 50,000 samples, with each sample containing 104 bolt locations.
While 50,000 samples for 104 predictors is still far from an ideal full grid, the correlation plots
generated during exploratory data analysis identified that the forces observed at any given location
appear most influenced by its immediately adjacent neighbors. Subsequent modeling efforts may
therefore focus on a local neighborhood surrounding the location to be predicted.
The binary (zero/one; in-tact/failed) nature of the predictors significantly challenged early model
fitting efforts. The input data were therefore transformed such that the:
 Predictor variable each in-tact location was made to be the force observed at that location
in the base case of zero failures.
 Predictor variable at each failed location was changed from zero to an arbitrarily large
number (100) to distinguish it from the in-tact locations
This approach informs the predictor data with the known base case force distribution across the
plate. It also makes the predictor data continuous instead of binary. Table 3 characterize the
resulting pre-processed input data structure.
Table 3: Pre-Processed Input Data Structure
r1c1 r2c1 r3c1 … r8c13
Sample 1 2.34 3.90 3.30 … 1.18
Sample 2 100 3.90 3.30 … 1.18
Sample 3 2.34 3.90 100 … 1.18
. . . . … .
. . . . … .
. . . . … .
Sample 50,000 2.34 0.00 3.30 … 1.18
Draft November 27, 2017
Multiple Linear Regression
Multiple linear regression was first attempted using all 104 locations as predictor variables and the
ANSYS-estimated force at one location as the response variable. The location at Row 4 Column
6 is roughly in the center of the plate and was selected for the prediction.
The caret package of R was used to implement lm with 10-fold cross as follows:
ctrl=trainControl(method="cv", number=10)
modelLinear<-train(x.train[], y.train[,44], method="lm", trControl = ctrl)
Figure 5 is a histogram of the resulting model coefficients for the linear model. The vast majority
of coefficient values is near zero (0) indicating that those parameters individually contribute
negligibly to the forces observed at the selected location.
The coefficient of about negative three (-3) is the failed bolt itself. There are two (2) coefficients
with values near 1.75, and these represent the two bolts immediately and horizontally adjacent to
the failed location.
Finally the model intercept is about 2.5, and this in theory would represent the force at the selected
location should all bolts be failed. However, this intercept is not physically meaningful due to the
poor overall fit of the linear model. In addition there were no samples involving more than 50 bolt
failures, and so the all-failed estimation by the intercept is not well-trained.
Figure 5: Histogram of Model Coefficient Values for Linear Model
Draft November 27, 2017
Figure 6 plots the “observed” (ANSYS-calculated) bolt forces versus those predicted by the trained
linear regression model. The linear regression seems to capture some of the trend at a gross level;
however it is clearly a poor model for this application. The likely explanation is that the physics
of force redistribution is highly non-linear.
Figure 6: Observed (ANSYS-Calculated) versus Predicted
Figure 7 plots the residuals for the trained linear regression model. The residuals are clearly not
Gaussian, reaffirming that linear regression is a poor model type for this application. The two
distinct groups are likely associated with samples where 1) there are no failures in the local
neighborhood, and 2) where there are failures in the local neighborhood. The negatively sloped
45° line is believed to represent cases where the predicted location (Row 4 Column 6) itself is
failed.
Figure 7: Residuals for Trained Linear Regression Model
Draft November 27, 2017
Regularized Regression
Lasso, ridge, elastic net models were trained with 10-fold cross validation using the glmnet
package of R. The mean squared error of each trained model is a follows:
 MSEridge = 4.9
 MSEelastic = 5.7
 MSElasso = 5.3
The response variables range in general from 0 to 7, and so the above mean squared errors indicate
poor fit similar to the multiple linear regression models. This again is likely due to the physics of
force redistribution being non-linear
Next the significance of each model coefficient was categorized using a 1% threshold. Figure 8
depicts this classification and indicates that locations immediately adjacent to the predicted
location (x=11, y=2) are most significant, and all other locations are insignificant.
Figure 8: Significance Classification (using 1% Threshold) of each Bolt Location
Draft November 27, 2017
k-Nearest Neighbors
k-Nearest Neighbors was tested using (x=4, y=6) as the location to be predicted. The caret
package of R was used to implement kNN with 10-fold cross validation over a tuning parameter-
space of k = 1:30 as follows:
ctrl=trainControl(method="cv", number=10)
modelKNN<-train(x=x.train[1:10000,], y=y.train[1:10000,44], method="knn",
preProc=c("center","scale"), tuneGrid=data.frame(.k=1:30),
trControl=ctrl)
Figure 8 plots the cross-validated model accuracy (measured by RMSE) for a range of model
complexities between k = 1 and k = 30 neighbors. This plot indicates the optimal (lowest RMSE)
kNN model to have seven (7) neighbors.
Figure 8: kNN RMSE over Range of Neighbors used for Model Training
Figure 9 plots the observed (ANSYS-calculated) versus predicted by the k = 7 kNN model. The
model struggles to reflect any of the observed variation. This poor performance may be related to
inadequate sample size. The exploratory data analysis and the linear models identified that the
force at any one location is largely driven by status of the immediately adjacent locations. When
the kNN model is trained to predict force at one location, in this case (x=4, y=6), there are likely
not enough samples involving variation immediately surrounding to that location to adequately
train the model. The vast majority of samples that have variation distant from the location of
interest are largely “wasted” and may impose noise onto the learning process.
Draft November 27, 2017
Figure 9: Observed (ANSYS-Calculated) versus Predicted
Given the kNN challenges with insufficient sample size relative to the high dimensionality, and
insights from the linear models that conditions at only the adjacent bolts are significant, it may be
worth:
 Reshaping the training data such the response at any one location is a function only of the
eight (8) immediately adjacent neighbors to any one bolt location to be predicted.
Kriging
Kriging was tested using the gstat package of R, which is commonly used for two dimensional
interpolation problems, such as the gold mine predictions for which the method was originally
developed by Daniel Krige. The basic R coding is summarized as follows:
forces.vgm <- variogram(force~1, krigSample)
forces.fit <- fit.variogram(forces.vgm, model=vgm(1, "Ste", 1))
plate.kriged <- krige(force ~ 1, krigSample, plate.grid, model=forces.fit)
Three Kriging models were trained, and the results are visualized in Figures 10-12.
Draft November 27, 2017
Figure 10: Predict Forces between all Bolt Locations, with No Failed Bolts, and the Forces
at all Bolts Known
Figure 11: Predict Forces between all Bolt Locations, Given a Random Sample of Nine (9)
Failed Bolts, and the Forces at all Bolts Known
Figure 12: Predict Forces at the 34 Locations Immediately Adjacent to the Nine (9) Failed
Bolts Given the Forces at all Bolt Locations except the 34 Adjacent to Failures
Draft November 27, 2017
The predictions of Figure 11 and 12 are of the same sample involving nine (9) failed bolts.
However, in Figure 11 the model was trained with the force at all bolt locations known, and so the
Kriging prediction is essentially an interpolation between known values. The predicted heat map
is likely close to the true heat map. In Figure 12, Kriging is predicting forces at bolt locations
surrounding the failures (not simply interpolating between known forces at each location).
Note one distinction between the figures is that in Figure 11, the failures are shown as bright green
circles indicating zero force, which is the form of the raw ANSYS output data. In Figure 12, the
failed locations have been changed to a value of five (5) to make the Kriging interpolation more
realistic, since interpolating to a value of zero would erroneously indicate low forces surrounding
each failure.
Comparing Figure 11 and Figure 12, Kriging performed reasonably well. There is some over-
estimation in Figure 12, and this is likely due choosing a value of five (5) to represent the force at
each failed bolt location. The Kriging model could tuned by adjusting this value.
Principal Component Analysis (PCA) and Artificial Neural Network (ANN)
First, countourf() plots of correlation within the input and output data were generated, and
these plots are shown in Figure 13 and Figure 14, respectively.
Figure 13: contourf() Correlation Plot of Input Space
Figure 13 indicates essentially zero correlation within the input space, which is sensible given the
input space simply indicates where randomly selected bolt failure locations exist for each sample.
Draft November 27, 2017
Figure 14: contourf () Correlation Plot of Output Space (Left) and Sample Heat Map
(Right)
The left panel of Figure 14 similarly indicates near-zero output space correlation across most of
the plate. However, there are two distinct and discontinuous lines of strong negative correlation
parallel to the diagonal.
These lines are offset by exactly eight (8) index locations from the diagonal, which corresponds to
the two bolts horizontally adjacent to each failed bolt location. The negative correlation indicates
that when a failure occurs, the force absorbed at that location decreases to zero, while the forces
absorbed at the adjacent locations increase. Bolts further than the immediately adjacent locations
show no change.
The discontinuous character of the two parallel lines of negative correlation is a feature of the plate
design that can be seen in the right panel of Figure 14. The right panel is a heat map of forces at
each location assuming no bolt failures. The discontinuities at Columns 2-3 and 8 correspond to
the discontinuities in the countourf() plot, and these are sections of the plate design that are
known to have little influence on the adjacent portions of the plate, because of their configuration.
The left panel indicates a third discontinuity near the 40th index, but this is not as clear in the heat
map.
Next, PCA was applied to the response data using the pca() function of MATLAB 2015b.
Figure 15 plots the cumulative percent of variance explained as a function of the number of scores
retained. The plot indicates 50 scores would need to be retained to explain 90% of the variance,
which would represent a ~50% dimensionality reduction.
Draft November 27, 2017
Figure 15: Cumulative Variance Explained by PCA Score
Finally, an artificial neural network as developed using the MATLAB Machine Learning Toolbox
(nnfit). As a first attempt, all 104 predictor variables were fitted to 32 response PCA scores,
which explains 75% of the response data. A default neural network with one hidden layer
containing 10 neurons was fitted using Levenberg-Marquardt back propagation. 70% of the data
was used for training, 15% for validation, and 15% for testing.
The ANN trained for six (6) hours without converging, and inspection of the performance plot
(Figure 16 below) suggested the model had converged but that it would not achieve the minimum
error required for the training to terminate.
Figure 15: ANN Training Performance
The model training was terminated and the workspace saved. Figure 16 depicts the observed versus
predicted for the trained ANN model.
Draft November 27, 2017
Figure 16: ANN Observed versus Predicted
Figure 16 indicates a relatively poor fit (correlation, R, between observed and predicted of ~70%).
However, inspecting the plot indicates the bulk of predicted data do align with the observations,
but that a horizontally oriented subset of data skews the fit off-diagonal, clockwise. This horizontal
data occurs when the prediction is zero (0), which is likely related to the failed bolt locations being
identified as a zero (0) in the predictor data.
The trained ANN was then used to predict the responses for all samples. It is acknowledged that
this includes all of the data upon which the ANN was trained. First, the PCA scores were predicted
by the trained ANN, and the scores were transformed back into the original data space by
multiplying the scores and eigenvectors, using the following code:
predictedScores = net(predictors');
predictedScores = predictedScores';
predicted = predictedScores * pcaRESP.coeff(1:32,:);
Figure 17 plots the observed responses versus those predicted by the PCA and ANN.
Figure 17: Observed vs. Predicted for One Selected Bolt Location
Draft November 27, 2017
Figure 17 indicates the PCA and ANN was unable to predict the observations with reasonable
accuracy. This in part is because the model fitting process included the following two known
degradations:
 Only enough PCA scores were retained to explain 75% of the variation. This was done to
manage the dimensionality for ANN training feasibility.
 The ANN achieved only a R = 0.60 correlation between observed and predicted.
Figure 17 also illustrates the discrete nature of the response. There are five (5) distinct groupings
of response data, likely corresponding to the range of nearby bolt failures postulated in the input
space. It is possible this characteristic could be used to generate a very simple model, perhaps one
that is directly proportional to the number of immediately adjacent bolt failures.
Alternate Formatting of Input Data
Finally, given that all of the attempted models have struggled with dimensionality, an alternate
format of the input data was developed. The new input data table has 30 columns, with each pair
of columns ([1,2], [3,4], [5,6], etc.) corresponding to the coordinate locations of each failed bolt.
All samples with more than 15 bolt failures were truncated from the input space, leaving ~45,000
samples available for modeling.
Table 4 illustrates the alternate data format. For example, Sample 1 has one failed bolt at position
(X=2, Y=2). Sample 2 has failed bolts at positions (X=5, Y=3) and (X=2, Y=8). Sample 3 has 15
failed bolt locations.
X1 Y1 X2 Y2 X3 Y3 . . . X30 Y30
Sample 1 2 2 0 0 0 0 . . . 0 0
Sample 2 5 3 2 8 0 0 . . . 0 0
Sample 3 1 6 5 3 12 3 . . . 6 6
. . . . . . . . . . . .
. . . . . . . . . . . .
. . . . . . . . . . . .
Initial model fitting using the alternate input data format struggled to predict with reasonable
accuracy. The primary reason is suspected to be that the output data are not similarly indexed. The
output data are indexed 1:104; whereas this alternate input data format are indexed by X and Y
position. It is also possible that the models interpret Xn and Yn as separate predictors, but in reality
they are related because these two parameters indicate the position of one bolt.
Conclusion
Draft November 27, 2017
Appendix A: R Coding
###Load Libraries
```{r eval=TRUE, message=FALSE, warning=FALSE}
library(AppliedPredictiveModeling)
library(caret)
library(corrplot)
library(dplyr)
library(ggplot2)
library(glmnet)
library(gridExtra)
library(gstat)
library(Lahman)
library(magrittr)
library(nnet)
library(scales)
library(sp)
library(stats)
library(tidyr)
```
###Import and Pre-Process Data
```{r eval=TRUE}
predictors<-read.table("_predictors.csv",header=TRUE,sep=",",dec=".")
responses<-read.table("_responses.csv",header=TRUE,sep=",",dec=".")
predictors<-predictors[,1:105]
responses<-responses[,1:105]
#Replace all zero (failed) locations with large number.
predictors[predictors==0]<-100
predictors=predictors[,2:105]
responses=responses[,2:105]
```
###Split into Training (80%) and Test (20%) Data
```{r eval=TRUE}
indices = sample(1:nrow(predictors), size=0.2*nrow(predictors))
x.test = predictors[indices,]
x.train = predictors[-indices,]
y.test = responses[indices,]
y.train = responses[-indices,]
rm(indices, predictors, responses)
```
###Exploratory Data Analysis
```{r eval=TRUE}
corrplot(cor(y.train[,1:50]),order="hclust",tl.cex = 0.75,title="Correlation
of Bolt Forces across Locationsn(Local Neighborhood of 50 Bolts)",mar =
c(0,0,2,0))
oneCol<-data.frame(c(y.train[,97],y.train[,98],y.train[,99],y.train[,100],
y.train[,101],y.train[,102],y.train[,103],y.train[,104]))
oneRow<-
data.frame(c(y.train[,4],y.train[,12],y.train[,20],y.train[,28],y.train[,36]
,
y.train[,44],y.train[,52],y.train[,60],y.train[,68],y.train[,76],
Draft November 27, 2017
y.train[,84],y.train[,92],y.train[,100]))
ggplot(data=oneRow, aes(x=oneRow[,1]))+
geom_histogram()+
ggtitle("Range of Bolt Forces across One Row")+
xlab("Bolt Force")+
ylab("Count")+
theme(axis.title = element_text(face="bold",size = 18))+
theme(axis.text = element_text(face="bold",size=12))+
theme(plot.title = element_text(hjust = 0.5,face="bold",size = 20))
ggplot(data=oneCol, aes(x=oneCol[,1]))+
geom_histogram()+
ggtitle("Range of Bolt Forces across One Column")+
xlab("Bolt Force")+
ylab("Count")+
theme(axis.title = element_text(face="bold",size = 18))+
theme(axis.text = element_text(face="bold",size=12))+
theme(plot.title = element_text(hjust = 0.5,face="bold",size = 20))
rm(oneCol,oneRow)
```
###Linear Regression
```{r, message=FALSE, warning=FALSE}
ctrl=trainControl(method="cv", number=10)
modelLinear<-train(x.train[], y.train[,44], method="lm", trControl = ctrl)
predictLinear<-x.test
predictLinear$predicted<-predict(modelLinear,predictLinear)
predictLinear$observed<-data.frame(y.test[,44])
ObsFit<-data.frame(predictLinear$observed,predictLinear$predicted)
names(ObsFit) <- c("obs", "pred")
modelstats=defaultSummary(ObsFit)
RMSE=round(modelstats[1],2)
Rsq=round(modelstats[2],2)
ggplot(data=predictLinear, aes(x=observed,y=predicted))+
geom_point(alpha=0.1)+
geom_abline(intercept = 0,slope = 1)+
ggtitle("Multiple Linear Regression")+
xlim(0,13)+
ylim(0,13)+
xlab("ANSYS Bolt Force")+
ylab("Predicted")+
theme(axis.title = element_text(face="bold",size = 18))+
theme(axis.text = element_text(face="bold",size=12))+
theme(plot.title = element_text(hjust = 0.5,face="bold",size = 20))+
annotate("text",x=12,y=3,label = paste("R^2 =",Rsq),fontface=2,size=4)+
annotate("text",x=12,y=2,label = paste("RMSE =",RMSE),fontface=2,size=4)
xyplot(resid(modelLinear) ~ predict(modelLinear),
type = c("p","g"),
xlab = "Predicted", ylab = "Residuals",
main = "Linear Model Residualsn (This is not Normal)")
Draft November 27, 2017
coef<-as.data.frame(coef(modelLinear$finalModel))
coefSignificant<-
data.frame(summary(modelLinear)$coef[summary(modelLinear)$coef[,4] <= 0.05,
4])
ggplot(data=coef, aes(x=coef(modelLinear$finalModel)))+
geom_histogram()+
ggtitle("Linear Model Coeficient Values")+
#xlim(-4,3)+
xlab("Coeficient Value")+
ylab("Count")+
theme(axis.title = element_text(face="bold",size = 18))+
theme(axis.text = element_text(face="bold",size=12))+
theme(plot.title = element_text(hjust = 0.5,face="bold",size = 20))+
annotate("text",x=2.65,y=25,label="Intercept",fontface=2,size=4)+
annotate("text",x=1.5,y=50,label="Immediately Adjacentn (Left-
Right)",fontface=2,size=4)+
annotate("text",x=-1.0,y=175,label="Vast Majorityn
Insignificant",fontface=2,size=4)+
annotate("text",x=-3,y=25,label="Failed Boltn (Force ~
Zero)",fontface=2,size=4)
rm(coef,coefSignificant,ctrl,modelstats,RMSE,Rsq,ObsFit,modelLinear,predictL
inear)
```
###Regularized Regression
```{r}
# Fit models
failure<-"r2c11"
predictors<-read.csv("_predictors.csv")
responses<-read.csv("_responses.csv")
predictors<-predictors[,-1]
responses<-responses[,-1]
responses<-responses[,1:104]
sam <- sample(1:nrow(predictors),floor(0.8*nrow(predictors)))
train.x <- as.matrix(predictors[sam,])
train.y <- as.matrix(responses[sam,failure])
test.y <- as.matrix(predictors[-sam,failure])
test.x <- as.matrix(responses[-sam,])
fit.ordinary <- lm(train.y ~ train.x)
hist(length(fit.ordinary$coefficients))
fit.lasso <- glmnet(train.x, train.y, family="gaussian", alpha=1)
dev.new(); hist(fit.lasso$df)
fit.ridge <- glmnet(train.x, train.y, family="gaussian", alpha=0)
dev.new(); hist(fit.ridge$df)
fit.elnet <- glmnet(train.x, train.y, family="gaussian", alpha=.5)
dev.new(); hist(fit.elnet$df,breaks=c(10:200:10))
#
# 10-fold cross validation for each alpha (ranging between Ridge Regression
and Lasso across the Elastic Net)
#
for (i in 0:10) {
assign(paste("fit", i, sep=""), cv.glmnet(train.x, train.y,
type.measure="mse", alpha=i/10,family="gaussian"))
}
Draft November 27, 2017
#
# Plot solution paths (to choose regularized models within each "family")
#
dev.new()
par(mfrow=c(1,2))
plot(fit10)
grid()
plot(fit.lasso, xvar="lambda")
#
dev.new()
par(mfrow=c(1,2))
plot(fit0)
grid()
plot(fit.ridge, xvar="lambda")
#
dev.new()
par(mfrow=c(1,2))
plot(fit5)
grid()
plot(fit.elnet, xvar="lambda")
#
# Calculating MSE for each ALPHA
#
yhat0 <- predict(fit0, s=fit0$lambda.1se, newx=test.x)
yhat1 <- predict(fit1, s=fit1$lambda.1se, newx=test.x)
yhat2 <- predict(fit2, s=fit2$lambda.1se, newx=test.x)
yhat3 <- predict(fit3, s=fit3$lambda.1se, newx=test.x)
yhat4 <- predict(fit4, s=fit4$lambda.1se, newx=test.x)
yhat5 <- predict(fit5, s=fit5$lambda.1se, newx=test.x)
yhat6 <- predict(fit6, s=fit6$lambda.1se, newx=test.x)
yhat7 <- predict(fit7, s=fit7$lambda.1se, newx=test.x)
yhat8 <- predict(fit8, s=fit8$lambda.1se, newx=test.x)
yhat9 <- predict(fit9, s=fit9$lambda.1se, newx=test.x)
yhat10 <- predict(fit10, s=fit10$lambda.1se, newx=test.x)
mse0 <- mean((test.y - yhat0)^2)
mse1 <- mean((test.y - yhat1)^2)
mse2 <- mean((test.y - yhat2)^2)
mse3 <- mean((test.y - yhat3)^2)
mse4 <- mean((test.y - yhat4)^2)
mse5 <- mean((test.y - yhat5)^2)
mse6 <- mean((test.y - yhat6)^2)
mse7 <- mean((test.y - yhat7)^2)
mse8 <- mean((test.y - yhat8)^2)
mse9 <- mean((test.y - yhat9)^2)
mse10 <- mean((test.y - yhat10)^2)
print(paste0("MSE for Ridge: ", mse0))
print(paste0("MSE for Elastic Net:",mse5))
print(paste0("MSE for Lasso:", mse10))
coeff_ridge<-coef(fit0,s='lambda.min')
coeff_elastic<-coef(fit5,s='lambda.min')
coeff_lasso<-coef(fit10,s='lambda.min')
sig_coeff_ridge<-as.matrix(coeff_ridge[which(abs(coeff_ridge)>=.01),])
sig_coeff_elastic<-as.matrix(coeff_elastic[which(abs(coeff_elastic)>=.01),])
sig_coeff_lasso<-as.matrix(coeff_lasso[which(abs(coeff_lasso)>=.01),])
Draft November 27, 2017
viz_grid<-function(coeff,failure){
a = as.data.frame(expand.grid(1:13,1:8))
colnames(a) = c('x', 'y')
b<-strsplit(row.names(coeff),"[a-z]+")
matrix<-as.matrix(coeff)
x.c<-rep(0,length(matrix))
y.c<-rep(0,length(matrix))
for(i in 2:length(matrix)){
x.c[i]<-as.numeric(b[[i]][3])
y.c[i]<-as.numeric(b[[i]][2])
}
x.c<-x.c[-1]
y.c<-y.c[-1]
sigbolts<-as.data.frame(cbind(x.c,y.c)) #contains the coordinates
a$indicator<-0
for (i in 1:length(sigbolts[,2])){
for(j in 1:length(a[,2])){
if(sigbolts[i,1]==a[j,1] & sigbolts[i,2]==a[j,2])
a[j,3]<-1
}
}
fail<-strsplit(failure,"[a-z]+")
index<-which(a$x==fail[[1]][3] & a$y==fail[[1]][2])
a[index,3]<-2
ggplot() + geom_point(data = a, aes(x = x, y = y,color=factor(indicator))) +
geom_point() + scale_y_continuous(trans='reverse') +
scale_color_discrete(label=c("insignificant","significant","failure"))
}
viz_grid(sig_coeff_ridge,failure)
viz_grid(sig_coeff_elastic,failure)
viz_grid(sig_coeff_lasso,failure)
rm(predictors,responses,test.x,test.y,train.x,train.y)
rm(sig_coeff_elastic,sig_coeff_lasso,sig_coeff_ridge)
rm(yhat0,yhat1,yhat2,yhat3,yhat4,yhat5,yhat6,yhat7,yhat8,yhat9,yhat10)
rm(coeff_elastic,coeff_lasso,coeff_ridge)
rm(failure,fit.elnet,fit.lasso,fit.ordinary,fit.ridge)
rm(fit0,fit1,fit2,fit3,fit4,fit5,fit6,fit7,fit8,fit9,fit10,i)
rm(mse0,mse1,mse2,mse3,mse4,mse5,mse6,mse7,mse8,mse9,mse10)
rm(sam,viz_grid)
```
###K-Nearest Neighbors
```{r eval=TRUE}
ctrl=trainControl(method="cv", number=2)
modelKNN<-train(x=x.train[1:5000,], y=y.train[1:5000,44], method="knn",
preProc=c("center","scale"), tuneGrid=data.frame(.k=1:30), trControl=ctrl)
print(modelKNN)
plot(modelKNN)
predictKNN<-x.test
Draft November 27, 2017
predictKNN$predicted<-predict(modelKNN,predictKNN)
predictKNN$observed<-data.frame(y.test[,44])
ObsFit<-data.frame(predictKNN$observed,predictKNN$predicted)
names(ObsFit) <- c("obs", "pred")
modelstats=defaultSummary(ObsFit)
RMSE=round(modelstats[1],2)
Rsq=round(modelstats[2],2)
ggplot(data=predictKNN, aes(x=observed,y=predicted))+
geom_point(alpha=0.5)+
geom_abline(intercept = 0,slope = 1)+
ggtitle("K-Nearest Neighbors")+
xlim(0,13)+
ylim(0,13)+
xlab("ANSYS Bolt Force")+
ylab("Predicted")+
theme(axis.title = element_text(face="bold",size = 18))+
theme(axis.text = element_text(face="bold",size=12))+
theme(plot.title = element_text(hjust = 0.5,face="bold",size = 20))+
annotate("text",x=12,y=3,label = paste("R^2 =",Rsq),fontface=2,size=4)+
annotate("text",x=12,y=2,label = paste("RMSE =",RMSE),fontface=2,size=4)
```
###Kriging
```{r eval=TRUE}
for (i in 3:5){
krigSample<-read.table("_krigingData.csv",header=TRUE,sep=",",dec=".")
krigSample<-krigSample[,c(1,2,i)]
krigSample<-na.omit(krigSample)
colnames(krigSample) <- c("x", "y", "force")
coordinates(krigSample) <- ~ x + y
class(krigSample)
krigSample %>% as.data.frame %>%
ggplot(aes(x, y)) + geom_point(aes(color=force,size=force)) +
scale_colour_gradient(low = "green", high = "red")+
ggtitle("Bolt Force at each Location") + coord_equal()
forces.vgm <- variogram(force~1, krigSample)
forces.fit <- fit.variogram(forces.vgm, model=vgm(1, "Ste", 1))
plot(forces.vgm,forces.fit)
Sr1 = Polygon(cbind(c(0,0,13,13),c(0,8,8,0)))
Srs1 = Polygons(list(Sr1), "s1")
SpP=SpatialPolygons(list(Srs1))
plate.grid=spsample(SpP,10000,"regular")
plate.kriged <- krige(force ~ 1, krigSample, plate.grid, model=forces.fit)
plot1 <- krigSample %>% as.data.frame %>%
ggplot(aes(x, y)) + geom_point(aes(color=force),size=1) +
scale_colour_gradient(low = "green", high = "red", breaks=c(1,7),
labels=c(1,7), limits=c(0,7))+
ggtitle("ANSYS Force at each Bolt Location") + coord_equal()
Draft November 27, 2017
plot2 <- plate.kriged %>% as.data.frame %>%
ggplot(aes(x=x1, y=x2)) + geom_tile(aes(fill=var1.pred)) + coord_equal()
+
scale_fill_gradient(low = "green", high="red", breaks=c(1,7),
labels=c(1,7), limits=c(0,7)) +
scale_x_continuous(labels=comma) + scale_y_continuous(labels=comma) +
theme_bw() + ggtitle("Krige-Interpolated Force across Plate")
grid.arrange(plot1, plot2, ncol = 2)
}
```
Draft November 27, 2017
Appendix B: MATLAB 2015b Coding
%% INITIALIZE AND IMPORT DATA
clc; clear all; close all;
set(0,'defaultfigurecolor',[1 1 1]);
cd ('C:UsersworrelclDesktopClarenceIE2065 (Stat Analysis &
Optimization)Project');
predictors = csvread('_predictors.csv', 1, 1); predictors =
predictors(:,1:104);
responses = csvread('_responses.csv', 1, 1); responses =
responses(:,1:104);
numSamples = length(predictors);
pcaPRED = struct('original', predictors);
pcaRESP = struct('original', responses);
%% PRE-PROCESS DATA FROM 104 COLUMNS TO 30 COLUMNS
% where each pair of columns identify the X,Y coordinates of each failed
% bolt location. Truncate samples with more than 15 bolt failures.
% First, count number of failures for each sample,
% identify samples with <=15 failures, and
% truncate predictor/response data to samples with <=15 failures
numFailed = zeros(numSamples,1);
counter = 1;
for sample = 1:numSamples
numFailed(sample,1) = sum(predictors(sample,:)==0);
if numFailed(sample,1) <= 15
keepIndex(counter,1) = sample;
counter = counter + 1;
end
end
predictors = predictors(keepIndex,:);
responses = responses(keepIndex,:);
numSamples = length(predictors);
% Next, get X,Y coordinates of each failure
failCols = zeros(numSamples,15);
for sample = 1:numSamples
counter = 1;
for col = 1:104
if predictors(sample,col)==0
failCols(sample,counter) = col;
counter = counter + 1;
end
end
end
% Finally, populate a new 30-column predictor matrix with X,Y location of
% each failed bolt
predictorsXY = zeros(numSamples,30);
for sample = 1:numSamples
counter = 0;
for col = 1:15
if failCols(sample,col) > 0
counter = counter + 1;
failedColIndex = failCols(sample,col);
Draft November 27, 2017
predictorsXY(sample,2*counter-1) = rem((failedColIndex-1),8)+1; %X
coord;
predictorsXY(sample,2*counter) = floor((failedColIndex-1)/8)+1; %Y
coord
end
end
end
clear col counter failedColIndex keepIndex sample
%% Correlation Plots
zPRED = zscore(pcaPRED.original);
cPRED = (zPRED'*zPRED) / (numSamples-1);
figure; contourf(cPRED), colorbar, title('Correlation between Predictor
Variables');
zRESP = zscore(pcaRESP.original);
cRESP = (zRESP'*zRESP) / (numSamples-1);
figure; contourf(cRESP), colorbar, title('Correlation between Response
Variables');
clear cPRED cRESP zPRED zRESP pcaPRED
%% PCA
[pcaRESP.coeff, pcaRESP.score, pcaRESP.latent, pcaRESP.tsquared,
pcaRESP.explained] = pca(pcaRESP.original);
pcaRESP.explained=cumsum(pcaRESP.explained);
figure; bar(pcaRESP.explained), title('Cumulative Percent of Variance
Explained'), xlabel('Score'), ylabel('Percent'), xlim([0,100]);
NNtargetPCA = pcaRESP.score(:,1:32);
predictedScores = net(predictors');
predictedScores = predictedScores';
predicted = predictedScores * pcaRESP.coeff(1:32,:);
boltLoc = 84;
figure; hold on
scatter(responses(:,boltLoc),predicted(:,boltLoc) +
mean(responses(:,boltLoc)));
title('Observed vs. Predicted for One Selected Bolt Location');
xlabel('Observed');
ylabel('Predicted (PCA+ANN)');
xlim([0,15]);
ylim([0,15]);
hold off

More Related Content

Similar to Predicting Force Redistribution caused by bolt failures across a Plate

Mechanism Design and Kinematics Analysis of Display Bracket Based on Adams
Mechanism Design and Kinematics Analysis of Display Bracket Based on AdamsMechanism Design and Kinematics Analysis of Display Bracket Based on Adams
Mechanism Design and Kinematics Analysis of Display Bracket Based on AdamsIJRESJOURNAL
 
[This sheet must be completed and attached to the last page of.docx
[This sheet must be completed and attached to the last page of.docx[This sheet must be completed and attached to the last page of.docx
[This sheet must be completed and attached to the last page of.docxhanneloremccaffery
 
Lec4 State Variable Models are used for modeing
Lec4 State Variable Models are used for modeingLec4 State Variable Models are used for modeing
Lec4 State Variable Models are used for modeingShehzadAhmed90
 
Applied numerical methods lec6
Applied numerical methods lec6Applied numerical methods lec6
Applied numerical methods lec6Yasser Ahmed
 
Nelson maple pdf
Nelson maple pdfNelson maple pdf
Nelson maple pdfNelsonP23
 
Pajek chapter2 Attributes and Relations
Pajek chapter2 Attributes and RelationsPajek chapter2 Attributes and Relations
Pajek chapter2 Attributes and RelationsChengjun Wang
 
G03402048053
G03402048053G03402048053
G03402048053theijes
 
Durbin watson tables unyu unyu bgt
Durbin watson tables unyu unyu bgtDurbin watson tables unyu unyu bgt
Durbin watson tables unyu unyu bgtFergieta Prahasdhika
 
Rabotna tetratka 5 odd
Rabotna tetratka 5 oddRabotna tetratka 5 odd
Rabotna tetratka 5 oddMira Trajkoska
 
NETWORK THEORY MESH N SUPER MESH TOPIC REVIEW
NETWORK THEORY MESH N SUPER MESH TOPIC REVIEWNETWORK THEORY MESH N SUPER MESH TOPIC REVIEW
NETWORK THEORY MESH N SUPER MESH TOPIC REVIEWSharmanya Korde
 
The sexagesimal foundation of mathematics
The sexagesimal foundation of mathematicsThe sexagesimal foundation of mathematics
The sexagesimal foundation of mathematicsMichielKarskens
 
Table durbin watson tables
Table durbin watson tablesTable durbin watson tables
Table durbin watson tablesDIANTO IRAWAN
 
X Bar And S Charts Mini Tutorial
X Bar And S Charts Mini TutorialX Bar And S Charts Mini Tutorial
X Bar And S Charts Mini Tutorialahmad bassiouny
 

Similar to Predicting Force Redistribution caused by bolt failures across a Plate (20)

AEN-VAR-AEN.pdf
AEN-VAR-AEN.pdfAEN-VAR-AEN.pdf
AEN-VAR-AEN.pdf
 
Mechanism Design and Kinematics Analysis of Display Bracket Based on Adams
Mechanism Design and Kinematics Analysis of Display Bracket Based on AdamsMechanism Design and Kinematics Analysis of Display Bracket Based on Adams
Mechanism Design and Kinematics Analysis of Display Bracket Based on Adams
 
SQC Project 01
SQC Project 01SQC Project 01
SQC Project 01
 
[This sheet must be completed and attached to the last page of.docx
[This sheet must be completed and attached to the last page of.docx[This sheet must be completed and attached to the last page of.docx
[This sheet must be completed and attached to the last page of.docx
 
Lec4 State Variable Models are used for modeing
Lec4 State Variable Models are used for modeingLec4 State Variable Models are used for modeing
Lec4 State Variable Models are used for modeing
 
Applied numerical methods lec6
Applied numerical methods lec6Applied numerical methods lec6
Applied numerical methods lec6
 
Control charts
Control chartsControl charts
Control charts
 
Nelson maple pdf
Nelson maple pdfNelson maple pdf
Nelson maple pdf
 
solution for 2D truss1
solution for 2D truss1solution for 2D truss1
solution for 2D truss1
 
Pajek chapter2 Attributes and Relations
Pajek chapter2 Attributes and RelationsPajek chapter2 Attributes and Relations
Pajek chapter2 Attributes and Relations
 
G03402048053
G03402048053G03402048053
G03402048053
 
Durbin watson tables unyu unyu bgt
Durbin watson tables unyu unyu bgtDurbin watson tables unyu unyu bgt
Durbin watson tables unyu unyu bgt
 
Rabotna tetratka 5 odd
Rabotna tetratka 5 oddRabotna tetratka 5 odd
Rabotna tetratka 5 odd
 
Cluto presentation
Cluto presentationCluto presentation
Cluto presentation
 
1سلمي 2
1سلمي 21سلمي 2
1سلمي 2
 
NETWORK THEORY MESH N SUPER MESH TOPIC REVIEW
NETWORK THEORY MESH N SUPER MESH TOPIC REVIEWNETWORK THEORY MESH N SUPER MESH TOPIC REVIEW
NETWORK THEORY MESH N SUPER MESH TOPIC REVIEW
 
The sexagesimal foundation of mathematics
The sexagesimal foundation of mathematicsThe sexagesimal foundation of mathematics
The sexagesimal foundation of mathematics
 
Durbin watson tables
Durbin watson tablesDurbin watson tables
Durbin watson tables
 
Table durbin watson tables
Table durbin watson tablesTable durbin watson tables
Table durbin watson tables
 
X Bar And S Charts Mini Tutorial
X Bar And S Charts Mini TutorialX Bar And S Charts Mini Tutorial
X Bar And S Charts Mini Tutorial
 

Recently uploaded

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiSuhani Kapoor
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...Suhani Kapoor
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysismanisha194592
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationshipsccctableauusergroup
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Serviceranjana rawat
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingNeil Barnes
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxfirstjob4
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystSamantha Rae Coolbeth
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...Suhani Kapoor
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiSuhani Kapoor
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxolyaivanovalion
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfLars Albertsson
 

Recently uploaded (20)

Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
VIP Call Girls Service Charbagh { Lucknow Call Girls Service 9548273370 } Boo...
 
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service BhilaiLow Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
Low Rate Call Girls Bhilai Anika 8250192130 Independent Escort Service Bhilai
 
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
VIP High Profile Call Girls Amravati Aarushi 8250192130 Independent Escort Se...
 
April 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's AnalysisApril 2024 - Crypto Market Report's Analysis
April 2024 - Crypto Market Report's Analysis
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships04242024_CCC TUG_Joins and Relationships
04242024_CCC TUG_Joins and Relationships
 
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
(PARI) Call Girls Wanowrie ( 7001035870 ) HI-Fi Pune Escorts Service
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Brighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data StorytellingBrighton SEO | April 2024 | Data Storytelling
Brighton SEO | April 2024 | Data Storytelling
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Introduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptxIntroduction-to-Machine-Learning (1).pptx
Introduction-to-Machine-Learning (1).pptx
 
Unveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data AnalystUnveiling Insights: The Role of a Data Analyst
Unveiling Insights: The Role of a Data Analyst
 
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
VIP High Class Call Girls Jamshedpur Anushka 8250192130 Independent Escort Se...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service AmravatiVIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
VIP Call Girls in Amravati Aarohi 8250192130 Independent Escort Service Amravati
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
BabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptxBabyOno dropshipping via API with DroFx.pptx
BabyOno dropshipping via API with DroFx.pptx
 
Industrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdfIndustrialised data - the key to AI success.pdf
Industrialised data - the key to AI success.pdf
 

Predicting Force Redistribution caused by bolt failures across a Plate

  • 1. Draft November 27, 2017 Metamodeling of ANSYS-Calculated Force Redistribution caused by Bolt Failures across a Plate Manas Gupte † Sahil Mohamad † Clarence Worrell Introduction A simplified ANSYS model of a plate exposed to a spatially heterogeneous loading with 208 uniformly spaced bolts has been created to estimate the force at each location in the presence of bolt failures. Figure 1 is a simplified depiction of the plate and its coordinate system. This study explores the potential for classical statistical modeling and machine learning to develop a metamodel approximation of the ANSYS model. Figure 1: Simplified Illustration of Plate and Coordinate System Sampling Strategy and Data Generation A dataset consisting of 25,000 samples, each representing some combination of failed and intact bolts, was generated using three strategies. First, a batch of approximately 5,000 samples was generated using engineering judgment to represent small clusters of failed bolts. Next, 10,000 samples were generated via Monte-Carlo sampling with 15 as the maximum number of bolt failures for any given sample. Finally, 10,000 samples were generated via Monte-Carlo sampling with 50 as the maximum number of bolt failures for any given sample. For each Monte-Carlo sample, first the number of failed bolts was sampled, and then then that number of failures was randomly assigned to specific locations. Figure 2 includes several selected samples to illustrate the range of postulated failures. 80% of the samples were selected randomly for model training, and the remaining 20% were reserved for model testing. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 1 2.34 1.45 1.31 1.68 1.93 2.28 1.94 1.41 1.87 1.92 1.89 1.87 1.84 1.87 1.87 1.90 1.92 1.85 1.36 1.81 2.48 2.35 1.46 1.23 1.40 2.33 2 3.90 1.87 2.30 3.49 3.74 3.48 3.60 2.50 3.97 4.25 4.25 4.22 4.20 4.20 4.21 4.26 4.25 3.93 2.41 3.36 3.56 4.07 3.34 2.12 1.97 3.87 3 3.30 1.68 1.93 2.97 3.20 3.00 3.07 2.08 3.34 3.62 3.59 3.56 3.54 3.54 3.57 3.60 3.61 3.33 2.08 2.99 3.00 3.25 2.99 1.93 1.71 3.29 4 2.73 1.48 1.58 2.39 2.72 2.65 2.56 1.66 2.77 3.02 3.04 3.01 2.98 2.99 3.00 3.02 3.03 2.76 1.70 2.51 2.67 2.68 2.44 1.62 1.50 2.74 5 2.34 1.40 1.41 2.13 2.36 2.40 2.40 1.41 2.38 2.65 2.67 2.65 2.63 2.63 2.64 2.67 2.65 2.37 1.38 2.38 2.40 2.34 2.08 1.37 1.41 2.32 6 2.05 1.34 1.20 1.81 2.03 2.25 2.24 1.16 2.05 2.33 2.37 2.37 2.35 2.36 2.36 2.38 2.33 2.05 1.15 2.21 2.20 2.09 1.82 1.23 1.31 2.05 7 1.66 1.15 0.98 1.44 1.61 2.01 1.98 0.85 1.59 1.89 1.96 1.96 1.96 1.96 1.96 1.96 1.89 1.59 0.85 2.00 2.02 1.62 1.44 0.98 1.17 1.66 8 1.19 0.82 0.77 0.82 1.09 1.35 1.20 0.68 0.92 1.06 1.08 1.08 1.08 1.07 1.08 1.08 1.05 0.92 0.70 1.21 1.36 1.08 0.84 0.76 0.81 1.18 RowRow Column
  • 2. Draft November 27, 2017 Figure 2: Selected Samples to Illustrate Range of Postulated Failures Exploratory Data Analysis The input data consist of 208 predictor variables, each representing the bolt state at a given location. The output data consist of 208 response variables, each representing the force at a given bolt location. The bolt location coordinate system is specified as r#c#. For example r3c5 refers to the bolt at Row 3, Column 5. Table 1 and Table 2 characterize the input and output data structure. 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 7 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Row Column 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 3 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 0 4 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 6 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 7 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 Row Column 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 1 1 0 0 2 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 0 0 1 0 0 3 0 0 0 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 0 0 1 0 0 1 0 0 4 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 1 0 0 0 0 5 0 0 0 0 1 1 0 0 0 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 1 6 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 7 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 8 0 0 0 0 1 0 0 0 0 1 0 1 1 0 0 0 0 1 0 1 0 0 0 0 0 0 Row Column
  • 3. Draft November 27, 2017 Table 1: Raw Input Data Structure r1c1 r2c1 r3c1 … r8c26 Sample 1 0 0 0 … 0 Sample 2 1 0 0 … 0 Sample 3 0 0 1 … 0 . . . . … . . . . . … . . . . . … . Sample 25,000 0 1 0 … 1 Table 2: Raw Output Data Structure r1c1 r2c1 r3c1 … r8c26 Sample 1 2.34 3.90 3.30 … 1.18 Sample 2 0.00 5.48 3.31 … 1.57 Sample 3 2.32 4.10 0.00 … 1.23 . . . . … . . . . . … . . . . . … . Sample 25,000 2.82 0.00 3.63 … 1.00 Figure 3 is a correlation plot of ANSYS-estimated bolt forces for a neighborhood of 50 bolt locations. The plot identifies various pairs of moderately correlated locations. Each correlated pair represents two immediately adjacent bolt locations. This indicates that the forces observed at any given location are largely a function of the immediately adjacent locations, and the influence of distant locations is weak, or even non-existent. Also note that the correlation is negative. This is because the force at a broken bolt location is near zero, and the force at its neighbors increases because it is absorbing the force that the broken bolt had absorbed when it was intact. So when a bolt breaks, the force at its location decreases while the force at its neighbors increases.
  • 4. Draft November 27, 2017 Figure 3: Correlation Plot of Bolt Forces at a Neighborhood of 50 Bolts Figure 4 depicts histograms of the forces across one example row and one example column. The figure identifies that the range of forces is between zero (0) and 15. There also appears to be more variation in the vertical (columnar) direction than in the horizontal (row-wise) direction. Figure 4: Histograms of the Forces across One Example Row and One Example Column
  • 5. Draft November 27, 2017 Data Pre-Processing The data in their raw form present at least two challenges:  High dimensionality  Binary (0/1) nature of the predictors The high dimensionality (208 predictors and 208 responses) makes it infeasible to perform a full grid sampling study. Strong symmetry between Columns 1-13 and 14-26 was noted however during exploratory data analysis, and this allowed rearranging the data to double the sample size from 25,000 to 50,000 samples, with each sample containing 104 bolt locations. While 50,000 samples for 104 predictors is still far from an ideal full grid, the correlation plots generated during exploratory data analysis identified that the forces observed at any given location appear most influenced by its immediately adjacent neighbors. Subsequent modeling efforts may therefore focus on a local neighborhood surrounding the location to be predicted. The binary (zero/one; in-tact/failed) nature of the predictors significantly challenged early model fitting efforts. The input data were therefore transformed such that the:  Predictor variable each in-tact location was made to be the force observed at that location in the base case of zero failures.  Predictor variable at each failed location was changed from zero to an arbitrarily large number (100) to distinguish it from the in-tact locations This approach informs the predictor data with the known base case force distribution across the plate. It also makes the predictor data continuous instead of binary. Table 3 characterize the resulting pre-processed input data structure. Table 3: Pre-Processed Input Data Structure r1c1 r2c1 r3c1 … r8c13 Sample 1 2.34 3.90 3.30 … 1.18 Sample 2 100 3.90 3.30 … 1.18 Sample 3 2.34 3.90 100 … 1.18 . . . . … . . . . . … . . . . . … . Sample 50,000 2.34 0.00 3.30 … 1.18
  • 6. Draft November 27, 2017 Multiple Linear Regression Multiple linear regression was first attempted using all 104 locations as predictor variables and the ANSYS-estimated force at one location as the response variable. The location at Row 4 Column 6 is roughly in the center of the plate and was selected for the prediction. The caret package of R was used to implement lm with 10-fold cross as follows: ctrl=trainControl(method="cv", number=10) modelLinear<-train(x.train[], y.train[,44], method="lm", trControl = ctrl) Figure 5 is a histogram of the resulting model coefficients for the linear model. The vast majority of coefficient values is near zero (0) indicating that those parameters individually contribute negligibly to the forces observed at the selected location. The coefficient of about negative three (-3) is the failed bolt itself. There are two (2) coefficients with values near 1.75, and these represent the two bolts immediately and horizontally adjacent to the failed location. Finally the model intercept is about 2.5, and this in theory would represent the force at the selected location should all bolts be failed. However, this intercept is not physically meaningful due to the poor overall fit of the linear model. In addition there were no samples involving more than 50 bolt failures, and so the all-failed estimation by the intercept is not well-trained. Figure 5: Histogram of Model Coefficient Values for Linear Model
  • 7. Draft November 27, 2017 Figure 6 plots the “observed” (ANSYS-calculated) bolt forces versus those predicted by the trained linear regression model. The linear regression seems to capture some of the trend at a gross level; however it is clearly a poor model for this application. The likely explanation is that the physics of force redistribution is highly non-linear. Figure 6: Observed (ANSYS-Calculated) versus Predicted Figure 7 plots the residuals for the trained linear regression model. The residuals are clearly not Gaussian, reaffirming that linear regression is a poor model type for this application. The two distinct groups are likely associated with samples where 1) there are no failures in the local neighborhood, and 2) where there are failures in the local neighborhood. The negatively sloped 45° line is believed to represent cases where the predicted location (Row 4 Column 6) itself is failed. Figure 7: Residuals for Trained Linear Regression Model
  • 8. Draft November 27, 2017 Regularized Regression Lasso, ridge, elastic net models were trained with 10-fold cross validation using the glmnet package of R. The mean squared error of each trained model is a follows:  MSEridge = 4.9  MSEelastic = 5.7  MSElasso = 5.3 The response variables range in general from 0 to 7, and so the above mean squared errors indicate poor fit similar to the multiple linear regression models. This again is likely due to the physics of force redistribution being non-linear Next the significance of each model coefficient was categorized using a 1% threshold. Figure 8 depicts this classification and indicates that locations immediately adjacent to the predicted location (x=11, y=2) are most significant, and all other locations are insignificant. Figure 8: Significance Classification (using 1% Threshold) of each Bolt Location
  • 9. Draft November 27, 2017 k-Nearest Neighbors k-Nearest Neighbors was tested using (x=4, y=6) as the location to be predicted. The caret package of R was used to implement kNN with 10-fold cross validation over a tuning parameter- space of k = 1:30 as follows: ctrl=trainControl(method="cv", number=10) modelKNN<-train(x=x.train[1:10000,], y=y.train[1:10000,44], method="knn", preProc=c("center","scale"), tuneGrid=data.frame(.k=1:30), trControl=ctrl) Figure 8 plots the cross-validated model accuracy (measured by RMSE) for a range of model complexities between k = 1 and k = 30 neighbors. This plot indicates the optimal (lowest RMSE) kNN model to have seven (7) neighbors. Figure 8: kNN RMSE over Range of Neighbors used for Model Training Figure 9 plots the observed (ANSYS-calculated) versus predicted by the k = 7 kNN model. The model struggles to reflect any of the observed variation. This poor performance may be related to inadequate sample size. The exploratory data analysis and the linear models identified that the force at any one location is largely driven by status of the immediately adjacent locations. When the kNN model is trained to predict force at one location, in this case (x=4, y=6), there are likely not enough samples involving variation immediately surrounding to that location to adequately train the model. The vast majority of samples that have variation distant from the location of interest are largely “wasted” and may impose noise onto the learning process.
  • 10. Draft November 27, 2017 Figure 9: Observed (ANSYS-Calculated) versus Predicted Given the kNN challenges with insufficient sample size relative to the high dimensionality, and insights from the linear models that conditions at only the adjacent bolts are significant, it may be worth:  Reshaping the training data such the response at any one location is a function only of the eight (8) immediately adjacent neighbors to any one bolt location to be predicted. Kriging Kriging was tested using the gstat package of R, which is commonly used for two dimensional interpolation problems, such as the gold mine predictions for which the method was originally developed by Daniel Krige. The basic R coding is summarized as follows: forces.vgm <- variogram(force~1, krigSample) forces.fit <- fit.variogram(forces.vgm, model=vgm(1, "Ste", 1)) plate.kriged <- krige(force ~ 1, krigSample, plate.grid, model=forces.fit) Three Kriging models were trained, and the results are visualized in Figures 10-12.
  • 11. Draft November 27, 2017 Figure 10: Predict Forces between all Bolt Locations, with No Failed Bolts, and the Forces at all Bolts Known Figure 11: Predict Forces between all Bolt Locations, Given a Random Sample of Nine (9) Failed Bolts, and the Forces at all Bolts Known Figure 12: Predict Forces at the 34 Locations Immediately Adjacent to the Nine (9) Failed Bolts Given the Forces at all Bolt Locations except the 34 Adjacent to Failures
  • 12. Draft November 27, 2017 The predictions of Figure 11 and 12 are of the same sample involving nine (9) failed bolts. However, in Figure 11 the model was trained with the force at all bolt locations known, and so the Kriging prediction is essentially an interpolation between known values. The predicted heat map is likely close to the true heat map. In Figure 12, Kriging is predicting forces at bolt locations surrounding the failures (not simply interpolating between known forces at each location). Note one distinction between the figures is that in Figure 11, the failures are shown as bright green circles indicating zero force, which is the form of the raw ANSYS output data. In Figure 12, the failed locations have been changed to a value of five (5) to make the Kriging interpolation more realistic, since interpolating to a value of zero would erroneously indicate low forces surrounding each failure. Comparing Figure 11 and Figure 12, Kriging performed reasonably well. There is some over- estimation in Figure 12, and this is likely due choosing a value of five (5) to represent the force at each failed bolt location. The Kriging model could tuned by adjusting this value. Principal Component Analysis (PCA) and Artificial Neural Network (ANN) First, countourf() plots of correlation within the input and output data were generated, and these plots are shown in Figure 13 and Figure 14, respectively. Figure 13: contourf() Correlation Plot of Input Space Figure 13 indicates essentially zero correlation within the input space, which is sensible given the input space simply indicates where randomly selected bolt failure locations exist for each sample.
  • 13. Draft November 27, 2017 Figure 14: contourf () Correlation Plot of Output Space (Left) and Sample Heat Map (Right) The left panel of Figure 14 similarly indicates near-zero output space correlation across most of the plate. However, there are two distinct and discontinuous lines of strong negative correlation parallel to the diagonal. These lines are offset by exactly eight (8) index locations from the diagonal, which corresponds to the two bolts horizontally adjacent to each failed bolt location. The negative correlation indicates that when a failure occurs, the force absorbed at that location decreases to zero, while the forces absorbed at the adjacent locations increase. Bolts further than the immediately adjacent locations show no change. The discontinuous character of the two parallel lines of negative correlation is a feature of the plate design that can be seen in the right panel of Figure 14. The right panel is a heat map of forces at each location assuming no bolt failures. The discontinuities at Columns 2-3 and 8 correspond to the discontinuities in the countourf() plot, and these are sections of the plate design that are known to have little influence on the adjacent portions of the plate, because of their configuration. The left panel indicates a third discontinuity near the 40th index, but this is not as clear in the heat map. Next, PCA was applied to the response data using the pca() function of MATLAB 2015b. Figure 15 plots the cumulative percent of variance explained as a function of the number of scores retained. The plot indicates 50 scores would need to be retained to explain 90% of the variance, which would represent a ~50% dimensionality reduction.
  • 14. Draft November 27, 2017 Figure 15: Cumulative Variance Explained by PCA Score Finally, an artificial neural network as developed using the MATLAB Machine Learning Toolbox (nnfit). As a first attempt, all 104 predictor variables were fitted to 32 response PCA scores, which explains 75% of the response data. A default neural network with one hidden layer containing 10 neurons was fitted using Levenberg-Marquardt back propagation. 70% of the data was used for training, 15% for validation, and 15% for testing. The ANN trained for six (6) hours without converging, and inspection of the performance plot (Figure 16 below) suggested the model had converged but that it would not achieve the minimum error required for the training to terminate. Figure 15: ANN Training Performance The model training was terminated and the workspace saved. Figure 16 depicts the observed versus predicted for the trained ANN model.
  • 15. Draft November 27, 2017 Figure 16: ANN Observed versus Predicted Figure 16 indicates a relatively poor fit (correlation, R, between observed and predicted of ~70%). However, inspecting the plot indicates the bulk of predicted data do align with the observations, but that a horizontally oriented subset of data skews the fit off-diagonal, clockwise. This horizontal data occurs when the prediction is zero (0), which is likely related to the failed bolt locations being identified as a zero (0) in the predictor data. The trained ANN was then used to predict the responses for all samples. It is acknowledged that this includes all of the data upon which the ANN was trained. First, the PCA scores were predicted by the trained ANN, and the scores were transformed back into the original data space by multiplying the scores and eigenvectors, using the following code: predictedScores = net(predictors'); predictedScores = predictedScores'; predicted = predictedScores * pcaRESP.coeff(1:32,:); Figure 17 plots the observed responses versus those predicted by the PCA and ANN. Figure 17: Observed vs. Predicted for One Selected Bolt Location
  • 16. Draft November 27, 2017 Figure 17 indicates the PCA and ANN was unable to predict the observations with reasonable accuracy. This in part is because the model fitting process included the following two known degradations:  Only enough PCA scores were retained to explain 75% of the variation. This was done to manage the dimensionality for ANN training feasibility.  The ANN achieved only a R = 0.60 correlation between observed and predicted. Figure 17 also illustrates the discrete nature of the response. There are five (5) distinct groupings of response data, likely corresponding to the range of nearby bolt failures postulated in the input space. It is possible this characteristic could be used to generate a very simple model, perhaps one that is directly proportional to the number of immediately adjacent bolt failures. Alternate Formatting of Input Data Finally, given that all of the attempted models have struggled with dimensionality, an alternate format of the input data was developed. The new input data table has 30 columns, with each pair of columns ([1,2], [3,4], [5,6], etc.) corresponding to the coordinate locations of each failed bolt. All samples with more than 15 bolt failures were truncated from the input space, leaving ~45,000 samples available for modeling. Table 4 illustrates the alternate data format. For example, Sample 1 has one failed bolt at position (X=2, Y=2). Sample 2 has failed bolts at positions (X=5, Y=3) and (X=2, Y=8). Sample 3 has 15 failed bolt locations. X1 Y1 X2 Y2 X3 Y3 . . . X30 Y30 Sample 1 2 2 0 0 0 0 . . . 0 0 Sample 2 5 3 2 8 0 0 . . . 0 0 Sample 3 1 6 5 3 12 3 . . . 6 6 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Initial model fitting using the alternate input data format struggled to predict with reasonable accuracy. The primary reason is suspected to be that the output data are not similarly indexed. The output data are indexed 1:104; whereas this alternate input data format are indexed by X and Y position. It is also possible that the models interpret Xn and Yn as separate predictors, but in reality they are related because these two parameters indicate the position of one bolt. Conclusion
  • 17. Draft November 27, 2017 Appendix A: R Coding ###Load Libraries ```{r eval=TRUE, message=FALSE, warning=FALSE} library(AppliedPredictiveModeling) library(caret) library(corrplot) library(dplyr) library(ggplot2) library(glmnet) library(gridExtra) library(gstat) library(Lahman) library(magrittr) library(nnet) library(scales) library(sp) library(stats) library(tidyr) ``` ###Import and Pre-Process Data ```{r eval=TRUE} predictors<-read.table("_predictors.csv",header=TRUE,sep=",",dec=".") responses<-read.table("_responses.csv",header=TRUE,sep=",",dec=".") predictors<-predictors[,1:105] responses<-responses[,1:105] #Replace all zero (failed) locations with large number. predictors[predictors==0]<-100 predictors=predictors[,2:105] responses=responses[,2:105] ``` ###Split into Training (80%) and Test (20%) Data ```{r eval=TRUE} indices = sample(1:nrow(predictors), size=0.2*nrow(predictors)) x.test = predictors[indices,] x.train = predictors[-indices,] y.test = responses[indices,] y.train = responses[-indices,] rm(indices, predictors, responses) ``` ###Exploratory Data Analysis ```{r eval=TRUE} corrplot(cor(y.train[,1:50]),order="hclust",tl.cex = 0.75,title="Correlation of Bolt Forces across Locationsn(Local Neighborhood of 50 Bolts)",mar = c(0,0,2,0)) oneCol<-data.frame(c(y.train[,97],y.train[,98],y.train[,99],y.train[,100], y.train[,101],y.train[,102],y.train[,103],y.train[,104])) oneRow<- data.frame(c(y.train[,4],y.train[,12],y.train[,20],y.train[,28],y.train[,36] , y.train[,44],y.train[,52],y.train[,60],y.train[,68],y.train[,76],
  • 18. Draft November 27, 2017 y.train[,84],y.train[,92],y.train[,100])) ggplot(data=oneRow, aes(x=oneRow[,1]))+ geom_histogram()+ ggtitle("Range of Bolt Forces across One Row")+ xlab("Bolt Force")+ ylab("Count")+ theme(axis.title = element_text(face="bold",size = 18))+ theme(axis.text = element_text(face="bold",size=12))+ theme(plot.title = element_text(hjust = 0.5,face="bold",size = 20)) ggplot(data=oneCol, aes(x=oneCol[,1]))+ geom_histogram()+ ggtitle("Range of Bolt Forces across One Column")+ xlab("Bolt Force")+ ylab("Count")+ theme(axis.title = element_text(face="bold",size = 18))+ theme(axis.text = element_text(face="bold",size=12))+ theme(plot.title = element_text(hjust = 0.5,face="bold",size = 20)) rm(oneCol,oneRow) ``` ###Linear Regression ```{r, message=FALSE, warning=FALSE} ctrl=trainControl(method="cv", number=10) modelLinear<-train(x.train[], y.train[,44], method="lm", trControl = ctrl) predictLinear<-x.test predictLinear$predicted<-predict(modelLinear,predictLinear) predictLinear$observed<-data.frame(y.test[,44]) ObsFit<-data.frame(predictLinear$observed,predictLinear$predicted) names(ObsFit) <- c("obs", "pred") modelstats=defaultSummary(ObsFit) RMSE=round(modelstats[1],2) Rsq=round(modelstats[2],2) ggplot(data=predictLinear, aes(x=observed,y=predicted))+ geom_point(alpha=0.1)+ geom_abline(intercept = 0,slope = 1)+ ggtitle("Multiple Linear Regression")+ xlim(0,13)+ ylim(0,13)+ xlab("ANSYS Bolt Force")+ ylab("Predicted")+ theme(axis.title = element_text(face="bold",size = 18))+ theme(axis.text = element_text(face="bold",size=12))+ theme(plot.title = element_text(hjust = 0.5,face="bold",size = 20))+ annotate("text",x=12,y=3,label = paste("R^2 =",Rsq),fontface=2,size=4)+ annotate("text",x=12,y=2,label = paste("RMSE =",RMSE),fontface=2,size=4) xyplot(resid(modelLinear) ~ predict(modelLinear), type = c("p","g"), xlab = "Predicted", ylab = "Residuals", main = "Linear Model Residualsn (This is not Normal)")
  • 19. Draft November 27, 2017 coef<-as.data.frame(coef(modelLinear$finalModel)) coefSignificant<- data.frame(summary(modelLinear)$coef[summary(modelLinear)$coef[,4] <= 0.05, 4]) ggplot(data=coef, aes(x=coef(modelLinear$finalModel)))+ geom_histogram()+ ggtitle("Linear Model Coeficient Values")+ #xlim(-4,3)+ xlab("Coeficient Value")+ ylab("Count")+ theme(axis.title = element_text(face="bold",size = 18))+ theme(axis.text = element_text(face="bold",size=12))+ theme(plot.title = element_text(hjust = 0.5,face="bold",size = 20))+ annotate("text",x=2.65,y=25,label="Intercept",fontface=2,size=4)+ annotate("text",x=1.5,y=50,label="Immediately Adjacentn (Left- Right)",fontface=2,size=4)+ annotate("text",x=-1.0,y=175,label="Vast Majorityn Insignificant",fontface=2,size=4)+ annotate("text",x=-3,y=25,label="Failed Boltn (Force ~ Zero)",fontface=2,size=4) rm(coef,coefSignificant,ctrl,modelstats,RMSE,Rsq,ObsFit,modelLinear,predictL inear) ``` ###Regularized Regression ```{r} # Fit models failure<-"r2c11" predictors<-read.csv("_predictors.csv") responses<-read.csv("_responses.csv") predictors<-predictors[,-1] responses<-responses[,-1] responses<-responses[,1:104] sam <- sample(1:nrow(predictors),floor(0.8*nrow(predictors))) train.x <- as.matrix(predictors[sam,]) train.y <- as.matrix(responses[sam,failure]) test.y <- as.matrix(predictors[-sam,failure]) test.x <- as.matrix(responses[-sam,]) fit.ordinary <- lm(train.y ~ train.x) hist(length(fit.ordinary$coefficients)) fit.lasso <- glmnet(train.x, train.y, family="gaussian", alpha=1) dev.new(); hist(fit.lasso$df) fit.ridge <- glmnet(train.x, train.y, family="gaussian", alpha=0) dev.new(); hist(fit.ridge$df) fit.elnet <- glmnet(train.x, train.y, family="gaussian", alpha=.5) dev.new(); hist(fit.elnet$df,breaks=c(10:200:10)) # # 10-fold cross validation for each alpha (ranging between Ridge Regression and Lasso across the Elastic Net) # for (i in 0:10) { assign(paste("fit", i, sep=""), cv.glmnet(train.x, train.y, type.measure="mse", alpha=i/10,family="gaussian")) }
  • 20. Draft November 27, 2017 # # Plot solution paths (to choose regularized models within each "family") # dev.new() par(mfrow=c(1,2)) plot(fit10) grid() plot(fit.lasso, xvar="lambda") # dev.new() par(mfrow=c(1,2)) plot(fit0) grid() plot(fit.ridge, xvar="lambda") # dev.new() par(mfrow=c(1,2)) plot(fit5) grid() plot(fit.elnet, xvar="lambda") # # Calculating MSE for each ALPHA # yhat0 <- predict(fit0, s=fit0$lambda.1se, newx=test.x) yhat1 <- predict(fit1, s=fit1$lambda.1se, newx=test.x) yhat2 <- predict(fit2, s=fit2$lambda.1se, newx=test.x) yhat3 <- predict(fit3, s=fit3$lambda.1se, newx=test.x) yhat4 <- predict(fit4, s=fit4$lambda.1se, newx=test.x) yhat5 <- predict(fit5, s=fit5$lambda.1se, newx=test.x) yhat6 <- predict(fit6, s=fit6$lambda.1se, newx=test.x) yhat7 <- predict(fit7, s=fit7$lambda.1se, newx=test.x) yhat8 <- predict(fit8, s=fit8$lambda.1se, newx=test.x) yhat9 <- predict(fit9, s=fit9$lambda.1se, newx=test.x) yhat10 <- predict(fit10, s=fit10$lambda.1se, newx=test.x) mse0 <- mean((test.y - yhat0)^2) mse1 <- mean((test.y - yhat1)^2) mse2 <- mean((test.y - yhat2)^2) mse3 <- mean((test.y - yhat3)^2) mse4 <- mean((test.y - yhat4)^2) mse5 <- mean((test.y - yhat5)^2) mse6 <- mean((test.y - yhat6)^2) mse7 <- mean((test.y - yhat7)^2) mse8 <- mean((test.y - yhat8)^2) mse9 <- mean((test.y - yhat9)^2) mse10 <- mean((test.y - yhat10)^2) print(paste0("MSE for Ridge: ", mse0)) print(paste0("MSE for Elastic Net:",mse5)) print(paste0("MSE for Lasso:", mse10)) coeff_ridge<-coef(fit0,s='lambda.min') coeff_elastic<-coef(fit5,s='lambda.min') coeff_lasso<-coef(fit10,s='lambda.min') sig_coeff_ridge<-as.matrix(coeff_ridge[which(abs(coeff_ridge)>=.01),]) sig_coeff_elastic<-as.matrix(coeff_elastic[which(abs(coeff_elastic)>=.01),]) sig_coeff_lasso<-as.matrix(coeff_lasso[which(abs(coeff_lasso)>=.01),])
  • 21. Draft November 27, 2017 viz_grid<-function(coeff,failure){ a = as.data.frame(expand.grid(1:13,1:8)) colnames(a) = c('x', 'y') b<-strsplit(row.names(coeff),"[a-z]+") matrix<-as.matrix(coeff) x.c<-rep(0,length(matrix)) y.c<-rep(0,length(matrix)) for(i in 2:length(matrix)){ x.c[i]<-as.numeric(b[[i]][3]) y.c[i]<-as.numeric(b[[i]][2]) } x.c<-x.c[-1] y.c<-y.c[-1] sigbolts<-as.data.frame(cbind(x.c,y.c)) #contains the coordinates a$indicator<-0 for (i in 1:length(sigbolts[,2])){ for(j in 1:length(a[,2])){ if(sigbolts[i,1]==a[j,1] & sigbolts[i,2]==a[j,2]) a[j,3]<-1 } } fail<-strsplit(failure,"[a-z]+") index<-which(a$x==fail[[1]][3] & a$y==fail[[1]][2]) a[index,3]<-2 ggplot() + geom_point(data = a, aes(x = x, y = y,color=factor(indicator))) + geom_point() + scale_y_continuous(trans='reverse') + scale_color_discrete(label=c("insignificant","significant","failure")) } viz_grid(sig_coeff_ridge,failure) viz_grid(sig_coeff_elastic,failure) viz_grid(sig_coeff_lasso,failure) rm(predictors,responses,test.x,test.y,train.x,train.y) rm(sig_coeff_elastic,sig_coeff_lasso,sig_coeff_ridge) rm(yhat0,yhat1,yhat2,yhat3,yhat4,yhat5,yhat6,yhat7,yhat8,yhat9,yhat10) rm(coeff_elastic,coeff_lasso,coeff_ridge) rm(failure,fit.elnet,fit.lasso,fit.ordinary,fit.ridge) rm(fit0,fit1,fit2,fit3,fit4,fit5,fit6,fit7,fit8,fit9,fit10,i) rm(mse0,mse1,mse2,mse3,mse4,mse5,mse6,mse7,mse8,mse9,mse10) rm(sam,viz_grid) ``` ###K-Nearest Neighbors ```{r eval=TRUE} ctrl=trainControl(method="cv", number=2) modelKNN<-train(x=x.train[1:5000,], y=y.train[1:5000,44], method="knn", preProc=c("center","scale"), tuneGrid=data.frame(.k=1:30), trControl=ctrl) print(modelKNN) plot(modelKNN) predictKNN<-x.test
  • 22. Draft November 27, 2017 predictKNN$predicted<-predict(modelKNN,predictKNN) predictKNN$observed<-data.frame(y.test[,44]) ObsFit<-data.frame(predictKNN$observed,predictKNN$predicted) names(ObsFit) <- c("obs", "pred") modelstats=defaultSummary(ObsFit) RMSE=round(modelstats[1],2) Rsq=round(modelstats[2],2) ggplot(data=predictKNN, aes(x=observed,y=predicted))+ geom_point(alpha=0.5)+ geom_abline(intercept = 0,slope = 1)+ ggtitle("K-Nearest Neighbors")+ xlim(0,13)+ ylim(0,13)+ xlab("ANSYS Bolt Force")+ ylab("Predicted")+ theme(axis.title = element_text(face="bold",size = 18))+ theme(axis.text = element_text(face="bold",size=12))+ theme(plot.title = element_text(hjust = 0.5,face="bold",size = 20))+ annotate("text",x=12,y=3,label = paste("R^2 =",Rsq),fontface=2,size=4)+ annotate("text",x=12,y=2,label = paste("RMSE =",RMSE),fontface=2,size=4) ``` ###Kriging ```{r eval=TRUE} for (i in 3:5){ krigSample<-read.table("_krigingData.csv",header=TRUE,sep=",",dec=".") krigSample<-krigSample[,c(1,2,i)] krigSample<-na.omit(krigSample) colnames(krigSample) <- c("x", "y", "force") coordinates(krigSample) <- ~ x + y class(krigSample) krigSample %>% as.data.frame %>% ggplot(aes(x, y)) + geom_point(aes(color=force,size=force)) + scale_colour_gradient(low = "green", high = "red")+ ggtitle("Bolt Force at each Location") + coord_equal() forces.vgm <- variogram(force~1, krigSample) forces.fit <- fit.variogram(forces.vgm, model=vgm(1, "Ste", 1)) plot(forces.vgm,forces.fit) Sr1 = Polygon(cbind(c(0,0,13,13),c(0,8,8,0))) Srs1 = Polygons(list(Sr1), "s1") SpP=SpatialPolygons(list(Srs1)) plate.grid=spsample(SpP,10000,"regular") plate.kriged <- krige(force ~ 1, krigSample, plate.grid, model=forces.fit) plot1 <- krigSample %>% as.data.frame %>% ggplot(aes(x, y)) + geom_point(aes(color=force),size=1) + scale_colour_gradient(low = "green", high = "red", breaks=c(1,7), labels=c(1,7), limits=c(0,7))+ ggtitle("ANSYS Force at each Bolt Location") + coord_equal()
  • 23. Draft November 27, 2017 plot2 <- plate.kriged %>% as.data.frame %>% ggplot(aes(x=x1, y=x2)) + geom_tile(aes(fill=var1.pred)) + coord_equal() + scale_fill_gradient(low = "green", high="red", breaks=c(1,7), labels=c(1,7), limits=c(0,7)) + scale_x_continuous(labels=comma) + scale_y_continuous(labels=comma) + theme_bw() + ggtitle("Krige-Interpolated Force across Plate") grid.arrange(plot1, plot2, ncol = 2) } ```
  • 24. Draft November 27, 2017 Appendix B: MATLAB 2015b Coding %% INITIALIZE AND IMPORT DATA clc; clear all; close all; set(0,'defaultfigurecolor',[1 1 1]); cd ('C:UsersworrelclDesktopClarenceIE2065 (Stat Analysis & Optimization)Project'); predictors = csvread('_predictors.csv', 1, 1); predictors = predictors(:,1:104); responses = csvread('_responses.csv', 1, 1); responses = responses(:,1:104); numSamples = length(predictors); pcaPRED = struct('original', predictors); pcaRESP = struct('original', responses); %% PRE-PROCESS DATA FROM 104 COLUMNS TO 30 COLUMNS % where each pair of columns identify the X,Y coordinates of each failed % bolt location. Truncate samples with more than 15 bolt failures. % First, count number of failures for each sample, % identify samples with <=15 failures, and % truncate predictor/response data to samples with <=15 failures numFailed = zeros(numSamples,1); counter = 1; for sample = 1:numSamples numFailed(sample,1) = sum(predictors(sample,:)==0); if numFailed(sample,1) <= 15 keepIndex(counter,1) = sample; counter = counter + 1; end end predictors = predictors(keepIndex,:); responses = responses(keepIndex,:); numSamples = length(predictors); % Next, get X,Y coordinates of each failure failCols = zeros(numSamples,15); for sample = 1:numSamples counter = 1; for col = 1:104 if predictors(sample,col)==0 failCols(sample,counter) = col; counter = counter + 1; end end end % Finally, populate a new 30-column predictor matrix with X,Y location of % each failed bolt predictorsXY = zeros(numSamples,30); for sample = 1:numSamples counter = 0; for col = 1:15 if failCols(sample,col) > 0 counter = counter + 1; failedColIndex = failCols(sample,col);
  • 25. Draft November 27, 2017 predictorsXY(sample,2*counter-1) = rem((failedColIndex-1),8)+1; %X coord; predictorsXY(sample,2*counter) = floor((failedColIndex-1)/8)+1; %Y coord end end end clear col counter failedColIndex keepIndex sample %% Correlation Plots zPRED = zscore(pcaPRED.original); cPRED = (zPRED'*zPRED) / (numSamples-1); figure; contourf(cPRED), colorbar, title('Correlation between Predictor Variables'); zRESP = zscore(pcaRESP.original); cRESP = (zRESP'*zRESP) / (numSamples-1); figure; contourf(cRESP), colorbar, title('Correlation between Response Variables'); clear cPRED cRESP zPRED zRESP pcaPRED %% PCA [pcaRESP.coeff, pcaRESP.score, pcaRESP.latent, pcaRESP.tsquared, pcaRESP.explained] = pca(pcaRESP.original); pcaRESP.explained=cumsum(pcaRESP.explained); figure; bar(pcaRESP.explained), title('Cumulative Percent of Variance Explained'), xlabel('Score'), ylabel('Percent'), xlim([0,100]); NNtargetPCA = pcaRESP.score(:,1:32); predictedScores = net(predictors'); predictedScores = predictedScores'; predicted = predictedScores * pcaRESP.coeff(1:32,:); boltLoc = 84; figure; hold on scatter(responses(:,boltLoc),predicted(:,boltLoc) + mean(responses(:,boltLoc))); title('Observed vs. Predicted for One Selected Bolt Location'); xlabel('Observed'); ylabel('Predicted (PCA+ANN)'); xlim([0,15]); ylim([0,15]); hold off