Condition Monitoring 
at Rolling Mills 
with Data-Driven 
Residual-Based Fault Detection 
Francisco Serdio Fernández 
Department of Knowledge-Based 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic 
Mathematical Systems 
Johannes Kepler University 
Linz - Austria
Index 
• Residual Based Approach 
• Framework 
» Data Cleaning 
» System Identification 
» Model Training 
» Model Testing 
• Reference Method 
» Principal Component Analysis – PCA 
» Multi Scale Principal Component Analysis – MSPCA 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Index 
• Current Challenges 
» Global approaches 
» Fixed thresholds 
• Artificial faults 
» Constant Failure 
» Drift Failure 
• Results 
» ROC Curves 
» Detection Rates 
• Conclusions 
• Outlook 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Basic Idea of Residual-Based Approach 
Increasing the dimensionality of the joint channel space decreases the likelihood that a 
fault is affected in all channels with same intensity and direction! 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Fault No Fault!, but non-smooth 
pattern of signal 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic 
Joint Channel Space 
(smooth dependency)
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Framework 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
– Produces a new dataset to be used in the following step 
– Iterative process  Identifies which channels explain others 
– Produces a model for each previously identified system 
– Determines when there is a fault in the running system 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Framework 
• Off-line stage 
» Data cleaning 
» System identification 
» Model training 
• On-line stage 
» Model testing 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Framework – Data cleaning 
• Remove duplicated channels 
» Duplicated?  R2 greater than 0.95 
• Remove outliers 
» Outlier?  pairwise distance in the training data  outlier degree 
• Downsample data set 
» Keep the shape of the channel 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
• Remove constant channels 
» Constant?  
• Remove binary channels 
» Binary?  
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Framework – System Identification 
• Identify channel dependencies 
» Forward selection with orthogonalization 
– Achieves channel ranking according to their importance level 
for explaining target (most important first) 
» GA based feature selection (included in Box-Cox) 
– Outputs individuals with 1’s and 0’s indicating whether a 
variable is included or not 
• Determine optimal number of dimensions in ranking 
scheme 
» Find a knee in the cumulative quality sum curve 
– Automatically determine by means of gradient 
– Keeps the inputs modelling the useful information 
– Discards the inputs modelling the noise 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Framework – System Identification 
• Determine optimal number of dimensions 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Framework – Model Training 
• Models applied, stepwise increasing non-lin. deg. 
» Ridge Regression (linear) 
“T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference 
and Prediction - Second Edition. Springer, New York Berlin Heidelberg, 2009“ 
– Global MLR with Tichonov regularization included 
» Genetic Box-Cox (slightly non-linear) 
”R.M. Sakia. The Box-Cox transformation technique: a review. The Statistician, 41:168--178, 1992.“ 
– Combining original Box-Cox with GA 
- Transform the inputs to introduce slight non-linearities 
- Use linear regression over the transformed inputs 
- The transformations are learnt using a GA 
“E. Lughofer and S. Kindermann. SparseFIS: Data-driven learning of fuzzy systems with sparsity 
constraints. IEEE Transactions on Fuzzy Systems, 18(2): 396--411, 2010.“ 
– Top down fuzzy modeling approach applying numerical sparsity constraints 
optimization, out-weighting unimportant rules and parameters 
– Employing iterative VQ, projected gradient descent and Semi-Smooth Newton 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
» SparseFIS (highly non-linear) 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Example of Input Transformations 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
francisco.serdio@jku.at 
Overview of Training Methods 
http://www.flll.jku.at/staff/francisco 
Method Type 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic 
Training 
effort 
System 
Identification 
Model Training 
Linear 
Regression 
Linear Low Forward selection 10-fold cv with mse 
Box-Cox 
Slightly 
non-linear 
Medium Genetic algorithm 10-fold cv with mse 
SparseFIS 
Highly 
non-linear 
High Forward selection 
10-fold cv with mse 
+ grid search
On-line Analysis of Residual 
Signals 
• Computation of residuals 
» The residuals are the differences between the observed values 
– Global: based on CV model error  a unique value for each point in 
the testing data set is provided 
– Local: based on adaptive confidence intervals according to variation 
in the data distribution over space  a different value for each point 
in the testing data set is provided 
• Combine residuals and error bars 
» The error bars are used to normalize the residuals 
– The residuals are now expressed in error bar units 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
and the predicted ones 
• Computation of error bars 
» Two types: global and local 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
On-line Analysis of Residual 
Signals 
• On-line tracking of the residuals 
» The average μ and the standard deviation σ is tracked 
– A window of time is used  values out of the tolerance band trigger 
a fault alarm and do not update the tracking 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic 
Current residual at time instance k 
generated from the ith model 
Incremantal / 
Decremental 
μ and σ over 
sliding window 
with size T
Dynamic Residual Signals Analysis - 
Example 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic 
Fault with 50% level 
Fault with 10% level
Reference methods 
• Principal Components Analysis – PCA 
» State of the art in fault detection 
– D. Garcia-Alvarez. Fault detection using principal component analysis (pca) in a 
wastewater treatment plant (wwtp). In Proceedings of the 62-th Int. Student's Scientic 
Conference, 13-17, Saint-Peterburg, Russia, 2009. 
– P.F. Odgaard, B. Lin, and S.B. Jorgensen. Observer and data-driven-model-based 
fault detection in power plant coal mills. IEEE Transactions on Energy Conversion, 
23(2): 659-668, 2008. 
» The monitoring can be reduced to two variables (T2 
and Q) characterizing two orthogonal subsets of the 
original space 
– T-Hotelling (T2) represents the major variation in the data 
– Q represents the random noise in the data 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Reference method (cont’d) 
• Multi Scale Principal Components Analysis – MSPCA 
» State of the art in process monitoring 
– B.R. Bakshi. Multiscale pca with application to multivariate statistical process 
» It uses wavelets to reconstruct the original signal 
– Reconstruction attempt to remove useless information from the 
signal, mainly noise 
» Monitoring uses the same statistics as in PCA 
– T-Hotelling (T2) represents the major variation in the data 
– Q represents the random noise in the data 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
monitoring. AIChE Journal, 44, 1596-1610, 1998. 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
• Global approaches 
» PCA and MSPCA uses the dataset as a whole 
» When new channels are added or removed to the system, the 
• Fixed thresholds 
» PCA and MSPCA uses a fixed threshold based on training data 
– Does not take into account train and test dataset differences 
– When train and test differs considerably, the appoach becomes 
useless 
– The threshold remains unchanged during the online operation of the 
system 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Current Challenges 
method should be trained again 
– Low cascadability 
» It’s a rigid approach 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Artificial Faults 
• Artificial faults were introduced in the data 
» Regions where channels values are zero were ignored 
• Different fault types with different intesities 
» Fault types 
– Means a progressive increase in the original signal 
– Different slopes → different shapes 
» Fault intesities (% added to the original signal) 
• Introduction of faults was shuffled 10 times to avoid unlucky 
situations (due to a bad coverage of faulty channels) 
francisco.serdio@jku.at 
– Means a jump in the original signal 
» From exponential to logarithmic 
http://www.flll.jku.at/staff/francisco 
– Constant failure 
– Drift failure 
– 5%, 10%, 20%, 50%, 100% 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Artificial Faults Examples 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Results 
• ROC Curves 
» For sensibility analysis facing true positives vs. false 
– How much the detection rate influences the overdetection rate 
– How much sensible is the method to its parameters 
– Which method is best 
– A higher AUC (Area Under the Curve) points to a better 
method, as higher detection rates (y-axis, values far from x-axis) 
can be achieved with lower false alarm rates (x-axis, values 
close to y-axis). 
francisco.serdio@jku.at 
positives  Detection vs. Overdetection 
» Depict the following useful information 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Results – Multi Scale PCA 
• Shows to be useless for our problem 
» The wavelet reconstruction is not able to reconstruct 
the signals properly 
– Poor channel reconstruction 
– The percentage of channels reconstructed using the wavelets, 
with accuracy greater or equal to 90% is around 55% to 65% 
of the total number of channels for all the datasets 
– Noise is introduce during the channel reconstruction, even in 
the channels reconstructed with good quality 
» Inacceptable overdetection rates in all the datasets 
– The method is not able to operate below 10% overdetection 
rate  useless in our problem 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Results – Multi Scale PCA 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Results – ROC Curves – Scenario 1 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Results – ROC Curves – Scenario 2 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Results – ROC Curves – Scenario 4 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Results – Detection Rates - Scenario 
1 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Results – Detection Rates - Scenario 
2 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Results – Detection Rates - Scenario 
4 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Statistical preference of methods 
• Two statistical tests using 
» (i) Rankings / (ii) Absolute detection rates 
– Plus denotes significant superiority over the other methods 
– Minus denotes inferiority to the other methods 
– 0 indicates no difference 
– na indicates not applicable 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Conclusions 
• MSPCA is not applicable in our problem 
• PCA is either not applicable or outperformed by our residual-based 
• In the pessimistic (real-world) case, Box-Cox showed best 
performance, thus favoring slight non-linearities in the models 
• A significant performance boost over pessimistic case could be 
recognized for all models times 
» Fault misses can be largely explained by having not a (good) model 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
approach 
available for a channel where a fault occurs! 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Outlook 
• Deal with the non-stable behaviour of the residuals 
(enhanced pattern analysis, model update schemes) 
• Deal with the data from different products 
(probably operator’s feedback required) 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Thanks a lot for your attention! 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Results – ROC Curves – Scenario 1 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Results – ROC Curves – Scenario 2 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Results – ROC Curves – Scenario 4 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Results – Detection Rates - Scenario 
1 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Results – Detection Rates - Scenario 
2 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic
Results – Detection Rates - Scenario 
4 
francisco.serdio@jku.at 
http://www.flll.jku.at/staff/francisco 
Francisco Serdio, Edwin Lughofer, Kurt Pichler, 
Thomas Buchegger, Hajrudin Efendic

IFAC MIM 2013

  • 1.
    Condition Monitoring atRolling Mills with Data-Driven Residual-Based Fault Detection Francisco Serdio Fernández Department of Knowledge-Based francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic Mathematical Systems Johannes Kepler University Linz - Austria
  • 2.
    Index • ResidualBased Approach • Framework » Data Cleaning » System Identification » Model Training » Model Testing • Reference Method » Principal Component Analysis – PCA » Multi Scale Principal Component Analysis – MSPCA francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 3.
    francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Index • Current Challenges » Global approaches » Fixed thresholds • Artificial faults » Constant Failure » Drift Failure • Results » ROC Curves » Detection Rates • Conclusions • Outlook Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 4.
    Basic Idea ofResidual-Based Approach Increasing the dimensionality of the joint channel space decreases the likelihood that a fault is affected in all channels with same intensity and direction! francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Fault No Fault!, but non-smooth pattern of signal Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic Joint Channel Space (smooth dependency)
  • 5.
    francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Framework Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 6.
    – Produces anew dataset to be used in the following step – Iterative process  Identifies which channels explain others – Produces a model for each previously identified system – Determines when there is a fault in the running system francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Framework • Off-line stage » Data cleaning » System identification » Model training • On-line stage » Model testing Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 7.
    Framework – Datacleaning • Remove duplicated channels » Duplicated?  R2 greater than 0.95 • Remove outliers » Outlier?  pairwise distance in the training data  outlier degree • Downsample data set » Keep the shape of the channel francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco • Remove constant channels » Constant?  • Remove binary channels » Binary?  Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 8.
    Framework – SystemIdentification • Identify channel dependencies » Forward selection with orthogonalization – Achieves channel ranking according to their importance level for explaining target (most important first) » GA based feature selection (included in Box-Cox) – Outputs individuals with 1’s and 0’s indicating whether a variable is included or not • Determine optimal number of dimensions in ranking scheme » Find a knee in the cumulative quality sum curve – Automatically determine by means of gradient – Keeps the inputs modelling the useful information – Discards the inputs modelling the noise francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 9.
    Framework – SystemIdentification • Determine optimal number of dimensions francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 10.
    Framework – ModelTraining • Models applied, stepwise increasing non-lin. deg. » Ridge Regression (linear) “T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference and Prediction - Second Edition. Springer, New York Berlin Heidelberg, 2009“ – Global MLR with Tichonov regularization included » Genetic Box-Cox (slightly non-linear) ”R.M. Sakia. The Box-Cox transformation technique: a review. The Statistician, 41:168--178, 1992.“ – Combining original Box-Cox with GA - Transform the inputs to introduce slight non-linearities - Use linear regression over the transformed inputs - The transformations are learnt using a GA “E. Lughofer and S. Kindermann. SparseFIS: Data-driven learning of fuzzy systems with sparsity constraints. IEEE Transactions on Fuzzy Systems, 18(2): 396--411, 2010.“ – Top down fuzzy modeling approach applying numerical sparsity constraints optimization, out-weighting unimportant rules and parameters – Employing iterative VQ, projected gradient descent and Semi-Smooth Newton francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco » SparseFIS (highly non-linear) Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 11.
    Example of InputTransformations francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 12.
    francisco.serdio@jku.at Overview ofTraining Methods http://www.flll.jku.at/staff/francisco Method Type Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic Training effort System Identification Model Training Linear Regression Linear Low Forward selection 10-fold cv with mse Box-Cox Slightly non-linear Medium Genetic algorithm 10-fold cv with mse SparseFIS Highly non-linear High Forward selection 10-fold cv with mse + grid search
  • 13.
    On-line Analysis ofResidual Signals • Computation of residuals » The residuals are the differences between the observed values – Global: based on CV model error  a unique value for each point in the testing data set is provided – Local: based on adaptive confidence intervals according to variation in the data distribution over space  a different value for each point in the testing data set is provided • Combine residuals and error bars » The error bars are used to normalize the residuals – The residuals are now expressed in error bar units francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco and the predicted ones • Computation of error bars » Two types: global and local Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 14.
    On-line Analysis ofResidual Signals • On-line tracking of the residuals » The average μ and the standard deviation σ is tracked – A window of time is used  values out of the tolerance band trigger a fault alarm and do not update the tracking francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic Current residual at time instance k generated from the ith model Incremantal / Decremental μ and σ over sliding window with size T
  • 15.
    Dynamic Residual SignalsAnalysis - Example francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic Fault with 50% level Fault with 10% level
  • 16.
    Reference methods •Principal Components Analysis – PCA » State of the art in fault detection – D. Garcia-Alvarez. Fault detection using principal component analysis (pca) in a wastewater treatment plant (wwtp). In Proceedings of the 62-th Int. Student's Scientic Conference, 13-17, Saint-Peterburg, Russia, 2009. – P.F. Odgaard, B. Lin, and S.B. Jorgensen. Observer and data-driven-model-based fault detection in power plant coal mills. IEEE Transactions on Energy Conversion, 23(2): 659-668, 2008. » The monitoring can be reduced to two variables (T2 and Q) characterizing two orthogonal subsets of the original space – T-Hotelling (T2) represents the major variation in the data – Q represents the random noise in the data francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 17.
    Reference method (cont’d) • Multi Scale Principal Components Analysis – MSPCA » State of the art in process monitoring – B.R. Bakshi. Multiscale pca with application to multivariate statistical process » It uses wavelets to reconstruct the original signal – Reconstruction attempt to remove useless information from the signal, mainly noise » Monitoring uses the same statistics as in PCA – T-Hotelling (T2) represents the major variation in the data – Q represents the random noise in the data francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco monitoring. AIChE Journal, 44, 1596-1610, 1998. Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 18.
    • Global approaches » PCA and MSPCA uses the dataset as a whole » When new channels are added or removed to the system, the • Fixed thresholds » PCA and MSPCA uses a fixed threshold based on training data – Does not take into account train and test dataset differences – When train and test differs considerably, the appoach becomes useless – The threshold remains unchanged during the online operation of the system francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Current Challenges method should be trained again – Low cascadability » It’s a rigid approach Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 19.
    Artificial Faults •Artificial faults were introduced in the data » Regions where channels values are zero were ignored • Different fault types with different intesities » Fault types – Means a progressive increase in the original signal – Different slopes → different shapes » Fault intesities (% added to the original signal) • Introduction of faults was shuffled 10 times to avoid unlucky situations (due to a bad coverage of faulty channels) francisco.serdio@jku.at – Means a jump in the original signal » From exponential to logarithmic http://www.flll.jku.at/staff/francisco – Constant failure – Drift failure – 5%, 10%, 20%, 50%, 100% Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 20.
    francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco ArtificialFaults Examples Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 21.
    Results • ROCCurves » For sensibility analysis facing true positives vs. false – How much the detection rate influences the overdetection rate – How much sensible is the method to its parameters – Which method is best – A higher AUC (Area Under the Curve) points to a better method, as higher detection rates (y-axis, values far from x-axis) can be achieved with lower false alarm rates (x-axis, values close to y-axis). francisco.serdio@jku.at positives  Detection vs. Overdetection » Depict the following useful information http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 22.
    Results – MultiScale PCA • Shows to be useless for our problem » The wavelet reconstruction is not able to reconstruct the signals properly – Poor channel reconstruction – The percentage of channels reconstructed using the wavelets, with accuracy greater or equal to 90% is around 55% to 65% of the total number of channels for all the datasets – Noise is introduce during the channel reconstruction, even in the channels reconstructed with good quality » Inacceptable overdetection rates in all the datasets – The method is not able to operate below 10% overdetection rate  useless in our problem francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 23.
    francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Results– Multi Scale PCA Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 24.
    francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Results– ROC Curves – Scenario 1 Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 25.
    francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Results– ROC Curves – Scenario 2 Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 26.
    francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Results– ROC Curves – Scenario 4 Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 27.
    Results – DetectionRates - Scenario 1 francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 28.
    Results – DetectionRates - Scenario 2 francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 29.
    Results – DetectionRates - Scenario 4 francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 30.
    Statistical preference ofmethods • Two statistical tests using » (i) Rankings / (ii) Absolute detection rates – Plus denotes significant superiority over the other methods – Minus denotes inferiority to the other methods – 0 indicates no difference – na indicates not applicable francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 31.
    Conclusions • MSPCAis not applicable in our problem • PCA is either not applicable or outperformed by our residual-based • In the pessimistic (real-world) case, Box-Cox showed best performance, thus favoring slight non-linearities in the models • A significant performance boost over pessimistic case could be recognized for all models times » Fault misses can be largely explained by having not a (good) model francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco approach available for a channel where a fault occurs! Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 32.
    Outlook • Dealwith the non-stable behaviour of the residuals (enhanced pattern analysis, model update schemes) • Deal with the data from different products (probably operator’s feedback required) francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 33.
    Thanks a lotfor your attention! francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 34.
    francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Results– ROC Curves – Scenario 1 Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 35.
    francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Results– ROC Curves – Scenario 2 Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 36.
    francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Results– ROC Curves – Scenario 4 Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 37.
    Results – DetectionRates - Scenario 1 francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 38.
    Results – DetectionRates - Scenario 2 francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic
  • 39.
    Results – DetectionRates - Scenario 4 francisco.serdio@jku.at http://www.flll.jku.at/staff/francisco Francisco Serdio, Edwin Lughofer, Kurt Pichler, Thomas Buchegger, Hajrudin Efendic