Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Distillation Column Process Fault Detection in the Chemical Industries


Published on

Chemical plants are complex large-scale systems which need designing robust fault detection schemes to ensure high product quality, reliability and safety under different operating conditions. The present paper is concerned with a feasibility study of the application of the black-box modeling method and Kullback Leibler divergence (KLD) to the fault detection in a distillation column process. A Nonlinear Auto-Regressive Moving Average with eXogenous input (NARMAX) polynomial model is firstly developed to estimate the nonlinear behavior of the plant. Furthermore, the KLD is applied to detect abnormal modes. The proposed FD method is implemented and validated experimentally using realistic faults of a distillation plant of laboratory scale. The experimental results clearly demonstrate the fact that proposed method is effective and gives early alarm to operators.

Published in: Technology
  • Be the first to comment

  • Be the first to like this

Distillation Column Process Fault Detection in the Chemical Industries

  1. 1. Research Article Fault detection in the distillation column process using Kullback Leibler divergence Lakhdar Aggoune n , Yahya Chetouani 1 , Tarek Raïssi 2 a Laboratoire d’Automatique de Sétif, Département d’Electrotechnique, Université de Sétif 1, Cité Maabouda, Route de Béjaia, 19000 Sétif, Algeria b Université de Rouen, Département Génie Chimique, Rue Lavoisier, 76821 Mont Saint Aignan Cedex, France c Conservatoire National des Arts et Métiers, Département EASY, Cedric-laetitia, 292, Rue St-Martin, case 2D2P10, 75141 Paris Cedex 03, France a r t i c l e i n f o Article history: Received 28 December 2014 Received in revised form 9 September 2015 Accepted 13 March 2016 Available online 25 March 2016 This paper was recommended for publica- tion by Didier Theilliol. Keywords: Fault detection Safety NARMAX model Kullback Leibler divergence Dynamic processes a b s t r a c t Chemical plants are complex large-scale systems which need designing robust fault detection schemes to ensure high product quality, reliability and safety under different operating conditions. The present paper is concerned with a feasibility study of the application of the black-box modeling method and Kullback Leibler divergence (KLD) to the fault detection in a distillation column process. A Nonlinear Auto-Regressive Moving Average with eXogenous input (NARMAX) polynomial model is firstly developed to estimate the nonlinear behavior of the plant. Furthermore, the KLD is applied to detect abnormal modes. The proposed FD method is implemented and validated experimentally using realistic faults of a distillation plant of laboratory scale. The experimental results clearly demonstrate the fact that proposed method is effective and gives early alarm to operators. & 2016 ISA. Published by Elsevier Ltd. All rights reserved. 1. Introduction With advances in modern engineering processes, fault and change detection are becoming more and more important for increasing the reliability, safety, availability, and environmental protection. The central goal is to improve the continuity and quality levels of the industrial production for a wide range of plants. To this end, much effort has been devoted to the devel- opment of fault detection (FD) schemes with various assumptions [1–5]. The diversity of the FD strategies developed in the past dec- ades, was required by the growing complexity of industrial installations. Among the most well-known is the model based approaches [6–8] which have found great interest since many modern industries require a faster detection time and the on-line implementation of the FD methods. In practice, the latter methods are based on the use of mathematical models. The FD procedure includes residual generation and residual evaluation. Usually, residual is generated by using the difference between the pre- dicted output and real measurements data. For this kind of FD techniques, the performance of detection depends very much on the accuracy of the model used. Process modeling forms a significant role in many fields of engineering systems. There are two different methods of modeling approaches: first principles models or black-box models. First principles models are based on physical laws in order to obtain the relationship between the input and output. Unfortunately in this approach it is necessary to know accurately the complete knowl- edge of the plant. Insufficient knowledge about the physical properties of the process prevents the design and implementation of FD techniques, and decreases the security (human and envir- onmental). As an alternative strategy, the black-box modeling, based on experimental data, does not need any prior knowledge about the physical insight of the system. It can be used to solve the problem of modeling in most cases encountered in engineering practice. On the other hand, the combination of these two modeling approaches gives a grey-box modeling. This method defines the model structure from a first principles models and determines the parameters from a experimental data using estimation techniques. Because of its satisfactory results in change detection, many researchers have implemented black-box models for monitoring process industrial systems such as, artificial neural networks [1], set membership identification [9], subspace identification [10], polynomial models [11,12]. However, in this Contents lists available at ScienceDirect journal homepage: ISA Transactions 0019-0578/& 2016 ISA. Published by Elsevier Ltd. All rights reserved. n Corresponding author. Tel./fax: þ213 36 61 12 11. E-mail address: (L. Aggoune). 1 Fax: þ33 2 35 14 61 30. 2 Tel.: þ33 1 40 27 21 69. ISA Transactions 63 (2016) 394–400
  2. 2. study we are interested only in an NARMAX polynomial model according to our experimental investigation. Once the residual is generated by using the model of the supervised process, its evaluation is necessary in order to reveal any deviation from the normal mode of the system. Due to pre- sence of modeling errors and measurement noises, residual eva- luation often includes a reliable decision module which can be obtained by using statistical hypothesis testing [6,13]. As an alternative approach, Kullback–Leibler divergence (KLD) [14] is proposed in this work. The KLD is a good tool to evaluate the similarity between two distributions. This paper investigates the modeling and parameter estimation of a chemical plant as a distillation column by using the black-box modeling approach based on NARMAX structures whose parameters are estimated by using the output error with extended prediction model (OEEPM) algorithm for the specific purpose of change detec- tion. An optimal NARMAX model was selected using statistical cri- teria such as Akaike’s Information Criterion (AIC), Root Mean Square Error (RMSE) and Nash-Sutcliffe Efficiency (NSE). The difference between the current measurement and the estimated output by the NARMAX model is defined as the residual of the monitored plant. The residual is compared to a fault detection threshold with the KLD which measures the probability distribution of the current residual compared to a reference one. The main contribution consists in the combination of the black-box identification method and KLD for the design of a fault detection scheme. The organization of the paper is the following: Section 2 gives a description of the Kullback–Leibler divergence as a fault detection criterion. Section 3 presents the experimental set-up. Section 4 describes the system identification procedure. Section 5 presents the experimental results of the proposed FD technique. Finally, some concluding remarks are given in Section 6. 2. Kullback-Leibler divergence Kullback-Leibler divergence (KLD), or relative entropy, is a measure of the similarity between two probability distributions. This divergence has been applied successfully in many fields such as pattern recognition [15,16]. Harmouche et al. [17] developed a fault detection approach under the PCA context which firstly used a PCA statistical model to extract a reference probability dis- tribution for each latent principal score from data and then monitored the probability distributions of these scores for each new set of data by using the KLD. They demonstrated its superior fault detection performance over the Hotelling T2 statistic. Houerbi et al. [18] proposed a change detection method for scan surveil- lance in Internet Networks. They used KLD for monitoring the variation of the distribution of scan features in a space spanned by IP source address, IP destination address, source port, and desti- nation port numbers. In the current work, The KLD can efficiently detect the abnor- mal events in distillation column by monitoring the dissimilarity between the reference probability density function (pdf) of the normal mode and the actual one based on the residual of the identified model. The KLD is an example of f divergence measure which is used to quantify discrepancy between pairs of probability distributions. In order to measure the difference between two discrete pdfs p0 (ε) (normal mode) and p1 (ε) (abnormal mode) of random variable ε, the KLD is defined as follow: KLðp1ðϵÞ‖p0ðϵÞÞ ¼ XN i ¼ 1 p1 ϵið Þlog p1 ϵið Þ p0 ϵið Þ ð1Þ where N is the data number and ε represents the sequence of the residual given by the difference between the process output yðtÞ and its estimate ^yðtÞasεðtÞ ¼ yðtÞÀ ^yðtÞ. Two fundamental properties of KLD are: 1. Positivity i.e. KLðp1ðϵÞ p0ðϵÞÞZ0 with equality if and only if p1ðεÞ ¼ p0ðεÞ, 2. Asymmetry i.e. KLðp1ðϵÞ p0ðϵÞÞaKLðp0ðϵÞ p1ðϵÞÞ . If p0 (ε) ¼ p1 (ε), the actual pdf corresponds to the one obtained in normal mode and the KLD is close to zero. Otherwise, large values of KLD correspond to the case of distributions p1 (ε) and p0 (ε) totally different and it is easy to detect an abnormal mode. The goal of the proposed FD technique is to detect the presence of an abnormal mode using KLD. The basic idea of this method is to compute the difference between the probability density func- tions of normal behavior and abnormal event, which can be rea- lised by calculating the KLD between the two distributions. The detection of faults affecting the process can be formulated as a hypothesis testing problem, considering two possible operating conditions or hypotheses: the null hypothesis H0, where the parameters of KLD are the same as those of the normal operating conditions, and the hypothesis H1 is when the parameters are different to those of the normal process behavior (anomaly). KLD values can run from zero to infinity. The KLD can be easily computed in the case of normal dis- tributions. Given two normal densities p0 (ε) and p1 (ε) such that p0 εð Þ $ Nðμ0; σ2 0Þ andp1 εð Þ $ Nðμ1; σ2 1Þ, where μ0; μ1 are the means and σ0; σ1are the standard deviations for p0 (ε) and p1 (ε) respectively. In this case the KLD may be written as [14]: KLðp1ðϵÞ‖p0ðϵÞÞ ¼ 1 2 σ2 1 σ2 0 þ ðμ1 Àμ0Þ2 σ2 0 þ log σ2 0 σ2 1 À1 ! ð2Þ The standard deviation of the distribution p1 (ε) is assumed unchanged after the occurrence of a fault σ0 ¼ σ1ð Þ. Then, we can write: KLðp1ðϵÞ‖p0ðϵÞÞ ¼ 1 2 ðμ1 Àμ0Þ2 σ2 0 ! ð3Þ μ0 and σ0 are obtained from the measurement of the residual of NARMAX model recorded in normal and safe operating modes, while the parameter μ1 is calculated at the end of each fixed sampling interval. Considering Eq. (3), the above hypothesis can be formulated in terms of KLD as: H0 : KLðp1ðϵÞ‖p0ðϵÞÞrTD H1 : KLðp1ðϵÞ‖p0ðϵÞÞ4TD ð4Þ where TD is the predetermined threshold. The value of TD is cal- culated based on the three sigma rule [19–21]. 3. Experimental device The process used in this work is a distillation column, a separation process habitually found in the refinery and petro- chemical industries. Fig. 1 represents the setup considered here. A toluene methylcyclohexane mixture is introduced in the tank in order to be separated with a mass composition at 23% methylcy- clohexane. Feed preheating system is constituted by three ele- ments of 250 W each one. The reciprocating feed pump is constituted by a membrane allowing firstly the suction of the mixture and the discharge towards the tank with a flow capacity F¼4.32 L/h. The column has also a reboiler of 2 L hold-up capacity, L. Aggoune et al. / ISA Transactions 63 (2016) 394–400 395
  3. 3. an immersion heater of a power Qb ¼3.3 kW, and of a level liquid switch sensor which allows the automatic stop of heating if the level is insufficient. The internal packing is made of Multiknit stainless 316 L which enhances the mass transfer between the vapor and liquid phases. A condenser is placed at the column overhead in order to condense the entire vapor coming out from the column. The cooling medium used in exchangers is water. The heat-transfer area of the total overhead condenser is 0.08 m2 . The thermocouples were coupled to a calibrated amplification cir- cuit (4–20 mA, 0–150 °C) whose signals are accessible on-line from computer, which permits the bottom and top tempera- tures to be obtained. The unit has twelve PT100 sensors which measure continuously the temperature throughout the col- umn. The measurements of top and bottom pressures of the separation unit are realized by two WIKA ECO-Tronic pressure transmitters. All sensors are interfaced with a control compu- ter using RS485/232 converters. A ETP-200 program is used to register the variables. In this set-up, the most significant variables that can be mea- sured are reflux timer (Rt), heating power (Qb), pressure drop (ΔP), preheating power (Qf), preheating temperature (Tf), feed flow rate (F), cooling rate (Qc) and overhead temperature (Td). The column is kept at a constant operating point for four hours to ensure that the column is in steady-state. The values of the measurable variables obtained on average from the nominal steady-state regime are: Qb of 45%, Rt of 14%, F of 50%, Td of 102 °C, Tf of 80 °C, Qc of 250 L/h. 4. NARMAX model system identification The problem of modeling is important in the context of simu- lation, control and fault detection and diagnosis. Several approa- ches have been developed in the literature [22–24]. The steps in the modeling procedure may be stated as follows: 1. Model structure detection, determining the model form suitable for the system of interest. 2. Model parameter estimation, estimating the unknown para- meters contained within the specified model, using experimental data. 3. Model validation, this last step includes testing to verify if the established model provides good accuracy for a fresh data set that was not used during the training phase. For nonlinear systems, one popular black-box identification method is to use the polynomial models, such as NARMAX one [22,25]. In case of multiple input, single output (MISO) systems this model is given by: yðtÞ ¼ f yðtÀ1Þ; :::; yðtÀnyÞ; u1ðtÀd1Þ; :::; u1ðtÀd1 Ànu1Þ; À …; upðtÀdpÞ; :::; upðtÀdP ÀnupÞþeðtÀ1Þ; :::; eðtÀneÞ Á þeðtÞ ð5Þ where y(t), ui(t), and e(t) represent the system output, input and noise at the discrete time t, respectively; p is the dimension of the input vector. ny; nui ; ne are the maximum time lags of the output, input, and noise, respectively;di is the input delay. f is a nonlinear function which taken here to be a polynomial expansion of its arguments with nonlinearity degree l. The expression of a poly- nomial NARMAX model is obtained as follows: yðtÞ ¼ Xn i1 ¼ 1 βi1 xi1 ðtÞþ Xn i1 ¼ 1 Xn i2 ¼ i1 βi1;i2 xi1 ðtÞxi2 ðtÞþ… þ Xn i1 ¼ 1 … Xn il ¼ il À 1 βi1;…il xi1 ðtÞ…xil ðtÞþ…þeðtÞ ð6Þ where n ¼ ny þnu1 þ:::þnup þne, β’s the model parameters, and x’s represent lagged output, input, and noise terms. The ARMAX model is a special case of the NARMAX one, which is obtained by setting l¼1. 4.1. NARMAX parameters estimation algorithm The parameter estimation of the NARMAX model has received much attention and many algorithms have been developed [26,27]. In this work, the recursive algorithm such as output error with extended prediction model (OEEPM) method is used mainly for its simplicity and superior performance [28]. Eq. (6) clearly belongs to the class of linear-in-the-parameter regression models. In order to estimate the parameters of the NARMAX model, Eq. (6) should be expressed as: yðtÞ ¼ φT ðtÞθþeðtÞ ð7Þ where φT ðtÞ includes all the output, inputs, and noise terms as well as all possible combinations up to degree l and up to time (t-1), and θ is the vector which includes the model parameters to be estimated. If the noise e(t) is known, an ordinary recursive least square (RLS) method can be used for parameters estimation. However, in general, the noise is not measurable, and the sequence e(t) is estimated iteratively as: εðtÞ ¼ yðtÞÀ ^yðtÞCeðtÞ ð8Þ where ε(t) is the residual at time t and the predicted output ^y(t) can be written as: ^yðtÞ ¼ φT ðtÀ1Þ^θ ð9Þ The parameter vector θ can be estimated by OEEPM algorithm, discussed in detail in [23]. Considering Eq. (7), one can add and subtract the term 7ða1 ^yðtÀ1Þ; :::; any ^yðtÀnyÞÞ. Note now that the regression vector φT ðtÞ includes all the predicted output, inputs, and noise terms as well as all possible combinations up to degree l and up to time (t-1). The initial values of ε(t) i.e. ε(0), ε(À1),…, ε (Àne) are set to zero. The complete algorithm used in the following Fig. 1. Experimental device: distillation column. L. Aggoune et al. / ISA Transactions 63 (2016) 394–400396
  4. 4. is described by: ^θðtÞ ¼ ^θðtÀ1ÞþPðtÞφðtÞεðtÞ PðtÞ ¼ 1 λ PðtÀ1ÞÀPðt À 1ÞφðtÞφðtÞT Pðt À 1Þ λþφðtÞT Pðt À1ÞφðtÞ εðtÞ ¼ yðtÞÀφðtÞT ^θðtÞ 8 : ð10Þ where λ is a forgetting factor, ^θ denotes the estimated value, ε(t) is the residual, and P denotes the adaptation gain. 4.2. Proper NARMAX model selection The first stage towards modeling a particular system is to select appropriate inputs and output. For this purpose, experimental tests are performed which was carried out to obtain a rich mixture in methylcyclohexane. Rt, Qb, ΔP, Qf, Tf represent the main mea- surable input variables of the process and Td its output. The aforementioned work has been reported in [29]. To identify the parameters of the model describing Td, time series of relevant data were collected continuously for 13 hours. All data are collected with a sampling period of 11 s. Experiment is performed with the aim of generating estimation and validation data rich in amplitudes and in frequencies. When the input sig- nals are modified, Td ranges from 101.5 °C to 103.5 °C. According to [23,24], once the data are collected, the first two thirds of the data recorded are used to estimate the system parameters and the remaining data for model validation. These data are presented in Figs. 2 and 3. The latter show the evolution of ΔP, Rt, Qf and Qb between 1540 s and 2365 s for better readability. The next step following the determination of a reliable model for distillation column is to choose appropriate NARMAX poly- nomial model. If the model structure is known a priori then the identification can be formulated as a standard least-squares problem. However, the identification is a hard task in reality as the structure process is often unknown. An approach in resolving the structural problem is to estimate model parameters using a simple structure and gradually increasing ny;nu;ne; d and l until a desired accuracy is achieved [23]. In order to select the optimal model, statistical criteria such as AIC, RMSE and NSE were evaluated by varying the number of parameters. The latter are obtained according to Eq. (6). A set of candidate models was developed by setting the maximum number of ny ¼ 5;nu ¼ 7; ne ¼ 5; d ¼ 5105510½ Š and l ¼ 4. The number of terms to include in the final model is determined by the comparison of statistical criteria values for each model structure. These statistical criteria are defined as follows [30,31]: AIC ¼ ln N 2 V þ 2n N ð11Þ RMSE ¼ sqrt PN i ¼ 1 ð^yðtÞÀyðtÞÞ2 N 0 B B B @ 1 C C C A ð12Þ NSE ¼ 1À varðϵðtÞÞ varðyðtÞÞ ð13Þ where V is the loss function, n is the number of estimated para- meters, N is the data length in the estimation data set. The loss function V is equal to the residual ε(t) sum of squares: V ¼ 1 N XN i ¼ 1 ε2 i ð14Þ It is clear that the three criteria require the computation of the residual ε. The AIC contains two terms: The loss function V, which Fig. 2. Estimation and validation data. Fig. 3. (a) Evolution of heating power and reflux timer and (b) evolution of pressure drop and preheating power. L. Aggoune et al. / ISA Transactions 63 (2016) 394–400 397
  5. 5. decreases with the model complexity, and the number of esti- mated parameters n, that is increasing with the model complexity. The objective of this criterion is to make a trade-off between prediction accuracy and model complexity. The RMSE evaluates the differences between predicted output and real measurements. Lower values of RMSE indicate good prediction quality. The NSE examines the relative magnitude of the residual variance com- pared to the measured output variance. A high NSE value (near to one) ensures the good accuracy of the identified model. The best model structure is chosen as the one that minimizes AIC and RMSE, and maximizes NSE. The NARMAX model is built on the basis of a linear ARMAX model. ny; nui ; ne and d were each varied and the OEEPM algo- rithm is employed to estimate the parameters of the models. For each combination, the statistical criteria were calculated using the estimation data set. Table 1 lists the computation results of these criteria. Only significant results are reported. Based on the results given on Table 1, the best ARMAX model structure is obtained with ny ¼ 2;nu ¼ 12222½ Š;ne ¼ 2; and d ¼ 14115½ Š, giving AIC¼ À0.3640, NSE¼0.9844 and RMSE¼0.0234. To select the appropriate NARMAX model for the studied dis- tillation column system, the chosen orders and delays for the best ARMAX model are used to define the best NARMAX model. Starting from the orders and delays indicated for the best ARMAX model, and increasing or decreasing their values, the optimal values for NARMAX model can be found. Based on the evaluation of statistical criteria, the best NARMAX model structure is obtained with AIC¼ À0.4126, NSE¼0.9851 and RMSE ¼ 0.0229. In this case, the model which predicts overhead temperature is given by: TdðtÞ ¼ À0:2601TdðtÀ1ÞÀ0:6901TdðtÀ2Þþ0:0228QbðtÀ1Þ À0:0427Qf ðtÀ4ÞÀ0:0558Qf ðtÀ5Þþ0:0685RtðtÀ1Þ þ0:1053RtðtÀ2Þþ0:0165ΔPðtÀ1ÞÀ0:0199ΔPðtÀ2Þ À0:0971Tf ðtÀ5ÞÀ0:0027Tf ðtÀ6Þþ0:3524eðtÀ1Þ þ0:4175eðtÀ2Þþ0:2139TdðtÀ2ÞRtðtÀ1ÞΔPðtÀ5Þ þ0:1093RtðtÀ6ÞΔPðtÀ1ÞΔPðtÀ6Þ À0:1050QbðtÀ1ÞRtðtÀ4ÞΔPðtÀ1Þ ð15Þ Eq. (15) shows that 3 nonlinear terms have been added to the model. Using the NARMAX model the AIC is reduced in compar- ison with the ARMAX model. During the development of the different models, the NARMAX model provides a good robustness against the orders and delays variations than the ARMAX model. This is because the small var- iations of the optimal values of ny, nu, ne and d found for the best ARMAX model degrade the prediction accuracy. Consequently, the NARMAX model is more suitable than the ARMAX model for the prediction of the overhead temperature. The validation phase is the last step in system identification problem, where the performances of the identified NARMAX model are evaluated for validation data set. It is necessary to analyze the properties of the residual (Fig. 4) as seen in [28,29]. There are several techniques of model validation [22]. In general, a model generating the minimum value of the residual would best represent the real plant of interest. A standard performance index is based on the whiteness test of residual and the independence between the residual and inputs [23,24,32]. In fact, if this condi- tion is satisfied, the identified model has effectively captured the system dynamics. 5. Fault detection results In this section, the performance of the monitoring strategy based on KLD is assessed through its utilization to detect faults in a distillation column. The identified model is used to generate a residual on which the anomaly is detected by the FD procedure. In order to verify the ability and effectiveness of the proposed method, experimental faults of the distillation plant of laboratory scale were performed. The adopted technique for anomaly detec- tion is based on the assumption that the residual variance remains unchanged and the residual shift only exists in residual mean. Determination of parameters μ0 and σ0 was performed by using a measurement data on the residual in fault free case. The distillation column is used widely in the fields of chemical industry. The Early detection of faults can help to avoid pro- ductivity loss and damage to human health. The distillation unit can be affected by several faults. The major categories of possible faults, including heating power, preheating power, feed pump, Table 1 Comparison between different ARMAX model structures estimated with OEEPM algorithm. Order [na, nb, nc] AIC NSE RMSE Delay d [2, 12212, 2] À0.3591 0.9843 0.0234 [1 4 1 1 5 ] [2, 12211, 2] À0.2792 0.9830 0.0244 [1 4 2 1 5 ] [2, 12111, 2] À0.2670 0.9828 0.0246 [1 4 2 1 5 ] [2, 11111, 2] À0.2635 0.9827 0.0246 [1 4 2 1 5 ] [2, 12222, 2] À0.3640 0.9844 0.0234 [1 4 1 1 5 ] [2, 12221, 2] À0.3323 0.9839 0.0238 [1 3 1 2 5 ] [2, 12312, 2] À0.3301 0.9839 0.0238 [1 3 1 2 4 ] [2, 13212, 2] À0.2370 0.9823 0.0249 [1 5 2 2 5 ] [2, 22212, 2] À0.3165 0.9837 0.0240 [1 5 1 2 6 ] [2, 12212, 2] À0.2783 0.9830 0.0244 [1 5 2 1 5 ] [2, 12312, 2] À0.2750 0.9830 0.0244 [1 3 2 1 4 ] [2, 13212, 2] À0.3524 0.9843 0.0235 [1 4 1 2 5 ] [2, 22222, 2] À0.3627 0.9844 0.0234 [1 5 1 1 5 ] [2, 12211, 2] À0.3149 0.9836 0.0240 [1 3 1 1 3 ] [2, 12213, 2] À0.3534 0.9843 0.0235 [1 5 1 2 5 ] [3, 12312, 3] À0.3561 0.9844 0.0234 [2 4 1 1 5 ] [3, 13212, 3] À0.3572 0.9844 0.0234 [2 4 1 1 6 ] [3, 12222, 3] À0.3626 0.9844 0.0234 [2 4 1 2 6 ] Fig. 4. Difference between the best NARMAX model and actual measurements of overhead temperature with validation data set. L. Aggoune et al. / ISA Transactions 63 (2016) 394–400398
  6. 6. reflux, and pressure drop. To evaluate the validity and effective- ness of the proposed method as a monitoring tool, the detection of a sudden increasing of the heating power (Qb) and the reflux timer (Rt) to 100% was selected under the assumption that only a single fault can occur at any time. These types of faults are introduced at instant 10835 S and cause a deviation in comparison with normal mode (Fig. 5). It is assumed that the first 100 values of the temperature (Td) are not faulty. Consequently, the reference pdf will be estimated. It is impor- tant to note that the effect of a sudden increasing of the heating power (Qb) was compensated by the distillation supervision con- trol system 17 samples (187 S) after its occurrence as shown in Fig. 5a. These anomalies should be detected by KLD. The dissimilarity obtained from the evaluation of KLD between actual pdf and the reference before and after the faults has been verified according to Eq. (3). Fig. 6 presents the evolution of KLD for the selected faults. This evolution shows the ability of the proposed technique to detect real faults because it exceeds a predefined threshold TD. TD is the level up to which, the KLD is considered still in the fault-free situations (its abnormality is due only to modeling errors and noises). This threshold value is defined by the KLD of the residual obtained with fault-free data. The threshold value is calculated as: TD ¼ Mþ3σ ð16Þ where M and σ are the mean and the standard deviation of the distance. As it is shown in Fig. 6, the faults detection is made with respect to a threshold of the KLD. The first fault (increasing of the heating power) was detected at 10879 S, i.e., with the delay of 44 S. This delay corresponds to a difference (ΔTd¼ À0.4 °C) between the desired (Td¼101.4 °C) and the fault overhead temperature (Td¼101.8 °C). The second fault (increasing of the reflux timer (Rt)) was confirmed 33 S after its occurrence. This corresponds to a difference ΔTd¼ À2.3 °C between the desired overhead tem- perature and the faulty one. The detection delay was defined as the difference between the time of the fault occurrence and the time of its detection. This was especially due to the fault ampli- tude, the time dependency of fault (abrupt fault, incipient fault, or intermittent fault), and the evolution of the dynamical behavior of the separation unit. As indicated by the experimental case studies, the values of KLD determined from the residual of identified NARMAX model can be used to distinguish between the normal mode and the abnormal one. 6. Conclusion Kullback Leibler divergence (KLD) has been widely used in information theory to compute the resemblance of two distribu- tions. In this work, it is used as a fault detection framework for anomaly detection that exploits the residual of NARMAX model. In this proposed fault detection approach, the black-box modeling method is used to establish a NARMAX model through input- output experimental measurements with the aim of providing a Fig. 5. Evolution of the overhead temperature caused by the chosen faults. Fig. 6. Results of KLD for the chosen faults. L. Aggoune et al. / ISA Transactions 63 (2016) 394–400 399
  7. 7. reliable model of the system considered. Then, the KLD is applied to measure the dissimilarity between the probability density of the residual of the NARMAX model and the reference distribution obtained in normal condition operation. The effectiveness of the proposed approach has been illustrated on experimental faults in distillation column. The results of these experimental faults clearly demonstrate the potential of the adopted fault detection method. The KLD can be used in order to avoid a dangerous behavior of the distillation unit by setting-on the suitable alarm and con- ducting appropriate actions on the process. References [1] Chetouani Y. Model selection and fault detection approach based on Bayes decision theory: Application to changes detection problem in a distillation column. Process Saf Environ Prot 2014;92:215–23. [2] Wong PK, Yang Z, Vong CM, Zhong J. Real-time fault diagnosis for gas turbine generator systems using extreme learning machine. Neurocomputing 2014;128:249–57. [3] Harrou F, Nounou MN, Nounou HN, Madakyaru M. Statistical fault detection using PCA-based GLR hypothesis testing. J Loss Prev Process Ind 2013;26:129–39. [4] Kouadri A, Aitouche MA, Zelmat M. Variogram-based fault diagnosis in an interconnected tank system. ISA Trans 2012;51:471–6. [5] Goffaux G, Wouwer AV, Bernard O. Continuous–discrete interval observers applied to the monitoring of cultures of microalgae. J Proc Control 2009;19:1182–90. [6] Ding SX. Model-based Fault Diagnosis Techniques: Design Schemes, Algo- rithms, and Tools. Berlin: Springer; 2008. [7] Blanke M, Kinnaert M, Lunze J, Staroswiecki M. Diagnosis and Fault-Tolerant Control. Berlin: Springer; 2006. [8] Isermann R. Fault-Diagnosis Systems, An Introduction from Fault Detection to Fault Tolerance. Berlin: Springer; 2006. [9] Chai W, Qiao J. Passive robust fault detection using RBF neural modeling based on set membership identification. Eng Appl Artif Intell 2014;28:1–12. [10] Akhenak A, Duviella E, Bako B, Lecoeuche S. Online fault diagnosis using recursive subspace identification: Application to a dam-gallery open channel system. Control Eng Pract 2013;21:797–806. [11] Peng ZK, Lang ZQ, Wolters C, Billings SA, Worden K. Feasibility study of structural damage detection using NARMAX modelling and Nonlinear Output Frequency Response Function based analysis. Mech Syst Signal Process 2011;25:1045–61. [12] Shashoa NAA, Kvaščev G, Marjanović A, Worden K. Sensor fault detection and isolation in a thermal power plant steam separator. Control Eng Pract 2013;21:908–16. [13] Basseville M, Nikiforov IV. Detection of Abrupt Changes: Theory and Appli- cation. New Jersey: Prentice Hall Englewood Cliffs; 1993. [14] Belov DI, Armstrong RD. Distributions of the Kullback–Leibler divergence with applications. Br J Math Stat Psychol 2011;64:291–309. [15] J.R. Hershey, P.A. Olsen, Approximating the kullback leibler divergence between gaussian mixture models. In: Proc. of IEEE ICASSP. Hawaii, USA; 15– 20 April 2007. 4: p. 317–20. [16] Silva J, Narayanan S. Average divergence distance as a statistical discrimination measure for hidden Markov models. IEEE Trans Audio Speech Lang Process 2006;14:890–906. [17] J. Harmouche, C. Delpha, D. Diallo, Incipient fault detection and diagnosis based on Kullback–Leibler divergence using Principal Component Analysis: Part I, Signal Process. 94, 278–287. [18] Houerbi KR, Salamatian K, Kamoun F. Scan surveillance in internet networks. Networking 2009:614–25. [19] Georgoulas G, Mustafa MO, Tsoumas IP, Antonino-Daviu JA, Climente-Alarcon V, Stylios CD, Nikolakopoulos G. Principal Component Analysis of the start-up transient and Hidden Markov Modeling for broken rotor bar fault diagnosis in asynchronous machines. Expert Syst Appl 2013;40:7024–33. [20] Shi Z, Gu F, Lennox B, Ball AD. The development of an adaptive threshold for model-based fault detection of a nonlinear electro-hydraulic system. Control Eng Pract 2005;13:1357–67. [21] Pukelsheim F. The three sigma rule. Am Stat 1994;48:88–91. [22] Billing SA. Nonlinear System Identification: NARMAX Methods in the Time, Frequency, and Spatio-Temporal Domains. New York: Wiley; 2013. [23] Landau ID, Lozano R, M'Saad M, Karimi A. Adaptive Control: Algorithms, Analysis and Applications. London: Springer; 2011. [24] Ljung L. System Identification, Theory for the User. Englewood Cliffs, New Jersey: Prentice Hall; 1999. [25] Leontaritis IJ, Billings SA. Input–output parametric models for nonlinear sys- tems. Int J Control 1985;4:303–44. [26] Rahrooh A, Shepard S. Identification of nonlinear systems using NARMAX model. Nonlinear Anal Theory Methods Appl 2009;71:e1198–202. [27] Billings SA, Korenberg MJ, Chen S. Identification of nonlinear output-affine systems using an orthogonal least squares algorithm. Int J Syst Sci 1988;19:1559–68. [28] Aggoune L, Chetouani Y, Radjeai H. Recursive identification of the dynamic behavior in a distillation column by means of autoregressive models. J Dyn Syst Meas Contr 2014;136 044506-044506-5. [29] Chetouani Y. Model-order reduction based on artificial neural networks for accurate prediction of the product quality in a distillation column. Int J Autom Control 2009;3:332–51. [30] Akaike H. A new look at the statistical model identification. IEEE Trans Autom Control 1974;19:716–23. [31] Nash JE, Sutcliffe JV. River flow forecasting through conceptual models. J Hydrol 1970;10:82–290. [32] Billings SA, Voon WSF. Correlation based model validity tests for nonlinear models. Int J Control 1986;44:235–44. L. Aggoune et al. / ISA Transactions 63 (2016) 394–400400