Fault Detection Approach Based on Weighted Principal Component Analysis Applied to Continuous Stirred Tank Reactor

Fault detection approach based on principal component analysis (PCA) may perform not well when the process is time-varying, because it can cause unfavorable influence on feature extraction. To solve this problem, a modified PCA which considering variance maximization is proposed, referred to as weighted PCA (WPCA). WPCA can obtain the slow features information of observed data in time-varying system. The monitoring statistical indices are based on WPCA model and their confidence limits are computed by kernel density estimation (KDE). A simulation example on continuous stirred tank reactor (CSTR) show that the proposed method achieves better performance from the perspective of both fault detection rate and fault detection time than conventional PCA model.


INTRODUCTION
With the development of modern process control techniques in production, fault detection has been playing an important role in ensuring long-term and efficient operation in chemical process.It is difficult to build mathematical models in the domain of model-based monitoring techniques.Data-based methods are often employed to build statistical models, in which only historical operating data is considered without any mathematical models.Multivariate statistical process monitoring (MSPM) methods are also data-based fault detection methods used widely.There have been some standard successfully designed models during the past several decades.High-dimensional data often include some redundant information, such as noises, thus the key point of MSPM is to monitor main extracted features of observed data [1][2][3].
Among MSPM methods, principal component analysis (PCA), canonical variate analysis (CVA), and independent component analysis (ICA) have been widely used for fault detection in chemical process in [4][5][6].For example, PCA can deal with high-dimensional data that are highly linear correlated.Process monitoring is then performed in the principal component (PC) subspace and residual subspace separately to detect the data changes inside and outside the PC subspace.Extended methods have been reported for better monitoring performance [7][8][9].Ku [10] proposed dynamic PCA (DPCA) in which serial correlation of data is considered.A multi-fault diagnosis method for sensor systems based on PCA was presented in [11].An image reconstruction method for electrical capacitance tomography based on robust PCA was given in [12].To address the *Address correspondence to this author at the College of Information and Control Engineering, Weifang University, Weifang 261061, PR China; Tel: 086+186-5471-0898; E-mail: gsm197851@126.comnonlinear behavior of a process, some nonlinear extensions of traditional PCA method have already been proposed such as auto associative neural network, principle curves and the other methods [13][14].A nonlinear process monitoring method based on kernel function was proposed in [15].Luna [16] proposed generalized principal component analysis (GPCA) method applied to the time-of-flight values.Kernel PCA (KPCA) can map the input space into a linear feature space via a nonlinear mapping, and an improved multi-scale KPCA was presented in [17].Lee [18] proposed multiway principal component analysis (MPCA) to extract the information in the multivariable data to monitor batch process.Aiming to tackle the problem of multiscale and nonlinear, multiscale KPCA [19] was applied to capture correlations of process variables at every possible scale.For large-scale chemical process, a fault detection approach was proposed based on multiblock KPCA, and the proposed decentralized nonlinear approach effectively captured the nonlinear relationship in the block process variables and showed superior fault diagnosis ability compared with other methods in [20].However, in these methods the local manifold structure of high-dimensional data was considered.Deng [21] proposed sparse kernel locality preserving projection method in nonlinear process fault detection.The findings suggest that PCA may extract useful information for feature extraction, and it is a promising method for databased fault detection in chemical process.However, the conventional PCA method fails to handle the time-varying process.To some extent, it may not extract much useful information, which limits its application.
Slow feature analysis (SFA), which emerged as a new dimension reduction method in recent years, was proposed in [22].SFA aims for extracting invariant features from highdimensional measurements.It can extract the slowly varying features from input signals, which is useful for classification and identification.SFA had been applied in many kinds of fields [23][24][25].For example, Ma proposed kernel-based method to solve the nonlinear expansion problem of SFA using an algorithm evaluation criterion [26].
However, all the above mentioned method may perform not well when the process is time-varying, because it can cause unfavorable influence on feature extraction for conventional PCA.To solve this problem, a modified PCA is proposed considering slow features extraction of timevarying signals, referred to as weighted PCA (WPCA) in this paper.

2.
WEIGHTED PRINCIPAL COMPONENT ANALYSIS

Principal Component Analysis
PCA is a kind of linear dimension reduction technique which can preserve meaningful information hidden between the original variables.In PCA method, the high-dimensional data can be projected onto a low-dimensional space by orthogonal transformation, so that low-dimensional and uncorrelated principal components can be obtained.By extracting the main features of observed data, it removes linear correlations between the variables in the highdimensional space.
Let X = (x 1 , x 2 ,...x n ) be the high-dimensional data matrix with n samples of process vector x i ∈R m .Let Y = ( y 1 , y 2 ,...y n ) be the low dimensional outputs with n samples of vector y i ∈R d .W = (w 1 ,w 2 ,...w d ) is defined as transformation matrix, and the vector y i can be obtained by PCA transformation as follows: T , 1,2,...
which is also called PC vectors or score vectors.Here w i ∈R m are the projection vectors and it projects original data into score space.Usually, the first d eigenvectors are selected to build PCA fault detection model.
The matrix E is residual matrix.
In the dimension reduction process of PCA, sum of squared reconstruction errors between high-dimensional and low-dimensional space is to be minimized, as follows [24]: Solving the problem of the objective function we can derive transformation vectors.Besides, the formula (4) should satisfy the condition of T 1 Equation ( 4) can be further written as: Therefore, the vectors w can be obtained by solving the eigenvalue equation described as follows: For a new sample x new , the corresponding score vector y new and residual vector e new can be calculated as: Finally, the Hotelling's T 2 index and the SPE (or Q statistic) index are used to monitor the difference both the normal variation information and residual space information.

Problem Statement
SFA is a new emerged method for extracting temporally coherent features out of high-dimensional data.SFA focuses on finding common slower variation components of the input signals, i.e. the higher-order statistics of input data.Besides, the obtained feature components are mutually unrelated and independent. Let T in a function space F, so that y(t) can be generated with y i (t) = f i (x(t)) which varies as slowly as possible.The objective is to minimize the squared mean of this derivative under the strong limitation as follows: Under the constraints of < y j > t = 0 (10) and ∀i < j,< y i y j > t = 0 where  y is the first order derivative of y, and < .> t is the sample mean over all the available time.In the paper, the first order derivative of y is obtained from the discrete temporal derivative as  y j (t) = f j (x(t)) − f j (x(t − 1))

The Calculation of Slow Feature
Let us consider the linear case first for the input vector x and weight vector w j .
In the following, we assume x to have zero mean without loss of generality.Equations ( 9), ( 11) and ( 12) can be rewritten as where A =<  x  x T > t is the expectation of the covariance matrix of the temporal first order derivative of the input vectors.With SFA algorithm, Δy 1 ≤ Δy 2 ... ≤ Δy d , the most slowly varying outputs can be introduced.So it induces an order, the first output signal being the slowest one, the second being the second slowest, and so on.

WPCA Algorithm
The essence of those dimensional-reduction methods such as PCA, is to project the observed data from a highdimensional space into a low-dimensional space by certain transformations.Fewer latent variables can be acquired by some criteria which represent the main information of original data.We believe that using a weighted criterion in dimension reduction methods would extract more thorough feature information.
The time-varying information always exists in the variables along with the progression of time.In PCA algorithm, it cannot consider the time-varying characteristics of the observed data.As mentioned earlier, SFA succeeds in finding optimal slowly varying features hidden in the input signals.This conclusion motivates us to integrate SFA with PCA into a new multivariable statistical monitoring method.We take into account the slow features extraction from the input information.The optimization function of PCA is fused with SFA.It is converted to a maximization problem under the constraints by solving the joint objective function.
For PCA, the optimization function refers as follows: Let Δx(t) = x(t) − x(t − 1) , and ΔX = [Δx 1 , Δx 2 ...Δx n ] be the incremental matrices.Be similar as SFA, the corresponding objective function can be as follows: ! J 2 =min w T ΔX ΔX T w (16) As analyzed previously, there are some similarities between the two objective functions, J 2 and J 1 , so we can introduce J 2 to J 1 to achieve integration.Therefore, WPCA proposed in the paper can solve the problem brought by time-varying input signals.The objective function can be reformulated further as J.
The majority information can be retained in the projected data when the high-dimensional data space is reduced into a low-dimensional space.The cumulative contribution rate can be used to calculate dimensionality d.
Therefore, the low-dimensional linear features are extracted by transformation projection vectors once we obtain projection vectors w 1 , w 2 ,w d .
The output principal vectors Y is an orthonormal set of vectors representing the eigenvectors of the sample covariance matrix associated with d < m of the PCs.
For a new sample x new , the corresponding output vector y new and residual vector e new can be obtained as follows: WPCA takes into account the time-varying data features information and data variability simultaneously.It considers incremental covariance matrix when executing covariance matrix decomposition in PCA, so that the data collected can be considered more thoroughly from different perspectives.Besides, we can infer correlations between WPCA and DPCA from the view of dynamic data.

Monitoring Statistics Based on WPCA
Based on WPCA model, the monitoring statistics T 2 and Q are built in the PC subspace and residual space for fault detection.T 2 monitoring index is used to measure the feature data variation in the PC subspace, which is defined as where β>0 is regularization parameters, and Y is output matrix of the normal condition in the training procedure.
Q statistic is a measure of error between deviation trend and statistical model for every sample.It can also be used to estimate the external data variation in the residual space.
Q  SPE = e T e = (x − Wy) T (x − Wy) (24) After the monitoring statistic is obtained, the upper control limits should be calculated to determine whether the monitored process is in control.Given that the output signals are not strictly Gaussian distribution, the upper control limits of T 2 and Q can be calculated by kernel density estimate (KDE) method.

Fault Detection Based on WPCA
The monitoring method based on WPCA algorithm includes two stages: offline modeling stage and online fault detection stage.The detailed procedure is demonstrated in the following text.The normal operating model is set up by WPCA in the offline modeling procedure, and the confidence limits of T 2 and Q monitoring indexes are obtained by KDE.The offline modeling procedure is summarized as follows: 1) Acquire data sets X under normal condition, calculate mean and variance of normal operating data, and acquire its corresponding incremental matrices.
2) Apply WPCA model by solving generalized eigenvalues problem in (17) to get the projection vectors w.
3) Project the data matrix X into the WPCA subspace to obtain the transformation feature components, and reconstruct the corresponding residual matrix E.

4)
Calculate the two monitoring statistics T 2 , Q respectively.

5)
Determine the control limits of T 2 and Q statistics by KDE.
At the end of office, T 2 and Q have been obtained from the model built above.In the online stage, statistics T 2 and Q of new data collected are to be calculated, hence to determine whether process at present is under normal operation state.

1)
Standardize new data to normal distribution with mean and variance from the offline modeling stage.
3) Determine whether T 2 and Q respectively exceed their control limits obtained in off line stage, and give an alarm if either statistic exceeds its corresponding limit.

ANALYSIS OF CSTR PROCESS
In this section, the process monitoring method based on WPCA is evaluated for comparison with PCA-based fault detection in the continuous stirred tank reactor (CSTR) benchmark process.The proposed fault detection method based on WPCA is tested in a simulated chemical process, CSTR system (Fig. 1).CSTR system is commonly used as a basic unit in chemical process, and it is also the core of many large and complex processes.Research on CSTR system for fault detection provides universal interest.In the simulation, there are ten process variables collected under normal condition and common faults condition.The sampled data are collected every 12 seconds and 2000 samples are obtained.A fault is introduced after every 300 samples.Both methods PCA and WPCA are applied to fault detection for performance comparison.Two monitoring statistics T 2 and Q are plotted as solid line and the corresponding confidence limits are plotted as dashed line (Fig. 2).The confidence limit for fault detection is set to 95% for convenience of methods comparison.There are 1000 samples under normal condition used for model establishing, and another 2000 normal samples are used to calculate confidence limits for the monitoring statistics.Weighted parameter α determines the difference for features extraction, and its value influences fault detection performance directly.Catalyst deactivation is a kind of common fault, which is applied to analyze the fault detection results with different α .Reference for experience and simulation results of different parameters, 0.85 is favorable to have a better fault detection result.And, 6 PCs are selected to explain about 91.39% of the all the variance information using WPCA fault detection method, which can be seen from Fig. (2).

Fig. (2). Cumulative variance contribution rate of variable (CCR).
After the WPCA fault detection model is established, the monitoring performance is evaluated using samples under normal condition.The fault detection result is shown in Fig. (3).We can see that the proposed method has the better process monitoring performance.It illustrated further that Two typical faults are illustrated to show the effectiveness of WPCA method.The first fault is associated with a small step change of inlet stream.As shown in Fig. (4), the T 2 and Q monitoring charts of PCA method are plotted for this fault and the fault detection rates by the T 2 and Q chart are 57.82% and 47.24%.When the proposed WPCA is applied, fault detection rates of T 2 and Q chart are 83.29% and 73.21%, respectively.The result indicates the proposed method has a superior capability in detecting faults.It can reduce the information loss problem and improve detection rates of the statistics.
Another fault is that coolant feed temperature ramps down.The monitoring results are also illustrated (Fig. 5) to show the effectiveness of WPCA.T 2 and Q monitoring charts of PCA can detect the fault at samples 561 and 580 respectively.However, T 2 monitoring chart of WPCA fault detection method can find the process abnormality at the 544th sample while Q monitoring chart of WPCA discovers the fault at the 498th sample.With 17 and 82 sampling time earlier the proposed method demonstrates that improved the monitoring performance.In addition, the two methods have different fault alarming rates.Fault missing rates of PCA are 17.53% and 19.24% for T2 and Q statistics, and WPCA performs better whose fault missing rates are 15.16% and 12.18%.WPCA can detect the fault more effectively than PCA.

CONCLUSION
In this paper, we present a method for fault detection based on weighted PCA (WPCA).Slow feature information is important for fault detection in practical processes.SFA can extract the characteristics of linear correlated and changing slowly data hiding in the time-varying process.The original PCA is modified by introducing SFA to establish WPCA.It is converted to a maximization problem which can be solved in a joint objective function.WPCA can acquire more complete information and diminish noise for implementing fault detection.Two monitoring statistics are constructed based on WPCA model in the principal component subspace and residual subspace separately.Simulation results on a numerical case, and CSTR process are used to compare the monitoring performance of PCA and WPCA.It demonstrates that the proposed method performs better in fault detection than PCA method.WPCA can also extract linear features of process variables.Further study is still required for the nonlinear processes, and much work could be done in the future.