Vous êtes sur la page 1sur 12

1402

IEEE TRANSACTIONS ON POWER DELIVERY, VOL. 28, NO. 3, JULY 2013

Detection and Classication of Faults in Power Transmission Lines Using Functional Analysis and Computational Intelligence
Andr de Souza Gomes, Marcelo Azevedo Costa, Thomaz Giovani Akar de Faria, and Walmir Matos Caminhas
AbstractThe transmission line is the most vulnerable element of any electrical power system due to its large physical dimension. As a consequence, many fault diagnosis algorithms have been proposed in the literature. In general, most proposals use signal-processing analysis and computational intelligence. In this paper, a new model to functionally represent the phases of a transmission line is proposed. The detection and classication strategy are developed from the analysis of the models parameters and were evaluated using a set of simulated faults and a real database. The results show that the proposed model detects faults very quickly, using a vastly simplied mathematical process, and is able to classify faults accurately. Index TermsDetection and classication of faults, power transmission lines.

I. INTRODUCTION HE ABILITY to detect faults in transmission lines as fast as possible is crucial, since they may compromise the propagation of energy to customers and the functioning of the transmission network. Therefore, efcient fault detection approaches are focused primarily on the analysis of short time intervals, or transient signals. In this context, the use of wavelet transforms has emerged as a powerful tool for feature extraction, mainly due to its ability to focus on short time intervals for the analysis of high-frequency components [1], [2]. Different from Fourier transforms [3], wavelets transforms can use varying time windows in order to extract the coefcients of the mother wavelet [4][6]. Briey, short time windows are applied for highfrequency components, and long time windows for low-frequency components. Further details about wavelets can be found in [1]. The estimates of wavelet coefcients require further analysis in order to detect or classify faults. In general, the coefcients
Manuscript received March 12, 2012; revised July 27, 2012, November 14, 2012, and February 06, 2013; accepted February 26, 2013. Date of publication March 27, 2013; date of current version June 20, 2013. This work was supported in part by CAPES-Brasil, and in part by CEMIG, in part by FAPEMIG, and in part by CNPq. Paper no. TPWRD-002602012. A. S. Gomes and W. M. Caminhas are with the Graduate Program in Electrical Engineering, Federal University of Minas Gerais, Belo Horizonte 31270-901 MG, Brazil. M. A. Costa is with the Department of Production Engineering, Federal University of Minas Gerais, Belo Horizonte 31270-901 MG, Brazil. T. G. A. de Faria is with Electric Company of Minas Gerais, Minas Gerais, Belo Horizonte 30150-150, Brazil. Color versions of one or more of the gures in this paper are available online at http://ieeexplore.ieee.org. Digital Object Identier 10.1109/TPWRD.2013.2251752

are used as inputs to classication models, such as articial neural networks (ANNs) [7]; fuzzy systems [8][10]; classication and regression trees (CART) [11]; support vector machines (SVMs) [12]; and a combination of these and other techniques [13], [14]. Despite the popularity of ANN, it has been widely criticized because it requires a considerable amount of training [11]. As alternatives to ANNs, SVM and CART reduce training efforts. Nevertheless, there is always a need to have innovative methods for transmission-line protection which can potentially detect faults faster than current routines [2]. In fact, current industrial relays for line protection are based on simpler but still effective techniques, such as the absolute sum of the differential current signal, differences in phase angle, etc. [15], [16]. The principle of differential and directional relays consists of the sequential surveillance of a protection zone (i.e., these relays detect changes in the differences between the total current output of the zone and the total current input entering the zone). For instance, threshold boundaries for differential quantities and the use of wavelets are found in recent literature [17], [18]. Similarly, in order to detect patterns that do not conform to the expected behavior, our proposed anomaly detection approach denes a protection zone, or a region that represents normal behavior within the domain of transmission lines, and then declares any behavior outside this region as an anomaly. In general, the normal behavior of transmission lines is much easier to model than the anomalous behavior. This is because the behavior of an anomaly is usually unknown and, therefore, difcult to model. In such modeling, statistical inference analysis [19] is crucial and provides mechanisms to properly build the boundaries between normal and anomalous behaviors. That is, the error of classifying normal behavior as an anomaly is controlled: this is also known as the control of the type I error. Our approach focuses on a reliable representation of the transmission line under normal operating conditions. Our proposed mathematical model includes stochastic components which account for current and voltage stochastic deviations, or noises, under normal operating conditions. By doing so, we provide novel stochastic representation of the transmission lines, which enables faster detection of anomalous behaviors, or faults. Therefore, our novel approach also relies on anomaly detection techniques [20]. This work was initially motivated by a research project between the Department of Electronics Engineering and the Electric Company of Minas Gerais (CEMIG), Brazil. CEMIG has a transmission network of approximately 7605 km. It is the third

0885-8977/$31.00 2013 IEEE

DE SOUZA GOMES et al.: DETECTION AND CLASSIFICATION OF FAULTS IN POWER TRANSMISSION LINES

1403

largest power transmission company in Brazil and is responsible for the propagation of large amounts of energy throughout the country. Currently, the power transmission protection system provides prompt responses to faults; meaning, the network is locally shut down when a fault is detected. Subsequently, when a fault occurs, maintenance teams move to the location of the fault. In most situations, there is no upfront information regarding the type of fault that will be found. Therefore, once the maintenance teams reach the fault location and identify its type, additional equipment may be required. Due to a large extension of the transmission network, the transport of extra equipment may considerably delay the time required to return the transmission line to its normal operating behavior. Therefore, a system that identies the probable cause of faults can guide the maintenance team with proper equipment. Initially, we proposed a methodology intended to classify faults in transmission lines. We found that these boundaries can also be used to detect faults quickly. On average, the proposed model detects faults in 0.09 cycles (or 3.64 ms), or faster. It is important to point out that our detection rate of 0.094 cycles at 60 Hz is superior than the recent published results of a half cycle using a sampling rate of 32 per cycle at 50 Hz [12], and two full cycles using a sampling rate of 256 samples per cycle at 50 Hz [11]. This paper is organized as follows: Section II presents the novel approach based on the elliptical behavior of the current and voltage signals. Section III presents the performance of the method for simulated and real databases. Finally, Section IV presents the conclusion. II. METHODS A. Mathematical Background Let the voltage and current signals in one phase of a transmission line at time be written as (1) (2) Likewise, we can express voltage and current signals as sine functions (3) (4) We do this in order to simplify the mathematical analysis of the proposed model. In this model, the angle is the delay between the current sine signal and the voltage sine signal . and are the peak values of the voltage and current sine signals, respectively. is the power factor (PF), and is the angular velocity. In the Brazilian system, 60 Hz. For any electrical system, PF is also dened as the ratio between active power (P) and apparent power (S). Fig. 1 shows the expected 2-D behavior of the voltage and current sine signals for one phase of a transmission line at B. Geometric Representation of One Phase in a Transmission Line Given the operational peak values for the voltage and current signals, the standardized signals can be written as
Fig. 1. Elliptical behavior of a phase in a transmission line, for different values and (b) . of : (a)

standard operation (i.e., without noise). This behavior can be modeled using a conic section mathematical equation or, more specically, the equation of the ellipse. It can be shown that for different values of the PF, different values for the radii and the rotation angle are generated. If the value of the PF is kept constant and the peak values of the current and voltage sine signals change, then the shape of the ellipse changes as shown in Fig. 2. Considering the transmission system under standard operation, the 2-D elliptical behavior is similar for each of the three phases of a transmission line.

(5) (6)

1404

IEEE TRANSACTIONS ON POWER DELIVERY, VOL. 28, NO. 3, JULY 2013

Using the geometric properties of (8), it is possible to monitor the behavior of voltage and current signals for each phase of a transmission line. As will be shown, this representation allows the design of a system to identify, as fast as possible, the moment when potential faults have occurred. The parameters of the ellipse can be re-estimated after a fault; that is, the value of the radii and the slope of the ellipse along with the estimates of the parameters of the postfault ellipse, may be used as input variables in any classication system. C. Estimates of the Coefcients of a Conic Section According to Boldrine et al. [21], a conic section dened in represents a set of points whose coordinates satisfy the general equation (10) with the parameters or or . The parameters can be estimated using a sample of the current and voltage signals. In this case, it is possible to describe a set of linear equations for the parameters in order to nd their estimates through minimum least squares. First, (9) can be rewritten in matrix form as

When rewritten as matrices, the parameters are estimated by means of the solution of minimum least squares, written as
Fig. 2. Elliptical behavior of a phase in a transmission line for different peak 200 kV and 0.4 kA, values of current and voltage sine signals. (a) 200 kV and 1.5 kA. (b)

(11) From the minimum least squares estimates, can be estimated as and PF

Furthermore, by applying basic trigonometric identities, it follows:

(7) Thus, from (5) and (6) and applying (7), the Cartesian form for the equation of the ellipse is dened as (8) Equation (8) represents the expected behavior of the electrical system after applying the standardized operator, and assuming a constant power factor and normal operating conditions. The standardized operation of voltage and current signals enables the comparison of these signals using the same unit scale. The nonstandardized equation of the ellipse, for any value of and , is given by D. Rotation and Translation Operations of a Conic Section As will be shown in Section II-E, it is of interest to represent the equation of the ellipse in a reduced form. The reduced form (or canonical form) of an ellipse can be obtained starting from the general equation, then performing a translation operation of the Cartesian axis [21], followed by a rotation operation of angle . The translation operation can be done by simply extracting the average values of the original voltage and current signals. The rotation operation creates a new PF between the signals, which is zero (i.e., ), where represents the operation of rotation.

(9)

DE SOUZA GOMES et al.: DETECTION AND CLASSIFICATION OF FAULTS IN POWER TRANSMISSION LINES

1405

The reduced form of (8) after the operation of rotation is given as (12) , where and . For 0 or , the nal conic equation represents a degenerated ellipse with one of the radii equal to zero and, consequently, the equation denes a straight line. For the remaining cases, (8) and (9) represent an ellipse. or, alternatively, as E. Modeling Stochastic Components The elliptical representation of the behavior of both voltage and current signals does not consider, at rst, the existence of the noise which is inherent in the processes of generation, transmission, or even disturbances related to digital recorders (DRs). The DRs collect data of current and voltage in the power substations. Even under normal operating conditions and without the occurrence of faults, the transmission lines are constantly subjected to noise that are present in current and voltage signals. In this case, it is of interest to quantify the noise components and to incorporate these components into the parametric equation of the ellipse. In order to do this, we propose an alternative model for the behavior of voltage and current signals by adding two stochastic components: one component to the voltage signal and the second component to the current signal, as follows:

In order to t the model with the stochastic components, it is necessary to estimate the dispersion parameters and . We estimate these parameters by rst nding the solution of a 1-D optimization problem that searches for the value of that minimizes the squared Euclidean distance of a point to the ellipse (15) Therefore, the residuals are calculated as

(16) where and are the radii of the rotated ellipse, described previously. In this case, we use the optimization method named golden search [22] to nd . We repeat the optimization procedure to a sample of points , where is the sample size. These points represent the EPS under normal operating conditions. Finally, we use the residuals to estimate the dispersion parameters (17) (18) It is worth mentioning that our proposed model projects the temporal behavior of the voltage and current signals into a statistic 2-D space. In this space, we estimate the residuals by simply calculating the minimum distance of a point to the ellipse. This procedure has the advantage of estimating voltage and current residuals simultaneously. Having found estimates for the dispersion parameters and , the ultimate goal of the stochastic model is to build boundaries around the ellipse for the operation of the phases (i.e., voltage and current signals of each phase of a transmission line), under normal operating conditions. These boundaries are built based on statistical inference analysis [19]. F. Condence Intervals for the Ellipse Under Normal Operating Conditions The following stochastic model is being considered for each phase of any transmission line operating under normal operating conditions:

(13) and are two random variables, with means of zero where and variances of and , respectively, . The model proposed in (13) has interesting mathematical properties, as will be shown next. The fact that the stochastic components have means of zero indicates that the random variables do not affect the average behavior of the ellipse. Both stochastic variables are associated only with the noise, that is, the random dispersion of the voltage and current signals with respect to their nominal conditions values under normal operating conditions. The fact that the proposed stochastic model is multiplicative with respect to the nominal peak values means that the variances of the noise are proportional to the square of the peak values of the voltage and current signals ( and ). This specic formulation has advantages with respect to the standardized operation of the signals. Thus, the standard equations for voltage and current signals, assuming the stochastic components, are dened as

(19) and where and are the radii of the rotated ellipse, and are the two random variables. We want to build a -level condence interval for and . To do so, we initially evaluate the points at time , where . For this specic condition, , where follows an unknown distribution with mean and variance . Similarly, for , then , where

(14) It is possible to show that the expected mathematical value of the signal with the stochastic component is given by (8). That is, the stochastic component changes neither the rotation nor the mean value of the ellipse with respect to the Cartesian plane.

1406

IEEE TRANSACTIONS ON POWER DELIVERY, VOL. 28, NO. 3, JULY 2013

also follows an unknown distribution with mean and variance . The following time points and represent the axes of the Cartesian plane where the ellipse is located. If and are Gaussian random variables, then at and , the condence intervals are dened as

(20) is the -score statistic with the -condence where level. For instance, choosing a condence parameter of 99.7%, then [19]. In this case, the upper and lower limits of the condence interval are represented by equations of ellipses with adjusted radii. At , the radius is adjusted to which is the upper limit; and to , which is the lower limit. If , then the lower limit does not exist. The same principle applies to the (i.e., by adjusting the radius to ) (upper limit) and to (lower boundary). If , then the lower boundary does not exist. Nevertheless, when and , the additive effect of the proposed stochastic model does not hold. In practice, for the construction of condence intervals, the additive form proposed in the stochastic model does not result in signicant loss in the accuracy of the upper and lower limits. Using the proposed additive model has advantages, such as: generating upper and lower limits, also as ellipses; generating simple rules, to assess whether points are within the condence region. A simple rule to determine whether a point is within the condence region is to check whether the following conditions are satised:

Fig. 3. Upper and lower limits dened by modeling the stochastic component, and assuming nominal operating conditions.

and select the values which leave 0.5% below, and those which leave 99.5% above, these values. By doing so, an empirical condence interval of 99% (condence level) is created, if the random variables and are independent. G. Fitting the Model Under Normal Operating Conditions for both voltage and current signals Using sample size under normal operating conditions, and are estimated, as shown in Section II-C. Alternatively, nominal values can be used, without the need for a sample. Thus, using these values, the voltage and current signals are standardized using the estimated peak values. In sequence, the sample points are rotated using angle . As a consequence, the expected behavior of current and voltage signals is given by the ellipse in its reduced form. (See (12).) Finally, the residuals and their variances are estimated. By doing so, a control region is established by the upper and lower limits, as dened previously in Section II-F. H. Fault Detection and False Fault Detection Having dened the control region, each phase of the transmission line is continuously monitored until three consecutive points violate the boundary conditions by leaving the control region. For instance, if the signals are monitored using a sample size of 32 points per cycle, then a fault is detected in approximately 0.094 cycles. Assuming a condence interval of 99.7%, it is expected that under normal operating conditions, an average of 0.3% of the points will leave the boundaries. Therefore, a false fault occurs, on average, with every 333 sampled points. Nonetheless, if we consider three consecutive points to be outside the boundaries as the fault detection criteria, then, under normal operating conditions, a false fault occurs with every 40 000 000 sampled points, approximately, or (23) points on average. In this case, is the condence value and is the number of consecutive points. In general, the larger the condence level and the number of consecutive points outside the boundaries, the smaller the chance of a false fault will occur.

(21) Fig. 3 shows the condence region for the ellipse, under normal operating conditions, and considering Gaussian random noises with the dispersion parameter of 0.02 (or 2%). 1) Condence Intervals for Non-Gaussian Residuals: Condence intervals can also be estimated if there is evidence that the residuals do not follow a Gaussian distribution. In this situation, upper and lower limits are obtained using the order statistics or the percentiles of the residuals. For instance, assuming a 99% equally tailed interval, the rule to determine whether a point is within the condence region is rewritten as % % % % (22)

where % and % are the 0.5th percentile of the voltage and current residuals, respectively; and % and % are the 99.5th percentiles of the voltage and current residuals, respectively. Briey, we sort the residuals in ascending order

DE SOUZA GOMES et al.: DETECTION AND CLASSIFICATION OF FAULTS IN POWER TRANSMISSION LINES

1407

Fig. 4. Flowchart of the proposed method for monitoring the phases of a transmission line.

Nevertheless, in this situation, the time required to detect a true fault will increase. Therefore, the choice of and must balance the time required to detect true faults in contrast to the number of false faults. I. Classication of Faults As shown in Section II-H, even under normal operating conditions, the model will eventually detect a false fault. We handle this situation by including the normal behavior of the EPS as one of the possible outcomes of the classier. Therefore, when a fault occurs, the classier is able to detect whether the fault is false. After the fault detection step, the classication stage of the fault starts. New incoming data, after fault detection, are rst standardized using previous prefault estimates. That is, we use and prefault values. Then, the operations of rotation and translation are applied to the points. Recall that these operations also use the parameters estimated by the initial sample of size , assuming normal operating conditions. A owchart of the proposed method for monitoring the phases of a transmission line is shown in Fig. 4. For each phase, one postfault ellipse is estimated and the following parameters are used as input patterns for the classier: radius of the ellipse, projected on the vertical axis; radius of the ellipse, projected on the horizontal axis; absolute value of the rotation angle of the ellipse, with respect to the vertical axis; peak value of the transformed voltage signal; peak value of the transformed current signal; the postfault power factor. We explored other parameters as potential inputs for the classier. Nevertheless, the parameters listed before achieved the best results. The output of the classier is dened as the type of fault. It is worth noting that at the postfault stage, there is no need to build a control region. We are mostly interested in evaluating the expected behavior of the postfault ellipse. Therefore, we do not estimate the postfault variances, nor the residuals. Selected Classiers: Selected classiers were available from the machine intelligent platform WEKA [23]. The WEKA platform has been previously used in fault detection and classication for power transmission lines [24], and it provides an extensive list of classiers. We applied the following classiers: Bayesian networks, naive Bayes, logistic regression, radial basis functions, multilayer perceptron, decision table, decision table naive Bayes, -nearest neighbor (knn), AdaBoost, and bagging among others. Overall, we tested 22 different classiers. Among the tested classiers, the decision table naive Bayes (DTNB), the Bayesian networks, and the -nearest neighbors provided

the best results, as shown next. The parameters of the classiers were adjusted using a ve-fold cross-validation procedure [25]. The following, two examples using the monitoring system, or fault detection system, and the classication system are shown. In the rst case, a simulated database is used to detect and classify faults in transmission lines. In the second case, a real database is used to evaluate and classify weather events that have compromised the operation of transmission lines of a power transmission company located in the state of Minas Gerais, Brazil. III. FAULT STUDIES It is worth mentioning that the ellipse equation aims at modeling the mean behavior of each phase of any transmission line under normal operating conditions. After a fault, we estimate the same model to the voltage and current signals but we standardized the postfault signals by applying values of , rotation angle and translation parameters, which were estimated before the fault. As a consequence, all parameters of the postfault ellipse are interpreted as relative values. For instance, if the post fault ellipse has a peak value of the current signal equals to 3, then it can be said that the post fault current peak value, within the 1.4 cycles after the fault detection, is three times larger than the prefault peak value. The same principle applies to the angle of the postfault ellipse. The use of our proposed parametric models has further advantages. First, our model mimics the functional behavior of the transmission line and provides robust information of the transmission line just a few moments after the fault, that is, effects of adverse events such as fault current decaying, switching of shunt capacitor, current variations are accounted as noise components and, as a consequence, do not compromise substantially postfault estimates of the ellipse. In practice, it means that the estimated ellipse using 1.4 postfault cycles wont differ much from an estimated ellipse using 2 or 3 postfault cycles. Following, we present two case studies using our proposed method. The rst case study is a simulated database. The second case study is a real database. A. Case Study 1: Simulated Database An EPS model was simulated using software Power Systems CAD (PSCAD), pscad.com, accessed July 13, 2012) considering two power sources connected by a 230-kV transmission line, 200 km long. The simulated faults were: short circuit type (AB, AC, BC, ABC, A-G, B-G, C-G, AB-G, AC-G, BC-G, where A, B, and C are the three phases of any transmission line and G means ground); and, open circuit type (A-open, B-open, C-open AB-open, AC-open, BC-open), resulting in a total of 320 faults. We also included 20 simulations without any faults. The simulations of the faults assumed different values for the following quantities: type of the fault, distance to the fault, fault resistance type, angle of the fault, and power factor. Table I shows the parameters of the simulated faults, where 100 km and . For open-circuit faults the resistance was set at 1 M . For each scenario described in Table I 16 simulations of faults were created, as previously described. Normal operating conditions were included into the simulated faults, that is, if an AB short circuit fault is simulated, then the type of fault for the C phase is normal operation. Therefore, for each simulated

1408

IEEE TRANSACTIONS ON POWER DELIVERY, VOL. 28, NO. 3, JULY 2013

TABLE I PARAMETERS OF SIMULATED FAULTS

TABLE II DECISION TABLE FOR THE FINAL CLASSIFICATION OF THE FAULT

fault there is one possible output for each phase, which are: 1) normal operation, 2) short-circuit between phases, 3) short circuit between phase and ground, and 4) open phase. The classication results for each phase are applied to a decision table, whose output indicates the nal diagnosis of the fault in the transmission line. Table II shows the decision table. For each phase of the transmission line, there is one possible output among the four possibilities: 1) normal operation, 2) short circuit between phases, 3) short circuit between phase and ground, and 4) open phase. B. Results of the Simulated Database Fig. 5 shows a simulated AB short-circuit fault. Fig. 5(a) shows the behavior of the current and voltage signals just before

Fig. 5. Three-phase voltage and current signals, and the postfault ellipse for a simulated AB short-circuit fault. (a) Three-phase signals with simulated AB-fault. (b) Detection step function and classication step function which show the moment at which the fault starts (horizontal dashed lines), the moment at which the fault is detected, and the moment at which the fault is classied. (c) Prefault and postfault signals and the postfault estimated ellipse for phase A.

and after the fault. Fig. 5(b) shows the fault detection step function and the classication step function which describe the moment at which the fault starts (horizontal dashed lines), the moment at which the fault is detected and the moment at which the

DE SOUZA GOMES et al.: DETECTION AND CLASSIFICATION OF FAULTS IN POWER TRANSMISSION LINES

1409

TABLE III CLASSIFICATION TABLE

fault is classied. The fault was detected after three consecutive points out of the control region, or 0.094 cycles. The classication of the fault was achieved after two cycles. Fig. 5(c) shows the prefault and postfault signals projected into the 2-D space and the estimated postfault ellipse (solid line). It can be seen that the postfault signal is very noisy. Nevertheless, the postfault ellipse captures the average behavior of the signal within two cycles. It is worth mentioning that the estimated variances for the simulated database were very small and very similar. Therefore, we assume , and that the residuals are independent and follow a Gaussian distribution. Table III shows the classication results. The Bayes network classier [26] achieved the best result. The decision tree naive Bayes (DTNB) [27] classier, and the -nearest neighbor classier also achieved good results. On average, the classication rate is 96.6%. Fig. 6 shows the simulated faults projected into the 2-D space of the following predicted variables: current relative peak value and the postfault power factor. It can be noticed that the classes are grouped together and nonlinearly separable. As a consequence, nonlinear classiers with low complexity such as the decision trees, Bayesian networks, decision trees naive Bayes (DTNB), and knn provide high classication rates, as shown in Table III. Fig. 6 also shows that the peak value of the postfault current signal, within the two cycles after the fault, may reach 20 times the peak value of the current signal under normal operating conditions for the short circuit between phases, and short circuit between phase and ground faults. It can be seen that the peak value of the postfault current signal for the open-phase fault is slightly larger than the peak value of the prefault current signal. This is because the time required for the current signal to reach zero is longer than 1.4 cycles after the fault. It is also interesting that for normal operation the power factor range is from 1 to 1. This is because of false faults which, in this case, were created using random pieces of voltage and current signals under normal operating conditions. Table IV shows the misclassied patterns for each fold of the ve-fold cross-validation procedure, and for each phase (A, B, and C) of the transmission line. The rst fold has three patterns with at least one misclassied phase in each. For the rst pattern, only the output of phase B was incorrectly classied. It was classied as short circuit between phase and ground. Nevertheless, phases A and C were correctly classied. For the second pattern, all phases were incorrectly classied. For the third pattern, again, only one phase was incorrectly classied. Folds 2, 3, and 5 show one pattern each, again, with at least one misclassied phase. Fold 4 does not show any misclassied pattern. For the second fold, all phases were incorrectly classied, which is similar to the second pattern in the fold. The pattern in the third fold shows the outcomes of phases A and

Fig. 6. The 2-D space of predictive variables: current relative peak value and postfault power factor. TABLE IV MISCLASSIFICATION OUTCOMES FOR EACH ONE OF THE 5-FOLD VALIDATION SETS, USING THE BAYESIAN NETWORK CLASSIFIER FOR EACH PHASE (A, B, AND C) OF THE TRANSMISSION LINE

B switched. The fth fold also shows one incorrectly classied phase. In these cases, regardless the switched output between the two phases, the general type of fault is being correctly identied. On average, only one phase is being incorrectly classied for each pattern. Except in the second pattern of fold and the pattern of the second fold in which the true type of fault is a three-phase short circuit (ABC short circuit), and the classication output is a three-phase short circuit and ground. For these two cases, the classier is able to partially identify the type of fault, which is a three-phase short circuit. C. Case Study 2: CEMIG Database The CEMIG database consists of 41 records of faults reported from years 2001 to 2003. Each report provides prefault and postfault voltage and current signals for each phase of the transmission line. Four different types of faults are provided. The types of faults and their respective codes are: falling tree on transmission line (W1); electrical lightning in transmission line (ND); cable entanglement (K6); re close to the transmission line (AQ). The CEMIG data set does not report the type of fault for each phase, separately. Therefore, the classier cannot be designed to detect the type of fault for each phase, as shown for the simulated data set. In this case, the output of the classier is the fault of the transmission line and the inputs are the features of the ellipse of each phase, grouped in one single input vector as

1410

IEEE TRANSACTIONS ON POWER DELIVERY, VOL. 28, NO. 3, JULY 2013

TABLE V CEMIG DATABASE

follows: we rst compare the postfault peak values for each current signal, and for each phase, with its prefault peak values. We create an input vector for the classier in which the initial elements are the parameters of the ellipse, among the three phases, which achieved the highest ratio between the postfault and prefault current peak values. Following, the parameters of the ellipses of the remaining phases are also included into the input vector, based on the descending order of the ratio between post and prefault current peak values. We choose the peak values of the current signals to sort the elements of the input vector because, in practice, current signals were more sensitive to faults than voltage signals. Table V shows the number of records in the CEMIG database for each type of fault. The database also includes 41 records of the transmission line under normal operating conditions. In general, the number of records is much smaller than the simulated database. D. Results of the CEMIG Database Initially, we investigate some of the properties of the voltage and current residuals, as initially shown in (13). We sampled 650 points, or approximately 20.3 cycles, from both voltage and current signals under normal operating conditions. In sequence, we estimate the residuals for both voltage and current signals using (16). Fig. 7 shows the histograms of the residuals for both voltage and current signals. The solid line represents the density of the Gaussian distribution. It is worth noting that the empirical distribution of the residuals, i.e., the histograms, do not follow the Gaussian distribution. By applying the Anderson-Darling normality test [28] to the voltage and current residuals, we obtain -values of and , which provide statistical evidence that the residuals do not follow the Gaussian distribution. Furthermore, the estimates of the standard deviations of the residuals are: 0.0066 and 0.0194. We replicate this analysis to the different phases of the transmission line and the results also indicate non-Gaussian behavior of the residuals. Therefore, we conclude that the assumption of Gaussian distribution of the residuals is not consistent with the CEMIG database and therefore we applied the proposed analysis for non-Gaussian residuals, as shown in Section II-F1. We also tested whether the random variables and are independent. The empirical linear correlation between the current and voltage residuals is 0.2411 with a -value of . Therefore, we conclude that the residuals are not independent. As a consequence, we can not assume that, using the 0.5th and the 99.5th percentiles of the residuals in (22), the true condence level is 99%. Furthermore, we can not assume that a false fault occurs with every 8.000 sampled points as shown in (23). Nevertheless, even if the residuals are correlated, it is possible to estimate the empirical false fault rate. To do so, we use the sampled

Fig. 7. Histograms of voltage and current residuals under normal operating conditions. Solid line represents the Gaussian density distribution. (a) Residuals of voltage. (b) Residuals of current.

Fig. 8. Behavior of the voltage and current signals for four different types of faults. (a) Fire close to the transmission lines. (b) Electrical lightning. (c) Falling tree on transmission line. (d) Cable entanglement.

points and, after building the control region, we estimate the proportion of points which lie within the control region. For the CEMIG database this value is 84.46%. Then, applying (23) with 0.8446 and considering 3 consecutive points out of the control region as our fault detection criteria, a false fault occurs with every 267 sampled points. As shown in Section II-H, it is possible to change the expected time required to detect false faults by increasing the control region, or the number of consecutive points outside the control region. Particularly, the sampling rate of the CEMIG database is 64 points per cycle. Therefore, if we choose 7, then a false fault will occur, on average, with every 457 000 points and the time required to detect faults will be of approximately 0.11 cycles. Furthermore, even if a false fault occurs, results show that the classier is able to correctly identify false faults. Due to the opening time of the breaker, which usually starts, on average, within two sine cycles after the fault, the samples of the postfault signals were chosen as 1.4 sine cycles after the fault. Fig. 8 shows the prefault and postfault voltage and current signals, as well as the postfault ellipse for four different types of faults. The gure illustrates that the angle of the postfault ellipse

DE SOUZA GOMES et al.: DETECTION AND CLASSIFICATION OF FAULTS IN POWER TRANSMISSION LINES

1411

TABLE VI CONFUSION MATRIX OF CEMIG DATABASE

Fig. 9. Three-phase voltage and current signals, and the projected ellipse, after and before the re close to the transmission line fault type. (a) Prefault and postfault voltage and current signals of a transmission line. (b) Detection step function and classication step function which show the moment at which the fault is detected and the moment at which the fault is classied. (c) Prefault and postfault signals and the postfault estimated ellipse for phase A.

is quite distinct among the different types of faults. It is also evident that the amplitude of the postfault signals are different. It can be seen that for the electrical lightning fault [see Fig. 8(b)]

that the ellipse is very narrow with large values of the postfault voltage and current signals (i.e., and ). Fig. 9 shows the behavior of the current and voltage signals after and before a re close to the transmission line fault. The fault was detected after seven consecutive points out of the control region, or 0.1094 cycles (or 1.82 ms). In this case, the sampling rate is 64 points per cycle, as previously mentioned. The classication of fault was achieved after 1.4 cycles, as shown in Fig. 9(b). Fig. 9(c) shows that for this particular fault, the postfault ellipse is not noisy. Furthermore, 1.4 cycles after the detection of the fault were sufcient to properly estimate the ellipse before the process of complete shut down of the line. It is worth mentioning that phase A achieved the highest ratio of prefault and postfault current peak values and, therefore, its estimated parameters were used as the rst elements of the input vector of the classier. In the state of Minas Gerais, during the dry season, it is quite common to have re in the woods close to the transmission line. The re may eventually cause short circuit faults, but may cause other types of faults as well. Nonetheless, the re usually affects, almost simultaneously, all phases of the transmission line and, as a consequence, its pattern is quite distinct. In fact, our approach provides high classication rates for this type of fault as shown in Table VI. As previously described, the inputs of the classiers are formed by six different measures of each phase. The applied measures were the same used for the simulated data. The classication models were adjusted using the 5-fold-cross-validation method. Among the evaluated classication methods, the BayesNet achieved best results. The confusion matrix for the BayesNet method is shown in Table VI. From Table VI, it can be seen that 63 samples were correctly classied, overall, which represents a classication rate of 76.83%. This number represents the sum of the elements of the diagonal. These results are promising, even though the sample size is small. It is worth noting that the classier was able to correctly identify all samples related to normal operating conditions. We included this category into the fault types in order to minimize the effects of the false faults, as described in Section II-H. With regard to faults related to re close to the transmission line (AQ), the classier achieved a classication rate of 73%. The classication rate for the four cases of cable entanglement faults (K6) is 50%. In the latter case, the sample size is extremely small. The classication results for the electrical lightning fault type (ND) show classication rates of 46.7%. For this class, some of the samples were erroneously classied as: re close to the transmission line (26.7%), cable entanglement (20.0%) and falling tree on transmission line (6.7%).

1412

IEEE TRANSACTIONS ON POWER DELIVERY, VOL. 28, NO. 3, JULY 2013

Fig. 10. The 3-D space of the predictive variables: current relative peak value, projected radius of the postfault ellipse, and postfault power factor.

Specically, for the fallen trees on the transmission lines fault type (W1), the estimates of the parameters of the ellipse were compromised because of singular matrices in the least squares estimator. Fig. 10 shows the faults projected into the 3-D space of the following predicted variables: current relative peak value, projected radius of the postfault ellipse, and postfault power factor. The gure shows the variables related to the phase which achieved the highest ratio of the pre and post-faults current signals (i.e., these variables are the rst elements of the input vector of the classier). It can be noticed that the records related to normal operating conditions (OP) do not overlap the remaining classes. Overall, the elements of the re close to the transmission line (AP) and electrical lightning (ND) share a low degree of overlap. The elements of the falling tree on transmission line (W1) and cable entanglement (K6) present a higher degree of overlap. The class K6 presents just a few elements. IV. DISCUSSION AND CONCLUSION A new methodology for fault detection and fault classication is presented in this paper. The behavior of voltage and current signals of any transmission line is modeled using an elliptical 2-D structure. Two stochastic components were included into the elliptical structure to account for noises under normal operating conditions. Based on statistical inference analysis, a control region under normal operating conditions is built. Faults are quickly detected when a short sequence of points leaves the control region. Furthermore, the width of the boundaries and the number of consecutive points leaving the control region can be changed in order to detect faults faster. It is shown that if three consecutive points (or 0.094 cycles) leaving the control region are chosen, then the false fault rate is 1/40 000 000 sample points. Furthermore, if six consecutive points (or 0.187 cycles) are chosen, then the new false fault rate is 1/1 370 000 000 000 000 sample points. In this case, the false fault rate is so small that it can be concluded that false faults will not occur. It is also worth mentioning that different fault detection strategies can be designed based on distance, in standard deviation units, of a point which lies outside the control region. Briey, using statistical inference, the farther a point is from the control region, the less likely that point is to represent a

false fault; therefore, it is more likely that the point represents a true fault. Thus, based on a single point, our approach provides mechanisms for the fault detection in 0.031 cycles. It must be cautioned, however, that narrow boundaries will falsely classify normal operating conditions as faults. Therefore, the width of the boundary can be set based on the probability of detecting false faults. After fault detection, 1.4 cycles of current and voltage signals are used to estimate the expected behavior of the fault. In this situation, the same model used to capture the normal operating conditions of the transmission line is now applied to capture unique features of the fault. These features are used as inputs for a classier, which determines the probable type of fault. Furthermore, the computational complexity of the proposed framework is very low, and our proposal can efciently detect faults in real time. Estimates for the parameters of the model are presented for both prefault and postfault conditions. The use of geometric components of the postfault ellipse as input vectors in classication models proved to be a robust and innovative strategy for classifying different types of faults, as shown by the results using simulated and real databases. The simulation case study showed that the estimated variables of the ellipse provides nonlinearly separable classes. The classication rates in both cases were promising despite a small sample size in the real database. Furthermore, our proposal presents very low complexity compared to the Fourier transform, wavelets, or articial neural networks (ANNs). In our proposal, each new sampling point is tested using two simple mathematical expressions, as shown in (22). The estimates of the coefcients of the postfault ellipse are achieved, almost instantaneously, by means of a linear equation solution. Another major advantage of the proposed method consists in its ability to generate fault classication space with low complexity. Therefore, high classication rates can be achieved with low complexity models, such as decision trees, Bayesian networks, and knn. It is worth noting that the proposed methodology will classify abnormal behavior of the current and voltage signals into one of the possible types of faults, previously specied by the user. Therefore, in order to properly detect nonfault transient cases, the user must include such nonfault transient cases as regular types of faults. Future work will aim at extending the study of statistical properties of the proposed model as well as the use of different parameters of the ellipse. Noise parameters at the postfault, as potential inputs to the classication models, will also be investigated further. We also aim at proposing novel extensions into the 3-D and 6-D spaces. In the latter case, the 6-D represents the current and voltage signals of the three phases of a transmission line, simultaneously. REFERENCES
[1] S. P. Valsan and K. S. Swarup, Wavelet transform based digital protection for transmission lines, Elect. Power Energy Syst., vol. 31, pp. 379388, 2009. [2] A. Abdollahi and S. Seyedtabaii, Comparison of fourier & wavelet transform methods for transmission line fault classication, in Proc. 4th Int. Power Eng. Optimiz. Conf., 2010, pp. 579584. [3] K. Gayathri and N. Kumarappan, Comparative study of fault identication and classication on EHV lines using discrete wavelet transform and fourier transform based ANN, Int. J. Elect., Comput., Syst. Eng., vol. 2, pp. 125136, 2008.

DE SOUZA GOMES et al.: DETECTION AND CLASSIFICATION OF FAULTS IN POWER TRANSMISSION LINES

1413

[4] M. Patel, Fault detection and classication on a transmission line using wavelet multi resolution analysis and neural network, Int. J. Comput. Appl., vol. 47, no. 22, pp. 2733, 2012. [5] M. J. Reddy and D. K. Mohanta, A wavelet-fuzzy combined approach for classication and location of transmission line faults, Int. J. Elect. Power Energy Syst., vol. 29, pp. 669678, 2007. [6] X. Dong, W. Kong, and T. Cui, Fault classication and faulted-phase selection based on the initial current traveling wave, IEEE Trans. Power Del., vol. 24, no. 2, pp. 552558, Apr. 2009. [7] A. L. O. Fernandez and N. K. I. Ghonaim, A novel approach using a FIRANN for fault detection and direction estimation for high-voltage transmission lines, IEEE Trans. Power Del., vol. 17, no. 4, pp. 894900, Oct. 2002. [8] N. Zhang and M. Kexunovic, Coordinating fuzzy art neural networks to improve transmission line fault detection and classication, in Proc. IEEE Power Eng. Soc. Gen. Meeting, Jun. 2005, vol. 1, pp. 734740. [9] R. Mahanty and P. D. Gupta, A fuzzy logic based fault classication approach using current samples only, Elect. Power Syst. Res., vol. 77, pp. 501507, 2007. [10] O. A. S. Youssef, Combined fuzzy-logic wavelet-based fault classication technique for power system relaying, IEEE Trans. Power Del., vol. 19, no. 2, pp. 582589, Apr. 2004. [11] J. Upendar, C. Gupta, and G. Singh, Statistical decision-tree based fault classication scheme for protection of power transmission lines, Int. J. Elect. Power Energy Syst. vol. 36, no. 1, pp. 112, 2012. [Online]. Available: http://www.sciencedirect.com/science/article/pii/S0142061511001864 [12] V. Malathi, N. S. Marimuthu, and S. Baskar, Intelligent approaches using support vector machine and extreme machine for transmission line protection, Neurocomputing, vol. 73, no. 1012, pp. 21602167, 2010. [13] P. Chiradeja and A. Ngaopitakkul, Identication of fault types for single circuit transmission line using discrete wavelet transform and articial neural networks, in Proc. Int. MultiConf. Eng. Comput. Scientists, 2009, vol. 2, pp. 15201525. [14] A. Ngaopitakkul and C. Jettanasen, Combination of discrete wavelet transform and probabilistic neural network algorithm for detecting fault location on transmission system, Int. J. Innovative Comput., Inf. Control, vol. 7, no. 4, pp. 18611873, 2011. [15] H. Ferrer, R. E. O. Schweitzer, and S. E. Laboratories, Modern Solutions for Protection, Control and Monitoring of Electric Power Systems. Pullman, WA, USA: Schweitzer Engineering Laboratories, 2010. [16] J. L. Blackburn and T. J. Domin, Protective Relaying: Principles and Applications, 3rd ed. Boca Raton, FL: CRC, 2006. [17] M. M. Eissa, Current directional protection technique based on polarizing current, Int. J. Elect. Power Energy Syst., vol. 44, no. 1, pp. 488494, 2013.

[18] M. M. Eissa, A new digital busbar protection technique based on frequency information during ct saturation, Int. J. Elect. Power Energy Syst., vol. 45, no. 1, pp. 4249, 2013. [19] G. Casella and R. L. Berger, Statistical Inference, 2nd ed. Pacic Grove, CA, USA: Duxbury Press, 2002. [20] V. Chandola, A. Banerjee, and V. Kumar, Anomaly detection: A survey, ACM Comput. Surv. vol. 41, no. 3, pp. 15:115:58, Jul. 2009. [Online]. Available: http://doi.acm.org/10.1145/1541880.1541882 [21] J. L. Boldrini, S. I. R. Costa, V. L. Figueiredo, and H. G. Wetzler, Linear Algebra, 3rd ed. New York: Harper & Row, 1980. [22] J. Kiefer, Sequential minimax search for a maximum, in Proc. Amer. Math. Soc., 1953, vol. 4, no. 3, pp. 502506. [23] M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten, The weka data mining software: An update, SIGKDD Explor. Newsl., vol. 11, no. 1, pp. 1018, 2009. [24] A. A. Yusuff, A. A. Jimoh, and J. L. Munda, Determinant-based feature extraction for fault detection and classication for power transmission lines, IET Gen., Transm. Distrib., vol. 5, no. 12, pp. 12591267, 2011. [25] T. Hastie, R. Tibshirani, and J. H. Friedman, The Elements of Statistical Learning. New York: Springer, Jul. 2003. [26] N. Friedman, D. Geiger, and M. Goldszmidt, Bayesian network classiers, Mach. Learn., vol. 29, pp. 131163, 1997. [27] R. Kohavi, Scaling up the accuracy of naive-bayes classiers: A decision-tree hybrid, in Proc. 2nd Int. Conf. Knowl. Discovery Data Mining, 1996, pp. 202207. [28] M. A. Stephens, Edf statistics for goodness of t and some comparisons, J. Amer. Stat. Assoc., vol. 69, pp. 730737, 19754. Andr de Souza Gomes, photograph and biography not available at the time of publication.

Marcelo Azevedo Costa, photograph and biography not available at the time of publication.

Thomaz Giovani Akar de Faria, photograph and biography not available at the time of publication.

Walmir MatosCaminhas, photograph and biography not available at the time of publication.

Vous aimerez peut-être aussi