Vous êtes sur la page 1sur 6

Power Quality Data Evaluation in Distribution

Networks Based on Data Mining Techniques

Tongyou Gu*, P. Kadurekt, J. F. G. Cobbent, A. W. Endhoven*
* Alliander N.V., Netherlands
tEindhoven University of Technology, Netherlands

Abstract-With the increasing amount of data available from

the harmonic monitoring systems in the distribution grid, it is
becoming more important to evaluate the harmonic data.
This paper presents an algorithm using data mining technique,

of harmonic measurements and extract hidden information. In

this paper the following questions are addressed, how:

in particular mixture modeling based on the Minimum Message



method, to classify the harmonic data into

clusters and identify useful patterns within the data. The resulted
clusters are applied to distinguish the sources of disturbances and

the time schedule of the disturbances in the distribution grid. In

addition, the CS.O algorithm as a supervised learning is used
to produce rules about how the measured data is classified into

various clusters using decision tree technique. These generated

rules can then be utilized to predict which cluster any new data
belongs to without calculating again.

Index Terms-clustering methods, data mining, power distri

bution, power system harmonics, power quality

Nowadays, the customers are using more and more devices
based on power electronics, which are more vulnerable to the
power quality (PQ) distortion, in particular to the harmonic
distortion in voltage supply waveform. Distribution system
operator (DSO) has to provide certain quality of voltage supply
to customers and decrease losses in the electricity grid [1].
In order to meet the requirements of DSO and customers, the
amount of available measurements among the distribution grid
is increasing, which provides numerous PQ measurements [2].
The large amounts of PQ data hold more information than
that reported using classical techniques for PQ monitoring.
Data mining method extracts the hidden information that might
be critical for identification and diagnoses of power quality
disturbances and prediction of system abnormalities or failures
[3]. Therefore, advanced evaluation of large volumes of PQ
data is an important task.
Data mining tools are an obvious candidate for analyzing
large scale data. Data mining can be understood as a process
that uses a variety of analytical tools to identify hidden patterns
within data. One important tool is clustering, which is a sort of
classification and often used to gain patterns and anomalies in
multivariate data. Once classified, another tool named decision
tree technique can be used to build rules or relationships
between the input data and the classes [4].
The aim of this paper is to present a methodical approach
using data mining techniques to classify harmonic data into
different clusters, which can assist to analyze large volumes

To develop an algorithm for classifying large amount

of harmonic data into clusters using Minimum Message
Length (MML) technique (Section 11).
To apply the MML algorithm to measured data from
computer simulation of harmonic events in distribution
grid (Section Ill).
To determine and distinguish of disturbance sources in
the distribution grid with clusters obtained from the MML
algorithm (Section IV).
To further interpret the obtained clusters and to generate
rules about the occurrence of cases using decision tree
technique (Section IV).
To propose the requirements of measurements, in order
to improve the accuracy of the result (Section IV).

11. DATA


A. Data Mining in Power Quality Data Analysis

Data mining has popularity in many research fields [5].
It is also a preferred candidate for assisting in the analysis
of power system. Essentially applying data mining tools to
power quality data provides the ability to identify the various
underlying contexts associated with the sites monitored, and
power quality disturbances of interest.
There are two important learning strategies in data min
ing technique: Supervised Learning (SL) and Unsupervised
Learning (USL). Unsupervised learning is applied to discover
a number of pattern labels, subsets, or segments within the
data, without any prior knowledge of the target classes. In
supervised learning, each instance of data is mapped to align
with its associated label in order to find the interpretation of
these pattern labels [6].
B. Clustering Technique as Unsupervised Learning

Clustering is a process that segments or divides an initially

unlabeled collection of data with various attributes into a
certain number of groups or clusters [7]. As a result, the
data grouped in each cluster are similar, whereas data across
different clusters are dissimilar. Clustering can be considered
as a learning process, and a method for analyzing large
volumes of data which is hard to analyze as a whole.

978-1-4673-3059-6113/$31.00 20l3 IEEE

There are a variety of clustering algorithms. As Mixture

Modeling method can process a large amount of data and has
a better accuracy, it is chosen to be applied in this paper [7].

1) Mixture Modeling Method [8J: Mixture Modeling can

be described as an unsupervised learning method which con
structs a model based on a mixture of statistical distributions
that have been learned from the data. The Gaussian distribution
is the most commonly used distribution in the mixture models.
The two main steps used in the Mixture Modeling method
are parameter estimation and model selection. In order to
identify a suitable model, the distribution parameters that
could have plausibly generated the same data values should
be estimated first.
In a single distribution the mean and variance that makes the
data most likely to have arisen from this distribution, can be
calculated by ditlerentiating the probability of all data points,
and equating the result to minimum. However, in the case of
mixture distributions it is hard to calculate the probability of
each data as it is not known which distribution has generated
which data point. The expectation and maximization (EM)
algorithm can solve this problem [9].
First, in the expectation step, initial values of distribution
parameters are selected, and the probabilities of each point,
with respect to their distribution, are calculated from Bayes
rule. Second, the calculated probabilities are then used to
estimate the distribution parameters.
2) Minimum Message Length (MML) Technique in Mixture
Modeling: MML technique is an important application of
Mixture Modeling method for data segmentation. As the name
implies, MML evaluates models according to their ability to
compress a message containing data.
Compression methods generally attain high densities by
formulating efficient models of the data to be encoded. The
encoded message consists of two parts. The first part describes
about the model and the second describes about the observed
data given that model. The total encoded message length is
then calculated and the best model (shortest total message
length) is selected.
Autoclass, which is a data mining software tool developed
by National Aeronautics and Space Administration (NASA),
has been chosen as the Mixture Modeling software used in
this paper.
C. Decision Tree as Supervised Learning
Once clusters are generated, supervised learning is usually
used to map each instance of data to align with its associated
class. Decision tree is one example of supervised learning
technique. With the decision tree, a model is generally auto
matically induced or built, based on some guiding information
and theoretical metric such as entropy gain [10]. It proposes
plausible relationships between the input data (training set)
and the classes, here being cluster labels obtained from the
unsupervised learning of MML. Once the model is trained, it
can then be applied to other data sets (future data). In doing
so, it is possible to predict which class a newcome data point

----1--- 150kV

Feeder 1

Feeder 3

Site 1

) Site 2





Site 3

Site 9 '. 'Site 11

Site 4 Site 5

Site 10






Loadl Load2
Fig. 1.
Schematic of a simplified typical MV distribution network in the

best belongs to. In this paper, the supervised learning CS.O

algorithm [11] is used to carry out the supervised learning.

Simulation Design

In this section, the harmonic measurement data from a

distribution grid, as illustrated in Fig. I, is classified using
Mixture Modeling program Autoclass. The distribution grid is
simulated in Simulink with a simulation time of 20s, and the
waveform is sampled with 128 points per cycle.
As can be seen from Fig. 1, 12 monitors are installed:
one monitor at the substation incoming supply (Site 0); three
monitors at the beginning of the three individual feeders
(Site 1, Site 2, Site 3); and eight monitors at each Point of
Connection (POC) in the load area (Site 4-Site 11).
At POC LoadS, a distortion source is simulated, which
generates Sth and 7th harmonic current in time intervals of
t=Ss-IOs and t=ISs-20s. Near load 7 a capacitor is placed to
compensate reactive power switching On at t=6s and Off at
B. Data Selection

Throughout the simulation, 1000 cycles (20s) are recorded

at frequency of SOHz. After decomposition of each cycle of
the measured waveforms in phase A by FFT, each order of
harmonic currents can be obtained, as well as fundamental
Three sorts of attributes are selected as input of Autoclass:
fundamental currents, Sth harmonic currents and 7th harmonic
currents. Because most of the 3rd harmonic current is blocked
due to the presence of /Y transformer, and higher order
harmonics are also not chosen because of their low values.

In sum, 36 attributes consisting of the fundamental currents,

as well as harmonic currents of order 5th and 7th at 12 sites
(Site O-Site 11) have been selected.
Besides, some additional data, such as the reactive power,
can be used to confirm any suspected events identified from
the clustering results. This will be explained in next section.
The program Autoclass is applied to the selected 36 at
tributes with an accuracy of measurement (Aom), which in
this example is almost 0% and set to be 0.1 % according
to [12]. By observing how the measured data are classified
into various clusters, power utility engineers can more readily
deduce power quality events that may have triggered a change
from one cluster to another cluster. To confirm the observation,
additional information, such as reactive power can be used.
A. Resulted Clusters
Seven clusters has been obtained and sorted in ascending
order based on the abundance value as shown in Table I. The
first three clusters (SO, SI, and S2) cover most of the cases,
thus these three clusters represent the main steady states of
the power grid. The rest clusters (S3, S4, S5 and S6) have
much lower abundance values. This means they represent the
transient moments. Each case in every generated cluster can be
considered as a profile of 36 variables (fundamental current,
5th and 7th harmonic current at 12 sites).
Fig. 3 illustrates the result of 5th and 7th harmonic current
at all sites in the first three clusters. It can be observed that
the harmonic currents in SO are almost zero. That means this
cluster is generated by cases that have no harmonic distortion.
Fig. 2 gives the distribution of the three main clusters in time
axis, which indicates each cluster contains cases of which time
period. Thus the clean period represented by SO is t=Os-5s and
Meanwhile, a large amount of harmonic currents appear in
SI and S2. In cluster SI, the significant value of harmonic
occurs at Site 0, Site 2 and Site 8. Together with the network
structure, this might be recognized as a route of major har
monic current flowing. Nevertheless, in cluster S2, at another
two sites, Site 3 and Site 10, large value of harmonic is also
found. This may be another possibility of harmonic current
An issue that should also be taken into consideration is
capacitors may be mistaken as harmonic source. The reason is
that the impedance of a capacitor is inversely proportional with
frequency. In normal cases, even a small amount of harmonic
voltage can lead to very high harmonic current through the
So these two probabilities (Site O-Site 2-Site 8 and Site
O-Site 3-Site 10) need to be further analyzed to locate a
harmonic distortion source.
B. Identification of Harmonic Source and Compensation Ca

By just observing the harmonic currents, it is difficult to

locate the harmonic source or capacitor. The reactive power




Cluster No.









0.45 1


0. 147


0.0 1l

0.0 1 1








Fig. 2.

Main result clusters mapping to time axis

as shown in Fig. 4 provides a further insight that capacitor

switching events have happened at Site 3.
The reactive power at Site 3 decreases at t=6s and increases
back at t=9s, which implies the capacitor is switched On and
Off at these two time points. This matches with the time range
of S2 as demonstrated in Fig. 2. Therefore, it can be deduced
that S2 is generated due to the use of the capacitor.
In addition, the cluster that has the lowest abundance is S6.
Its abundance is only 0.04, which means only 4 cases fall in
this cluster (1000 cases in total). Normally small abundance
means abnormal occurrence, which may need further inves
tigation. When mapping the 4 cases in S6 to time axis, it is
found that S6 is at the two ends of S2. By observing the current
waveform at Site 3 in Fig. 5, a capacitor switching event at
around t=6s can be recognized by high frequency resonance.
This confirms the analysis result mentioned before.
In actual conditions, the pattern of reactive power may
not be so obvious, as the main function of capacitor is to
compensate the reactive power when the demand is high.
Thus the value of measured reactive power is expected to
be approximately constant. In this situation, the clusters that
represent transients are determinant to identify capacitors
switch events. This is one of the main advantages of clustering
algorithm that the clusters have less abundance can potentially
signify anomalous cases, and this is of great help to deduce
and analyze the unique or different operating conditions.
To conclude, the location of harmonic distortion source is
at loadS and compensation capacitor is near load7, in addition,
the distorted time period is t=5-IOs and t=15-20s, with a
capacitor switches in during t=6-9s. These all agree with the
initial settings of the grid. Fig. 6 shows the time periods of
distortion, and Fig. 7 indicates the routes of harmonic current
C. Rules Generating
Once identify the harmonic distortion source, we may want
to gain a closer insight into cluster SI which represents
cases with harmonic distortion, to understand what would be
the range of the attributes that could form this cluster. The
supervised learning C5.0 tool is applied to the measured data
set and the mixture model generated by Autoclass program.
The generated clusters are used as class labels to the input
data (fundamental current, 5th and 7th harmonic current).
Generally, the data from all the 12 sites can be applied


25 ..---





-- .
....1 1 '.. --....
. ...
-- ---....--

11 '


- - -

-- -- --w--!1- - =-;Jic
7th 5th

5th 7th 5th

- - - _- - -- _- -7th 5th 7th
5th 7th 5th
h 5th
siteO site 1

5th 7th 5th

7th 5th
7th ,

Fig. 3.



site 10

site 1 1

The result of 5th and 7th harmonic currents at Site a-Site 1 1 in cluster 02



-400. 6


Fig. 4.

4 Time(s) 6






6 Ti6.m0e(s)2 6.04 6.06 6.08


Reactive power at Site 2 and Site 3

Fig. 5.
Current waveform at Site 3 around the time when the capacitor is
switched on(t=6s)

to generate rules. Since the network structure is simplified,

and only contains one harmonic source and capacitor, here we
only choose data at Site 0 as input to simplify the calculation,
because the current at Site 0 can provide sufficient information
of the whole substation. The discovered rule for cluster SI is
listed as follows:

if 15 > 9.228
and 17 :::; 7.254
then 1.000

This rule means that if the 5th harmonic current at Site 0 is

larger than 9228A, and if the 7th harmonic current is smaller
than 7.254A, the case will be classified in SI. This makes
sense because if there exist harmonic sources with order of
5 and 7, the 5th and 7th harmonic currents will be higher
than normaL Furthermore, if the 7th harmonic current becomes


1 I


1 101






Fig. 6.
is on)

Time axis of distortion Ch' for harmonic distortion, 'C' for capacitor

exceedingly high due to the capacitor switching in, another

grid operation state will be recognized and consequently
another cluster (in this example S2) will be created. Thus this

substation incoming supply; individual feeders; and POC in

the load area.

Feeder 1

CC Site 1




Feeder 3

-- 10kV


Site 3

' Site 2



3 -1

Site 7

1 Site 8

Site 9

"Site 11

Site 4 Site 5

Site 10








Loadl Load2

- Route of harmonic source

Route of capacitor harmonic current
Other routes of harmonic current


Fig. 7.

3) Data of Measurement: Basically, the harmonic current

measurements up to order of SO are required [12]. And
additional measurements are optional, such as reactive power,
which may help to evaluate power quality data and analyses
the cause of power quality events.

Location of harmonic source

rule gives a good criterion of cluster S1.

The quality of the rules is described by two indices, which
are the number of instances assigned to the rules and the
proportion of correctly classified instances. In this example,
the number of data instances is 1000, with a proportion of
0.998 or 998 instances being correctly classified.
The usefulness of supervised learning using CS.O is that it
can help to analyse unusual events from a large amount of PQ
data. Additionally, suppose the input of CS.O covers almost all
possibilities, then if a new measured data comes, the cluster
that the data should belong to can easily be predicted without
calculating or clustering again.
D. Requirements of Measurement
To use these algorithms in real application, the requirements
of measurement are as follows:
J) Time Scale of the Measurement: The suggested time
interval according to [12] for measurements of harmonic, inter
harmonic and unbalance waveforms is 3 seconds for very short
interval. This is not enough if the detail of transients is needed,
as the duration of a transient moment is usually as short as
only one cycle. In order to provide sufficient information for
data mining, a time scale of maximum 0.02s is required. In
addition, if other power quality problems such as voltage dip
are under consideration, the time scale should be even shorter
since their duration is also shorter.
2) Placement among the Distribution Grid: In order to
locate the source of harmonic distortion among a distribution
network, there are 3 potential measurement locations: the

With the increasing amount of data available from the

harmonic monitoring system, it is becoming more difficult to
extract useful information from the large scale of harmonic
monitoring data. This paper has illustrated that the application
of data mining, in particular mixture modeling based on the
Minimum Message Length (MML) method, to power quality
data can identify useful patterns within the harmonic data
obtained from simulations of distribution grid. Each resulted
cluster can represent a specific operating condition such as
capacitor switching operation and harmonic event.
By observing clusters got from the MML algorithm, the
operation status of each time period could be recognized,
including the details of harmonic currents. Thus the sources of
disturbance in the distribution grid can be distinguished, and
the time schedule of the disturbance can be obtained. Other
available data (which is not used in the clustering algorithm)
such as reactive power measurements could be used together
to confirm the observations.
Once the clusters are generated using the MML method, the
CS.O algorithm as a supervised learning is used to examine
how the measured data is classified into various clusters and
to generate rules about the occurrences of cases using decision
tree technique. These generated rules can then be utilized
to predict which cluster any new data belongs to without
calculating again.
The proposed technique will be useful for the power quality
analysis in the distribution network and will help the DSOs
fastly recognize and categorize power quality events in the
network. This study also serves as an initial investigation
for Alliander N.Y. (Dutch DSO) suggesting the application
possibility of the data mining technique for evaluation of PQ
[1] Voltage characteristics o f electricity supplied b y public distribution
systems, Nederlandse norm Std. NEN-EN 50 160, 2000.
[2] F. Provoost, "Intelligent distribution network design," Ph.D. dissertation,
Eindhoven University of Technology, 2009.
[3] w. W. Dabbs and T. E. Sabin, "Probing power quality data," IEEE
Transactions on Computer Applications, vo!. 7, no. 2, pp. 8-14, 1994.
[4] H. Mannila, "Data mining: machine learning, statistics, and databases,"
in 8th inter. co/if. on scientific and statistical database systems, 1996,
pp. 2-9.
[5] R. Groth, Data Mining: Building Competitive Advantage.
Prentice Hall, 2000.
[6] A. T. M. Asheibi, "Discovery and pattern classification of large scale
harmonic measurements using data mining," Ph.D. dissertation, Univer
sity of Wollongong, 2009.
[7] T. Pang, M. Steinbach, and V. Kumar, Introduction to Data Mining.
Boston: Pearson Education, 2006.

[8] J. J. Oliver and D. J. Hand, Introduction to minimum encoding inference.

UK: Dept.Statistics. Open University, 1994.
[9] G. McLachlan and T. Krishnan, T he EM Algorithm and Extensions.
John Wiley & Sons, 1997.
[10] 1. H. Witten and E. Frank, Data mining practical machine learning tools
and techniques. San Francisco: Morgan Kalfman, 2005.
[11] J. R. Quinlan, C4.5: Programs for machine learning. Morgan Kaufmann
Publishers, Inc, 1993.
[12] Electromagnetic compatibility (EMC) - Part 4-30: Testing and measure
ment techniques - Power quality measurement methods, International
electrotechnical commission Std. IEC61 000-4-30, 2008.