Non-Parametric Short-Term Load Forecasting: D. Asber, S. Lefebvre, J. Asber, M. Saad, C. Desbiens

Non-parametric short-term load forecasting
D. Asber
a
, S. Lefebvre
a,
*
, J. Asber
b
, M. Saad
b
, C. Desbiens
c
a
IREQ, Institut de Recherche dHydro-Que bec, 1800 Boul. Lionel Boulet, Varennes, Que., Canada J3X 1S1
b
E
cole de Technologie Supe rieure 1100 Notre-Dame Ouest, Montre al, Que ., Canada H3C1K3
c
Hydro-Que bec Distribution 680 Sherbrooke Ouest, Montre al, Que ., Canada H3C4T8
Received 22 March 2005; received in revised form 30 May 2006; accepted 5 September 2006
Abstract
Load forecasting is an important problem in the operation and planning of electrical power generation, as well as in transmission and
distribution networks. This paper is interested by short-term load forecasting. It deals with the development of a reliable and ecient
Kernel regression model to forecast the load in the Hydro Quebec distribution network.
A set of past load history comprising of weather information and load consumption is used. A non-parametric model serves to estab-
lish a relationship among past, current and future temperatures and the system loads. The paper proposes a class of exible conditional
probability models and techniques for classication and regression problems. A group of regression models is used, each one focusing on
consumer classes characterising specic load behaviour. Each forecasting process has the information of the past 300 h and yields esti-
mated loads for next 120 h. Numerical investigations show that the suggested technique is an ecient way of computing forecast statistics.
2007 Elsevier Ltd. All rights reserved.
Keywords: Distribution network; Forecasting; Time series; Regression; Non-parametric
1. Introduction
In constructing a load forecasting model, a mathemati-
cal relation is established between the measured load and
various factors of inuence. The model contains several
coecients, with values to be determined, that quantify
the magnitudes of each inuence. The coecient values
are chosen such that the overall error between model esti-
mates and actual measured loads is minimized. The model
is considered valid if tests conducted with numerous histor-
ical data sets result in small overall errors. Improvements
to a decient model could involve the use of a dierent
mathematical relationship or of data that are more rened.
Irrespective of the forecasting techniques, weather
parameters are the key factors of Hydro-Quebec short-
term load forecasts: temperature and humidity are the most
commonly used load predictors. Thermostat-based models
have thus been developed (explanatory models). The time
factors include the day of the week, and the hour of the
day because there are important dierences in load between
weekdays and weekends. Furthermore, the load on dier-
ent weekdays can behave dierently. For example, Mon-
days and Fridays being adjacent to weekends, they often
have structurally dierent loads than Tuesday through
Thursday. This is particularly true during summer. Holi-
days are more dicult to forecast than non-holidays
because of their relative infrequent occurrence. For cus-
tomer classes with similar load patterns, standard load
curves, namely diagrams of loads as a function of time,
can be obtained through load research studies based on
modelling individual customer demands within a specic
interval.
For short-term load forecasting (STLF) several factors
must be considered such as time factors, weather data,
and possible customers classes. Several forecasting meth-
ods have been developed using parameter regression or
time series [1,2]. These technologies, despite some limita-
tions, are widely used in the industry. Neural networks
[36], fuzzy techniques [7] have also been applied, and there
0142-0615/$ - see front matter 2007 Elsevier Ltd. All rights reserved.
doi:10.1016/j.ijepes.2006.09.007
*
Corresponding author.
E-mail address: lefebvre.serge@ireq.ca (S. Lefebvre).
www.elsevier.com/locate/ijepes
Electrical Power and Energy Systems 29 (2007) 630635
are numerous publications in scientic journals. According
to Hippert et al. [8], although these technologies seem to
provide valid load forecasts, most investigators have used
seemingly misspecied models that have been incompletely
tested. More research on the behavior large neural net-
works is needed before denite conclusions are drawn on
the suitability of these approaches. To alleviate these prob-
lems, a heuristic approach in [9] has been rejuvenated in
[10] that proposes using abductive networks. The technique
is reported to oer the advantages of simplied and more
automated model synthesis. It also provides analytical
inputoutput models that automatically select inuential
inputs. Nonetheless, tuning is still required to improve
forecasting accuracy through, for example, the inclusion
of hourly temperature data and the development of dedi-
cated seasonal models. Charytoniuk et al. [11] present
another approach to short-term load forecasting. It is a
non-parametric regression. The main advantage of the
non-parametric approach is that it is data driven and elim-
inates a need for the statistical analysis aimed at selecting a
multivariate distribution tting the data. This also assures
portability of the proposed method. It can be used in any
utility, regardless of the type of its load distribution.
This paper builds on non-parametric regression and
compares its outcome with the classical approaches. In Sec-
tion 2, the paper rst presents a simple time series based
forecasting model and a thermostat-based regression
model. In Section 3, non-parametric algorithms are used
for STLF for commercial loads. In Section 4, the probabil-
ity forecast functions of the aggregated load are developed.
In Section 5, results from dierent models are presented for
residential and commercial loads. For residential loads, the
paper compares a thermostat-based model and a non-para-
metric model. For commercial loads, times series estimates
and non-parametric estimates are compared since thermo-
stat-based regression is not representative of these loads.
2. Time series and regression models
2.1. Basic time series model
Time series models use previous or historical values, of
the data as input to the model, thus
yt %F fxt; xt dt; :::; xt ndt; yt dt;
yt 2dt; :::; yt ndtg 1
where y(t), y(t-dt), y(t-2dt) represent the load at t, t-dt and
t-2dt, dt being the time period considered. Here dt is 1 h,
xt represents a vector of other factors such as the day,
the hour or the temperature.
2.2. A thermostat-based regression model for heating loads
In this model, system load follows closely home heating
requirements as controlled by thermostats actions. The ran-
dom heating cycles of numerous homes is not represented,
thus the model cannot reproduce the combined load over
time. However, with a single equation, it can reproduce
the average load over the considered time period.
The dynamic model for the temperature of an average
house having a heater regulated by a thermostat is of the
form
dx
dt
Ax BM CQ
T Hx
2
where T is the room temperature as measured by the ther-
mostat; x is the thermostat state; M is a vector of possibly
non-linear function of meteorological inputs such as out-
side air temperature, solar intensity and wind velocity; Q
is the heat ow of space heaters; A, B, C, H are constant
matrices whose values depend on house construction.
The thermostat reduces the thermal characteristics (and
the eects of weather and lifestyle) to two variables: on-
duration and o-duration. On-duration d
1
is the time in
minutes required for the room temperature T to raise from
the lower set point (T
s
D) to the upper set point T
s
when
the heater is on. The o-duration d
o
is the time in minutes
required for T to decrease from T
s
to (T
s
D) when the
heater is o. The mean value of the thermostat status under
normal conditions is
b
d
1
d
1
d
o
3
In this model, we substitute in the matrix H the value of b
instead of using directly the temperature forecast.
3. Non-parametric models
In load forecasting, optimal algorithms often require the
knowledge of underlying densities of measurements and/or
noise. As these densities are usually unknown, assumptions
are frequently made that compromise the algorithms per-
formance. A common approach to this problem is data
density estimation. If a particular density form is assumed
or known, then parametric estimation is used. If nothing is
assumed about the density shape, non-parametric estima-
tion is the choice.
The kernel density estimator, also commonly referred to
as the Parzen window estimator [12] is non-parametric. It is
particularly attractive when no a priori information is
available to guide the choice of density with which to t
the data, for example the number of variables aecting a
forecast. A comprehensive review of non-parametric esti-
mation is presented in [13]. In time series, the Parzen win-
dow is a weighted moving average transformation used to
smooth measurements.
3.1. Parzen window
In the Parzen window approach, a hypercube cell of
xed width is used to investigate a region R
n
. The region
volume is
D. Asber et al. / Electrical Power and Energy Systems 29 (2007) 630635 631
Vn h
d
n
4
where h
n
is the length of the edge of R
n
. Dene, for exam-
ple, the function
uu
1 ju
j
j 6
1
2
j 1; :::; d
0 otherwise
_
5
u((x x
i
)/h
n
) is equal to unity if x
i
falls within the hyper-
cube of volume V
n
centered at x and is equal to zero other-
wise. The number of samples (or independent observations)
in this hypercube is n, let
K
n

n
i1
u
x x
i
h
n
_ _

1
n
n
i1
1
V
n
u
x x
i
h
n
_ _
6
The probability estimates P
n
(x) are
P
n
x
1
n
n
i1
1
h
n
u
x x
i
h
n
_ _
7
The denitions for u may vary from uniform, triangular,
Gaussian and others since the choice in Eq. (5) is not un-
ique. This function must satisfy two conditions, namely
_
judu 1 and j(.) is symmetric. By considering a
Gaussian distribution u, Eq. (7) becomes an average of
normal densities centered on the samples x
i
. The probabil-
ity density function PDF is
^
f
k
x
1
nh
n
i1
K
x
i
x
h
8
The scalar function j(.) or u(.) is called Parzen window
function or kernel function. The Gaussian kernel is used
in rest of the paper:
uu
1
2P
p
_ _
expu
2
=2 and h
n

h
1
n
p n 1 1 9
In practice, the kernel function chosen is not nearly as
important as the kernel size h.
3.2. Simple Kernel regression
Regression estimation aims at nding a relationship
between a dependent variable and a set of independent
variables. Kernel regressions are used when we are unwill-
ing to impose a parametric form on the regression equation
and there is lot of data.
Let the scalars y
i
be the outputs and x
i
the data inputs.
Regression equations are specied as
y
i
mx
i
e
i
; 10
where E[e
i
] = Cov [m(x
i
),e
i
] = 0 and m(.) is a possibly non-
linear function. The term e
i
is random with mean zero and
variance r
2
. It denes the variation of y
i
around its mean,
m(x
i
) . The mean can be expressed as a function of the
probability density f
mx
i
EY
i
jx
i
x
_
y f x; ydy
_
f x; ydy

_
y f x; ydy
f x
11
The kernel smoothed density estimator is assumed to be a
combination of the Gaussian distribution as a function of n
and the kernel bandwidth h. The general form for this type
of estimator follows
^ mx
i

n
i1
K
h
x
i
xy
i
n
i1
K
h
x
i
x
where K
h
x
t
x
exp
x
t
x
2
2h
2
_ _
h

2P
p
Ku
1
2P
p exp
u
2
2
_ _
12
The term
K
k
xi x
n
i1
Kxi x
is the weight given to observation i.
The denominator makes the weights sum equals to unity.
As an example, the weight function could give equal weight
to the k values of x
i
that are closest to x and zero weight to
all other observations. The N(0, h
2
) PDF is commonly
used. The choice of h allows us to easily vary the relative
weights of dierent observations. This weighting function
is positive so all observations get a positive weight. The
weights are largest for observations near x and then tapers
o in a bell-shaped way. A low value of h means that the
weights taper o fast; the weight function is then a normal
PDF with a low variance.
3.3. Practical regression
In practice we have to estimate ^ mx at a nite number of
points x. The load forecast y
t
depends on many variables
(x
t
, z
t
, l
t
, . . ., h
t
) and the general formulation of a local aver-
aging estimator uses the multivariate Kernel regression:
^ mx
n
i1
K
hx
x
i
xK
hz
z
i
zK
hl
l
i
ly
i
n
i1
K
hx
x
i
xK
hz
z
i
zK
hl
l
i
l
13
where
K
h
x
i
x
1
h

2P
p exp
x
i
x
2
2h
2
i
_ _
14
Optimization estimators, on the other hand, are more ame-
nable to incorporating additional structure.
As a prelude to our later discussion, consider the follow-
ing estimator. Given
^y
i
^ mx
i
; y
i
h; :::; l
i
e 15
Then we solve:
min
h
1
T
i
y
i
h ^y
i
2
16
3.4. Condence intervals and bands
Dene the condence intervals D(k) of the estimated
load P
t
as follows:
D L; U; L the band lower and U the upper band
of the estimator
632 D. Asber et al. / Electrical Power and Energy Systems 29 (2007) 630635
Dk fp
avr
Cstdpe; p
avr
Cstdpeg
where p
avr
is the mean value for estimated load,
C 1 x
1
a
2
x
1
a
2
is the (1 a/2) quintile of the standard

Gaussian distribution with the probability equal to
p = 1 a.
4. Smoothing parameters for feeder load prole forecasting
Electric load aggregation is the process by which large
commercial, residential and industrial loads are combined
to form an aggregated load with homogeneous and non-
homogeneous loads.
The main objective of this section is to obtain the prob-
ability forecast functions of the aggregated load. The
method consists of two steps. First, smoothing techniques
based on kernel estimates are applied to derive non-para-
metric estimators. The aggregated load is the sum of the
individual category forecasts. Then, parameter smoothing
is applied to form the aggregated load. Parameter smooth-
ing is modelled as being unknown but bounded in ampli-
tude by a closed convex set.
At any given instant, the aggregated smoothing param-
eter h
i
agr
is equal to the sum of the category parameters.
h
u agr
t
C
i1
k
i
h
ui
t 17
Subject to:
C
i1
k
i
1 18
where u = 1. . ., m, with m the total number of parameters
included in the load model. Here the parameters are
weather and time for a weekday load. This added con-
straint requires that the aggregated parameters lie within
the smallest convex set containing the points formed by
the parameters category loads. The problem is then cast
in the form of a constrained optimization problem:
min
h
1
T
t
y
t
h y
t
2
subject to
C
i1
k
i
1
k
i
P0
19
5. Illustration and comparison of methods
This section is a scoping study for the methodology
based upon a data set from customers in the Montreal
region of the Hydro-Quebec network. In this study, the
electricity consumption of ve load categories sampled at
60 min intervals was recorded over a month period. The
0 5 10 15 20 25
0.3
0.32
0.34
0.36
0.38
0.4
0.42
0.44
0.46
Time in hours
L
o
a
d

i
n

p
.
u
.
Forecasting for the residential load
non parametric
method
with measure
thermstat
method
Fig. 1. Load forecast over a 24 h period for a residential load.
0 5 10 15 20 25
-0.05
-0.04
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0.04
0.05
Time in hours
E
r
r
e
u
r

R
e
l
a
t
i
v
e
thermostat
method
non parametric
load
Relative error over a 24 hours period for a residential load
Fig. 2. Forecasting error for the residential load.
0 5 10 15 20 25
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
Time in hours
L
o
a
d
i
n

p
.
u
.

series
method
measured
load
non
parametric
load
Load forecast over a 24 hours period for a commercial load
Fig. 3. Forecast over a 24 h period for a commercial load.
recorded historical data are used to compute the parame-
ters of each forecasting technique in the paper. The histor-
ical load, the hour and the corresponding temperature are
used for the modeling procedure.
Fig. 1 shows the forecasts for a residential load and the
recorded values. The non-parametric approach ensures
forecasts very close to the measured load over a 24 h period
of time. The thermostat-based method does not yield the
same accuracy; the forecasting error with these two meth-
ods is illustrated in Fig. 2. The non-parametric technique
has an average error of 0.01 p.u. while the forecasting error
using the thermostat approach varies between 0.04 and
0.05 p.u. From Fig. 2, we compute MAPE = 1.5% for the
non-parametric model, and MAPE = 2.2% with the ther-
mostat model.
The second data set represents a commercial load (oce
load). Fig. 3 illustrates the forecasted load using long pro-
ven time series [14] and non-parametric techniques. Both
techniques ensure satisfactory results for very short-term
forecasts up to about 7 h ahead. However, for larger peri-
ods, the non-parametric approach has a better tracking
performance than the time series approach. The average
error is illustrated in Fig. 4. From this gure we compute
MAPE = 0.87% for the non-parametric model, and
MAPE = 3.04% with time series. Time series models may
be made more accurate, but this requires large amounts
of historical good quality data.
Finally, Fig. 5 shows the normalized error between the
aggregated commercial load and the sum of the individual
category forecasts. In this case, MAPE = 2.69%, higher
than without decompositionaggregation.
6. Conclusion
The proposed non-parametric methods for forecasting
proposed exhibit a very good performance with respect to
the well-known thermostat-based techniques. These non-
parametric algorithms give results that are more consis-
tent. Their performance is also very competitive even
with linear time series. They are automatic procedures
that do not need of any prior information. In short,
the method searches a collection of historical observa-
tions for records similar to the current conditions and
uses these to estimate the future state of the system.
The method is simple, has a sound theoretical basis
and provides the best forecast in the sense of a minimum
expected squared error. It requires few parameters that
can be easily calculated from historical data by applying
the cross validation technique.
There are drawbacks to non-parametric estimation.
Whereas parametric models compress all training data into
a set of equations through the process of parameter tting,
non-parametric regression retains the data and searches
through them for past similar cases each time a forecast
is made. If the number of variables is very large, then the
result is ineciency of the non-parametric methods. This
was not a problem for the load forecasting problem, but
may become one in very short-term estimation.
References
[1] Haida T, Muto S. Regression based peak load forecasting using
transformation technique. IEEETrans Power Syst 1994;9 (4):178894.
[2] Ramanathan R, Enge R, Granger CW, Vahid-Araghi F, Brace C.
Short-term forecasts of electricity loads and peaks. Int J Forecasting
1997;13:16174.
[3] Dillon TS, Sestito S, Leung S. An adaptive neural network approach
in load forecasting in a power system. In: Proceedings of the rst
international forum on applications of neural networks to power
systems; 1991. p. 1721.
[4] Peng TM, Hubele NF, Karady G. An adaptive neural network
approach to 1 week ahead load forecasting. IEEE Trans Power Syst
1993;8 (3):1195203.
[5] Kodogiannis VS, Anagnostakis EM. A study of advanced learning
algorithms for short-term load forecasting. Eng Artif Intell 1999;12:
15973.
0 5 10 15 20 25
-0.05
-0.04
-0.03
-0.02
-0.01
0
0.01
0.02
0.03
0.04
0.05
Time in hours
E
r
r
e
u
r

R
e
l
a
t
i
v
e
Error for the commercial loads
non
parametric
method
time series
method
Fig. 4. Forecasting error for the commercial load.
0 10 20 30 40 50 60 70 80 90 100
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
Difference between aggregate commercial load and the sum of category loads
E
r
r
o
r

i
n

p
.
u
.
Time in hours
measured
aggregated
load
estimated
aggregated
load
Fig. 5. Error between the aggregated commercial load and the sum of
category loads.
634 D. Asber et al. / Electrical Power and Energy Systems 29 (2007) 630635
[6] Villalba SA, Bel CA. Hybrid demand model for load estimation and
short-term load forecasting in distribution electric systems. IEEE
Trans Power Deliver 2000;15 (2):7649.
[7] Liang RH, Cheng CC. Short-term load forecasting by a neuro-fuzzy
based approach. Electr Power Energ Syst 2002;24:10311.
[8] Hippert HS, Pereira CE, Souza RC. Neural networks for short-term
load forecasting: a review and evaluation. IEEE Trans Power Syst
2001;16 (1):4455.
[9] Dillon TS, Morsztyn K, Phula K. Short-term load forecasting using
adaptive pattern recognition and self organizing techniques. In:
Proceedings of the 5th power system computational conference,
Cambridge, September; 1975.
[10] Abdel-Aal RE. Short-term hourly load forecasting using abductive
networks. IEEE Trans Power Syst 2004;19 (1):16473;
CharytoniukW, ChenSM, OlindaV. Non-parametric regressionbased
short-termloadforecasting. IEEETransPowerSyst 1998;13(3):72530.
[11] Parzen E. On estimation of a probability density function and mode.
Ann Math Stat 1962;33:106576.
[12] Izenman AJ. Recent developments in non-parametric density esti-
mation. J Am Stat Assoc 1991;86:20524.
[13] Duda RO, Hart PE, Stork DG. Pattern classication; 2001.
[14] Mu G, Chen YH, Liu ZF, Fan WD. Studies on the forecasting errors
of the short-term load forecast power system technology. In:
Proceedings of the power con., vol. 1; 2001. p. 636640.

Non-Parametric Short-Term Load Forecasting: D. Asber, S. Lefebvre, J. Asber, M. Saad, C. Desbiens

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Non-Parametric Short-Term Load Forecasting: D. Asber, S. Lefebvre, J. Asber, M. Saad, C. Desbiens

Transféré par

Droits d'auteur :

Formats disponibles

Non-parametric short-term load forecasting

is the (1 a/2) quintile of the standard

Vous aimerez peut-être aussi