Vous êtes sur la page 1sur 6

Paper accepted for presentation at 2003 IEEE Bologna Power Tech Conference, June 23th-26th, Bologna, Italy

A Fast Electric Load Forecasting Using


Adaptive Neural Networks
M. L. M. Lopes, A. D. P. Lotufo, 0HPEHU,(((, and C. R. Minussi

$EVWUDFW7KLV ZRUN SUHVHQWV D SURFHGXUH IRU HOHFWULF ORDG


IRUHFDVWLQJ EDVHG RQ DGDSWLYH PXOWLOD\HU IHHGIRUZDUG QHXUDO
QHWZRUNVWUDLQHGE\WKH %DFNSURSDJDWLRQ DOJRULWKP 7KH QHXUDO
QHWZRUN DUFKLWHFWXUH LV IRUPXODWHG E\ WZR SDUDPHWHUV WKH
VFDOLQJDQGWUDQVODWLRQRIWKHSRVWV\QDSWLFIXQFWLRQVDWHDFKQRGH
DQGWKHXVHRIWKHJUDGLHQWGHVFHQGHQWPHWKRGIRUWKHDGMXVWPHQW
LQ DQ LWHUDWLYH ZD\ %HVLGHV WKH QHXUDO QHWZRUN DOVR XVHV DQ
DGDSWLYH SURFHVV EDVHG RQ IX]]\ ORJLF WR DGMXVW WKH QHWZRUN
WUDLQLQJUDWH7KLVPHWKRGRORJ\SURYLGHVDQHIILFLHQWPRGLILFDWLRQ
RIWKHQHXUDOQHWZRUNWKDWUHVXOWVLQIDVWHUFRQYHUJHQFHDQGPRUH
SUHFLVH UHVXOWV LQ FRPSDULVRQ WR WKH FRQYHQWLRQDO IRUPXODWLRQ
%DFNSURSDJDWLRQ DOJRULWKP 7KH DGDSWLQJ RI WKH WUDLQLQJ UDWH LV
HIIHFWXDWHG XVLQJ WKH LQIRUPDWLRQ RI WKH JOREDO HUURU DQG JOREDO
HUURUYDULDWLRQ$IWHUILQLVKLQJWKHWUDLQLQJWKHQHXUDOQHWZRUNLV
FDSDEOH WR IRUHFDVW WKH HOHFWULF ORDG RI  KRXUV DKHDG 7R
LOOXVWUDWH WKH SURSRVHG PHWKRGRORJ\ LW LV XVHG GDWD IURP D
%UD]LOLDQ(OHFWULF&RPSDQ\
,QGH[ 7HUPV$GDSWLYH 3DUDPHWHUV %DFNSURSDJDWLRQ
$OJRULWKP (OHFWULFDO /RDG )RUHFDVWLQJ )X]]\ /RJLF )X]]\
&RQWUROOHU1HXUDO1HWZRUNV3RVWV\QDSWLF)XQFWLRQ

[10]. The BP algorithm is considered, in specialized literature,


a benchmark in precision. However its convergence is quite
slow. This work is divided in two steps: 1)a new formulation
of two parameters such as, scaling and translation of the
postsynaptic functions is introduced at each neuron, which is
adapted in an iterative way using the gradient-descent method
[8], and 2)the training rate is adjusted during the
convergence process, to reduce the execution time. The
adjustment is effectuated using a fuzzy controller. It is also
used a decaying exponential function that establishes a priority
in the regulator actuation in the initial training time and avoids
instability in the convergence process. This two
implementations are optimal mechanisms that reduces the
convergence time and improves the precision of the results.
II. NEURAL NETWORK STRUCTURE
The Lth output element (neuron) [11] is a linear
combination of the element inputs [ that are connected to the
element L by the weight Z :
L

LM

I. INTRODUCTION

= Z [
L

xpansion Planning, Load Flow, Economic Operation,


Security Analysis and Control of Electric Energy Systems
are some studies that effectively depend on the previous
behavior of load profile [3], i.e., forecasting of future
information of a time series based on past values. In technical
literature we can find many methods used to load forecasting
such as: simple or multiple linear regression, exponential
smoothing, state estimation, Kalman filter, ARIMA of Box and
Jenkins [1]. All these methods, to be used, need a previous
load modeling for further application. To model the load it is
necessary to know some information like: cloudy days, wind
speed, suddenly temperature variations and effects of non
conventional days (holidays, strikes, etc). After modeling the
load using these information, the algorithm is initialized to get
the results. The use of neural networks is, nowadays, a very
efficient method mainly to the special days case.
The objective of this work is to develop a methodology to
electric load forecasting using an ANN (Artificial Neural
Network) [4], [11] with the training by BP (Backpropagation)

M. L. M. Lopes is a PhD. student at UNESP Ilha Solteira, SP, Brazil (email: mara@dee.feis.unesp.br).
A. D. P. Lotufo is with UNESP Ilha Solteira, SP, Brazil (e-mail:
annadiva@dee.feis.unesp.br).
C. R. Minussi is with UNESP Ilha Solteira, SP, Brazil (e-mail:
minussi@dee.feis.unesp.br).

0-7803-7967-5/03/$17.00 2003 IEEE

LM

(1)

Each element can have a bias Z fed by an extra constant


input [ = + . The linear output is finally converted in a
nonlinear function as a sigmoid or relay [4], [11], etc. The
relay functions are appropriated for binary systems, while the
sigmoid functions can be employed for both continuous and
binary systems.


III. NEURAL NETWORK TRAINING


The BP training is initialized by presenting a pattern ; to
the network that gives an output <. Following this, it is
calculated an error in each output (the difference with the
desired value and the output). The next step is to determine
the back propagated error by the network associated to the
partial derivative of the quadratic error of each element
related to the weights, and finally adjusting the weights of
each element. Then a new pattern is presented, and the
process must be repeated until the existence of total
convergence ( |error| arbitrated tolerance). The initial
weights are usually adopted as random numbers [11]. The
BP algorithm consists of adapting the weights such that the
network quadratic error is minimized. The sum of the
instantaneous quadratic error of each neuron of the last layer
(network output) is given by:

QR

=


where:

=
=
G
=
\
QR
=

L=

(2)

L

This way, using the gradient descent method, it is obtained


the following schema for adapting the weights [5], [6], [11]:
9 (U) = 9 (U) + 2 ;


G \ ;
desired output of the Lth element of the last layer;
output of the Lth element of the last layer;
number of neurons of the last layer.

L

L

L

L

(7)

L

If the Lth element is in the last layer, then:

  
L

L

(8)

L

If the i-th element is in other layers, then:


$:HLJKW$GDSWLYH3URFHVV
Considering the Lth network neuron and using the
descendent gradient method, the weight adjustments are
formulated by [4], [11]:
9 (U) = 9 (U) + (U)


where:
(U) =

=
(U) =
9

(3)

[ (U)];
stability control parameter or training rate;
quadratic error gradient related to neuron Lweights;
vector containing neuron L weights;
[ Z Z Z Z ] .

L

L

L

L

QL

The adopted direction in (3) to minimize the objective


function of the quadratic error corresponds to the gradient
opposite direction. The  parameter determines the vector
length (U). Considering that this work deals with load
forecasting (the values are always positive), the nonlinear
function to be used is the sigmoid function defined by [4], [11]
(varying between 0 and +1):
L

\ = 1 / {( + H[S ( + )}
L

where:
= constant that determines the slope of the function \ ;
= constant that determines the translation of the function \ .
L

The gradient (U) is represented by:


L

= 2
9 (U )
9 (U )
L

Z 
LM

4 L

(9)

where:
4(L) = set of the element indices that are in the next layer
to the Lth element layer and are interconnected to
the Lth element.
The parameter used as a stability control of the iterative
process is dependent of [6]. The network weights are
randomly initialized considering the interval {0,1}. By
convenience, the parameter (training rate) is redefined as
follows [5]:

= /

(10)

Replacing (10) in (7), it is cancelled the amplitude


dependency of related to . The amplitude is maintained
constant to every . This alternative is important considering
that only actuates in the left and right tails of . Then, (7) is
written as follows[5]:
9 (U) = 9 (U) + {2  / };


The BP algorithm is considered in the technical literature a


benchmark in precision, although its convergence is very slow.
In this way, this work proposes to adjust the training rate
during the convergence process taking as an objective the
reduction in the execution training time. The adjustment is
executed by a proceeding based on a fuzzy controller.
%6ORS$GDSWLYH3URFHVVRI6LJPRLG)XQFWLRQ

Differentiating (2) in relation to the vector 9 , it is


obtained the following equation:

\
\

=
=
=
9
9
9
9

(5)

(4)

L

(U) =

The general form of the postsynaptic functions used to


adjust the neural network is given by [8]:
\

L

I Z

(11)

where:
sigmoid function derivative, (4), related to ;
= \ (\ ).
L

Then,

=;
9

(6)

The scaling and translation parameters contain


corresponding learning rates denoted by , and ,
respectively. A generalized architecture of a neuron with and
performing the role of scaling and translation in a
multidimensional space is showed in Fig. 1 [8]:

where:
;  pattern vector;
= [ [ [ [ [ ] .
L

L

L

L

QL

Fig 1. Architecture of the neural network.

The adjustment of the scaling and translation of the


postsynaptic functions is developed by the gradient descendent
method based on the BP algorithm.
Similar to the calculation of adjusting the weights, the
adjustment of the inclination and translation parameters of the
sigmoid function is effectuated considering the L-th neuron, by
the descent gradient method. This way, the adjustment of the
inclination parameter of the sigmoid function is given by [8]:

(U) = (U) + (U)




= [    ] .


7

L

L

L

The gradient is represented by:


L

Now, differentiating (2) in relation to the vector , it is


obtained the following equation:
L

=
=

where:

(U) = [ (U)];
(U) = quadratic error gradient related to neuron Lslope;

vector containing neuron L slopes;
= [    ] .

L

L

L

where:
sigmoid function derivative, (4), related to ;
= \ (\ ).
L

The following equation defines the adaptation rule of the


translation parameter of the nonlinear function [8]:

(U) =

(U) = (U) + 2

2
= 2
(U )
(U )
L

If the Lth element is in the last layer, then:

Now, differentiating (2) in relation to the vector , it is


obtained the following equation [8]:

= 

=
=

Then, the rule that defines the adaptation of the inclination


parameter of the sigmoid function is given by the following
equation:

(U) = (U) + 2


If the Lth element is in the last layer, then:


L

(13)

L

If the i-th element is in other layers, then:

= 
L

4 L

Z 
LM

(14)

&6KLIW$GDSWLYH3URFHVVRI6LJPRLG)XQFWLRQ
Using the same schema, the sigmoid function translation
parameter is formulated by:

(U) = (U) + (U)




(15)

where:

(U) = [ (U)];
(U) = quadratic error gradient related to neuron Lshift;

vector containing neuron L shifts;
L

= 

= 

where:
sigmoid function derivative, (4), related to ;
= \ (1 \ ) .

L

(16)

L

If the i-th element is in other layers, then:

QL

The gradient is represented by [8]:

= 2
(U )
(U )

(U) =

(12)

QL

4 L

Z 
LM

(17)

The adaptation rule for inclination and translation


parameters of the network is calculated in an iterative way for
every Lth neuron. Therefore, the adaptive neural network
always gives an output, and the network is faster than the
conventional one. The next step is to introduce the fuzzy
controller, which objective is to execute a control on the
training rate to obtain a better solution.

IV. NEURAL NETWORK WITH FUZZY CONTROLLER


The purpose of the fuzzy controller is to reduce the
execution training time through the adjustment of the
training rate . The basic idea of the methodology consists of
determining the system state, defined as the global error J and
the global error variation J, taking as an objective a control
structure that leads the error to zero in a reduced iteration
number, when compared to the conventional procedures. In
this work, the control is formulated using the fuzzy logic
concepts [9]. Initially, the global error is defined as:
QS

J =

QR

M = L =


L

(18)

where:
QS = number of the network pattern vectors.

The global error is calculated in each iteration and the

parameter , is adjusted by an increase determined by


fuzzy logic. The system state and the control action are defined
as [5]:
( = [J J ] , and
T

X =
T

T

(19)

where:
T = current iteration index.
For a very large input pattern ;, J and J can saturate.
Then, the adaptive control is effectuated using an exponential
decreasing function applied to the fuzzy controller response. In
this way, the adaptive controller is given by [5]:

* = H[S (T)
T

(20)

where:
= an arbitrary positive number;
T
= change from the fuzzy controller at instant T.
This parameter is used to adjust the network weight set
referred to the subsequent iteration. The process must be
repeated until the training be concluded. It is a very simple
procedure which control system requests an additional effort,
although reduced, considering that the controller has two input
variables and only one output. This is an improvement of the
one presented in reference [2], i.e., using the same variables J
and J to execute the control. However, in this work the
following contributions are introduced: 1) improvement of BP
algorithm proposal; 2) the proposal of the fuzzy controller is
original (a set of rules and the use of an exponential decreasing
function applied to the controller response).
Each state variable must be represented between 3 and 7
fuzzy variables. The control variable must also be represented
by the same number of fuzzy sets. The J variable must be
normalized considering as a schedule factor the first global
error generated by the network, i.e., T = 0. With this
representation, the variation interval is between 0 and +1. If
the adaptation heuristic is accordingly coincident, the process
convergence is an exponential decreasing. The J variable
varies between 1 and +1. If the convergence process is
exponentially decreasing, the J values is always negative. In
this case, although the J schedule is between 1 and +1, it
must be employed in the rule set an accurate adjustment
between 1 e 0. In the other interval (0, +1], the adjustment
can be more relaxed.
In the fuzzy controller the rules are codified as a decision
table form. Each input represents the fuzzy variable value
given the global error values J and the global error variation
J The parameter must be arbitrated in function of
(sigmoid function slope). The variations also follows the
same procedure.
Table I shows the fuzzy set rule in 30 rules. The number of
rules can be increased to improve the network performance
during the training.

TABLE I
FUZZY CONTROLLER RULES
g

g
ZE
ZE
NL
PS
PL
ZE

NL
NS
ZE
PS
ME
PL

where:
NL =
NS =
ZE =
PVS =
PS =
ME =
PL =
PVL =

PVS
ZE
ZE
NL
ME

PS
NS
ME
NL
ZE
NL

ZE

ME
PL
NL
PL
PS
ME

PL

PVL
PL

ME
PS
NS
ZE
ME

NS
ZE
NL
ZE

Negative Large;
Negative Small;
Near to Zero;
Positive Very Small;
Positive Small;
Medium;
Positive Large;
Positive Very Large.

To analyze the developed methodology performance, gains


are defined, considering the number of cycles and the
necessary time to effectuate the training, in the following way
respectively:
*& = 1%3 / 1)%3
(21)
*7 = 7%3 / 7)%3
where:
1%3 =
1)%3 =
7%3 =
7)%3 =

(22)

number of the cycles of Conventional BP;


number of the cycles of Adaptive BP-FC;
execution time (processing) by Conventional BP (s);
execution time by Adaptive BP-FC (s).
V. LOAD FORECASTING

Short term load forecasting (daily forecasting) is executed


as follows: the implementation of a recurrence in the output of
a determined instant is used as an input at the subsequent
instant. The hourly historical data in a predefined interval,
e.g., monthly are considered The input network to a
determined hour K is defined as the load values, extracted
from the historical data in four instants (the current value,
one, two and three hours before), temperature, etc., and the
data referred to time (month, day of the week, holiday and
hour, etc.). The output network corresponds to the load value
referred to hour (h+1). The output / input set is defined
considering this strategy until all time series interval time is
completed. This scheme can be modified to improve the
results by introducing other variables (hazy day, etc.). Then,
the input and output vectors are respectively defined as
follows [5]:
;(K) = [ W /(K) /(K) /(K) /(K) ] ,
7

<(K) = [ /(K) ] ,

<5



;5

(23)
(24)

where:
P
= dimension of vector ;;
/(KS) = load value S hours before the current hour K;
/(K) = electric load value corresponding to the
subsequent hour to current hour K;
W
= time vector referred to the historical data (month,
day of the week, holiday, hour, etc.) represented
in a similar way as the binary code (1,+1).

historical data between July 8, 1998 and July 28, 1998.


Therefore, there are 504 input/output vectors. Table II shows
the principal parameters referred to the used neural network
and the training.
TABLE II
NEURAL NETWORK SPECIFICATION
Item

Choosing this binary representation is preferable in relation


to (0,+1) representation, considering that the network input
component 0 does not modify the weights. In this way
(1,+1) representation gives a faster convergence, and
consequently, more efficient. The electric loads /(K), ... ,
/(K) represent the feedback link, with a delay in the output.
Then, this network is a recurrent one. Therefore, in the specific
case of the problem of electric load forecasting determination,
it is used a sigmoid function relatively small (inferior to 1).
This permits a less restrictive choice of the network weights
[6], if compared to the adopted in the bibliography. Then, this
reduces the possibility of occurrence of paralysis and increases
the speed of the BP algorithm convergence [6]. The training
data used in this work are (to each vector): the time data
(month day, week day (if it is a holiday or not), day hour), the
current hour loads, and load values considering three hours
before. The future load (one hour in advance) is the network
output. The temperature data are not considered due to the
electricity company do not provide, however, it can be
employed without any problem. Considering a binary
representation, the vector W has a dimension 8, that joining the
load data completes 12 components (P = 12). The historical
data are electric loads and were obtained from a Brazilian
Electricity Company. These data contain hourly loads of the
year 1998, being related to non typical days (holidays), special
days (Saturdays and Sundays) and days from a typical week.
Taking into account the experience it is considered the

Value

Number of pattern vectors


Number of layers
Neurons number per layer
Tolerance (%)
Training rate
Sigmoid functions training rate slope
Sigmoid functions training rate translation
Sigmoid functions initial slope
Sigmoid functions initial translation
Parameter

Fig. 2 shows the results of load forecasting (by conventional


BP and adaptive BP-FC (adaptive BP with fuzzy controller))
referred to a defined day July 29, 1998. For a precision
analysis, the mean absolute percentage error (0$3() [7] and
the maximum error of the daily forecasting, are defined
comparing the real load values with the estimated values by
neural network in the following way:
0$3( =

1

{ (| /(K) /(K) |) / /(K)}x
1 L=

0D[LPXPHUURU (%) = PD[ {| /(K) /(K)| / /(K)}x


where:
/(K) = actual load value referred to hour K;
/(K) = estimated load value referred to hour K;
1 = total number of hours.

3800
Actual
Adaptive BP-FC
Conventional BP

3600

Load ( MVA )

3400
3200
3000
2800
2600
2400
2200

504
3
12-30-1
4
5.5
0.1
0.5
0.3
0.0
0.4281

10

15
Time ( h )

Fig 2. Load forecasting results.

20

25

(25)

(26)

The program was elaborated in FORTRAN and processed


in a Pentium 4 (1.7 GHz and 256 MB of RAM memory). The
processing time is only referred to the BP algorithm execution,
excluded the reading / output data operations. Table III shows
the comparative results.
TABLE III
COMPARATIVE RESULTS
Item
Cycles number
Processing time
Gain *&
Gain *7
0$3(
0D[LPXPHUURU

(s)

(%)
(%)

Conventional BP
66.986
1165.98

1.80
6.20

Adaptive BP-FC
1.878
33.16
35.67
35.16
0.97
2.98

VI. CONCLUSION
It is developed a methodology for electric load forecasting
by neural network using a training based on a fuzzy controller
BP algorithm. The network also has an adaptation system of
the translation and inclination parameters of the sigmoid
function for each network element, which, at first, always has a
solution. The presented results consider the historical data of a
Brazilian Electricity Company. The short term load forecasting
is executed considering 24 hours in advance. It is verified, in
this example, that the proposed formulation reduces the
number of training cycles, and the processing time, when
compared to conventional BP algorithm. The observed mean
absolute percentage error (0$3() and the maximum error of
the daily forecasting are 0.97% and 2.98%, respectively.
Therefore, this network has an optimal adaptation mechanism.
Concluding, the formulation presented in this work gives the
following results:
training rate adaptation, using a fuzzy controller that,
besides increasing the training velocity, improves the
prediction precision;
adaptation of the inclination and translation parameters of
the sigmoid function that, besides increasing the training
velocity, acts to find a feasible solution for the load
forecasting problem.

[5]

M. L. M. Lopes; C. R. Minussi, and A. P. Lotufo, A fast electric load


forecasting using neural networks, 43rd MIDWEST Symposium on
Circuits and Systems, Lansing Michigan, USA, August 2000.
[6] C. R. Minussi and M. C. G. Silveira, Electric power system transient
stability by neural networks, 38 th Midwest Symposium On Circuits
And System, pp. 1305-1308, 1995.
[7] D. Srinivasan, S. S. Tan, C. S. Chang and E. K. Chan, Practical
implementation of a hybrid fuzzy neural network for one-day-ahead load
forecasting, ,(( 3URFHHGLQJV *HQHUDWLRQ 7UDQVPLVVLRQ DQG
'LVWULEXWLRQ, vol. 145, no. 6, pp. 687 692, November 1998.
[8] N. Stamatis, D. Parthimos, and T. M. Griffith, Forecasting chaotic
cardiovascular times series with an adaptive slope multilayer perceptron
neural network, ,((( 7UDQVDFWLRQV RQ %LRPHGLFDO (QJLQHHULQJ, vol.
46, no. 12, pp. 1441-1453, 1999.
[9] T. Terano, K. Asai, and M. Sugeno, )X]]\ 6\VWHPV 7KHRU\ DQG ,WV
$SSOLFDWLRQ, Academic Press, 1991.
[10] P. J. Werbos, Beyond regression: new tools for prediction and
analysis in the behavioral sciences, Master Thesis, Harvard
University, 1974.
[11] B. Widrow, and M. A. Lehr, 30 years of adaptive neural networks:
perceptron, madaline, and backpropagation, 3URFHHGLQJV RI WKH ,(((,
vol. 78, no. 9, pp. 1415-1442,1990.

IX. BIOGRAPHIES
0DUD
/~FLD
0
/RSHV
graduated in
Mathematics from the UFMS, Trs Lagoas, MS,
Brazil, in 1997. Received her M.Sc. degree from
the UNESP, Ilha Solteira, SP, Brazil in 2000. She
is presently a Ph.D. student at UNESP-Ilha
Solteira, SP, Brazil, doing research in load
forecasting by neural network area. E-mail:
mara@dee.feis.unesp.br.

graduated in electrical
engineering from UFSM, Santa Maria, RS,
Brazil in 1978 and M.Sc. from UFSC,
Florianpolis, SC, Brazil, in 1982. She is
currently an Assistant Professor at UNESP Ilha
Solteira, SP, Brazil and a Ph.D. student at
UNESP, Ilha Solteira, SP doing research in
transient and preventive control of electrical
power
system
area.
E-mail:
annadiva@dee.feis.unesp.br.

$QQD ' 3 /RWXIR

VII. ACKNOWLEDGMENT
The authors would like to acknowledge the financial
support of FAPESP (Fundao de Amparo Pesquisa do
Estado de So Paulo) - Brazil (Proc. No 00/15120-1).
VIII. REFERENCES
[1]

[2]

[3]
[4]

C. Almeida, P.A. Fishwich, and Z. Tang, Time series forecasting using


neural network vs. Box-Jenkins methodology, 6LPXODWLRQ &RXQFLHV
,QF., pp. 303-310, November 1991.
P. Arabshahi, J. J. Choi, R. J. Marks II and T.P Caudell, Fuzzy
parameter adaptation in optimization, ,((( &RPSXWDWLRQDO 6FLHQFH
(QJLQHHULQJ, pp. 57-65, Spring 1996.
T. M. ODonovan6KRUW7HUP)RUHFDVWLQJ$Q,QWURGXFWLRQWRWKH%R[
-HQNLQV$SSURDFK New York, John Wiley & Sons, 1983.
S. V. Kartalopoulos, 8QGHUVWDQGLQJ1HXUDO1HWZRUNVDQG)X]]\/RJLF
New York: IEEE Press, 1996.

5 0LQXVVL
graduated in electrical
engineering from UFSM, Santa Maria, RS,
Brazil in 1978, M.Sc. and Ph.D. from UFSC,
Florianpolis, SC, Brazil, in 1981, and 1990,
respectively. He is currently an Associate
Professor at UNESP Ilha Solteira, SP, Brazil.
His main interests are in analysis, control of
power systems, and neural networks. E-mail:
minussi@dee.feis.unesp.br.

&DUORV

Vous aimerez peut-être aussi