Académique Documents
Professionnel Documents
Culture Documents
Neural
Networks
PERGAMON Neural Networks 11 (1998) 1049–1058
Contributed article
Abstract
Constructive theorems of three-layer artificial neural networks with (1) trigonometric, (2) piecewise linear, and (3) sigmoidal hidden-layer
units are proved in this paper. These networks approximate 2p-periodic pth-order Lebesgue-integrable functions (Lp2p ) on R m to R n for p $ 1
with Lp2p ¹ norm. (In the case of (1), the networks also approximate 2p-periodic continuous functions (C 2p) with C 2p-norm.) These theorems
provide explicit equational representations of these approximating networks, specifications for their numbers of hidden-layer units, and
explicit formulations of their approximation-error estimations. The function-approximating networks and the estimations of their approxi-
mation errors can practically and easily be calculated from the results. The theorems can easily be applied to the approximation of a non-
periodic function defined in a bounded set on R m to R n. q 1998 Elsevier Science Ltd. All rights reserved.
Keywords: Three-layer artificial neural network; Function approximation; Approximating network construction; Hidden-layer unit number
specification; Approximation-error estimation; Jackson’s theorem
domain. For simplicity, only the approximation to a 2p- nq W (f, d) for n [ N. For any t $ 0, we denote [t] as the
periodic function on R m is discussed in this paper. largest integer # t. Then q W (f, td) # q W (f, (1 þ [t])d) #
(1 þ [t])q W(f, d) # (1 þ t)q W(f, d). Q.E.D.
u , qv #l
0#pX 4.1.4. Approximation by multidimensional trigonometric
(ii) Kl (t) ¼ 1 þ 2Bl bl, p bl, q cos(p ¹ q)t: polynomials
Combinations
of pÞq[Nm 0
The convolution of the multidimensional Fejér–
Korovkin kernel and a target function gives an approxi-
u , qv #l
0#pX mating multidimensional trigonometric polynomial. A
(iii) K̂ l (r) ¼ Bl blp blq (r [ Nm
0 ) thus K̂ l (0) ¼ 1: multidimensional extension of Jackson’s theorem for an
p¹q¼r approximation by trigonometric polynomials is construc-
p, q[Nm
0
tively proved from the convolution.
p
(iv) K̂ l (1i ) ¼ cos : Theorem 5 (Constructive approximation by multidimen-
lþ2
sional trigonometric polynomials). Let f [ W and K l be
the m-dimensional Fejér–Korovkin kernel. The convolution
K l p f is an m-dimensional trigonometric polynomial which
Proof approximates f such that
!¹m
X rþ1 u , qv #l
0#pX
(i) : Bl ¼ sin2 p Kl p f (x) ¼ h f , 1i þ 2Bl bl, p bl, q
0#r#l lþ2
Combinations
!¹m of pÞq[Nm
X
0
2(r þ 1) ( )
¼2 m
lþ1¹ cos p h f , cos(p ¹ q)ti cos(p ¹ q)x
0#r#l lþ2 : ð9Þ
m þ h f , sin(p ¹ q)ti sin(p ¹ q)x
2
¼ :
lþ2
The approximation error is estimated by
X X
(ii) : Kl (t) ¼ Bl blp eipt blq eiqt ¼ 1 þ 2Bl p2 p ÿ
0#pu #l 0#qv #l kf ¹ Kl p f kW # 1 þ m qW f , (l þ 2) ¹ 1 : (10)
2
u , qv #l
0#pX
3 blp blq cos(p ¹ q)t:
Combinations
If f satisfies a Lipschitz condition with constant M and
of pÞq[Nm 0 exponent n in W, then q W(f, (l þ 2) ¹1) of Eq. (10) can be
(iii): h1, e ¹irti ¼ 0 (r Þ 0), hcos at, cos bti ¼ 1/2 (a ¼ 6 b), 0 replaced by M(l þ 2) ¹n.
(otherwise) and hcos at, sin bti ¼ 0. Hence, it is derived
from (ii). Proof. Eq. (9) is derived from (ii) of Proposition 2. We
apply K l to Theorem 4. From Eq. (8) of Theorem p4, (iii)
p X q þ1 X
(iv) : cos sin2 i p¼ and p
(iv) pof Proposition 2, g l ¼ 2m sin p/
l þ 2 0#qi #l lþ2 0#qi #l ¹ 1 2(l þ 2) , mp= 2(l þ 2). Put «p¼ pl
{g (l þ 2)} ¹1
. 0,
then «g l ¼ (l þ 2) and « , p m= 2. Then we obtain
¹1 ¹1
qi þ 1 q þ2
sin p sin i p Eq. (10) from Eq. (7) of Theorem 4 and (iii) of Proposition
lþ2 lþ2 2, and Lipschitz case from (viii) of Proposition 1. Q.E.D.
from
Corollary 3 (Multidimensional extension of Jackson’s
p q þ1 theorem). For f [ W there exists a l-order m-dimensional
cos sin2 i p
lþ2 lþ2 trigonometric polynomial Tÿl for l [ N such that
p
1 qi qi þ 1 qi þ 1 qi þ 2 kf ¹ Tl kW # (1 þ p2 m=2)qW f , (l þ 2) ¹ 1 . If f satisfies
¼ sin p sin p þ sin p sin p : a Lipschitz condition with constant M and exponent n in
2 lþ2 lþ2 lþ2 lþ2
W, then q W(f, (l þ 2) ¹1) can be replaced by M(l þ 2) ¹n.
Hence, from (iii),
0#q1 , …, qi ¹ 1 , Proof. The corollary is derived from Theorem 5. Q.E.D.
qi þ 1 ,X
…, qm #l
K̂ l (1i ) ¼ Bl bl, q bl, q þ 1i 4.1.5. Proof of Theorem 1
0#qi #l ¹ 1 Network construction: We denote K l p f i by TN l[f i] and
X qi þ 1 q þ2 obtain Eq. (1) from Eq. (9) of Theorem 5. Because TN l[f i] is
sin p sin i p a linear combination of trigonometric functions, it is just a
0#qi #l ¹ 1 lþ2 lþ2 p
¼ X ¼ cos : Q:E:D: network on R m to R with trigonometric hidden-layer units.
q þ 1 lþ2
sin2 i p Then TN l[f] is a network on R m to R n with trigonometric
0#qi #l lþ2 hidden-layer units which approximates f with W-norm.
1054 S. Suzuki / Neural Networks 11 (1998) 1049–1058
X¹ 1 p X¹ 1
4lrlj
(2k þ 1)p
(2k þ 1)p
4lrlj
3 sin
p
sin PLj, k (rx): ð12Þ 3 sin sin SGj, k (rx): ð15Þ
4j 4j 4j k¼0 4j
k¼0
The approximation errors with Lp2p ¹ norm are estimated by
The approximation errors with Lp2p ¹ norm are estimated by
ksin rx ¹ SSj (rx)kLp2p ¼ kcos rx ¹ SCj (rx)kLp2p
ksin rx ¹ PSj (rx)kLp2p ¼ kcos rx ¹ PCj (rx)kLp2p
( p )1=p
( p )1=p p m 1 4j p
p m 4j p # log 2 ¹ þ ¹ cot : ð16Þ
# ¹ cot : ð13Þ (2p)m ¹ 1 j 2 p 4j
(2p)m ¹ 1 j p 4j
Fig. 2. Sin 2x and the functions derived from the approximating networks Fig. 3. Sin 2x and the functions derived from the approximating networks
with piecewise linear hidden-layer units PS j(2x) at j ¼ 1 and 2. with sigmoidal hidden-layer units SS j(2x) at j ¼ 1 and 2.
S. Suzuki / Neural Networks 11 (1998) 1049–1058 1055
Proof. Denote, by SS j(rx), Eq. (11) of Theorem 6 replacing [ ¹ p, p] (cf. Fig. 4) by the neural networks proposed in this
PL j,k(rx) with SG j,k(rx). Then it is a network with 4lrlj paper. The following approximating networks are con-
sigmoidal hidden-layer units based on SG j,k. Let structed using Eqs. (1), (3) and (5): (i) networks with
yk ¼ ¹ p þ kp=2j and y ¼ 8j=px þ 8jp ¹4k
¹ 2.m Then
Rp trigonometric hidden-layer units for l ¼ 2, 4, 6, 8, and
kPLj, k (rx) ¹ SGj, k (rx) kL12p #p ð2p
m =(2p) Þ R ¹ p 10, which respectively have 24, 80, 168, 288, and 440
`
lPLj, k (lrlx) ¹ SGj, k (lrlx)ldx p # ð m=(2p)m ¹ R1 lrlÞ ¹ ` hidden-layer units; (ii) Networks with piecewise linear
lPLj, k (x) ¹ SGj, k (x)ldx ¼ ð2pm =(2p)m ¹ 1 lrlÞ y`kRþ yk þ 1 =2 and sigmoidal hidden-layer units for l ¼ 2, 4, 6, 8, and
`
fPLj, k (x) ¹ RSGj, k (x)gdx ¼ ðp m=(2p)m ¹ 1 4lrljÞ½ p 2 f1 ¹ 10 at j ¼ 5, which respectively have about 6.0 3 10 2,
2
sg(y)g dy þ 0 fy þ 2=2 ¹ sg(y)g dyÿ ¼ ðp m=(2p)m ¹ 1 3.6 3 10 3, 1.1 3 10 4, 2.4 3 10 4, and 4.6 3 10 4 hidden-layer
4lrljÞ(log 2 ¹ 1=2). Hence, from Eqs. (11) and (13) of units, and the same l values at j ¼ 60, which respectively
Theorem 6, Eq. (16) is derived in the case of p ¼ 1. 0 # have about 7.2 3 10 3, 4.3 3 10 4, 1.3 3 10 5, 2.9 3 10 5, and
lsin rx ¹ SS j(rx)l # 1 according to the construction of 5.5 3 10 5 hidden-layer units. Their actual approximation-
SS j(rx). Then we obtain Eq. (16) from ksin rx ¹ error values with L12p ¹ norm, which are the left sides of
SSj (rx)kLp2p # ksin rx ¹ SSj (rx)kL1=p
1 for p $ 1. We can con- Eqs. (2), (4) and (6), are calculated by numerical integra-
2p
struct SC j(rx) in the same manner. Q.E.D. tions. Their estimated approximation error values are calcu-
lated from the right sides of Eqs. (2), (4) and (6). j is fixed at
4.2.2. Proofs of Theorems 2 and 3 and Corollaries 1 and 2 5 and 60 for l in this example, therefore notice that the
estimated error values of networks with piecewise and
Proofs of Theorems 2 and 3. Network construction: We sigmoidal hidden-layer units do not necessarily decrease
denote, by PN l,j[f i], Eq. (9) of Theorem 5 replacing monotonically as l increases because of the second terms
cos(p ¹ q)x and sin(p ¹ q)x respectively with PC j((p ¹ q) of the right sides of Eqs. (4) and (6). When j ¼ 5, the actual
x) and PS j((p ¹ q)x) of Theorem 6 and obtain Eq. (3). error values of the three kinds of networks are about the
Because PN l,j[f i] is a linear combination of PL j,k, it is just same and decrease monotonically as l increases (cf.
a network on R m to R with piecewise linear hidden- Fig. 5). This shows that these approximations proceed in
layer units. Then PN l,j[f] is a network on R m to R n with almost the same way as l increases, even when j is fixed
piecewise linear hidden-layer units approximating f with at a small value for l in this example. The actual error
Lp2p ¹ norm. values are always bounded by the estimated error values.
Hidden-layer unit number: Each PN l,j[f i] has common The estimated error values of the networks with piecewise
hidden-layer units based on PL j,k. Let r ¼ p ¹ q P in Eq. (3), and sigmoidal hidden-layer units, however, do not decrease
¹ l#ri #l
then the
P4lrlj ¹ 1 number
P ¹ l#ri #l is given by 1/2 rÞ0[Nm monotonically as l increases because j is fixed at a small
0
k¼0 P ¼ 2j r[Nm lrl ¼ 2jfl(l þ 1)(2l þ 1)m ¹ 1 þ value for l. When j ¼ 60, these actual error values are
(2l þ 1) ¹ l#r2 , …, rm #l (lr2 l þ … þ lrm l)g ¼ 2mjl(l þ 1)
0
almost equal and decrease monotonically as l increases
(2l þ 1)m ¹ 1 . P0#pu , qv #l (cf. Fig. 6). This shows that these approximations proceed
Error P estimation: Bl CombinationsP
of pÞq[Nm bl, p bl, q ¼ almost equally as l increases, when j is fixed at a larger
(b l, p ) g ¼
0 2
(1/2)Bl f P 0#pu , qv #l b b
l, p l, q ¹ 0#pu #l value relative to l in this example. In this case, the actual
ð1=2ÞfBl ( 0#r#l sinðrþ1Þp=ðlþ2) ¹1g ¼ ð1=2Þf(2=l þ
2m
approximation-error values are bounded by the estimated
2)m (cotðp=2Þ(l þ 2))2m ¹ 1g # ð1=2Þf(8(l þ 2)=p2 )m ¹ 1g, approximation-error values which decrease monotonically
because 0:9 , ðp=2(l þ 2)Þ cot p=2(l þ 2) , 1 for l [ N. as l increases. The actual error values of the three kinds of
Hence, from Eq. (3), Eq. (9) of Theorem 5, Eq. (13) of networks in the both cases of j ¼ 5 and 60 in this example
Theorem 6, lhfi , cos (p ¹ q)til # kfi kLp2p , and lhfi , sin (p ¹ q) can be estimated by the same formulation; i.e. the right side
til # kfi kLp2p , kKl p fi ¹ pPNl, j [fi ]kL2p # 2kfi kL2p of Eq. (2) and the first terms of the right sides of Eqs. (4) and
p p
f(8(l þ 2)=p2 )m ¹ 1g fðp m=(2p)m ¹ 1 jÞ(4j=p ¹ cot (6), as we stated in Remark 2. As explained in the following
p=4j)g1=p . Thus, Eq. (4) is obtained from Eq. (10) of discussion, the approximation capabilities of the three kinds
Theorem 5 for W ¼ Lp2p . We can prove Theorem 3 in the of networks cannot be compared with each other using the
same manner as Theorem 2 using SC j and SS j of Theorem 7
instead of PC j and PS j. Q.E.D.
5. An example
This is an example of the approximation to a two- Fig. 4. The three-dimensional and contour graphs of the approximated
dimensional Gaussian function e ¹ (x þ y ) on [ ¹ p, p] 3
2 2
function e ¹ (x þ y ) .
2 2
1056 S. Suzuki / Neural Networks 11 (1998) 1049–1058
Fig. 5. The actual and estimated approximation error values of the approximating networks with trigonometric, piecewise, and sigmoidal hidden-layer units for
l at j ¼ 5.
Fig. 6. The actual and estimated approximation error values of the approximating networks with trigonometric, piecewise, and sigmoidal hidden-layer units for
l at j ¼ 60.
S. Suzuki / Neural Networks 11 (1998) 1049–1058 1057
theorem for an approximation by trigonometric poly- error evaluations using the modulus of continuity of a target
nomials. (In the case of (1), the networks also approximate function and can be represented by explicit formulations
2p-periodic continuous functions with C 2p-norm.) Because without any explicit constant. Then we can easily calculate
most of the previous results about the constructive approxi- the estimated values using these formulations. Networks
mations by neural networks approximate continuous func- TN l, PN l,q, and SN l,j can approximate functions with
tions, the approximations in this paper extend the space of any degree of accuracy, if their numbers of hidden-layer
target functions. Theorems 1, 2, and 3 provide explicit units increase. Corollary 2 shows that their approximation
equational representations of approximating networks errors approach 0, if l increases under some conditions of j.
TN l, PN l,j, and SN l,j respectively with trigonometric, The approximation methods by networks with piecewise
piecewise linear, and sigmoidal hidden-layer units, specifi- linear and sigmoidal hidden-layer units are based on the
cations for their numbers of hidden-layer units, and explicit method by networks with trigonometric hidden-layer
formulations of their approximation-error estimations. Most units. In fact, for any l, the functions by PN l,j and SN l,j
previous constructive approximation methods are not approach the function by TN l then they become almost the
simple enough for deriving explicit equational representa- same, if j increases. Moreover, for any l, the approximation
tions of approximating networks that can be calculated errors of PN l,j and SN l,j approach the error of TN l if j
practically. The approximation methods presented in this increases. Then they become almost the same value, which
paper only need multidimensional Fourier coefficients of a can be estimated mainly by the same formulation in terms of
target function and can practically and easily construct l; i.e. the right side of Eq. (2) and the first terms of the right
approximating networks. The formulations of the approxi- sides of Eqs. (4) and (6), while the second terms of the right
mation-error estimations in most of the previous results sides of Eqs. (4) and (6) are negligible, if j is large enough
contain inexplicit constants. Then it is not easy to calculate for l. These results apply easily to an approximation to a
the estimated values of approximation errors practically non-periodic function defined in a bounded set on R m. The
using the formulations, which fit the actual valuers. The approximation example to a two-dimensional Gaussian
error estimations in this paper are derived from the direct function by our methods illustrates our results.
1058 S. Suzuki / Neural Networks 11 (1998) 1049–1058
Acknowledgements
References