Académique Documents
Professionnel Documents
Culture Documents
To improve traditional neural networks, the present research used the wavelet net-
work, a special feedforward neural network with a single hidden layer supported by the
wavelet theory. Prediction performance and efficiency of the proposed network were ex-
amined with a published experimental dataset of cross-flow membrane filtration. The
dataset was divided into two parts: 70 samples for training data and 330 samples for test-
ing data. Various combinations of transmembrane pressure, filtration time, ionic strength
and zeta potential were used as inputs of the wavelet network to predict the permeate
flux. The initial network led to a wavelet network model after training procedures with
fast convergence within 30 epochs. Further, the wavelet network model accurately de-
picted the positive effects of either transmembrane pressure or zeta potential on permeate
flux. Moreover, comparisons indicated the wavelet network model produced better pre-
dictability than the back-forward backpropagation neural network and the multiple re-
gression models. Thus the wavelet network approach could be employed successfully in
modeling dynamic permeate flux in cross-flow membrane filtration.
Key words:
Artificial Neural Network, membrane, simulation
(RBFNs) have been proven the two most attractive many studies describe wavelet networks as a pow-
methods for modeling membrane filtration until erful tool for function approximation (Zhang and
now (Al-Abri and Hilal,13 Aydineretal.,14 Sahoo and Benveniste,21 Zhang et al.22), system identification
Ray15). The BPNN was first used to predict the (Billings and Wei,23 Sjoberg et al.24), automatic
dynamic process of membrane fouling during control (Sanner and Slotine,25 Oussar et al.26), etc.
cross-flow microfiltration of cane sugar. They stud- Such network is a special 3- layer feed forward net-
ied the effect of the BPNN hidden structure and dif- work as shown in Fig. 1 which is calculated as fol-
ferent distributions of training data on the predict- low:
ability, and they obtained satisfactory accuracy.
åi=1 y( x )
s
Infollowing studies, BPNNs have been shown to y$ = (1)
besometimes better suited and more useful for
highnonlinear processes of membrane filtration
(Chellam,16 Cinar et al.,17 Curcio et al.,18 Ni
Mhurchu and Foley19). Unfortunately, there is a
lack of reliable rules for the choice of the hidden
structure of BPNNs and the common way is by trial
and error. Moreover, initial parameters of BPNNs
are often assigned random values, which cause slow
convergence during the training process. On the
other hand, using the experimental data of Fabish et
al.,20 Chen and Kim11 found that RBFNs achieved
better predictions than BPNNs. However, Sahoo
and Ray16 pointed out that the results of Chen and
Kim12 were obtained through ANNs with un-opti-
mized structures. With the same experimental data F i g . 1 – The structure of wavelet network
of Fabish et al.,20 they showed that there was no ap-
parent difference in predictability between BPNNs
and RBFNs when the structures of both type of net- Where, y$ is the output; x is the input; w is the con-
works were optimized by genetic algorithms (GAs). necting weight; y is the hidden-layer neuron; s is
Although the GA optimized networks may improve the number of hidden neurons. As a distinguishing
the predictability of ANNs, Cheng et al.12 stated characteristic, y is a wavelet function and is often
that the GAs are inclined to encounter the pitfall of called a wavelon. The wavelet is a function that sat-
overfitting when the number of neurons is too large isfies an admissibility condition and it has a prop-
because of their intrinsic randomness. Cheng et erty of compact support, namely, nonzero values
al.12 presented a modified RBFN (MRBFN) which only in a finite domain (Daubechies28). A single
showed good convergence resulting from a well ini- wavelet y(x) called a mother wavelet can generate
tialized structure. They constructed the MRBFN by family of wavelets by dilating and translating itself
choosing neurons from a set of candidate neurons, as following:
which were predetermined according to uniform
partitions of the domain of interest at different lev- ì
ï æ 1ö
ç- ÷dn ü
ï
els. However, their method could encounter the íy m. n(x ) = aè 2 ø y(a-n x - mb) : n Î Z , mÎ Z d ý (2)
ï
î ï
þ
problem of dimension curse when the input dimen-
sion increases, because the increasing dimension
leads to an exponential increase of the uniform par- Where a and b are the dilation and translation step
titions of the domain. In particular, the dimension sizes (typically a = 2, b = 1), respectively, m is the
curse becomes more severe when higher levels are translation index and n is the dilation level. Each
carried out in the case of higher input dimension. wavelet in the family (2) has finite support. Support
Despite the vast efforts to improve the efficiency of refers to the region where the function is nonzero.
ANNs in membrane processes as above, there are The wavelet function of the family (2) can exhibit a
still some problems, for instance, lack of specific support that is compact or rapidly vanishing
methods or the danger of encountering lower effi- (Daubechies27). The wavelet family can be used as
ciency to determine the initial network structures tools for function analysis and synthesis. The con-
and the initial parameters. Therefore, a more effi- cept of a wavelet family is very similar to a set of
cient type of ANNs will be preferred to model the sine functions at different frequencies used in Fou-
complex process of membrane filtration. Wavelet rier analysis. The family (2) is used as a set of can-
networks have shown to be a promising alternative didate wavelets that constitute the original form of
to traditional ANNs (Zhang and Benveniste21) and eq. 1, i.e., the initial wavelons are chosen from the
J. VIVIER and A. MEHABLIA, A New Artificial Network Approach for …, Chem. Biochem. Eng. Q. 26 (3) 241–248 (2012) 243
family (2). For a set of training data, only part of namic permeates flux. Out of 400 data samples, 70
wavelets in the family (2) are useful for constituting samples were chosen as training data such that
the wavelet network model because most wavelets more samples were located in regions of greatest
may cover no data due to their compact support curvature, as shown in Fig. 3. The other 330 sam-
(Zhang28). In addition, the set of training data in ples were used as testing data. Prior to being put in
practical cases is sparse, and this sparseness leads the wavelet network, the values of all experimental
to a further reduction in the number of candidate data were normalized between –1 and 1 as below:
wavelets in family (2). Clearly, both the wavelet
compact support and the sparseness of training ( x - x cent )
x norm = (3)
data make the selection of initial wavelons in eq. 1 x half
more convenient. Once a given number of initial
wavelons have been chosen, the initial value of the Where, xcent = (xmax + xmin)/2 is the center of variable
connecting weight (w) can be determined by linear Interval, xhalf = (xmax + xmin)/2 is the half of interval
regression (Zhang28). Compared with traditional length, xmax is the maximum value of a variable and
ANNs, wavelet networks obviously offer a highly xmin is the minimum value of a variable. On the
efficient approach for constructing the initial form other hand, a corresponding denormalization was
of networks. Many studies have exhibited the high done to achieve a reasonable permeate flux after an
efficiency of wavelet networks according to differ- output of the network was attained. The normaliza-
ent constructing methods, such as the orthogonal tion method was derived from Z-score normaliza-
least square (OLS) algorithm and backward elimi- tion (Priddy and Keller30). Such normalization can
nation algorithm (Oussar and Dreyfus,29 Zhang28). minimize bias within the wavelet network for one
In particular, Zhang28 has reported that wavelet net- feature over another. It can also speed up training
works converged fast during training processes be- time by starting the training process for each fea-
cause of their better initialization, and wavelet net- ture within the same scale.
works already offered a comparable accuracy even
in their initial form. Specific constructing methods,
better initialization of internal parameters and fast Constructing and training the wavelet
convergence, all of them make wavelet networks network
more suitable than traditional ANNs for modeling
According to eq. 1, a so-called Mexican hat
complex nonlinear processes. Although the great
wavelet (Zhang28) was chosen as a mother wavelet
benefit of wavelet networks has been successfully
with the following form:
applied to many fields, any wavelet network appli-
cation to membrane systems has not been reported -||x ||2
yet. In this paper, the wavelet network approach y( x ) = ( d - || x || 2 ) e 2 (4)
was used to model a process of cross-flow mem-
brane filtration based on the published data set of Where, d is the dimension of x, ||x||2 = xxT. Then a
Bowen et al.8 The purpose of the current investiga- family of candidate wavelets was determined by the
tion was to demonstrate the high efficiency of a mother wavelet and the training data. Then, further
wavelet network; to study its predictability of per- development was done by selecting different num-
meate flux under various operating conditions and bers of wavelons from the candidate family accord-
to compare its capability with those of BPNN and ing to the OLS algorithm and then, the number
the conventional multiple regression (MR) method. of wavelons was determined by the generalized
cross-validation (GCV) criterion. The GCV, a mod-
ified form of cross-validation, is a popular method
Materials and methods for choosing the smoothing parameter. An impor-
tant early reference is Craven and Wahba.31 The
Experimental data GCV criterion consists of estimating the expected
performance of the model evaluated with fresh data
This study used the experimental data pub-
based on the required data (Zhang28). Thus an
lished in Bowen et al.30 In the experiments, spheri-
initial form of the wavelet network model was
cal silica colloids with a mean particle diameter of
achieved. In the end, the initial wavelet network
65 nm were used to perform cross-flow membrane
was trained to get the final wavelet network model.
ultrafiltration at five different combinations of pH
The detailed steps are listed below:
and ionic strength (I). For each combination of pH
and I, the permeate flux was measured under five 1) Definition of wavelet support: For ym,n(x) in
different transmembrane pressures (DP) as shown the family (2), denoted by Sm,n its support, i.e.,
in Fig. 4. In this paper, I, z, DP and time (t) acted as
the input of wavelet networks to predict the dy- S m , n = {x Î R d :|y m , n (x )| > 0662
. max |y m , n (x )| } (5)
244 J. VIVIER and A. MEHABLIA, A New Artificial Network Approach for …, Chem. Biochem. Eng. Q. 26 (3) 241–248 (2012)
W = {y m , n :( m, n ) Î I 1 È I 2 ¼ I N } (7)
y$ i ( x ) = å wl y l ( x )
i i
i=1
Let L be the number of wavelets in W. For which is the resulting wavelet network;
convenience of the following presentation, the
double index (m, n) is replaced by a single index 1
N
æ 2s ö
j = 1, 2, …, L, i.e., GCV i =
N
å ( y$ i ( x k ) - y k )çè1 + N ÷ø;
k =1
W = {y 1 , y 2 , ¼ y L } (8)
I = I - {l i };
3) Determination of the initial wavelet net-
End loop:
work: The initial wavelet network is achieved by a
hybrid algorithm (Zhang28) where, the OLS algo- s = arg min GCV i .
rithm determines the relative optimal combination iÎ{1¼L}
of a certain number of candidate wavelets and then
the GCV criterion points out the most reasonable As a result, the initial wavelet can be achieved
combination of a certain number of candidate with the form of eq. 1.
wavelets. The hybrid algorithm can be done as fol- 4) Training the wavelet network: A common
lows: backpropagation training method is applied to fit
the training data, as well as possible in this paper.
Algorithm: The details about back-propagation method can be
I = (1, 2, ¼ L) found in Haykin.32 The performance of the training
process is evaluated with normalized square root of
p j = y j for all j Î I ; mean square (NSRMSE)
1/K å
K
l0 = 0, ql0 = 0; ( y k - y$ k ) 1
K
with y = å y k .
k =1
NSRMSE =
1/K å
Begin-loop K K k =1
( y k - y)
k =1
For i = 1 : L p j = p j - (y Tj q li-1 )q li-1 , for all j Î I ;
In addition, the NSRMSE was used as the stop-
I = I - { j : p j = 0}: ping criteria for training ANN, and it was also a
performance index of model prediction in this pa-
If I is empty, set s = i – 1 and break the loop; per.
l j = arg max ( p Tj y ) 2 ( p Tj p j );
j ÎI
Results and discussion
a ji = y Tli q l j for j = 1, ¼ , i - 1;
The wavelet network model
a ii = p lTi p li ; By examining each training input xk consisting
of I, z , DP and t, a family of 152 candidate wave-
q li = a-ii 1 p li ; lets was achieved and each of the wavelets had a
support covering at least one training input. By
~ = q T y; comparison, the number of candidate wavelets
w li li would be: (24)0+ (24)1+ (24)2+ (24)3+ (24)4 = 69905.
J. VIVIER and A. MEHABLIA, A New Artificial Network Approach for …, Chem. Biochem. Eng. Q. 26 (3) 241–248 (2012) 245
Effect of I, z and DP on permeate flux cle repulsion, thus causing less deposition (thinner
cake layers) and more permeable cake layers.
Fig. 4 shows the measurements and the predic-
tions of the wavelet network model for permeate Consequently, the thinner and more permeable
flux during cross-flow ultrafiltration of silica cake layers lead to an enhancement of permeate
flux. Such an enhancement became more signifi-
colloids at five pressures for each of the five combi-
cant in this paper because of the colloidal particles
nations of pH and I. The wavelet network model
with a smaller mean diameter 65 nm. Faibish et
excellently offered predictions for both training and
al.20 have reported that for particles with diameters
testing data. Moreover, the nonlinearity of the flux
smaller than about 100 nm, the effect of interaction
time profiles was well reproduced by the wavelet
repulsion on permeate flux becomes dominant.
network. As shown in Fig. 4, the predictions, as
Among all the predictions and experiments, the com-
well as the measurements accurately described per- bination of pH = 9, I = 0.030 M and z = –80.5 mV
meate flux behavior greatly changing with the vari- led to the highest permeate flux for each DP in ear-
ations of DP, pH, and I. The same plot could have lier stage. Besides, the effect of the highest z, lower
been generated for each scenario but it has been I also enhanced the permeate flux for case E, which
confined to one case to show the trend. For each is in accordance with the report by Elzo et al.35 who
combination of pH and I, the permeate flux was stated that lower I corresponds to larger distances
proportional to DP and the ultimate flux at higher between particles and such distance results in more
transmembrane pressure was higher. These phe- permeable cake layers. The predictions in case E
nomena are due to the greater driving force for per- agree well with the theoretical explanation. All the
meate flux by higher DP (Faibishet al.20). Mean- case scenarios are listed below.
while, higher DP led to the steeper flux decline at
the earlier stage of the filtration. This trend is attrib- Case A: pH = 4, I = 0.072 M, z = 6.3 mV (Fig. 4)
uted to the higher convective mass of colloids to- Case B: pH = 4, I = 0.030 M, z = 17.9 mV
ward the membrane surface and a more densely
packed cake layer at higher pressure (Faibish et Case C: pH = 4, I = 0.0077 M, z = 24.0 mV
al.20). On the other hand, for each combination of Case D: pH = 7, I = 0.031 M, z = 46.7 mV
pH and I, the lowest P was observed with the least Case E: pH = 9, I = 0.030 M, z = 80.5 mV
flux decline for both experimental data and predic-
tions. This observation is an index of operation ap-
Performance comparison
proaching the critical flux below which colloids do
not deposit on the membrane surface (Faibish et This section describes the superiority of the
al.30). The predictions also accurately described the wavelet network over BPNN under the same mod-
positive effect of z on permeate flux. Also, an in- eling conditions. The BPNN model used the tanh
creasing flux followed a decreasing I with the same function as hidden neurons. Moreover, a compari-
pH and, in the latter, the enhancement in permeate son with the conventional MR method was used to
flux was preceded by an increasing pH with the show the better initialization of the wavelet net-
same I. In fact, both of the above operations led work. Two MR models were built with the identical
to an increasing z (not considering the sign) of training data. One is a linear regression equation
colloids (Bowen et al.8). The enhancements in per- with the following form:
meate flux by z in Fig. 4, all the results are in quali-
tative agreement with the published reports by J = 2.488 · 10–2 – 2.401 · 10–1{I} –
Huisman et al.33 and McDonogh et al.34). These au-
thors explained the effect of z qualitatively by rea- – 2.230 · 10–4{z} + 1.376 · 10–4{DP} –
soning that high zeta potentials increase inter-parti- – 8.353 · 10–4{t}.
ing complex membrane filtration processes. As a 11. Chen, H. Q., Kim, A. S., 192 (1–3) (2006) 415.
modeling tool, wavelet networks could be used for 12. Cheng, L. H., Cheng, Y. F., Chen, J. H., J. Membr. Sci. 308
both observation of membrane system performance (1–2) (2008) 54.
and evaluation of experimental conditions. In addi- 13. Al-Abri, M., Hilal, N., Chem. Eng. J. 141 (1–3) (2002) 27.
tion, this modeling technique could be applied as a 14. Aydiner, C., Demir, I., Yildiz, E, J. Membr. Sci. 248 (1–2)
(2005) 53.
simulation tool to improve the operating conditions
15. Sahoo, G. B., Ray, C., Predicting J. Membr. Sci. 283 (1–2)
of other water and wastewater treatment systems (2006) 147.
that involve highly nonlinear processes. 16. Chellam, S., J. Membr. Sci. 258 (1–2) (2005) 35.
17. Cinar, O., Hasar, H., Kinaci, C., J. Biotech. 123 (2) (2006)
204.
Nomenclature 18. Curcio, S., Calabro, V., Iorio, G., J. Membr. Sci. 286 (1–2)
y$ - output (2006) 125.
x - input 19. Ni Mhurchu, J., Foley, G., J. Membr. Sci. 281 (1–2) (2006)
325.
w - connecting weight
20. Faibish, R. S., Elimelech, M., Cohen, Y., J. Colloid Interf.
y - hidden-layer neuron Sci. 204 (1) (1998) 77.
a and b - dilation and translation step sizes d: dimen- 21. Zhang, Q. H., Benveniste, A., IEEE T. Neural Network 3
sion of x (6) (1992) pp 889.
L - number of wavelets 22. Zhang, X. Y., Qi, J. H., Zhang, R. S., Liu, M. C., Xue,
H. F., Fan, B. T., Comput. Chem. 25 (2) (2001) 125.
NSRMSE - performance index of model prediction
23. Billings, S. A., Wei, H. L., A IEEE T. Neural Networ. 16
s - number of hidden neurons (4) (2005) pp 862.
xcent = (xmax + xmin)/2 - center of variable 24. Sjoberg, J., Zhang, Q., H., Ljung, L., Benveniste, A.,
xhalf = (xmax + xmin)/2 - half of interval length Delyon, B., Glorennec, P. Y., Hjalmarsson, H., Juditsky,
A., Automatica 31 (12) (1995) 1691.
25. Sanner, R. M., Slotine, J. J. E., Int. J. Contr. 70 (3) (1998)
References 405.
26. Oussar, Y., Rivals, I., Personnaz, L, Dreyfus, G., Neuro-
1. Brindle, K., Stephenson, T., Biotech. Bioeng. 49 (6) (1996) computing 20 (1–3) (1998) 173.
701. 27. Daubechies, I., Ten lectures on wavelets, SIAM, Philadel-
2. Liao, B. Q., Kraemer, J. T., Bagley, D. M., Crit. Rev. Envi- phia, (1992) 11.
ron. Sci. Tech. 36 (6) (2006) 48 9.
28. Zhang, Q. H., technical report of Linkoping University,
3. Visvanathan, C., Ben Aim, R., Parameshwaran, K., Crit. LiTH-ISY-I1423. October (1992).
Rev. Environ. Sci. Tech. 30 (1) (2000) 1.
29. Oussar, Y.; Dreyfus, G., Neurocomputing 34 (1) (2000),
4. Defrance, L., Jaffrin, M. Y, Gupta, B., Paullier, P.,
131.
Geaugey, V., , Biores. Tech. 73 (2) (2000) 105.
5. Song, L. F., J. Membr. Sci. 139 (2) (1998) 183. 30. Priddy, K. L., Keller, P. E., Artificial neural networks: An
introduction, SPIE Press, (2005) 15.
6. Zhang, M. M., Song, L. F., Environ. Sci. Tech. 34 (17)
(2000) 3767. 31. Craven, P., Wahba, P., Numer. Math. 31 (4) (1979) 377.
7. Li, Q. L., Elimelech, M., Environ. Sci. Tech. 38 (17) 32. Haykin, S., Neural networks: A Comprehensive Founda-
(2004) 4683. tion, 2nd. Ed., Prentice Hall, (1999) 183.
8. Bowen, W. R., Mongruel, A., Chem. Eng. Sci. 51 (18) 33. Huisman, I. H., Dutre, B., Persson, K. M., Tragardh, G.,
(1996) 4321. Desalination 113 (1) (1997) 95.
9. Tarleton, E. S., Wakeman, R. J., Chem. Eng. Res. Des. 71 34. McDonogh, R. M., Fane, A. G., Fell, C. J. D., J. Membr.
(4) (1993) 399. Sci. 43 (1) (1989) 69.
10. Chattopadhyay, S., Bandyopadhyay, G., Switzerland. Int. 35. Elzo, D., Huisman, I., Middelink, E., Gekas, V., Colloid
J. Remote Sens. 28 (20) (2007) 4471. Surf. Physicochem. Eng. A. 138 (2–3) (1998) 145.