Vous êtes sur la page 1sur 7


Control Engineering Practice 13 (2005) 681–687


Nonlinear system identification of rapid thermal processing

Caizhong Tiana,, Takao Fujiib
Tokyo Electron LTD., TBS Broadcast Center, 3-6 Akasaka 5, Minato-ku, Tokyo 107-8481, Japan
Graduate School of Engineering Science, Osaka University, Machikaneyama 1, Toyonaka city, Osaka 560-8531, Japan
Received 13 September 2003; received in revised form 2 April 2004
Available online 16 September 2004


Identification of rapid thermal processing parameters is examined to find a more accurate model to predict and control the
temperature of semiconductor wafers during processing. In this paper, a Wiener model is applied to identify the significant dynamics
of an RTP system. A recursive method is developed to simultaneously estimate the model parameters and states with respect to
parameter variation in RTP systems under process noise. The identification result shows that the model’s prediction error was
greatly reduced as compared to a linear model. The proposed method can also be easily applied to model-based adaptive control.
r 2004 Elsevier Ltd. All rights reserved.

Keywords: Identification; Nonlinear models; Kalman filters; Measurement noise; Recursive estimation; Adaptive control

1. Introduction wafer (Campbell, Knutson, Liu, & Leighton, 1991; Jin

& Hyun, 2001). Experiments have shown that, given an
Rapid thermal processing (RTP) is a key technology accurate enough model, the nonuniform temperature of
for single-wafer fabrication operations in semiconductor the wafer can be controlled by adjusting the relative
manufacturing. The requirements imposed by the need power of individual lamps, which alters the heat flux
for high quality, high yield, and increasing feature density of a wafer in RTP (Cho & Gyuyi, 1997;
sizes, as well as the competitive nature of semicon- Gyurcsik, Riley & Sorrell, 1991; Acharya et al., 2001).
ductor manufacturing, motivates this interest in model Most approaches to model identification in RTP
identification of RTP systems for control. Over systems use a linear approximation model such as the
the last few years, many RTP system parameters have time constant and gain (TG) model (Cho, Paulraj &
been identified through coarse models, but the short- Kailath, 1994), the state-space model (Cho & Kailath,
comings of these models severely limit process-control 1993), and a physics-based numerical calculation model
accuracy. (Campbell et al., 1991; Wang & Spanos, 2002).
RTP systems heat the wafer using radiative energy However, the nonlinearity and parameter variation
generated by several lamps placed near the wafer. The during processing prevent us from obtaining a precise
most common concern in manufacturing is thermal model with reliable identification for a wide range of
uniformity across a wafer during processing. Since the wafer temperatures. Some recent techniques on model
desired heating and cooling rates are often very high, the identification have been studied in order to identify an
resultant stresses can easily exceed the plastic limit of the accurate model for the RTP system.
In this paper, we present some initial experimental
Corresponding author. Central Research Laboratory, Tokyo results on the development of a nonlinear Wiener model
Electron Ltd., 1-8 Fuso Cho, Amagasaki City Hyogo Prefecture,
(Schetzen, 1980) for RTP systems. The model is valid
6600891 Japan. Tel.: +81-664-874-775; fax: +81-664-872-892. over a larger operating envelope than a linear time-
E-mail address: david.tian@tel.com (C. Tian). invariant model. A recursive method based on the

0967-0661/$ - see front matter r 2004 Elsevier Ltd. All rights reserved.
682 C. Tian, T. Fujii / Control Engineering Practice 13 (2005) 681–687

Extended Kalman Filter (Ljung, 1979) is applied to indicating static nonlinearity defined by
simultaneously estimate the model parameters and
xkþ1 ¼ Axk þ Buk þ wk ;
states under parameter variation in RTP systems
and process noise. We also explore the implications yk ¼ Cxk þ Duk ; ð1Þ
for the future design of model-based adaptive control zk ¼ jðyk Þ þ vk :
It is assumed that (1) both w and v are zero-mean
stationary stochastic processes with rational spectral
2. The identification problem of rapid thermal processing density; (2) the nonlinear function jðÞ is differentiable,
and hence it can be approximated by a set of
A small custom RTP system, shown in Fig. 1, was polynomials; (3) the physical parameters in RTP vary
used in identification. In this system, electrical energy as slowly during the processing and within a small range;
system input is supplied to one pot lamp arranged in the and (4) the RTP system was treated as a three-input and
center and two ring cylindrical lamps arranged three-output system in our experiment. Now the
in the middle and edge of the heater, respectively. identification problem is: given the power consumption
Energy transport is achieved both by radiation of each lamp (the model input uk ) and the temperature
through a quartz window onto a thin wafer and by of the wafer (measured output z^ k ¼ ½^zk;1 ; z^k;2 ; z^k;3 ),
reflections off the walls. The low thermal mass of a determine a state space realization ðA B C DÞ and a
single wafer allows the RTP system to rapidly increase parametric estimation of the static nonlinearity. Since it
wafer temperatures; the cold-wall system allows the is impossible to measure the internal signal yk ; let y^ k ¼
wafer to be quickly cooled as well. Three thermocouples ½y^ k;1 ; y^ k;2 ; y^ k;3  be the estimated output of linear sub-
are mounted on the surface of the wafer to measure the model yk in the sequel.
temperature at its center, middle, and edge (the system
output). 3. Extended Kalman filter-based recursive identification
To capture the significant dynamics of this kind of
physical system, we attempted to use a nonlinear Wiener 3.1. Canonical parameterization of the Wiener model
model. The Wiener model is a kind of block-oriented
nonlinear model consisting of a dynamic linear submodel In general, the linear dynamics and static nonlinearity
and a static or memoryless nonlinear block, illustrated of the Wiener model cannot be independently identified
in Fig. 2. The advantages of using this kind of model lie because of the models cascade structure (Billings &
in low computational cost for identification and Fakhouri, 1982). We use a canonical parameterization
suitability for control design. Here the model form is of the two blocks. Since a scale factor can be arbitrarily
chosen as the state space representation of the linear distributed between the linear dynamics and the static
submodel ðA B C DÞ and a nonlinear function jðÞ nonlinearity without affecting the input–output char-
acteristics of the model, the gain can be fixed in one of
them. Let the Wiener model be parameterized in the
pseudo-observability form (Ljung, 1997) of the linear
submodel and the Chebyshev approximation of the
static nonlinearity,
Aðyl Þ ¼ diag½A1 ; A2 ; A3 ;
Bðyl Þ ¼ ½B1 ; B2 ; B3 ;
Cðyl Þ ¼ diag½C1 ; C2 ; C3 ;
Dðyl Þ ¼ 0;
ji ðy^ k;i Þ ¼ g0;i T 0 þ g1;i T 1 þ    þ gp 1;i T p 1 ;
i ¼ 1; 2; 3; ð2Þ

Fig. 1. Schematic of the small RTP system.

where Ai ; Bi ; Ci are defined as
2 3 2 3
0 1  0 0

6 .. .. . . .. .. 7 6
6 . . . . . 7 6 7
Ai ¼ 6 7; Bi ¼ 6 . 7;
wk vk 6 7 6 . 7
40 0  0 1 5 4 . 5
uk Dynamic yk Static zk
linear nonlinear

Fig. 2. Wiener model. Ci ¼ ½ 1 0  0 ; i ¼ 1; 2; 3

C. Tian, T. Fujii / Control Engineering Practice 13 (2005) 681–687 683

and yl denotes the parameters of the linear block, linearity of the model, the experiment was designed so
gq;i ; ðq ¼ 0; 1; . . . ; p 1Þ the coefficients of the polyno- that the wafer temperature would not vary more than 30
mials, T q ; ðq ¼ 0; 1; . . . ; p 1Þ the Chebyshev polyno- from a reference temperature. From the SVD of the
mials. input–output data, shown in Fig. 4, we chose six as the
order of the linear submodel and empirically let the
T0 ¼ 1
indices order of each output signal be the same.
T 1 ¼ y^ k;i Accordingly the quadruple ðA B C DÞ is of the form:
T 2 ¼ 2y^ k;i 1 (" # " # " #)
0 1 0 1 0 1
.. A ¼ diag ; ; ;

T qþ2 ðy^ k;i Þ ¼ 2y^ k;i T qþ1 ðy^ k;i Þ T q ðy^ k;i Þ; ðqX0Þ: ð3Þ B ¼ ½
3 ;
The Chebyshev polynomials are usually defined on the C ¼ diagf½ 0 1 ; ½ 0 1 ; ½ 0 1 g;
basic domain L ¼ ½ 1; 1: If the output of the linear D ¼ ½03
3 ð5Þ
dynamic submodel y^ k;i is not included in L, then a
previous transformation of y^ k;i from ½y l; y þ l to L is and was calculated using the remaining procedure of the
needed: subspace method. Tenth-order Chebychev polynomials
were used for the approximation of the static non-
y^ 0k;i ¼ ðy^ k;i yÞ=l;
 (4) linearity. Although the parameters were valid for only a
where lXðy^ max y^ min Þ=2 and y is the average of y^ k : small temperature range, they provided an initial
estimation of the Wiener model for further identification
over a much wider range of wafer temperatures.
3.2. Preliminary experiments

To help obtain a priori information about the model, 3.3. Direct parameter and state estimation over a wide
such as the delay time, static gain, and the dominating range of wafer temperatures
time constant, we performed a preliminary step-
response experiment at a set of selected temperature Our aim was to be able to conveniently estimate and
points over the whole wafer temperature range. From directly track the model parameters by adjusting them
the results, we consider the dominating time constant until they reached the minimum of the 2-norm of the
variation with respect to the temperature, which is deviation between the system output and the estimate. A
illustrated in Fig. 3 and is similar to that of (Cho, systematic approach to this is to use the Extended
Paulraj & Kailath, 1994), the main nonlinearity of the Kalman Filter (EKF) (Ljung, 1979).
RTP system. It will lead to a significant model The EKF is based on a linearization of the state
prediction error when using a linear model to predict equations at each time step and the use of linear
the wafer temperature over a large operating tempera- estimation theory (Kalman filter). For the Wiener
ture range. model, this means linearizing the static nonlinearity
Based on the above preliminary step-response experi- jðÞ at each time step with respect to the output of the
ment, the initial parameter values of the Wiener model linear dynamic block:
and the order of the dynamic linear submodel can be x^ kþ1 ¼ Ax^ k þ Buk þ Mk ðzk Cj x^ k Þ;
obtained by linear system identification using the
x^ 0 ¼ 0; ð6Þ
subspace method (Overschee & Moor, 1996). To ensure

350 Singular value decomposition

polynomial Fitted Time Constant
Experimental Time Constant
Log of Singular values


200 1



50 -4
150 200 250 300 350 400 450 500 550 600 0 2 4 6 8 10 12 14 16
Time Constant (sec) Model order

Fig. 3. Time constant with respect to temperature at the wafer center. Fig. 4. Singular value decomposition of the input–output data.
684 C. Tian, T. Fujii / Control Engineering Practice 13 (2005) 681–687

where Mk is a Kalman filter gain matrix that needs to be and

updated at each time sample and
q2 V ðyÞ^ XN

Cj ¼ C diagðd 1 ; 0; d 2 ; 0; d 3 ; 0Þ: (7) H¼ 2

¼ WðkÞWT ðkÞ: (13)
qy^ k¼1
The gradient WðkÞ can be determined analytically in

dji ðy^ i Þ some cases, but can always be approximated by
di ¼ i ¼ 1; 2; 3: (8)
d y^ i y^i ¼y^k;i numerical differentiation.
Since an online method is our concern and, thus, a
For the parameter updating, it is natural to interpret solution has to be calculated before all the data is
the EKF as an attempt to minimize the expected obtained, singular value decomposition cannot be
value of the squared residuals (or performance applied as is used in canonical variate analysis and
function) associated with the model parameters. other batch methods. Eq. (13) may then be rewritten in
Newton’s method can be used to solve the problem, the form of a recursion as:
which converges quickly to the asymptote. Before that, X
several preliminaries on this method are described HðkÞ ¼ WðkÞWT ðkÞ
below: k¼1

¼ Hðk 1Þ þ WðkÞWT ðkÞ: ð14Þ

3.3.1. Model error and performance function
Consider the matrix identity
The model error characterizes the discrepancy be-
tween measurement and model prediction output when ðE þ FGÞ 1
the model is excited by the input signal uk : The following
¼ E 1 E 1 FðI þ GE 1 FÞ 1 GE 1 : ð15Þ
scaled model error ðkÞ ¼ ½1 ðkÞ; 2 ðkÞ; 3 ðkÞ expresses
the model quality more accurately than the model error Identify
without scaling:
Hðk 1Þ ¼ E;
zk;i z^k;i WðkÞ ¼ F;
i ðkÞ ¼ ; i ¼ 1; 2; 3: (9)
WT ðkÞ ¼ G; ð16Þ
The model error ðkÞ includes two parts: 0 ðkÞ; which
then applying Eq. (15) to Eq. (14) to obtain
corresponds to the process and output noise, and a
parameter-dependent part p ðkÞ: H 1 ðkÞ ¼ H 1 ðk 1Þ
ðkÞ ¼ 0 ðkÞ þ p ðk; yÞ; (10) H 1 ðk 1ÞWðkÞWT ðkÞH 1 ðk 1Þ
: ð17Þ
where y denotes the estimated parameters of the Wiener 1 þ WT ðkÞH 1 ðk 1ÞWðkÞ
^ cannot be measured independently,
model. Since p ðk; yÞ Let
instead of minimizing p ðk; yÞ ^ with the parameter
^ H 1 ð0Þ ¼ d 1 I; (18)
estimate y; a relative performance function of quadratic
form in ðkÞ is minimized: where d is a small positive number and I is the identity
matrix. This form of initialization assures that H 1 ðkÞ is
XN always positive definite.
^ ¼1
V ðyÞ ðkÞðkÞT : (11)
2 k¼1 With these preliminaries, the whole EKF-based
procedure for estimating model parameters and states
The EKF here reduces the influence of 0 ðkÞ (Ljung, in RTP systems is described as follows:
1997). Measurement updating:
Mk ¼ Pkjk 1 CTj ðR þ Cj Pkjk 1 CTj Þ 1 ;
3.3.2. Hessian matrix and its inverse
The Hessian matrix and its inverse (Bishop, 1997) x^ kjk ¼ x^ kjk 1 þ Mk ðzk Cj x^ kjk 1 Þ;
are basic to the formulation of second-order Pkjk ¼ ðI Mk Cj ÞPkjk 1 ;
optimization methods as an alternative to updating the y^ k ¼ Cx^ kjk ;
model parameters in the EKF. The Hessian matrix of
^ dji ðy^ i Þ
the cost function V ðyÞ; denoted by H; can be di ¼ ; i ¼ 1; 2; 3;
approximately determined from the model error gradi- d y^
i y^ i ¼y^ k;i
ent WðkÞ; where Cj ¼ C diagðd 1 ; 0; d 2 ; 0; d 3 ; 0Þ;
z^ k ¼ Cj x^ kjk ;
WðkÞ ¼ (12) Qk ¼ Cj QCTj : ð19Þ
C. Tian, T. Fujii / Control Engineering Practice 13 (2005) 681–687 685

Time updating: 0

x^ kþ1jk ¼ Ax^ kjk þ Buk ; -0.1

Pkþ1jk ¼ APkjk AT þ Qk : ð20Þ -0.2

Parameter updating: -0.3

y^ kþ1 ¼ y^ k ZH 1 ðkÞWðkÞ: (21) -0.4

In these equations, -0.5

 x^ kjk 1 is the estimate of xk given past measurements -0.6

up to zk 1 : -0.7
 x^ kjk is the updated estimate based on the last 0 2000 4000 6000 8000 10000 12000 14000
measurement zk :
 Z is a step size for adjusting the convergence of the Fig. 5. The updating of the independent parameter in matrix A.
 E½vðtÞvðtÞT  ¼ Rdðt tÞ;
 E½wðtÞwðtÞT  ¼ Qdðt tÞ Table 1
Comparison of ARX model and Wiener model
and the process noise covariance matrices Q are also
updated, since a large (or small) gain implies that the Performance function
system is sensitive (or insensitive), thus, the process
Linear ARX model 1.9675e+003
noise will have a strong (or weak) influence on the Wiener model 1.7635e+002

Identification result real output"." model output"-"

4. Experimental results and implication for process 700
control 600

The dynamics between the three lamps and the wafer
temperatures were tested over a large operating tem- Center
perature range from 350 to 600 using a small custom 300
0 2000 4000 6000 8000 10000 12000 14000
RTP system. PRBS (pseudo-random binary sequence) time
signals were designed as input signals based on
preliminary experiments. The average switch time of

the PRBS signals was 350 s, leaving enough time for a 500

significant change to be registered by the thermocouples 400

on the wafer. All experimental data were recorded with 300
a sampling frequency of 1 Hz. 0 2000 4000 6000
8000 10000 12000 14000

Both a linear ARX model and a nonlinear Wiener 700

model were identified for comparison. The parameters 600


of ARX model were updated online using RLS. In total, 500

13000 data points were simulated and the parameters of
the Wiener model converged within 3000 points. Fig. 5 Edge
shows the updating of the independent parameters in 0 2000 4000 6000 8000 10000 12000 14000
matrix A during the identification. The performance
functions of the models are shown in Table 1. The Fig. 6. Fits of the Wiener model output and measured output.
model fits (center temperature of the wafer) are plotted
in Figs. 6 and 7. Clearly, the model fit error is
greatly improved by the Wiener model with an online The identified Wiener model immediately suggests a
EKF-based parameter estimator. control strategy. Since the nonlinearity is static and
The fits of a static nonlinear function for the center assumed to be differentiable, control of the Wiener
temperature of the wafer at the last step during the system is equivalent to control of the output of
identification is shown in Fig. 8. The static nonlinearities the linear subsystem. During the model parameter
for the middle and edge temperature of the wafer are estimation, the EKF simultaneously estimated the
similar to that for the center. states of the linear dynamic subsystem of the Wiener
686 C. Tian, T. Fujii / Control Engineering Practice 13 (2005) 681–687

model. This will allow a linear control technique to be a linear adaptive regulator is designed to control the
applied by using state feedback. A block diagram of the output of a linear subsystem with respect to the
control of Wiener-type systems is shown in Fig. 9, where parameter variation in the RTP system. The details of
controller design and its analysis are left for future
Identification result real output"." model output"-" research.



5. Conclusions
0 2000 4000 6000 8000 10000 12000 14000 In this paper, a predictive model for RTP has been
700 identified using a nonlinear Wiener model with an online
EKF-based parameter estimator. It has been shown that

this model is superior to a linear model in capturing the

RTP dynamics, and the EKF-based online estimator for
model parameters and state estimation were implemen-
0 2000 4000 6000 8000 10000 12000 14000
ted successfully.
time The preliminary experiments were a coarse analysis to
find out the magnitude or range of certain model
structure parameters, such as dead-time and the nature

500 of the nonlinearity. This increased the confidence of the

400 user in doing the identification and supplied the iterative
parameter estimation methods with reliable initial
0 2000 4000 6000
8000 10000 12000 14000
The second step of identification over a wide range
Fig. 7. Fits of the linear ARX model output and measured output. of wafer temperatures was formulated with respect
to the EKF-based parameter optimization which con-
verged quickly to the asymptote. Compared to
estimated nonlinear function other recent methods for RTP parametric identification,
polynomial fitted nonlinearity
the proposed EKF-based identification method is
real nonlinearity more easily applied to adaptive control problems. The
method finds systematic solutions for a block-oriented
nonlinear model despite parameter variation and process

The final goal of our work is to control the

500 temperature uniformity across a wafer within a
certain time period. With these positive results,
450 in our future study we will design an adaptive controller
to adjust the relative power of the RTP lamps to
400 optimize the temperature uniformity of the wafer. In
view of the fact that many real systems can be
350 naturally described by a Wiener model as pointed
-1 -0.5 0 0.5 1
5 out by Schetzen (1980), the method proposed
output of linear part x 10
in this paper can be applied to more general process
Fig. 8. Polynomial fitted static nonlinearity. identification.

ref e u zk
ϕ(∗ )−1 Adaptive regulator Linear y
Wiener system
x^ k y^ k

Fig. 9. Control of Wiener-type systems.

C. Tian, T. Fujii / Control Engineering Practice 13 (2005) 681–687 687

Acknowledgements Cho, Y. M., & Gyuyi, P. (1997). Control of rapid thermal processing:
A system theoretic approach. IEEE Transactions on Control
Systems Technology, 5(6), 644–653.
The authors gratefully acknowledge the help provided
Cho, Y. M., & Kailath, T. (1993). Model identification in rapid
by Professor Karl F. MacDorman (Osaka University) thermal processing systems. IEEE Transactions on Semiconductor
through his many useful discussions and constructive Manufacturing, 6(3), 233–245.
comments. Cho, Y. M., Paulraj, A., & Kailath, T. (1994). A contribution to
optimal lamp design in rapid thermal processing. IEEE Transac-
tions on Semiconductor Manufacturing, 7(1), 34–41.
Gyurcsik, R. S., Riley, T. J., & Sorrell, F. Y. (1991). A model for rapid
thermal processing: Achieving uniformity through lamp
References control. IEEE Transactions on Semiconductor Manufacturing,
4(1), 9–13.
Acharya, N., Kirtikar, V., Shooshtarian, S., Hong, D., Timans, P. J., Jin, Y. C., & Hyun, M. D. (2001). A learning approach of wafer
Balakrishnan, K. S., & Knutson, K. L. (2001). Uniformity temperature control in a rapid thermal processing system. IEEE
optimization techniques for rapid thermal processing systems. Transactions on Semiconductor Manufacturing, 14(1), 1–10.
IEEE Transactions on Semiconductor Manufacturing, 14(3), Ljung, L. (1979). Asymptotic behavior of the extended Kalman filter as
218–226. a parameter estimator for linear systems. IEEE Transactions on
Billings, S. A., & Fakhouri, S. Y. (1982). Identification of systems Automatic Control, AC-24, 36–50.
containing linear dynamic and static nonlinear elements. Auto- Ljung, L. (1997). System identification: Theory for the user. Englewood
matica, 18(1), 15–26. Cliffs, NJ: Prentice-Hall.
Bishop, C. M. (1997). Neural networks for pattern recognition. Oxford: Overschee, P. V., & Moor, B. D. (1996). Subspace identification for
Oxford University Press. linear systems. Dordrecht: Kluwer Academic Publishers.
Campbell, S. A., Ahn, K. H., Knutson, K. L., Liu, B. Y. H., & Schetzen, M. (1980). The Volterra and Wiener theories of nonlinear
Leighton, J. D. (1991). Steady-state thermal uniformity systems. New York: Wiley.
and gas flow patterns in a rapid thermal processing Wang, J. X., & Spanos, C. J. (2002). Real-time furnace modeling and
chamber. IEEE Transactions on Semiconductor Manufacturing, diagnostics. IEEE Transactions on Semiconductor Manufacturing,
4(1), 14–19. 15(4), 393–403.

Vous aimerez peut-être aussi