Optimum and Heuristic Data Path Scheduling Under Resource Constraints?

Optimum and Heuristic Data Path Scheduling
Under Resource Constraints?

Cheng-Tsung Hwang Yu-Chin Hsu You-Long Lin
Department of Computer Science, Tsing HIMUniversity
Hsinchu, Taiwan 30043, R.O.C.
ABSTRACT feature of ZS is that it schedules m m than one control step at a

time. As compared to list scheduling techniques [3-51. which
This paper presents an integer linear programming model
schedule one control step at a time, ZS solves a problem more
for the scheduling problem in high level synthesis under
globally.
resource constraints. Extensive consideration is given to the
following applications: This paper is organized as follows: Section 2 reviews the
previous work The integer linear programming (ns) model
Multicycle operations with for a scheduling problem u n k resource constraints is described
non-pipelined function units, in Section 3. The Zone Scheduling (ZS) method is introduced
pipelined function units; in Section 4. Section 5 extends the formulations to a number of
Mutually exclusive operations; variations. The experimental results are shown in Section 6.
Functional pipelining; Finally, concluding remarks are made in Section 7.
Loop folding;
Scheduling under bus constraint.
2. PREVIOUS WORKS
Using this model, we are able to solve all the bench-
The simplest scheduling technique is av soon arparsjble
marks in the literature optimally in a few seconds. Besides the
(ASAP) scheduling [6] where the operations in control and data
model, a new technique. called Zone scheduling (ZS). is pro-
flow graph (CDFG) are scheduled into control steps from the
posed to solve large size problems. ZS partitions the distribu-
first control step to the last. An operation is scheduled to the
tion graph into several zones and solves sequentially the prob-
next control step only if all its predecessors have been
lems contained. A novel feature of this technique is that it
scheduled.
schedules more than one control step at a time, allowing us to
take a more global view of a scheduling problem. List scheduling has been adapted in many systems [3-51
In this scheme, ready operations are kept in an ordered list
1. INTRODUCTION according to a heuristic priority function and are scheduled in
order into the next control step until the number of scheduled
Operation scheduling and hardware allocation are the two operations e x d the number of resources.
major subtasks in data path synthesis. Operations are assigned
to the appropriate control ste@ in the scheduling phase. ( This In forcedirected scheduling [2]. "forct" values are calcu-
is referred to as state binding in [l].) Function units, registers lated for all omations at all feasible con~101 steps. The pairing
and interconnections are allocated for the data path in the allo- of operation and control stq~that has the most attractive force is
cation phase. These two subtasks are tightly interdependent selected and assigned. A force-directed list scheduling (FDLS)
[7] has been proposed recently for scheduling under resource
Roughly speaking, operation scheduling determines the constraints. It employs "force" value to detennine its priority
cost-speed trade-offs of a design Once the operations are functions.
scheduled, the number and types of function units, the lifetimes
of variables and the timing constraints are fixed. Thus. a good Since the above methods assign operations to control
scheduler is very important to an automated data path synthesis steps one at a time, their results depend strongly on the order of
system. We address in this paper the scheduling problem under the assignments. An integer linear programming formulation is
various resource constraints such as the number of multipliers. able to solve the problem globally. The following systems are
adders and busses. We also take into account what will happen examples of this approach.
in the real world, including multi-cycle operations, mutually An integer programming model to formulate a digital
exclusive operations, pipelining. etc. logic is found in [8]. The model gives detailed spedications
An integer linear programming model has been formu- for a data path. However, since too much complexity is
lated to solve the above problems. Besides the model. a new involved in digital logic synthesis, only a small problem can be
technique, called zone scheduling (a), is proposed to solve solved.
large size problems. ZS partitions the distribution graph [2] into An integer linear programming approach is proposed for
several zones and solves their problems sequentially. A novel microcode scheduling in CATHEDRAL-II [9].The model con-
tains data precedence, resource conflict and controller pipelhhg
+This work was supported in pan by the National Science Coun- constraints. Since excessive CPU time is required to solve large
cil, R.O.C. under grant no. Nsc19-0404-EOO7-24. problems, the model have been replaced by a graph-based
27th ACMllEEE Design Automation Conference@

Paper 5.2
'1990 IEEE 0738-100X19010006/0065 $1 .OO 65
heuristic scheduling algorithm.
Spaid [lo] incorporates a linear programming formulation
to group the operations into clusters. Operations are then Li
s c h d e d into control steps by a heuristic method. (3)
ALPS [ll]. which is the prototype for our present
research, is the first realistic integer linear programming
approach for data path scheduling. Because of its efficient con-
straint equations, this method obtains optimal solutions for mid-
dle size problems in acceptable run time.
Ti - C, S 0. for all oi without successors. (6)
Among the above approaches, list scheduling and FDLS
are resourceconstrained schedulers. while the FDS and ILP The objective function in (1) states that we are going to
approaches in [111 are timeconstrained schedulers. "ize the total number of control steps. Constraint (2)
states that no schedule should have a control step containing
3. AN ILP FORMULATION UNDER RESOURCE CON- more than M,, function units of type t k . It is clear that oi can
STRAINTS only be scheduled into a step between Si and b ; this is
A r e s o ~ c o n s t r a i n e dscheduling problem is to find a reflected in (3). The time variables are defined at constraint (4)
fastest schedule under a given set of resources. In general, the Constraint ( 5 ) ensures that the precedence relations of the DFG
resources given are the number of function units, such as be preserved. No operations should be scheduled after . ,
C as
adders, multipliers, ALUs and busses. described in constraint (6).
Our proposed scheduling approach includes list schedul- The number of variables and equations in our formulation
ing, ASAP, ALAP and ILP. List scheduling determines the grows as 0 (s.a) and 0(s .m + n ), respectively. The nm time is
upper limit on the number of control step for the optimal solu- more sensitive to the number of variables than to the number of
ASAP and
t i o ~ ALAP determine the minimum and maximum eCp3tiOnS.
start times of each operation; while ILP minimizes the number
of control steps in solving an integer linear programming formu- 4. ZONE SCHEDULING METHOD
lation In this section we present a new scheduling method,
The ILP formulation is presented here in detail. For sim- called Zone Scheduling (ZS).to solve the ILP formulation of
plicity of explanation. two assumptions are made : large size problem. Suppose we divide the distribution graph
(1) each operation is assumed to have a onecycle propaga- into n zones, zl, z2, .... z,. ZS will solve the first zone followed
tion delay; and by an updating of the distribution graph, then solve the second
(2) only non-pipelined data paths are considered. zone, and so on until all the operations are solved. The idea
Orher general considerations will be discussed in the later sec- behind ZS is that if the range of the minimum and maximum
tions. start times of an operation falls completely inside a zone, this
operation must be scheduled within the zone. If it crosses
Suppose the data flow graph (DFG) contains n operations zones. it can either be scheduled immediately or delayed to a
and is going to be scheduled into s steps which is obtained by subsequent zone.
list scheduling. Each of the operations is labeled oi, where
1 5 i I n . A precedence relation between two operations oi and For an operation whose time frame crosses a zone boun-
01 is denoted as oi + oj. where oi is the immedute predeces- dary, a new 0-1 variable, called delay-variable, is introduced to
sor of oj. The minimum start time (ASAP) and the maximum represent the possibility that it will be scheduled in the next
start time (ALAP)of or are Si and Li; respectively. There are zone. Given resource constraints, our goal is to minimize the
m types of function units available. A function unit for type tk
summation of weighted delay variables.
is named FUlk and the number of function units of type tk is Let's assume the zone considered extends from control
M,k. A relation between an operation 0; and a function unit step C,,, to Ced. Before we introduce the formulation, the
following variables are defined:
FUlk is denoted by oi E FUlk. if FUlk can perform the function (1) di: a delay variable introduced for oi if its time frame
of oi. crosses a zone boundary: i.e. Si 5 C,, c Li.
The variables used in the formulation are : di = 1 , if oi is delayed to the next zone.
(1) x i j are 0-1 integer variables associated with oi , where di = 0 , otherwise.
x i j = 1, if oi is scheduled into step j ; (2) C, : the set of operations whose time fiames cross the
x i j = 0, otherwise. zone boundary; i.e. C , = ( oi I Si 5 C,, < Li ).
(2) Ti are integer variables which denote the control step (3) wi: a weight given to 0; if oi E C,.
where oi is scheduled. (4) R,, : the set of operations whose time frame falls com-
(3) C,,, is an integer variable which equals the total number pletely inside the zone:
of control steps required. i.e. R,. = [ oi I C,,, I Li I C,, ).
( 5 ) q : a constant.
The problem can now be formulated: ci = 1. ifoi E c,.
ci = 0, if oi E R,. .
Minimize CIW, (1) (6)S'i = ma^( C- , Si ).
subject to (7)L'j = min( c,, ,L; 1.
(8) T:i : the control step if oi is scheduled in the zone;
T i = 0, otherwise.
Paper 5.2
66
The problem can now be formulated:
Minimize wi * di,
ai ‘C,
subject to
xij - Mlk I 0. for C, I j I C,,,,, 1 SkS m;
oicm (7) Fig.1 Data Flow Gnph
‘k
L’i
x,
j=S.
xij= 1, for all ~i ER-; (8) step1
--
x l x2 x3 x4 x5 x6 x7 x8 x9 x10 x l l
8
c
8
-
8
L’i
E,x i j +di= 1 , for all oi EC,,; (9) step 3 I

L..
lr’
I
I 11 I I I 1 I 1 I 1 I
L.1 I 1 11 I 1 I1 I
j=Si
step 4 ;- 1;. I I 1 I I I 1 11 I
L; L a b 1 1
7 j.xij-T:, , for all oi ER- C,; and (10) step 5 I; 1 1

L I
I I
b.
j s
Fig.2 Distribution Graph of b e 1 Scheduling
T’i + c~.(C,, + 1)di - T ; - ~ j * ( C =+d2).dj I -1 ,
for all oi +oj. (11) x l d x3x4 xS x6x7 x8 x 9 x l O x l l
The objective function in (6) states that we are going to

“ize the summation of weighted delayed variables. Weights
is determined by the importance of the delayed operations. It
can be calculated by using urgency or mobility as in list
scheduling. It is clear that no schedule should have a control Fig3 Distributicm Gnph of b e 2 Scheduling
step containing more than MIk function units of type t k ; this is
reflected in (7). Constraint (8) states that oi in ,R should be
scheduled within [S: ,L’il. Constraint (9)states that oi in C,
can either be scheduled in the range of [S‘i , L’i] or delayed to
the next wne. The time variables are defined in constraint (10). 11
Constraint (1 1) ensures the data dependencies.
Consider the data flow graph [2] in Fig. 1. The available
function units are two multipliers (FU,l) and two ALUs (FU12;
i.e.. Mtl = MIz = 2. The ALU is capable of performing addi- Fig.4 The Result
tion, subtraction and comparison. Also suppose the list schedul-
ing requires five steps to finish the scheduling under these con-
straints. The time kame of each operation, after the ASAP and T i - T3 + 4 d 3 I-1; Tz - T3 + U 3 5 -1;
ALAP schedulings are performed, is shown in Fig. 2. Suppose T6 + 3d6 - T7 - 4d7 I -1;
we divide the distribution graphs into two zones. z 1 and zz. z 1 T I + 3ds - Tg - 4 d g 5-1; and
contains control steps 1 and 2; z z contains steps from 3 to 5. Ti0 + 3dlo - Ti1 - 4dll I-1.
Consider now the first zone (Fig. 2). The set of operations By solving this formulation, we have ~ 1 . 1 ,x ~ 1 x3.s
. 142
crossing the zone are C,,, = ( 0 3 , 0 6 , 0 7 , os. 0 9 ~ o 1 0 . 0 1 11. dg and dg equal to 1. After we solve the formu-
x10,l. ~ 1 1 2 d7.
,
Delay variables, di, and weights, w i, are introduced for them. In lation for zone 1. ALAP scheduling is applied to the remaining
this example, the weight used is a function of the mobility after nodes in the DFG. Fig. 3 shows the distribution graphs for 2 2
the cut. The following is the integer programming formulation Since z z is the last zone, the objective function must be to
for the zl: “he the total n u m k of steps. The formulation for zz
1 1 1 1 1 1
Minimize -d3+ T d 6 + T d , + T d g t ydg+ -dlo+ ydll ,
1 becomes:
1 2
subject to Minimize C
.,
x1.1 + X Z l + X6.l + X8.l I 2; subject to
X1,z + x u + x32 + x u + x72 + xs2 5 2; x73 + XS.3 5 2; x1.4 + *8,4 5 2;
x10.1 I 2; x9.2 + x1o.z + x112 5 2; x43 I 2; x4.4 + x5.4 + Xg.4 5 2; x53 + xg3 I 2;
~ 1 . 1+xi2 = 1; xZ.1 +XU= 1; + d3 = 1;
~3.2 3~4.3 + 4 ~ 4 . 4- T4 = 0; 4~5.4 + 5x55 - T5 = 0;
X6.1 + x62 + d6 = 1; = 1;
X72 d7 + 3.~7.3 + 4x7.4 - T7 = 0; 3x83 + 4Xg,4 - T s = 0;
X S , J + xg2 + d s = 1; xg2 + dg = 1; 4~9.4 + 5x93 - Tg = 0;
~ 1 0 . 1 + xi02 + dio = 1; x i i j + dii = 1;
T4 - T5 I -1; T7 - T5 I -1; T s - Tg I-1;
~1.1 + 2x12 - T i = 0; +
X Z ~ 2, - Tz = 0; T5-C-50; and T g - C - 5 0 .
2x32 - T3 = 0; X6.1 2 x 6 2 - T6 = 0;
By solving this formulation, we have x 43 x 5.4 x 7,3 x 8.3
2x72 - T7 = 0; +
X ~ J 2 8 , - Ts = 0;
xg,4equal to 1 and C, = 4. The solution is optimal (Fig. 4).
2 x 9 2 - Tg = 0; ~ 1 0 . 1 + b 1 0 2 - Ti0 = 0;
2x112 - T11 = 0;
Paper 5.2
67
The number of variables m the formulation grows as T'J + cJ.(C,, + l)d,-T'j-Cj.(C,, + 1 + di)-dj 5 1 4 , .
0 (d en'). where d is the number of control steps considered in for all oi + oj. (11.1)
each zone and n' is the number of operations whose time frame
inmsects the zone. If an operation cannot be finished within the zone. it may still
occupy some resources. Therefore, the formulation of the next
5. GENERALIZATIONS zone must be take into account the cross mne operations both
in resource utilization and precedence relations.
We have generalis the ILP formulations to the follow-
ing variations: 53. Functional Pipellnlng (Plpellned Data Path)
(1) Scheduling with
a) Chaining A pipelined data path allows the execution of multiple
b) Multi-cycle operations by non-pipelined function units tasks concurrently. Two consecutive tasks can be initiated with
c) Multicycle operations by pipelined function units a certain interval, which is called the latency of the pipelined
(2) Functional p i p e l i i g data path.
(3) Loop folding For a given latency 1. the operations in c o d steps
(4) Mutually exclusive operations j + p f ( p = 0, 1.2. . . . ) are executed simultaneously and can-
( 5 ) Scheduling under bus constraint not share the same function units. Consequently, constraint (2)
(6) Minimizing lifetimes of variables is modified as
The formulations for scheduling with chaining, multi-
cycle operations by m-pipelined function units and minimizing
LYJ
lifetimes of variables have been discussed in [ll]. We concen- I:
p = o OiEFU
x i i w 5 MIk, for 1 5 j 5 1. 1 5 A 5 m. (2.2)
trate on the other variations. k'
5.1. Multi-Cycle Operatlons with Plpehed Implementa- 53. Loop Foldlng

tlon
The ideas of loop folding [12] (or loop winding in 1131)
In a pipelined function unit. new input data can be ini- and functional pipelining are v a y similar. The differ- is
tiated while previous data are still being computed. The time that in loop folding there are data dependencies between loop
between two successive initiations is called latency. A pipe- iterations, while in functional pipelining there is no data depen-
lined function unit with a delay d and a fixed latency 1 can per- dency between different instances.
form a new operation every f cycles, whae 1 5 d .
Since a pipelined function unit can be shared by the Let of & y o , denote a d-degree [12] data &pendency
operations of any two steps si. sj. where lsi-sj I is an integer ) oj and Tid be the time where oj is
between oi ( o i ~ F U I kand
multiple of 1, we can group the control steps into clusters. Here executed at d iterations lata. Suppose the loop length aftex
a function unit can be shared by the operations of different steps folding is known to be 1 ; we have Tid = Ti + d d . Therefore, a
within a clusta but can not be shared by those in different clus- new set of constraints
ters. Therefore, the number of function units needed is the total
of the function units required by the clustas; the number of Ti -Ti" 5 - 4 , for all oi & y o j (12)
function units required by a cluster is the maximum number of or equivalently
function units of the steps within this cluster. The range of the
control steps to be considered is within [st - d + 1. st].
Therefore. for a pipelined function unit of type f k . with a
prO&" delay dk and a fixed latency lk, COnSWaht (2) is introduced into the previous formulations to enforce the data
modified as dependency between different loop iterations. The remaining
d 1-p constraints are the same as those for functional pipelining.
1-C- 1
'k - A loop folding problem is closely related to a retiming
problem in synchronous circuits[l4]. Retiming relocates the
positions of the separating registers in the CDFG to obtain a
for 1 Ij Is. 1 Ik Im. (2.1) shorter critical path and, hence, higher throughput. Algorithms
[14]for retiming have been proposed to optimize synchronous
The number of function units required by all the clusters is circuits. In [12], after a retiming on the CDFG. scheduling is
determined by the outer summation, while h e maximum func- performed on the modified data flow graph. The separation of
tion and the inner summation determine the function units retiming and scheduling will produce sub-optimal design. Our
required by each cluster. Note that constraint (2.1) is not linear, formulation for loop folding performs retiming while schedul-
but it can be transformed into a set of linear equations. ing. Experiments show that not only is a minimum number of
resources required, but the delay time and lifetimes of variables
To ensure the data dependencies, constraints (5) and (6) are also reduced (see section 6.3).
are changed to
Ti - T i I-di for oi + o j , and (5.1) 5.4. Mutually Exclusive Operations
Ti - C,* I1 - di , for all oi without successors; (6.1) As in the case of structured programming. the relation-
respectively. ships among a set of operations, 8, can be represented as a tree
where the internal nodes are of two types. XOR and AND, and
For ZS formulation, constraint (11) is changed to the leaves are the operations. Let a node have n sub-trees and
the number of function units needed for each sub-tree be N F U :
Paper 5.2
60
N m 2 , .... N m ' . etc. N m can be defined as follows: length is 17 cycles. The run time for the various experiments of
1:
the example depends on the number of 0-1 variables and is
within tens of seconds.
FI =1~ ~ N F ifUthe
~ node is an XOR no&
Nm= NFUi if the node is an AND node 6.1. Non-plpelined data path
I =l
Tables 1 and 2 show the results with a non-pipelined data
qj if the node is a leaf path. The multiplier can be non-pipelined (Table 1) or pipe
lined (Table 2). We also take into account the cost of busses
Let Nm
'k
,,(e)be the number of function units of type All the results are optimal.
ftrequired at control step j . as illustrated in the above function.
constraint (2) is changed to 63. Functional plpellnlng (plpellned data path)
N F u l k J ( e ) S Mlk . for 15 j 5 s , 15 t~ m . (23)
The data flow graph of the fifth order filter b used to test
functional pipelining. ( T h e rre no data depmdenciea between
iterations; i.e. the outpts of the data flow grlrph will not feed
back into the inputs. ) We have achieved the minimal numbex
5 5 . Scheduling under Bus Constraint ~ ~each latency and we have also minimized the
of r e s o u r ~for
In a bussed architecture, when more than one operation delay time. The results are shown in the first and second parts
which share a common input variable are scheduled into the of Table 3 for non-pipelined and pipelined multiplies; nspec-
same control step, the number of busses needed for that variable tively. The third paxt of Table 3 shows the results of [17]
is only one ( via broadcasting ). Thus. the n u m k of busses where the maximum delay is set at 10 cycles. Note that in their
required at a control step equals the number of distinct input implementation, the cycle time is longer so that a multiplication
variables of all the operations assigned to this step. (The or two additions can be executed within a single control step.
hypothesis is that the read/wTite phases are interleaved and the
number of reads is more than that of mites at any step.) Sup- 63. Loop folding
pose the input variables at control step j are VI. va ..., v .1. I . The critical path length of the fifth filter can be reduced
We introduce a 0-1 integer variable y r j for v, (1 Ir I Iv I ) at to 16 cycles after loop folding or retiming while preseaving the
step j . where y r j is 1 if v, is accessed at this step and 0 if not inter-iteration data precedences. Tables 4 and 5 show the
accessed. We have the constraint that the number of y r J which minimal sample period (= loop length) and delays using a non-
are assigned to be 1 is less than the number of busses, i.e. pipelined multiplier and a pipelined multiplier. Here, delay
Iv I means the number of control steps required for the entire DFG
~ y , j I N ~f o: r l I j I s . (14) to be executed. Although delay time is ignored by other sys-
r=l
tems, we are wncemed with it and try to minimize it for two
Since the transfer of variables during a control step (y,~) is reasons: First, with respect to the sample period, which
directly related to the assignment of operations to a control step corresponds to the throughput of the system. the delay h e is
( x i j ) . we have to define the relationship between them. Let v, directly related to the tum around time, which is one of the
be a shared input of a group of r, operations, o , ~ .o ,..... ~ and
most important performance criteria Second, a longer delay
increases the lifetimes of the variables. Thus, " i z i n g the
or,. The value of y r j is defined as follows: If delay time will potentially reduce the register cost. Note that
+lj=.5.2j= . . . =.5.1j=0,then y, is given a value 0; otherwise, Spaid first retimed the DFG, then, pexformed a scheduling to
it is given a value of 1: i.e.. y r j = O R ( ~ , ~ j , l r ,, ~. j. . J r , j ) . find the loop length (called clock cycle in Spaid). Our loop
The following constraint is included to satisfy the definition of
Table 1: Non-pipelined Data Path
Y rj :
'I I Non-Dimlined Multidier I

+,j - r,.y,j 0 . for 11 r I iv I , 11 j Is. (15)
i =1
Note that when r, = 1. y r j in (14) can be directly replaced by

xij. The correspondent equation in (15) is omitted.
6. EXPERIMENTAL RESULTS
The system called ALPS has been implemented and
tested. The programs for list scheduling, ASAP, ALAP, and Table 2: Non-pipelined Data Path
ILP formulations are wriuen in C on a VAX 11/8550 running
ULTRIX operating system, and the ILP formulation is solved
using the LINDO [15] package on a VAX 11/8800 running
VMS operating system. LINDO starts with an optimal linear
programming solution and produces an optimal integer solution
using the branch-and-bound method. The fifth order wave filter
which was borrowed from [16] is given to illustrate various
requirements. It contains 26 additions and 8 multiplications. As
most systems do, we suppose a multiplication takes 2 cycles
while an addition takes 1 cycle to complete. The critical path
Paper 5.2
69
REFERENCES
Table 3: Fifth order Filter with Pipelined Data Path
D.D. Gajski. N.D. Putt and BM. Pangrle, "Silicon Com-
pilation (Tutorial)", Proceedings of the IEEE 1986 C w -
tom Integrated Circuits Conference, Rochester NY. pp.
a
102-110,May 1986.
P.G. Paulin and J.P. Knight, "Force-Directed S d d u h g
in Automatic Data Path Synthesis". Proc. of the 24th
Design Automation Conference, pp. 195-202.June 1987.
pipelined Multiplier
B.M. Pangrle and D.D. Gajski, "State Synthesis and Con-
nectivity Binding for Microarchitecture Compilatiofl
Proc. of ICCAD-86, pp. 210-213,November 1986.
Multipliers 2 2 2 2 E.F. Girczyc. "Automatic Generation of Microsequend
18 19 19 17 18 a0 22 23 - 33 Data Path to Realize ADA Circuit Description", Ph.D.
Thesis Carleton Univ.. 1984.
$-
Result of 1171
Latency - 2 3 1
4 S 6 7 8 9 N. Park and A.C. Parker. "Sehwa: A Software Package
Adden - 1 3 1100 1 7
13 6 5 5 6 4 for Synthesis of Pipelines fiom Behavioral
Multipliers - 4 4 1 2 3 2 2 2 2 - - Specifications", IEEE T-CAD, pp. 356-370.March 1988.
C.Y. Hitchock and D.E. Thomas, "A Method of
Automatic Data Path Synthesis", Proc. of the 20th
folding technique performs retiming and scheduling simultane- Design Automation Conference, pp. 484489. June 1983.
ously, which makes a better solution possible. Our scheduler is P.G. Paulin, and J.P. Knight, "Scheduling and Binding
also able to make a scheduling under the self-timed [12] Algorithm for High Level Synthesis". Proc. o f 26th
requirement. Design Automation Confeence. pp. 1-6.June 1989.
L. Hafer and A.C. Parker, "A Formal Method for the
3. CONCLUSION Specification, Analysis, and Design of Register-Transfer
In this paper, we have presented a new approach for Level Digital Logic". Proc. ofthe 18th Design Automa-
scheduling in data path synthesis under resource constraint. Our tion Conference. pp. 546-553.June 1981.
approach includes list scheduling, ASAP,ALAP and ILP. With [91 H. DeMan, J. Rabaey, P. Six and L. C l m q
it, we are able to solve all the benchmarks optimally in a few "Cathedral-II: A Silicon Compiler for Digital Signal Ro-
seconds. In addition to the model, a new technique, called cessing". IEEE Design and T a t , pp. 13-25,December
Zone Scheduling. is proposed to solve large size problems. 1986.
This method schedules a block of control steps at one time, B.S. Harolm, and M.I. Elmasry, "Architectural Synthe~is
allowing us to take a more global view of the scheduling prob- for DSP Silicon Compiler", IEEE T-CAD, pp. 431-447,
lem. Excellent results are obtained when using it to solve a April 1989.
large size problem. Jiahn-Hung Lee, YuChin Hsu and Youn-Long Lin "A
New Integer Linear Programming Formulation for the
Table 4: Loop Folding Scheduling Roblem in Data Path Synthesis", Prec. of
ICCAD-89. November 1989.
[12] G. Goossens. J. Vandewalle, and H. De Man, "LOOP
Optimization in Register-Transfer Scheduling for DSP-
System", Proc. 4 the 26th Design Automation Confer-
ence, pp. 826-831.June 1989.
[13] E.M. Girczyc. "Loop Winging - a Data Flow Approach
to Functional Pipelining", Proceedings of the IEEE
ISCAS. pp 382-385.May 1987.
[14] F. Rose, C. Leiserson. and J. Saxe, "Optimizing Syn-
t If 1 d-timed &I* U required. thesis Circuitry by Retiming". Proc. ColTech Conf. on
VLSI, pp. 41-67,Computer Sci. Press, 1983.
Table 5: Loop Folding [15 "LINDO:Linear INteractive and Discrete Optimizer for
Linear, Integer, and Quadratic programming poblems."
p
U N D O Systems, Inc.
System II &aid I ALPS [16] S.Y. Kung, H.J. Whitehouse and T. Kailath. "VLSI and
Modem Signal Processing". Prentice Hall, pp. 258-264.
1985.
[17] Ki So0 Hwang. Albert E. Casavant, Ching-Tand Chang
and Manuel A,d'Abreu, "Scheduling and Hardware Shar-
Sample perid m y ing in Pipelined Data Paths". IEEE Pruc. Int. Conf. C m
2834 pp. 24-27,November 1989.
Paper 5.2
70

Optimum and Heuristic Data Path Scheduling Under Resource Constraints?

Transféré par

Informations du document

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Optimum and Heuristic Data Path Scheduling Under Resource Constraints?

Transféré par

Droits d'auteur :

Formats disponibles

Optimum and Heuristic Data Path Scheduling

Under Resource Constraints?

ABSTRACT feature of ZS is that it schedules m m than one control step at a

27th ACMllEEE Design Automation Conference@

E,x i j +di= 1 , for all oi EC,,; (9) step 3 I

7 j.xij-T:, , for all oi ER- C,; and (10) step 5 I; 1 1

The objective function in (6) states that we are going to

5.1. Multi-Cycle Operatlons with Plpehed Implementa- 53. Loop Foldlng

'I I Non-Dimlined Multidier I

Note that when r, = 1. y r j in (14) can be directly replaced by

Vous aimerez peut-être aussi