Vous êtes sur la page 1sur 41

Hardware-Software

Co-partitioning for Distributed


Embedded Systems

Outline
1. Introduction
2. Related Work
3. Distributed Embedded System and
System Model
4. Multi-Level Partitioning
5. Case Study
2

1. Introduction

Hardware-Software Codesign
Distributed Embedded System
Motivation

Task Graph
Physical Restrictions

Distributed Embedded System Codesign (DESC)

Object Modeling Technique (OMT)


Linear Hybrid Automata (LHA)
SES Models
3

1. Introduction (cont)

Multi-Level Partitioning

Partitioning Algorithm
Sharing, Clustering

Case Studies

2. Related Work

Target Embedded System

1-CPU and 1-ASIC Topology


n-CPU and m-ASIC Topology

Optimal Codesign
Heuristic Codesign

2. Related Work (cont)

Codesign of 1-CPU and 1-ASIC


Topology

Kumar et al. 1993


Kalavade and Lee 1993
Thomas et al. 1993
Gupta and De Micheli 1993
Barros et al. 1994

2. Related Work (cont)

Codesign of n-CPU and m-ASIC Topology

Optimal Codesign Approaches:

Mixed integer linear programming


Prakash and Parker 1992

Exhaustive search
Wolf 1994, Haworth et al. 1993
DAmbrosio and Hu 1994

2. Related Work (cont)

Heuristic Codesign Approaches:


Iterative and Constructive

Iterative:
Dick and Jha 1998 --- MOGAC, CORDS
Dick and Jha 1999 --- MOCYN

2. Related Work (cont)

Constructive:
Wolf 1996 --- object-oriented
Yen and Wolf 1996 --- sensitivity-driven
Dave, Lakshminarayana, and Jha 1999 --- COSYN
Dave and Jha 1999 --- COFTA
Dave and Jha 1998 --- COHRA
Our proposed: Distributed Embedded System
Codesign (DESC)
9

3. Distributed Embedded
Systems
and
System
Models
An
embedded
computer
system is a system

which uses computers but is not a generalpurpose computer.


In 1971, there were about 142,000
computers world-wide.
In 1999, there are now some 350 to 400
million personal computers alone and at
least of magnitude more embedded devices.
10

3. Distributed Embedded
Systems
and System Models (cont)
There are several reasons to build
distributed hardware engine for embedded
system

Cheaper
Faster response time
The devices control may be physically
distributed
11

3. Distributed Embedded Systems


and System Models (cont)

System Models

Object Modeling Technique (OMT)


Models

Object Model
Dynamic Model
Functional Model

12

3. Distributed Embedded
Systems
andHybrid
System
Models
(cont)
Linear
Automata
(LHA) Models

Internal system model


For verifying systems

SES Models

SES/workbench is a popular modeling and simulation


tool for system performance evaluation

13

4. Multi-Level Partitioning

Multi-Level Partitioning (MLP)


Three Main Phases

Codesign Space Exploration (CSE)


System Structural Partitioning (SSP)
Binary Search Copartitioning (BSC)

14

Overall Flow Chart of Multi-Level Partitioning


CSE level

Initialization

Explore Design Space


Number of CPU and
hardware cost

SSP level

Generate Structural
Partition
CPU allocation to
distributed subsystems

Copartitioning

Next structural partition

ASIC
Sharing

CPU
Sharing

Hardware
Clustering

Software
Grouping

Last Structural
Partition?

No

Yes

Last Design
Alternative?
Yes

Output Heuristically
Optimal Partition

No

BSC level

Detailed Flow Diagram of Multi-Level Partitioning


CSE level
Initialization

Place all objects of hardware


parts into ILA and all other
objects into MLA

OMT Models

Calculate CPD ratios of each


object in MLA

Sort all MLA objects in an ascending


order of their CPD ratios

Select number of CPU and hardware


cost (Explore Design Space)
SSP level
Allocate CPU to distributed Subsystem
(Generate Structural Partition)

Copartitioning

BSC level
Next structural partition

Select an object with median CPD ratio


Use software to implement all objects
with CPD ratios not less than that of
the selected median object.
Use hardware to implement all objects
with CPD ratios less than that of the
selected median object.
LHA Models
Check if the partition result
satisfies system constraints
Cost constraint is not
satisfied, but performance
constraints are satisfied

Increase software
objects

Cost and performance


constraints are satisfied

Cost is more
important Check if the partition

No satisfactory partition

Cost constraint is satisfied, but


performance constraints are not
satisfied
Performance is
more important

is a heuristically
optimal solution?

Increase hardware
objects

Yes

Store structural partition result and


perform sharing and clustering
Last structural partition?
Yes

Last design alternative?


Yes

Partition found?

No

No

No

Yes

Output least costly


partition

Print No partition

4. Multi-Level Partitioning (cont)

CPD: Cost-Performance Difference


CPD ( x)

Hardware _ Cost ( x) Software _ Cost ( x)


| Hardware _ Perf ( x) Software _ Perf ( x) | / Perf _ Constraint ( x)

where x is a object

17

4. Multi-Level Partitioning (cont)


CPU/ASIC Sharing

Sharing Threshold Distance (STD)


SLI: Subsystem Location Inter-distance
Sharing
SLI:

No Sharing
STD

18

4. Multi-Level Partitioning
(cont)

Interconnect Cost (IC) Model


IC (X1, X2) = SLI(S1, S2) #Link(X1, S2)
BW(X1, S2) + EC(X1)

SLI: Subsystem Location Inter-distance


S1 and S2 : Subsystems
X1 and X2 : A component (PE or ASIC)
: A parameter that depends on the interconnection technology
#Link(X1, S2) : The number of links between X1 and S2
BW(X1, S2) : The communication bandwidth between X1 and S2
EC(X1) : The cost for enhancing X1 such that both S1 and S2 can
use X1.

19

4. Multi-Level Partitioning
(cont)
Algorithm 5.2 Share Components Algorithm
Share_Components(s){
/* s=<s1, s2, ,s>, si=(si1, si2) where si1 is the number of PE in subsystem Si and si2 is
the number of ASIC in subsystem Si. si1, si2{0,1, } */
for (i = 1, i , i++) {
for (j = i, j , j++) {
if SLI(si, sj) STD {
if (si1 0 sj1 0)
Share_PE(Si, Sj); /* Refer to Algorithm 5.3 */
if (si2 0 sj2 0)
Share_ASIC(Si, Sj); /* Refer to Algorithm 5.4 */
}
}
}
}

20

4. Multi-Level Partitioning (cont)

Hardware Clustering and Software


Grouping

In DESC, hardware clustering is based on


Kernighan and Lin basic graph partitioning
algorithm, but it is enhanced to include DEMS
characteristics.
Software grouping technique similar to load
balancing on multiple processors

21

4. Multi-Level Partitioning
(cont)

Analysis and Validation of MLP

Complexity analysis
MLP Init _ time BSC
MLP r r log r log r

[ sp( p ) ( Share _ time Cluster _ time)]

[ sp( p ) C 2 ( 2 max ( p k ) k r 2 )]

p 0 ,...,

p 0 ,...,

1 k p

r: the number of objects


: the number of subsystems
22

5. Case Studies

Vehicle Parking Management System


(VPMS)
Examples of Sharing and Clustering in
MLP
Application of MLP to Coal Mine System

23

5. Case Studies (cont)

Vehicle Parking Management System (VPMS)

VPMS Specifications

A VPMS consists of three subsystems: ENTRY management, EXIT


management, and DISPLAY.
An ENTRY (or an EXIT) subsystem consists of three parts: a ticket
facility, a gate controlled by a gate-motor, and a pair of sensors.
A DISPLAY subsystem

24

7. Case Study (cont)

Constraints for the VPMS system

A maximum cost of $1,300,


A maximum display response time of 14,000 s, and
A maximum ENTRY (EXIT) gate response time of 250 s.

25

5. Case Study (cont)

Specification and Mapping of VPMS

VPMS is described using OMT models consisting of


Object
Dynamic, and
Functional models.

26

Object Model of VPMS


Vehicle Parking
Management System

ENTRY Management
System

EXIT Management
System

Gate
Controller

Time
Stamp

isa

isa

Sensor

ENTRY Sensor

Control
Unit

isa

EXIT Sensor

Ticket Checker

7-Segment

Control System

Display Device

LCD

Dot Matrix
Counter

EXIT Gate

ENTRY Gate

isa

Motor

Display System

Send/Receive
Device
Control
Unit

Display
Interface

Dynamic Model of a DISPLAY Subsystem


Increment
counter

Car out

Car in

Idle

Decrement
counter
Push time stamp
button

Count > 0,
send ACK!
Count = 0,
out of space

Read count

Update
Display

Functional Model of a DISPLAY Subsystem

EXIT Sensor

ENTRY Sensor
Car in signal

Car out signal

Increment
Counter

Counter
Counter Data

Update
Display

Decrement
Counter

5. Case Study (cont)


LHA Model of VPMS

Hardware LHA Model

Software LHA Model

30

Hardware LHA of a DISPLAY


Subsystem
Count:=500
t:= 0,

Update Display
t = 42ns, t := 0

t = 42ns, t := 0
Count := Count + 1

Count := Count 1
t = 100ns

Decrement
Counter

Car in
t := 0

Car out
t := 0

Idle

t = 18ns

Push time stamp button


t := 0

Read Count

Increment
Counter

Software LHA of a DISPLAY


Subsystem
Count:=500

t:= 0, x := 0,
Update Display
t = 3.2s, t := 0
Count := Count 1
x 33ms, x := 0

Decrement
Counter

t = 5.12s,
t := 0,
x 33ms,
x := 0
Car in,
t := 0

t = 10ms, t:= 0

x := 0
Polling

t = 10s ,
t := 0

Read Count

Car out,
t := 0

Push time stamp button,


t := 0

t = 3.2s, t := 0
Count := Count +1
x 33ms, x := 0

Increment
Counter

5. Case Study (cont)


SES Models

Using SES/workbench Model

A car-simulator

An ENTRY management subsystem

An EXIT management subsystem

A DISPLAY subsystem

33

5. Case Study (cont)

SES Model of a DISPLAY Subsystem

34

5. Case Study (cont)


Applying MLP to VPMS

Sensor Driver
Counter
Motor Driver

Calculation of CPD for VPMS Parts


Hardware Software
Hardware
Software
Cost
Cost
Performance Performance
115
90
210
1,030
120
90
290
13,200
260
90
820
1,030

CPD
7.622
32.533
202.381

35

5. Case Study (cont)


Applying MLP to the VPMS Example
Binary Search Copartitioning (BSC)
Codesign Space
Response time FeasiResponse time
Exploration (CSE)
bility
(s)
(Number of CPU) Partitions(SSP) Cost ($) (s)
(sensor to display) (sensor to gate)
A(HC, HS, HM)
0
1,450
190
0.2 No
B(HC, HS, SM)
1
1,280
190
215.0 Yes
13,200
820.0 No
C(HC, HS, S M2 ) 1,370
2
D(SC, HS, SM)
1,250
13,100
215.0 Yes
1,340
13,100
210.0 No
E( SC2 , HS, SM)
3
F(SC, SS, SM)
1,225
13,200
1,030.0 No
H: hardware, S: software, subscripts: C = Counter, S = Sensor Driver, M = Motor Driver,
superscripts: 1 One CPU, 2 Two CPUs, 3 Three CPUs

36

5. Case Study (cont)


VPMS Emulation

Block Diagram for Prototype D(SC, HS, SM)

Entry Sensor
& Driver
Exit Sensor
& Driver

Signal Car in(i)


Processing
Signal
Processing

Car out(i)

Single-chip
Processor
(8751)

Display scan data(o)

Open(o) or Close(o)
Interface

Time Stamp
Machine
Ticket
Checker

Push time stamp button(i)


Ticket taken(i)
Acknowledgment(o)
Parking fees paid(i)

Display
Device

Entry gate

Single-chip
Processor Open(o) or Close(o)
(8751)
Interface

Exit gate

37

5. Case Study (cont)

VPMS Emulation Results

VPMS Emulation Results


Partitions
B(HC, HS, SM)
Cost ($)
1278
Power Consumption (W)
4.76
Response time (s)
180
(sensor to display)
Response time (s)
210
(sensor to gate)

D(SC, HS, SM)


1240
4.20
13,000
210

38

5. Case Study (cont)


Examples of Sharing and Clustering in MLP

Sharing and clustering techniques in MLP


based on several variants of the VPMS case
study.
How object oriented modeling can be
advantageous in hierarchical partitioning.
Coal mine control and monitoring system
39

Advantage of Sharing in MLP


Partitioning Results for three VPMS Specifications
with and without Sharing
Specifications
VPMS-1
1.0
6.0
7.0

STD (m)
SLI(ENTRY, EXIT) (m)
SLI(Display, EXIT) (m)
SLI(Display,
ENTRY)
(m)

Number and Locations of


PE

Number and Locations of


ASIC
System Cost ($)
Display
response
time
Performance (s)
Gate
response
time (s)
MLP Execution Time
(sec)

VPMS-2

2.0

Partitioning Results
(1) ENTRY gate
(1)
control
(2) EXIT gate
2
control
(2)
(3) Display
(1) ENTRY
(1)
sensor
1
control
(2) EXIT sensor
control
1,430

VPMS-3
1.0
0.5
3.0

1.0
0.8
0.5

3.0

0.5

ENTRY/
EXIT gate
control
Display

(1)ENTRY/
EXIT/
1
Display
Subsystem

ENTRY/
EXIT
sensor

(1) ENTRY/
EXIT/
1
Display
Subsystem
Interface
1,250
1,180

13,200

13,200

14,020

210

210

1030

0.602

3.857

14.789

Advantage of Clustering in MLP


Partitioning Results for five VPMS Specifications
with and without Clustering
Specifications
VPMS-B
VPMS-C

VPMS-A
Number of
Subsystems

VPMS-D
2

(1) ENTRY/ (1) ENTRY/


(1) ENTRY/
EXIT/
EXIT
Display
Display
Subsystem
Subsyste
Subsystems
Subsystem (2) Display
m
Subsystem (2) EXIT
Subsystem
Partitioning Results
(1) ENTRY
Motor
(1) Motor
(1) Motor
Driver/
Number and
1
Driver/ 2
Driver
2
Counter
locations of PE
Counter
(2) Counter
(2) EXIT
Motor
Driver
(1) ENTRY
Number and
(1) Sensor
(1) Sensor
Sensor
1
1
2
locations of ASIC
Driver
Driver
(2) EXIT
Sensor
System Cost ($)
1,180
1,250
1,340
Display
14,020
13,200
13,100
response
Perfor- time (s)
mance Gate
1,030
210
110
response
time (s)

VPMS-E
2

(1) ENTRY
(1) ENTRY
Subsystem
Subsystem
(2) EXIT/
(2) EXIT
Display
Subsystem
Subsystem (3) Display
Subsystem
(1) ENTRY
(1) ENTRY
Motor
Motor
Driver
Driver
2 (2) EXIT
3 (2) EXIT
Motor
Motor
Driver/
Driver
Counter
(3) Counter
(1) ENTRY
(1) ENTRY
Sensor
Sensor
2
2
(2) EXIT
(2) EXIT
Sensor
Sensor
1,340
1,430
13,100

13,200

110

110

Vous aimerez peut-être aussi