Hierarchical Computing

Hierar
hi al Computing: An Ar hite ture for E ient Transa tion

Pro essing
Juan Rubio and Lizy K. John
Laboratory for Computer Ar hite ture
The University of Texas at Austin
Austin, TX 78712
fjrubio,ljohnge e.utexas.edu
Abstra t
Transa tion pro essing workloads impose heavy demands in the memory and storage sub-systems and often
result in large amounts of tra in I/O and memory buses. In this paper, we propose to utilize pro essing
elements distributed a ross the memory hierar hy, with the obje tive of performing the omputation lose
to the data residen e. Leveraging on a tive memory modules and a tive disk devi es emerging from other
resear h groups and available in the market, we propose a hierar hi al omputing system in whi h the
distributed pro essing elements operate on urrently and ommuni ate using a hierar hi al inter onne t.
Transa tions are partitioned a ross the dierent layers in the hierar hy depending on the anity of ode
to a parti ular layer or other heuristi s. Commands per olate down into the lower layers of the hierar hy
and prepro essed/partially pro essed information ows up into the higher layers. All layers ACTIVELY
parti ipate in the pro essing of the transa tion by doing tasks for whi h they are parti ularly suited. The
lower layers ontain inexpensive pro essor units and in onjun tion with the powerful entral pro essor
and other ollaborating memory and disk pro essors, yield high performan e in a ost-ee tive fashion.
This on ept is then applied to the online transa tion pro essing ben hmark TPC-C and s hemes for
ode partitioning are outlined. A hierar hi al omputing system ontaining four inexpensive memory
pro essors and 32 very inexpensive disk pro essors an yield speedups of up to 4:52x when ompared with
a traditional system. Sin e transa tion pro essing has been seen to ontain ne-grained and oarse-grained
parallelism, the proposed hierar hi al omputing paradigm whi h exploits parallelism and redu es data
transport requirements seems to be a feasible model for future database servers.
Keywords: Computer Ar hite ture, Parallel Pro essing, Memory Hierar hy, High Performan e Computing, Database Servers.
Te hni al Area:
Ar hite ture
Introdu tion
It is a well known fa t that the server market is the driving fa tor for several of the te hnologi al advan ements in the omputer industry. A few years ba k, this market was mainly dominated by te hni al
workloads. But during the last two de ades, it has hanged to power a large portion of ommer ial operations.
One important type of appli ations in the group of ommer ial workloads is Transa tion Pro essing
(TP). Transa tion pro essing workloads are lassied in two types: Online Transa tion Pro essing (OLTP)
and De ision Support Systems (DSS). OLTP systems are used to handle those operations that o ur during
the normal operation of a business (e.g. a lient buys produ ts, the managers he k the inventory, adjust
the pri e of an item). On the other hand, DSS systems are used to take de isions based on the data
gathered by a business, whi h usually omes from an OLTP system (e.g. nd most popular produ t within
a given demographi bra ket, estimate net prot of all sales in the last three months). Even when both
workloads t within the ategory of transa tion pro essing, they have many dieren es:
OLTP operations are of short duration, taking millise onds to omplete, whereas DSS operations
take minutes.
The above statement also applies to the dataset of an operation. While OLTP operations usually have
datasets in the order of kilobytes or megabytes, DSS operations usually a ess megabytes or hundreds
or megabytes of data. Re ent literature suggests that DSS systems will be a essing gigabytes in the
next ouple of years [1.
The number of on urrent operations in a OLTP system is in the order of thousands, while DSS
systems normally have less than a hundred on urrent operations.
OLTP systems onstantly modify the data stored in the databases (
enter a sale, deliver a pa kage).

DSS systems, on the other hand, use mostly read operations during their exe ution.
e.g.
Transa tion pro essing systems are typi ally implemented using a multi-tier ar hite ture. The idea
is to implement a fun tional pipeline that streams transa tions from the lients to the server database
in an e ient way. As an be seen from Figure 1, lients on the left are onne ted to an intermediate
server or Middle-Tier, through a swit hed network. The fun tion of the middle-tier server is to a t as a
lter and reje t those requests presented by the lients that are in orre tly generated. It also enfor es the
se urity in the system and serves as a parser that transforms requests formulated in one language domain
(e.g. HTML) to another domain (e.g. SQL).
The next omponent is the Middle-Tier server. In this example, the Middle-Tier server is implemented
as a server luster, with a front-side onne tion to the lients through a load-balan ing swit h, whose
2
CLIENTS
BACKENDTIER
SERVER
FRONTSIDE
SWITCH
0
1
0
0 1
1
0
1
111111
000000
0
1
0
1
000000
111111
0
1
0
1
0
1
0
1
0
1
0
1
000000
111111
0
1
0
1
000000
111111
0
1
0
1
000000
111111
0
1
0
000000
111111
0 1
1
0
1
000000
111111
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
1
0
000000
111111
0 1
1
0
1
000000
111111
0
1
0
1
000000
111111
0
1
0
1
000000
111111
0
1
0
1
000000
111111
0
1
0
1
000000
111111
0
1
0
1
000000
111111
0
1
0
000000
111111
0 1
1
0
1
000000
111111
BACKEND
SWITCH
MIDDLETIER
SERVER
Figure 1: Conventional System Level Ar hite ture for a Transa tion Pro essing System.
fun tion is to reate a distributed load in all the nodes of the luster. Work done in this area in ludes
lo ality aware distribution algorithms like the one developed by Pai, et al.[2. The nodes in the luster
ommuni ate with ea h other through a ba k-side network interfa e, whi h they also use to send the
requests to the database server, also referred to as the Ba k-end-Tier server.
The nal omponent of the system is the Ba k-end-Tier, whi h is also the fo us of this paper. This
server is the one that manipulates the primary data of the ommer ial operation (e.g. it keeps the list of
lients, the orders they pla e, pri es of items and their quantities in the warehouses). As su h, the ba kend-tier has omplete ontrol over a large portion of the data, whi h is normally lo al to it and a essed
using a Relational Database Manager System (RDBMS or ommonly DBMS). Implementations of this
server in lude symmetri multipro essor systems (SMP) as well as luster servers.
When we look at the exe ution behavior of ommer ial workloads, we observe ommer ial workloads
are dierent than te hni al workloads and present more vigorous demands to the memory and storage
sub-systems [3, 4, 5. In fa t, studies that analyzed transa tion pro essing workloads indi ate that systems
spend around 90% of the time waiting for the I/O devi es to a ess the data [6. On e the data are brought
to memory, the pro essor uses between 25% to 45% of the exe ution time handling memory a esses [7.
That results in a sub-optimal utilization of the laten y hiding features of modern dynami ally s heduled
pro essors [8.
One of the reasons for this unbalan e between omputation and data a ess ba ktra ks to the prin iples
3
of traditional memory hierar hies, where data moves from the storage sub-system to the pro essor before
it an be pro essed. Although we have be ome a ustomed to this exe ution model, whi h works well
for te hni al and some other appli ations, it is far from optimal when used with a transa tion pro essing
workload. The a tion of moving data ba k and forth between the storage and the omputing elements not
only results in a high volume of tra , whi h hurts the s alability of the system, but it also reates an
arti ial bottlene k by serializing the exe ution in an environment with enough parallelism.
This paper presents the Hierar hi al Computing model as a possible solution to the problems presented
above. The next se tion introdu es the idea, giving spe ial attention to the operation of the hardware and
ommuni ation of the devi es. Se tion 3 overs the programming model used in the system, how we plan
to partition the problem, and use a set of basi primitives to operate on the data. Se tion 5 performs
a mathemati al analysis of the idea, and identies those parameters that ae t the performan e of this
te hnique. Se tion 4 presents a basi ode partitioning s heme and inspe ts the idea of using a onventional
transa tion pro essing workload, whi h helps us to determine the feasibility of the model. Se tion 6 looks
at other ideas proposed in the literature. Se tion 7 on ludes with a highlight of the most signi ant
ontributions.
Hierar hi al Computing
To address the problems presented in the previous se tion, most transa tion pro essing systems exploit
the oarse grain parallelism present in the form of on urrent transa tions. Before pro eeding further it is
important to dene what is a transa tion. A transa tion is a sequen e of operations, whi h are exe uted
atomi ally (atomi ), whi h always maintain a onsistent state in the database ( onsistent), whose exe ution
is not ae ted by on urrent transa tions (isolation) and whose ee ts are permanent (durable). These
set of properties, ommonly referred as the ACID properties for the rst letter of ea h property, is the
basis for the transa tion pro essing theory. Implementations of urrent systems use a thin layer of software
known as the Transa tion Manager. Its fun tion is to enfor e an order in the arrival of the transa tions
and the ACID properties required by the transa tion model. The queries that form a transa tion are then
passed to the database pro esses whi h run in ea h one of the pro essors in the system. Although this
approa h has been relatively su essful, it is possible to ause resour e management problems, manifested
in the form of hot spots during the a ess of the database tables, thus inhibiting the exploitation of all the
parallelism in the system.
The Hierar hi al Computing exe ution model exploits the available parallelism present within a single
transa tion, in addition to thread-level or oarse level parallelism that an be exploited using other means.
The idea is to distribute the omputation a ross a omputer system on behalf of a unique transa tion.
In the model, the ommuni ation is based in a message based approa h using a hierar hi al topology of
4
inter onne ts. Sin e this model exploits a dierent type of parallelism than a traditional system (intratransa tion vs inter-transa tion), its use an be orthogonal to existing methods. While in a traditional
system all the omputations are required to be performed by the CPU, under this model, there is no need
to insist on that. There is an important dieren e between both models, whi h is seen in the way a server
handles a query from a lient. In fa t, several operations may be e iently performed at the data residen e.
This brings a set of interesting tradeos, whi h are presented and evaluated in the next se tions.
To distribute omputing in this way, some omputing power is lo ated in memory and some in disk, lose
to the lo ation of the data, where they perform simple omputations su h as omparison and a umulation.
The omponents are oupled using a hierar hi al inter onne t (any sort of dire ted a y li graph like a
binary tree).
Processing
Element
1. Processor
(ILP)
Interconnect
2. Interconnect
3. Main Memory
(multibank)
4. Interconnect
Interconnect
Interconnect
5. Disk
(Array)
Figure 2: Sample topology for a Hierar hi al Computing system.

Figure 2 shows the topology of a sample system based on the above ideas and where we perform
omputations in ve dierent levels. The additional points of omputation are represented by shaded
areas and are lo ated in the memory banks, storage devi es and inter onne ts between the main levels.
To exemplify the operation of the system, we an introdu e the analogy of a orporate o e, where the
employees are organized in a well dened hierar hy, ea h of them with an amount of data in its lose
vi inity and over whi h they have omplete ownership. As a team, they handle ea h of the transa tions
re eived by the o e, in a very distributed fashion. Even when at a parti ular point in time, only some
members work in the same task, ea h one operates on the transa tion at some point. Managers at dierent
5
levels do not need to know every detail of the operations of their subordinates. Ea h level in the hierar hy
onveys only the right amount of information to the layer above.
A system with, say, 1 main memory module and 4 disk modules needs to have only 3 levels of omputations (1 in the main CPU, 1 in memory and 1 in the disks), whereas pro essors in the inter onne t
will be riti al to larger hierar hi al omputing systems. The number of levels of inter onne t pro essors
depends on the number of disks reporting to the same memory, as well as the inter onne t network, thus
for a binary tree we need a number of levels of inter onne ts expressed by:
(
log2 N umber of memory units 1; 0)
(1)
The intelligen e in the dierent levels an be realized using intelligent memory modules investigated
in re ent resear h [9, 10, 11, 12, and intelligent or a tive disks [13, 14, 15, 16, 17. It is possible to nd
storage devi es in the market with 150 MIPS ore and up to 2MB of main memory [18, 19, 20, 21. Ongoing eort in intelligent or a tive memories and disk omponents an thus be leveraged to implement
hierar hi al omputing systems. If omputing apability is required in the network or swit hes, they
an be realized using hips similar to mi ro- ontrollers embedded in the swit h/bus interfa e. It may be
noted that the omputing resour es required in the storage devi es or memory are signi antly heaper
than the entral pro essor. Use of a powerful pro essor whi h exploits instru tion level parallelism and
thread-level-parallelism is favored at the root of the hierar hy.
The hierar hi al mode of operation has worked relatively well in our so iety, and we expe t it to work
as well in a omputer system due to the following hara teristi s:
Raw data stay lo al to a level of the hierar hy, whi h gives more freedom to the upper levels to
operate and hold temporary results of the operations in their fast but redu ed storage spa e.
M ax log2 N umber of disks
It suggests a spe ialization of the units, whi h permits a system to use dynami ally s heduled pro-
essors in the upper levels of the hierar hy, where omplex de isions are required and ontrol ow
is hard to predi t. In-order pro essors, or narrower power-e ient pro essors an be used to handle
the bulk of the data in the lower levels.
The impli ations of the rst point ae t the mode of operation of the hierar hy and its programming
model and are overed in the next se tion. The analysis of the se ond point dire tly ae ts the hardware
used in the system and will be studied in subsequent se tions.
Programming Model
As it was mentioned in the previous se tion, the basi element behind the Hierar hi al Computing model
is the use of omputation engines that sit lose to the lo ation of the data. The idea is to expedite the
6
movement of data from its natural point of residen e to the omputation unit. This omputation unit an
be the main pro essor, for tasks that require the high omputation power provided by a high frequen y
and dynami ally s heduled pro essor, or ould as well be a mu h simpler mi ropro essor or mi ro ontroller
that fun tions as an intelligent memory or disk ontroller. The partition of the data and operations to
take advantage of this new ar hite ture are overed in Se tion 4. This se tion deals with the programming
model used to support a transa tion pro essing workload.
The heart of a transa tion pro essing system is the database server. Nowadays, databases are based
on the Relational model [22. In this model data is stored in the form of tables with a variable number
of rows of a predetermined width. To a ess these tables, these systems use data manipulation languages
(DML), of whi h the stru tured query language (SQL) is the one most frequently used. The operations
supported by the SQL language in lude:
S an: lo ates rows that mat h a parti ular riteria. The riteria an be a single predi ate (
list
all ights to Pittsburgh), or a omplex predi ate (e.g. nd all vehi les in the state of Texas and
registered after 1973). It is alled sele t under some data manipulation languages. This operates on
a single table.
e.g.
Join: similar to the s an operation, but operates on two or more tables (
nd all vehi les registered

before 1966 and urrently owned by individuals born after 1966). The two pie es of information are
ontained in separate tables.
e.g.
Insert: reates a new row in an existing table.

Remove: homologous to insert, it removes a row from a table.
Update: modies one or more elds within one or more rows.
Sort: is not an independent type of operation, but an be applied to a s an operation.
In order to implement the fun tionality required by the relational model and the SQL language and
to take advantage of the on urren y provided by the Hierar hi al Computing model, we have opted for a
message-based data ow model. This on ept is presented in Figure 3, whi h shows two levels of a sample
hierar hy (although the model is exible enough to support several levels).
We assume that data is already partitioned and that it resides in the lower level (Lf1; 2g). The top level
(T f1g) is the level that initiates the operation by issuing a ommand (CM D) to one or more modules in
the lower levels (T f1g and/or T f2g). The a tion of sending the ommand an be a broad ast or multi ast
(e.g. sele t all rows whi h mat h a riteria), or it an also be a uni ast (e.g. insert row in table). The
ommands en ompass enough information to allow the lower levels to perform the omputations in behalf
7
of the upper level. On e the upper level sends the ommand to the lower level, it waits for data, whi h
depending on the operation an be a null response, a single element or a sequen e of elements. The
hardware provides basi ontrol ow signals to help the upper level handle the amount of data that might
result from an operation.
111
000
000
111
000
111
000
111
000
111
000
111
T{1}
11
00
00 Active
11
00
11
00
11
component
CMD
.
DATUM
L{1,2}
. DATUM
111
000
000
111
000
111
000
111
000
111
111
000
000
111
000
111
000
111
000
111
Figure 3: Exe ution of an operation in the Hierar hi al Computing model.

From the perspe tive of the lower level, on e it re eives a ommand from the upper level, it performs
a preorder traversal starting at this level. Thus, the node pro eeds to a ess the data over whi h it has
ontrol. If the data is not present in its level, it forwards the ommand to the level immediately under it
and forwards all responses to the upper level. If there are no levels under it, a blank response is sent to
the upper level.
In order to support this me hanism both ommands and responses need to be tagged with a unique
identier. This te hnique is similar to the tokens present in traditional data ow ma hines [23. We also tag
the ommands with the ID of the level that initiated it, this is done to redu e the overhead of pro essing
the responses. Finally, the model also implements a name spa e lo ator in the form of a table allo ation
index. This index permits the pro essor in the level lo ate data within its boundaries. It is also used to
determine if data is not present, thus avoids a lengthy traversal of all the data.
After studying the dierent SQL operations and the algorithms used in transa tion pro essing workloads, we designed two types of operations: individual and aggregate. They are based in the exe ution
presented in Figure 3, but dier in the way the lower level generates the results and how the upper level
interprets them.
Type 1: Individual
Named Individual, as the lower level informs the upper level of every single result. It ee tively a ts
as an unbuered lter. The semanti s an be designed to allow the operation to return on the rst
8
event triggered or to ontinue operating until it rea hes the end of the region.
Example: Sear h a range of data for a string.
111
000
000
111
000
111
000
111
000
111
000
111
000
111
1
0
0
1
0
1
*
Active component
Data Match
5
7
CMD
111
000
000
111
000
111
000
111
000 CMD
111
1
2
3
4
CMD
111
000
000
111
000
111
000
111
000
111
5
Figure 4: Primitive Type 1 (Individual).
Type 2: Aggregate
For this primitive, the lower level a esses its asso iated data, and nds those elements that mat h
a parti ular riteria. However, it does not send all these results to the upper level. Instead it
produ es an aggregate number and sends it on e all its data has been analyzed. In this ontext,
an aggregate fun tion Aggregate() is any fun tion that produ es a single number based on a set of
numbers (fI0 ; :::; I g). The most ommon aggregate fun tions in transa tion pro essing workloads
are: sum(), ount(), average(),max(), and min(). In this ase, the results returned by the dierent
nodes in the lower level might need to be ombined in the upper level to produ e a unique answer.
We an a omplish that, if we apply the same aggregate fun tion in the upper level to the values
returned. This works for all the examples shown above, ex ept for the average() fun tion. For those
ir umstan es the algorithm is hanged to return both sum() and ount() from the nodes in the
lower levels and the upper level performs the omputation of average().
Example: Count orders pla ed in January 2001.
N
Code partitioning
When we introdu ed the programming model in the Se tion 3, we assumed that the data was laid out
orre tly in disk. The distribution of ode among the dierent levels of the hierar hy impa ts performan e.
In this se tion we address the issue of how to partition a transa tion in order to a hieve a good performan e
with the Hierar hi al Computing model.
9
111
000
000
111
000
111
000
111
000
111
000
111
000
111
aggr<3>
1
0
0
1
0
1
*
Active component
Data Match
aggr<5,7>
CMD
111
000
000
111
000
111
000
111
000 CMD
111
1
2
3
4
CMD
111
000
000
111
000
111
000
111
000
111
5
Figure 5: Primitive Type 2 (Aggregate).

To study the data and ode partitioning, we begin by looking at the TPC-C ben hmark [24, a popular
transa tion pro essing workload. The TPC-C ben hmark is developed by the Transa tion Pro essing
Coun il (TPC), and is intended to serve as a standard ben hmark for Online Transa tion Pro essing
systems. The ben hmark models the operation of a business with ve dierent types of transa tions (new
order, payment, order status, delivery and sto k level).
Table 1 shows the hara teristi s of the database tables used by the ben hmark. The parameter W
represents the number of warehouses in the ben hmark, and is used to s ale it to dierent hardware
ongurations. The ardinality olumn indi ates the number of rows in a database table. The next olumn
shows the size of a row for our implementation using IBM DB2 Universal Database [25. The last olumn
shows the size of the tables for a onguration with 17,500 warehouses, whi h is lose to the highest
non- lustered re ord to TPC-C by the time we ondu ted this study.
Spe ied as part of the TPC-C ben hmark is the high level des ription of ea h of the ve dierent
transa tions. For ea h of these transa tions, we show the query exe ution plans (Figures 6 to 10). Exe ution
plans are dire ted graphs that represent the tables in the database, the ow of data and the operations
performed over the data. The tables are represented by the ir les, operations by the re tangles and the
ow of data by the ar s.
Note that there is no referen e to time in this representation, as the omputation is driven by the
arrival of data. Also, to read them orre tly, they should be traversed in postorder (i.e. visit hildren
before entering the node). So if we look at, say, Figure 8, we observe that we annot perform the s an over
table order-line before we perform the s an over table order, whi h itself needs the s an over table ustomer
together with a sort operation. An additional lari ation is needed for the transa tion shown in Figure 6,
where there is a dotted box around some operations. It is used to indi ate that a se tion of the plan is
10
Update
Scan
Scan
Scan
Warehouse
District
Customer
Insert
Insert
Order
OrderLine
Update
Scan
Scan
Item
Stock
Figure 6: Exe ution Plan for the TPC-C New-Order Transa tion.
Update
Update
Update
Insert
Scan
Scan
Scan
History
Warehouse
District
Customer
Figure 7: Exe ution Plan for the TPC-C Payment Transa tion.
Scan
Scan
Sort
OrderLine
Order
Scan
Customer
Figure 8: Exe ution Plan for the TPC-C Order-Status Transa tion.
11
Update
Scan
Aggregate
Customer
Update
Remove
Scan
Sort
Order
Update
Scan
OrderLine
Scan
NewOrder
Figure 9: Exe ution Plan for the TPC-C Delivery Transa tion.
Count
Join
Scan
Scan
Stock
OrderLine
District
Figure 10: Exe ution Plan for the TPC-C Sto k-Level Transa tion.
12
Table
Cardinality Row Size Table Size (GB)

(bytes)
(W=17.5k)
Warehouse
W
101
<0.1
Distri t
W 10
107
<0.1
Customer
W 30k
701
342.8
Sto k
W 100k
330
537.9
Item
100k
90
<0.1
Order
W 30k +
40
19.6
New-Order W 9k+
10
1.5
Order-Line W 300k+
80
391.2
History
W 30k +
68
33.3
Total
1,270.4
Table 1: Dimensions of tables for the TPC-C ben hmark.
repeated several times. In this parti ular ase, one time for every item in the order.
In a old system all the data is stored in disk, but as the exe ution progresses, we an expe t tables, a
subset of all the rows in the table or indi es to move to upper levels in the hierar hy. We all this pro ess
data promotion. Dierent from what happens in a traditional system with a hes, this operation is not
transparent to the software, and is ontrolled by the mapping algorithms.
Based on the query exe ution plans, we use a stati ost analysis similar to the one used in some of the
rst query optimizers [26, where the ost (in instru tions) of a essing a table is omputed as a fun tion
of the dimensions of the table and the memory apa ity. Given a table T able of size Size(T able ) and
with Rows(T able ) rows, in a hierar hy with M levels, the pro ess used to partition the ode is shown
in Figure 11. For join operations we use a se ond table T able of size Size(T able ) and Row(T able )
rows. This is a simple model, and by no means should be onsidered an optimal partition. Additional
information used by the algorithm might in lude the notion of pro essing anity [27, where a module of
omputation is sent to the omponent whi h will exe ute it in the minimum amount of time.
j
Evaluation
To evaluate the potential from the hierar hi al omputing model, we estimate the time needed to perform
a task in this model and ompare it with the time needed in a onventional system. Equation 2 shows
the basi expression for the time ne essary to omplete a task, where performan e gains an be obtained
by improving any of the fa tors of the equation. We have hosen to express the time required to exe ute
an instru tion (TPI) as the produ t of the lo k frequen y and the CPI ( y les-per-instru tion). The
rst is ommonly asso iated with the hardware implementation details, while the se ond is a fa tor of the
pro essor ar hite ture and the workload being exe uted.
13
Initial table allo ation
For ea h Level
For ea h T able
if (i = M )
i
T ableInLevel(T able ; Level )

T rue
else
if (Size(T able ) < T hreshold(Level ))
T rue
else
F alse
j
Cost estimation
Traverse exe ution plan

For ea h operation
Obtain type(operation)
Set level where omputation is performed
Levels
fLevel j T ableInLevel(T able(operation); Level ) = T rueg
HighestLevel
M in(Levels)
if (type(operation) = fSCAN; IN SERT; REM OV E; U P DAT E g
Exe OpInLevel(operation)
HighestLevel
if (type(operation) = fJ OIN g
Exe OpInLevel(operation)
M in(Rows(T able ); Rows(T able ))
Assist(operation)
M ax(HighestLevel 1; 1)
Remember tables used by level
LevelU sesT able(Exe OpInLevel(operation); T ablesU sedBy (operation))
Compute ost(operation)
if (type(operation) = SCAN ) ost(operation) Rows(T able )
if (type(operation) = IN SERT ) ost(operation) 1
if (type(operation) = REM OV E ) ost(operation) 1
if (type(operation) = U P DAT E )
if (dependen ies(operation) > 0) ost(operation) Rows(T able )
else ost(operation) 0
if (type(operation) = AGGREGAT E ) ost(operation) 0
if (type(operation) = SORT ) ost(operation) Rows(T able )
if (type(operation) = J OIN ) ost(operation) M ax(j; k)
Add ost(operation) to CostOf Level(Exe OpInLevel(operation))
i
Remove unused opies of tables
For ea h Level where i > M

For ea h T able
if (LevelU sesT able(Level ; T able ) = F alse)
T ableInLevel(Level ; T able )
F alse
i
Figure 11: Simple s heme for partitioning transa tion pro essing workloads.
14
T rue
TPI
= Instru tions
N
(2)
pro essors
TPI
= lk CP I
(3)
This exe ution model performs omputations in the dierent levels of the storage hierar hy, in what an
be onsidered heterogeneous omputing elements, thus Equation 4 expresses the time spent omputing by
a single level. The index level i goes from 1 to M , where M is the total number of levels in the hierar hy.
1 The parameter N indi ates the number of pro essing elements in the parti ular level i.
i
ti
TPI
= Instru tions
N
i
(4)
Another interesting hara teristi of this model, is that it allows for operations to be performed on urrently in the dierent levels. However there might be algorithms that do not allow for that level of
parallelism be ause they have serial omponents in their exe ution. Taking that into a ount, Equations 5
and 6 show the range of values for the total time taken by a task. When we an exploit full parallelism,
the resulting time will be the maximum of the individual times. Situations where all the operations need
to be serialized would result in a time equal to the sum of all the times. Hen e the range of exe ution time
an be al ulated as follows:
tmax
X
M
ti
no overlapping
=1
(5)
tmin
= M=1
AX
M
ti
f ull overlapping
(6)
To enable this model, omputation is partitioned so operations are assigned to the omputation elements
in ea h one of the levels. Considering the nature of transa tion pro essing workloads, the partition of
operations an be done stati ally or using query optimizers like the ones built in most ommer ial databases
[26. With these systems, we express the distribution of omputation in Equation 7. The oe ients '
indi ate the fra tion of the total number of instru tions exe uted in level i of the hierar hy.
i
X
=
M
Instru tions
Instru tionsi
=1
(7)
'i
1
= Instru tions
Instru tions
(8)
Our onvention is to number the hierar hy from the top to the bottom, so the level with the main pro essor is assigned a
value
i = 1.
15
The number of instru tions a pro essor is apable of exe uting is a fun tion of a myriad of fa tors in
the system. Among them, the type of workload plays an important role. Given that we are evaluating this
model under the light of transa tion pro essing workloads, we will use some
CP I
= CP I
omputation
+ CP I
(9)
storage
A hara teristi of this model, is the use of relatively simple omputation engines in the lower levels of
the hierar hy. This ae ts the balan e of the operations in the system, a ording to Equation 10.
CP I omputation;i < CP I omputation;j
(10)
i<j
Assuming a hierar hy of three levels like the one shown in Figure 12, it is possible to present an
expansion of the above relations, showing the fa tors in whi h the time de omposes. We have assigned the
rst level to operations omputed by the main pro essor, the se ond level to the main memory and the
third level to those operations performed by the disk ontroller.
11111
00000
00000
11111
P.main
00000
11111
00000
11111
00000
11111
00000
11111
00000
11111
Main
(i=1)
0000
1111
1111
0000
P.memory
0000
1111
0000
1111
Memory
(i=2)
0000
1111
1111
0000
P.memory
0000
1111
0000
1111
11
00
00
11
Disk
(i=3)
N2
11
00
00
11
P.disk
P.disk
N3
Figure 12: Topology of a Simple Hierar hi al Computing system with M = 3.

t
Instru tions1
{z
N1
main
T P I1 +
Instru tions2
} |
{z
N2
T P I2 +
memory
Instru tions3
} |
{z
N3
disk
T P I3
(11)
Sin e typi ally the disk and memory pro essors will not be as sophisti ated as the entral pro essor, we
use a degradation fa tor to indi ate how the memory and disks pro essors ompare to the entral pro essor.
16
The degradation fa tor for T P I is designed as =

non-overlapping mode using Equation 5 is:
i
Speedup
'1
T P Ii
T P I1
, whi h we use to obtain the speedup for the
1
'2 2
N1
N2
'3 3
(12)
N1
N3
Similarly, the speedup when the algorithm shows full overlapping is expressed as:
Speedup
M AX '1 ; '2 2
N1
N2
; '3 3
(13)
)
3
N1
N
F(17%,25%,58%) Y(1,4,32)
3.00
2.75
2.50
2.25
2.75-3.00
2.50-2.75
2.25-2.50
2.00-2.25
1.75-2.00
1.50-1.75
1.25-1.50
1.00-1.25
0.75-1.00
0.50-0.75
0.25-0.50
0.00-0.25
2.00
1.75
Speedup 1.50
1.25
1.00
0.75
0.50
0.25
0.00
2.0
3.0
3.1
4.6
6.3
4.2
t2
7.9
5.2
9.6
6.3
t3
11.2
7.4
8.5
9.6
14.5
12.8
Figure 13: Speedup for (17%; 25%; 58%) (1; 4; 32).

Let us use tuple to represent the number of omputation elements in ea h level of the hierar hy.
Code will be partitioned between the dierent layers based on anity or other heuristi s. We use tuple
to represent the fra tion of the instru tions exe uted in ea h level of the hierar hy (' ). To analyze the
ee t of the sele tion of devi es, we generate the speedup surfa e for variations of 2 and 3 (Figure 13).
As it was expe ted, we a hieve the maximum speedup when we use the smallest numbers for 2 and 3
(i.e. P.memory and P.disk are not signi antly slower than P.main). There are however several points with
equal speedup for multiple pairs of degradation fa tors, whi h gives designers a great degree of freedom
when sele ting omponents.
i
17

( 1, 4,32)
( 1, 4,32)
( 1, 4,32)
( 1, 4,32)
( 1, 4,32)
( 1, 4,16)
( 1, 4,32)
( 1, 4,64)
( 1, 2,32)
( 1, 2,64)

(10%, 5%, 85%)
(35%, 10%, 55%)
(15%, 25%, 60%)
( 0%, 25%, 75%)
(10%, 20%, 70%)
(10%, 20%, 70%)
(10%, 20%, 70%)
(10%, 20%, 70%)
(10%, 20%, 70%)
(10%, 20%, 70%)
Speedup
Min Max
1.60 4.32
1.17 2.13
0.95 2.86
1.01 4.52
1.08 3.48
0.80 2.67
1.08 3.48
1.31 4.10
0.70 2.58
0.79 2.91
Table 2: Speedups of hierar hi al omputing system with onventional memory and disk omponents.
The algorithm presented in Figure 11 is used to obtain a preliminary estimate of the ode partitioning. For the situation where T hreshold(Level ) = Capa ity(Level ), Table 3 shows the fra tions of the
omputation that should be performed in ea h level for a onguration of (1; 4; 16).
i
Transa tion type

New Order
Payment
Order Status
Delivery
Sto k Level
'1
'2
'3
10.0 % 0.8 % 89.2 %

36.3 % 9.1 % 54.6 %
16.7 % 25.0 % 58.3 %
0.00 % 25.0 % 75.0 %
6.3 % 18.8 % 74.9 %
Table 3: Partition of ode for the TPC-C ben hmark.

Now, we pro eed to do a sensitivity analysis assuming a hierar hi al omputing system built with
available te hnologies. We assume a main pro essor running at 1 GHz, memory pro essors running between
100 and 500 MHz (i.e. 2 < 2 < 10), and disk pro essors running between 66 and 250 MHz (i.e. 4 < 3 <
15). Considering the database sizes of some implementations of transa tion pro essing systems, we sele ted
a onguration with four memory pro essors and 32 disks (whi h is expressed by (1; 4; 32)). The rst
se tion of Table 2 shows the maximum and minimum speedups obtained for dierent partitions of the ode
(i.e. varying values of ). Depending on the distribution of work, we an observe speedups of up to 4:52x
when omparing with a traditional unipro essor system. If the memory and disk pro essors are very slow,
potential slowdowns may be observed for ertain partitions of the ode. For instan e, the distribution
(15%; 25%; 60%) en ounters a slowdown when 2 = 10 and 3 = 15. However the same onguration is
apable of a speedup of 2:86x given more powerful nodes.
Code partition is dependent on the nature of the transa tion, and as su h is relatively hard to hange
18
without restru turing the algorithm. It is then important to analyze the ee t that hanging the hardware
onguration has over the speedup for a given ode partition. This is shown in the lower half of Table 2.
Here we work on a partition (10%; 20%; 70%) and hange the number of memory and disk modules.
Given the large amount of ode given to the disk modules, we observe that in rementing the number of
disks results in an in rease in the speedup. As in any other multipro essor system, the in rease is not
linear with the amount of pro essing elements added. It may also be noted that a tive memory systems
and a tive disk systems are subsets of the proposed general hierar hi al omputing system.
Related Work
During the 1970s, omputer s ientists looked at database appli ations and proposed spe ially designed
ma hines to handle the in reasing gap in the performan e between primary and se ondary storage as well
as the overwhelming software omplexity present in database appli ations [28, 29. Known as Database
Ma hines, these systems in orporated spe ialized omponents in the form of pro essors per-disk, per-tra k,
per-head and asso iative memories, in order to fa ilitate the a ess of data. The problem with these systems
was that the use of non- ommodity hardware drasti ally in reased the ost of the systems. Additionally
they were designed to handle only database workload, whi h resulted in a de lined interest by the rest of
the ar hite ture and software ommunities.
Database ma hines saw their last days with the development of parallel databases [30, whi h proved
to be a ost-ee tive solution for the problems of the day. Sin e then, modern ommer ial databases have
adopted several of the proposed algorithms: parallel sort [31, parallel join [32 and other algorithms that
tradeo memory utilization for I/O bandwidth [33.
In addition to database ma hines, several resear h proje ts have looked at the idea of having omputation elements lose to the data. Intelligent memories have mostly been targeted to regular numeri
appli ations [9, 10, but re ent attempts also look at its use in non-regular appli ations [11, 34, 12, 35.
Likewise the idea of intelligent disk has also been overed by dierent resear h groups. The workloads onsidered for this te hnology onsist of De ision Support Systems [13, 14, 15, 17, data-mining and multimedia
appli ations [16.
Related to our resear h, the X-Tree ar hite ture [36, 37, looks at a multipro essor organization, where
pro essors are onne ted using a binary tree. This topology fa ilitates the design of high bandwidth
systems as the average distan e between the nodes in reases logarithmi ally with the number of nodes in
the system. However, the emphasis in the X-Tree system was to build VLSI hips based on the idea of
re ursive ar hite tures [38, where it was possible to design a omputer system by onstru ting a hierar hy
with the same type of pro essors. Another example of a re ursive system is the Data Driven Ma hine
(DDM1), designed by Davis et al.[39. DDM1 was able to exploit on urren y due to its implementation
19
of Data Driven Nets (DDN), whi h onstitute a form of data ow similar to the one used by our model.
While the omputing paradigm presented in this paper has similarities to the aforementioned resear h
eorts, it must be noted that the merits of several past ar hite tural paradigms are being synergisti ally ombined and applied to transa tion pro essing in our urrent resear h eort. We are leveraging
on advan ements in a tive memories and a tive disks while at the same time taking advantage of the
advan ements done during the last twenty years in parallel databases, and query optimizers.
Con lusions
In this paper, we have presented a hierar hi al model of omputing, whi h is based on the on ept of
performing omputations in a hierar hi al manner distributed over the memory hierar hy. This resear h is
intended to alleviate the imbalan e of omputation and data a essing experien ed by large s ale transa tion
pro essing systems. The building blo ks that help to realize the proposed hierar hi al model are a tive
memory units and a tive disk devi es that are emerging in the market. The basi prin iple is to use
omputation engines that sit lose to the lo ation of the data and the use of a hierar hy to onne t these
omputation engines. This paradigm also brings in benets of the data ow model of omputation.
We des ribed the omputation paradigm and outlined a simple ode partitioning s heme. Then we
applied the ode partitioning s heme to the TPC-C ben hmark, and observed that it is possible to split
the transa tions using stati information about the database tables and storage apa ity of the omputation
nodes. Using the simple partitioning algorithm and assuming inexpensive memory and disk pro essors,
we show that a hierar hi al omputing system with four memory pro essors and 32 disk pro essors an
obtain speedups of up to 4:52x when ompared with traditional unipro essor systems. While performan e
degradations will o ur in non-parallelizable ode, or systems with very slow memory or disk pro essors,
it is seen that judi ious partitioning of ode an yield in performan e improvements. Sin e transa tion
pro essing has been shown to ontain signi ant amounts of parallelism, partitioning ode in a fruitful
way is not immensely di ult. In summary, hierar hi al omputing whi h exploits parallelism, distributes
omputations, and redu es the data transport requirements is a feasible model of omputation for future
database servers.
One attra tive result of using this paradigm is the feasibility of using pro essors, with dierent performan e ratings in the same system. This is possible due to the use of heterogenous pro essing elements in
the dierent layers of the hierar hy. Pro essors whi h are used at the top of the hierar hy move down as
memory pro essors on e a new high performan e pro essor generation arrives. Simultaneously, the urrent
memory pro essors will move to the disk as disk pro essors. This maximizes the lifetime of a pro essor
design, amortizing the ost of the design.
20
Referen es
[1 R. Winter, \The growth of enterprise data: Impli ations for the storage infrastru ture," in Whitepaper
Winter Corporation, 1998.
[2 V. S. Pai, M. Aron, G. Banga, M. Svendsen, P. Drus hel, W. Zwaenepoel, and E. Nahum, \Lo alityaware request distribution in luster-based network servers," in Pro eedings of the Eight International
Conferen e on Ar hite tural Support for Programming Languages and Operating Systems, (San Jose,
CA, USA), pp. 205{216, O t. 2{7 1998.
[3 A. M. G. Maynard, C. M. Donnelly, and B. R. Olszewski, \Contrasting hara teristi s and a he performan e of te hni al and multi-user ommer ial workloads," in Pro eedings of the Sixth International
Conferen e on Ar hite tural Support for Programming Languages and Operating Systems, (San Jose,
CA, USA), pp. 145{156, O t. 4{7 1994.
[4 S. E. Perl and R. L. Sites, \Studies of Windows NT Performan e Using Dynami Exe ution Trees,"
in Pro eedings of the 2nd USENIX Symposium on Operating Systems Design and Implementation,
(Seattle, WA, USA), pp. 169{184, O t. 28{31 1996.
[5 L. Barroso, K. Ghara horloo, and E. Bugnion, \Memory system hara terization of ommer ial workloads," in Pro eedings of the 25th Annual International Symposium on Computer Ar hite ture (ISCA98), (Bar elona, Spain), pp. 3{14, June 27{July 1 1998.
[6 M. Rosenblum, E. Bugnion, S. A. Herrod, E. Wit hel, and A. Gupta, \The impa t of ar hite tural
trends on operating system performan e," in Pro eedings of the Fifteenth ACM Symposium on Operating Systems Prin iples, (Copper Mountain, CO), pp. 285{298, ACM Press, De . 1995.
[7 A. Ailamaki, D. J. DeWitt, M. D. Hill, and D. A. Wood, \DBMSs on a modern pro essor: Where does
time go?," in Pro eedings of the 25th Conferen e on Very Large Data Bases (VLDB'99), (Edinburgh,
S otland), pp. 15{26, Sept. 7{10 1999.
[8 K. Keeton, D. Patterson, Y. Q. He, R. C. Raphael, and W. E. Baker, \Performan e Chara terization
of a Quad Pentium Pro SMP using OLTP Workloads," in Pro eedings of the 25th Annual International
Symposium on Computer Ar hite ture (ISCA-98), (Bar elona, Spain), pp. 15{26, June 27{July 1 1998.
[9 D. G. Elliott, W. M. Snelgrove, and M. Stumm, \Computational RAM: A memory-SIMD hybrid and
its appli ation to DSP," in Custom Integrated Cir uits Conferen e, pp. 30.6.1{30.6.4, May 1992.
[10 D. Patterson, T. Anderson, N. Cardwell, R. Fromm, K. Keeton, C. Kozyrakis, R. Thomas, and
K. Yeli k, \A ase for intelligent RAM: IRAM," IEEE MICRO, Apr. 1997.
[11 M. Oskin, F. Chong, and T. Sherwood, \A tive pages: A omputation model for intelligent memory,"
in Pro eedings of the 25th Annual International Symposium on Computer Ar hite ture (ISCA-98),
(Bar elona, Spain), pp. 192{203, June 27{July 1 1998.
[12 M. Hall, P. Kogge, J. Koller, P. Diniz, J. Chame, J. Drapper, J. LaCoss, J. Grana ki, J. Bro kman,
A. Srivastava, W. Athas, V. Free h, J. Shin, and J. Park, \Mapping irregular appli ations to DIVA,
a PIM-based data-intensive ar hite ture," in Pro eedings of the High Performan e Networking and
Computing Conferen e (SC99), (Portland, OR), Nov. 13{19 1999.
21
[13 K. Keeton, D. A. Patterson, and J. M. Hellerstein, \A ase for intelligent disks (IDISKs)," in Pro eedings of the ACM SIGMOD International Conferen e on Management of Data (SIGMOD-98), (Seattle,
WA, USA), pp. 42{52, June 1{4 1998.
[14 A. A harya, M. Uysal, and J. Saltz, \A tive disks: Programming model, algorithms and evaluation," in Pro eedings of the Eight International Conferen e on Ar hite tural Support for Programming
Languages and Operating Systems, (San Jose, CA, USA), pp. 81{91, O t. 2{7 1998.
[15 G. A. Gibson, D. F. Nagle, K. Amiri, J. Butler, F. W. Chang, H. Gobio, C. Hardin, E. Riedel,
D. Ro hberg, and J. Zelenka, \A ost-ee tive, high-bandwidth storage ar hite ture," in Pro eedings of the Eight International Conferen e on Ar hite tural Support for Programming Languages and
, (San Jose, CA, USA), pp. 92{103, O t. 2{7 1998.

E. Riedel, G. Gibson, and C. Faloustos, \A tive storage for large-s ale data mining and multimedia,"
in Pro eedings of the 24th VLDB Conferen e, Aug. 24{27 1998.
G. Memik, M. T. Kandemir, and A. Choudhary, \Design and evaluation of smart disk ar hite ture
for DSS ommer ial workloads," in Pro eedings of the 2000 International Conferen e on Parallel
Pro essing, (Toronto, ON, Canada), pp. 335{342, Aug. 21{24 2000.
Cirrus Logi , In ., \Preliminary produ t bulletin CLSH8665." June 1998.
Intel Corporation, \i960 HX mi ropro essor developers's manual." Order Number 272484-002, Sept.
1998.
Siemens Mi roele troni s, \TriCore ar hite ture overview handbook." Feb. 1999.
A. Tessardo, \TMS320C27x: New generation of embedded pro essors looks like a mi ro ontroller,
runs like a DSP." White Paper SPRA446, Digital Signal Pro essing Solutions, 1998.
E. F. Codd, \A relational model for shared large data banks," Communi ations of the ACM, vol. 13,
no. 6, pp. 377{387, 1970.
Arvind and R. S. Nikhil, \Exe uting a program on the MIT tagged-token data ow ar hite ture,"
in PARLE '87, Parallel Ar hite tures and Languages Europe, Volume 2: Parallel Languages (J. W.
de Bakker, A. J. Nijman, and P. C. Treleaven, eds.), Berlin, DE: Springer-Verlag, 1987. Le ture Notes
in Computer S ien e 259.
\TPC-C spe i ation."
http://www.tp .org/ spe .html.
\IBM DB2 Universal Database."
http://www.software.ibm. om/data/db2/udb/.
M. Jarke and J. Ko h, \Query optimization in database systems," ACM Computing Survey, vol. 16,
pp. 111{152, June 1984.
J. Lee, Y. Solihin, and J. Torrellas, \Automati ally mapping ode on an intelligent memroy ar hite ture," in Pro eedings of the Seventh International Symposium on High Performan e Computer
Ar hite ture (HPCA-7), (Monterrey, Mexi o), Jan. 19{24 2001.
D. K. Hsiao, ed., Advan ed Database Ma hine Ar hite ture. Englewood Clis, NJ: Prenti e-Hall, 1983.
Operating Systems
[16
[17
[18
[19
[20
[21
[22
[23
[24
[25
[26
[27
[28
22
[29 L. L. Miller, A. R. Hurson, and S. H. Pakzad, eds., Parallel ar hite tures for data/knowledge-based
systems. Los Alamitos, CA: IEEE Computer So iety Press, 1995.
[30 D. J. DeWitt and J. Gray, \Parallel database systems: The future of high-performan e database
systems," Communi ations of the ACM, vol. 35, pp. 85{98, June 1992.
[31 M. H. Nodine and J. S. Vitter, \Greed sort optimal deterministi sorting on parallel disks," ACM
Transa tions on Database Systems, vol. 42, pp. 919{933, July 1995.
[32 A. Segev, \Optimization of join operations in horizontally partitioned database systems," ACM Transa tions on Database Systems, vol. 11, pp. 48{80, Mar. 1986.
[33 L. D. Shapiro, \Join pro essing in database systems with large main memories," ACM Transa tions
on Database Systems, vol. 11, pp. 239{264, Sept. 1986.
[34 Y. Kang, W. Huang, S.-M. Yoo, D. Keen, Z. Ge, V. Lam, P. Pattnaik, and J. Torrellas, \Flexram:
Toward an advan ed intelligent memory system," in Pro eedings of the International Conferen e on
Computer Design (ICCD99), (Austin, TX, USA), O t. 1999.
[35 K. Mai, T. Paaske, N. Jayasena, R. Ho, W. J. Dally, and M. Horowitz, \Smart memories: A modular
re ongurable ar hite ture," in Pro eedings of the 27th Annual International Symposium on Computer
Ar hite ture (ISCA'00), (Van ouver, BC, Canada), pp. 161{171, June 12{14 2000.
[36 A. M. Despain and D. A. Patterson, \X-TREE: A tree stru tured multi-pro essor omputer ar hite ture," in Pro eedings of the 5th Annual International Symposium on Computer Ar hite ure, (Palo
Alto, CA, USA), pp. 144{151, Apr. 3{5 1978.
[37 D. A. Patterson, E. S. Fehr, and C. H. Sequin, \Design onsiderations for the VLSI pro essor of
X-TREE," in Pro eedings of the 6th Annual International Symposium on Computer Ar hite ure,
(Philadelphia, PA, USA), pp. 90{101, Apr. 23{25 1979.
[38 P. C. Treleaven, \VLSI pro essor ar hite tures," IEEE Computer, pp. 33{45, June 1982.
[39 A. L. Davis, \The ar hite ure and system method of DDM1: A re ursively stru tured data driven
ma hine," in Pro eedings of the 5th Annual International Symposium on Computer Ar hite ure, (Palo
Alto, CA, USA), pp. 210{215, Apr. 3{5 1978.
23

Hierarchical Computing

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Hierarchical Computing

Transféré par

Droits d'auteur :

Formats disponibles

Hierar

hi al Computing: An Ar hite ture for E ient Transa tion

 OLTP systems onstantly modify the data stored in the databases (

enter a sale, deliver a pa kage).

Figure 2: Sample topology for a Hierar hi al Computing system.

 Join: similar to the s an operation, but operates on two or more tables (

nd all vehi les registered

 Insert: reates a new row in an existing table.

Figure 3: Exe ution of an operation in the Hierar hi al Computing model.

Figure 4: Primitive Type 1 (Individual).

Figure 5: Primitive Type 2 (Aggregate).

Cardinality Row Size Table Size (GB)

Initial table allo ation

T ableInLevel(T able ; Level )

Traverse exe ution plan

Remove unused opies of tables

For ea h Level where i > M

Figure 12: Topology of a Simple Hierar hi al Computing system with M = 3.

The degradation fa tor for T P I is designed as  =

, whi h we use to obtain the speedup for the

Figure 13: Speedup for (17%; 25%; 58%) (1; 4; 32).

Transa tion type

10.0 % 0.8 % 89.2 %

Table 3: Partition of ode for the TPC-C ben hmark.

, (San Jose, CA, USA), pp. 92{103, O t. 2{7 1998.

Vous aimerez peut-être aussi

OLTP systems onstantly modify the data stored in the databases (

Join: similar to the s an operation, but operates on two or more tables (

Insert: reates a new row in an existing table.

The degradation fa tor for T P I is designed as =

Figure 13: Speedup for (17%; 25%; 58%) (1; 4; 32).