Vous êtes sur la page 1sur 35

I/O Performance Measures:

Austin Orgah
Chapter 8.6,7,8,9

Examples from Disk and File


Systems
How should we compare I/O systems?
- This is complex because I/O
performance depends on many aspects of
the I/O system.
- Design can also make complex trade-offs
between response time and throughput,
making it impossible to measure just one
aspect in isolation.

Examples from Disk and File


Systems contd
For Example:
Handling a request as early as possible
generally minimizes response time, although
greater throughput can be achieved handling
related requests together.
Throughput may be increased on a disk by
grouping requests that access locations that
are close together.
This will increase response time for some
requests, probably leading to a larger variation in
response time.

Examples from Disk and File


Systems contd
Though throughput will be increased,
some benchmarks constrain the maximum
response time to any request, making any
of the optimizations(disk and file)
potentially problematic.

Some benchmarks are proposed for


determining the performance of disk
systems.
These benchmarks are affected by a
variety of system features such as:
Disk technology
How the disks are connected
The memory system
The processor
The file system provided by the operating
system

Important Note:
Terminology/units:
Performance of I/O systems depends on
the rate at which system transfers data.
The transfer rate depends on the clock
rate, which is in GHz=109 cycles/sec. It is
usually quoted in GB/sec.
In I/O systems GBs are measured using
base 10. So GB=109 =1,000,000,000
bytes.
Memory is measured using base 2.
GB=230 =1,073,741,824.

Important Note: Contd


In base 10: 1K = 1000
In base 2: 1K = 1024
For calculation, instead of converting
between the two, treating the two as if they
are equal will introduce little error.

Benchmarks
Transaction Processing I/O
File System and Web I/O

Transaction Processing I/O


Benchmarks
Transaction Processing(TP) A type of
application that involves handling small short
operations(transactions) that require both I/O
and computation. Its applications typically have
both response time requirements and a
performance measurement based on the
throughput of transactions.
TP are mainly concerned with I/O rate measured
as the number of disk accesses/sec instead of
data rate measured in bytes of data per/sec.

Transaction Processing I/O


Benchmarks
I/O rate Performance measure of I/Os
per unit time, such as reads per/sec.
Data rate performance measure of bytes
per unit time, such as GB/sec.
TP involve changes to a large database, with the
system meeting some response time
requirements as well as gracefully handling
certain types of failures. For example banks use
TP systems.

Transaction Processing I/O


Benchmarks
The best-known set of benchmarks is
developed by the Transaction Processing
Council (TPC).
TPC-C created in 1992, simulates a
complex query environment.
TPC-H models ad hoc decision supportthe queries are unrelated and knowledge
of past queries cannot be used to optimize
future queries.

Transaction Processing I/O


Benchmarks
TPC-R simulates a business decision
support system where users run a
standard set of queries.
TPC-W web based transaction
benchmark that simulates the activities of
a business-oriented transactional web
server.
Pour plus information visiter sur le internet
www.tpc.org.

File System and Web I/O Benchmarks


File systems stored on disks have a different
access pattern.
Measurement of UNIX file systems (engineering
environment) show that:
80% of accesses are to files < 10KB.
90% of all file accesses are to data with sequential. addresses
on the disk.
67% of the accesses are reads.
27% were writes.
6% were read-modify accesses which read, modified and
rewrote data to the same location.

These measurements have led to the creation of


synthetic file system benchmarks.

File System and Web I/O Benchmarks


A popular synthetic file system benchmark
with its 5 phases using 70 files:
MakeDir: Constructs a directory subtree that is
identical in structure to the given directory
subtree.
Copy: Copies every file from the source subtree
to the target subtree.
ScanDir: Recursively traverses a directory
subtree and examines the status of every file in it.
ReadAll: Scans every byte of every file in a
subtree once.
Make: Compiles and links all the files in a
subtree.

File System and Web I/O Benchmarks


In addition to processor benchmarks,
SPEC offers a file server and a web server
benchmarks. (SPECSFS) and
(SPECWeb).
SPECSFS is a benchmark for measuring NFS(Network
File System) performance using a script of file server
requests. It tests performance of the I/O system, disk,
and network I/O and the processor. It is a throughputoriented benchmark with important response time
requirements.
SPECWeb is a web server benchmark that simulates
multiple clients requesting both static and dynamic
pages from a server. Also clients posting data to the
server.

I/O Performance Versus Processor


Performance
Impact of I/O on System Performance:
Suppose we have a benchmark that executes in 100s of
elapsed time, where 90s is CPU time & the rest is I/O
time. If CPU time improves by 50% per year for the next
five years but I/O time doesnt , how much faster will our
program run at the end of five years?
Elapsed time = CPU time + I/O time
100
= 90 + I/O time
Therefore: I/O time = 10s.

After n years

CPU time

I/O time

Elapsed time

% I/O time

90 secs

10 secs

100 secs

10%

90/1.5 = 60
secs

10 secs

70 secs

14%

60/1.5 = 40
secs

10 secs

50 secs

20%

40/1.5 = 27
secs

10 secs

37 secs

27%

27/1.5 = 18
secs

10 secs

28 secs

36%

18/1.5 = 12
secs

10 secs

22 secs

45%

CPU improvement over 5 years is:


90/12 = 7.5
The improvement in elapsed time is:
100/22 = 4.5
So the I/O time increased from 10% to 45%
of the elapsed time.

Designing an I/O System


Two primary specifications that designers
encounter in I/O systems
Latency Constraints
Bandwidth Constraints

Knowledge of the traffic pattern affects the


design and analysis.

Latency Constraints involve ensuring


that the latency to complete an I/O
operation is bounded by a certain amount.
Designing an I/O system to meet a set of
bandwidth constraints given a workload.
Find the weakest link in the I/O system which is the component
in the I/O path that will constrain the design. Depending on the
workload, this component can be anywhere, including the CPU,
the memory system, the back plane bus, the I/O bus, the I/O
controllers or the devices. The workload and configuration limits
may dictate where the weakest link is located.
Configure this component to sustain the required bandwidth.
Determine the requirements for the rest of the system and
configure them to support this bandwidth.

I/O System Design Example


A CPU that sustains 3 billion instructions/sec and averages 100,000
instructions in the operation system per I/O operation.
A memory backplane bus capable of sustaining a transfer rate of
1000 MB/sec.
SCSI Ultra320 controllers with a transfer rate of 320 MB/sec and
accommodating up to 7 disks.
Disk drives with read/write bandwidths of 75 MB/sec and an average
seek plus rotational latency of 6 ms.
If the workload consists of 64 KB reads(where the block is
sequential in a track) and the user program needs 200,000
instructions per I/O operation, find the max sustainable I/O rate and
the number of disks and SCSI controllers required. Assume that the
reads can always be done on an idle disk if one exists(i.e, ignore
disk conflicts).

Real Stuff: A Digital Camera


Digital cameras are embedded computers
with removable, writable, nonvolatile,
storage, and interesting I/O devices. See
Sanyo VPC-SX500

Digital Camera Contd


When powered on, the microprocessor
first runs diagnostics on all components
and writes any errors messages to the
liquid crystal display(LCD). When a picture
is about to be taken, the photographer
holds the shutter halfway so that the
microprocessor can take a light reading.
The microprocessor then keeps the
shutter open to get the necessary light
which is captured by a charged couple
device(CCD) as red, green, and blue
pixels.

Digital Camera Contd


The pixels are then scanned out row and
then passed through routines for white
balance, color and aliasing correction and
then stored in a 4MB frame buffer. The
next step is to compress the image into a
standard format such as JPEG and store it
in the removable flash memory. The
microprocessor updates the LCD display
to show that there is room for one less
picture. The camera has other features
such as video recording, sleep mode, LCD
display amongst many.

Digital Camera Contd


The camera allows the use of a Microdrive
disk instead of CompactFlash memory. Fig
8.15 shows the comparison of both.

Digital Camera Contd


The electronic brain of the Sanyo camera is an
embedded computer with several special
functions embedded on the chip. These kind of
chips are called systems on a chip(SOC). The
SOC integrate into a single chip all the parts that
were found on a small printed circuit board of
the past. They reduce size and lowers the power
compared to less integrated solutions. The SOC
enables the camera to operate on half the
number of batteries and to offer a smaller form
factor than competitors cameras.
Fig 8.16

The SOC has two buses, the 16-bit bus is for


the many slower I/O devices like the Smart
Media interface, program and data memory,
and DMA. The 32-bit bus is for the SDRAM,
the signal processor(which is connected to
the CCD), the Motion JPEG encoder, and the
NTSC/PAL encoder(which is connected to
the LCD). The SOC has a large variety of I/O
buses it must integrate unlike desktop
microprocessors. This 700 mW chip contains
1.8M transistors in a 10.5 x 10.5 mm die
implemented using a 0.35-micron process

Fallacies and Pitfalls


Fallacy: the rated mean time to failure of
disks is 1,200,000 hours or almost 140
years so disks practically never fail.
This number exceeds the lifetime of a disk.
For this large MTTF to make some sense, the
manufacturer's argue that this calculation will
correspond to a user who buys a disk, and
keeps replacing it every 5 years. (lifespan of
the disk).

Fallacy: Magnetic disk storage is on its last


legs and will be replaced shortly.
This is a fallacy and a pitfall. Magnetic
bubbles memories, optical storage, and
holographic storage are unsuccessful
contenders. None have matched the
combination of the characteristics that favor
magnetic disks: high reliability, nonvolatility,
low cost, reasonable access time etc.
magnetic storage rather improves at the same
or faster pace that is sustained over the past
25 years.

Fallacy: A 100 MB/sec bus can transfer


100 MB of data in 1 sec.
First you cannot use 100% of any computer
resource. For a bus you would be fortunate to
get 70% to 80% of the peak bandwidth. Time
to send the address, time to acknowledge the
signals and stalls while waiting to use a busy
bus are deterrents to 100% utilization of a
bus. Also the MB of storage and the MB/sec
of bandwidth do not agree.

Pitfalls: Using the peak transfer rate of a


portion of the I/O system to make
performance projections or performance
comparisons.
The components of an I/O system, from the
devices to the controllers to the buses are
specified using their peak bandwidth. These
peak bandwidths measurements are often
based on unrealistic assumptions about the
system or are unattainable because of other
system limitations. Amdahls law tells us that
the throughput of an I/O system will be limited
by the lowest-performance component in the
I/O path.

Pitfall: Using magnetic tapes to back up


disks.
This is a fallacy and a pitfall. Tapes use
similar technology to disks. The cost
difference between disks and tapes is based
on the fact that the rotating disk have lower
access times than sequential tape access.
Though tapes could hold the contents of
many disks and since it was 10 to 100 times
cheaper per gigabyte than disks it was a
useful backup. Today, disks have improved
much rapidly than tapes that tapes have
compatibility problems that are not imposed
on disks.

Pitfall: Trying to provide features only


within the network versus end to end.
The concern is providing at a lower level
features that can only be accomplished at the
highest level, thus only partially satisfying the
communication demand.

Pitfall: Moving functions from the CPU to


the I/O processor, expecting to improve
performance without a careful analysis.
A frequent instance of this fallacy is the use of
intelligent I/O interfaces, which, because of
the higher overhead to set up an I/O request,
can turn out to have worse latency than a
processor directed activity.

Vous aimerez peut-être aussi