Disk Storage, Basic File Structures, and Hashing: Database Design Database Design

Chapter Outline
Database Design
Disk Storage Devices
Chapter 13 Files of Records
Operations on Files
Ordered and Unordered Files
Disk Storage, Basic File Hashed Files
Structures, and Hashing Extendible and Linear Hashing Techniques
RAID Technology
CS 6360.501 (Fall 2009)
Storage Area Networks
Instructor: Sunan Han
The University of Texas at Dallas
Slide 13- 2
Primary and Secondary Storage Devices Disk Storage Devices

Primary storage: random access memory (RAM) with fastest Preferred secondary storage device for high storage
access time. Volatile in nature – stored data is gone when capacity, non-volatility and low cost.
power is off. Technology limits its size (Moore’s Law) Data stored as magnetized areas on magnetic disk
Main memory: stores data being processed by CPU: programs surfaces
and system data
Cache memory: stores the most frequently used data to reduce
the number of times accessing the secondary storage or the
main memory
Stores dynamic databases, such as network routing table
Secondary storage: magnetic disks or tapes: massive
capacity. Permanent regardless power. Slower access time
than RAM. Inexpensive
Databases are usually stored in secondary storage
Slide 13- 3
Disk Storage Devices (contd.) Disk Storage Devices (contd.)
A track is divided into smaller sectors
Meaning number of blocks varies in fixed angle sectors:
because it usually contains a large amount of information outer tracks have more blocks than inner tracks
The division of a track into sectors is hard-coded on the disk
surface and cannot be changed (see example on next slide)
One type of sector organization calls a portion of a track that
subtends a fixed angle at the center as a sector.
A track is divided into blocks by a computer operating system
It may be a subdivision of a sector or sector itself
The block size is fixed for each system
Typical block sizes range from 512 bytes to 4096 bytes.
Whole blocks are transferred between disk and main memory for
processing. (Blocks are units of data transfer at operating
system’s level)
Slide 13- 5 Slide 13- 6
Disk Storage Devices (contd.) Disk Storage Devices (contd.)

A read-write head moves to the track that contains the block
to be transferred.
Disk rotation moves the block under the read-write head for
reading or writing.
A physical disk block (hardware) address consists of:
a cylinder number (imaginary collection of tracks of same radius
from all recorded surfaces): decides the head position
the track number or surface number (within the cylinder):
decides the disk/surface
and block number (within track): decides the actual location
Reading or writing a disk block is time consuming because of
the seek time and rotational delay (latency)
Double buffering can be used to speed up the transfer of
contiguous disk blocks.

Buffering of Blocks Double Buffering
Note that the data transfer is directly between the

memory and the disk, after being set up by the CPU,
store some blocks of data from the disk in the main
memory may accelerate the entire data retrieval process B
Buffering technique is used so that, while transferring

data from the disk, the CPU can simultaneously process
data already in the memory
Double buffering: reading and processing consecutive
blocks from disk (see example on next slide)
Typical Disk Records

Parameters Fixed and variable length records (for entities)
Records contain fields (for attributes) which have
values of a particular type
E.g., amount, date, time, age
Fields themselves may be fixed length or variable
length
Variable length fields can be mixed into one record:
Separator characters or lengths of fields are needed
so that the record can be “parsed.”
(Courtesy of Seagate Technology)

Blocking
Blocking:
Refers to the fact of storing a number of records in one block on
the disk.
Blocking factor (bfr) refers to the number of records per
block.
There may be empty space in a block if an integral number of
records do not fit in one block.
Unspanned records: records can’t be stored in two or more
blocks
Spanned Records:
Refers to records that that is stored in more than one block (to
make use of the empty space), or records that exceed the size
of one or more blocks and hence span a number of blocks
Spanned and Unspanned Records Files of Records

A file is a sequence of records, where each record is a
collection of data values (or data items). A file is a good unit
to represent a table, but not limited to just one file/one table
A file descriptor (or file header) includes information that
describes the file, such as the field names and their data
types, and the addresses of the file blocks on disk. (It may
contain other information such as hashing structures)
Records are stored in disk blocks.
The blocking factor (bfr) for a file is the (average) number
of records of the file stored in a disk block.
A file can have fixed-length records or variable-length
records.

Files of Records (contd.) Operations on Files
OPEN: Readies the file for access, and associates a pointer that will refer
to a current file record at each point in time.
The physical disk blocks that are allocated to hold the
FIND: Searches for the first file record that satisfies a certain condition,
records of a file can be contiguous, linked, or indexed.
and makes it the current file record.
In a file of fixed-length records, all records have the same FINDNEXT: Searches for the next file record (from the current record) that
format. Usually, unspanned blocking is used with such files. satisfies a certain condition, and makes it the current file record.
Files of variable-length records require additional information READ: Reads the current file record into a program variable.
to be stored in each record, such as separator characters INSERT: Inserts a new record into the file & makes it the current file
and field types. record.
Usually spanned blocking is used with such files. DELETE: Removes the current file record from the file, usually by marking
the record to indicate that it is no longer valid.
MODIFY: Changes the values of some fields of the current file record.
CLOSE: Terminates access to the file.
REORGANIZE: Some files need to be reorganized periodically (e.g.
sorting)
READORDERED: Read the file blocks in order of a specific field of file
Unordered Files Ordered Files

Also called a sequential file.
Also called a heap or a pile file. File records are kept sorted by the values of an ordering field.
New records are inserted at the end of the file. Insertion is expensive: records must be inserted in the correct
order.
A linear search through the file records is It is common to keep a separate unordered overflow (or
necessary to search for a record. transaction) file for new records to improve insertion efficiency;
This requires reading and searching half the file this is periodically merged with the main ordered file.
blocks on the average, and is hence quite expensive. A binary search can be used to search for a record on its
ordering field value.
Record insertion is quite efficient. This requires reading and searching log2 of the file blocks on
the average, an improvement over linear search.
Reading the records in order of a particular field
Reading the records in order of the ordering field is quite
requires sorting the file records. efficient.

Ordered Files Average Access Times
The following table shows the average access time
Binary search can be to access a specific record for a given type of file (b
applied to an ordered is the number of blocks the file uses on the disk)
list:
1.Start from the middle
of the list;
2. Compare: stop if
found, or select the a
half of the list to be
current list and go to 1.
Hashing Techniques – Internal Hashing Hashing Algorithms

Hashing is a special type of file organization for fast file access
Algorithm 13.2. The key field (K) assumes a data type of 20
(vs. ordered log2(n) and unordered (n/2)). The search condition
must be an equality on a single field characters. code returns the ASCII of a character
Internal Hashing is used for fast access of the records in a table in (a) temp = 1;
the memory for i = 1 to 20 do temp = temp*code(K[i]) MOD M;
hash_address = temp MOD M;
A key field (attribute) in a table uniquely identifies the records. If it
is used as the address of the table, then no search is necessary:
as long as the key of a record is given, the record can be accessed (b) i = hash_address; a = i; new_hash_address = i;
right away if (location i is occupied)
The process of mapping the key field (K) to the address is called then { i = (i+1) MOD M;
hashing while (i<>a and location is occupied) {
i = (i+1) MOD M;}
A hashing function does the mapping. A simple approach is the
if (i == a)
remainder operator MOD (Note: it generates collision)
then {print “List full”; exit;}
h(K) = K MOD M, M is the number of records in the file
else {new_hash_address = i;}
h(K) generates integer numbers from 0 to M-1, referring to the M
}
entries/addresses in the table
Internal Hashing Example Discussion on Collision Resolution
Name Ssn Job Salary h(K) = K MOD 6 new_hash Ph Addr Memory Storage
John 23 A 100 h(23) = 5 ==> 5 0 Deb 39 B 200 Open addressing: From the hash address, checks the
Kyle 14 B 200 h(14) = 2 ==> 2 1 Sean 18 C 150
Jay 32 A 150 h(32) = 2 ==> 3 2 Kyle 14 B 200
subsequent positions in order until an unused (open) position
Sue 98 C 100 h(98) = 2 ==> 4 3 Jay 32 A 150 is found (Algorithm 13.2 (b))
Deb 39 B 200 h(39) = 3 ==> 0 4 Sue 98 C 100 Chaining: Additional overflow space is provided for collided
Sean 18 C 150 h(18) = 0 ==> 1 5 John 23 A 100
hash addresses. All collided records are chained together
(see example on next slide)
Algorithm Algorithm
13.2 (a) 13.2 (b) Multiple hashing: It applies a second hash function if the first
results in a collision. If another collision results, it uses open
Algorithm 13.2 (a) is used to calculate the hash function. addressing or applies a third hash function and then uses
Obviously, address collision occurs from the hash open addressing if necessary
function. Algorithm 13.2 (b) is used to resolve the collision
Trade-off of simplicity, space and computation time
External Hashing – Hashed Files

Collision
Hashing for disk files is called External Hashing
Resolution by The file blocks are divided into M equal-sized buckets,
Chaining numbered bucket0, bucket1, ..., bucketM-1.
Typically, a bucket corresponds to one (or a fixed number of)
disk block.
One of the file fields is designated to be the hash key of the
file.
The record with hash key value K is stored in bucket i, where
i=h(K), and h is the hashing function.
Search is very efficient on the hash key.
Collisions occur when a new record hashes to a bucket that
is already full.
An overflow file is kept for storing such records.
Overflow records that hash to each bucket can be linked
together.
Bucket Number to Block Matching Hashed Files - Overflow handling
Chained for
records from
h(K) same bucket
Stored in File Header
Hashed Files Discussions Dynamic File Hashing
To reduce overflow records, a hash file is typically Dynamic Hashing Techniques

kept 70-80% full. Hashing techniques are adapted to allow the dynamic
The hash function h should distribute the records growth and shrinking of the number of file records.
uniformly among the buckets Extendible hashing
Otherwise, search time will be increased because
many overflow records will exist. Linear hashing
Main disadvantages of static external hashing:
Fixed number of buckets M is a problem if the number
of records in the file grows or shrinks.
Ordered access on the hash key is quite inefficient
(requires sorting the records).

Extendible Hashing Extendible Hashing
Extendible hashing uses the binary representation of the
The directories can be stored on disk, and they expand or
hash value h(K) in order to access a directory.
shrink dynamically.
The directory is an array of size 2d where d is called the Each directory entry points to a bucket which points to the disk
global depth and is the number of binary digits used for the block/blocks that contain the stored records
directory addresses, or determines the number of entries of a Each bucket has a local depth d’, d’ ≤ d
directory
An insertion in a disk block that is full causes the
d can be increased or decreased by one at a time, doubling block/bucket to split into two by increasing the local depth
or halving the size of the directory (e.g. 01 becomes 010 and 011), and the records are
Assume a record’s K field value is K. Then the first d binary redistributed among the two blocks based on hashed keys
digits of h(K) determines which directory entry it belongs to, with the new local depth of d’
and therefore which bucket it belongs to Extendible hashing do not require an overflow area.
Linear Hashing
Extendible
Hashing Linear hashing does require an overflow area but does not
use a directory
Initial hashing is h1(K) = K MOD M, M = number of buckets
Buckets are split in linear order as overflow occurs: first
overflow occurs, bucket0 is split into two. Second overflow
Before expending. If
full, expand d’ = 3 and occurs, bucket1 is split into two, …
redistribute records Use a variable n to keep track of which bucket has just been
based on 010 and 011
split (or how many have been split). When split, the records
are redistributed between the two buckets based on a new
Before expending. If
hash function hi+1(K) = K MOD 2iM (It’s possible because of
full, expand d = 4 and the property hi+1(K) = either hi(K) or hi(K) + 2i-1M)
redistribute records At each bucket, an overflow chain is needed for records that
based on 1100 and
1101. Directory has 16 cause overflows but the bucket’s turn of split is yet to come
entries Slide 13- 35 Slide 13- 36
Linear Hashing Linear Hashing
h1(K) = K MOD M In retrieving a record with field value K, if h(K) < n, then re-
hash it using hi+1(K) because it was in a bucket which has
been split. So, n divides the records as far as which hash
Bucket0 Bucket1 Bucket2 Bucket3 Bucket4
function is used
When n = M, all original M buckets have been split and hi+1
applies to all buckets now
Split Split Split
M=5 n can be reset to zero and any new overflow leads to the use
of a new hash function hi+2(K) = K MOD 4M
Bucket5 Bucket6 In general, hashing function hi+j(K) = K MOD 2jM is used for
Bucket7
records who hashes to buckets with a number < n, and
n=3
hi+j+1(K) = K MOD 2j+1M is used for other records, where j =
When h1(K) < n or splitting 0,1,2, …. Each j is a pass of split of buckets
use h2(K) = K MOD 2M
Parallelizing Disk Access using RAID

Linear Hashing Search Algorithm
Technology.
Algorithm 13.3. Given field value K of a record,
decide its hash value (a is the hashed value and a Secondary storage technology must take steps to keep up in
performance and reliability with processor technology.
zero or positive integer pointing to a bucket)
A major advance in secondary storage technology is
represented by the development of RAID, which stands for
a = hj(K) ; Redundant Arrays of Inexpensive Disks.
if (a < n)
The main goal of RAID is to even out the widely different
then rates of performance improvement of disks against those in
{ memory and microprocessors. (Disk performance – access
a = hj+1(K) ; speed, capacity, etc. – improvement is found to be at a lot
} slower rate than memory and CPU)

RAID Technology Improving Reliability
Assume that a disk’s mean time between failure (MTBF) is about
A natural solution is a large array of small independent disks 200,000 hours (22.8 years)
acting as a single higher-performance logical disk. For 100 disks running in RAID, the MTBF becomes 2,000 hours or
A concept called data striping is used, which utilizes 83.3 days. This is very bad since, if the mean time to repair
parallelism to improve disk performance (MTTR) is 24 hours, the per year down time is (365/83.3)*24 = 105
Bit level striping and block level striping hours, or 4.38 days, this is a lot of data loss. The unavailability (U)
is 24/(2000+24) = 0.012, or availability (A) = 1 – 0.012 = 0.988.
Data striping distributes data transparently over multiple The industrial standard usually requires five 9’s (0.99999 =
disks to make them appear as a single large and fast disk. 99.999%, about 5 minutes downtime/year)
Redundancy is necessary. One technique is called mirroring
(shadowing): data is stored redundantly in two identical disks, but
they are treated as one logical disk
The reliability is significantly improved: for each pair of redundant
disks, the MTBF becomes 190,304 years.
RAID Organizations and Levels

Different raid organizations were defined based on different combinations
of the two factors of granularity of data interleaving (striping) and the
redundancy pattern
Raid level 0 has no redundant data and hence has the best write
performance at the risk of data loss

Raid level 1 uses mirrored disks.
Raid level 2 uses memory-style redundancy by using Hamming codes,
which contain parity bits for distinct overlapping subsets of

components. Level 2 includes both error detection and correction.
Raid level 3 uses a single parity disk relying on the disk controller to Multiple Level of
figure out which disk has failed. RAID
Raid Levels 4 and 5 use block-level data striping, with level 5
distributing data and parity information across all disks.

Raid level 6 applies the so-called P + Q redundancy scheme using
Reed-Soloman codes to protect against up to two disk failures by

using just two redundant disks.
Direct-Attached Storage (DAS) Network-Attached Storage (NAS)
Ethernet
LAN = Local Area Network

JBOD = Just a Bunch Of Disks
(no redundancy and parallelism)
Storage Area Networks Storage Area Networks

The demand for higher storage has risen considerably in
recent times.
Organizations have a need to move from a static fixed data
Ethernet center oriented operation to a more flexible and dynamic
infrastructure for information processing.
Thus they are moving to a concept of Storage Area Networks
(SANs).
A SAN is a network of interconnected computers and data
SONET/DWDM/fiber storage devices.
Ethernet over fiber In a SAN, online storage peripherals are configured as nodes
on a high-speed network and can be attached and detached
Ethernet/Fiber Channel
from servers in a very flexible manner.
This allows storage systems to be placed at longer distances
from the servers and provide different performance and
connectivity options.
Storage Area Networks Summary
Advantages of SANs are: Disk Storage Devices
Flexible many-to-many connectivity among servers and storage
Files of Records
devices using fiber channel hubs and switches
Up to 10km separation (this can be significantly extended by Operations on Files
such technology as dense wavelength division multiplexing) Ordered and Unordered Files
between a server and a storage system using appropriate fiber
optic cables Hashed Files
Better isolation capabilities allowing non-disruptive addition of Extendible and Linear Hashing Techniques
new peripherals and servers
RAID Technology
SANs face the problem of combining storage options from
multiple vendors and dealing with evolving standards of NAS and SAN
storage management software and hardware.
Assignment #11
Page 507: 13.27, 13,28
Due 11/16/09
Slide 13- 51

Disk Storage, Basic File Structures, and Hashing: Database Design Database Design

Transféré par

Informations du document

Description originale:

Titre original

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Disk Storage, Basic File Structures, and Hashing: Database Design Database Design

Transféré par

Droits d'auteur :

Formats disponibles

Chapter Outline

Primary and Secondary Storage Devices Disk Storage Devices

Disk Storage Devices (contd.) Disk Storage Devices (contd.)

Slide 13- 7 Slide 13- 8

Note that the data transfer is directly between the

Buffering technique is used so that, while transferring

Slide 13- 9 Slide 13- 10

Typical Disk Records

(Courtesy of Seagate Technology)

Slide 13- 11 Slide 13- 12

Slide 13- 13 Slide 13- 14

Spanned and Unspanned Records Files of Records

Slide 13- 15 Slide 13- 16

Unordered Files Ordered Files

Slide 13- 19 Slide 13- 20

Slide 13- 21 Slide 13- 22

Hashing Techniques – Internal Hashing Hashing Algorithms

Slide 13- 25 Slide 13- 26

External Hashing – Hashed Files

Stored in File Header

Slide 13- 29 Slide 13- 30

Hashed Files Discussions Dynamic File Hashing

To reduce overflow records, a hash file is typically Dynamic Hashing Techniques

Slide 13- 31 Slide 13- 32

Slide 13- 33 Slide 13- 34

Parallelizing Disk Access using RAID

Slide 13- 39 Slide 13- 40

Slide 13- 41 Slide 13- 42

RAID Organizations and Levels

performance at the risk of data loss

Raid level 2 uses memory-style redundancy by using Hamming codes,

which contain parity bits for distinct overlapping subsets of

distributing data and parity information across all disks.

Reed-Soloman codes to protect against up to two disk failures by

LAN = Local Area Network

Storage Area Networks Storage Area Networks

Slide 13- 49 Slide 13- 50

Vous aimerez peut-être aussi