Vous êtes sur la page 1sur 6

Modes of file access

Serial file

.
A serial file is one in which records are stored, one after the other, in the order in which they are
added not in order of a key field.
This means that new records are stored at the end of the file.

The following shows a serial file that is used to store the number of entries for EdExcel GCSE
Mathematics. The entries were received in the order: Kettlewood, Queens Park, St Marys, Wilton
High, West Orling.
Centre
Number

Centre
Name

No of
Candidates

27102

Kettlewood

85

38240

Queens Park

103

64715

St Marys

121

30446

Wilton High

156

12304

West Orling

105

Note that the key field in this file would be Centre Number (it uniquely identifies each school)
Both disks and tapes can be used to store a file serially.

Sequential file
A sequential file is one in which the records are stored, one after the other, in the order of the key
field.
The following shows a sequential file that is used to store the number of entries for EdExcel GCSE
Mathematics. The entries were added in the order: Kettlewood, Queens Park, St Marys, Wilton
High, West Orling but they are stored in the order of the key field Centre Number:
Centre
Number

Centre
Name

No of
Candidates

12304

West Orling

105

27102

Kettlewood

85

30446

Wilton High

156

38240

Queens Park

103

64715

St Marys

121

As with a serial file, both tape and disks can be used to store a file sequentially and access to the
records must take place from the beginning of the file.
Benefits
Sequential files allow the records to be displayed in the order of the key field this makes the
process of adding a record slower, but significantly speeds up searches.

Indexed sequential file


An indexed sequential file is one in which the records are stored, one after the other, in the order of
the key field, but which also has an index that enables records to be accessed directly.
Index
An index is a file with two fields, created from the main file, which contains a list of:
the key fields (sorted sequentially);
pointers to where the records can be found in the main file.
Indexed sequential files are useful when:
it is sometimes necessary to process all the records in sequential order; and
it is sometimes necessary to access individual records randomly.

Examples of indexed sequential files


Company employee file
At the end of each month all the records will be processed sequentially, in order to produce
payslips. However, some records will need to be accessed randomly, at other times for example,
when an employee changes address.
A schools student file
When an attendance report is printed, the file will be accessed sequentially, but when the details of
an individual student are required the index will be used to find the required record quickly.

Random (direct) access file


A random access file is one in which a record can be written or retrieved without first examining
other records.
A random access file must be stored on disk and the disk address is calculated from the primary
key.
In its simplest form a record with a primary key of 1 will be stored at block 1, a record with a
primary key of 2 will be stored at block 2; a record with primary key 3 will be stored at block 3 etc:

It should be noted that this very simple method where [disk address] = [primary key] is very
inefficient in respect of disk space. For example:

if the lowest primary key is 1001, then all the disk space below block 1001 will be wasted.
If there are some values which the primary key never takes (for example odd values) these
storage spaces will be wasted.

In order to be more efficient with the use of disk space, random access files calculate disk addresses
by using a hashing algorithm (also known as just hashing).

Hashing

Hashing is a calculation that is performed on a primary key in order to calculate the storage address
of a record.
A hashing algorithm will typically divide the primary key by the number of disk blocks that are
available for storage, work out the remainder and add the start address. The answer will be the
storage address of the record.
[disk address] = [primary key] MOD [number of blocks] + [start address]
Example
If a file was to be stored on the first 5000 blocks of a disk then:
[disk address] = [primary key] MOD 5000
That is, the primary key of each of the records would be divided by 5000 and the remainder would
be the disk address for the record.
This means that a record with primary key of 27102 would be stored at the disk address calculated
as follows:
27102
= 5 remainder 2102
5000

This means that the disk address for this record will be 2102.

The table shows some other disk addresses calculated using the same hashing algorithm:
Centre
Number

Centre
Name

No of
Candidates

Disk
Address

27102

Kettlewood

85

2102

38240

Queens Park

103

3240

64715

St Marys

121

4715

30446

Wilton High

156

446

12304

West Orling

105

2304

Problems with hashing


One problem that could occur with hashing is that a block may already contain a record and be full.
For example records with key fields of 38240 and 43240 will both be assigned a disk address of
3240.
If this happens then the new record will need be written somewhere else. Two common ways of
determining this alternative location are:
the record can be written to the next available block note that if it is the last address block
which is full then the search for an available space will start from the first block.
the record could be written to a separate overflow area and a tag is placed in the calculated
location to indicate exactly where in this overflow area the record can be found.

Vous aimerez peut-être aussi