Académique Documents
Professionnel Documents
Culture Documents
The operating system (OS) must provide and manage hardware resources as well as
provide an interface between the user and the machine and between applications
software and the machine. The OS must also provide other services such as data
security.
Originally, if a program needed input, the program would have to contain the code to
do this, similarly if output were required. This led to duplication of code so the idea
of an OS was born. The OS contained the necessary input and output functions that
could be called by an application. Similarly, disk input and output routines were
incorporated into the OS. This led to the creation of subroutines to do these simple
tasks, such as read a character from a keyboard or send a character to a printer. The
joining together of all these basic input and output routines led to the input-output
control system (IOCS). Originally, the IOCS could only read a punched card or send
data to a card punch. However, as new input and output media, such as magnetic tape
and disk, were developed the IOCS became more complex.
Another complication was added when assembly and high-level languages were
developed as the machine did not use these languages. Machines use binary codes for
very simple instructions. With the development of these new types of programming
language a computer would have to
The system had now become too complex for the human user to be able to organise it
all. Also, as the processor could work much faster than the manual operator and the
input and output devices, much time was wasted.
Further, to make full use of the processor, more than one program should be stored in
memory and the processor should give time to each of the programs. Suppose two
programs are stored in memory and, if one is using an input or output device (both
very slow compared to the processor), it makes sense for the other program to use the
processor. In fact this can be extended to more than two programs as shown in Fig.
3.1.a.1.
The OS must now manage the memory so that all three programs shown in Fig.
3.1.a.1 are kept separate as well as any data that they use. It must also schedule the
jobs into a sequence that makes best use of the processor.
Using I/O Using
Program A processor required processor
Using I/O
Program C processor required
Processor idle
Fig. 3.1.a.1
The I/O phase should not hold up the processor too much which can easily happen if
the I/O devices are very slow, like a keyboard or printer. This can be overcome by
using Simultaneous Peripheral Operations On-Line (spooling). The idea is to store all
input and output on a high-speed device such as a disk. Fig. 3.1.a.2 shows how this
may be achieved,
Input Output
device Input Output device
spool spool
Application
Read program Write
process process
Fig. 3.1.a.2
Another problem is that programs may not be loaded into the same memory locations
each time they are loaded. For example, suppose that three programs are loaded in
the order A, B, C on one occasion and in the order C, A, B on another occasion. The
results are shown in Fig. 3.1.a.3.
OS OS
Program A
Program C
Program B
Program A
Program C
Program B
Free Free
Fig. 3.1.a.3
A further problem occurs if two or more users wish to use the same program at the
same time. For example, suppose user X and user Y both wish to use a compiler for
C++ at the same time. Clearly it is a waste of memory if two copies of the compiler
have to be loaded into main memory at the same time. It would make much more
sense if user X's program and user Y's program are stored in main memory together
with a single copy of the compiler as shown in Fig. 3.1.a.4.
OS
User X's
program and data
User Y's
Program
and data
Compiler
Free
Fig. 3.1.a.4
Now the two users can use the compiler in turns and will want to use different parts of
the compiler. Also note that there are two different sets of data for the compiler, user
X's program and user Y's program. These two sets of data and the outputs from the
compiler for the two programs must be kept separate. Programs such as this compiler,
working in the way described, are called re-entrant.
Memory management, scheduling and spooling are described in more detail in the
following Sections.
Distributed systems have operating systems that arrange for the sharing of the
resources of the system by the users of that system. No action is required from the
user. An example would be when a user asks for a particular resource in a network
system, the resource would be allocated by the O.S. in a way that is transparent to the
user.
3.1 (b) Interrupts
Start
Fetch instruction
Execute instruction
End
Fig. 3.1.b.1
• I/O interrupt
o Generated by an I/O device to signal that a job is complete or an error
has occurred. E.g. printer is out of paper or is not connected.
• Timer interrupt
o Generated by an internal clock indicating that the processor must
attend to time critical activities (see scheduling later).
• Hardware error
o For example, power failure which indicates that the OS must close
down as safely as possible.
• Program interrupt
o Generated due to an error in a program such as violation of memory
use (trying to use part of the memory reserved by the OS for other use)
or an attempt to execute an invalid instruction (such as division by
zero).
Fetch instruction
Execute instruction
Is there an interrupt?
No
Yes
End
Fig. 3.1.b.2
This diagram shows that, after the execution of an instruction, the OS must see if an
interrupt has occurred. If one has occurred, the OS must service the interrupt if it is
more important than the task already being carried out (see priorities later). This
involves obeying a new set of instructions. The real problem is 'how can the OS
arrange for the interrupted program to resume from exactly where it left off?' In order
to do this the contents of all the registers in the processor must be saved so that the OS
can use them to service the interrupt. Chapter 3.3 explains registers that have to have
their contents stored as well as explaining the processing cycle in more detail.
Another problem the OS has to deal with happens if an interrupt occurs while another
interrupt is being serviced. There are several ways of dealing with this but the
simplest is to place the interrupts in a queue and only allow return to the originally
interrupted program when the queue is empty. Alternative systems are explained in
Section 3.1.c. Taking the simplest case, the order of processing is shown in
Fig.3.1.b.3.
Start
Fetch instruction
Yes
Execute instruction
Is there an interrupt
No in the interrupt queue?
No Yes
Fig. 3.1.b.3
The queue of interrupts is the normal first in first out (FIFO) queue and holds
indicators to the next interrupt that needs to be serviced.
3.1 (c) Scheduling
One of the tasks of the OS is to arrange the jobs that need to be done into an
appropriate order. The order may be chosen to ensure that maximum use is made of
the processor; another order may make one job more important than another. In the
latter case the OS makes use of priorities.
Suppose the processor is required by program A, which is printing wage slips for the
employees of a large company, and by program B, which is analysing the annual,
world-wide sales of the company which has a turnover of many millions of pounds.
Program A makes little use of the processor and is said to be I/O bound. Program B
makes a great deal of use of the processor and is said to be processor bound.
If program B has priority over program A for use of the processor, it could be a long
time before program A can print any wage slips. This is shown in Fig. 3.1.c.1.
A using processor
Fig. 3.1.c.1
Fig. 3.1.c.2 shows what happens if A is given priority over B for use of the processor.
This shows that the I/O bound program can still run in a reasonable time and much
better throughput is achieved.
Fig. 3.1.c.2
The objectives of scheduling are to
To achieve these objectives some criteria are needed in order to determine the order in
which jobs are executed. The following is a list of criteria which may be used to
determine a schedule which will achieve the above objectives.
• Priority. Give some jobs a greater priority than others when deciding which
job should be given access to the processor.
• I/O or processor bound. If a processor bound job is given the main access to
the processor it could prevent the I/O devices being serviced efficiently.
• Type of job. Batch processing, on-line and real-time jobs all require different
response times.
• Resource requirements. The amount of time needed to complete the job, the
memory required, I/O and processor time.
• Resources used so far. The amount of processor time used so far, how much
I/O used so far.
• Waiting time. The time the job has been waiting to use the system.
Enter the
system READY BLOCKED
RUNNING
Leave the
system
Fig. 3.1.c.3
When entering the system a job is placed in the ready queue by a part of the OS called
the High Level Scheduler (HLS). The HLS makes sure that the system is not over
loaded.
Sometimes it is necessary to swap jobs between the main memory and backing store
(see Memory Management in Section 3.1.d. This is done by the Medium Level
Scheduler (MLS).
Moving jobs in and out of the ready state is done by the Low Level Scheduler (LLS).
The LLS decides the order in which jobs are to be placed in the running state. There
are many policies that may be used to do scheduling, but they can all be placed in one
of two classes. These are pre-emptive and non-pre-emptive policies.
A pre-emptive scheme allows the LLS to remove a job from the running state so that
another job can be placed in the running state. In a non-pre-emptive scheme each job
runs until it no longer requires the processor. This may be because it has finished or
because it needs an I/O device.
• FCFS
o simply means that the first job to enter the ready queue is the first to
enter the running state. This favours long jobs.
• SJF
o simply means sort jobs in the ready queue in ascending order of time
expected to be needed by each job. New jobs are added to the queue in
such a way as to preserve this order.
• RR
o this gives each job a maximum length of processor time (called a time
slice) after which the job is put at the back of the ready queue and the
job at the front of the queue is given use of the processor. If a job is
completed before the maximum time is up it leaves the system.
• SRT
o the ready queue is sorted on the amount of expected time still required
by a job. This scheme favours short jobs even more than SJF. Also
there is a danger of long jobs being prevented from running.
• MFQ
o involves several queues of different priorities with jobs migrating
downwards.
There are other ways of allocating priorities. Safety critical jobs will be given very
high priority, on-line and real time applications will also have to have high priorities.
For example, a computer monitoring the temperature and pressure in a chemical
process whilst analysing results of readings taken over a period of time must give the
high priority to the control program. If the temperature or pressure goes out of a pre-
defined range, the control program must take over immediately. Similarly, if a bank's
computer is printing bank statements over night and someone wishes to use a cash
point, the cash point job must take priority. This scheme is shown in Fig. 3.1.c.4; this
shows that queues are needed for jobs with the same priority.
Fig. 3.1.c.4
In this scheme, any job can only be given use of the processor if all the jobs at higher
levels have been completed. Also, if a job enters a queue that has a higher priority
than the queue from which the running program has come, the running program is
placed back in the queue from which it came and the job that has entered the higher
priority queue is placed in the running state.
Multi-level feedback queues work in a similar way except that each job is given a
maximum length of processor time. When this time is up, and the job is not
completely finished, the job is placed in the queue which has the next lower priority
level. At the lowest level, instead of a first in first out queue a round robin system is
used. This is shown in Fig. 3.1.c.5.
Fig. 3.1.c.5
3.1 (d) Memory Management
This section can become very complex. In an examination the questions will be
limited to basic definitions and explanations. Calculations of the addresses and other
detail will not be required, they are included here for completeness of the topic for
those students who wish to understand in more detail.
In order for a job to be able to use the processor the job must be stored in the
computer's main memory. If there are several jobs to be stored, they, and their data,
must be protected from the actions of other jobs.
Suppose jobs A, B, C and D require 50k, 20k, 10k and 30k of memory respectively
and the computer has a total of 130k available for jobs. (Remember the OS will
require some memory.) Fig. 3.1.d.1 shows one possible arrangement of the jobs.
Free 20k
Job D 30k
Job C 10k
130k available
Job B 20k
for jobs
Job A 50k
OS
Fig 3.1.d.1
Now suppose job C terminates and job E, requiring 25k of memory, is next in the
ready queue. Clearly job E cannot be loaded into the space that job C has
relinquished. However, there is 20k + 10 k = 30k of memory free in total. So the OS
must find some way of using it. One solution to the problem would be to move job D
up to job B. This would make heavy use of the processor as not only must all the
instructions be moved but all addresses used in the instructions would have to be
recalculated because all the addresses will have changed.
When jobs are loaded into memory, they may not always occupy the same locations.
Supposing, instead of jobs A, B, C and D being needed and loaded in that order, it is
required to load jobs A, B, D and E in that order. Now job D occupies different
locations in memory to those shown above. So again there is a problem of using
different addresses.
The OS has the task of both loading the jobs and adjusting the addresses. The part of
the OS which carries out these tasks is a program called the loader. The calculation of
addresses can be done by recalculating each address used in the instructions once the
address of the first instruction is known. Alternatively, relative addressing can be
used. That is, addresses are specified relative to the first instruction.
This system is known as variable partitioning with compaction. Imagine that each job
needs a space to fit into, this space is the partition. Each of the jobs requires a
different size of space, hence “variable partitions”. These variable partitions are
normally called segments and the method of dividing memory up is called
segmentation. We also saw that sometimes it is necessary to move jobs around so that
they fill the ‘holes’ left by jobs that leave, this is called “compaction”.
An alternative method is to divide both the memory and the jobs into fixed size units
called “pages”. As an example, suppose jobs A, B, C, D and E consist of 6, 4, 1, 3
and 2 pages respectively. Also suppose that the available memory for jobs consists of
12 pages and jobs A, B and C have been loaded into memory as shown in Fig. 3.1.d.2.
Fig. 3.1.d.2
Now suppose job B terminates, releasing four pages, and jobs D and E are ready to be
loaded. Clearly we have a similar problem to that caused by segmentation. The 'hole'
consists of four pages into which job D (three pages) will fit, leaving one page plus
the original one page of free memory. E consists of two pages, so there is enough
memory for E but the pages are not contiguous, in other words they are not joined
together and we have the situation shown in Fig. 3.1.d.3.
Job E Memory
Page 2 Free
Page 1 C1
Free
D3
D2
D1
A6
A5
A4
A3
A2
A1
Fig. 3.1.d.3
The big difference between partitioning and paging is that jobs do not have to occupy
contiguous pages. Thus the solution is shown in Fig. 3.1.d.4.
Memory
E2
C1 E split
E1
D3
D2
D1
A6
A5
A4
A3
A2
A1
Fig. 3.1.d.4
The problem with paging is again address allocation. This can be overcome by
keeping a table that shows which memory pages are used for the job pages. Then, if
each address used in a job consists of a page number and the distance the required
location is from the start of the page, a suitable conversion is possible.
Suppose, in job A, an instruction refers to a location that is on page 5 and is 46
locations from the start of page 5. This may be represented by
5 46
5 46 Becomes 8 46
Paging uses fixed length blocks of memory. An alternative is to use variable length
blocks. This method is called segmentation. In segmentation, programmers divide
jobs into segments, possibly of different sizes. Usually, the segments would consist
of data, or sub-routines or groups of related sub-routines.
Segment A1 Segment B1 B1
B3
A2
B2
A1
Fig. 3.1.d.5
Now suppose that an instruction specifies an address as segment 3, displacement
(from start of segment) 132. The OS will look up, in the process segment table, the
basic address (in memory) of segment 3. The OS checks that the displacement is not
greater than the segment size. If it is, an error is reported. Otherwise the
displacement is added to the base address to produce the actual address in memory to
be used. The algorithm for this process is
+
Segment Table
Seg. No. Seg. Size Base Address
1
2
3 1500 3500
4
Fig. 3.1.d.6
Paging and segmentation lead to another important technique called virtual memory.
We have seen that jobs can be loaded into memory when they are needed using a
paging technique. When a program is running, only those pages that contain code that
is needed need be loaded. For example, suppose a word processor has been written
with page 1 containing the instructions to allow users to enter text and to alter the text.
Also suppose that page 2 contains the code for formatting characters, page 3 the code
for formatting paragraphs and page 4 contains the code for cutting, copying and
pasting. To run this word processor only page 1 needs to be loaded initially. If the
user then wants to format some characters so that they are in bold, then page 2 will
have to be loaded. Similarly, if the user wishes to copy and paste a piece of text, page
4 will have to be loaded. When other facilities are needed, the appropriate page can
be loaded. If the user now wishes to format some more characters, the OS does not
need to load page 2 again as it is already loaded.
Now, what happens if there is insufficient space for the new page to be loaded? As
only the page containing active instructions need to be loaded, the new page can
overwrite a page that is not currently being used. For example, suppose the user
wishes to use paragraph formatting; then the OS can load page 3 into the memory
currently occupied by page 2. Clearly, this means that programs can be written and
used that are larger than the available memory.
There must be some system that decides which pages to overwrite. There are many
systems such as overwrite the page that has not been used for the longest period of
time, replace the page that has not recently been used or the first in first out method.
All of these create extra work for the OS.
To further complicate matters not every page can be overwritten. Some pages contain
a job's data that will change during the running of a program. To keep track of this
the OS keeps a flag for each page that can be initially set to zero. If the content of the
page changes, the flag can be set to 1. Now, before overwriting a page, the OS can
see if that page has been changed. If it has, then the OS will save the page before
loading a new page in its place. The OS now has to both load and save pages. If the
memory is very full, this loading and saving can use up a great deal of time and can
mean that most of the processor's time is involved in swapping pages. This situation
is called disk threshing.
Systems can use both multi-programming and virtual memory. Also, virtual memory
can use segmentation as well as paging although this can become very complex.
3.1 (e) Spooling
Spooling was mentioned in Section 3.1.a and is used to place input and output on a
fast access device, such as disk, so that slow peripheral devices do not hold up the
processor. In a multi-programming, multi-access or network system, several jobs may
wish to use the peripheral devices at the same time. It is essential that the input and
output for different jobs do not become mixed up. This can be achieved by using
Simultaneous Peripheral Operations On-Line (spooling).
Suppose two jobs, in a network system, are producing output that is to go to a single
printer. The output is being produced in sections and must be kept separate for each
job. Opening two files on a disk, one for each job, can do this. Suppose we call these
files File1 and File2. As the files are on disk, job 1 can write to File1 whenever it
wishes and job 2 can write to File2. When the output from a job is finished, the name
(and other details) of the file can be placed in a queue. This means that the OS now
can send the output to the printer in the order in which the file details enter the queue.
As the name of a file does not enter the queue until all output from the job to the
corresponding file is complete, the output from different jobs is kept separate.
Spooling can be used for any number of jobs. It is important to realise that the output
itself is not placed in the queue. The queue simply contains the details of the files that
need to be printed so that the OS sends the contents of the files to the printer only
when the file is complete. The part of the OS that handles this task is called the
spooler or print spooler.
It should be noted that spooling not only keeps output from different jobs separate, it
also saves the user having to wait for the processor until the output is actually printed
by a printer (a relatively slow device). Spooling is used on personal computers as
well as large computer systems capable of multi-programming.
3.1.(f) Desktop PC Operating Systems
There are basically two types of OS used on PC's. These are command driven and
those that use a graphical user interface (GUI). Probably the best known of these are
MS-DOS (command driven) and Windows (GUI). These differ in the way the user
uses them and in the tasks that can be carried out.
All OS's for PC's allow the user to copy, delete and move files as well as letting the
user create an hierarchical structure for storing files. They also allow the user to
check the disk and tidy up the files on the disk.
However, Windows allows the user to use much more memory than MS-DOS and it
allows multi-tasking. This is when the user opens more than one program at a time
and can move from one to another. Try opening a word processor and the clipboard
in Windows at the same time. Adjust the sizes of the windows so that you can see
both at the same time. Now mark a piece of text and copy it to the clipboard. You
will see the text appear in the clipboard window although it is not the active window.
This is because the OS can handle both tasks apparently at the same time. In fact the
OS is swapping between the tasks so fast that the user is not aware of the swapping.
Another good example of multi-tasking is to run the clock program while using
another program. You will see that the clock is keeping time although you are using
another piece of software. Try playing a CD while writing a report!
The OS not only offers the user certain facilities, it also provides application software
with I/O facilities. In this Section you will see how an OS is loaded and how it
controls the PC.
This section, printed with a shaded background, is not required by the CIE Computing
Specification, but may be interesting and useful for understanding how the system
works.
When a PC is switched on, it contains only a very few instructions. The first step the
computer does is to run the power-on-self-test (POST) routine that resides in
permanent memory. The POST routine clears the registers in the CPU and loads the
address of the first instruction in the boot program into the program counter. This
boot program is stored in read-only memory (ROM) and contains the basic
input/output system (BIOS).
Control is now passed to the boot program which first checks itself and the POST
program. The CPU then sends signals to check that all the hardware is working
properly. This includes checking the buses, systems clock, RAM, disk drives and
keyboard. If any of these devices, such as the hard disk, contain their own BIOS, this
is incorporated with the system's BIOS. Often the BIOS is copied from a slow CMOS
BIOS chip to the faster RAM chips.
The PC is now ready to load the OS. The boot program first checks drive A to see if a
disk is present. If one is present, it looks for an OS on the disk. If no OS is found, an
error message is produced. If there is no disk in drive A, the boot program looks for
an OS on disk C. Once found, the boot program looks, in the case of Windows
systems, for the files IO.SYS and MSDOS.SYS. Once the files are found, the boot
program loads the boot record, about 512 bytes, which then loads IO.SYS. IO.SYS
holds extensions to the ROM BIOS and contains a routine called SYSINIT. SYSINIT
controls the rest of the boot procedure. SYSINIT now takes control and loads
MSDOS.SYS which works with the BIOS to manage files and execute programs.
The OS searches the root directory for a boot file such as CONFIG.SYS which tells
the OS how many files may be opened at the same time. It may also contain
instructions to load various device drivers. The OS tells MSDOS.SYS to load a file
called COMMAND.COM. This OS file is in three parts. The first part is a further
extension to the I/O functions and it joins the BIOS to become part of the OS. The
second part contains resident OS commands, such as DIR and COPY.
The files CONFIG.SYS and AUTOEXEC.BAT are created by the user so that the PC
starts up in the same configuration each time it is switched on.
The OS supplies the user, and applications programs, with facilities to handle input
and output, copy and move files, handle memory allocation and any other basic tasks.
In the case of Windows, the operating system loads into different parts of memory.
The OS then guarantees the use of a block of memory to an application program and
protects this memory from being accessed by another application program. If an
application program needs to use a particular piece of hardware, Windows will load
the appropriate device driver. Windows also uses virtual memory if an application
has not been allocated sufficient main memory.
As mentioned above, Windows allows multi-tasking; that is, the running of several
applications at the same time. To do this, Windows uses the memory management
techniques described in Section 3.1.d. In order to multi-task, Windows gives each
application a very short period of time, called a time-slice. When a time-slice is up,
an interrupt occurs and Windows passes control to the next application. In order to do
this, the OS has to save the contents of the CPU registers at the end of a time-slice and
load the registers with the values needed by the next application. Control is then
passed to the next application. This is continued so that all the applications have use
of the processor in turn. If an application needs to use a hardware device, Windows
checks to see if that device is available. If it is, the application is given the use of that
device. If not, the request is placed in a queue. In the case of a slow peripheral such
as a printer, Windows saves the output to the hard disk first and then does the printing
in the background so that the user can continue to use the application. If further
printing is needed before other printing is completed, then spooling is used as
described in Section 3.1.e.
Any OS has to be able to find files on a disk and to be able to store user's files. To do
this, the OS uses the File Allocation Table (FAT). This table uses a linked list to
point to the blocks on the disk that contain files. In order to do this the OS has a
routine that will format a disk. This simply means dividing the disk radially into
sectors and into concentric circles called tracks. Two or more sectors on a single
track make up a cluster. This is shown in Fig. 3.1.f.1.
Cluster Sectors
using 3
sectors
Tracks
Fig 3.1.f.1
A typical FAT table is shown in Fig 3.1.f.2. The first column gives the cluster
number and the second column is a pointer to the next cluster used to store a file. The
last cluster used has a null pointer (usually FFFFH) to indicate the end of the linking.
The directory entry for a file has a pointer to the first cluster in the FAT table. The
diagram shows details of two files stored on a disk.
Cluster Pointer
Pointer from 0 FFFD
directory entry for 1 FFFF
File 1 2 3
3 5
Pointer from 4 6
directory entry for 5 8
File 2 6 7
7 10
8 9
9 FFFF End of File 1 is in
10 11 cluster 9
11 FFFF
End of File 2 is in
cluster 11
Fig. 3.1.f.2
In order to find a file, the OS looks in the directory for the filename and, if it finds it,
the OS gets the cluster number for the start of the file. The OS can then follow the
pointers in the FAT to find the rest of the file.
In this table any unused clusters have a zero entry. Thus, when a file is deleted, the
clusters that were used to save the file can be set to zero. In order to store a new file,
all the OS has to do is to find the first cluster with a zero entry and to enter the cluster
number in the directory. Now the OS only has to linearly search for clusters with zero
entries to set up the linked list.
It may appear that using linear searches will take a long time. However, the FAT
table is normally loaded into RAM so that continual disk accesses can be avoided.
This will speed up the search of the FAT.
Note that Windows 95/98 uses virtual FAT (VFAT) which allows files to be saved 32
bits at a time (FAT uses 16 bits). It also allows file names of up to 255 characters.
Windows 98 uses FAT 32 which allows hard drives greater than 2 Gbytes to be
formatted.
3.1.(g) Network Operating Systems
This Section should be read in conjunction with Chapters 1.6 from the AS text and
3.10 from this A2 text.
The facilities provided by a NOS depend on the size and type of network. For
example, in a peer-to-peer network all the stations on the network have equal status.
In this system one station may act as a file server and another as a print server. At the
same time, all the stations are clients. A client is a computer that can be used by users
of the network. A peer-to-peer network has little security so the NOS only has to
handle
• communications,
• file sharing,
• printing.
• file sharing,
• file security,
• accounting,
• software sharing,
• hardware sharing (including print spooling),
• communications,
• the user interface.
File sharing allows many users to use the same file at the same time. In order to avoid
corruption and inconsistency, the NOS must only allow one user write access to the
file, other users must only be allowed read access. Also, the NOS must only allow
users with access rights permission to use files; that is, it must prevent unauthorised
access to data. It is important that users do not change system files (files that are
needed by the NOS). It is common practice for the NOS to not only make these files
read only, but to hide them from the users. If a user looks at the disk to see what files
are present, these hidden files will not appear. To prevent users changing read only
files to read write files, and to prevent users showing hidden files, the NOS does not
allow ordinary users to change these attributes.
To ensure the security of data, the network manager gives users access rights. When
users log onto a network they must enter their user identity and password. The NOS
then looks up, in a table, the users' access rights and only allows them access to those
files for which access is permitted. The NOS also keeps a note of how the users want
their desktops to look. This means that when users log on they are always presented
with the same screen. Users are allowed to change how their desktops look and these
are stored by the NOS for future reference.
As many users may use the network and its resources, it may be necessary for the
NOS to keep details of who has used the network, when and for how long and for
what purpose. It may also record which files the user has accessed. This is so that the
user can be charged for such things as printing, the amount of time that the network
has been used and storage of files. This part of the NOS may also restrict the users'
amount of storage available, the amount of paper used for printing and so on. From
time to time the network manager can print out details of usage so that charges may
be made.
The NOS must share the use of applications such as word processors, spreadsheets
and so on. Thus when a user requests an application, the NOS must send a copy of
that application to the user's station.
Several users may well wish to use the same hardware at the same time. This is
particularly true of printers. When a user sends a file for printing, the file is split into
packets. As many users may wish to use a printer, the packets from different users
will arrive at the print server and will have to be sorted so that the data from different
users are kept separate. The NOS receives these packets and stores the data in
different files for different users. When a particular file is complete, it can be added
to the print queue as described in Section 3.1.e.
The NOS must also ensure that users' files are saved on the server and that they
cannot be accessed by other users. To do this the network manager will allocate each
user a fixed amount of disk space and the NOS will prevent a user exceeding the
amount of storage allocated. If a user tries to save work when there is insufficient
space left, the NOS will ask the user to delete some files before the user can save any
more. In order to do this, the server's hard drive may be partitioned into many logical
drives. This means that, although there may be only one hard drive, different parts of
it can be treated as though they are different drives. For example, one part may be
called the H drive which is where users are allowed to save their work. This drive
will be divided up into folders, each of which is allocated to a different user. Users
only have access to their own folders but can create sub-folders in their own folders.
The NOS must provide this service as well as preventing users accessing other users'
folders. Another part of the drive may be called (say) the U drive where some users
can store files for other users who will be allowed to retrieve, but not alter, them
unless they are saved in the user's own area. The NOS will also only allow access to
certain logical drives by a restricted set of users.
For all the above to work, the NOS will have to handle communications between
stations and servers. Thus, the NOS is in two parts. One part is in each station and
the other is in the server(s). These two parts must be able to communicate with one
another so that messages and data can be sent around the network. The need for rules
to ensure that communication is successful was explained in Chapter 1.6 in the AS
text.
Finally, the NOS must provide a user interface between the hardware, software and
user. This has been discussed in Section 3.1.f, but a NOS has to offer different users
different interfaces. When a user logs onto a network, the NOS looks up the needs of
the user and displays the appropriate icons and menus for that user, no matter which
station the user uses. The NOS must also allow users to define their own interfaces
within the restrictions laid down by the network manager.
It must be remembered that users must not need an understanding of all the tasks
undertaken by the NOS. As far as users are concerned they are using a PC as if it
were solely for their own use. The whole system is said to be transparent to the user.
This simply means that users are unaware of the hardware and software actions. A
good user interface has a high level of transparency and this should be true of all
operating systems.
3.1 Example Questions
2. State three different types of interrupt that may occur and say in what order the
three interrupts would be handled if they all arrived at the processor together,
giving a reason for your answer. (5)
5. Explain how interrupts are used in a round robin scheduling operating system.
(3)
6. a) Explain the difference between paging and segmenting when referring to
memory management techniques. (2)
b) Describe how virtual memory can allow a word processor and a spreadsheet to
run simultaneously in the memory of a computer even though both pieces of
software are too large to fit into the computer’s memory. (3)
7. Describe how a PC operating system uses a file allocation table to find files
when necessary.
Chapter 3.2 The Functions and Purposes of Translators
When electronic computers were first used, the programs had to be written in machine
code. This code was comprised of simple instructions each of which was represented
by a binary pattern in the computer. To produce these programs, programmers had to
write the instructions in binary. This not only took a long time, it was also prone to
errors. To improve program writing assembly languages were developed. Assembly
languages allowed the use of mnemonics and names for locations in memory. Each
assembly instruction mapped to a single machine instruction which meant that it was
fairly easy to translate a program written in assembly language to machine code. To
speed up this translation process, assemblers were written which could be loaded into
the computer and then the computer could translate the assembly language to machine
code. Writing programs in assembly language, although easier than using machine
code, was still tedious and took a long time.
After assembly languages, came high-level languages which used the type of
language used by the person writing the program. Thus FORTRAN (FORmula
TRANslation) was developed for science and engineering programs and it used
formulae in the same way as would scientists and engineers. COBOL (Common
Business Oriented Language) was developed for business applications. Programs
written in these languages needed to be translated into machine code. This led to the
birth of compilers.
Fig. 3.2.a.1
The problem with using a compiler is that it uses a lot of computer resources. It has
to be loaded in the computer's memory at the same time as the source code and there
has to be sufficient memory to hold the object code. Further, there has to be sufficient
memory for working storage while the translation is taking place. Another
disadvantage is that when an error in a program occurs it is difficult to pin-point its
source in the original program.
4.2 - 1
computers lacked the power and memory needed for compilation. This method also
has the advantage of producing error messages as soon as an error is encountered.
This means that the instruction causing the problem can be easily identified. Against
interpretation is the fact that execution of a program is slow compared to that of a
compiled program. This is because the original program has to be translated every
time it is executed. Also, instructions inside a loop will have to be translated each
time the loop is entered.
Source Program
Lexical Analysis
Syntax Analysis
Semantic Analysis
Intermediate
Language
Code Generation
Code Optimisation
Object Program
4.2 - 2
3.2 (b) Lexical Analysis
The lexical analyser uses the source program as input and creates a stream of tokens
for the syntax analyser. Tokens are normally 16-bit unsigned integers. Each group of
characters is replaced by a token. Single characters, which have a meaning in their
own right, are replaced by their ASCII values. Multiple character tokens are
represented by integers greater than 255 because the ones up to 255 are reserved for
the ASCII codes. Variable names will need to have extra information stored about
them. This is done by means of a symbol table. This table is used throughout
compilation to build up information about names used in the program. During lexical
analysis only the variable's name will be noted. It is during syntax and semantic
analysis that details such as the variable's type and scope are added. The lexical
analyser may output some error messages and diagnostics. For example, it will report
errors such as an identifier or variable name which breaks the rules of the language.
At various stages during compilation it will be necessary to look up details about the
names in the symbol table. This must be done efficiently so a linear search is not
sufficiently fast. In fact, it is usual to use a hash table and to calculate the position of
a name by hashing the name itself. When two names are hashed to the same address,
a linked list can be used to avoid the symbol table filling up.
The lexical analyser also removes redundant characters such as white space (these are
spaces, tabs, etc., which we may find useful to make the code more readable, but the
computer does not want) and comments. Often the lexical analysis takes longer than
the other stages of compilation. This is because it has to handle the original source
code, which can have many formats. For example, the following two pieces of code
are equivalent although their format is considerably different.
When the lexical analyser has completed its task, the code will be in a standard format
which means that the syntax analyser (which is the next stage, and we will be looking
at in 3.2.c) can always expect the format of its input to be the same.
4.2 - 3
3.2 (c) Syntax Analysis
This Section should be read in conjunction with Section 3.5.j which discusses Backus-
Naur Form (BNF) and syntax diagrams.
During this stage of compilation the code generated by the lexical analyser is parsed
(broken into small units) to check that it is grammatically correct. All languages have
rules of grammar and computer languages are no exception. The grammar of
programming languages is defined by means of BNF notation or syntax diagrams. It
is against these rules that the code has to be checked.
and expression is
and the parser must take the output from the lexical analyser and check that it is of
this form.
If the statement is
which becomes
and then
<assignment statement>
which is valid.
4.2 - 4
and this does not represent a valid statement hence an error message will be returned.
It is at this stage that invalid names can be found such as PRIT instead of PRINT as
PRIT will be read as a variable name instead of a reserved word. This will mean that
the statement containing PRIT will not parse to a valid statement. Note that in
languages that require variables to be declared before being used, the lexical analyser
may pick up this error because PRIT has not been declared as a variable and so is not
in the symbol table.
Most compilers will report errors found during syntax analysis as soon as they are
found and will attempt to show where the error has occurred. However, they may not
be very accurate in their conclusions nor may the error message be very clear.
During syntax analysis certain semantic checks are carried out. These include label
checks, flow of control checks and declaration checks.
Some languages allow GOTO statements (not recommended by the authors) which
allow control to be passed, unconditionally, to another statement which has a label.
The GOTO statement specifies the label to which the control must pass. The
compiler must check that such a label exists.
Certain control constructs can only be placed in certain parts of a program. For
example in C (and C++) the CONTINUE statement can only be placed inside a loop
and the BREAK statement can only be placed inside a loop or SWITCH statement.
The compiler must ensure that statements like these are used in the correct place.
Many languages insist on the programmer declaring variables and their types. It is at
this stage that the compiler verifies that all variables have been properly declared and
that they are used correctly.
4.2 - 5
3.2 (d) Code Generation
It is at this stage, when all the errors due to incorrect use of the language have been
removed, that the program is translated into code suitable for use by the computer's
processor.
During lexical and syntax analysis a table of variables has been built up which
includes details of the variable name, its type and the block in which it is valid. The
address of the variable is now calculated and stored in the symbol table. This is done
as soon as the variable is encountered during code generation.
Before the final code can be generated, an intermediate code is produced. This
intermediate code can then be interpreted or translated into machine code. In the
latter case, the code can be saved and distributed to computer systems as an
executable program. Two methods can be used to represent the high-level language
in machine code. This syllabus does not require knowledge of either of these methods,
but a brief outline is given for those who may be interested.
One uses a tree structure and the other a three address code (TAC). TAC allows no
more than three operands. Instructions take the form
A := (B + C) * (D – E) / F
R1 := B + C
R2 := D – E
R3 := R1 * R2
A := R3 / F
4.2 - 6
:=
A /
* F
+ -
B C D E
Fig. 3.2.d.1
Other statements can be represented in similar ways and the final stage of compilation
can then take place.
The compiler has to consider, at this point, the type of code that is required. Code can
be improved in order to increase the speed of execution or in order to make size of the
program as small as possible. Often compilers try to compromise between the two.
This process of improving the code is known as optimisation.
An example of code optimisation is shown in Fig. 3.2.d.2 where the code on the left
has been changed to that on the right so that r1 * b is only evaluated once.
a := 5+ 3 a := 5 + 3
b := 5 * 3 b := 5 * 3
r1 := a + b r1 := a + b
r2 := r1 * b r2 := r1 * b
r3 := r2 / a r3 := r2 / a
r4 := r1 - b r4 := r2
r5 := r4 + 6 r5 := r4 + 6
c := r3 – r5 c := r3 – r5
Fig. 3.2.d.2
There are many other ways of optimising code but the last section has hopefully made
the concept clear. Candidates will not be expected to optimise code, but should be
aware of what it is.
4.2 - 7
3.2 (e) Linkers and Loaders
Programs are usually built up in modules. These modules are then compiled into
machine code that has to be loaded into the computer's memory. This process is done
by the loader. The loader decides where in memory to place the code and then adjusts
memory addresses as described in Chapter 3.1. As the whole program may consist of
many modules, all of which have been separately compiled, the modules will have to
be correctly linked once they have been loaded. This is the job of the linker. The
linker calculates the addresses of the separate pieces that make up the whole program
and then links these pieces so that all the modules can interact with one another.
The idea of using modules that can be used in many programs was explained in
Section 1.3 in the AS text. This method of creating programs is important as it
reduces the need to keep rewriting code and will be further discussed under object
oriented programming in Section 3.5.f.
4.2 - 8
3.2 Example Questions.
1. Explain why the size of the memory available is particularly relevant to the
process of compilation. (4)
b) Give one advantage of the use of each of the two translation techniques. (2)
3. State the three stages of compilation and describe, briefly, the purpose of each.
(6)
4. Explain, in detail, the stage of compilation known as lexical analysis. (6)
4.2 - 9
Chapter 3.3 Computer Architecture and the Fetch-Execute Cycle
John Von Neumann introduced the idea of the stored program. Previously data and
programs were stored in separate memories. Von Neumann realised that data and
programs are indistinguishable and can, therefore, use the same memory. This led to
the introduction of compilers which accepted text as input and produced binary code
as output.
The Von Neumann architecture uses a single processor which follows a linear
sequence of fetch-decode-execute. In order to do this, the processor has to use some
special registers. These are
Register Meaning
PC Program Counter
CIR Current Instruction Register
MAR Memory Address Register
MDR Memory Data Register
Accumulator Holds results
The program counter keeps track of where to find the next instruction so that a copy
of the instruction can be placed in the current instruction register. Sometimes the
program counter is called the Sequence Control Register (SCR) as it controls the
sequence in which instructions are executed.
The memory address register is used to hold the memory address that contains either
the next piece of data or an instruction that is to be used.
The memory data register acts like a buffer and holds anything that is copied from the
memory ready for the processor to use it.
The central processor contains the arithmetic-logic unit (also known as the arithmetic
unit) and the control unit. The arithmetic-logic unit (ALU) is where data is processed.
This involves arithmetic and logical operations. Arithmetic operations are those that
add and subtract numbers, and so on. Logical operations involve comparing binary
patterns and making decisions.
The control unit fetches instructions from memory, decodes them and synchronises
the operations before sending signals to other parts of the computer.
The accumulator is in the arithmetic unit, the program counter and the instruction
registers are in the control unit and the memory data register and memory address
register are in the processor.
A typical layout is shown in Fig. 3.3.a.1 which also shows the data paths.
4.3 - 1
Main Memory
Central Processing Unit (CPU)
Control Unit
ALU
PC Accumulator
CIR
MAR
MDR
Fig 3.3.a.1
4.3 - 2
3.3 (b) The Fetch-Decode-Execute-Reset Cycle
The following is an algorithm that shows the steps in the cycle. At the end the cycle
is reset and the algorithm repeated.
1. Load the address that is in the program counter (PC) into the memory address
register (MAR).
2. Increment the PC by 1.
3. Load the instruction that is in the memory address given by the MAR into the
memory data register (MDR).
4. Load the instruction that is now in the MDR into the current instruction
register (CIR).
5. Decode the instruction that is in the CIR.
6. If the instruction is a jump instruction then
a. Load the address part of the instruction into the PC
b. Reset by going to step 1.
7. Execute the instruction.
8. Reset by going to step 1.
Steps 1 to 4 are the fetch part of the cycle. Steps 5, 6a and 7 are the execute part of
the cycle and steps 6b and 8 are the reset part.
Step 1 simply places the address of the next instruction into the memory address
register so that the control unit can fetch the instruction from the right part of the
memory. The program counter is then incremented by 1 so that it contains the address
of the next instruction, assuming that the instructions are in consecutive locations.
The memory data register is used whenever anything is to go from the central
processing unit to main memory, or vice versa. Thus the next instruction is copied
from memory into the MDR and is then copied into the current instruction register.
Now that the instruction has been fetched the control unit can decode it and decide
what has to be done. This is the execute part of the cycle. If it is an arithmetic
instruction, this can be executed and the cycle restarted as the PC contains the address
of the next instruction in order. However, if the instruction involves jumping to an
instruction that is not the next one in order, the PC has to be loaded with the address
of the instruction that is to be executed next. This address is in the address part of the
current instruction, hence the address part is loaded into the PC before the cycle is
reset and starts all over again.
4.3 - 3
3.3 (c) Parallel Processor Systems
Instruction 2 Instruction 1
Fig. 3.3.c.1
This helps with the speed of throughput unless the next instruction in the pipe is not
the next one that is needed. Suppose Instruction 2 is a jump to Instruction 10. Then
Instructions 3, 4 and 5 need to be removed from the pipe and Instruction 10 needs to
be loaded into the fetch part of the pipe. Thus, the pipe will have to be cleared and
the cycle restarted in this case. The result is shown in Fig. 3.3.c.2
Instruction 2 Instruction 1
Instruction 10
Instruction 11 Instruction 10
Fig. 3.3.c.2
The effect of pipe lining is that there are three instructions being dealt with at the
same time. This SHOULD reduce the execution times considerably (to approximately
4.3 - 4
1/3 of the standard times), however, this would only be true for a very linear program.
Once jump instructions are introduced the problem arises that the wrong instructions
are in the pipe line waiting to be executed, so every time the sequence of instructions
changes, the pipe line has to be cleared and the process started again.
Another type of computer architecture is to use many processors, each carrying out an
individual instruction at the same time as its partners. This type of processing uses an
architecture known as parallel processing, which involves many independent
processors working in parallel on the same program. One of the difficulties with this
is that the programs running on these systems need to have been written specially for
them. If the programs have been written for standard architectures, then some
instructions cannot be completed until others have been completed. Thus, checks
have to be made to ensure that all prerequisites have been completed. However, these
systems are in use particularly when systems are receiving many inputs from sensors
and the data need to be processed in parallel. A simple example that shows how the
use of parallel processors can speed up a solution is the summing of a series of
numbers. Consider finding the sum of n numbers such as
2 + 4 + 23 + 21 + …. + 75 + 54 + 3
Using a single processor would involve (n – 1) additions, one after the other. Using
n/2 processors we could simultaneously add n/2 pairs of numbers in the same time it
would take a single processor to add one pair of numbers. This would leave only n/2
numbers to be added and this could be done using n/4 processors. Continuing in this
way the time to add the series would be considerably reduced.
4.3 - 5
3.3 Example Questions
The questions in this section are meant to mirror the type and form of
questions that a candidate would expect to see in an exam paper. As before,
the individual questions are each followed up with comments from an
examiner.
b) Describe two ways in which the program counter can change during the
normal execution of a program, explaining, in each case, how this change is
initiated. (4)
c) Describe the initial state of the program counter before the running of the
program. (2)
b) State one type of instruction that would cause the pipeline system to be reset,
explaining why such a reset is necessary. (3)
4.3 - 6
Chapter 3.4 Data Representation, Data Structures and Data Manipulation
Counting is one of the first skills that a young child masters, and none of us consider
counting from 1 to 100 difficult. However, to count, we have to learn, by heart, the
meanings of the symbols 0,1,2…9 and also to understand that two identical symbols
mean totally different things according to their ‘place’ in the number. For instance, in
23 the 2 actually means 2 * 10. But why multiply by 10? Why not multiply by 6? The
answer is simply that we were taught to do that because we have 10 fingers, so we can
count on our fingers until we get to the last one, which we remember in the next
column and then start again.
We don’t need to count in tens. The ancient Babylonians counted in a system, which
is similar to counting in sixties. This is very difficult to learn because of all the
symbols needed, but we still use a system based on sixties today: 60 minutes = 1 hour;
60 seconds = 1 minute; 6 * 60 degrees = 1 revolution.
Instead of increasing the number of symbols in a system, which makes the system
more difficult, it seems reasonable that if we decrease the number of symbols the
system will be easier to use.
A computer is an electronic machine. Electricity can be either on or off. If electricity
is not flowing through a wire it can stand for 0. If electricity is flowing, then it stands
for 1. The difficulty is what to do for the number 2. We can’t just pump through twice
as much electricity, what we need is a carry system, just like what happens when we
run out of fingers. What we need is another wire.
ADD 1
electricity 1
=1
no electricity 0
ADD 1
no electricity 0
=2
Carry electricity 1
ADD 1
electricity 1
=3
electricity 1
ADD 1
4.4 - 1
no electricity 0
Carry
no electricity 0 =4
Carry electricity 1
The computer can continue like this for ever, just adding more wires when it gets
bigger numbers.
This system, where there are only two digits, 0 and 1, is known as the binary system.
Each wire, or digit, is known as a binary digit. This name is normally shortened to
BIT. So each digit, 0 or 1, is one bit. A single bit has very few uses so they are
grouped together. A group of bits is called a BYTE. Usually a byte has 8 bits in it.
The first thing we must be able to do with the binary system is to change numbers
from our system of 10 numbers (the denary system) into binary, and back again.
There are a number of methods for doing this, the simplest being to use the sort of
column diagrams, which were used in primary school to do simple arithmetic
Thousands Hundreds Tens Units
except, this time we are using binary, so the column headings go up in twos instead of
tens
32s 16s 8s 4s 2s units
To turn a denary number into a binary number simply put the column headings, start
at the left hand side and follow the steps:
• If the column heading is less than the number, put a 1 in the column and then
subtract the column heading from the number. Then start again with the next
column on the right.
• If the column heading is greater than the number, put a 0 in the column and start
again with the next column on the right.
Note: You will be expected to be able to do this with numbers up to 255, because that
is the biggest number that can be stored in one byte of eight bits.
e.g. Change 117 (in denary) into a binary number.
Answer: Always use the column headings for a byte (8 bits)
128 64 32 16 8 4 2 1
4.4 - 2
32 is less than 53, so put a 1.
128 64 32 16 8 4 2 1
0 1 1
Take 32 from 53 = 21, and repeat.
If you continue this the result (try it) is
128 64 32 16 8 4 2 1
0 1 1 1 0 1 0 1
So 117 (in denary) = 01110101 (in binary).
To turn a binary number into denary, simply put the column headings above the
binary number and add up all the columns with a 1 in them.
This principle can be used for any number system, even the Babylonians’ sixties if
you can learn the symbols.
e.g. If we count in eights (called the OCTAL system) the column headings go up in
8’s.
512 64 8 1
4.4 - 3
Another system is called HEXADECIMAL (counting in 16’s). This sounds very
difficult, but it needn’t be, just use the same principles.
256 16 1
So 117 (in denary) is 7 lots of 16 (112) plus an extra 5. Fitting this in the columns
gives
256 16 1
0 7 5
Notice that 7 in binary is 0111 and that 5 is 0101, put them together and we get
01110101 which is the binary value of 117 again. So binary, octal and hexadecimal
are all related in some way.
There is a problem with counting in 16’s instead of the other systems. We need
symbols going further than 0 to 9 (only 10 symbols and we need 16!).
We could invent 6 more symbols but we would have to learn them, so we use 6 that
we already know, the letters A to F. In hexadecimal A stands for 10, B stands for 11
and so on to F stands for 15.
So a hexadecimal number BD stands for 11 lots of 16 and 13 units
= 176 + 13
= 189 ( in denary)
Note: B = 11, which in binary = 1011
D = 13, which in binary = 1101
Put them together to get 10111101 = the binary value of 189.
4.4 - 4
3.4 (b) Negative Integers
If a computer system uses a byte to store a number in the way that was suggested in
3.4.a there are three problems that arise. The first is that the biggest number that can
be represented is 255 because there aren’t enough bits to store bigger numbers. This is
easily solved by using more than one byte to represent a number. Most computer
systems use either two or four bytes to store a number. There is still a limit on the size
that can be represented, but it is now much larger. The second problem is not so easy
to solve, how to represent fractions. This will be looked at later in this chapter. The
third problem is how to store negative numbers.
The example we used for binary storage was 117 which becomes 01110101 in binary.
If we want to store +117 or –117, these numbers need a second piece of data to be
stored, namely the sign. There are two simple ways to store negative numbers.
Two’s Complement
The MSB stays as a number, but is made negative. This means that the column
headings are
-128 64 32 16 8 4 2 1
4.4 - 5
3.4 (c) Binary Arithmetic
The syllabus requires the addition of two binary integers, and the ability to take one
away from another. The numbers and the answers will be limited to one byte.
Addition.
There are four simple rules 0+0=0
0+1=1
1+0=1
and the difficult one 1 + 1 = 0 (Carry 1)
e.g. Add together the binary equivalents of 91 and 18
Answer: 91 = 01011011
18 = 00010010 +
01101101 = 109
1 1
Subtraction.
This is where two’s complement is useful. To take one number away from another,
simply write the number to be subtracted as a two’s complement negative number and
then add them up.
e.g. Work out 91 – 18 using their binary equivalents.
Answer: 91 = 01011011
-18 as a two’s complement number is –128 + 110
= -128 +(+64 +32 +8 +4 +2)
= 11101110
Now add them 01011011
11101110 +
1 01001001
1 111111
But the answer can only be 8 bits, so cross out the 9th bit giving
01001001 = 64 + 8 + 1 = 73.
Notes: Lots of carrying here makes the sum more difficult, but the same rules are
used.
One rule is extended slightly because of the carries, 1+1+1 = 1 (carry 1)
Things can get harder but this is as far as the syllabus goes.
4.4 - 6
3.4 (d) Floating Point Representation
In the first part of this chapter we learned how to represent both positive and negative
integers in two's complement form. It is important that you understand this form of
representing integers before you learn how to represent fractional numbers.
In decimal notation the number 23.456 can be written as 0.23456 x 102. This means
that we need only store, in decimal notation, the numbers 0.23456 and 2. The number
0.23456 is called the mantissa and the number 2 is called the exponent. This is what
happens in binary.
For example, consider the binary number 10111. This could be represented by
0.10111 x 25 or 0.10111 x 2101. Here 0.10111 is the mantissa and 101 is the exponent.
Similarly, in decimal, 0.0000246 can be written 0.246 x 10-4. Now the mantissa is
0.246 and the exponent is –4.
Thus, in binary, 0.00010101 can be written as 0.10101 x 2-11 and 0.10101 is the
mantissa and –11 is the exponent.
It is now clear that we need to be able to store two numbers, the mantissa and the
exponent. This form of representation is called floating point form. Numbers that
involve a fractional part, like 2.46710 and 101.01012 are called real numbers.
4.4 - 7
3.4 (e) Normalising a Real Number
In the above examples, the point in the mantissa was always placed immediately
before the first non-zero digit. This is always done like this with positive numbers
because it allows us to use the maximum number of digits.
Suppose we use 8 bits to hold the mantissa and 8 bits to hold the exponent. The
binary number 10.11011 becomes 0.1011011 x 210 and can be held as
0 1 0 1 1 0 1 1 0 0 0 0 0 0 1 0
Mantissa Exponent
Notice that the first digit of the mantissa is zero and the second is one. The mantissa
is said to be normalised if the first two digits are different. Thus, for a positive
number, the first digit is always zero and the second is always one. The exponent is
always an integer and is held in two's complement form.
Now consider the binary number 0.00000101011 which is 0.101011 x 2-101. Thus the
mantissa is 0.101011 and the exponent is –101. Again, using 8 bits for the mantissa
and 8 bits for the exponent, we have
0 1 0 1 0 1 1 0 1 1 1 1 1 0 1 1
Mantissa Exponent
The reason for normalising the mantissa is in order to hold numbers to as high a
degree of accuracy as possible.
Care needs to be taken when normalising negative numbers. The easiest way to
normalise negative numbers is to first normalise the positive version of the number.
Consider the binary number –1011. The positive version is 1011 = 0.1011 x 2100 and
can be represented by
0 1 0 1 1 0 0 0 0 0 0 0 0 1 0 0
Mantissa Exponent
Now find the two's complement of the mantissa and the result is
1 0 1 0 1 0 0 0 0 0 0 0 0 1 0 0
Mantissa Exponent
4.4 - 8
As another example, change the decimal fraction –11/32 into a normalised floating
point binary number.
= 0.1011 x 2-1
and we have
1 0 1 0 0 0 0 0 1 1 1 1 1 1 1 1
Mantissa Exponent
The fact that the first two digits are always different can be used to check for invalid
answers when doing calculations.
4.4 - 9
3.4 (f) Accuracy and Range
There are always a finite number of bits that can be used to represent numbers in a
computer. This means that if we use more bits for the mantissa we will have to use
fewer bits for the exponent.
Let us start off by using 8 bits for the mantissa and 8 bits for the exponent. The
largest positive value we can have for the mantissa is 0.1111111 and the largest
positive number we can have for the exponent is 01111111. This means that we have
The smallest positive mantissa is 0.1000000 and the smallest exponent is 10000000.
This represents
The largest negative number (i.e. the negative number closest to zero) is
Note that we cannot use 1.1111111 for the mantissa because it is not normalised. The
first two digits must be different.
The smallest negative number (i.e. the negative number furthest from zero) is
Have you noticed that zero cannot be represented in normalised form? This is
because 0.0000000 is not normalised because the first two digits are the same.
Usually, the computer uses the smallest positive number to represent zero.
Notice also that when we are talking size of number we mean the furthest to the left
on a number line so -1 is a bigger number than -2. Whereas, if we talk about largest
magnitude negative number then the -2 is greater magnitude than -1 because the
integer value is greater.
4.4 - 10
Now suppose we use 12 of the bits for the mantissa and the other four for the
exponent. This will increase the number of bits in the mantissa that can be used to
represent the number. That is, we shall have more binary places in the mantissa and
hence greater accuracy. However, the range of values in the exponent is from –8 to
+7 and this produces a very small range of numbers. We have increased the accuracy
but at the expense of the range of numbers that can be represented.
Similarly, reducing the size of the mantissa reduces the accuracy, but we have a much
greater range of values as the exponent can now take larger values.
4.4 - 11
3.4 (g) Static and Dynamic Data Structures
Static data structures are those structures that do not change in size while the program
is running. A typical static data structure is an array because once you declare its size,
it cannot be changed. (In fact, there are some languages that do allow the size of
arrays to be changed in which case they become dynamic data structures.)
Dynamic data structures can increase and decrease in size while a program is running.
A typical example is a linked list.
The following table gives advantages and disadvantages of the two types of data
structure.
Advantages Disadvantages
Static structures Compiler can allocate Programmer has to
space during compilation. estimate the maximum
amount of space that is
Easy to program. going to be needed.
4.4 - 12
3.4 (h) Algorithms
Consider Fig. 3.4.h.1 which shows a linked list and a free list. The linked list is
created by removing cells from the front of the free list and inserting them in the
correct position in the linked list.
Fig. 3.4.h.1
Now suppose we wish to insert an element between the second and third cells in the
linked list. The pointers have to be changed to those in Fig. 3.4.h.2.
Fig. 3.4.h.2
The algorithm must check for an empty free list as there is then no way of adding new
data. It must also check to see if the new data is to be inserted at the front of the list.
If neither of these are needed, the algorithm must search the list to find the position
for the new data. The algorithm is given below.
4.4 - 13
1. Check that the free list is not empty.
2. If it is empty report an error and stop.
3. Set NEW to equal FREE.
4. Remove the node from the stack by setting FREE to pointer in cell pointed to
by FREE.
5. Copy data into cell pointed to by NEW.
6. Check for an empty list by seeing if HEAD is NULL
7. If HEAD is NULL then
a. Pointer in cell pointed to by NEW is set to NULL
b. Set HEAD to NEW and stop.
8. If data is less than data in first cell THEN
a. Set pointer in cell pointed to by NEW to HEAD.
b. Set HEAD to NEW and stop
9. Search list sequentially until the cell found is the one immediately before the
new cell that is to be inserted. Call this cell PREVIOUS.
10. Copy the pointer in PREVIOUS into TEMP.
11. Make the pointer in PREVIOUS equal to NEW
12. Make the pointer in the cell pointed to by NEW equal to TEMP and stop.
Suppose we wish to delete the third cell in the linked list shown in Fig. 3.4.h.1. The
result is shown in Fig. 3.4.h.3.
Fig. 3.4.h.3
In this case, the algorithm must make sure that there is something in the list to delete.
4.4 - 14
Linked Lists - Amendment
Amendments can be done by searching the list to find the cell to be amended.
The algorithm is
Assuming that the data in a linked list is in ascending order of some key value, the
following algorithm explains how to find where to insert a cell containing new data
and keep the list in ascending order. It assumes that the list is not empty and the data
is not to be inserted at the head of the list.
Note: A number of methods have been shown here to describe algorithms associated
with linked lists. Any method is acceptable provided it explains the method. An
algorithm does not have to be in pseudo code, indeed, the sensible way of explaining
these types of algorithm is often by diagram.
4.4 - 15
Stacks – Insertion
Fig. 3.4.h.4 shows a stack and its head pointer. Remember, a stack is a last-in-first-
out (LIFO) data structure. If we are to insert an item into a stack we must first check
that the stack is not full. Having done this we shall increment the pointer and then
insert the new data item into the cell pointed to by the stack pointer. This method
assumes that the cells are numbered from 1 upwards and that, when the stack is
empty, the pointer is zero.
Stack
Data Pointer to
Data Top of stack
Data
Top of stack
Data
Fig. 3.4.h.4
Stacks – Deletion
When an item is deleted from a stack, the item's value is copied and the stack pointer
is moved down one cell. The data itself is not deleted. This time, we must check that
the stack is not empty before trying to delete an item.
These are the only two operations you can perform on a stack.
4.4 - 16
Queues - Insertion
Fig. 3.4.h.5 shows a queue and its head and tail pointers. Remember, a queue is a
first-in-first-out (FIFO) data structure. If we are to insert an item into a queue we
must first check that the stack is not full. Having done this we shall increment the
pointer and then insert the new data item into the cell pointed to by the head pointer.
This method assumes that the cells are numbered from 1 upwards and that, when the
queue is empty, the two pointers point to the same cell.
Head pointer
Data
Data
Data
Data Tail pointer
Fig. 3.4.h.5
Queues - Deletion
Before trying to delete an item, we must check to see that the queue is not empty.
Using the representation above, this will occur when the head and tail pointers point
to the same cell.
These are the only two operations that can be performed on a queue.
4.4 - 17
3.4 (i) Algorithms for Trees
Trees - Insertion
When inserting an item into a binary tree, it is usual to preserve some sort of order.
Consider the tree shown in Fig. 3.4.i.1 showing a tree containing data that is stored
according to its alphabetic order.
To add a new value, we look at each node starting at the root. If the new value is less
than the value at the node move left, otherwise move right. Repeat this for each node
arrived at until there is no node. Insert a new node at this point and enter the data.
Now let's try putting "Jack Spratt could eat no fat" into a tree. Jack must be the root
of the tree. Spratt comes after Jack so go right and enter Spratt. could comes before
Jack, so go left and enter could. eat is before Jack so go left, it's after could so go
right. This is continued to produce the tree in Fig. 3.4.i.1.
Jack
could Spratt
eat no
fat
Fig. 3.4.i.1
4.4 - 18
Using this algorithm and adding the word and we follow these steps.
Jack
could Spratt
and eat no
fat
Fig. 3.4.i.2
If the values are read from the left to the right, using the algorithm
The values can be read in order. There are many different ways of reading the values
from a tree, but the simplest is the one illustrated by the dashed line in the diagram
above. It is also the only way that you will be asked to read a tree in an examination.
The diagram with the dashed line would be read as
Could, Eat, Fat, Jack, No, Spratt
Note that the words are now in alphabetic order.
4.4 - 19
3.4 (j) Searching Methods
There are many different methods of searching but only two will be considered here.
These methods are the serial search and the binary search.
The serial search expects the data to be in consecutive locations such as in an array. It
does not expect the data to be in any particular order. To find the position of a
particular value involves looking at each value in turn and comparing it with the value
you are looking for. When the value is found you need to note its position. You must
also be able to report that a value has not been found in some circumstances.
This method can be very slow, particularly if there are a large number of values. The
least number of comparisons is one; this occurs if the first item in the list is the one
you want. However, if there are n values in the list, you will need to make n
comparisons if the value you want is the last value. This means that, on average, you
will look at n/2 values each search. Clearly, if n is large, this can be a very large
number of comparisons.
Now suppose the list is sorted into ascending order as shown below.
Anne
Bhari
Chu
Diane
Ejo
Frank
Gloria
Hazel
Suppose we wish to find the position of Chu. Now compare Chu with the value that
is in the middle of the table. In this case Diane and Ejo are both near the middle; in
this case we usually take the smaller value. As Chu is before Diane, we now look
only at this list.
Anne
Bhari
Chu
Diane
Now we only have a list of four entries. That is we have halved the length of the list.
We now compare Chu with Bhari and find that Chu is greater than Bhari so we use
the list
Chu
Diane
which is half the length of the previous list. Comparing Chu with Chu we have found
the position we want. This has only taken three comparisons.
4.4 - 20
Clearly, this is more efficient than the serial search method. However, the data must
first be sorted and this can take some time.
The formal algorithms for these searching techniques are not necessary as part of the
syllabus, but are included here because the authors are aware that some students are
interested in the more formal solutions to problems. It must be stressed that the
remainder of this section 3.4 (j) is not part of the CIE syllabus but is included here for
interest sake only.
Serial Search
Let the data consist of n values held in an array called DataArray which has subscripts
numbered from 1 upwards. Let X be the value we are trying to find. We must check
that the array is not empty before starting the search.
Note that the for loop only acts on the indented line. If there is more than one
operation to perform inside a loop, make sure that all the lines are indented.
Binary Search
Assume that the data is held in an array as described above but that the data is in
ascending order in the array. We must split the lists in two. If there is an even
number of values, dividing by two will give a whole number and this will tell us
where to split the list. However, if the list consists of an odd number of values we
will need to find the integer part of it, as an array subscript must be an integer.
We must also make sure that, when we split a list in two, we use the correct one for
the next search. Suppose we have a list of eight values. Splitting this gives a list of
the four values in cells 1 to 4 and four values in cells 5 to 8. When we started we
needed to consider the list in cells 1 to 8. That is the first cell was 1 and the last cell
was 8. Now, if we move into the first list (cells 1 to 4), the first cell stays at 1 but the
last cell becomes 4. Similarly, if we use the second list (cells 5 to 8), the first cell
becomes 5 and the last is still 8. This means that if we use the first list, the first cell in
the new list is unchanged but the last is changed. However, if we use the second list,
the first cell is changed but the last is not changed. This gives us the clue of how to
do the sort.
4.4 - 21
b. If the value in this cell is the required value, return the cell position and
stop.
c. If the value to be found is less than the value in the mid-point cell then
i. Make the search list the first half of the current search list
d. Else make the search list the second half of the current search list.
2. Report error, item not in the list.
A more detailed algorithm is given below that is useful f you wish to program the
binary search. You would not be expected to be able to reproduce this during an
examination.
In this algorithm, Vector is a one-dimensional array that holds the data in ascending
order. X is the value we are trying to find and n is the number of values in the array.
{Initialisation}
First = 1
Last = n
Found = FALSE
{Perform the search}
WHILE First <= Last AND NOT Found DO
{Obtain index of mid-point of interval}
Mid = INT(First + Last) / 2
{Compare the values}
IF X < Vector[Mid] THEN
Last = Mid – 1
ELSE
IF X > Vector[Mid] THEN
First = Mid + 1
ELSE
Output 'Value is at ', Mid
Found = TRUE
ENDIF
ENDIF
ENDWHILE
IF NOT Found THEN
{Unsuccessful search}
Output 'Value not in the list'
ENDIF
END
4.4 - 22
3.4 (k) Sorting and Merging
Sorting is placing values in an order such as numeric order or alphabetic order. The
order may be ascending or descending. For example the values
3 5 6 8 12 16 25
Merging is taking two lists which have been sorted into the same order and putting
them together to form a single sorted list. For example, if the lists are
and
Annis Bharri Chu Emi Kris Liz Mattu Medis Parrash Roger Ste Will
There are many methods that can be used to sort lists. You only need to understand
two of them. These are the insertion sort and the merge sort. This Section describes
the two sorts and a merge in general terms; the next Section gives the algorithms.
Insertion Sort
In this method we compare each number in turn with the numbers before it in the list.
We then insert the number into its correct position.
20 47 12 53 32 84 85 96 45 18
We start with the second number, 47, and compare it with the numbers preceding it.
There is only one and it is less than 47, so no change in the order is made. We now
compare the third number, 12, with its predecessors. 12 is less than 20 so 12 is
inserted before 20 in the list to give the list
12 20 47 53 32 84 85 96 45 18
This is continued until the last number is inserted in its correct position. In Fig.
3.4.k.1 the blue numbers are the ones before the one we are trying to insert in the
correct position. The red number is the one we are trying to insert.
4.4 - 23
20 47 12 53 32 84 85 96 45 18 Original list, start with second
number.
20 47 12 53 32 84 85 96 45 18 No change needed.
53 32 84 85 96 45 18 Now compare 12 with its
predecessors.
12 20 47 53 32 84 85 96 45 18 Insert 12 before 20.
12 20 47 53 32 84 85 96 45 18 Move to next value.
12 20 47 53 32 84 85 96 45 18 53 is in the correct place.
12 20 47 53 32 84 85 96 45 18 Move to the next value.
12 20 32 47 53 84 85 96 45 18 Insert it between 20 and 47
12 20 32 47 53 84 85 96 45 18 Move to the next value.
12 20 32 47 53 84 85 96 45 18 84 is in the correct place.
12 20 32 47 53 84 85 96 45 18 Move to the next value.
12 20 32 47 53 84 85 96 45 18 85 is in the correct place.
12 20 32 47 53 84 85 96 45 18 Move to the next value.
12 20 32 47 53 84 85 96 45 18 96 is in the correct place.
12 20 32 47 53 84 85 96 45 18 Move to the next value.
12 20 32 45 47 53 84 85 96 18 Insert 45 between 32 and 47.
12 20 32 45 47 53 84 85 96 18 Move to the next value.
12 18 20 32 45 47 53 84 85 96 Insert 18 between 12 and 20.
Fig. 3.4.k.1
Merging
2 4 7 10 15 and 3 5 12 14 18 26
In order to merge these two lists, we first compare the first values in each list, that is 2
and 3. 2 is less than 3 so we put it in the new list.
New = 2
Since 2 came from the first list we now use the next value in the first list and compare
it with the number from the second list (as we have not yet used it). 3 is less than 4 so
3 is placed in the new list.
New = 2 3
As 3 came from the second list we use the next number in the second list and compare
it with 4. This is continued until one of the lists is exhausted. We then copy the rest
of the other list into the new list. The full merge is shown in Fig. 3.4.k.2.
4.4 - 24
2 4 7 10 15 3 5 12 14 18 26 2 3 4 5
2 4 7 10 15 3 5 12 14 18 26 2 3 4 5 7
2 4 7 10 15 3 5 12 14 18 26 2 3 4 5 7 10
2 4 7 10 15 3 5 12 14 18 26 2 3 4 5 7 10 12
2 4 7 10 15 3 5 12 14 18 26 2 3 4 5 7 10 12 14
2 4 7 10 15 3 5 12 14 18 26 2 3 4 5 7 10 12 14 15
2 4 7 10 15 3 5 12 14 18 26 2 3 4 5 7 10 12 14 15 18
2 4 7 10 15 3 5 12 14 18 26 2 3 4 5 7 10 12 14 15 18 26
Fig. 3.4.k.2
4.4 - 25
3.4 (l) Sorting Algorithms
Insertion Sort
4.4 - 26
Merging
The following algorithm merges two sets of data that are held in a two one-
dimensional arrays called VectorA[1 to M] and VectorB[1 to N] into a third one-
dimensional array called VectorC. VectorA and VectorB have been sorted into
ascending order and if there are any duplicates they are only copied once into
VectorC.
4.4 - 27
3.4 (m) Sorting using a Binary Tree
See the end of section 3.4 (i) for the simple algorithm to turn the items in
a binary tree into a sorted list.
4.4 - 28
3.4 Example Questions
4. Describe a floating point representation for real numbers using two bytes. (4)
5. a) Explain how the fraction part of a real number can be normalised. (2)
b) State the benefit obtained by storing real numbers using normalised form. (1)
that can be stored. Give each answer as an 8 bit binary value and as a decimal
equivalent. (4)
b) Explain the relationship between accuracy and range when storing floating
point representations of real numbers. (4)
7. State the difference between dynamic and static data structures giving an
example of each. (3)
8. a) Show how a binary tree can be used to store the data items Feddi, Eda, Joh,
Sean, Dav, Gali in alphabetic order. (4)
b) Explain why problems may arise if Joh is deleted from the tree and how such
problems may be overcome. (4)
4.4 - 29
9. Describe two types of search routine, giving an indication of when it would be
advisable to use each. (6)
10. Describe the steps in sorting a list of numbers into order using an insertion
sort. (4)
11. Given two sorted lists describe an algorithm for merging them into one sorted
list. (6)
4.4 - 30
Chapter 3.5 Programming Paradigms
3.5 Introduction
Thus, treat the code in this Chapter simply as an explanation of the different facilities
in different programming paradigms. Do not think that you have to be able to
program in all the languages used in the following Sections.
4.5 - 1
3.5 (a) Programming Paradigms
Figure 3.5.a.1
The next advance was the development of procedural languages. These are third
generation languages and are also known as high-level languages. These languages
are problem oriented as they use terms appropriate to the type of problem being
solved. For example, COBOL (Common Business Oriented Language) uses the
language of business. It uses terms like file, move and copy.
The problem with procedural languages is that it can be difficult to reuse code and to
modify solutions when better methods of solution are developed. In order to address
these problems, object-oriented languages (like Eiffel, Smalltalk and Java) were
developed. In these languages, data and methods of manipulating the data, are kept as
a single unit called an object. The only way that a user can access the data is via the
object's methods. This means that, once an object is fully working, it cannot be
corrupted by the user. It also means that the internal workings of an object may be
changed without affecting any code that uses the object.
4.5 - 2
A further advance was made when declarative programming paradigms were
developed. In these languages the computer is told what the problem is, not how to
solve the problem. Given a database the computer searches for a solution. The
computer is not given a procedure to follow as in the languages discussed so far.
4.5 - 3
3.5 (b) Programming Paradigms and examples.
Procedural languages specify, exactly, the steps required to solve a problem. These
languages use the constructs: sequence, selection and repetition (see Section 1.3 in the
AS text). For example, to find the area of a rectangle the steps are
Here each line of code is executed one after the other in sequence.
Most procedural languages have two methods of selection. These are the IF …
THEN … ELSE statement and the SWITCH or CASE statement. For example, in
C++, we have
IF (Number > 0)
cout << "The number is positive.";
ELSE
{
IF (Number = = 0)
cout << "The number is zero.";
ELSE
cout << "The number is negative.";
}
In C++ multiple selections can be programmed using the SWITCH statement. For
example, suppose a user enters a single letter and the output depends on that letter, a
typical piece of code could be
switch (UserChoice)
{
case 'A':
cout << "A is for Apple.";
break;
case 'B':
cout << "B is for Banana.";
break;
case 'C':
4.5 - 4
cout << "C is for Cat.";
break;
default:
cout << "I don't recognise that letter.";
}
FOR … NEXT
REPEAT … UNTIL …
WHILE … ENDWHILE
A typical use of a loop is to add a series of numbers. The following pieces of C++
code add the first ten positive integers.
The point to note with these procedural languages is that the programmer has to
specify exactly what the computer is to do.
Procedural languages are used to solve a wide variety of problems. Some of these
languages are more robust than others. This means that the compiler will not let the
programmer write statements that may lead to problems in certain circumstances. As
stated earlier, there are procedural languages designed to solve scientific and
engineering problems while others are more suitable for solving business problems.
There are some that are particularly designed for solving problems of control that
need real time solutions.
Procedural languages may use functions and procedures but they always specify the
order in which instructions must be used to solve a problem. The use of functions and
4.5 - 5
procedures help programmers to reuse code but there is always the danger of variables
being altered inadvertently.
In the 1970s it was realised that code was not easily reused and there was little
security of data in a program. Also, the real world consists of objects not individual
values. My car, registration number W123ARB, is an object. Kay's car, registration
number S123KAY, is another object. Both of these objects are cars and cars have
similar attributes such as registration number, engine capacity, colour, and so on.
That is, my car and Kay's car are instances of a class called cars. In order to model
the real world, the Object-oriented Programming (OOP) paradigm was developed.
Unfortunately, OOP requires a large amount of memory and, in the 1970s, memory
was expensive and CPUs still lacked power. This slowed the development of OOP.
However, as memory became cheaper and CPUs more powerful, OOP became more
popular. By the 1980s Smalltalk, and later Eiffel, had become well established.
These were true object-oriented languages. C++ also includes classes although the
programmer does not have to use them. This means that C++ can be used as a
standard procedural language or an object-oriented language or a mixture of both!
Java, with a syntax similar to C++, is a fully object-oriented language. Although
OOP languages are procedural in nature, OOP is considered to be a new programming
paradigm.
The following is an example, using Java, of a class that specifies a rectangle and the
methods that can be used to access and manipulate the data.
class Shapes {
4.5 - 6
}//end of class Shapes.
class Rectangle {
//Declare the variables related to a rectangle
int length;
int width;
int area;
This example contains two classes. The first is called Shapes and is the main part of
the program. It is from here that the program will run. The second class is called
Rectangle and it is a template for the description of a rectangle.
The class Shapes has a constructor called Shapes, which declares two objects of type
Rectangle. This is a declaration and does not assign any values to these objects. In
fact, Java simply says that, at this stage, they have null values. Later, the new
statement creates actual rectangles. Here small is given a width of 2 and a length of 5,
medium is given a width of 10 and a length of 25 and large is given a width of 50 and
a length of 100. When a new object is created from a class, the class constructor,
which has the same name as the class, is called.
The class Rectangle has a constructor that assigns values to width and length and then
calculates the area of the rectangle.
The class Rectangle also has a method called write( ). This method has to be used to
output the details of the rectangles.
In the class Shapes, its constructor then prints a heading and the details of the
rectangles. The latter is achieved by calling the write method. Remember, small,
medium and large are objects of the Rectangle class. This means that, for example,
small.write( ) will cause Java to look in the class called Rectangle for a write method
and will then use it.
4.5 - 7
The functional programming paradigm provides a very high level view of
programming. All programs consist of a series of functions that use parameters for
input and pass values to other functions. There are no variables like the ones in
procedural languages. However, like procedural languages, the programmer has to
tell the computer the precise steps to be taken to solve a problem. For example, in the
language "Haskell", the following returns the square of a number.
says that we have a function called square that takes an integer as input and outputs an
integer.
square n = n * n
says that the function requires the value of n as input and outputs n * n.
Another example is
Here, different takes three integers as input and outputs a Boolean value True or
False. The output is True if a, b and c are not all the same.
(i) a = 2, b = 2 and c = 3;
(ii) a = 2, b = 3 and c = 3;
(iii) a = 5, b = 5 and c = 5.
You should find that (i) and (ii) give an output of True and (iii) gives an output of
False.
Most functions use guards to determine the output. Consider the following example.
4.5 - 8
Here | x <= y is a guard. The function first checks to see if x <= y is True. If it is, the
function outputs x and ends. If x <= y is False, Haskell moves to the next line and
checks the guard. In this case there is no guard. So the function outputs y.
Here the function is expecting three integers as input and outputs an integer. It in fact
outputs the minimum of three integers. If x <= y and x <= z then x must be the
minimum and the function outputs x and stops. However, if this is not true, x is not
the minimum therefore y or z must be the minimum. The second guard checks this.
If this returns a True value, y is output. If the guard returns a False value, the
function continues and outputs z.
This uses two functions namely the minimum of three integers and the minimum of
two integers. But we already have solutions to these problems so why not use them
and we have
sum = 0
For count = 1 to n
sum = sum + count
Next count
4.5 - 9
| otherwise = n + sum (n-1)
This is simply saying that the function sum expects an integer as input and outputs an
integer. The first guard says that if the input integer is 1, output 1. If this guard is
False, then output n plus the value of sum (n – 1). Then sum (n – 1) calls the same
function but with the input being 1 less than the last call. This is repeated until the
input is reduced to 1 when a value of 1 is output.
sum 3
| 3 =1 False
| otherwise = 3 + sum 2 → sum 2
|2 = 1 False
| otherwise = 2 + sum 1 → sum 1
| 1 = 1 True
=1
=2+1
=3
=3+3
=6
Another programming paradigm is the declarative one. Declarative languages tell the
computer what is wanted but do not provide the details of how to do it. These
languages are particularly useful when solving problems in artificial intelligence such
as medical diagnosis, fault finding in equipment and oil exploration. The method is
also used in robot control. An example of a declarative language is Prolog. The idea
behind declarative languages is shown in Fig. 3.5.b.1.
4.5 - 10
User Search Engine Database
Fig. 3.5.b.1
Here the user inputs a query to the search engine, which then searches the database for
the answers and returns them to the user. For example, using Prolog, suppose the
database is
female(jane).
female(anne).
female(sandip).
male(charnjit).
male(jaz).
male(tom).
Note that in Prolog values start with a lowercase letter and variables start with an
uppercase letter. A user may want to know the names of all the males. The query
male(X).
will return
X = charnjit
X = jaz
X = tom
Notice that the user does not have to tell Prolog how to search for the values of X that
satisfy the query. In a procedural language the database may be held in a two-
dimensional array Gender as shown below.
4.5 - 11
Array Gender
1 2
1 female Jane
2 female Anne
3 female Sandip
4 male Charnjit
5 male Jaz
6 male Tom
For count = 1 To 6
If Gender[count, 1] = "male" Then
picResults.Print Gender[count, 2]
End If
This is fairly straightforward. However, suppose we now add to the Prolog database
the following data.
parent(jane,mary).
parent(jane, rajinder).
parent(charnjit, mary).
parent(charnjit, rajinder).
parent(sandip, atif).
parent(jaz, atif).
and suppose we wish to know the name of the mother of Atif. In Prolog we use the
query
X = sandip
The result is
X = charnjit Y = mary
X = charnjit Y = rajinder
X = jaz Y = atif
If we only want a list of fathers we use the underscore and create the query
parent(X, _ ), male(X).
4.5 - 12
and the result is
X = charnjit
X = charnjit
X = jaz
Further examples are given in Section 3.5.g. At this stage the important point is that
the programmer does not have to tell the computer how to answer the query. There
are no FOR … NEXT, WHILE … DO … or REPEAT … UNTIL … loops as such.
There is no IF … THEN … statement. The system simply consists of a search engine
and a database of facts and rules. Examples of facts are given above. Examples of
rules will be given in Section 3.5.g.
4.5 - 13
3.5 (c) Structured Design
A complex problem needs to be broken down into smaller and smaller sub-problems
until all the sub-problems can be solved easily. This process is called step-wise
refinement or top-down design.
Consider the problem of calculating the wages for an hourly paid worker. The worker
is paid 6.50 per hour for up to 40 hours and time-and-a-half for all hours over 40. Tax
and other contributions have to be deducted. This can be represented by Fig. 3.5.c.1.
Wages
Fig. 3.5.c.1
An alternative way of writing this is to use numbered statements. This can be easier if
there are many sub-problems to be solved.
1. Wages
1.1 Get number of hours
1.2 Calculate gross pay
1.2.1 Calculate normal wages
1.2.2 Calculate overtime
1.3 Calculate deductions
1.3.1 Calculate tax
1.3.2 Calculate other deductions
1.4 Calculate net pay
1.5 Output wages slip
Either of these designs can be turned into a series of functions and procedures. The
program could be called Wages and consist of the following functions and procedure.
4.5 - 14
Wages
GetHours( ) returns an integer in range 0 to 60
CalculateWages(Hours) returns gross wage
CalculateNormalWages(Hours) returns wage for up to 40 hours
CalculateOvertime(Hours) returns pay for any hours over 40
CalculateDeductions(GrossWage) returns total deductions
CalculateTax(GrossWage) returns tax due
CalculateOthers(GrossWage) returns other deductions due
CalculateNetWage(GrossWage, Deductions) returns net wage after deductions
Procedure OutputResults(Hours, GrossWage, Tax, Others, Deductions, NetWage)
Procedure to print the wage slip
Here we can see that if a single value is to be returned, the simplest way to do this is
to use a function. If there are no values to be returned, then a procedure should be
used. If more than one value is to be returned, a procedure should be used.
These statements state that the function will return an integer value and it does not
expect any values to be fed into it.
or
CalculateNormalWages(Hours As Integer) As Double
expects to be given an integer value as input and returns a value of type Double.
Note: If you are programming in C, C++ or Java, there are no procedures. These
languages only use functions. A function has to be typed. That is, the programmer
must specify the type of value to be returned. This is true in all languages. In C, C++
and Java, if a function is not going to return a value, its return type is void. That is, no
value is actually returned.
Another type of diagram is used with the Jackson Structured Programming (JSP)
technique. Fig. 3.5.c.1 shows the sequence of steps.
4.5 - 15
Calculate gross pay consists of the sequence
Calculate tax
Calculate Others
The diagram does not illustrate selection (IF … THEN … ELSE …) nor does it show
repetition (FOR … DO … , WHILE … DO … , etc.). Fig. 3.5.c.2 shows selection in
JSP design. Note the use of a circle inside the boxes to indicate that the operations B
and C are conditional.
Condition Else
BO CO
Fig. 3.5.c.2
Repetition (also known as iteration) is shown in Fig. 3.5.c.3. The asterisk is used to
indicate that B is an iterative process; that is, B has to be repeated.
B *
Fig. 3.5.c.3
Initially we shall consider diagrams that represent data. Consider a pack of ordinary
playing cards which has been divided into red and black suits. Fig. 3.5.c.4 shows this.
The top level shows we are using a pack; the second layer shows that the pack is
divided into red and black components. The third level shows that the red component
consists of many (which may be zero) cards. Similarly, black consists of many cards.
4.5 - 16
Pack
Red O Black O
Red * Black *
card card
Fig. 3.5.c.4
Pack
Fig, 3.5.c.5
Now suppose we deal a hand of cards from a shuffled pack until a spade is dealt. The
data structure is shown in Fig. 3.5.c.6.
Hand
non-spade
O O cards
Cards Spade
followed by
before a card
a spade card
spade
Many cards
Card *
(may be zero)
Fig. 3.5.c.6
4.5 - 17
Finally consider a sequential file that contains daily takings of a shop in date order.
At the end of each month's takings is a grand total for the month. At the end of the
file is the total for all the months recorded. These totals act as validation checks. At
the start of the file is a header record describing the contents of the file. Fig. 3.5.c.7
shows this data structure. Present and absent are processes that are performed
according to the presence or absence of the respective totals.
Sales
O O
*
Month's Present Absent
data
Month's Month's
body total
O O
*
Day's total Present Absent
Fig. 3.5.c.7
The JSP diagrams we have seen so far have shown physical data structures. Logical
data structures describe the data with respect to a given application.
Suppose we wish to extract the takings for all Mondays from the file shown in Fig.
3.5.c.7. The logical diagram is shown in Fig. 3.5.c.8. Notice that the diagram does
not violate the data structure shown in the previous diagram. Also, notice the use of
null ( ─ ) in the decision at the bottom of the diagram; this shows that the ELSE part
of the decision does nothing. Further there are no decisions below the totals in this
diagram because they are not being used.
4.5 - 18
Sales
*
Month's
data
Month's Month's
body total
*
Day's total
O O
Monday ─
Fig. 3.5.c.8
Now suppose the application is to extract February's data. In this case we need to
read past January totals so these must be shown in the logical data structure.
However, once we have dealt with February, we do not wish to read the remaining
months' data. Fig. 3.5.c.9 shows the logical structure for this problem.
4.5 - 19
Sales
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
* Month * Month
Day's total Day's total
data data
Fig. 3.5.c.9
The next step is to enter, on the diagram, the constraints. This is done by numbering
the constraints and then listing their meanings. Placing the constraints on Fig. 3.5.c.8
produced Fig. 3.5.c.10 where
4.5 - 20
Sales
C1
*
Month's
data
Month's Month's
body total
C2
*
Day's total
C3 ELSE
O O
Monday ─
Fig. 3.5.c.10
In this example, separate procedures can be written for each box in the logical
structure. That is, each section in the diagram can be a separate procedure or
function. Each procedure and function can use parameters to pass data to and from
the calling procedure or function. This is further explained in the next Section.
4.5 - 21
3.5 (d) Standard Programming Techniques.
The previous Section discussed top down design and JSP diagrams. Both systems lead
to modules that can be programmed easily. Each module is a solution to an individual
problem and each module has to interface with other modules. As long as the
interfaces are clearly specified, each module can be given to a different programmer
to code. All that the programmers need to know is the problem and how its solution
must communicate with the solutions to other modules. This means that two
programmers may happen to use the same name for a variable. Also, the
programmers will need to pass values to other modules and be able to accept values
from other modules. This was briefly discussed in Section 1.3 in the AS text.
Let us first consider how data can be input to a function or procedure. This is done by
means of parameters. The function below, written in Visual Basic, finds the perimeter
of a rectangle given its length and breadth. This is not the only way of finding the
perimeter and it probably is not the best way. However, it has been written like this in
order to illustrate certain programming points.
X=2*X
Y=2*Y
PerimeterOfRectangle = X + Y
End Function
In this function X and Y are integers the values of which must be passed to the
function before it can find the area of the rectangle. These variables are called formal
parameters. To use this function, another program will have to call it and provide the
values for X and Y. This can be done by means of a statement of the form
Perimeter = PerimeterOfRectangle(4, 6)
or we can use
A=3
B=4
Perimeter = PerimeterOfRectangle(A, B)
In both of these statements the variables inside the parentheses ( 4 and 6 in the first
example and A and B in the second) are called actual parameters. How the values
are passed to the function or procedure depends on the programming language. In the
first example the values 4 and 6 are stored in the variables X and Y. In the second
example, in Visual Basic, the addresses of A and B are passed to the function so that
X and Y have the same address as A and B respectively. In C++, in both cases the
actual values are passed to the function which stores them in its own variable space.
Thus we have two different ways of passing parameters. Fig. 3.5.d.1 shows how
Visual Basic normally passes parameters.
4.5 - 22
Calling Program Memory Location Function
A 3 X
B 4 Y
Fig. 3.5.d.1
Fig 3.5.d.2 shows what normally happens when C++ passes parameters. notice that
two extra memory locations are used and that C++ makes a copy of the values of A
and B and stores them in separate locations X and Y.
Fig 3.5.d.2
Visual Basic is said to pass parameters by reference (or address) and C++ passes them
by value. It is interesting to see the effect of passing values by address. Here is the
function described above and a copy of the calling function in Visual Basic.
X=2*X
Y=2*Y
PerimeterOfRectangle = X + Y
End Function
Dim A As Integer
Dim B As Integer
Dim Perimeter As Integer
A=3
B=4
End Sub
4.5 - 23
Fig.3.5.d.3
Notice that after the function has been run the values of A and B have changed. This
is because the addresses of A and B were passed not their actual values.
Visual Basic can pass parameters by value and C++ can pass parameters by reference.
In Visual Basic we have to use the ByVal key word if we want values to be passed by
value. Here is a modified form of the Visual Basic function together with the output
from running the modified program.
X=2*X
Y=2*Y
PerimeterOfRectangle = X + Y
End Function
Fig. 3.5.d.4
4.5 - 24
Variables can have different values in different parts of the program. Look at the
following Visual Basic code and its output, shown in Fig. 3.5.d.5.
Dim A As Integer
Dim B As Integer
Dim C As Integer
Dim Perimeter As Integer
A=3
B=4
C=5
End Sub
Dim C As Integer
C = 10
X=2*X
Y=2*Y
PerimeterOfRectangle = X + Y
End Function
Fig. 3.5.d.5
4.5 - 25
This shows that C has a different value in the function PerimeterOfRectangle to in the
calling function cmdShow_Click. C is said to be a local variable and the C in
PerimeterOfRectangle is stored in a different address to the C in cmdShow_Click.
Local variables only exist in the block in which they are declared. This is very
helpful as it means that different programmers, writing different routines, do not have
to worry about the names of variables used by other programmers. However, it is
sometimes useful to be able to use the same variable in many parts of a program. To
do this, the variable has to be declared as global. In Visual Basic this is done by
means of the statement
Public C As Integer
which is placed in a module. If we do this with the previous example the code
becomes
Dim A As Integer
Dim B As Integer
Dim Perimeter As Integer
A=3
B=4
C=5
picResults.Print "Before call to Sub A = "; A; " and B = "; B; " and C = "; C
Perimeter = PerimeterOfRectangle(A, B)
picResults.Print "Perimeter = "; Perimeter
picResults.Print "After call to Sub A ="; A; " and B = "; B; " and C = "; C
End Sub
C = 10
X=2*X
Y=2*Y
PerimeterOfRectangle = X + Y
End Function
Fig. 3.5.d.6 shows that the value of C, when changed in the function
PerimeterOfRectangle, is changed in the calling routine also. In fact it is
changed throughout the program.
4.5 - 26
Fig. 3.5.d.6
In C++ variables can be declared at any point in the program. This means that a
variable can be local to a small block of code. In the following example i is only
available in the for loop. The output statement after the loop is illegal as i no longer
exists.
We now have the ability to allow variables to be used only in certain parts of a
program or in any part. Global variables should be used as sparingly as possible as
they can cause a program to be very difficult to debug. This is because it is not
always clear when global variables are being changed.
What happens if a variable is declared as both global and local? The following code
declares C as global and C as local in the function PerimeterOfRectangle and the
result of running it is shown in Fig. 3.5.d.7. Notice that the value of C in the function
cmdShow_Click is not changed although its value is changed in
PerimeterOfRectangle. This is because C is declared as a local variable in this
function and this means that the global C is not used.
4.5 - 27
Public C As Integer 'global declaration
Dim A As Integer
Dim B As Integer
Dim Perimeter As Integer
A=3
B=4
C=5
picResults.Print "Before call to Sub A = "; A; " and B = "; B; " and C = "; C
Perimeter = PerimeterOfRectangle(A, B)
picResults.Print "Perimeter = "; Perimeter
picResults.Print "After call to Sub A ="; A; " and B = "; B; " and C = "; C
End Sub
C = 10
X=2*X
Y=2*Y
PerimeterOfRectangle = X + Y
End Function
Fig. 3.5.d.7
4.5 - 28
3.5 (e) Stacks and Procedures
When a procedure or function is called, the computer needs to know where to return
to when the function or procedure is completed. That is, the return address must be
known. Further, functions and procedures may call other functions and procedures
which means that not only must several return addresses be stored but they must be
retrieved in the right order. This can be achieved by using a stack. Fig. 3.5.e.1 shows
what happens when three functions are called after one another. The numbers
represent the addresses of the instructions following the calls to functions.
Main program
.
.
.
Call Function A Function A
100 … .
. .
. .
. Call Function B Function B
. 150 … .
. . .
. . .
End Return Call Function C Function C
250 … .
. .
. .
Return Return
Fig. 3.5.e.1
Notice that the addresses will be stored in the order 100, 150 then 250. When the
returns take place the addresses will be needed in the order 250, 150 then 100. That
is, the last address stored is the first address needed on returning from a function.
This means that we need a data structure that provides a last in first out facility. A
stack does precisely this, so we store the return addresses in a stack. In the above
example, the addresses will be stored in the stack each time a function is called and
will be removed from the stack each time a return instruction is executed. This is
shown in Fig. 3.5.e.2.
4.5 - 29
Calls and Returns Stack
Call Function A
Push return address
onto stack
Call Function B
Push return address
onto stack
Call Function C
Push return address
onto stack
Stack pointer 250
150
100
Return from C
Pop return address off
stack
250
Stack pointer 150
100
Return from B
Pop return address off
stack
250
150
Stack pointer 100
Return from A
Pop return address off
stack
250
150
100
Stack pointer NULL
Fig.3.5.e.2
4.5 - 30
Now suppose that values need to be passed to, or from, a function or procedure.
Again a stack can be used. Suppose we have a main program and two procedures,
Proc A(A1, A2) and Proc B(B1, B2, B3). That is, A1 and A2 are the formal
parameters for Proc A and B1, B2 and B3 are the formal parameters for Proc B. Now
look at Fig. 3.5.e.3 which shows the procedures being called and the return addresses
that must be placed on the stack.
Main program
.
.
.
Call Proc A(X1,X2) Proc A(A1,A2)
200 … .
.
.
Call Proc B(Y1,Y2,Y3) Proc B(B1,B2,B3)
400 … .
. .
. .
Return Return
Fig. 3.5.e.3
Now let us suppose that all the parameters are being passed by value. Then, when the
procedures are called the actual parameters must be placed on the stack and the
procedures must pop the values off the stack and store the values in the formal
parameters. This is shown in Fig. 3.5.e.4; note how the stack pointer is moved each
time an address or actual parameter is popped onto or popped off the stack.
4.5 - 31
Call Proc A(X1, X2)
PUSH 200
PUSH X1 Stack pointer X2
PUSH X2 X1
200
A2 = X2 (POP X2)
A1 = X1 (POP X1)
X2
X1
Stack pointer 200
B3 = Y3 (POP Y3)
B2 = Y2 (POP Y2) Y3
B1 = Y1 (POP Y1) Y2
Y1
Stack pointer 400
200
4.5 - 32
Next we must consider what happens if the values are passed by reference. This
works in exactly the same way as the addresses of variables are passed so there is no
need to return the values via parameters. The procedures, or functions, will access the
actual addresses where the variables are stored. Finally, how do functions return
values? Simply push them on the stack immediately before returning. The calling
program can then pop the value off the stack. Note that the return address has to be
popped off the stack before pushing the return value onto the stack.
4.5 - 33
3.5 (f) Object-Oriented Programming (OOP)
Data encapsulation (or data hiding) has been explained in Section 3.5.a. It is the
concept that data can only be accessed via the methods provided by the class. This is
shown in Fig. 3.5.f.1 where the objects, that is, instantiations of a class, are prevented
from directly accessing the data by the methods.
Object Object
Object Object
Fig. 3.5.f.1
Objects can only access the data by using the methods provided. An object cannot
manipulate the data directly. In the case of the rectangle class, an object of this class
cannot directly calculate its area. That is, we cannot write
To find the area of myRectangle, class Rectangle must provide a suitable method.
The example in Section 3.5.a does not do this. The class Rectangle calculates the area
when an instance of a class is instantiated. The only way to find the area is to use the
write( ) method which outputs the area. If a user wishes to access the width and
length of a rectangle, the class must provide methods to do this. Methods to return the
width and length are given below.
4.5 - 34
integer getWidth( ) {
getWidth := width;
}//end of getWidth method.
integer getLength( ) {
getLength := length;
}//end of getLength method.
myRectangle can now use these methods to get at the width and length. However, it
cannot change their values. To find the perimeter we can write
myWidth := myRectangle.getWidth( );
myLength := myRectangle.getLength( );
myPerimeter := 2 * (myWidth + myLength);
Thus, an object consists of the data and the methods provided by the class. The
concept of data being only accessible by means of the methods provided is very
important as it ensures data integrity. Once a class has been written and fully tested,
neither its methods nor the data can be tampered with. Also, if the original design of
a method is found to be inefficient, the design can be changed, unknowingly to the
user, without the user's program being affected.
Another powerful concept is that of inheritance. Inheritance allows the re-use of code
and the facility to extend the data and methods without affecting the original code. In
the following diagrams, we shall use a rounded rectangle to represent a class. The
name of the class will appear at the top of the rectangle, followed by the data followed
by the methods.
Consider the class Person that has data about a person's name and address and a
methods called outputData( ) that outputs the name and address, getName( ) and
getAddress( ) that return the name and address respectively. This is shown in Fig.
3.5.f.2.
name
Data
address
outputData( )
getName( ) Methods
getAddress( )
Fig. 3.5.f.2
Now suppose we want a class Employee that requires the same data and methods as
Person and also needs to store and output an employee's National Insurance number.
Clearly, we do not wish to rewrite the contents of the class person. We can do this by
creating a class called Employee that inherits all the details of the class Person and
adds on the extra data and methods needed. This is shown in Fig. 3.5.f.3 where the
4.5 - 35
arrow signifies that Employee inherits the data and methods provided by the class
Person. Person is called the super-class of Employee and Employee is the derived
class from Person. An object of type Employee can use the methods provided by
Employee and those provided by Person.
Person
name
address
outputData( )
getName( )
getAddress( )
Employee
NINumber
outputData( )
getNINumber( )
Fig. 3.5.f.3
Notice that we now have two methods with the same name. How does the program
determine which one to use? If myPerson is an instantiation of the Person class, then
myPerson.outputData( );
will use the outputData( ) method from the Person class. The statement
myEmp.outputData( );
will use the method outputData( ) from the Employee class if myEmp is an
instantiation of the Employee class.
Now suppose we have two types of employee; one is hourly paid and the other is paid
a salary. Both of these require the data and methods of the classes Person and
Employee but they also need different data to one another. This is shown in Fig.
3.5.f.4.
4.5 - 36
Person
name
address
outputData( )
getName( )
getAddress( )
Employee
NINumber
outputData( )
getNINumber( )
HourlyPaidEmp SalariedEmp
hourlyRate salary
outputData( ) outputData( )
getHourlyRate( ) getSalary( )
Fig. 3.5.f.4
How can an object of type Employee output the name and address as well as the N.I.
number? The outputData( ) method in class Employee can refer to the outputData( )
method of its superclass. This is done by writing a method, in class Employee, of the
form
void outputData( ) {
super.outputData( );
System.out.println("The N.I. number is " + NINumber);
}//end of outputData method.
4.5 - 37
Here super. outputData( ) calls the outputData( ) method of the super-class and then
outputs the N.I. number. Similarly, the other derived classes can call the methods of
their super classes.
In the above, we have explained the meanings of terms such as data encapsulation,
class and inheritance. However, sometimes the examiner may ask you to simply state
the meanings of these terms. In this case a simple definition is all that is required.
Note also that there will only be one (or possibly two) marks for this type of question.
The following definitions would be satisfactiory answers to questions that say 'State
the meaning of the term … '.
Definitions
Data encapsulation is the combining together of the variables and the methods that
can operate on the variables so that the methods are the only ways of using the
variables..
A class describes the variables and methods appropriate to some real-world entity.
Inheritance is the ability of a class to use the variables and methods of a class from
which the new class is derived.
4.5 - 38
3.5 (g) Declarative Languages
In Section 4.5.1, we saw that, in declarative languages, the programmer can simply
state what is wanted having declared a set of facts and rules. We now look at how
this works using examples of Prolog scripts. In order to do this, we shall use the
following facts.
female(jane).
female(anne).
female(sandip).
male(charnjit).
male(jaz).
male(tom).
parent(jane,mary).
parent(jane, rajinder).
parent(charnjit, mary).
parent(charnjit, rajinder).
parent(sandip, atif).
parent(jaz, atif).
Remember that variables must start with an uppercase letter; constants start with a
lowercase letter.
Suppose we ask
male(X).
Prolog starts searching the database and finds male(charnjit) matches male(X) if X is
given the value charnjit. We say that X is instantiated to charnjit. Prolog now
outputs
X = charnjit
Prolog then goes back to the database and continues its search. It finds male(jaz) so
outputs
X = jaz
and again continues its search. It continues in this way until the whole database has
been searched. The complete output is
X = charnjit
X = jaz
X = tom
No
The query male(X) is known as a goal to be tested. That is, the goal is to find all X
that satisfy male(X). If Prolog finds a match, we say that the search has succeeded
4.5 - 39
and the goal is true. When the goal is true, Prolog outputs the corresponding values of
the variables.
This rule states that X is father of Y if (the :- symbol) X is a parent of Y AND (the
comma) X is male.
female(jane).
female(anne).
female(sandip).
male(charnjit).
male(jaz).
male(tom).
parent(jane,mary).
parent(jane, rajinder).
parent(charnjit, mary).
parent(charnjit, rajinder).
parent(sandip, atif).
parent(jaz, atif).
father(X, Y) :- parent(X, Y), male(X).
Suppose our goal is to find the father of rajinder. That is, our goal is to find all X that
satisfy
father(X, rajinder).
In the database and the rule the components female, male, parent and father are called
predicates and the values inside the parentheses are called arguments. Prolog now
looks for the predicate father and finds the rule
In this rule Y is instantiated to rajinder and Prolog starts to search the data base for
parent(X, rajinder)
parent(jane, rajinder)
if X is instantiated to jane. Prolog now uses the second part of the rule
male(X)
4.5 - 40
with X = jane. That is, Prolog's new goal is male(jane) which fails. Prolog does not
give up at this stage but backtracks to the match
parent(jane, rajinder)
and starts again, from this point in the database, to try to match the goal
parent(X, rajinder)
parent(charnjit, rajinder)
with X instantiated to charnjit. The next step is to try to satisfy the goal
male(charnjit)
This is successful so
X = charnjit
Prolog continues to see if there are any more matches. There are no more matches so
Prolog outputs
No
A powerful tool in Prolog is recursion. This can be used to create alternative versions
for a rule. The Fig. 3.5.g.1 shows how ancestor is related to parent.
a is ancestor of b a is parent of b
b
a is ancestor of d c is parent of d
b is ancestor of d
d
Fig. 3.5.g.1
4.5 - 41
This shows that X is an ancestor of Y if X is a parent of Y. But it also shows that X is
an ancestor of Y if X is a parent of Z and Z is a parent of Y. It also shows that X is an
ancestor of Y if X is a parent of Z , and Z is a parent of W and W is a parent of Y.
This can continue forever. Thus the rule is recursive. In Prolog we require two rules
that are written as
The second rule is in two parts. Let us see how it works using Fig.3.5.g.1 which
represents the database
parent(a, b).
parent(b, c).
parent(c, d).
Prolog finds the first rule and tries to match parent(a, c) with each predicate in the
database. Prolog fails but does not give up. It backtracks and looks for another rule
for ancestor. Prolog finds the second rule and tries to match
parent(a, Z).
It finds
parent(a, b)
so instantiates Z to b.
This is now put into the second part of the rule to produce
ancestor(b, c).
This means that Prolog has to look for a rule for ancestor. It finds the first rule
parent(b, c)
and succeeds.
4.5 - 42
This means that with X = a, Y = c we have Z = b and the second rule succeeds.
Therefore Prolog returns Yes.
ancestor(a,d)
and
ancestor(c, b).
You should find that the first goal succeeds and the second fails.
The words in italics are part of the names of relationships. We also have atomic
conclusions such as
In the last atomic conclusion, x is a variable. The meaning of the conclusion is that
Frank likes anything (or anybody) that likes computing.
Joint conditions use the logical operators OR and AND, examples of which are
The atomic formulae that serve as conditions and conclusions may be written in a
simplified form. In this form the name of the relation is written in front of the atomic
formula. The names of the relations are called predicate symbols. Examples are
loves(Mary, Harry)
likes(Philip, Zak)
4.5 - 43
The AND is represented by a comma in the condition part of the atomic conclusion.
For example
These examples show the connection between Prolog and predicate calculus. You do
not need to understand how to manipulate the examples you have seen any further
than has been shown in this Section.
AS with the previous Section we include here the definitions of terms used in this
Section. Remember, they can be used when a question says ''State the meaning of the
term … '.
Definitions
A goal is a statement that we are trying to prove whether or not it is True or False.
4.5 - 44
3.5 (h) Use of Special Registers/Memory Addressing Techniques
Fig. 3.5.i.1 shows the minimum number of registers needed to execute instructions.
Remember that these are used to execute machine code instructions not high-level
language instructions.
B
u
s
Fig. 3.5.i.1
The program counter (PC) is used to keep track of the location of the next instruction
to be executed. This register is also known as the Sequence Control Register (SCR).
The memory address register (MAR) holds the address of the instruction or data that
is to be fetched from memory.
The current instruction register (CIR) holds the instruction that is to be executed,
ready for decoding.
The memory data register (MDR) holds data to be transferred to memory and data
that is being transferred from memory, including instructions on their way to the CIR.
Remember that the computer cannot distinguish between data and instructions. Both
are held as binary numbers. How these binary numbers are interpreted depends on the
registers in which they end up. The MDR is the only route between the other registers
and the main memory of the computer.
4.5 - 45
The accumulator is where results are temporarily held and is used in conjunction with
a working register to do calculations.
The index register is a special register used to adjust the address part of an instruction.
This will be explained in more detail later.
Note that the diagram does not show the control bus and the signals needed for
instructions to be correctly executed. These are not required for this examination.
We shall now see how these registers are used to execute instructions. In order to do
this we shall assume that a memory location can hold both the instruction code and
the address part of the instruction. For example, a 32-bit memory location may use 12
bits for the instruction code and 20 bits for the address part. This will allow us to use
up to 212 (= 4096) instruction codes and 220 (= 1 048 576) memory addresses.
Code Meaning
LDA load the accumulator
STA store the accumulator
ADD add the contents of memory to the accumulator
STOP Stop
We shall also use decimal numbers rather than binary for the address part of an
instruction.
Suppose four instructions are stored in locations 300, 301, 302 and 303 as shown in
the following table and that the PC contains the number 300.
4.5 - 46
Address Contents Notes
. .
. .
. .
300 LDA 400 Load accumulator with contents of location 400
301 ADD 401 Add contents of location 401 to accumulator
302 STA 402 Store contents of accumulator in location 402
303 STOP Stop
304
. .
. .
. .
400 5 Location 400 contains the number 5
401 7 Location 401 contains the number 7
402 ? Not known what is in location 402
. .
. .
. .
4.5 - 47
The fetch part of the instruction is
The instruction is now decoded (not shown in the table) and is interpreted as 'load the
contents of the location whose address is given into the accumulator'.
We now start the execution phase. As the contents of an address are needed, the
address part of the instruction is copied into the MAR, in this case 400.
Now use the MAR to find the value required and copy it into the MDR.
Now use the same steps to fetch and execute the next instruction. Note that the PC
already contains the address of the next instruction.
Note that all data moves between memory and the MDR via the data bus. All
addresses use the address bus.
A summary of the steps needed to fetch and execute the LDA instruction are shown in
Fig. 3.5.h.2
4.5 - 48
Set PC to address of first instruction
Copy PC to MAR
Increment PC
Fetch
Copy contents of location pointed to by MAR to MDR
phase
Decode instruction
Execute
Copy contents of address in MAR to MDR phase for
LDA
instruction
Fig. 3.5.h.2
4.5 - 49
In Fig. 3.5.h.3, what happens during the execute cycle depends on the instruction. For
example, the STA n (store the contents of the accumulator in the location with address
n) has the execute steps shown in Fig. 3.5.i.3.
Fig. 3.5.h.3
This process works fine but only allows for the sequential execution of instructions.
This is because the PC is only changed by successively adding 1 to it. How can we
arrange to change the order in which instructions are fetched? Consider these
instructions.
Suppose the PC contains the number 300, after the instruction ADD 500 has been
fetched and executed the PC will hold the number 301. Now the instruction JLZ 300
will be fetched in the usual way and the PC will be incremented to 302. The next step
is to execute this instruction. The steps are shown in Fig. 3.5.h.4.
4.5 - 50
Fetch next instruction ( See Fig 3.5.i.2)
Is accumulator < 0
No
Yes
Fig. 3.5.h.4
So far we have used two copy instructions (LDA and STA), one arithmetic instruction
(ADD) and one jump instruction (JLZ). In the case of the copy and arithmetic
instructions, the address part has specified where to find or put the data. This is
known as direct addressing.
Fig. 3.5.h.5
4.5 - 51
The advantage of this mode of addressing is that the actual address used in our
example can be the full 32 bits giving 232 addresses.
ADDX 700
Before the ADDX instruction is executed the contents of the IR must be added to the
MAR to give
Thus the contents of address 705 are added to the accumulator. The programmer then
increments the IR to make it 6 so that the next time the ADDX 700 instruction is
executed the addressed used will be 706.
4.5 - 52
3.5 (i) Third and Fourth Generation Languages
Third generation languages are those that use a structured syntax such as C, C++ and
Pascal. Early versions of Fortran and BASIC were not structured and are usually
treated as second generation languages. However, Visual Basic is structured and can
be treated as a third generation language.
Third generation languages need the user to specify clearly all the steps that need to (Procedural Languages)
(Declarative Languages) be taken to solve a problem. Fourth generation languages do not do this. Languages
that accompany modern database, word processing and spreadsheet packages do not
need the user to do this. The users of these packages tell the application what they
want to do not how to do it. An example is mail merge . Here all the user has to do is
tell the software what table or database to use and the mail merge will take place.
Databases often use query by example (QBE). Here the user simply states what is
required and the software will do the task. For example, Microsoft Access lets a user
specify conditions such as DOB < 01/01/90 and the necessary coding will be done. In
fact Access uses the Structured Query Language (SQL) to create the queries.
Consider the following table called Students.
> 150
for the criteria. We could also specify, by means of a check box, that only the name
should be printed. The result would be
Dalvinder
Frank
Georgina
SELECT name
FROM Students
WHERE height > 150;
4.5 - 53
Notice that we do not have to give the steps needed to check each entry in the table
Students. A more complicated query is
SELECT name
FROM Students
WHERE height > 145
AND
weight > 32;
Again, we do not tell the computer exactly how to find the answer required as we
would with a third generation language.
The development of fourth generation languages has meant that people who are not
programmers can produce useful results. (Advantage of fourth generation languages)
4.5 - 54
3.5 (j) Backus Naur Form and Syntax Diagrams
For count = 1 To 10
A Visual Basic compiler would not understand the C++ syntax and vice versa. We
therefore need, for each language, a set of rules that specify precisely every part of the
language. These rules are specified using Backus Naur Form (BNF) or syntax
diagrams.
All languages use integers, so we shall start with the definition of an integer. An
integer is a sequence of the digits 0, 1, 2, … , 9. Now the number of digits in an
integer is arbitrary. That is, it can be any number. A particular compiler will restrict
the number of digits only because of the storage space set aside for an integer. But a
computer language does not restrict the number of digits. Thus the following are all
valid integers.
0
2
415
3040513002976
0000000123
where the vertical line is read as OR. Notice that all the digits have to be specified and
that they are not inside angle brackets (< and >) like <integer> and <digit>. This is
because integer and digit have definitions elsewhere; the digits 0, 1, 2, … , 9 do not.
4.5 - 55
But how are we going to specify integers of any length? Consider the integer
147
This is a single digit integer ( 1 ) followed by the integer 47. But 47 is a single digit
integer ( 4 ) followed by a single digit integer ( 7 ). Thus, all integers of more than
one digit start with a single digit and are followed by an integer. Eventually the final
integer is a single digit integer. Thus, an indefinitely long integer is defined as
=<digit><digit><integer>
=<digit><digit><digit><integer>
To stop this we use the fact that, eventually, <integer> is a single digit and write
4.5 - 56
integer digit
digit 0
Fig. 3.5.j.1
+27
-3415
and we can use the earlier definition of an <unsigned integer>. It is usual to say that
an integer is an unsigned integer or a signed integer. If we do this we get the
following definition, in BNF, of an integer.
There are other valid ways of writing these definitions. However, it is better to use
several definitions than try to put all the possibilities into a single definition. In other
words, try to start at the top with a general definition and then try to break the
definitions down into simpler and simpler ones. That is, we have used top-down
design when creating these definitions. We have broken the definitions down until we
have terms whose values can be easily determined.
4.5 - 57
Fig. 3.5.j.2 shows the corresponding syntax diagrams.
integer
digit
+
digit
0
Fig.3.5.j.2
Care must be taken when positioning the recursion in the definitions using BNF.
Suppose we define a variable as a sequence of one or more characters starting with a
letter. The characters can be any letter, digit or the underscore. Valid examples are
A
x
sum
total24
mass_of_product
MyAge
Let us see what happens if we use a similar definition to that for an unsigned integer.
4.5 - 58
<character> ::= <letter>|<digit>|<under-score>
<character><variable>
with <character> = 2 and <variable> = Sum. Continuing in this way we use 2, S and
u for <character> and then m for <letter>. This means that our definition simply
means that we must end with a letter not start with one. We must rewrite our
definition in such a way as to ensure that the first character is a letter. Moving the
recursive call to the front of <character> can do this. This means that the last time it
is called it will be a letter and this will be at the head of the variable. The correct
definition is
A syntax diagram can also represent this. This is left as an exercise. You should also
note that, in the definition of integer, we used tail recursion, but here we have used
head recursion.
Let us now use our definition of an integer to define a real number such as
0.347
-2.862
+14.34
00235.006
Finally, suppose we do not want to allow leading zeros in our integers. That is
zero digit
non-zero digit
non-zero digit followed by any digit.
4.5 - 59
This means that an integer is
zero or digits
<digits> must be a single non-zero digit or a non-zero digit followed by any digits.
This gives us
where
<zero> ::= 0
<non-zero integer> ::= 1|2|3|4|5|6|7|8|9
<digit> ::= <zero>|<non-zero digit>
integer 0
digits
digits 1
2 0
3 1
4 2
5 3
6 4
7 5
8 6
9 7
9
Fig. 3.5.j.4
4.5 - 60
3.5 Example Questions
Having worked through the 66 pages of section 3.5, many students will be worried about the
detail offered and the need to answer examination questions on what is very difficult and
complex work. However, take heart! The whole examination only lasts 2 hours, and in that
time the examiner not only has to examine this section but also the other 9 sections in module
3. This works out at only 12 minutes for each section, so the idea of long, complex, algorithm
type questions are not feasible. You are going to be asked questions that will be taken from
the syllabus but which will be fairly short, and knowledge based. The exception may be with
the types of language which may produce a question on the lines of the object oriented
question in the sample material.
4. Explain the difference between direct and indirect addressing and explain why
indirect addressing allows access to more memory locations than indirect addressing.
(6)
5. An amount of money can be defined as
• A$ sign followed by either
• A positive integer or
• A positive integer, a point, and a two digit number or
• A point and a two digit number
A positive integer has been defined as <INTEGER>
A digit is defined as <DIGIT>::= 0/1/2/3/4/5/6/7/8/9.
a) Define, using Backus Naur form, the variable <AMOUNT OF MONEY>
b) Using the previously defined values of INTEGER and DIGIT, draw a syntax diagram
to define AMOUNT OF MONEY.
4.5 - 61
Chapter 3.6 Databases
Originally all data were held in files. A typical file would consist of a large number
of records each of which would consist of a number of fields. Each field would have
its own data type and hold a single item of data. Typically a stock file would contain
records describing stock. Each record may consist of the following fields.
The problem is when we check the stock the next day, we will create a new order
because the stock that has been ordered has not been delivered. To overcome this we
could introduce a new field called On Order of type Boolean. This can be set to True
when an order has been placed and reset to False when an order has been delivered.
Unfortunately it is not that easy.
The original software is expecting the original seven fields not eight fields. This
means that the software designed to manipulate the original file must be modified to
read the new file layout.
Further ad hoc enquiries are virtually impossible. What happens if management ask
for a list of best selling products? The file has not been set up for this and to change it
so that such a request can be satisfied in the future involves modifying all existing
software. Further, suppose we want to know which products are supplied by Food &
Drink Ltd.. In some cases the company's name has been entered as Food & Drink
Ltd., sometimes as Food and Drink Ltd. and sometimes the full stop after Ltd has
been omitted. This means that a match is very difficult because the data is
inconsistent. Another problem is that each time a new product is added to the
database both the name and address of the supplier must be entered. This leads to
redundant data or data duplication.
The following example, shown in Fig. 3.6.a.1, shows how data can be proliferated
when each department keeps its own files.
4.6 - 1
File containing Stock
Programs to
Code, Description,
Purchasing place orders
Re-order level, Cost
Department when stocks are
Price, Sale Price
low
Supplier name and
address, etc
File containing
Programs to Customer name and
Accounts record accounts address, amount
Department of customers owing, dates of orders,
etc.
Fig. 3.6.a.1
This method of keeping data uses flat files. Flat files have the following limitations.
• Duplication of data
4.6 - 2
• Data dependence
• Incompatibility of files
To try to overcome the search problems of sequential files, two types of database
were introduced. These were hierarchical and network databases. Examples of these
are shown in Fig. 3.6.a.2 and Fig. 3.6.a.3 respectively.
Employee
Part-Time Full-Time
4.6 - 3
Smart Cards
Access Travelling
Control
The hierarchical model can still lead to inconsistent and redundant data. A network
database is similar to an hierarchical one, except that it has more complex pointers.
An hierarchical database allows movement up and down the tree like structure. A
network database allows movement up, down and across the tree like structure. The
diagram in Fig. 3.6.a.3 shows how complex the pointers can become. This makes it
very difficult to maintain a network database.
4.6 - 4
3.6 (b) Relational Databases and Normalisation
Note: When reading this Section you may wish to turn off the facility that underlines
spelling mistakes. This can be done by choosing Options… from the Tools menu and
clicking on the Spelling and Grammar tab. Then check the boxes 'Hide spelling
errors on this document' and 'Hide grammatical errors in this document'. This is so
that you can see the keys.
1 Table
2 Desk
3 Chair
In this example, the delivery note has more than one part on it. This is called a
repeating group. In the relational database model, each record must be of a fixed
length and each field must contain only one item of data. Also, each record must be
of a fixed length so a variable number of fields is not allowed. In this example, we
cannot say 'let there be three fields for the products as some customers may order
more products than this and other fewer products. So, repeating groups are not
allowed.
At this stage we should start to use the correct vocabulary for relational databases.
Instead of fields we call the columns attributes and the rows are called tuples. The
files are called relations (or tables).
where DELNOTE is the name of the relation (or table) and Num, CustName, City,
Country, ProdID and Description are the attributes. ProdID and Description are put
inside parentheses because they form a repeating group. In tabular form the data may
be represented by Fig. 3.6 (b)2.
4.6 - 5
Num CustName City Country ProdID Description
005 Bill Jones London England 1 Table
2 Desk
3 Chair
This again shows the repeating group. We say that this is in un-normalised form
(UNF). To put it into 1st normal form (1NF) we complete the table and identify a key
that will make each tuple unique. This is shown in Fig. Fig. 3.6 (b)3.
To make each row unique we need to choose Num together with ProdID as the key.
Remember, another delivery note may have the same products on it, so we need to use
the combination of Num and ProdID to form the key. We can write this as
To indicate the key, we simply underline the attributes that make up the key.
Because we have identified a key that uniquely identifies each tuple, we have
removed the repeating group.
Definition of 1NF
Let us now see how to move from 1NF to 2NF and on to 3NF.
Definition of 2NF
4.6 - 6
In our example, using the data supplied, CustName, City and Country depend only on
Num and not on ProdID. Description only depends on ProdID, it does not depend on
Num. We say that
and write
If we do this, we lose the connection that tells us which parts have been delivered to
which customer. To maintain this connection we add the dependency
Note the keys (underlined) for each relation. DEL_PROD needs a compound key
because a delivery note may contain several parts and similar parts may be on several
delivery notes. We now have the relations in 2NF.
Can you see any more data repetitions? The following table of data may help.
Definition of 3NF
A relation that is in 1NF and 2NF, and in which no non-primary key attribute is
transitively dependent on the primary key is in 3NF. That is, all non-key elements are
fully dependent on the primary key.
4.6 - 7
In our example we are saying
City → Country
Let us now use the data above and see what happens to it as the relations are
normalised.
4.6 - 8
1NF
DELNOTE
Num CustName City Country ProdID Description
005 Bill Jones London England 1 Table
005 Bill Jones London England 2 Desk
005 Bill Jones London England 3 Chair
008 Mary Hill Paris France 2 Desk
008 Mary Hill Paris France 7 Cupboard
014 Anne Smith New York USA 5 Cabinet
002 Tom Allen London England 7 Cupboard
002 Tom Allen London England 1 Table
002 Tom Allen London England 2 Desk
Convert to
2NF
DELNOTE PRODUCT
Num CustName City Country ProdID Description
005 Bill Jones London England 1 Table
008 Mary Hill Paris France 2 Desk
014 Anne Smith New York USA 3 Chair
002 Tom Allen London England 7 Cupboard
5 Cabinet
DEL_PROD
Num ProdID
005 1
005 2
005 3
008 2
008 7
014 5
002 7
002 1
002 2
Convert to
3NF
4.6 - 9
DELNOTE DEL_PROD
Num CustName City Num ProdID
005 Bill Jones London 005 1
008 Mary Hill Paris 005 2
014 Anne Smith New York 005 3
002 Tom Allen London 008 2
008 7
014 5
002 7
002 1
002 2
PRODUCT CITY_COUNTRY
ProdID Description City Country
1 Table London England
2 Desk Paris France
3 Chair New York USA
7 Cupboard
5 Cabinet
UNF
1NF
2NF
3NF
In this Section we have seen the data presented as tables. These tables give us a view
of the data. The tables do NOT tell us how the data is stored in the computer, whether
it be in memory or on backing store. Tables are used simply because this is how users
4.6 - 10
view the data. We can create new tables from the ones that hold the data in 3NF.
Remember, these tables simply define relations.
Users often require different views of data. For example, a user may wish to find out
the countries to which they have sent desks. This is a simple view consisting of one
column. We can create this table by using the following relations (tables).
Films are shown at many cinemas, each of which has a manager. A manager may
manage more than one cinema. The takings for each film are recorded for each
cinema at which the film was shown.
Converting this to 1NF can be achieved by 'filling in the blanks' to give the relation
4.6 - 11
FID Title CID Cname Loc MID MName Takings
15 Jaws TF Odeon Croyden 01 Smith £350
15 Jaws GH Embassy Osney 01 Smith £180
15 Jaws JK Palace Lye 02 Jones £220
23 Tomb Raider TF Odeon Croyden 01 Smith £430
23 Tomb Raider GH Embassy Osney 01 Smith £200
23 Tomb Raider JK Palace Lye 02 Jones £250
23 Tomb Raider FB Classic Sutton 03 Allen £300
23 Tomb Raider NM Roxy Longden 03 Allen £290
45 Cats & Dogs TF Odeon Croyden 01 Smith £390
45 Cats & Dogs LM Odeon Sutton 03 Allen £310
56 Colditz TF Odeon Croyden 01 Smith £310
56 Colditz NM Roxy Longden 03 Allen £250
Therefore 2NF is
FILM(FID, Title)
CINEMA(CID, Cname, Loc, MID, MName)
TAKINGS(FID, CID, Takings)
In Cinema, the non-key attribute MName is dependent on MID. This means that it is
transitively dependent on the primary key. So we must move this out to get the 3NF
relations
FILM(FID, Title)
CINEMA(CID, Cname, Loc, MID)
TAKINGS(FID, CID, Takings)
MANAGER(MID, MName)
4.6 - 12
3.6 (c) Entity-Relationship (E-R) Diagrams
The statements show two types of relationship. There are in fact four altogether.
These are
one-to-one represented by
one-to-many represented by
many-to-one represented by
many-to-many represented by
Fig. 3.6 (c)1 is the E-R diagram showing the relationships between DELNOTE,
CITY_COUNTRY, PRODUCT and DEL_PROD.
4.6 - 13
DELNOTE
CITY_COUNTRY DEL_PROD
PRODUCT
If the relations are in 3NF, the E-R diagram will not contain any many-to-many
relationships. If there are any one-to-one relationships, one of the entities can be
removed and its attributes added to the entity that is left.
Let us now look at our solution to the cinema problem which contained the relations
FILM(FID, Title)
CINEMA(CID, Cname, Loc, MID)
TAKINGS(FID, CID, Takings)
MANAGER(MID, MName)
in 3NF.
takes
FILM TAKINGS
is for
connected by FID
takes
CINEMA TAKINGS
is for
connected by CID
manages
MANAGER CINEMA
managed
by
connected by MID
4.6 - 14
CINEMA
MANAGER TAKINGS
FILM
That is
CINEMA FILM
If you now look at Fig. 3.6.c.2, you will see that the link entity is TAKINGS.
Form Design
Section 2.1 (c) discussed the design of screens and forms. All that was said in that
section applies to designing forms for data entry, data amendments and for queries.
The main thing to remember when designing screen layouts is not to fill the screen
too full. You should also make sure that the sequence of entering data is logical and
that, if there is more than one screen, it is easy to move between them.
Let us consider a form that will allow us to create a new entry in DELNOTE which
has the attributes Num, CustName, City. Num is the key and, therefore, it should be
created by the database system. Fig. 3.6 (c)3 shows a suitable form.
4.6 - 15
Entered by
the system
Appears
automatically
when City is
completed if
Can use in database.
drop down
lists to
complete
With this form, if a new City is input the user can input the Country and the
City_Country table will be updated. If the City exists in the database, then Country
will appear automatically.
Now let us design a form to allow a user to input a customer's order. In this case we
shall need to identify the customer before entering the order details. This is best done
by entering the customer's ID. However, this is not always known. An alternative, in
this case, is to enter the customer's name. The data entry form should allow us to
enter either of these pieces of data and the rest of the details should appear
automatically as a verification that the correct customer has been chosen. Fig. 3.6(c)4
shows a form that is in two parts. The upper part is used to identify the customer and
the lower part allows us to enter each item of data that is on the customer's order.
4.6 - 16
Enter either the customer's number OR
the customer's name. The other three
boxes will then be completed
automatically by the system.
Customer
details
Entered by
the user
Order Entered by
details the system
Notice how certain boxes are automatically completed. Also, because the form
requires a customer ID (Number), orders can only be taken for customers whose
details are on the database. This ensures the entry of customer details before an order
can be entered. In order to be consistent, the positions of the boxes for customer
details is the same on the Order Entry form as on the Add New Delivery Note form.
It is usual for both these forms to be password protected. This ensures that only
authorised personnel can enter data into the database.
This is a very simple example. Suppose the customer's ID is not known. We have
seen one way of overcoming this which satisfies the needs of the problem given.
Some systems allow the post code to be entered in order to identify the address. In
this case, the street, town and county details are displayed and the user is asked for the
house number. Other systems allow the user to enter a dummy ID such as 0000 and
then a list of customers appears from which the user can choose a name.
Alternatively, part of the name can be entered and then a short list of possible names
is displayed. Again the user can choose from this list.
4.6 - 17
Deletion and modification screens are similar, but must be password protected as
before so that only authorised personnel can change the database.
A query screen should not allow the user to change the database. Also, users should
only be allowed to see what they are entitled to see. To see how this may work, let us
consider a query requesting the details of all the cinemas in our second example. The
view presented to the users will give details of cinema names. locations, manager
names and film names as shown in Fig, 3.6 (c)5.
However, another user may be given the view shown in Fig. 3.6 (c)6.
Another user may be given all the details, including the cinema and manager IDs.
Notice that the columns do not have to have the same names as the attributes in the
database. This means that these names can be made more user friendly.
In order to create the query a user will normally be presented with a data entry form.
This form may contain default values, as shown in Fig. 3.6 (c)7, which allows a user
4.6 - 18
to list cinemas that have takings for films between set limits. This film allows users
to choose all the films, all the cinemas and all locations or to be more selective by
choosing from drop down lists. When the user clicks the OK button a table, such as
those given above, will appear.
In this Figure,
the boxes are
initially
completed with
default values.
In this case, if
the OK button is
clicked, all
cinemas and
films would be
listed.
However, suppose we want to know which films at the Odeon, Croyden took less than
£400. The user could modify the boxes, using drop down lists, as shown in
Fig. 3.6 (c)8.
When the OK
button is
clicked, a report
like that shown
in Fig. 3.6(c)9
would appear
together with a
button allowing
the user to print
the results or
return to the
query form.
4.6 - 19
Fig. 3.6 (c)9
4.6 - 20
3.6 (d) Advantages of Using a Relational Database (RDB)
Advantage Notes
Control of data redundancy Flat files have a great deal of data redundancy that
is removed using a RDB.
Consistency of data There is only one copy of the data so there is less
chance of data inconsistencies occurring.
Data sharing The data belongs to the whole organisation, not to
individual departments.
More information Data sharing by departments means that
departments can use other department's data to find
information.
Improved data integrity Data is consistent and valid.
Improved security The database administrator (DBA) can define data
security – who has access to what. This is enforced
by the Database Management System (DBMS).
Enforcement of standards The DBA can set and enforce standards. These
may be departmental, organisational, national or
international.
Economy of scale Centralisation means that it is possible to
economise on size. One very large computer can
be used with dumb terminals or a network of
computers can be used.
Improved data accessibility This is because data is shared.
Increased productivity The DBMS provides file handling processes instead
of each application having to have its own
procedures.
Improved maintenance Changes to the database do not cause applications
to be re-written.
Improved back-up and recovery DBMSs automatically handle back-up and
recovery. There is no need for somebody to
remember to back-up the database each day, week
or month.
4.6 - 21
3.6 (e) The Purpose of Keys
We have used keys in all our earlier examples to uniquely identify tuples (rows) in a
relation (table). A key may consist of a single attribute or many attributes, in which
case it is called a compound key.
The key used to uniquely identify a tuple is called the primary key.
In some cases more than one attribute, or group of attributes, could act as the primary
key. Suppose we have the relation
Clearly, EmpID could act as the primary key. However, NINumber could also act as
the primary key as it is unique for each employee. In this case we say that EmpID
and NINumber are candidate keys. If we choose EmpID as the primary key, then
NINumber is called a secondary key.
We see that MID occurs in CINEMA and is the primary key in MANAGER. In
CINEMA we say that MID is the foreign key.
4.6 - 22
3.6 (f) Access Rights
Sometimes it must not be possible for a user to access the data in a database. For
example, in a banking system, accounts must be updated with the day's transactions.
While this is taking place users must not be able to access the database. Thus, at
certain times of the day, users will not be able to use a cash point. Another occasion
is if two people have a joint account and one of them is withdrawing cash from a cash
point. In this case the one user will be able to change the contents of the database
while the other will only be allowed to query the database.
Similarly, while a database system is checking stock for re-ordering purposes, the
POS terminals will not be able to use the database as each sale would change the
stock levels. Incidentally, there are ways in which the POS terminals could still
operate. One is to only use the database for querying prices and to create a
transaction file of sales which can be used later to update the database.
It is often important that users have restricted views of the database. Consider a large
hospital that has a large network of computers. There are terminals in reception, on
the wards and in consulting rooms. All the terminals have access to the patient
database which contains details of the patients' names and addresses, drugs to be
administered and details of patients' illnesses.
It is important that when a patient registers at reception the receptionist can check the
patient's name and address. However, the receptionist should not have access to the
drugs to be administered nor to the patient's medical history. This can be done by
means of passwords. That is, the receptionists' passwords will only allow access to
the information to which receptionists are entitled. When a receptionist logs onto the
network the DBMS will check the password and will ensure that the receptionist can
only access the appropriate data.
Now the terminals on the wards will be used by nurses who will need to see what
drugs are to be administered. Therefore nurses should have access to the same data as
the receptionists and to the information about the drugs to be given. However, they
may not have access to the patients' medical histories. This can be achieved by giving
nurses a different password to the receptionists. In this case the DBMS will recognise
the different password and give a higher level of access to the nurses that to the
receptionists.
Finally, the consultants will want to access all the data. This can be done by giving
them another password.
All three categories of user of the database, receptionist, nurse and consultant, must
only be allowed to see the data that is needed by them to do their job.
So far we have only mentioned the use of passwords to give levels of security.
However, suppose two consultants are discussing a case as they walk through
reception. Now suppose they want to see a patient's record. Both consultants have
the right to see all the data that is in the database but the terminal is in a public place
and patients and receptionists can see the screen. This means that, even if the
4.6 - 23
consultants enter the correct password, the system should not allow them to access all
the data.
This can be achieved by the DBMS noting the address of the terminal and, because
the terminal is not in the right place, refusing to supply the data requested. This is a
hardware method of preventing access. All terminals have a unique address on their
network cards. This means that the DBMS can decide which data can be supplied to a
terminal.
4.6 - 24
3.6 (g) Database Management System (DBMS)
Let us first look at the architecture of a DBMS as shown in Fig. 3.6 (g)1.
EXTERNAL
LEVEL User 1 User 2 User 3
(Individual
users)
CONCEPTUAL
LEVEL Company Level
(Integration of
all user views)
INTERNAL DISK/FILE
LEVEL organisation
(Storage view)
At the external level there are many different views of the data. Each view consists of
an abstract representation of part of the total database. Application programs will use
a data manipulation language (DML) to create these views.
At the conceptual level there is one view of the data. This view is an abstract
representation of the whole database.
The internal view of the data occurs at the internal level. This view represents the
total database as actually stored. It is at this level that the data is organised into
random access, indexed and fully indexed files. This is hidden from the user by the
DBMS.
The DBMS contains a data definition language (DDL). The DDL is used, by the
database designer, to define the tables of the database. It allows the designer to
specify the data types and structures and any constraints on the data. The Structured
Query Language (SQL) contains facilities to do this. A DBMS such as Microsoft
Access allows the user to avoid direct use of a DDL by presenting the user with a
design view in which the tables are defined.
The DDL cannot be used to manipulate the data. When a set of instructions in a DDL
are compiled, tables are created that hold data about the data in the database. That is,
it holds information about the data types of attributes, the attributes in a relation and
4.6 - 25
any validation checks that may be required. Data about data is called meta-data.
These tables are stored in the data dictionary that can be accessed by the DBMS to
validate data when input. The DBMS normally accesses the data dictionary when
trying to retrieve data so that it knows what to retrieve. The data dictionary contains
tables that are in the same format as a relational database. This means that the data
can be queried and manipulated in the same way as any other data in a database.
The other language used is the data manipulation language (DML). This language
allows the user to insert, update, delete, modify and retrieve data. SQL includes this
language. Again, Access allows a user to avoid directly using the DML by providing
query by example (QBE) as was mentioned in Section 3.5 (j).
4.6 - 26
Appendix: Designing Databases
Although not stated as part of the syllabus, students may find the following to be of
value, particularly when normalising a database.
A company employs engineers who service many products. A customer may own
many products but a customer's products are all serviced by the same engineer. When
engineers service products they complete a repair form, one form for each product
repaired. The form contains details of the customer and the product that has been
repaired as well as the Engineer's ID. Each form has a unique reference number.
When a repair is complete, the customer is given a copy of the repair form.
The task is to create a database for this problem. In order to do this, you must first
analyse the problem to see what entities are involved. The easiest way to do this is to
read the scenario again and to highlight the nouns involved. These are usually the
entities involved. This is done here.
A company employs engineers who service many products. A customer may own
many products but a customer's products are all serviced by the same engineer. When
engineers service products they complete a repair form, one form for each product
repaired. The form contains details of the customer and the product that has been
repaired as well as the engineer's ID. Each form has a unique reference number.
When an engineer has repaired a product, the customer is given a copy of the repair
form.
engineer
product
customer
repair form
Now we look for the relationships between the entities. These can usually be
established by highlighting the verbs as done here.
A company employs engineers who service many products. A customer may own
many products but a customer's products are all serviced by the same engineer. When
engineers service products they complete a repair form, one form for each product
repaired. The form contains details of the customer and the product that has been
repaired as well as the Engineer's ID. Each form has a unique reference number.
When a repair is complete, the customer is given a copy of the repair form.
4.6 - 27
Relationship Type Notes
engineer services product many-to-many An engineer services many products
and a product can be serviced by
many engineers. For example,
many engineers service washing
machines.
customer owns product many-to-many A customer may own many products
and a product can be owned by
many customers. For example,
many customers own washing
machines.
engineer completes form one-to-many An engineer completes many forms
but a form is completed by only one
engineer.
customer is given form one-to-many A customer may receive many
forms but a form is given to only
one customer.
product is on form one-to-many Only one product can be on a form
but a product may be on many
different forms.
services completes
ENGINEER
serviced completed
by by
PRODUCT FORM
owned by given to
CUSTOMER
owns is given
4.6 - 28
services completes
ENGINEER
ENG_PROD
serviced completed
by by
PRODUCT FORM
owned by given to
CUST_PROD
CUSTOMER
owns is given
This suggests the following relations (not all the attributes are given).
ENGINEER(EngineerID, Name, … )
ENG_PROD(EngineerID, ProductID)
PRODUCT(ProductID, Description, … )
CUST_PROD(CustomerID, ProductID)
CUSTOMER(CustomerID, Name, … )
FORM(FormID, CustomerID, EngineerID, … )
These are in 3NF, but you should always check that they are.
Another useful diagram shows the life history of an entity. This simply shows what
happens to an entity during its life. An entity life history diagram is similar to a JSP
diagram (see Section 3.5 (c)). The next diagram shows the life history of an engineer
in our previous problem.
4.6 - 29
Engineer
Detail *
changes
This tells us
1. A new record for an engineer is created when an engineer joins the Company.
2. While the engineer is working for the Company, he/she may change their
name, address or telephone number as many times as is necessary (hence the
use of the *).
3. When the engineer leaves the Company, the main life-cycle ends and the
engineer's record is updated to indicate they no longer work for the Company.
4. 12 months after the engineer has left the Company his/her record is archived
and removed from the database.
4.6 - 30
3.6 Example Questions
3. Every student in a school belongs to a form. Every form has a form tutor and all
the form tutors are members of the teaching body.
Draw an entity relationship diagram to show the relationships between the four
entities STUDENT, FORM, TUTOR, TEACHERS (6)
4.6 - 31
Chapter 3.7 Use of Systems and Data
This automatic stock reordering has two cost effects. First it means that the
organisation should rarely run out of stock which would cause a loss of sales if it were
to happen and, hence, loss of income. It also means that the organisation should not
need to store large quantities of stock which would lead to high inventory costs.
If the organisation also keeps data showing the rates of sales of products, the system
can recognise changes in these rates and so change its ordering patterns.
Thus, data about products in stock and rates of sales is valuable as they improve the
profitability of the organisation.
In order for data to be of value they must be accurate and up-to-date. Often data are
inaccurate due to them not being frequently updated. If the sales figures are only used
once a week to update the stock database, the stock levels are soon out of date and the
data have little value.
These days, banks offer services other than banking. They offer mortgages, insurance
and business support. If a bank is considering a loan, it is important that the bank is
aware of the risks involved. Keeping data about previous borrowers, such as age,
income and social background, and comparing the data for a potential new borrower
with the historical data can help to determine whether or not to make the loan. This is
often done using artificial intelligence (AI) techniques and leads to fewer people
reneging on their loan. Thus, the data used is very valuable to the bank.
However, how does the senior executive in one country know what is happening in
other countries? Modern companies keep databases that can be accessed on a world-
wide basis. In order to do this, value added network services (VANS) are used.
These simplify the exchange of data between users of the service by using computer
networks.
6.1 - 1
In these systems, users plug into the interface provided by the VANS operating
company and the software does everything else. A VANS may operate in a single
company or may be of use to several companies. For example, estate agents may
share a VANS in order to match potential buyers with sellers over a much wider area
than is possible if each estate agent only has access to their own data. This system is
also used by solicitors having access to local authority databases for conveyancing
purposes. Eventually VANS will operate on a world-wide basis. Thus data that was
only of value to a small number of users is now of value to many more. This means
that the data have increased in value.
One of the problems with so much data being available is trying to sift the data for
useful information. This is often achieved using data mining techniques. A lot of
work is going on to develop sophisticated data mining software which looks for
patterns in vast quantities of data.
can lead to much better targeting of customers with the result that there are better
returns on investments.
A great deal of work is being done on data mining as many companies can make use
of the results. Indeed, some companies sell lists of people who may be valuable
customers to other companies.
6.1 - 2
3.7 (b) Standardisation
Text files. These are used to hold characters represented by the ASCII code. Text
files are used to transfer data between application packages. The data consists of
individual characters and there is no formatting applied to the characters.
Comma Separated Variable files. These are used to transfer tabular data between
applications. Each field is separated by a comma.
Tab Separated Variable files. These are used to transfer tabular data between
applications. Each field is separated by a tab character.
Standard Interchangeable Data files. These are used to transfer tabular data
between applications. They are not common outside the UK education market.
Rich Text Format files. These are a complex format used to store data from a word
processor. They include information about fonts, sizes, colour and styles.
Picture files. These are used to represent sound/pictures in digital format. There are
many different formats such as BMP (bit mapped), JPEG (Joint Picture Experts
Group), GIF (Graphical Interchange Format) and MPEG (Moving Picture Experts
Group). JPEG and MPEG involve compression techniques. It is these techniques that
allow pictures to be quickly transferred over the Internet. MPEG has also allowed the
introduction of many more television channels through a more efficient use of the
bandwidth available over the media used.
Sound files. As with picture files, there are many different formats that store sound in
digital form. WAV files are common on PCs. Storing sound requires a great deal of
memory. CDs sample at the rate of 44,100 samples/sec and DVD (Digital Versatile
Disk) at 96.000 samples/sec. Thus 3 minutes of music requires 3 x 60 x 96,000 =
16Mbytes. A typical DVD can hold 4.3 Gbytes or 13 hours of music.
The method of transferring data over a wide area is usually by means of ISDN
(integrated services digital network) connections. ISDN is used by telephone
companies to connect digital exchanges. Most homes use analogue connections to the
local exchange but after that ISDN is used as shown in Fig. 3.7(b)1.
6.1 - 3
Digital links
Digital
Exchange
Digital
Exchange
Digital
Exchange
User
Fig. 3.7(b)1
ISDN has a standard format that is used world-wide. There are two standard ISDN
services known as primary rate access (ISDN 30) and basic rate access (ISDN 2). The
difference is the number of channels and the methods used to deliver the services to
the user.
ISDN 2 will probably be used by most small business and individual users. It
provides two channels at 64kbps (B-channels) and a signalling channel of 16kbps (D-
channel). The three channels are multiplexed onto a single communications medium.
Data is packaged into frames, one type for transmission from the network to the
terminal and the other for transmission from the terminal to the network. Each type of
frame consists of 48 bits that have to be in a prescribed order. This ensures that the
data can be reassembled properly when received. This system can use the same wires
as in current telephone networks.
In order that data are understood when received, it is not sufficient to package data
into a format that can be sent along ISDN connections. The data may represent
sound, pictures, text or many other things. It is necessary to package this data into
some standard format first. The standard used is Open Systems Interconnection
6.1 - 4
Reference Model usually simply called the OSI model. The OSI model is simply a set
of rules (protocol) for the transmission of data from one piece of hardware to another.
These rules will have to cover the medium used for the transmission and then rules
about the software itself. This is an obvious subdivision for the rules that are needed,
but these can also be subdivided. In all the OSI model has 7 subdivisions (other
models have more or fewer, but all work on the same basic principle). The point being
that if the whole protocol was treated as a single entity then every time a small change
was necessary, perhaps a different peripheral being added to a system or a different
software being used, the whole protocol would need to be altered. However, in the
OSI system, only one of the subdivisions (actually called ‘layers’) needs to be altered.
6.1 - 5
3.7 (c) Computers and Communication
Computers are now used to aid communication between many devices and to provide
extra facilities that were not available with the old telephone networks.
Voice mail digitises spoken messages and stores them on disk. When recipients
access the messages they are converted back into sound.
Digital telephone systems provide many facilities. Because computers can maintain
very large databases, it is possible for users to have itemised bills, recall stored
numbers and to have accurate timing of calls. Although itemised bills can be sent out
on a regular basis, users can, using the Internet, access their own accounts at any time
and see what calls they have made and the costs of these calls. These systems also
allow the use of voicemail. Mobile phones rely heavily on computers to route calls.
Retailer Customer
Order
Payment
Invoice
Computer Delivery Note
Computer
Price List
6.1 - 6
3.7 (d) New Business
The use of the Internet by media reporters can mean that news can be quickly updated
and that information is in electronic form. This means that it can be manipulated for
use on other media.
Estate agents can set up sites that enable them to sell property throughout the world.
The applications are endless and you should keep abreast of modern developments as
they are published in the media. It is also worth reading Business @ the Speed of
Thought by Bill Gates published by Penguin Books (ISBN 0140283110).
6.1 - 7
3.7 (e) Training
Training in the use of IT is essential if users are to make the best use of it. Young
people are growing up in an IT environment and receive basic training in its use.
However, older generations find using IT daunting and need careful and appropriate
training. This may be as simple as switching on a PC and loading software or may
involve the use of particular packages. In the latter case, the packages taught need to
be pertinent to the jobs carried out by the learner. It is very easy to alienate learners
by teaching them how to use software facilities that they will never use.
It is also important that courses provide sufficient time for the learners to practise new
skills and to be provided with sufficient notes to enable them to redo tasks, set during
the course, at a later date. Online help is not enough; most people prefer to have their
notes in printed form. This is because they need to look at their work and their notes
at the same time. Adjusting the size of windows so that the work and the notes are
both on the screen at the same time is often unsatisfactory. Also, learners like to flick
back and forth through their notes and this is much easier when the notes are on
paper.
IT is changing the way things are done all the time. Robots weld cars, what is to be
done with the people who used to do the welding? They will have to be retrained to
do a different job. Bank clerks used to add up columns of figures, now they press
keys on a keyboard. However, they are now expected to provide new services to the
customer other than handling cash and cheques. They have had to be retrained as
sales persons as banks now sell mortgages, insurance and other services.
Organisations are setting up help desks for customers to contact when they have a
query. At present, most of these help desks involve large numbers of people. In
future a lot of this help will be provided electronically by means of databases that
hold data about frequently asked questions (FAQs). This means that the operators of
the help desks will have to be retrained to create these databases.
Training in the use of IT is not sufficient in itself. Employees can be trained to use
email but also need training in how email can be used to enhance their work. Instead
of groups of workers meeting, say, once a week, the workers can keep one another
informed of progress when it happens. This means that all workers on a project know
the current stage of development of that project. This speeds up the work. However,
training is needed in these new working methods, particularly to prevent an
overloading of email communications.
6.1 - 8
3.7 (f) Changing Work Patterns
These were mentioned in the previous Section. At one time a sales person went to a
customer with a catalogue and a price list. If a customer wanted something unusual,
the sales person had to go back to the office to get details. Now a laptop and a
modem can allow the sales people to access the company's database from customers'
premises. This allows them to spend more time with customers.
A similar example is that of selling double-glazing. At one time someone went to the
customer's house and measured all the windows. The next step was to go back to the
office and prepare a quotation which was then sent to the customer. Now, the sales
person can use a laptop, with suitable software, to prepare a quotation on the spot.
It is quite common for people to work on a project in the office, email it home and
continue working on it later at home.
Like banks, factories have seen major changes in working patterns. Fewer people are
needed in the assembly process because of more machines being used, many of them
intelligent and robotic. However, more technicians are needed to maintain the
automated plant. This seems like a balance of job lost against jobs gained, but it is
more complicated because the jobs lost are normally low skill while the new jobs are
high skill. This movement of skill levels in the work force has major implications for
the education of people and also means that many who were employed may not be
able to learn new skills and hence get new jobs. There is a consequent social problem
in society of a whole new underclass of people who are unable to gain satisfactory
employment.
Hotel receptionists have access to a database for all the hotels in a group. This means
that they can now book hotels for customers other than the one in which they work.
Staff who work in stores only take stock a few times a year instead of weekly. Stock
levels are kept on computer databases and need to be checked occasionally in case
stock is removed without passing through point-of-sale terminals. (This may be due
to products being damaged or stolen.)
Teachers and lecturers often set assignments using computer networks. Students then
post their work to their tutors electronically. Tutors view the work on screen and
return the marked work, with comments, electronically.
6.1 - 9
Products can be manufactured to a much higher standard because of the use of
computerised machines and robots. This increase in accuracy has lead to an increase
in quality. Self- assembly furniture is easier to put together because the parts are
made more accurately. Children's building toys look much better because the
components are more accurately made and are of better, more consistent, quality.
This increase in quality has led to fewer faults in end products such as motor cars.
This means that, in the case of motor cars, mechanics spend more time servicing
vehicles and less time correcting errors in manufacture. However, the increase in
quality has also led to a reduction in the need to service motor vehicles.
6.1 - 10
3.7 Example Questions
Example questions are not offered here as all the work is either a repeat of previously
visited work or is based on definitions that can be taken straight from the text.
As all the examiner comments have already been made, any content here would
simply be a repeat of what has gone before.
6.1 - 11
Chapter 3.8 Systems Development, Implementation, Management and
Applications
You have already used techniques such as entity models and normalization in Chapter
3.6. Chapter 1.7 discussed the system life cycle and techniques that can be used to
develop computer solutions.
Another useful technique is the Structured Systems Analysis and Design Method
(SSADM). Fig. 3.8.a.1 shows the stages involved when using SSADM.
Feasibility Study
Requirements Analysis
Requirements Specification
Physical Design
Fig. 3.8.a.1
These steps have been explained, informally, in Chapter 1.7. In this Section we shall
show how diagrams can help the development of the stages shown in Fig. 3.8.a.1.
You do not need to know the techniques of SSADM but Data Flow Diagrams (DFDs) are
important. DFDs provide a graphic representation of information flowing through a
system. The system may be manual, computerized or a mixture of both.
6.2 - 1
The advantages of using DFDs are that it
• is a simple technique;
• is easy to understand by users, analysts and programmers;
• gives an overview of the system;
• is a good design aid;
• can act as a checking device;
• clearly specifies communications in a system;
• ensures quality.
DFDs use only four symbols. These are shown in Fig. 3.8 (a)2.
All names used should be meaningful to the users, whether they are computer literate or
not.
6.2 - 2
The steps to be taken when developing DFDs are given in Table 3.8 (a)1.
Step Notes
1. Identify dataflows. e.g. documents, VDU screens, phone messages.
2. Identify the external entities. e.g. Customer, Supplier
3. Identify functional areas. e.g. Departments, individuals.
4. Identify data paths. Identify the paths taken by the dataflows identified in
step 1.
5. Agree the system boundary. What is inside the system and what is not.
6. Identify the processes. e.g. Production of invoices, delivery notes, payroll
production.
7. Identify data stores. Determine which data are to be stored and where.
8. Identify interactions. Identify the interaction between the data stores and the
processes.
9. Validate the DFD. Check that meaningful names have been used.
Check that all processes have data flows entering and
leaving.
Check with the user that the diagram represents what
is happening now or what is required.
10. Fill in the details.
Fig. 3.8 (a)3 shows the different levels that can be used in DFDs.
0 Payroll System
Level 1
(Top 1 Get hours worked 2 Calculate wages 3 Produce wage slips
Level)
Level 2
(Lower 2.1 Validate 2.2 Calculate 2.3 Calculate 2.4 Calculate
Level) Data gross wage deductions net wage
Level 3
(Not
always
needed)
Fig. 3.8 (a)3
6.2 - 3
Now consider the following scenario.
A hotel reception receives a large number enquiries each day about the availability of
accommodation. Most of these are by telephone. It also receives confirmation of
bookings. These are entered onto a computer database.
While a guest is resident in the hotel, any expenses incurred by the guest are entered into
the database by the appropriate personnel. If guests purchase items from the bar or
restaurant, they have to sign a bill which is passed to a receptionist who enters the details
into the database.
When guests leave the hotel they are given an invoice detailing all expenditure. When
they pay, the database is updated and a receipt is issued.
Customer
Enquiry
Drinks Bill
Reply Drinks Order
Confirmation 1 Reception 1 Bar
of booking
Process Process
bookings and Customer's bookings and
Final Bill
accounts accounts
Customer Payment expenditure
Receipt
Customer Customer Drinks Bill
Details Details
D2 Customer Accounts
D1 Customer Details
Food Bill
1 Restaurant
Food Bill
6.2 - 4
The symbol for Customer, an external entity, has a diagonal line to indicate that it occurs
more than once. This does not mean that these symbols represent different customers. It
is simply used to make the diagram clearer. Without this, there would be too many flow
lines between this symbol and the internal processes.
Data stores may also be duplicated. This is done by having a double vertical line on the
left hand side as shown in Fig. 3.8 (a)5.
M1 Customer data
Notice this data store is numbered M1 whereas those in Fig. 3.8 (a)4 were numbered D1
and D2. In data stores, M indicates a manual store and D indicates a computer based data
store. Also, there can be no data flows between an external entity and a data store. The
flow of data from, or to, an external entity must be between the external entity and a
process.
The example of the hotel system only shows one external entity, the customer. Usually
there is more than one external entity. Suppose we are dealing with a mail order
company. Clearly, one external entity is the customer. However, another is the supplier.
Note that although there are more than one customer and supplier, in the diagram they are
written in the singular.
With a number of tasks, some may be possible at the same time, while with others it
becomes important to do them in the correct order.
It may also be important to work out how long the project should take to complete.
Major projects like this can be represented graphically to show the different tasks and
how they join together, or relate to each other.
Take, as an example, the major project of building a bungalow. It can be divided into a
number of tasks.
A Concreting the foundations takes 4 days
B Building the walls takes 4 days
C Making the doors and windows takes 7 days
D Tiling the roof takes 2 days
E Installing the plumbing takes 3 days
F Doing the interior carpentry takes 4 days
G Installing the electrics takes 6 days
H Decorating takes 5 days
6.2 - 5
One way of deciding how long the bungalow takes to build is to add up all the separate
times, 35 days. On the other hand they are separate jobs so as long as enough people are
working, it will only take as long as the longest task, 7 days. This is silly, the decorating
can’t be done before the roof is on!
The real time for the project is somewhere between 7 and 35 days.
H
G
F
E
D
C
B
A
4 8 12 16 20 24 No of Days
Another type of graph might be similar to a flow diagram. The circles represent stages
and the arrows the tasks needed to reach that stage and the time, in days, needed to carry
them out.
6.2 - 6
4
1 A 2
4
B
2
3 7 4 D 5 6
C G
4
3 F 7
E 5
6 H
1 Start
2 Foundations finished
3 Start making the windows and doors
4 Walls finished
5 Roof finished
6 Interior finished
7 Bungalow finished
There are many routes through the diagram. The one that gives the time taken to
complete the bungalow is
1 2 4 5 6(G) 7
A total of 21 days. This is the critical path. The bungalow cannot be built in a shorter
time.
The arrows show the order in which the tasks must be carried out, so E (the plumbing)
cannot be done before B (building the walls), but can be done at the same time as tiling
the roof (D). Each node can have an optimum time worked out. Node 6 has an optimum
time of 4+4+2+6 = 16 days. This is the time to be sure of getting to node 6 whichever
route you choose. Each node can also have a latest starting time before it holds up
another node. Node 1 must start immediately otherwise the walls (4) won’t be finished
by day 8. However, node 3 could start immediately or wait a day without affecting the
rest.
Complex computing projects require effective management or they may get out of
control. Different personnel will be needed for different tasks. Their time must be
booked in advance so that they are free from working on other tasks or projects when
they are needed. This means that an accurate prediction of the start times of the various
tasks must be made. It is also necessary to assess how long the different tasks will last so
that other projects can book personnel.
There is now specialist project management software available that can take more
complex projects than the one that we were considering and produce the type of analysis
6.2 - 7
that we have been talking about. If the software is compatible with the diary software
used by a firm then it can automatically book the workers when they are needed.
As an exercise try to draw a chart to help in the planning of the work for an A2 project.
One possible answer is shown below.
Evaluation
User documentation
Technical documentation
Implementation
Testing
Software development
Design
Analysis
Nature of problem
Time
6.2 - 8
3.8 (b) The Purpose of Documentation
An E-R diagram shows the relationships between entities so that we can check these
relationships with the end user. These diagrams help us to ask questions like
Thus, we can ensure that the relationships are correct. We can also ensure that we have
included all the entities.
Similarly, our Data Flow Diagrams (DFDs) show how data is moving through the
organization and we can ask questions like
Once we are sure that our analysis is complete and accurate, by continually checking it
with the end user, we can move on to the design stage. At the design stage the diagrams,
so far produced, will be modified to show exactly what we are going to produce. This
may be different from the analysis due to unforeseen constraints. However, as they
should be very similar to those produced at the analysis stage, we can check that we have
included everything required by the user. Indeed we should go back to the user to
validate our design.
We can now design the user interfaces and check that they will allow the user to input the
data required and output the results expected. Again these can be checked with the user.
A good way of doing this is to produce a prototype of your solution. This will appear to
work but, in fact, it only shows how the interfaces work. There is no code behind the
interfaces to manipulate the data. This can cause problems, as often users think that they
are seeing the real thing when there is still many months work needed to produce the final
solution. When the interfaces have been designed they can also be checked against the E-
R and data flow diagrams.
This continual cross-checking with previous steps is very important. Fig. 3.8 (b)1 shows
that, as each stage is developed, it is checked against all previous stages. This continual
6.2 - 9
validation process is essential if we are to reduce the cost of maintenance due to errors
and omissions. The careful documentation also helps to maintain a piece of software
when it is to be upgraded by adding extra facilities.
User Request
Initial Study
Feasibility
Study
Systems
Analysis
Systems
Design
Implementation
Change Over
Evaluation
and
Maintenance
6.2 - 10
3.8 (c) Technical Requirements
A quick look at any advert for computer hardware will make it obvious that different
computer systems have different technical specifications. One system will have a 700
Megahertz processor while another will have a 1.5 Gigahertz processor. One will have
256 Megabytes of Ram while another will have 64 Megabytes. Is one of these systems
preferable to the other, and if so, which one?
To answer the second question first, “None.” This answer is a little bald, but strictly true.
An examination paper cannot ask for the specification for a PC because everyone will
have their own idea what specification will be appropriate and, anyway, the answer that
the examiner has put on the mark scheme will be out of date by the time the examination
is marked. Strictly, there is no right answer to such a question. So the requirements of
the syllabus come down to the first question – What specifications are important to allow
the system to perform the operations expected of it?
The speed of the processor is simply a measurement of the number of operations that are
possible every second. This is being typed, using ‘Word’ word processing software, on a
computer with a 1.33 Megahertz processor. If I had a 1.33 Gigahertz processor (1000
times faster) it would make no difference, I still can’t type any faster. So the speed of the
processor is largely irrelevant to this particular task. My neighbour uses her computer in
order to edit video material as part of a service that she offers to local industries. My
computer would simply not be able to process the data quickly enough to produce
satisfactory images without considerable jerking of the picture, in her case she needs the
faster processor. A student uses their computer to produce essays for their English
course, whilst another produces high quality colour pictures for an Art course. The first
student is storing relatively small text files while the other student is needing to store
large quantities of data, my neighbour’s digitised video is going to need all of the 70
Gigabyte hard drive that she has bought.
The English student will need to have a CDROM drive in order to load software to use.
The Art student decides that the images are too valuable to lose and opts for a DVD drive
to act as a back up storage to the hard drive. My neighbour has clients who want large
numbers of copies of the videos that she produces on CD’s to send to clients so she has
invested in a CD writer, a fast one so that it takes less time to produce each copy.
The English student has invested in an ink jet printer with a separate black cartridge
because most of the work will be in black and white. The artist has a simple one
cartridge printer because black will not be needed very often.
From these examples, hopefully, the student can understand that the important thing is
the need to satisfy the needs of the application rather than the need for the student to
6.2 - 11
memorise large quantities of data about system specifications which would be out of date
very quickly anyway.
6.2 - 12
3.8 (d) Appropriate Response Times
This section is, very much, a recap on work covered in the AS part of the course.
The point being made is that different applications require different things from the
computer system. A classic example is the file of goods that needs to be accessed
directly if it is to be used at the point of sale terminal, and sequentially if the details are to
be used for ordering of replacements. The solution is to store the file in an indexed
sequential format. At the point of sale the item required is identified by a laser scanner
whereas at the head office the item name will likely be typed in at a keyboard. If a single
item needs to be found then a program can be written to find the details directly from the
key field by using a hashing algorithm, alternatively the items can be held sequentially in
an index and a program can be written to find the particular item by using a binary search
of the index. Dependent upon the method of search the data needs to be stored
differently.
Students will be expected to be able to determine sensible types of hardware and software
for particular applications dependent upon their characteristics, just as they were expected
to do in the AS part of the syllabus.
6.2 - 13
3.8 (e) Implementation Techniques
The different methods for implementing new systems have all been explained before.
The inclusion here is simply to reinforce the ideas and forewarn students that they are
likely to meet this topic in this examination as well as others in the past.
The attention of students is especially directed to Chapter 1.7 the systems development
life cycle, from the AS work, and also to the notes on module 4, the project, for
references to how to implement a system.
6.2 - 14
3.8 (f) Managing, Monitoring and Maintenance of Systems
As with so much of the work in this module, a lot of the work in this section refers to the
work already covered in other sections.
The need for managing, monitoring and maintenance of systems goes back to section 1.7
in the AS text which was to do with systems analysis and the system life cycle. Not only
should one be aware of the need for documentation, but it should be up to date. In a
subject like Computing, where things are changing so very quickly, it is not reasonable to
suggest that a thing has been done once and that it therefore does not have to be
considered again. Software and hardware change on such a regular basis that, even if we
are happy with our system, the outside world is going to affect us and force change.
There are very few systems that are totally self-contained and when another system is
updated our original system may no longer be compatible. This process of change should
not be a static one. A firm should not sit back and wait for a change to be forced upon it,
rather there should be a continual process when managing a system, of measuring the
system that is being used against what is currently available, a software (and hardware)
audit. The outputs from systems should be studied to ensure that they are acceptable, this
is quality control and the documentation must keep up with the rest of the system in order
to provide the necessary information about the system to the various users.
6.2 - 15
3.8 Example Questions
6.2 - 16
Chapter 3.9 Simulation and Real-time Processing.
A real-time system is one that can react quickly enough to data input to affect the real
world. If this is true it implies that the output from the system must be produced quickly
enough to produce the effect on the world outside the computer before that world has
enough time to alter. Consider the case of the airline booking system. The “world” that
we are talking about is the world of the database that contains all the booking details. The
use of a real-time system here refers to the concept that if a ticket is bought by a member
of the public, the database must be updated before the next person has a chance to book a
ticket. Notice that the idea of working “incredibly fast” or “in billionths of a second” does
not apply here. In some real-time applications these comments may be reasonable, but it
depends on the ‘world’ that the application is concerned with.
When asked to describe a real-time application the first thing that needs to be described is
the world of the application. Everything else falls into place. Students should then
describe the hardware necessary to allow input and output from that world and the
decisions that the software must take.
A nuclear reactor may start to react too violently and sensors inform the computer
controlling the reaction that this is happening. The computer takes the decision to insert
the graphite rods to slow the reaction down. This is a real-time application. The world of
the application has been identified, the input devices are the sensors that inform the
computer of the state of the reaction, the computer makes an immediate decision and the
graphite rods are now moved into place. Notice that the rods moving is not immediate but
will take place over a period of time, however the decision was taken immediately. Note
also that the sensors simply report on the state of the world, there is no hint at decision
making on the part of the sensors. Many students would phrase their answer in the form
of “The sensors spot that the reaction is too violent and the processor makes …”. Here,
the sensors are being credited with having processing power in that they can interpret the
readings that are produced.
Most sensors are surprisingly simple, relying on one of two methods to gain information.
They either use some type of spring mechanism to physically change the position of
something in the sensor, or a means of turning some reading into a variable voltage. The
spring ones are like the pressure pad, held open by a spring which is overcome by the
weight of the burglar. Similarly, a bumper around a robot vehicle can be kept away from
the body of the vehicle by springs which are overcome if the robot moves too close to
something blocking its path. A thermistor, used to measure the room temperature for a
central heating system converts the ambient temperature to a voltage so that a decision
can be made by the processor. A light meter at a cricket match converts the available
light to a voltage so that the processor can decide whether there is enough light for play.
Notice that the digital sensors are really switches. If the bumper around the robot is
depressed is it necessary for a processor to make a decision? Probably not, the switch
would simply switch off the motor. The question arises whether this is a sensor in the true
sense of a sensor in computing. The answer is that the action of turning the motor off
required no decision and hence no processing, but that the need to do so will be reported
to the processor because it now has to make a decision about what to do next and the
input from that sensor is going to be an important part of that decision.
One last point about sensors. There are as many different sensors as there are physical
quantities that need measuring, but their reports need to be kept as simple as possible to
allow the processor to make decisions quickly. The idea of a sensor being a TV camera
because it can show the processor what is going on in a large area is unrealistic because it
would provide too much information. What would be possible would be a TV picture
which could be scanned by a processor for any kind of movement in order to indicate the
presence of a burglar. Never lose sight of the idea that the processor is limited in the
amount of data that can be interpreted, as is the software that the processor is running.
When the processor makes its decisions it must be able to take some action. In the type of
scenario we are talking about it will probably be necessary to alter something in the
physical world. This may involve making the robot move in a different direction, it may
be a matter of switching on some lights or telephoning the police station to report an
intruder on the premises. Some of these are simple electric circuits that the computer can
trip by itself, however, making the robot change direction is rather more complex. Such
movement involves the use of an actuator. An actuator is the device that can accept a
signal from the computer and turn it into a physical movement.
3.9 (c) The Use of Robots
A computer system has the ability to perform a large number of calculations in a short
space of time. If a physical action can be portrayed as the result of a series of formulae
and their results, acting upon one another, then the computer can, by doing the
calculations, pretend to be carrying out the physical action. This is what is meant by a
simulation.
The rate of growth of a sunflower is known from observations taken over many years.
The effects of different chemicals on the growth of sunflowers are known from simple
experiments using one chemical at a time. The effects of different combinations of the
chemicals are not known. If the computer is programmed with all the relevant formulae
dictating how it should grow in certain circumstances, the computer can play the part of
the sunflower and show how a real one can be expected to react. In this way the effects of
different growing conditions can be shown in seconds rather than waiting 6 months for
the sunflower to grow. One can imagine that in the course of a day a programmer can
come up with a suitable cocktail of additives to allow sunflowers to grow on the fringes
of a desert and consequently create a cash crop for farmers, in such conditions, where
they had no cash crop before. This is an example of the use of a computer simulation to
speed up a process in order to give results in a more reasonable time scale.
Simulation can be used to predict the results of actions, or model situations that would be
otherwise too dangerous. What will happen when a certain critical condition is exceeded
in a nuclear reactor? I don’t want them to try it on the one down the road from where I
live and I’m sure no one else does either. Program a computer to pretend to be a nuclear
reactor and it is no longer necessary to do it for real.
Some things are impossible. It is not possible to fly through the rings of Saturn. Program
a computer to pretend and a virtual reality world can be created to make it seem possible.
A car company is planning a new suspension system for a range of cars. One way of
testing different designs is to build prototypes and take them out in different conditions to
test how they work. This is very expensive, as well as time consuming. A computer can
be programmed to take the characteristics of each possible system and report how well
they will work, at a fraction of the cost. The same simulation can be made to vary the
conditions under which it operates. A fairly simple change to the parameters of one of the
formulae being used can simulate driving on a motorway or on a country lane.
People can have ideas which need testing to see if they are valid. An engineer may design
a new leaf spring for a suspension system. The hypothesis is that the spring will give
more steering control when travelling on rough surfaces. The computer can be set up to
simulate the conditions and give evidence to either support or contradict the hypothesis.
A financial package stores data concerning the economy. It can be used to provide
information about past performance of the various criteria being measured or it can be set
up to predict what will happen in the future. If the graph of a particular measure is linear
then extrapolation of what will happen in a year’s time is not difficult, in fact you
certainly don’t need a computer to provide the prediction. However, if the graph is non
linear the mathematics becomes more difficult. More importantly, economic indicators do
not exist in isolation. If the unemployment figures go up then there is less money in the
economy, so people can buy less, so firms sell less, so more people are laid off. Unless
the bank of England brings down interest rates which will encourage people to borrow
more and hence buy more, so firms need to employ more people in order to put more
goods in shops… When the relationships become intertwined like this the calculations of
predictions become very complex and computers are needed.
Description of a Simulation
Generally, a student would be expected to understand that in a given situation there are a
number of variables that control the outcome and the results that may be predicted. There
should be an awareness that the values of these variables do not just appear by magic but
must be collected and that sensible limits should be set within which the variable values
must lie. There should be an awareness that the results are going to be based on the use of
these variables in specific formulae that relate the variables to one another. Finally, there
should be an awareness that the results produced are subject to a degree of error, the size
of which will result from, not just the validity of the variable values and the relationships,
but also the validity of the model that is used.
As mentioned before, if the predictions are best on a simple linear relationship between
two variables then there is little need for a computer because there is little to calculate
and the calculations are simple. If the predictions are based on a more complex
relationship then the computer becomes useful but only in terms of a glorified calculator.
However, if the predictions are based on complex, multiple, relationships the calculations
become immense and must be done quickly, if related to a time sensitive example like the
weather, otherwise, if tomorrow’s weather forecast takes two weeks to compute, it
doesn’t matter how accurate the results are, they are useless.
This combination of the vast quantities of data, the inter relationships and the consequent
volume of calculations means that computer power becomes essential to giving a sensible
result. Indeed, with something like the weather forecast, ordinary computers are too slow.
There is a need for high speed calculation, a need that is satisfied by the use of parallel
processing.
If 100 digits need adding together a standard, single processor, computer will need 100
cycles to complete the calculation. A computer with 50 processors can add the 50 pairs of
values together in one cycle. The 50 answers can be added in pairs in a second cycle to
give 25 answers and so on. A total of 7 cycles are needed to add the set of 100 digits.
This simple example gives an illustration of how simple arithmetic can be significantly
speeded up using parallel processing, and hence how parallel processing can be so
important to simulations.
3.9 (f) Advantages of Simulation
This has already been discussed in this section. Briefly, the advantages of simulation to
testing are that tests can be carried out safely, at a fraction of the normal cost and in a
fraction of the time that the testing would otherwise have taken.
Limitations of Simulation
The syllabus no longer expects the limitations to be taught as a separate topic, but the
value of computer simulations cannot be appreciated without considering such
limitations. Those students with a mathematical background will know the difficulty of
coming up with true randomness and particularly in relation to physical events. In this
context random events must be interpreted as unpredicted or unpredictable events. The
previous sections have discussed problems associated with unpredicted events like the
effect of the sun storms on the weather. It is not suggested that the effects of these storms
could not be included in our model, simply that it was not considered when the model
was created, hence this is a fault of the model and not a suggestion that because of sun
storms it is impossible to predict the weather. However, there are some situations that are
not predictable. They are to do with events that are so complex that it is impossible to
design a model for them, or to do with human beings, the behaviour of which does not
normally follow easily interpreted relationships.
If it were possible to predict the outcome of the lottery draw then there would be some
very rich computer programmers. Mathematically, the outcome is not random and should
be predictable, perhaps by modelling the behaviour of the individual atoms inside the
machine that chooses the balls. However, this is impossible, certainly with present
technology.
If it were possible to predict accurately that human beings would all buy a particular song
in preference to another, then the record industry would not have to produce such a
volume of material in order to have a single hit. Human behaviour is very difficult to
predict.
3.9 Example Questions
1. Describe the real time application of a computer used to control a burglar alarm
system. (4)
2. Explain why some applications require parallel architecture to carry out their
processing and describe what is meant by parallel processing.
Chapter 3.10 Common Network Environments, Connectivity and
Security Issues
Most of this is covered in Chapter 1.6 in the AS text. Remember that questions may
be asked on any part of the A Level Computing syllabus in the exam for module 3.
In this Chapter you will learn how to connect LANs and WANs. Section 3.7 (b)
showed how analogue signals are used from your home PC (or network) to the local
telephone exchange. This connection is analogue if a modem is used. From then on
digital signals are used until the final local exchange. From this exchange analogue
signals must be used if the ordinary home telephone and modem are used by the
receiver.
LANs use digital signals to transfer data between nodes. The rate of transmission of
the data depends on the topology of the network and the transmission medium used to
join nodes in the network. Fig. 3.10 (a)1 shows a ring network. The most common
medium used in this type of network is unshielded twisted pair (UTP) as described in
Section 3.10 (b). This makes ring networks easy to install but limits bandwidth and,
therefore, the maximum speed of the network.
Station Repeater
6.4 - 1
Media, other than UTP, are used in ring networks, details of which are given in Table
3.10 (a)1. You are not expected to remember the exact transmission rates and other
details. However, you do need to remember the relative details. Details of these
media are given in Section 3.10 (b).
In bus networks the communication network is simply the transmission medium. Bus
networks can use any medium and details are given in Table 3.10 (a)2.
The limits on transfer rates given in the two tables are typical but they are being
extended all the time as technology advances.
6.4 - 2
3.10 (b) Transmission Media
The other main type of cable used in LANs is coaxial cable. This has a central
conductor enclosed in a plastic sheath which is surrounded by a copper sheath. This
copper screen is surrounded by a plastic coating as shown in Fig. 3.10 (b)2.
Copper screen
conductor
Central
conductor
Plastic
insulators
Fig. 3.10 (b)2
The transfer rates for these media are given in the Tables in Section 3.10 (a).
Sometimes it is very difficult to lay cables so low-power radio may be used. This
uses radio signals between networks and nodes, with other forms of media used to
link other parts of a network together. This is now being used in schools that have
mobile classrooms, sometimes known as demountables.
6.4 - 3
3.10 (c) Network Components
Switches use the same type of wiring as hubs (see Section 3.7 (d)). However, each
connector has full network speed. A typical layout is shown in Fig. 3.10 (c)1. Here,
each station has full speed access to the server. However, if any of these stations wish
to access the main network, they would have to share the connection to the main
network.
Stations
S
W
I
T To main
C network
H
Server
If the number of stations is increased and they all want to access the main network,
the increased local speed would be less useful because of sharing access to the main
network. In a case like this, it may be necessary to upgrade the link to the main
network.
A router is used to connect different types of network together. A router can alter
packets of data so that two connected networks (LANs or WANs) need not be the
same. Routers use network addresses and addresses of other routers to create a route
between two networks. This means that routers must keep tables of addresses. These
tables are often copied between routers using routing information protocol (RIP).
Public
LAN Router Router LAN
network
6.4 - 4
In order to route data round a network, a router takes the following steps.
Note that, in the case of the Internet, the destination address is the IP address.
Usually a router is slower than a bridge. A bridge links two LANs which may, or
may not, be similar. It uses packets and the address information in each packet. To
route data efficiently, a bridge learns the layouts of the networks.
Suppose a bridge is used to link two segments together that are not far apart, say in
the same building. The two segments can work independently but, if data needs to go
from one segment to another, the bridge will allow this. Fig. 3.10 (c)3 shows this
situation.
Segment
Bridge
Segment
The bridge has to learn where each node is situated. The bridge will receive data that
does not have to be passed from one segment to another. Initially, any data the bridge
receives is buffered and passed to both segments. The bridge stores a table containing
the addresses of sending nodes and the segment from which the data was sent.
Eventually, when all nodes have sent data, the bridge will know on which segment
each node is.
Now, when the bridge receives data being sent from one node to another, it can make
a decision whether, or not, the receiving node is on the same segment as the sending
node.
6.4 - 5
This leads to the following algorithm.
However, bridges
• introduce delays,
• can become overloaded.
Modems are needed to convert analogue data to digital data and vice versa. A modem
combines the data with a carrier to provide an analogue signal. This means that
ordinary telephone lines can be used to carry data from one computer to another. This
was explained in Section 3.7 (b).
6.4 - 6
3.10 (d) Common Network Environments
Probably the largest network in use is the Internet. The internet provides facilities to
link computers world-wide, usually using telecommunications systems. It allows fast
communications between people, the transfer of data between computers and the
distribution of information.
Messages are passed from the source computer, through other computers, to the
destination computer.
In order for this system to work, there are Internet Service Providers (ISP) who
connect a subscriber to the backbone of the Internet. These providers then pass data
between them and onto their respective clients. Fig. 3.10 (d)1 (on the next page)
shows how data, including electronic mail (see Section 3.10 (g)), are passed from one
computer to another.
An intranet is a network offering the same facilities as the Internet but solely within a
particular company or organisation.
An intranet has to have very good security for confidential information. Sometimes
the organisation allows the public to access certain parts of its intranet, allowing it to
advertise. This Internet access to an intranet is called an extranet.
Suitable software is required to make these systems work. Browsers allow a user to
locate information using a universal resource locator (URL). This is the address for
data on the Internet. The URL includes the transfer protocol to be used, for example
http, the domain name where the data is stored, and other information such as an
individual filename.
e.g. http://www.bcs.org.uk/ will load the British Computer Society's home page.
Domain names are held in an hierarchical structure. Each name is for a location on
the Internet. Each location has a unique name. The names in the various levels of the
hierarchy are assigned by the bodies that have control over that area.
PC195-staff.acadnet.wlv.ac.uk
The domain is uk and the ac would be assigned to a particular authority. (In this case
UKERNA). This authority would then assign the next part, i.e. wlv. As this is
Wolverhampton University, it is responsible for all the parts prior to wlv. Those in
charge of acadnet are responsible for PC195-staff.
6.4 - 7
Each computer linked to the Internet has a physical address, a number called its IP
(Internet protocol) address. This numeric address uniquely identifies the physical
computer linked to the Internet. The domain name server converts the domain name
into its corresponding IP address.
6.4 - 8
3.10 (e) Hypertext Links
The World Wide Web stores vast amounts of data on machines that are connected to
the Internet. This data may be in the form of text, databases, programs, video, films,
audio and so on. In order to view this data you must use a browser such as Internet
Explorer or Netscape. However, the browser will need to know how to retrieve and
display this data.
All the data is situated on computers all over the world. These computers have unique
addresses and the data is held in folders on these computers. However, not all
computers use the same hardware and software. This means that there must be some
protocol that allows all the computers to communicate and be able to pass the data
from one computer to another. One of the protocols to do this is the hypertext transfer
protocol (http) that is used by the browsers to receive and transmit data. A typical
URL is
http://www.bcs.org.uk/
Here, the URL starts http:// where http tells the browser which protocol to use. the
portion :// is a separator marking off the transmission protocol from the rest. This
URL connects the user to the home page of the British Computer Society. If a
particular piece of data is required, such as a weather forecast, you can specify a
folder to move to directly. This one
http://bbc.co.uk/weather/
loads a page from the directory weather at bbc.co.uk. In turn, this page will have
links to other directories and pages.
This means that the browser now knows where to look for the data. Links may be
placed so that a user can quickly move around a document or to another document,
which may be at a completely different site. Fig. 3.10 (e)1 shows links to documents
that are at the same site as the document containing the links.
Smart Cards
Contents
Definitions
Applications
The Electronic Purse
Home page
6.4 - 9
The links are usually displayed in a different colour to the rest of the text and are
underlined. When you place your pointer on a link, the pointer becomes a pointing
finger. If you now click the mouse button you will be connected to the appropriate
site and the data will be downloaded. (Try it.). In this document, when you leave the
pointer on a link, the URL will be displayed. Fig. 3.10 (e)2 shows part of the page
that is displayed when Applications is clicked on.
Applications
Electronic Purse
Access Control and Security
Travelling
The Future
Smart Card Contents
Home Page
Electronic Purse
This acts like cash. The card can be charged up at modified automatic teller machines
(ATMs), modified BT payphones and at new points installed by the provider. Mondex
is one of the largest suppliers of these smart cards and trials are taking place at Aston,
Exeter and York Universities as well as at Swindon. The card is loaded with
electronic cash and it can then be used to pay for goods and services in a similar way
to using a charge card. The difference being that 'cash' is being transferred from the
card to the retailer. The cards can transfer 'cash' from one card to another. Thus, if two
people, such as a parent and a child, each have a card, the parent can transfer the
child's pocket money from one card to the other.
Another large provider of smart cards is Visa. They produce both disposable and
reloadable cards. Visa Cash, as it is called, can provide secure trading on the Internet
as well as facilities similar to those of Mondex. Click here for more details.
Start of Applications
6.4 - 10
3.10 (f) Hypertext Mark-up Language (HTML)
Using http, your browser can transfer data between computers. However, the browser
still needs to know how to display the data. This is done by using the hypertext
markup language (HTML). You are not expected to be able to produce detailed
HTML in the examination. However, you may find it useful to remember a few
examples in order to explain an answer to a question.
HTML uses tags to indicate how to display the data. Tags are enclosed in angle
brackets < and >. For example <B>. Some tags have two parts. One indicates the
start point and the other the end point. For example
would produce
Similarly,
would produce
An HTML document is in two parts called the HEAD and the BODY. What is in the
HEAD is not normally displayed, although some browsers will display a title if it is
included in the HEAD. Level 2 HTML requires users to include a title of up to 64
characters. This is because some search programs enter it in a database so that the
search engine can find it if it contains what the searcher wants. Thus it is a good idea
to include some keywords in the title. The heading tags <H1>…</H1> to
<H6>…</H6> are used to create headings. The layout is decided by the browser, so
blank lines, tabs and extra spaces are ignored. If you want these, you must use tags to
do it. This is because the browser has to fit the output to the display screen attached
to the receiver. These may be set up in many different ways. Fig. 3.10 (f)1 shows a
simple example of HTML. In this piece of HTML the blank <HR> tags are used to
insert blank lines because the Web browser ignores the carriage return and new line
characters.
6.4 - 11
<HTML>
<TITLE> An Example of HTML </TITLE>
<HEAD/>
<BODY>
<HR>
<H1>An Example of HTML </H1>
<HR>
This piece of text has been produced using HTML. The text may be
<B>bold</B> or <I>italic</I>.
Although this piece of text is on a new line here, it may not be when displayed by the
browser. Remember, the Web browser decides the layout unless tags are used.
</BODY>
The result of a browser running this HTML will vary, but will be something like that
shown in Fig. 3.10 (f)2.
An Example of HTML
This piece of text has been produced using HTML. The text may be bold or
italic. Although this piece of text is on a new line here, it may not be when
displayed by the browser. Remember, the Web browser decides the layout
unless tags are used.
A line space and a thick line precede headings. A line space and a thick line also
follow them.
Exactly how the information is displayed will depend on the browser. Also, some
browsers do not recognise all tags. If a browser encounters an unknown tag, it should
ignore it. However, there is no guarantee of this. The result is that a page that looks
outstanding when you design it, may not look very good on a different browser.
In Fig. 3.10 (e)2 you will see many links, most of which link you to different sites
around the world. For example, Mondex links you to mondex.com, the home page for
Mondex who specialise in applications of smart cards.
6.4 - 12
To use links as shown in the previous Section, you need to use the anchor tag <A>.
Smart Cards
<A>Smart Cards</A>
in the HTML document. However, this will not create the link; it only creates the
hypertext. This hypertext must now be linked to the site. You do this by giving the
anchor attributes, using a hypertext reference (HREF). This points to where the
document to be displayed is kept. A typical example is shown in Fig. 3.10 (f)3. Note
this only shows the HTML necessary to create the link.
A shortened version can be used if the link is to a document in the same directory as
the one being viewed. In this case we need only write
If the document is in a subdirectory of the directory containing the page being viewed,
we can write
Links can also be created to points in the same document by using the NAME
attribute.
Inserting an image for interest is done by means of the <IMG> tag which has no end
tag. You must specify where the image is stored known as the source (SRC). For
example
6.4 - 13
where the BASE has been set by using, say,
If you want the image to be a hypertext link, then use, for example,
6.4 - 14
3.10 (g) Electronic Mail (email)
Electronic mail is a fast and cheap method of corresponding with others. It does not
matter what time you send it, you do not have to consider that at 08:00 in London it is
only 03:00 in New York. Also, email can be delivered when nobody is available to
receive it. The facilities offered by email are numerous as are their advantages.
Electronic mail systems allow the user to compose mail and to attach documents, in
many formats, to the message. Suppose several people are working on different
chapters of a book. It is easy for them to pass their work to one another as an
attachment so that others can make comments and revisions before retuning them.
This book was created in this way. The ability to attach all kinds of documents can
prove very useful. The author of this Chapter uses email to collect homework.
Students can word process their work and send it as an attachment. I can then mark it
and return my comments. Even better, students attach programs they have been asked
to write and I can run them to see if they work!
Often emails are sent to people who need to pass the message on to someone else.
This is easy as there is a forward facility with all email services. All the user has to
do when an email is to be passed on to someone else is to click a button, enter the
email address and press the Send button.
It is easy to reply to an email as you only have to click a Reply button and the original
sender's address automatically becomes the address to which the reply is to be sent.
Another useful facility that can be used is the facility to send the same email (and
attachments) to a group of people. For example, if I wish to send a message to the
whole of one of my classes I can do this. All that is necessary is for me to create a
group by inserting in it the email addresses of all the students in the class. I can then
type the message once and send it to the whole group by means of a single click on
Send.
Users of email can also set message priorities and request confirmation of receipt.
It is also possible to use voice mail in a similar way to email. In this case the spoken
message is digitised and stored electronically on a disk. When the recipient checks
for mail, the digitised form is turned back into sound and the receiver can hear the
message. These messages can also be forwarded, stored and replied to.
6.4 - 15
3.10 (h) Confidentiality of Data
Once an organisation opens some of its network facilities up, there is a problem of
confidentiality of data. An organisation may well wish that potential customers have
access to their product database. However, they will not want them to have access to
employee files.
A first step is to encrypt the confidential data and this is addressed in the next Section.
Another solution is to install firewalls. These sit between WANs and LANs. The
firewall uses names, Internet Protocol addresses, applications, and so on that are in the
incoming message to authenticate the attempt to connect to the LAN. There are two
methods of doing this. These are proxies and stateful inspection. Proxies stop the
packets of data at the firewall and inspect them before they pass to the other side.
Once the packets have been checked and found to be satisfactory, they are passed to
the other side. The message does not pass through the firewall but is passed to the
proxy. This method tends to degrade network performance but offers better security
than stateful inspection.
Stateful inspection tracks each packet and identifies it. To do this, the method uses
tables to identify all packets that should not pass through the firewall. This is not as
secure as the proxy method because some data do pass through the firewall.
However, the method uses less network resources.
6.4 - 16
3.10 (i) Encryption, Authorisation and Authentication
Authentication is used so that both parties to the message can be certain that the other
party is who they say they are. This can be done by using digital signatures and
digital certificates. Digital signatures require encryption. Basically, a digital signature
is code that is attached to a message.
In order to understand how public key cryptography works, suppose Alice and Bob
wish to send secure mail to each other:
• First, both Bob and Alice need to create their public/private key pairs. This is
usually done with the help of a Certification Authority (CA).
• Alice and Bob then exchange their public keys. This is done by exchanging
certificates.
• Bob can then use his private key to digitally sign messages, and Alice can
check his signature using his public key.
• Bob can use Alice's public key to encrypt messages, so that only she can
decrypt them.
A primary advantage of public-key cryptography is the application of digital
signatures, which help combat repudiation, i.e. denial of involvement in a transaction.
Since the owner keeps their private key secret, anything signed using that key can
only have been signed by the owner.
The predominant public-key algorithm is RSA, which was developed in 1977 by, and
named after, Ron Rivest, Adi Shamir, and Leonard Adleman. The RSA algorithm is
included as part of Web browsers from Netscape and Microsoft and also forms the
basis for many other products.
6.4 - 17