Académique Documents
Professionnel Documents
Culture Documents
Mark Gregory
___________________________________________________________________________
Page 1 of 150
Designing and Building Access
Database Systems
Mark Gregory
École Supérieure de Commerce de Rennes
Previously,
Page 2 of 150
DESIGNING AND BUILDING ACCESS DATABASE SYSTEMS..............................2
MARK GREGORY.......................................................................................................2
PREVIOUSLY, .............................................................................................................2
Edition 0..................................................................................................................................................................2
March 2000.............................................................................................................................................................2
Edition 7..................................................................................................................................................................2
October 2009...........................................................................................................................................................2
1.8. Limitations......................................................................................................................................................17
1.9. Acknowledgements........................................................................................................................................18
Page 3 of 150
2. A BRIEF INTRODUCTION TO DATABASES........................................................19
2.1. Databases (bases de données) and how they are designed.........................................................................19
2.5. Background....................................................................................................................................................20
2.13. Attribute........................................................................................................................................................25
2.18. Queries..........................................................................................................................................................26
3.5. Decide the purpose and basic contents of the database – Data Modelling..............................................41
3.5.1. Basic Constructs of ER Modelling...........................................................................................................41
3.5.2. Deciding entity types................................................................................................................................41
3.5.3. Entities......................................................................................................................................................41
3.5.4. Relationships.............................................................................................................................................42
3.5.5. Fields: What are the attributes of each entity?........................................................................................43
3.5.6. Data type: Domain....................................................................................................................................44
3.5.7. Identify Domains......................................................................................................................................44
3.5.8. Classifying Relationships.........................................................................................................................45
3.5.9. Keys: primary and secondary (“foreign”)................................................................................................46
3.5.10. Normalisation.........................................................................................................................................48
3.5.11. ER Notation............................................................................................................................................49
3.5.12. Online tutorial.........................................................................................................................................50
3.5.13. DFDs and ERDs – why both? How are they linked?............................................................................51
3.5.14. Why BOTH Data and Process models?.................................................................................................51
3.9. How will Input / Update be carried out (Forms etc.)? .............................................................................52
4.2. An Exercise.....................................................................................................................................................54
7.7. Relationships..................................................................................................................................................68
7.7.1. Relationships and linking: Enforcing referential integrity where appropriate.......................................70
Page 7 of 150
SECTION 3 – THE ANYTOWN DISTANCE LEARNING BUSINESS SCHOOL
EXAMPLE...................................................................................................................75
16. PROCESSES........................................................................................................77
16.1. Process Applicants.......................................................................................................................................77
17. DOCUMENTS.......................................................................................................78
17.1. Course Description......................................................................................................................................78
17.1.1. List of Modules.......................................................................................................................................78
Page 8 of 150
22. LEVEL 1 DFD.......................................................................................................82
28. REFERENCES......................................................................................................98
28.1. Basics of structured analysis.......................................................................................................................98
Page 9 of 150
30.4. The components of a DFD.........................................................................................................................107
31.5. Conclusion..................................................................................................................................................118
Page 10 of 150
2. APPENDIX 5: ACCESS HINTS - DESIGNING FOR USE...................................124
2.1. Getting more help........................................................................................................................................124
2.3. Some difficulties associated with forms and subforms and how to overcome them............................128
2.5. Detail subform does not show the subset of records based on the value of the current master form
record...................................................................................................................................................................130
3. APPENDIX 6: NORMALISATION........................................................................133
3.1. Introduction to Normalisation....................................................................................................................133
3.2. Introduction..................................................................................................................................................133
3.4. Terminology.................................................................................................................................................134
3.4.1. Records...................................................................................................................................................134
3.4.2. Field names.............................................................................................................................................134
3.4.3. Keys........................................................................................................................................................134
Page 12 of 150
1. Introduction: Who is this document for?
1.1. Preface
The booklet aims to help you learn how to design and build applications using Microsoft
Access. This document is written to be read and understood as you are working on your own
design and build experiments.
This Access database design and implementation document is a higher-level self-instruction
booklet; it is assumed that you are already a fairly competent Access user.
If you need to learn how to use Microsoft Access, please see section 1.6 for further advice.
Page 15 of 150
In this section, we summarise what we consider to be the basic knowledge and ability you
need to have in Microsoft Access.
The most important first step is to take a first step! Get hold of a copy of Access and start to
use it. As you do so, tick off the various things on the list below. You can start reading this
Designing and Building Access Database Systems guide in parallel, but please understand
that you cannot understand what is in this book without actually testing your practical ability
and knowledge.
1.9. Acknowledgements
I should like to thank:
♦ Former Huddersfield colleagues Dr. Steve Wade and Dr. Ken
Lunn
♦ ESC Rennes colleagues, notably Dr. Renaud Macgilchrist
♦ Previous ESC Rennes students
The following students gave me permission to reuse parts of their excellent
work on the Anytown Business School group case. I have incorporated this
case as a worked example in this document, and made significant use of
these students’ work:
Marine CORRE; Marie GALATAUD; Emmanuelle HAMEURY;
Naïla MALTI
Page 18 of 150
SECTION 1 - THE PRINCIPLES OF DATABASE
Data
Sourc Data Processing Recipien
e System t
Information
Store Retrieve
Data Data
Database
This diagram, which shows the structure of a data processing system (a synonym for business
information system), highlights the central importance of the database as the place where
data is stored and from which it is retrieved.
Page 19 of 150
2.4. Why study Databases?
2.5. Background
Some understanding of what a database is, how it is used, and (to a greater or lesser extent)
how databases are designed is essential to understanding electronic business.
Businesses are systems; they use Information Systems, which are based on Information and
Communications Technology.
Example: any e-commerce company provides a Web window onto its internal catalogue:
which is a web page connected to a database.
Every stakeholder needs information from the business. They generally obtain this as
information presented on forms (screens), reports and dynamic web-pages (webpages which
show the current contents of a database and permit stockholders to update that database).
Relative strengths and weaknesses of Word, Excel and Access for storing data
Method Advantages Disadvantages
Word Simple, well understood by people with weak computing skills No formulae (or only very
Processing: e.g. rudimentary ones)
Word
Excellent formatting options Tables are not related in any
way Can only be updated by one
person at a time. The data in a
table has no “structure” known to
the computer.
Spreadsheet: e.g. Some degree of structure – cells organised into rows and Persistent data is not safe.
Excel columns, with links possible between the cells
Very powerful data manipulation using formulae Size limits – 65535 rows (until
Office 2007).
Separate tables can be held in different worksheets No design methodology or
coherence – it is possible and
easy to mix data up in a way
which makes it impossible to
Page 20 of 150
find, update and relate.
Items of data can be related together using lookup formulae such Poor support for queries –
as VLOOKUP (RECHERCHEV) and HLOOKUP searching is slow, and the lookup
(RECHERCHEH) formulae are far from being
intuitive.
Can only be updated by one
person at a time
Database: e.g. Each kind of data is stored by the database management system More difficult to use and to learn
Access (DBMS) in its own separate table. The tables are related together (at first)
in accordance with the Relational data model – this gives
coherence to the collection of tables, which is the whole database
Very powerful data structuring and querying. In fact a query is Requires thoughtful use and
just a results table which combines together selected data from advance planning
more than one stored table. The database program enables the
user to say what data they need and they construct a query which
precisely specifies what data is to be retrieved into the results
table
Safer persistent data (though less safe than bigger, more powerful Access databases are not directly
DBMS programs like Microsoft SQL Server, ORACLE etc) web-accessible
Is multi-user: that is, more than one person at a time can change
(update) the database
Since every record in a table has the same basic structure, it is But the programming language
much easier and / or more cost-effective to process complete sets within Access, VBA, is too
of records under program control difficult and / or inappropriate
for most business users to learn.
Figure 1 Comparative strengths and weaknesses of data storage in two dimensional tables:
Microsoft Office tools
Page 21 of 150
Order Customer name Customer Product Product Unit Price Quantit Amount
number address code description of per unit y
sale
O001 GREGORY Mark 1 La Rue P001 Apples kg 0,80 € 2,5 2,00 €
O002 GREGORY Mark 1 La Rue P876 Oranges kg 0,90 € 1 0,90 €
O003 MACGILCHRIST 1 La Croix P001 Apples kg 0,80 € 3 2,40 €
Renaud Mistype
O004 GREGORY Mark 1 La Rue P001 Apples kg d €
0,80 2 1,60 €
O005 MACGILCHRIST 1 La Croix P876 Oranges kg address
1,05 € 1,5 1,58 €
Renaud
O006 GREGORY Mark 11 La Rue P001 Apples kg 0,90 € 2 1,80 €
O007 GOT Guillaume 1 L’Avenue P001 Apples kg 0,90 € 3 2,70 €
O008 GREGORY Mark 99 Le Chemin P876 Oranges kg 0,90 € 1,5 1,35 €
Changed
address
Page 22 of 150
∗ PRODUCT table Price
p
unit i er
Product Product Unit of sale Standard s on
bot h t
code description price per a
One i bles!
unit s
stand
a
P001 Apples kg 0,80 € other rd, the
P876 Oranges kg 0,90 € order
specif -
ic.
∗ ORDER table
Order Customer Product Actual Quantity Amount
number number code price per
unit
O001 C001 P001 0,80 € 2,5 2,00 €
O002 C001 P876 0,90 € 1 0,90 €
O003 C002 P001 0,80 € 3 2,40 €
O004 C001 P001 0,80 € 2 1,60 €
O005 C002 P876 1,05 € 1,5 1,58 €
O006 C001 P001 0,90 € 2 1,80 €
O007 C003 P001 0,90 € 3 2,70 €
O008 C001 P876 0,90 € 1,5 1,35 €
♦ This still isn’t perfect, since Orders and their Details continue to
be mixed together in one table. 1
Page 24 of 150
2.12. An example: Students by Programme
Entity relationship
Sample data
diagram
2.13. Attribute
An attribute is a Property of an entity, a single fact about the entity. An entity type will
normally have several different attributes, one (or occasionally more) of which uniquely
identifies every instance of the entity type. The identifying attribute or group of attributes is
called for the primary key for the entity type.
The Attributes of Programme are Programme Code (primary key), Programme Name, and
Programme leader
The Attributes of Student are Number (primary key), First name, Last name, Programme
Code (foreign key)
Programme code has to be present as a foreign key in the student entity in order to represent
the relationship which exists between programme and student.
2.18. Queries
The purpose of a database is to enable users to get the specific information they need. This
can be done using queries. Queries are both useful in themselves, and also are used as the
basis for reports and for forms.
♦ To answer a question like: who is programme leader for a
given student? We can get all the necessary information by a
query on both tables - programme and student
♦ Note that the name of the programme leader should be an
Page 26 of 150
attribute of programme, and definitely NOT of student!
To answer a question like: who is programme leader for a given student? we can get all the
necessary information by a query on both tables - programme and student. This is the work of
the relational database management system software (RDBMS). A user of the database
formulates a query, and the RDBMS goes away to look up details of occurrences in both
entity types, joining the answers together as a result presented to the user.
Module Student
In this diagram, the two rectangular boxes represent entity types. Here, they are Module and
Student. The relationship is Many-to-Many. The diagram represents this relationship by using
a line with a crow's foot at both ends of it. The end of the crow's foot represents the many end
of the many to many relationship, often represented simply as M:M or M:N
This model reflects the empirical observations that:
1. Any one student studies many modules
2. Any one module has many students
Many-to-Many relationships are very common. They are also problematical – this is because
actual database management systems like Access (and almost all others) cannot support
Many-to-Many relationships directly.
However, by following simple rules, it is possible to eliminate many-to-many relationships.
Page 27 of 150
Resolv
♦
♦
Relatio
The many-to-many relationship is removed by:
♦ Introducing a link or intersection entity
Drawing 1-to-many links FROM each original entity TO the new
one
∗ Note that the primary key of EACH parent entity
becomes part of the COMPOUND primary key of the
link entity
Here, the primary key of Module is Module Code, and that of
Student is Student No. Both become the compound primary
key of the Registration entity
Module
PK Module code
Module Registration
IS402E 20099235 B
IS402E 20099897 C
PK,FK1 Module code
OB401E 20099234 Fx
Module
PK,FK2 Student no
Module result
Student
PK Student no
Page 28 of 150
Surname
Forenames
♦ Note that there is only one primary key, made up of two
attributes
∗ Neither Module code nor Student no are unique in the
Module registration table – but the combination is
unique
Unless a student is allowed to do a module a second time, in
which case it is necessary to add a further attribute, usually a date,
to the compound primary key in order to make it unique again:
Page 29 of 150
2.22. Towards a more complete entity relationship attribute model
As analysis proceeds, the model is gradually refined and improved. Still incomplete, it might
look like this:
Programme Qualification
Programme name
LMD level
Programme leader surname
Programme leader forenames
Student
PK Student no Award
PK,FK1 Qualification
Student surname PK,FK2 Student no
Student forenames
FK1 Programme code Award result
Student gender
Student birthdate
Module Registra
Module grade
Module mark
Module Operation
PK,FK1
Note that this model has introduced a number Module code
of changes:
♦ PK
There is greater precision in Module yearnames
the attribute semester
chosen
♦ We wish to record a student’sModule
qualifications,
leader so we have
introduced Qualification
♦ Because a many-to many relationship exists between student
and qualification, an intermediate (link) entity has been
Page 30 of 150
Module
PK Module code
introduced; we observe that in the real world a specific award
is give to each student who qualifies in something, so we’ve
called the link entity Award
♦ We observe that many modules are offered and “run” (that is,
they occur and are taught) for several years in succession (and
sometimes in more than one semester in a year), and further
that in some cases students take a module one year, fail it,
and do it again in a subsequent year; therefore we introduce a
Module Operation, the run of a module in a given year and
semester
♦ The model remains incomplete but it’s now good enough to be
worth prototyping (building and testing) in Access – so that we
can check that it meets our needs for storing data and (above
all) retrieving information in a very flexible way
I created it using the query design wizard (assistant) in Access. In the simple query wizard, I
specified fields from the student table and from the programme table which I wished to
appear in the result. Here, I wanted a list of students with details of the programme they are
following. The screen shot shows the resulting query: it indicates that there are two tables
which are joined together in the preparation of the result, and it also indicates which fields
take part in the result.
Page 31 of 150
How has Access created this result? Probably something like this: it reads each record in the
student table. One of the attributes of student is the programme code. Programme code is the
foreign key in the student table; it is also the primary key in the programme table. Access
looks up the details from the programme corresponding to the programme code for each
student record. In effect, it joins together the two tables on the basis of the linking foreign
key.
Page 33 of 150
2.30. Why it’s important to relate entities
Construction of the query in the previous example was eased because a proper design process
had been undertaken in order to determine what entity types would be represented in the
database, and how they would be related. This design process resulted in the simple entity
relationship model presented in section 2.22. A line was used to link the two entity boxes
together; this line had a crow's foot at the many end of the one to many relationship which
analysis indicated exists between programme and student.
So one of the most important results of analysis is to establish what entity types are, and how
they are related. Relationships are links between entities which are significant for this type of
information and which are normally true in reality. A relationship can have a name: actually,
it can have two, one read from one end of the link, the other from the other end. In the earlier
example of Student and Programme, we can recognise two relationships - "student is on
programme" and "programme enrols student". The relationship is said to have a degree of
1:M, which can be read one to many. In this case, the relationship is said to be mandatory:
that is to say, a student is not a student if they are not on a programme, and a programme is
not a programme if it has no students. (The second assertion may not always be the case, and
it is possible to represent a relationship as being optional from one or both ends.)
Page 34 of 150
It is absolutely critical to identify the primary key for each entity type, and to ensure that
there is a foreign key at the many end of any one to many relationship which is discovered
as you think about how the entity types are related.
2
UML is the Unified Modelling Language, a set of notations largely used by information systems
professionals and particularly associated with a style of programming called Object Oriented or OO. The only
UML notation we employ in this module is the Use Case diagram, UCD.
3
However, it is a serious error to use a spreadsheet when a database is necessary. Please see appendices 31
and 1 for a discussion of reasons why a database is often superior to a spreadsheet.
Page 35 of 150
blown database is appropriate, you need to consider the steps outlined in the main
parts of this section, many of them associated with a particular method. 5
3.1.3.Assumptions
The approach described in this document is applicable only to relatively small
applications: such as proof-of-concept prototype systems or perhaps end user
computing systems. So:
∗ The requirement is relatively small scale
E.g. the specific needs of the department in which you work; or
(part or all) of a small business
∗ A prototype (and perhaps target system) can be
implemented using Microsoft Access or a similar end-
user orientated database
Even if it’s too large for Access, people often create an initial (or
“prototype”) system in Microsoft Access. This is then used to
establish the complete requirements for an eventual full, or
“target” system. Or the target system may be sufficiently small to
be realisable using Microsoft Access.
∗ You are acting as the Analyst or System Designer
This document exists to help people design an effective database
application. In business, it is normal to distinguish between those
who use a system, the so-called users, and those who analyse,
design and implement a system – the developers.
This document treats you throughout as though you were acting in
the developer role.
What if you are the user as well as the developer?
Then you are in the situation sometimes described as end user
development, where a business person or student develops a
system for their own use and perhaps also for the use of other
members of their team or department.
Wherever possible, get someone else – e.g. a member of your
team – to act in the role of a true system user. Their perspective
may be different but also complementary.
Then you need to know how to analyse and build what you
need
∗ You’re an entrepreneur and you want to build a
business
Page 37 of 150
♦ Dataflow diagrams
This technique is intended specifically for use with business users, and it is
reasonably visual. It also breaks large problems down into smaller, more-
manageable ones. It is therefore a very good basis for a dialogue between
you as system users and IS professionals.
3.1.9. SSADM
SSADM itself is less widely used than once it was but remains
important, not least because it is relatively easy for business
people to understand when compared with more modern
techniques.
For a good worked example of all SSADM techniques, please
see http://www.systemsanalysis.org.uk/ accessed 24/11/2008.
Wikipedia (accessed 26/02/2008) has a useful summary of SSADM (Structured
Systems Analysis and Design Methodology):
http://en.wikipedia.org/wiki/Structured%20Systems%20Analysis
%20and%20Design%20Method
The following material was found at http://www.edrawsoft.com/SSADM.php
accessed 03/01/2009.
♦ Introduction - Structured Systems Analysis and Design
Methodology (SSADM)
SSADM (Structured Systems Analysis and Design Method) is
another method dealing with information systems design. It was
developed in the UK by CCT (Central Computer and
Telecommunications Agency) in the early 1980's. It is the UK
government's standard method for carrying out the systems
analysis and design stages of an information technology project.
SSADM has been traditionally used for the development of medium
or large systems. However, one variant of SSADM is 'Micro SSADM'
which is for small systems. SSADM starts from defining the
information system strategy and then develops a feasibility study
module. These are followed by requirements analysis, requirements
Page 38 of 150
specification, logical system specification and a final physical
system design.
♦ Structured Systems Analysis and Design Methodology
(SSADM) Stages
SSADM consists of 5 main stages (which are broken-down in several
sub-stages). The 5 main stages are:
♦ Feasibility Study
The Feasibility Study involves a high level analysis of a
business area to determine whether it’s feasible to develop
a particular system. Data Flow Modelling and (high-level)
Logical Data Modelling can be used as technique during this
stage.
♦ Requirements Analysis
In the Requirements Analysis stage requirements are
identified and the current business environment is
modelled, business system options are produced and
presented. One of these options will be chosen then refined.
Data Flow Modelling and Logical Data Modelling can be used
as technique during this stage.
♦ Requirements Specification
In the Requirements Specification the functional and non-
functional requirements are specified as a result of the
previous stage. Data Flow Modelling, Logical Data Modelling
and Entity Event Modelling can be used as technique during
this stage.
♦ Logical System Specification
In the Logical System Specification the development and
implementation environment are specified, and the logical
design of update and enquiry processing and system
dialogues are carried out.
♦ Physical Design
During the Physical Design the logical system specification
and technical specification are used to create a physical
design and a set program specifications.
♦ Applicability of SSADM
Unlike rapid application development, which conducts steps in
parallel, SSADM builds each step on the work that was prescribed in
the previous step with no deviation from the model. Because of the
rigid structure of the methodology, SSADM is praised for its control
over projects and its ability to develop better quality systems. Most
current developers find it too onerous in its application, however.
3.1.10. MERISE
This is a French equivalent to SSADM.
See, for example,
http://www.commentcamarche.net/merise/concintro.php3 accessed
24/11/2008.
3.5.3. Entities6
Entities are the principal data object about which information is to be collected.
Entities are usually recognizable concepts, either concrete or abstract, such as
6
This material was in part found at http://www.edrawsoft.com/datamodel.php checked 18/10/2009.
Page 41 of 150
person, places, things, or events which have relevance to the database. Some specific
examples of entities are Employees and Projects. An entity is analogous to a table in
the relational model.
Entities can be classified as independent or dependent (in some
methodologies, the terms used are strong and weak,
respectively). An independent entity is one that does not rely on
another for identification. A dependent entity is one that relies on
another for identification.
An entity occurrence (also called an instance) is an individual occurrence of an
entity. An occurrence is analogous to a row in the relational model.
♦ Special Entity Types
∗ Associative entities (also known as link or intersection
entities) are entities used to associate two or more
entities in order to reconcile a many-to-many
relationship.
∗ Subtypes entities are used in generalisation hierarchies
to represent a subset of instances of their parent entity,
called the supertype, but which have attributes or
relationships that apply only to the subset.
An example is B2B Customer, a specialisation of Customer. The
Customer entity has the main attributes. A B2B entity then has
additional attributes specific to B2B, for example, credit
arrangements or contact details. Customer and B2B customer have
a one to one relationship.
Associative entities and generalisation hierarchies are discussed in more
detail below.
♦ What are the main entities / tables?
We now go on to decide which tables are necessary and how they link
together. There should be a table for each class of real-world thing, or
'entity'.
3.5.4. Relationships
A Relationship represents an association between two or more entities. Example of
such a relationship might be:
1. Employees are assigned to projects
2. Projects have subtasks
3. Departments manage one or more projects
Relationships are classified in terms of degree, connectivity, cardinality, and
existence. These concepts are discussed below.
♦ Relationships and linking
How are the entity types inter-related? There are three basic possibilities,
sometimes referred to as the cardinality of the relationship. Cardinality
specifies how many instances of an entity relate to one instance of another
entity.
Ordinality is also closely linked to cardinality. While cardinality specifies
the number of occurrences of a relationship, ordinality describes the
relationship as either mandatory or optional. In other words, cardinality
specifies the maximum number of related records and ordinality specifies
the absolute minimum number of related records. When the minimum
number is zero, the relationship is usually called optional and when the
Page 42 of 150
minimum number is one or more, the relationship is usually called
mandatory.
∗ 1:1 (one to one)
In a one-to-one relationship, each record in Table A can have only
one matching record in Table B, and each record in Table B can
have only one matching record in Table A. This type of
relationship is not common, because most information related in
this way would be in one table. For example, it may not be
necessary to have a separate credit reference entity; instead, its
attributes could appear on the customer entity.
You might use a one-to-one relationship to divide a table with
many fields, to isolate part of a table for security reasons, or to
store information that applies only to a subset of the main table.
For example, you might want to create a table to track employees
participating in a fundraising soccer game. The additional
attributes for employees who are also football players would be
stored in a football player table, linked one-to-one to employee.
This is done because the vast majority of employees will not be
football players. Similarly, you might have a general customer
table, and then link it to a B2B table (for B2B-specific elements)
and a B2C one. See also generalisation hierarchies below.
∗ 1:M (one to many)
A one-to-many relationship is the most common type of
relationship. In a one-to-many relationship, a record in Table A
can have many matching records in Table B, but a record in Table
B has only one matching record in Table A.
∗ M:N (many to many) and their resolution into two 1:M,
1:N relationships to a new link entity
In a many-to-many relationship, a record in Table A can have
many matching records in Table B, and a record in Table B can
have many matching records in Table A. This type of relationship
can only be stored in a database by defining a third table (called a
junction table, or a link or intersection entity) whose primary key
consists of or includes two fields - the foreign keys from both
Tables A and B. A many-to-many relationship is really two one-
to-many relationships with a third table. For example, an Orders
table and a Products table have a many-to-many relationship that's
defined by creating two one-to-many relationships to an Order
Details table.
It is occasionally necessary to add another attribute to the key to
ensure uniqueness – often this is a date/time field.
Page 43 of 150
Each of the characteristics represents a different field in the table and to differentiate
them they need a unique name. A database management system such as Access
requires to be told the name of each field (attribute) and type of data (text, numeric,
date etc.) which that field represents. If it is a text field the largest character size,
e.g. the biggest name to be stored will need to be included.
Programme Qualification
Programme name
LMD level
Programme leader surname
Programme leader forenames
Student
PK Student no Award
PK,FK1 Qualification
Student surname PK,FK2 Student no
Student
Note that, in accordance with forenames
the rule that the primary key of the one end of a
many-to-may
FK1relationship
Programmebecomes ancodeattribute of the many end – where it is known
as a foreign key – the entity type Award has attributes Qualification and Student no. Award result
Student
Very frequently, the combination gender
of the foreign keys is the best primary key for the
new entity type. However,
Student it isbirthdate
sometimes necessary to add a date or time attribute
to make the key unique – this is arguably necessary here because it is possible to
envisage a student achieving a qualification on more than one date. However, for
simplicity, we have ignored this rare possibility here.
3.5.7.Identify Domains
This concept, which goes beyond the Chen model, is both well-
based theoretically and very useful in practice. A Domain is the list
of all possible values of an attribute. Thus you might know that the
set of all possible values of a Sex attribute is Male and Female (for
Page 44 of 150
mammals); you might also choose to add the value Hermaphrodite
(to cover worms). It is also very common to permit a Null value,
meaning that for a particular individual we do not know what their
sex is. However, with these four permitted values, we have defined
ALL possible values of that attribute. From this we can state that a
Sex attribute should be a 1-character Text field, and that a
Validation rule should permit only the values M, F, H and (perhaps)
space, representing <null>. We have identified the domain of the
Sex attribute.
It is important to think about the Domain of an attribute for two
reasons:
♦ The Domain determines the data type, size and permitted
values
All attributes having the same Domain should have the
same data type, size and permitted values.
Therefore a Surname should be defined in the same way
throughout a database implementation.
Neither MS Access nor the vast majority of actual database
management systems provide direct support for the Domain
concept - instead, it is the responsibility of the implementer
to ensure that all attributes which share the same domain
are defined with the same type (e.g. numeric integer, text
….), size (e.g. long, double, 5-character text ….), and that
appropriate validation rules are defined and enforced.
In the case of the animal type entity, the number of legs
attribute is an integer number in the range 2 to 1000. The
data type is integer; the domain is the total set of possible
values, in this case, 2, 4, 6, 8… 1000 (millipede!).
♦ Validation rules: What rules apply to each field having this
domain?
Simple example: Sex may be male or female. All other values should be
disallowed by a validation rule, which permits only M (male/masculine) or
F (female/feminine) (and, perhaps, unknown) as values for a Sex attribute.
Consider the validation rules for each data attribute. For example, in the
animal entity, the attribute number of legs must be a value in the domain of
all possible values. Values such as three and 5 are never valid. Consider
setting a rule which does not permit these values. This has the benefit that
it decreases the likelihood of storing bad data.
Page 46 of 150
There is one and only one primary key per entity type. One (sometimes more)
field(s) will uniquely identify each entity in a database; therefore, we have to set it
to be the primary key.
The primary field of an animal patient might be its name or the owner name.
However, both of these are bad choices. Why? What better alternative can you
suggest?
Patient also needs to contain a foreign key - the name of the animal type. Why?
3.5.10. Normalisation
We should now go back and check each attribute list is:
∗ Complete
∗ Has the right attributes on the right entities
We may choose to use the formal relational data analysis technique
called normalisation. This technique is described in appendix 3. It is
a useful cross-check, and is not essential.
Page 48 of 150
3.5.11.ER Notation
There is no standard for representing data objects in ER diagrams. Each modelling
methodology uses its own notation.
The original notation used by Chen is widely used in academic texts
and journals but rarely seen in either CASE (Computer Aided
Software Engineering) tools or publications by non-academics.
Today, there are a number of notations used, among the more
common being Bachman, crow's foot, IDEFIX and SSADM.
Page 49 of 150
The symbols used in this document for the basic ER constructs are taken
from the American Information Engineering tradition and are also called
the crow’s foot notation (in French, patte d’oie).
∗ Entities are represented by labelled rectangles. The
label is the name of the entity. Entity names should be
singular nouns.
∗ Relationships are represented by a solid line connecting
two entities. The name of the relationship is written
above the line. Relationship names should be verbs.
∗ Attributes, when included, are listed inside the entity
rectangle. Attributes which are identifiers are
underlined. Attribute names should be singular nouns.
∗ Cardinality of many is represented by a line ending in a
crow's foot. If the crow's foot is omitted, the cardinality is
one.
∗ Existence is represented by placing a circle or a
perpendicular bar on the line. Mandatory existence is
shown by the bar (which looks like a 1) next to the entity
of which an instance is required. Optional existence is
shown by placing a circle next to the entity that is
optional.
There are many different ways of drawing entity-relationship diagrams. In
most of this document, we show one-to-many relationships using the
crow’s foot notation without particular concern for the ordinality.
Where it is desirable or necessary to consider ordinality
(whether or not a relationship is mandatory) we can use an
extended set of symbols:
3.5.12.Online tutorial
For an additional online tutorial about entity relationship modelling, see
http://www.cems.uwe.ac.uk/~tdrewry/lds.htm checked 24/11/2008. Note that this
tutorial sticks rigidly to the SSADM modelling conventions and names and makes
reference to Logical Data Structures, LDS. As it makes clear, “Logical data
Page 50 of 150
structures are data models, and are sometimes called entity-relationship (ER) models
or even entity-attribute-relationship models.” In other words, LDS is a synonym for
Entity Relationship Model.
3.6.2.Time dimension
Ensure that, for all major entity types, there are processes which
CReate, Update, and Delete (CRUD!) them. It may be necessary to
create specific processes to carry out operations which create,
update or delete entity types. But note that some systems do not
ever delete data, instead, they may archive the data.
Formal Entity Life History models exist as part of SSADM but are rarely
constructed nowadays. It is usually sufficient simply to ensure that the
Page 51 of 150
above points 3.6.1 and 3.6.2 are respected. See also
http://www.cems.uwe.ac.uk/~tdrewry/modeling.htm#Modeling
%20Techniques
Page 53 of 150
The fault in this case may lie with you or the user or both, but you have to
reach some agreement on what needs to be done to put the system right,
and it will normally be at your expense.
♦ The user has not themselves sufficiently thought through what
they require of a system
Even though the "fault" here is more obviously with the user, you still have
to reach some agreement on what needs to be done to put the system right.
♦ The user, inspired by seeing the implemented system, decides
that they would like the system to do more
Great! A business opportunity! You enter the required additional
functionality on a document which you may grandly term the Enhancement
Register, work out the implications in terms of additional design and
implementation effort, and tell the user what the enhancements will cost -
in terms of later delivery and / or an increased bill. You should never allow
yourself to get dragged into a cycle of continuously responding to such
changes as you go along, without explicit renegotiation of the terms of
reference agreed at the outset of the project.
4.2. An Exercise
Page 54 of 150
Assume that you are the people who originally designed the University of
Anytown database used as an example later in this booklet. Now that you
know about the various stages required to analyse user needs and design
a database solution, carry out those steps for yourself for a business
school. Go through the various stages and carefully document what you
do at each stage. Or, if you are responsible for database design in an
assignment I have set, do the same thing for that database.
This is a significant piece of work - it will probably take you at least a few
hours of effort, and may well take you a week of on-and-off effort.
When you have finished the University of Anytown database design,
compare the results of your work with those of the original analyst /
designers. You should find that you have reached similar or better
conclusions.
4.3.3.Further study
Before going much further with database design, and assuming you
are full of enthusiasm for database, you are advised to study the
topic in a textbook, learning more about concepts such as
normalisation, which is a very useful technique for ensuring that
each table has exactly the right attributes. Normalisation is
introduced in appendix 3. For a basic treatment, see [Hughes
2000]; for an advanced treatment, see [Date 2003]
Page 56 of 150
5.2. The history of databases
♦ Hierarchic and network databases were invented in the 1960s.
♦ 1970: Dr. E.F. Codd introduces the concept of the relational
database
In 1970, the expatriate British researcher Edgar Codd was
working for IBM in the United States. He suggested that a
better basis for database implementation was relational set
theory, a mathematical approach.
♦ Concepts are relatively simple and have a strong theoretical
basis, that of mathematical set theory
In a relational database, care is taken to keep all the data
for a set of like entities in what is mathematically a relation
or a set, but what we would probably refer to as a table of
records: e.g. student. Each different kind of entity is kept
separately: so we might also have a programme entity (or
relation or table - these are equivalent terms). Student
records are linked to a programme record by means of a
shared linking attribute, in this case, the programme code.
♦ Relational model does have limitations but is currently the
dominant paradigm (way of thinking)
♦ Object databases are just beginning to become commercially
significant and might dominate eventually
The relational database paradigm has been dominant since
about 1980, and has yet to be displaced by the more recent
object database approach.
∗ Oracle – hybrid object-relational approach
Page 57 of 150
The diagram shows a situation in which a foreign key, programme
code, in a Student table is being linked to the corresponding
programme code in the Programme table. It is necessary for a
Programme to exist before a Student can be registered. It is
probably appropriate automatically to cascade any change to the
programme code in Programme to each Student record having that
code. By contrast, the deletion of a Programme might not require
the deletion of linked Students (who perhaps studied on the
programme before it was deleted).
♦ The emphasis has to be on effective data retrieval in order to
answer arbitrarily complex questions
Page 59 of 150
5.9. What Is a DBMS?
♦ Database management system (DBMS) = an integrated set of
programs, used to define, update, and control the database
♦ Examples
∗ Small MS Access,
OpenOffice.org Base
∗ Medium MS SQL Server, MySQL, PostgreSQL
∗ Large ORACLE, IBM DB/2
Page 60 of 150
SECTION 2 – USING MICROSOFT ACCESS TO BUILD GOOD
DATABASES
Workstation
HTTP carrying
HTML
Client workstation
2. In addition to
displaying web Local server
pages, this PC
runs a copy of
Microsoft Access
which can
alsoquery the
remote database
(SQL Server or
ORACLE or Server
whatever) which
runs on the main HTTP carrying
database server Workstation HTML; plus
computer ODBC (open
database
connectivity
feature) for
database queries
Web Server –
responds to
Internet HTTP and
delivers HTML;
the web pages
can include
database forms,
with data stored
on the database
server computer.
Servers
Database server
computer. It is
this central
6.3. Further facilities of more advanced DBMS computer that
runs the Central computers
♦ Support many users and multiple
company’s main applications
transaction
∗ MS Access does this, sort of ...
processing andan individual
database
database may support a handful ofsoftware.
users
♦ Depend upon a data dictionary (sometimes
called a repository)
Page 62 of 150
♦ Integrate with the CASE (Computer Aided
Software Engineering) tool which created and
maintains the data dictionary
♦ Implement resilience and recovery mechanisms
These things include roll-forward and / or roll-back mechanisms so
that complete transactions (only) are carried out. Such mechanisms
are essential to prevent situations where, for example, money
leaves one company’s bank account, but never reaches another
company’s.
♦ Enforce security
Only privileged users should be able to see things like payroll data.
Page 63 of 150
6.4.1. The relative ease-of-use of MS Access
Industrial strength databases, such as ORACLE, are harder to learn and less well
integrated into the PC environment, whereas MS Access is easily accessible (sic!)
Page 64 of 150
7.4. Attribute types in MS Access
What data type should you use for a field in a table?
Decide what kind of data type to use for a field based on these considerations:
• What kind of values do you want to allow in the field? For
example, you can't store text in a field with a Number data type.
• How much storage space do you want to use for values in the
field?
• What types of operations do you want to perform on the values in
the field? For example, Microsoft Access can sum values in Number or
Currency fields, but not values in Text or OLE Object fields.
• Do you want to sort or index a field? Memo, Hyperlink, and OLE
Object fields can't be sorted or indexed.
• Do you want to use a field to group records in queries or reports?
Memo, Hyperlink, and OLE Object fields can't be used to group records.
• How do you want to sort values in a field? In a Text field,
numbers sort as strings of characters (1, 10, 100, 2, 20, 200, and so on), not
as numeric values. Use a Number or Currency field to sort numbers as
numeric values. Also, many date formats will not sort properly if entered in
a Text field. Use a Date/Time field to ensure proper sorting.
Page 65 of 150
Single Stores numbers from –3.402823E38 to – 4 bytes
(Réel simple) 1.401298E–45 for negative values and
from
1.401298E–45 to 3.402823E38 for positive
values
Double Stores numbers from – 8 bytes
(Réel double) 1.79769313486231E308 to
–4.94065645841247E–324 for negative
values and from 1.79769313486231E308
to 4.94065645841247E–324 for positive
values.
15 decimal places.
Date/Time Dates and times. 8 bytes
(Date/Heure)
Currency Currency values. Use the Currency data 8 bytes
(Monétaire) type to prevent rounding off during
calculations. Accurate to 15 digits to the
left of the decimal point and 4 digits to the
right.
AutoNumber Unique sequential (incrementing by 1) or 4 bytes
(Numérotation random numbers automatically inserted
automatique) when a record is added.
NB: if you use an automatically numbered
field as part of the primary key of a table,
and you also have to use it as the foreign
key in a linked table, the data type
required in the many end is long integer,
which is how in fact an AutoNumber field is
stored.
Yes/No Fields that will contain only one of two 1 bit
(Oui/Non) values, such as Yes/No, True/False, On/Off.
OLE Object Objects (such as Microsoft Word Up to one
(Liaison OLE) documents, Microsoft Excel spreadsheets, gigabyte (subject
pictures, sounds, or other binary data), to disc space!)
created in other programs using the OLE
protocol, that can be linked to or
embedded in a Microsoft Access table. You
must use a bound object frame in a form or
report to display the OLE object.
Hyperlink Field that will store hyperlinks. A hyperlink Up to 64,000
(Hyperlien) can be a UNC (Universal Naming characters
Convention) path to a file, or a URL.
Assistant for Creates a field which permits you to The same size as
choosing from a choose, from a scrolling list, a value which the primary key
list comes either from another table or from a of the
(Assistant Liste specified list of permitted values. If you corresponding
de choix) choose this option, a wizard appears to table. In the
help you to define the field. (common) case
where this is an
AutoNumber
field, it will be 4
Page 66 of 150
bytes in length.
7.6. Keys
Page 67 of 150
7.6.3. Multi-part primary keys
The primary key may be multipart. To create a multipart primary key in Access,
select the first field, then, holding the control key, select the second and subsequent
parts. Once all parts of the primary are selected, use Edit / Primary key to set the
attribute as primary key.
7.7. Relationships
Defining relationships in Access involves you in adding the tables you want to relate to the
Relationships window, and then dragging the primary key field from one table and dropping
it on the foreign key field in the other table.
The kind of relationship that Microsoft Access creates depends on how the related fields are
defined:
♦ One-to-many relationship
A one-to-many relationship is created if only one of the related fields is a
primary key or has a unique index. This is usually the case.
♦ One-to-one relationship
A one-to-one relationship is created if both of the related fields are
primary keys and / or have unique indexes.
Sometimes Access recognises this automatically, as here, when a B2B
customer table is being created to hold fields specific to B2B customers:
Page 68 of 150
The result is:
Page 69 of 150
♦ Many-to-many relationship
A many-to-many relationship is really two one-to-many relationships with
a third table whose primary key consists of7 two fields - the foreign keys
from the two other tables. This has already been discussed in section 2.21
7.8.1. Queries
A query is a temporary results table resulting from joining together fields taken from
one or more database tables. A query can also include calculated fields.
7.8.2. Reports
Reports are comprehensive summaries of a situation, and normally involve data
from several tables. As such, it is based rather on a single query than on a single
table. A report is frequently intended to be printed, rather than viewed on-screen.
7.8.3. Forms
7
Or, includes them, along with another attribute which ensures uniqueness, usually a date.
Page 70 of 150
Forms are used to get data into a system, and may also be used to get information
out -- see the next section.
Page 71 of 150
7.9.4. Table-level checks on forms
On a form, it is possible to cross check fields. For example, you
might not allow the title Mr for a person whose gender is female.
However, to do so requires the use of VBA.
Page 72 of 150
However, SQL is more than a conventional query language.
It also provides set manipulation facilities, that is, it is
possible to create whole new sets of data and to store them
in tables, and / or it is possible to update complete sets of
records in a single operation. Access implements this
functionality as ‘append’ and ‘update’ queries – see below,
section 7.11.1.
♦ Occasional need for record-at-a-time navigation and
processing
Access usually manipulates records a set at a time.
Sometimes, it is necessary to carry out record-at-a-time
navigation and processing under program control. This is
achieved in Access by means of:
∗ Recordsets: the Access mechanism for making tables
and queries available a record at a time
∗ Visual Basic: the language in which you can manipulate
individual records
ESC students should not normally try to learn how to do this.
7.11.2. Macros
Macros are stored sequences of user commands.
Page 73 of 150
8. Ways in which to learn more MS Access
8
However, please note that this Contact Management system will NOT meet the requirement set out in section 4.3.1!
Page 74 of 150
SECTION 3 – THE ANYTOWN DISTANCE LEARNING BUSINESS
SCHOOL EXAMPLE
9. Example scenario: Anytown Distance Learning Business
School
The Anytown Distance Learning Business School offers general business courses at undergraduate
and postgraduate levels. The undergraduate course is a Bachelor of Arts (BA) course called Business
Studies. The postgraduate course is a Master of Business Administration (MBA). Each course is
administered by a Course Coordinator.
Students apply for a course, BA or MBA.9 They send in an application form containing their personal
details, and their desired course. On behalf of the School, the appropriate Course Coordinator checks
whether the course is available and that the student has already obtained the necessary academic
qualifications. If the course is available (not yet full) and the student is qualified, he or she is enrolled
in the course, and the School confirms the enrolment by sending a confirmation letter to the student.
If the course is unavailable or the student is not sufficiently qualified, the student is sent a rejection
letter.
9
Note that course in this case study is neither programme nor module – but, as we will see, it is closer to
programme than module.
Page 75 of 150
• Students study modules drawn from two lists of modules held for the School, one of
undergraduate modules, the other of postgraduate ones.
• Modules have titles and a unique identifying code. Each module has a pre-defined value
expressed as a number of credits.
• Modules are of two kinds - some modules are core, some are electives, that is they are
optional.
• Core modules must be taken by all students on the course. The course regulations will
specify how many optional modules a student can take and what these options might be.
• Students who pass a module are awarded the number of credits specified as that module’s
value. If they fail, they get zero credits.
• Students construct a programme of study by doing core modules, to which are added the
optional (elective) modules they select from those available.
• Every course defines a maximum period of enrolment within which time the course must be
completed. This is normally five years for an undergraduate course and three years for a
postgraduate course. If a student does not complete within this time, the decision of the next
exam board will be that they have failed the course.
• Students may suspend studies or withdraw from the course. The date on which this happens
must be recorded.
• Each component (coursework or exam) of a module has a certain percentage weighting and a
student’s overall mark for a module is calculated by combining the marks for each component.
• An exam board (jury) meets after each semester to consider the marks obtained by students
and to determine whether they have passed or failed the modules they were registered for, and
what their status on the course now is. This process is described in more detail in section 12.
1. Passed all the necessary credits, including all the core modules for the course and at
least the necessary number of options from the course’s collection of optional modules; in
this case, the decision is that they have succeeded in the course and they are awarded a BA or
MBA.
2. Not yet passed all the necessary credits, but are making satisfactory progress: the
decision is that they may proceed, taking further credits as necessary.
3. Are not making satisfactory progress, that is, they are failing to complete too many
modules or have exceeded the maximum time they may stay on the course: the decision is
that they have failed in the course as a whole.
After the exam board, a revised version of the Student Record is printed and sent to the students.
Page 76 of 150
14. Simplifying Assumptions
This database does not contain full details of qualifications. 10 So data about the following things is
simply stored as large text fields (Access Memo type), because it does not need to be queried:
Course Qualifications Required
Applicant / Student Previous Qualifications
Each module has only one teacher, and that teacher is the module leader. One teacher may however
be the module leader for a number of modules.
10
For a more thoughtful approach to how to manage qualifications, please see section 2.22
Page 77 of 150
17. Documents
The Course Coordinators currently produce and maintain the following documents:
17.1.1.List of Modules
For each module, the following data has to be kept:
♦ Module Code
♦ Module Title
♦ Course Code – The Course on which the module is used – a
module is used either on the BA or the MBA
♦ The Lecturer who is the Module Leader
♦ Elective or Core?
♦ Examination weighting %
Page 78 of 150
Student Name
Student Address
Date of Birth
Previous Qualifications (stored as a Memo field)
Status (Applicant / Enrolled / Passed / Failed / Withdrawn / Suspended / Progressing)
Page 79 of 150
20. Anytown high-level Use Case diagram
Please note that the label <<include>> can also be written « include ». Note also that Microsoft
Visio employs <<uses>> or « uses » instead of << include >> - they mean the same thing.
Applicant
Confirm applicant
as student
Record student
module choices
Print module
results for jury
Course coordinator
Print student
results for jury
Student
Update student
status Jury
Print student
results letters
Record new staff
member
Submit coursework
Change course
structure
«uses»
Module leader
«uses»
Dean
Allocate teacher
Request management
reports
Page 80 of 150
21. Anytown: Context diagram
Acceptance or rejection
Module results
Proogramme co -
ordinator
Decision
Application University of Anytown Student
Course description
System
Management reports
Dean / Management
Module choices
Page 81 of 150
22. Level 1 DFD
Application
Revised module specification
Student status
Acceptance or rejection
4
Prepare for and hold
D4 Module results exam board Proposed changes to
Decision module specification
5
6
Teach and assess Module &
module D5 Review Courses and
Course specs modules
Management reports
Applicant
D1
details Module
D3 Potential changes to Course
registrations
Semester results letters
2 3
Admit students to Register students on Dean / Management
Coursework and exams course D2 Students core and elective
for assessment modules
Course descriptions
Module results letters
Course
co-ordinator
Student Module choices
Page 82 of 150
23. Example Level 2 DFD
D4 Module results
4.1
D2 Students
Students
Page 83 of 150
24. Data dictionary
We now need to move towards a good ERA model by means of top-down entity attribute modelling.
The approach I have adopted here is to work on the basis of the list of "obvious" entities which I identified in section 18, put them into a spreadsheet, and gradually add the
appropriate attributes. The spreadsheet is an extended example of what is sometimes called a Data Dictionary.
Description
External entities
Applicant
Student
Course Coordinator
Module Leader
Dean
Page 84 of 150
Register
students on
core and Register students on
elective core and elective
modules P 3 modules 3
Prepare for and
hold exam Prepare for and hold
board P 4 exam board 4
Collect module
results and Collect module
produce student results and produce
profile S 4 1 student profile 4.1
The scaling factor (if any)
is applied to the recorded
student results before the
Review module Review module Student Results
results S 4 2 results 4.2 Summary is reprinted
Review
programmes Review programmes
and modules P 6 and modules 6
Page 85 of 150
Produce
management Produce management
reports P 7 reports 7
1 Applicants
2 Students
3 Module registrations
4 Module results
5 Module specifications
6 Student profiles
7 Course specifications
Process
Data Flows External entity Process No Direction Name of flow name
Process
Applicant 1 Inward Application applications
Process
Applicant 1 Outward Acceptance or rejection applications
Process
Course Coordinator 1 Outward Application applications
Process
Course Coordinator 1 Inward Decision applications
Page 86 of 150
Review
programmes
Course Coordinator 6 Inward Course description and modules
Register
students on
core and
elective
Student 3 Inward Module choices modules
Review
Coursework and exams for module
Student 4.2 Inward assessment results
Teach and
assess
Student 5 Outward Module results letters module
Print results
Student 4.4 Outward Student results letters letters
Review
Proposed changes to module programmes
Module Leader 6 Inward specification and modules
Review
Module specification as programmes
Module Leader 6 Outward revised and agreed and modules
Review
programmes
Module Leader 6 Inward Module results and modules
Analysis of the results of
Produce each module; course
management description; list of
Dean 7 Outward Management reports reports modules
Review
Proposed changes to programmes
Dean 6 Inward programme and modules
Page 87 of 150
Entities Attribute Primary? Foreign? Domain Validation
Y or C Type Size Format Input mask Rules Description
An applicant becomes a
Applicant / student when they are
Student enrolled
Student
number Y Text 11 > LLL00000000
Student
forenames Text 30
Student last
name Text 20 >
Page 88 of 150
Finishing
date Date/time 20 Date 00/00/0000
Applicant /
Enrolled /
Passed /
Failed /
Withdrawn /
Suspended /
Status Text 12 Progressing
Term
address line
1 Text 20
Term
address line
2 Text 20
Term city Text 50 > Defaults to Anytown
Term
postcode Text 8
Contact
details Text 60
Previous
qualifications Memo
Employee
Employee
number Y Text 7 > LLL0000 e.g. EMP1234
Employee
forenames Text 30
Employee
last name Text 20
Page 89 of 150
Course
coordinator /
Employee module
role Text 20 leader / Dean
Social
security
number Text 16
Employee
address line
1 Text 20
Employee
address line
2 Text 20
Employee
city Text 50
Employee
postcode Text 8
Employee
country Text 32
Employee
contact E.g. telephone numbers,
details Memo etc.
Programme Level Y Text 1 > L P/U
Credits per 10 if undergrad; 15 if
module Integer postgrad
Project
credits Integer 60 if postgrad
Credits 360 if undergrad; 180 if
required Integer postgrad
Course
Course MBA /
code Y Text 3 BA
Page 90 of 150
BA Business Studies or
Course Master of Business
name Text 40 Administration
Course Long Employee who manages
coordinator Y Integer the course
Level Y Text 1
Required
qualifications Memo
Max number
of students Integer
Normal
number of
years Integer
Max number
of years Integer
Module
value Integer
Modules per
semester Integer
Taught
semesters
per year Integer
Description Memo
Module
Module
code Y Text 4
Page 91 of 150
Course code Y Text 1
Hypertext
Specification link
A Module Operation is the
operation, or running, of a
Module operation module in a given year
Module
code C Y Text 4
Year C Text 4
Core /
elective / C/E/
obsolete? Text 1 O L
Examination
weighting Integer %
Teacher Text 7 > LLL0000
Scaling
factor Scaling factor, in %,
applied Single % decided by jury
Registration Result
Module
code C Y Text 4 L000
Student
number C Y Text 11 > LLL00000000
Date course
work
received Date/time
Course work
mark Integer %
Relationship
s Parent Relationship Child Degree Description
A Course is part of either
a Postgraduate or an
Undergraduate
Programme Includes Course 1:M Programme
An application is made by
an Applicant for a Course.
If they are acceptable,
they may Enrol on the
Applicant / Course. They are then a
Course Enrols Student 1:M Student on that Course
A Course is delivered as
Course Consists Of Module 1:M a series of Modules
An Employee whose role
is Course Co-ordinator,
Employee Coordinates Course 1:M coordinates the Course
An Employee whose role
is Lecturer, Leads the
Employee Leads Module 1:M Module
Registration Result
resolves the many-to-
many relationship
Registration between Module and
Module Results In Result 1:M Student
Registration Result
resolves the many-to-
Is many relationship
Applicant / Registered Registration between Module and
Student On Result 1:M Student
Page 93 of 150
Module
Module Runs as operation 1:M
System
Outputs
Reports Description
Average mark, standard
deviation, percentage of
students who have not
Analysis of the results of each module passed
See section 10 of
Student Results Summary scenario
Page 94 of 150
Sub-
Forms Forms Description
Queries Description
System
Inputs
Sub-
Forms Forms Description
Applicant details
Student details
Programme and module
Record student module choices details
Module and Module Operation
Update course structure details
Update module
Update student Registration results
Update member of staff
Coursework receipt
Page 95 of 150
25. Anytown ER diagram
Programme
Includes
Consists of
Module
Leads
Runs as
Results in
Registration / Result
Is Registered on
Applicant / Student
Page 96 of 150
26. Anytown system implementation
In order to use the analysis and design work we have already undertaken, you would begin to
translate the ERA model (data model) into equivalent Access objects. Therefore, entities
become tables, attributes become fields, and relationships are defined as relationships!
Similarly, the Use Case diagram has already been used to identify inputs and outputs indicated
in the dictionary above. Implementation in Access involves converting these into equivalent
forms and subforms. You might like to try this for yourself (because we have not uploaded an
Anytown database). Over to you to try…
Page 97 of 150
28. References
Page 99 of 150
29. Appendix 1 Business Process Analysis using Use Case
Analysis
With thanks to Dr. Ken Lunn, former colleague at the University of Huddersfield, whose
material has formed the main basis for this section.
A Use Case is a definition of a meaningful interaction with a computer system. If you have
used the internet to buy things, an example of a Use Case would be choosing something from
an online catalogue, and another might be paying for the goods.
Use Case modelling is part of requirements definition and systems analysis. At the high level, a
set of Use Case diagrams define the presentation of the system, and these are excellent tools
for discussion with stakeholders of a system, such as users and sponsors. At a more detailed
level, Use Cases are used to fully specify the external functionality of a system.
Use Cases are part of the information required by developers to design and implement a
system.
Use Case diagrams say "what" a system does. The detailed analysis of Use Cases begins to say
something of "how" the system behaves in an environment. However, it does not say "how" a
system is structured internally to provide that behaviour. In computer system development you
will frequently see this separation emphasised. Before you decide how a system works, you
need to determine what it does first - a simple and obvious rule, but one so often forgotten to
many people's ultimate regret. That’s why Use Case diagrams (UCDs) and Use Case models
(UCDs with supporting text documents) can be so useful.
We draw a Use Case as an ellipse with the name of the Use Case underneath:
Page 100 of 150
Sometimes the name is put inside:
A Use Case
A Use Case
The Use Case name is a concise, active description of the behaviour carried out by the
Use Case, such as "print invoice". Do not write mini-essays to describe the behaviour
of the Use Case - we shall use a more elaborate means for describing the behaviour in
full.
An Actor
A Use
An Actor Case
This means that an Actor uses the Use Case. In any relationship there will be two way
communications. The direction of the arrow indicates who initiates the interaction.
Often in an interactive system, it is the Actor that initiates the dialogue, but it can be
the Use Case. Sometimes the arrow is left out.
A Use Case can use another Use Case. If you have a piece of well-defined
functionality, it makes sense to re-use this wherever possible. Also, sometimes a Use
Case gets too big to manage sensibly and it makes sense to break this down into
smaller Use Cases.
There are two ways Use Cases can relate. The first is where a Use Case "includes"
another Use Case. In this case the second Use Case is always invoked as part of the
execution of the first. This is drawn with an arrow pointing to the Use Case that is
included, with the label <<include>> tagged to the line:
Please note that the label <<include>> can also be written « include ». Note also that
Microsoft Visio employs <<uses>> or « uses » instead of << include >>.
Sometimes a Use Case is only called occasionally from another Use Case. From the
scenario analysis of the business, this will often be to support an alternative path or an
exception. We draw this with an arrow pointing the other way (yes it is confusing at
first) where the arrow points to the calling Use Case. So below, Chase Payment
sometimes calls Issue Warning Letter.
<<extend>
>
Telephone Reminder
Process Payment
<<include>>
Correct Invoice
Credit Control
Clerk Receive Payment
<<extend>>
<<extend>>
Correct Delivery
With a Use Case Diagram like the one above, you are getting a clear picture of who
uses a system, and what they can do with it. You also have forced some decisions, and
provided some external structure to the system.
The use of these techniques is not taught on this module, nor is it described in
this document. That is because the scope of activity for the kind of systems
described in the rest of the document is assumed to be relatively small scale,
Context
University of Anytown Student System There is one and only one Context diagram in a DFD
L2 DFD process 4
L2 DFD process 1 The L 1 DFD identifies between two and seven /eight
Prepare for and hold exam
Process applications main processes ; each such main process may be
board
expanded into a Level 2 DFD (here , only two are shown
L3 DFD process 1
L3 DFD process 2 L3 DFD process 3 The L 2 DFD identifies between two and
NB: this is NOT a DFD; this diagram shows the STRUCTURE
Collect module results and
produce student profile
Review moduleof a DFD!
results Decide student status main processes ; each such main pro
expanded into a Level 3 DFD (he
1 A Process box
The Number and Description is the
Process same as in the Elements List (data
Description dictionary)
A Data Store
D1 Name of Data Store The Number and Description is the
same as in the Elements List
Source or Source/Destinatio
Destination n
Arrows show DATA FLOWS
30.16.Supporting documentation
♦ Elementary Process Descriptions
∗ Description in natural language of a process
which is not further exploded in lower-level
diagrams
∗ Shouldn’t need to be long – if it is, may indicate
need for another level of diagram
♦ The list of elements - External entity list, etc – which is
typically stored in a Data Dictionary
31.1. Introduction
We normally store data on computers when we have many occurrences of a specific
kind of record, and we want to process specific records, or complete set of records.
For example, we may want to maintain a list of companies. For purposes of
comparison, we will normally choose to store the same items of data about each
occurrence. For example, we will store the name of each company, its principal sector
of activity, and the address of its global headquarters. A widely accepted way of
storing data, indeed, we may even refer to it as the “natural” way to store such data, is
by means of two-dimensional tables.
Many widely used office productivity programs provide good facilities for storing two-
dimensional tables. We can use a word processing program, such as Microsoft Word; a
spreadsheet, such as Microsoft Excel; or a database, such as Microsoft Access. However, each
program has specific strengths and weaknesses when it stores data in this way. Refer back to
section 2.6 for more on this.
31.2.4.Summary
When you are manipulating data for yourself alone, or as part of a small
team, or in a very small business, spreadsheets are likely to be more intuitive,
initially more productive and easier to get started with. However, as the
volumes of data, or the number of users, increase, databases become much
the preferable option. It is often a sensible and viable option to prototype the
requirements for a small business information system using a spreadsheet,
and then, when the requirements are quite clear, to transfer the data storage
element to a database.
See also http://www.epinions.com/content_972857476 (checked 20/11/2008)
Since the days of Lotus 1-2-3, people have used spreadsheet programs for everything
from word processing to data management. Doing the former is silly. Doing the latter,
however, is viable, especially in the latest version of Microsoft Excel. But though you
may be more comfortable with Excel, a real relational database program like
Microsoft Access is a better choice for managing data—for a number of reasons.
♦ Databases are safer. Excel, for example, does everything
in memory, so that any unsaved data may be lost if your
system crashes. Databases write data to the hard drive
immediately.
♦ Databases can handle more data. Sure, Excel can
technically handle more than 65,000 rows of data, but
doing so will likely bog down even the fastest PC.
♦ Databases can easily link tables of related data together,
such as customers and orders or musical groups and
albums (as well as the songs on each album). This is
where the words relational and database come together.
Storing related data together in a single table or
spreadsheet can be unwieldy and invite errors.
We'll look at a situation for which Access is a better tool than Excel and show you how
an Access solution works. If you've never used Access before, that's okay; we'll walk
you through how to create everything from scratch. We used Access 2002 for the
instructions, but you'll find the process is similar in all versions of Access. We chose
Access because so many users have it already, but you can do the same things in other
relational databases such as FileMaker or Microsoft SQL Server. For more on picking
the right database, see "Databases for All Reasons" in our issue of January 2003 at
http://www.pcmag.com/article2/0,1759,760886,00.asp (checked 24/11/2008).
If you want, you can add a description for each field to explain its contents as well as
a caption. The caption is a name that is used in place of the field name in reports and
forms. If you use shortened or cryptic field names, captions are a good idea.
To set a primary key, right-click on the area to the left of the Owner Nr field and
choose Primary Key. A key icon will appear, indicating that the field is the primary
key. Save the file with the name Owner, and click on the table's Close button.
Repeat this process to create a second table for pets with these fields:
Now you can enter data into the tables. Click on Tables in the Objects bar and double-
click on Clients to open it in datasheet view. Type the following data into the table
(the number in the Owner Nr field will be entered automatically):
Close the table and then repeat the process to add the following data to the Pet table
(the patient no will be added automatically):
patient owner animal patient condition treatment leave date date of birth
no code type name date
1 2 Cat Peaches fever 30/04/2004 01/05/2004 01/04/2003
2 1 Dog Sam 01/04/2003
3 3 Horse Dobbin 03/03/1999
4 3 Cat Ginger 01/04/2003
Now you can add a new owner and his pet, as well as add a new patient to one of the
existing clients. To see what is happening behind the scenes, close the form and open
the patients table. You'll see that the data has been entered into the fields patient no
and owner code, even though neither field was included on the form. The patient no
Page 122 of 150
number is automatically entered, because the field type is AutoNumber and the owner
code field is automatically set to the owner’s number, since the records are related
through the form's design.
Remove a client from the Clients table by opening the table, selecting the client, and
clicking on Delete. You'll be warned that a record in another data file will be affected
(the client's pets will be removed when the client is). This is the result of selecting the
Cascade Delete Related Records check box when setting up the relationship. The same
does not work in reverse and it is possible to have a client with no pets in the Clients
table.
”
11
You don’t need to create this SQL statement yourself. Instead, separately create a query that
combines the fields that you need in the usual way, using Design mode (mode création). Test that it
works, then display it in SQL mode. Copy the SQL SELECT statement that Access has generated and
use it to replace the SELECT statement in the Row Source mentioned above.
Page 127 of 150
2.3. Some difficulties associated with forms and
subforms and how to overcome them
The examples here are based on the following database structure:
This is a test form which shows how the ProductID combo box
displays values based on the CategoryID selected.
The content, the RowSource for the ProductID, is obtained
using the SQL statement:
SELECT distinct Products.ProductID,
Products.ProductName FROM Products WHERE
(((Products.CategoryID)=[forms]![frmComboTest]!
3.2. Introduction
♦ The relational database has a mathematical basis in Set
Theory
♦ It is possible to exploit the mathematical basis for
relational database design to improve the quality of the
actual design. Normalisation is a formal technique for
ensuring that the right attributes appear on the right
entities
♦ Also called relational data analysis, the technique of
normalisation is based on a property of data called
dependency or functional dependency.
♦ Normalisation aims to yield a set of entities designed to
∗ Minimise data redundancy
∗ Avoid consistency problems
♦ Normalisation is a “Bottom up” technique
Instead of starting with a top-down analysis of user
requirements, this technique starts with the existing
situation: the technique examines business documents
as they are currently used in existing business
processes. From this, it induces the necessary
database entities. For example, the starting point
might be an existing purchase order form. As we saw
Page 133 of 150
above in section 2, normalisation enables us to deduce
the need for several entity types, including purchase
order, supplier, product and order detail line.
♦ It is applied to attributes discovered on paper and
computer forms, viewed as a table (cf. Spreadsheet view)
3.4. Terminology
3.4.1.Records
Data tends to be held in groups of items - each individual item
of data is a field, and the group of fields constitutes a record.
3.4.2.Field names
Or attributes.
3.4.3.Keys
♦ Introduction
Before we can store details of (facts about) things in a
database, we need unique labels, that is, identifiers or
names, for the entities about which attributes are to be
stored. These identifiers, or keys, need to be chosen
with precision and consistency.
Candidate keys are possible labels / names /
identifiers.
Answer: one
This involves representing the data in the Purchase Order in the following
format:
Remove repeating groups, i.e. groups of data fields (or a single data field)
that may have multiple values for a single value of the key.
Set such groups up as a separate entity.
The key to this new entity will be a compound key comprising the original
key plus additional information to identify individual occurrences.
Applying this to the above example gives us the following:
We now have two entities, purchase order and purchase order detail.
There must be only one value per cell (row/column intersection) in the
entity. Put in another way, an entity is in first normal form (1NF) if there
are no repeating groups of attributes.
Any attributes that are dependent on only a part of the key should be
removed and stored in their own entity along with the part-key on which
they depend.
Applying this rule to our example leads us to produce the following
representation:
Part Number
Product Name
Packet size
What was wrong with 1NF, and what have we gained by moving to 2NF?
The answer is that the 1NF representation contains unnecessary repetition
of "Part Description" and "Packet size" information for every part ordered.
The same part may be ordered many hundreds of times so that storing the
data in 1NF could represent a waste of disk space. More importantly, this
amount of redundancy in the way data is stored could lead to significant
update problems.
Another problem with our 1NF representation is that there is nowhere in the
database to store information about Parts which are not currently on order.
So to summarise, by normalising, we have discovered a third entity type,
which is going to be called something like Product, or Stock item.
To avoid the possibility of the database becoming inconsistent (with some
copies of the same data being updated whilst other copies are overlooked)
we would ideally like to store each piece of data only once. This is really
what normalisation is all about.
Applying this rule to our example would give the following representation:
Note: The items in CAPITALS above are suggested names for the entities now
identified
Again we should ask the question: "What's wrong with 2NF"?
2. Given the following table are the following statements true or false?
The entries marked * have values that have already been entered, i.e. *
represents 'ditto'.
(I) What problems might result from storing this data in a single table?
(II) Take the data in the file through to third normal form.
(III) Does the new file structure address all the problems identified
in (I)?
(IV) If the sales manager wanted to add the following data to the
files:
- Supplier Name for each item
- Name of store manager for each store
- Maximum quantity of each item to be stored in each
store
Where would the data be stored in your 3NF model? Would
any new tables be required?
4. The following table shows the breakdown of student marks on different courses
by assignment number. In this example we have a repeating group inside a
repeating group. For each course there is repeating student data and for each
student there is repeating assignment data.
2 Entity Relationship 54
Models
3 Normalisation 32
2 Prototype 64
Implementation
i) Take the data in this report through to 3NF. What are the benefits of
storing this data in third normal form?
5. The Natural Yoghurt Company sells many products. Each product is composed
of several raw ingredients that are supplied by various vendors. A particular
ingredient is always supplied by the same vendor; however a vendor may
supply more than one ingredient. The product line (product offering) is divided
up so that only one department is responsible for a particular product.
However, each department is responsible for more than one product. Each
manager manages exactly one department. The following data items must be
stored in the Natural Yoghurt Company’s database:
Derive an entity relationship model and a set of 3NF tables from the above
description.
4.1. Introduction
MS Visio is now available to ESC students via the Microsoft Developers’ Network
Academic Alliance MSDNAA Electronic Licence Management System ELMS. You
should by now have received an email from e-academy telling you how you can profit
from this scheme.
In order to create a drawing of a particular kind, you use both a template file and a
stencil file. These together tell Visio what kind of symbols can be used. The
equivalent terms in French are un modèle and un gabarit.
Microsoft Office Visio 2007 makes it easy for business and ICT professionals to
visualise, explore, and communicate complex information. Rather than complicated
text and tables that are hard to understand, you can use Visio diagrams that
communicate information at a glance. Instead of static pictures, you can create data-
connected Visio diagrams that display data, are easy to refresh, and dramatically
increase your productivity. You can use the wide variety of diagrams in Office Visio
2007 to understand, act on, and share information about organizational systems,
resources, and processes throughout an enterprise.
Office Visio 2007 is available in two stand-alone editions: Office Visio Professional
and Office Visio Standard. Office Visio Standard 2007 has the same basic
functionality as Visio Professional 2007 and includes a subset of its features and
templates. Office Visio Professional 2007 offers advanced functionality, such as data
connectivity and visualization features, that Office Visio Standard 2007 does not.
12
Please note : the zero-defect ideal is emphatically not expected in the work that you do for
assessment. Instead, we are aiming for “good enough”! This appendix is included only because of the
extremely useful technique it illustrates.
Page 149 of 150
serious errors of style. The designer of the artefact should correct the problems
subsequently
Keys to success in the use of structured walk-through include:
Correctly assembling the right group of colleagues.
Distributing the artefact to participants before the meeting.
Total concentration on the artefact itself, rather than the person -- individual
criticism should be avoided.
The meeting should be scheduled in advance and of fixed duration.
The benefits of structured walk-throughs can be summarised as:
The quality of the artefact is improved because more faults are found, and
because errors of style -- which can lead subsequently to errors of
interpretation by others -- are eliminated.
Misunderstandings of the original requirements are more likely to be
detected.
The earlier a problem is found with an artefact, the cheaper it will be to fix it.
But there are obvious problems in using this technique in an organisational culture that
is not collaborative and supportive.