Vous êtes sur la page 1sur 327

Software Engineering

Computer Science & Engineering


(Software Engineering)
SRS and its Specification
The fundamental problem of S/W Engg. is the
problem of scale
As the scale changes to more complex and
larger s/w systems, new problems occur
This leads to redefining the priorities of the
activities that go into developing s/w
Software requirement in one such area to which
little importance was attached in the early days
of s/w development
More emphasis was on design & coding since it
was assumed that the developers understood
the problem clearly when it was explained to
them
Introduction
As systems became more complex, it
became clear that the system cant be
easily understood
Hence the need for a more rigorous
requirements arose
Now for large s/w systems requirements
analysis is perhaps the most difficult
activity
It is also very error-prone
Many s/w engineers believe that this
critical area is the weakest!!
IEEE defines requirements as:
1. A condition of capability needed by the
user to solve a problem or achieve an
objective
2. A condition or a capability that must be
met or possessed by a system to satisfy
a contract, standard, specification or
other formally imposed document
Software Requirements
It is to be noted that in case of software
requirements mean that it is the capabilities that
the system must have when it is developed
But since the system doesnt exist in any form,
specifying its capabilities becomes all the more
complicated
Regardless of how the requirements phase
proceeds, it ultimately ends with the SRS
SRS is a document that describes what the
proposed s/w will should do without describing
how the s/w will do it
The basic goal of the requirements phase is to
produce the SRS


Need for SRS
The origin of most s/w is the need of a client
The s/w system itself is created by the developer
Finally, the completed system is used by the end users
It is imperative that the requirements for the system that
will satisfy the needs of the clients and the concerns of
the end users will have to be communicated to the
developer
But the problem is that the client usually doent
understand s/w or the s/w development process
The developer doesnt understand the clients problem
and the application area
The basic purpose of the SRS is to bridge this
communication gap
SRS is the medium through which the
client and the user needs are accurately
specified
Hence the SRS forms the basis of s/w
development
A good SRS should satisfy all the parties
This is something which is very hard to
achieve
It involves trade-offs and persuasion
Another important purpose of developing
an SRS is helping the clients understand
their own needs
Developing an SRS helps a client and the
users to:
Think
Interact
Discuss with others (including the requirement
analyst)
to identify the requirements
Advantages of SRS
1. An SRS establishes the basis for
agreement between the client and the
supplier on what the s/w product will do
This basis of agreement is frequently
formalized into a legal contract between
the client and the developer
Through the SRS, the client clearly
describes what it expects from the supplier
and the developer clearly understands
what should be the capabilities of the s/w

Without such an agreement, it is almost
guaranteed that once the development is
over, the project will have an unhappy
client
This will obviously lead to unhappy
developers
Client: Hey! Theres a bug!!
Developer: No! it is a s/w feature!!
The reality is that even with such an
agreement, the client is frequently not
satisfied
2. An SRS provides a reference for
validation of the final product
The SRS helps the client determine of
the s/w meets the requirements
Without a proper SRS:
Theres no way a client can determine if the
s/w being developed is what was ordered
Theres no way a developer can convince
the client that all requirements have been
fulfilled

3. A high-quality s/w is a prerequisite to high-
quality s/w
It is known that the primary forces driving a
project are:
Cost
Schedule
Quality
Consequently anything that has a favourable
effect on these factors is desirable
Boehm found that in some projects 54% of all
the errors were detected after coding and testing
was done
Out of these 45% of the error originated during
the requirements and early design stages
That is, out of the total, approx. 25% errors
occur during requirements and early design
stages
Some other horror stories were:
More than 500 errors were found in an SRS
that was earlier approved
More than 250 errors were found in a
previously reviewed SRS by the stating the
requirements in a structured manner and
using tools to analyze the document
It is clear that many errors are made
during the requirements phase
An error in the SRS will most likely
manifest itself as an error in the final
system implementing the SRS

A wrong SRS will lead to an incorrectly
functioning system that will not satisfy the
client
Clearly, to get a high-quality s/w with fewer
errors, a high-quality SRS is a must

A high quality SRS reduces the development
cost
We know that the cost of fixing an error
increases almost exponentially as time
progresses
That is, a requirement error, if detected and
removed after the system has been developed
can cost upto 100 times more than removing it
during the requirements phase itself.
Phase Cost(person-hours)
Requirements 2
Design 5
Coding 15
Testing 50
Operation & Maintenance 150

This table clearly shows that there can be
a tremendous reduction in project cost by
the reducing the errors in the SRS

It is known that the requirements frequently change
Some of the changes are inevitable due the changing
needs and perceptions
But many changes are due to the requirements not being
properly analyzed
Not enough effort was put into validating the
requirements
With a high-quality SRS, requirement changes that come
due to improper analyzed requirements should be
reduced considerably
It is also a fact that frequent changes in the SRS
escalates the cost and severely disturbs the project
schedule
A well written SRS will reduce the project cost in addition
to improving its chances of finishing on schedule
Hence, the quality of the SRS impacts:
the customer & developer satisfaction
System validation
Quality of the final s/w
S/w development cost
All these points illustrate the importance of a
high-quality SRS
Unfortunately it is not a common practice:
due to lack of understanding of the role and the
importance of the SRS
To speed up development and cut costs by
eliminating non-essential activities
Non-essential activities mostly mean any other
activity other than coding


As a result many s/w projects in the
industry start with a poor-quality SRS
This frequently leads to cost and schedule
overruns
This has lead to the s/w industry acquiring
a reputation of developing poor-quality
products
Characteristics of an SRS
To properly satisfy the basic goals, an SRS
should have certain properties and should
contain different types of requirements
A good SRS is:
1. Correct
2. Complete
3. Unambiguous
4. Verifiable
5. Consistent
6. Ranked for importance and/or stability
7. Modifiable
8. traceable
1. Correct: An SRS is correct if every
requirement included in the SRS
represents something required in the
final system
2. A SRS is complete if everything the s/w
is supposed to do and the responses of
the s/w to all classes of input data are
specified in the SRS
Correctness and completeness go hand-in-
hand
To ensure completeness, one has to detect
the absence of specifications

3. An SRS is unambiguous if and only if
every requirement stated has one and
only one interpretation
Ambiguity is often due to the use of natural
languages to write an SRS
It is easy to understand
Some formal specification language can
also be used to avoid ambiguities
difficulty in reading and understanding
particularly by the users and the clients

4. An SRS is verifiable if and only if every
stated requirement is verifiable
A requirement is verifiable if there exists
some cost-effective process that can check
whether the final s/w meets that requirement
Unambiguity is essential for verfiability
Verification is often done through reviews
It implies that the SRS is understandable, by
the developer,the client and the users
Understandability is extremely important as
one of the goals of the requirements phase
us to produce a document on which the
client, the developer and the users can
agree

5. An SRS is consistent if there is no requirement
that conflicts with the another
Terminologies may cause inconsistencies
There may be logical or temporal conflicts between
requirements
6. An SRS can be ranked for importance and/or
stability if for each requirement the importance
and the stability of the requirement are indicated
Generally, all requirements are not of equal
importance,some are:
critical
Important
Desirable
Similarly some requirements are core requirements
that are not likely to change as time passes
Others are more dependent on time


7. An SRS is modifiable if its structure and
style are such that any necessary change
can be made easily while preserving
completeness and consistency
Writing an SRS is an iterative process
Even when the requirements of a system are
specified, they are later modified as the needs
of the clients change
Hence it should be easy to modify
But presence of redundancy is a major
hindrance to modifiability
8. An SRS is traceable is the origin of each
of its requirements is clear and if it
facilitates the referencing of each
requirement in future development
Forward traceability means that each
requirement should be traceable to some
design & code elements
Backward traceability requires that it is
possible to trace design and code elements to
the requirements they support
Traceability aids in verification and validation
Conclusion
Of all the above characteristics, completeness is
the most important and hardest to ensure
One of the most common problem in
requirements specification is when some of the
requirements of the client are not specified
This necessitates additions and modifications to
the requirements later in the development cycle
This is often expensive to incorporate
Incompleteness is a major source of
disagreement between the client and the
supplier

Structure of an SRS Document
All the requirements of the system have to be included in
a document that is clear and concise
For this it is necessary to organize the requirements
document as sections and subsections
There can be many ways to structure a requirements
document
Many stds. and methods have been proposed for
organizing an SRS
The main advantage of standardizing the structure of the
document is that with an available std., each SRS will fit
a certain pattern
This will make it easier for others to understand
This is also one of the roles of any std.
Another advantage that these stds. give is that
by having the requirements in a std. Way, they
help ensure that the analyst does not forget
some major property
The IEEE stds. recognize that different projects
may require their requirements to be organized
differently
That is, there is no one method that is suitable to
all projects
Thus IEE provides different ways of structuring
the SRS
The first two sections of the SRS are same for
all of them
The organization of the SRS as proposed in the
IEEE guide to SRS is given in the next slides

1.Introduction
1.1 Purpose
1.2 Scope
1.3 Definitions, Acronyms & Abbreviations
1.4 References
1.5 Overview
2. Overall Description
2.1 Product Perspective
2.2 Product Functions
2.3 User Characteristics
2.4 General Characteristics
2.5 Assumptions & Dependencies

3. Specific Requirements
3.1 External Interface Requirements
3.1.1 User Characteristics
3.1.2 Hardware Characteristics
3.1.3 Software Characteristics
3.1.4 Communication Interfaces
3.2 Functional Requirements
3.2.1 Mode 1
3.2.1.1 Functional Requirement 1.1
..
..
3.2.1.n Functional Requirement 1.n
..
..

3.2.m Mode m
3.2.m.1 Functional Requirement m.1
..
..
3.2.m.n Functional Requirement m.n
3.3 Performance Requirements
3.4 Design Constraints
3.5 Attributes
3.6 Other Requirements

Introduction Section:
Contains the purpose, scope, overview, etc.,
of the requirements document
It also contains the references cited in the
document and any definitions that are used
Overall Description:
Describes the general factors that affect the
product and its requirements
Specific requirements are not mentioned
A general overview is presented that make
the understanding of the specific
requirements easier

Product Perspective: essentially the
relationship of the product to to other products
is defined
It defines if the product is independent or a part of
a larger product and what would be the main
interfaces of the product
Product Functions: it provides a general
abstract description of the functions to be
performed by the product
Schematic diagrams showing a general view of the
different functions and their relationship with each
other is also useful
User Characteristics: End user skill level etc.
General Constraints: Any other condition that
may be satisfied for the product


Specific Requirements:
Describes all the details that the s/w
developer needs to know for designing and
developing the system
This is typically the largest and the most
important part of the document
This section may have different organizations
The requirements can be organized by:
The modes of operation
User class
Object
Feature
Stimulus
Functional hierarchy
External interface: specifies all the interfaces of the
s/w to:-
The people
Other software
Hardware
Other systems
User interface: they specify each human interface the
system plan to have, it includes:
Screen formats
Content of the menus
Command structure
Hardware interface: the logical characteristics of each
interface between the s/w and the h/w on which the
s/w can run
Essentially any assumptions that the s/w is making about the
h/w are to be listed here



S/w Interface: all other s/w that is needed for
this s/w to run is specified here along with the
interfaces
Communication Interface: specifies if the s/w
communicates with other entities in other
machines

The design process often has 2 levels:
At the 1
st
level the focus is on deciding:
Which modules are needed for the system
The specifications of these modules
How these modules should be interconnected
This is called the system design or top-level design
At the 2
nd
level, the focus is on:
The internal design of the modules
How the specifications of the module can be
satisfied
This is called the detailed design or logic design

The detailed design essentially expands the
system design to contain more detailed
description of the processing logic and data
structures so that the design is sufficiently
complete for coding.
Since the detailed design is an extension of the
system design, the system design controls the
major structural characteristics of the system
The system design has a major impact on the
testability and modifiability of the system
It also impacts the efficiency
Much of the design effort for designing s/w is
spent creating the system design
A design methodology is a systematic approach
to creating a design by applying of a set of
techniques and guidelines
These guidelines are not formalized and do not
reduce the design activity to a sequence of steps
that can be followed by the designer
The input to the design phase is the
specifications for the system to be designed
It is important that the specifications are:
Stable
Have been approved
Complete
Consistent
Unambiguous etc
The output of the top-level design phase is
the architectural design or the system
design for the s/w to be built
A reasonable exit criteria for the phase
could be that the design has been verified
against the input specifications and has
been evaluated and approved for quality

Object-oriented or Function oriented
A design can be:
Object oriented
Function oriented
In function-oriented design, the design consists of
Module definitions, with each module supporting
a functional abstraction
In object-oriented design, the modules in the
design represent data abstraction
In the function oriented design approach, a
system is viewed as a transformation function,
transforming the inputs to the desired outputs
Since the purpose of the design phase is to
specify the components, each component is also
viewed as a transformation function

The most commonly used function
oriented design methodology is known as
the structured design methodology
The basic output of the system design
phase when a function oriented design
approach is being followed is:
The definition of all the major data structures
in the system
All the major modules of the system
How the modules interact with each other
Design Principles
The design of a system should be:
Correct
Verifiable
Complete
Traceable
Efficiency: proper use of scarce resources
Simplicity: makes the job of maintenance
easier, faster and cost effective
These criteria are not independent and
increasing one may affect the other
Thus the design approach frequently requires
tradeoffs
Its the job of the designer to achieve a balance



M
o
s
t

i
m
p
o
r
t
a
n
t

But for most purposes, simplicity is the
primary property
Thus the objective of the design process is
to produce designs that are simple to
understand
But creating a simple design of a large
system can be an extremely complex task
It requires good engg. judgment
It is a creative activity and cant be
reduced to a series of steps, though
guidelines can be provided

The principles used in the design
process are the same used in problem
analysis
In fact, the methods are also similar
since both are concerned with
constructing models
But there are some fundamental
differences
1. In the problem analysis the model is made for the problem
domain while in design a model is made for the solution
domain
2. In problem analysis, the analyst has limited degrees of
freedom in selecting the models as the problem is given and
modeling has to represent it
In design, the designer has a great deal of freedom in deciding
the models, as the system that the designer is modeling
doesnt exist
In fact, the designer is creating a model for the system that will
be the basis of building the system
It simply means that in design, the system depends on the
model, while in problem analysis, the model depends on the
system
3. The basic aim of modeling in problem domain is to
understand, while the basic aim of modeling in design is to
optimize (in most cases, simplicity and performance
It concludes that though the basic principles and
techniques look similar, the activities of analysis and
design are very different.

Problem Partitioning and Hierarchy
Dividing a problem into manageable small
pieces that can be solved separately is a goal of
s/w design
The reason behind this is that if the pieces of the
problem are solvable separately, the cost of
solving the entire problem is less than if the
entire problem is solved
But the different pieces cant be solved
independent of each other as they together
solve the larger problem
This adds to the complexity which was not there
in there in the original problem

As the components increase, the cost of
partitioning together with the added complexity
may become more than the savings achieved by
partitioning
Thus the designer has to decide when to stop
partitioning
Partitioning helps in:
Understandability
Maintainability
Modifiability
The design process should support as much
independence as possible
But total independence is not possible
Dependence between modules is one of the
reasons for high maintenance costs


Proper partitioning will make the system easier
to maintain and making the design easier to
understand
Problem partitioning also aids design verification
Problem partitioning, which is essential for
solving a complex problem, leads to hierarchies
in the design
That is, the design produced by using problem
partitioning can be represented as a hierarchy of
components
The relationship between the elements in this
hierarchy can vary depending on the method
used

Abstraction
It is a very powerful concept that is used in all
engg. disciplines
It is a tool that permits a designer to consider a
component at an abstract level
This means that he doesnt have to worry about
the implementation details of the component
An abstraction of a component describes the
external behaviour of the component without
bothering about the internal details that produce
the behaviour
This means that the abstract definition of a
component is much simpler than the component
itself

Abstraction is essential for problem
partitioning
It is used to determine the components of
the system
It can be used for existing components as
well
There are 2 common abstraction
mechanisms for s/w systems:
Functional abstraction: a module is specified
by the function it performs
Data abstraction: a module is specified by the
data object it generates

Functional abstraction:
Is the basis of partitioning in function-oriented
approaches
When the problem is being partitioned, the
overall function for the system is partitioned
into smaller functions that comprise the
system
That is, the decomposition of the system is in
terms of functional modules

Data abstraction:
Here data is treated as objects with some
predefined operations on them
These operations are the only operations that
can be performed on those objects
These operations depend on the object and
the environment in which it is used
From outside of an object, the internals are
hidden, only the operations on the object are
visible
Data abstraction forms the basis for object-
oriented design
Here each object is viewed as a set of objects
providing some services
Modularity
The real power of partitioning comes if a
system is partitioned into modules so that
the modules are solvable and modifiable
separately
It is even better if the modules are
compilable separately
A system is considered to be modular:
If it consists of discreet components so that
each component can be implemented
separately
Change to one component has minimal
impact on other components
Modularity is a desirable property of the system
It helps in system debugging
It helps in system building
But a system cant be made modular just by
simply chopping it into a set of modules
Each module needs to support a well-defined
abstraction and a clear interface through which it
can interact with other modules
Modularity is where abstraction and partitioning
come together
For easily understandable and maintainable
systems, modularity is a basic objective
Partitioning and abstraction are the concepts to
achieve modularity
Top-down and Bottom-up Strategies
A system consists of components which may
have their own components
Thus it is a hierarchy of components
The highest level component corresponds to the
total system
To design such a hierarchy, there are 2
approaches:
Top-down: starts from the highest level component of
the hierarchy and proceeds to the lower level
Bottom-up: starts from the lowest-level component of
the hierarchy to progressively higher levels to the top-
level component



Top-down approach
Starts by identifying the major components
of the system, decomposing them into
lower-level components and iterating until
some desired level of detail is achieved
This often results in stepwise refinement
Starting from an abstract design, in each
step the designed to a more concrete level
until a level is reached where no more
refinement is needed and the design can
be implemented directly
But this approach is suitable only:
If the specifications of the system are clearly
known
The system development is from scratch
Hence, it is a reasonable approach if a
waterfall type of process model is being
used
However if it is to be built from an existing
system, a bottom-up approach is more
suitable
Bottom-up Approach
Starts with designing the most basic
components and proceeds to higher-level
components that use their lower-level
components
Starting from the very bottom, operations that
provide a layer of abstraction are implemented
The layer of operations of this layer are then
used to implement more powerful operations a
still higher layer of abstraction
This is done until a stage is reached where the
operations supported by the layer are those
desired by the system
This approach is more suitable if a system
is to be built from an existing system
The components from the existing system
can be used to make the new system
This approach is suitable if an iterative
enhancement type of process is being
followed
Conclusion
Pure top-down and bottom-up strategies
are often not practical
A common approach is to combine the two
approaches
This is done by providing a layer of
abstraction for the application domain of
interest through libraries of functions
This contains the functions of interest to
the application domain

Then the top-down approach can be used to
determine the modules in the system, assuming
that the abstract machine available for
implementing the system provides the
operations supported by the abstract layer
This is the approach used for developing
systems
Almost universally these days, most
developments make use of the layer of
abstraction supported in a system consisting of
the library functions provided by the OS,
programming languages and special-purpose
tools
Module Level concepts
These concepts are specific to function-oriented
design
A module is a logically separable part of a program
It is a program unit that is discreet and identifiable with
respect to compiling and loading
It terms of programming language constructs a module
can be:
Macro
Function
Procedure (or a subroutine)
Process
Package
In systems using functional abstraction, a module is
usually a procedure of function or a collection of these

To produce modular designs, some criteria
must be used to select modules so that
the modules:
support well defined abstractions
are solvable
are modifiable separately
In systems using functional abstraction,
there are 2 modularization criteria which
are often used together
coupling
cohesion
Coupling
The notion of coupling attempts to capture
how strongly different modules are
interconnected
It is also a measure of interdependence
among modules
Highly coupled modules are joined by strong
interconnections
Loosely coupled modules have weak
interconnections
Independent modules have no
interconnections


To solve and modify a module separately,
a module should be loosely coupled with
other modules
The choice of modules decides the
coupling between modules
Since the modules are created during
system design, the coupling between
modules is largely decided during system
design and cant be reduced during
implementation

It is an abstract concept and is not easily
quantifiable
But some major factors can be identified as
influencing coupling between modules
The most important among them are:
1. Type of connections between modules
2. The complexity of the interface
3. The type of information flow between modules
Coupling increases with complexity and
obscurity of the interface between the modules
To keep coupling low the above factors should
be low


Interface: it is used to pass information to
and from other modules
Coupling is reduced if only the defined
entry interface of a module is used by
other modules
For e.g: passing information to and from a
modules exclusively through parameters
Coupling would increase if a module is
used by other modules via an indirect
interface
For e.g: directly using the internals of a
module or using shared variables
Complexity is another factor that affects
coupling
The more complex the interface, the
higher will be the degree of coupling
For e.g: if a field of record is needed by a
procedure, often the entire record is passed,
rather than just passing that field
This increases the coupling
The interface should be as simple and
small as possible

The type of information flow along the interface
also affects coupling
There are 2 types of information that can flow
along an interface:
Data
Control
Passing or receiving control information means
the action of the modules will depend on this
control information
Transfer of data information means that a
module passes as input some data to another
module and gets some data as output
This allows a module to be treated as a simple
input-output function that performs some
transformation on the input data to produce the
output data


In general, interfaces with only data
communication result in the lowest degree of
coupling
The next higher degree of coupling is when there
is transfer of control data
Coupling is highest if the data is hybrid

Interface
Complexity
Type of
connection
Type of
communication
Low Simple
obvious
To module
by name
Data
control
High Complicated
obscure
To internal
elements
Hybrid
Cohesion
Another way of reducing the coupling among
modules is to strengthen the bond between
elements of the same module
This is done by maximizing the relationship
between elements of the same module
Cohesion is a concept that tries to capture this
intra-modular behaviour
Cohesion determines how closely the elements
of a module are related to each other or how
tightly bound are the internal elements of the
module to each other



it gives an idea to the designer about whether
the different elements of a module belong
together
Cohesion and coupling are related
Usually, the greater the cohesion of each module
in the system, the lower the coupling between
the modules
There are several levels of cohesion:
Coincidental
Logical
Temporal
Procedural
Communicational
Sequential
Functional

Lowest
Highest
These levels do not form a linear scale
Sometimes two levels of cohesion are
applicable between elements of a module
In such cases, the higher level is
considered

Coincidental: occurs when there is no
meaningful relationship among the
elements of a module
It occurs when a program is modularized
by chopping into pieces and creating
different modules
Mostly occurs when a module is created to
avoid duplicate code
It leads to strong interconnections between
modules



Logical: A module has a logical cohesion if
there is some logical relationship between
the elements of module and the elements
perform functions that fall in the same
logical class
A module with type of cohesion often
leads to hybrid information flow which
leads to the worst form of coupling
Such modules also have clumsy or tricky
code
In general, logically cohesive modules
should be avoided, if possible
Temporal: is the same as logical cohesion
except that the elements are also related
in time and are executed together
Modules that perform activities like
initialization, termination and clean-up
are usually temporally bound
Since all the elements are executed
together, there is no need of passing a flag
and the code is thus simpler

Procedural: contains elements that belong
to a common procedural unit
For e.g: a loop or a sequence of decision
statements in a module may be combined
together to form a separate module
Such modules occur when modular
structure is determined from some form of
flowchart
Such cohesion often cuts across functional
lines
Such a module may contain only a part of
a complete function or parts of several
functions
Communicational: occurs when a module
has elements that are related by a
reference to the same input or output data
That is, in a communicationally bound
module, the elements are together
because they operate on the same input or
output data
For e.g: a module to print a record

Sequential: When elements are together in
a module because the output of one forms
the input to the other
Many possibilities exist:
Combine all in one module
Put 1
st
half in one and 2
nd
half in another & so
on
Sequentially cohesive modules bear a
close resemblance to the problem structure
Functional: is the strongest cohesion
All the elements of the module are related
to performing a single function
Functions do not mean mathematical
function but modules accomplishing a
single goal
Function like compute square root and
sort the array are examples of
functionally cohesive modules

Conclusion
There is no mathematical formula to
determine the cohesion level of a module
But a useful technique is to write a
sentence that describes, fully and
accurately the function or purpose of the
module
The following tests can be made:
1. If the sentence is a compound sentence, if it
contains a comma, or it has more than one
verb, the module is performing more than
one function, then it probably has sequential
or communicational cohesion
2. If the sentence contains words relating
to time, like first, next, when, and
after the module probably has
sequential or temporal cohesion
3.If the predicate of the sentence doesnt
contain a single specific object following
the verb (such as edit all data) the
module probably has logical cohesion
4. Words like initialize and cleanup
imply temporal cohesion

Modules with functional cohesion can
always be described by a simple sentence
However, if a description is a compound
sentence, it doesnt mean that the module
doesnt have functional cohesion
Functionally cohesive modules can also
be described by a compound sentences
If a module cant be described by using a
simple sentence, the module is not likely
to have functional cohesion

Design notation and specification
While designing, a designer needs to record his
thoughts and decisions and to represent the
design so that he can view it and play with it
For this design notations are used
Design notations are largely meant to be used
during the process of design and are used to
represent design and design notations
They are meant largely for the designer so that
he can quickly represent his decisions in a
compact manner that he can evaluate and
modify
These notations are largely graphical
Once the designer is satisfied with the design he
has produced, the design is to be precisely
specified in the form of a document
To specify the design, specification languages
are used
Producing the design specification is the ultimate
objective of the design phase
The purpose of this design document is quite
different from the design notation
Whereas a design represented using the design
notation is largely used by the designer, a design
specification has to be so precise and complete
that it can be used as a basis of further
development by the programmers
Generally design specifications use textual
structures with the design notation helping
understanding
The design notation uses structure charts
to represent function-oriented design
Then a design language is used to specify
a design
Structure Charts
For a function-oriented design, the design can
be represented graphically by structure charts
The structure of a program is made up of
modules of that program together with the
interconnection between the modules
Every computer program has a structure and
given a program, its structure can be determined
The structure chart of a program is a graphic
representation of its structure
In a structure chart, a module is represented by
a box with the module name written in the box

An arrow from module A to module B represents
that module A invokes module B
B is called the subordinate of A
A is called the super ordinate of B
The arrow is labeled by the parameters received
by B as input and the parameters returned by B
as output
The direction of the flow of the input and the
output parameters are represented by small
arrows
The parameters can be shown to be:
data (unfilled circle at the tail of a label)
Control (filled circle at the tail)
Main
read_nums sort add_n
a, n
a, n a
sum
a, n
switch
x, y x, y
The structure chart of the sort and add program
A
B C D
A
B C D
Iteration and decision representation
If module A repeatedly calls the modules
C and D then it is represented by a looping
arrow around the arrows joining the
subordinates C and D to A
All the subordinate modules activated
within a common loop are enclosed in the
same looping arrow
If the invocation of modules C and D in
module A depends on the outcome of
some decision, that is represented by a
small diamond in the box for A with the
arrows joining C and D coming out of this
diamond
Modules in a system can be categorized into few
classes
There are some modules that obtain information
from their subordinates and then pass it to their
super ordinates
This kind of module is an input module
Similarly there are output modules
These take information from their super
ordinates and pass it on to their subordinates
Some modules are called transform modules
These modules transform data from one form
into another
Most of the computational modules fall under
this category
Input
Module
Data to
Super ordinate
Output
Module
Data from
Super ordinate
Coordinate
Module
Transform
Module
x
y
Composite
Module
y
x
x
y
Different types of modules
Finally, some modules manage the flow of data
to and from different subordinates
Such modules are called coordinate modules
A module can perform function of more than one
type of module
Such a module is called a composite module
For e.g: in the previous fig. the composite
module is an input module from the point of view
of its super ordinate as it feeds data y to the
super ordinate
Internally it is also a coordinate module since it
gets data x from one subordinate and passes it
on to another subordinate which converts it into
y
Modules in actual systems are often composite
modules
Conclusion
A structure chart is a nice representation
mechanism for a design that uses functional
abstraction
It shows:
The modules
Their call hierarchy
The interfaces between the modules
What information passes between the modules
It is a convenient and compact notation that is
very useful while creating the design
A designer can make effective use of structure
charts to represent the model he is creating
while he is designing

However it is not very useful for
representing the final design since it
doesnt give all the information needed
about the design such as:
The scope
Structure of data
Specifications of each module, etc
Design Document Specification
Using some design rules or methodology, a
conceptual design of the system can be
produced in terms of a structure chart
This is all right while the designer is designing
but inadequate when the design is to be
communicated to other programmers or
archived for future reference
To avoid these problems, the design needs to
be formally specified
The design specification is then communicated
in the design document
It is used as a reference for the design phase for
later activities like detailed design or
implementation
As the design document is them means by
which the design is communicated, it should
contain all information relevant to future phases
It should contain:
Problem specification
Major data structures
Modules and their specifications
Design decisions
The requirements document specifies the
problem in the terminology of the problem
domain
Often, before starting design, the requirements
are translated into a specification of a problem
more suited for design purposes
Thus problem restatement is the 1
st
step in the
design process
During system design, the major data
structures for the s/w are identified
Without these, the system modules cant
be meaningfully defined during design
In the design specification, formal definition
of these data structures should be given


Module specification is the major part of
system design specification
All modules in the system should be
identified when the system design is
complete and these modules should be
specified in the document
During system design only the module
specification is obtained because the
internal details of the modules are defined
later

To specify a module, the design document
must specify:
The interface of the module (all data items,
their types, whether they are for input and/or
output)
The abstract behaviour of the module (what
the module does) by specifying the modules
functionality or its input/output behaviour
all other modules used by the module being
specified
This information is quite useful in
maintaining and understanding the design
Hence, a design specification will necessarily
contain specification of the major data structures
and modules in the system
After a design is approved (using some
verification mechanism), the modules will have to
be implemented in the target language
This requires that the module headers for the
target language first be created from the design
The translation of the design for the target
language can introduce errors if its done
manually
To eliminate these errors, if the target language
is known, its better to have a design
specification language whose module
specifications can be used directly in
programming
This not only minimizes the translation errors that
may occur but also reduces the effort required for
translating the design to programs
This provides an added incentive to the designers
that the design document will not be thrown away
after review but will be used directly in coding
To aid in understanding of the design, all major
design decisions made by the designers during
the design process should be explained
thoroughly
The choices that were available and the reasons
for making a particular choice should be
explained
This makes the design more visible
Structured Design Methodology
Creating the s/w system design is a major
concern of the design phase
The aim of the design methodologies is
not to reduce the process of design to a
sequence of steps
But it is to provide guidelines to aid the
designer during the design process
The SDM is one of the design
methodologies for developing system
designs
SDM views every s/w system as having
some inputs that are converted into
desired outputs by the s/w system
The s/w is viewed as a transformation
function that transforms the given inputs
into the desired outputs
The central problem of designing s/w
systems is to be properly designing this
transformation function
Due to this view, the SDM is primarily
function oriented
Thus it relies heavily on functional
abstraction and functional decomposition
The concept of a program lies at the heart
of the SDM
During design, SDM aims to control and
influence the structure of the final program
The aim is to design a system so that:
programs implementing the design would
have a hierarchical structure
Has functionally cohesive modules
As few interconnections between modules as
possible





In a properly designed system, it is often the
case that a module with subordinates does not
actually perform much computation
The bulk of actual computation is performed by
its subordinates
The module itself largely coordinates the data
flow between the subordinates to get the
computations done
The subordinates in turn get the bulk of their
work done by their subordinates until the
atomic modules which have no subordinates
are reached
Factoring is a process of decomposing a module
so that the bulk of its work is done by its
subordinates
A system is said to be completely factored if all
the actual processing is accomplished by
bottom-level atomic modules and if the non-
atomic modules largely perform the jobs of
control and coordination
SDM attempts to achieve a structure that is
close to being completely factored
The overall strategy is to identify the input and
the output streams and the primary
transformations that have to be performed to
produce the output
High-level modules are then created to perform
these major activities which are later refined
There are 4 major steps in this strategy:
1. Restate the problem as a data flow diagram
2. Identify the input and output data elements
3. First-level factoring
4. Factoring of input, output, and transform
branches

1. Restate the problem as a DFD
The 1
st
step is to construct the DFD
DFDs can be drawn during requirement analysis
and during structured design
In the requirements analysis, a DFD is drawn to
model the problem domain
During design activity, the DFD represents how
the data will flow in the system when it is built
In this modeling, the major transform or
functions in the software are decided
The DFD shows the major transforms that the
s/w will have and how the data will flow through
the different transforms
DFD
Data flow diagrams (also called data flow
graphs) are useful in understanding a system
and can be effectively used in analysis and
design
It views a system as a function that transforms
the inputs into the desired outputs
Any complex system will not perform this
transformation in a single step but data will
typically undergo a series of transformations
before it becomes the output
A DFD aims to capture the transformations that
take place within a system to the input data so
that eventually the output data is produced
The agent that performs the
transformation of data from one state to
another is called a process or a bubble
The processes are shown by named
circles and data flows are represented by
named arrows entering or leaving the
bubbles
A rectangle represents a source or sink
and is a originator or consumer of data
A source or a sink is typically outside the
main system of study
Sort
A process
x,y
Data flow
Worker Originator/ Consumer of data
*, + Multiple data flows, * = AND , * = OR
Employee Record
External files
Most of the systems are too large for a single
DFD to describe the data processing clearly
It is necessary that some decomposition and
abstraction mechanism be used for such
systems
DFDs can be hierarchically organized which
helps in progressively partitioning and analyzing
large systems
Such DFDs are called leveled DFD set
A leveled DFD set has a starting DFD which is a
very abstract representation of the system
It only identifies the major inputs, outputs and
the major processes in the system
Then each bubble is refined and a DFD drawn
for the process i.e. a bubble in a DFD is
expanded into a DFD during refinement
For the hierarchy to be consistent, it is important
that the net inputs and outputs of a DFD for a
process are the same as the inputs and outputs
of the process in the higher level DFD
This refinements stops if each bubble is
considered to be atomic i.e. each bubble can
be easily understood
It is to be noted that during refinement, though
the net input and output are reserved, a
refinement of the data might also occur
That is, a unit of data may be broken into its
components for processing when the detailed
DFD for a process is being drawn
So, as the processes are decomposed, data
decomposition also occurs
Some points to remember:
A DFD is not a flowchart
A DFD represents the flow data while a flowchart
represents the flow of control
A DFD doesnt represent procedural information
While drawing a DFD, the procedural details and
thinking must be avoided
For e.g: considerations of loops and decisions
must be ignored
The designer of the DFD should only specify the
major transforms in the path of the data flowing
from the input to the output
How those transforms are performed is not an
issue while drawing the DFD
Following are some suggestions to draw a
DFD:
Work your way consistently from the inputs to
the outputs or vice versa. If you get stuck,
reverse direction
Start with a high-level data flow graph with a
few major transformations describing the
entire transformation from the inputs to the
outputs and then refine each transform with
more detailed transformations
Never try to show control logic. If you find
yourself thinking in terms of loops or
decisions, it is time to stop and start again


Label each arrow with proper data
elements. Inputs and outputs of each
transform should be carefully identified
Make use of * and + operations and show
sufficient detail in the data flow graph
Try drawing alternate data flow graphs
before settling down on one
Some common errors while drawing a
DFD are:
Unlabeled data flows
Missing data flows: information required by a
process is not available
Extraneous data flows: some information is
not being used in the process
Consistency not maintained during refinement
Missing processes
Contains control information
Unlabeled data flows:
It is the most common error
If an analyst cant label the data flow, it is likely
that he doesnt understand the purpose and
structure of that data flow
Missing data flow:
To check for this the analyst should ask:
Can the process build the outputs shown
from the given inputs?
Extraneous data flow:
Check for redundant data flows by asking:
Are all data flows required in the computation
of the output?

Consistency:
Can be easily lost if new data flows are added
to the DFD during modification
If such changes are made, appropriate
changes should be made in the parent or the
child DFD
If a new data flow is added in a lower-level
DFD, it should also be reflected in the higher-
level DFDs
If a data flow is added in a higher-level DFD,
the DFD for the processes affected by the
change should also be appropriately modified

Missing processes and control information:
The DFD should be carefully scrutinized to
make sure that all the processes in the
physical environment are shown in the DFD
It should also be ensured that none of the
data flows is actually carrying control
information
A data flow without any structure or
composition is a potential candidate for
control information
An example: Word counting
Problem: To determine the no. of different
words in the file

Determine the no.
of different
words
Input word file No. of words
Level 0 DFD
Get word list
Count no.
of
Different
words
Input
word
file
w.l.
No. of words
Level 1 DFD
Get word
list
Input
word
file
Sort the
list
w.l.
Count the
no. of different
words
Print
the
count
S
o
r
t
e
d

w
.
l
.

mai
c
o
u
n
t

mao
Level 2 DFD
2. Identify the most abstract input
and output data elements
Sometimes the transformations cant be easily
applied to the actual physical input and produce
the desired physical output
Then it becomes necessary to convert the
inputs/outputs into a form on which the
transformation can be applied with ease
The goal of this step is to separate the
transforms in the DFD that convert the
input/output to the desired format from the ones
that perform the actual transformations
For this separation, once the DFD is ready the
next step is to identify the highest abstract level
of input/output
The most abstract input data elements (mai) are
those that are furthest removed from the
physical inputs but can still be considered inputs
to the system
MAI data elements are recognized by starting
from the physical inputs and traveling towards
the outputs in the DFD until the data elements
are reached that no longer be considered
incoming
The aim is to go as far as possible from the
physical inputs without losing the incoming
nature of the data element
MAO: Identified by starting from the outputs in
the DFD and traveling towards the inputs
These are data elements that are most removed
from the actual outputs but can still be
considered outgoing
The MAO data elements may also be
considered the logical data items that are to be
converted into a form that is required as the
output
There will usually be some transforms left
between the MAI and MAO data items
Central Transforms: performs the basic
transformation for the system taking the MAI and
transforming it into the MAO



Get word
list
Input
word
file
Sort the
list
w.l.
Count the
no. of different
words
Print
the
count
S
o
r
t
e
d

w
.
l
.

mai
c
o
u
n
t

mao
DFD for word counting problem
The arcs in the DFD show the MAI and the
MAO
The choice for the MAI is done as:
Start with the input
The input file is converted into a word-list
which is essentially the input in a different
form
The sorted word-list is still basically the input,
as it is still the same list in a different order
This appears to be the MAI since the next
data(i.e. the count) is not just another form of
the input data
The MAO is chosen as:
Count is the natural choice :
All other data items are classified as inputs
there is only more transformation left



3. First-Level Factoring
Having identified the central transforms, the MAI
and MAO, the next step is to identify some
modules for the system
First the main module is to be specified whose
purpose is to invoke the subordinates
The main module is therefore the coordinate
module
For each of the MAI data items, an immediate
subordinate module to the main module is
specified
Each of these modules is an input module,
whose purpose is to deliver to the main module
the most abstract data item for which it is
created

Main
Output
count
Get sorted
word list
sorted w.l.
count
Count no. of
different words
s
o
r
t
e
d

w
.
l
.

c
o
u
n
t

Similarly, for each MAO data item, a subordinate
module that is an output module that accepts
data from the main module is specified
Each of the arrows connecting these input and
output subordinate modules are labeled with the
respective abstract data item flowing in the
proper direction
Finally, for each central transform, a module
subordinate to the main one is specified
These modules will be the transform modules
whose purpose is to accept data from the main
module and then return appropriate data back to
the main module
The data items coming to a transform module
from the main module are on the incoming arcs
of the corresponding transform in the DFD
The data items returned are on the outgoing
arcs of that transform
The structure after the first-level factoring of the
word-counting problem is shown
In this example, there is one input module which
returns the sorted word list to the main module
The output module takes the value of the count
from the main module
There is one central transform in this example,
and a module is drawn for that

3. Factoring the input, output and
transform branches
The first-level factoring results in a very-high
level structure, where each subordinate module
has a lot of processing to do
To simplify these modules, they must be
factored into subordinate modules that will
distribute the work of a module
Each of the input, output and transform modules
must be considered for factoring
Lets consider the input modules first:
The purpose of an input module, as viewed by
the main program, is to produce some data
To factor an input module, the transform in the
DFD that produced the data item is now treated
as a central transform
The process performed for the first-level
factoring is repeated here with this new central
transform with the input module being
considered the main module
A subordinate input module is created for each
input data stream coming into this new central
transform
The new input modules now created can then be
factored again, until the physical inputs are
reached
Factoring of input modules will usually not yield
any output subordinate modules
Get sorted list
Get word list Sort
w.l.
w.l.
s.wl.
Get a word Add to word list
Factoring the input module
The factoring of the input module get-sortedlist
in the first-level structure is shown above
The transform producing the input returned by
this module (i.e. the sort transform) is treated as
a central transform
Its input is the word list
Thus in the first factoring we have an input
module to get the list and a transform module to
sort the list
The input module can be factored further, as the
module needs to perform 2 functions:
Getting a word
Adding it to the list
The looping arrow is used to show iteration
Factoring the output module:
The factoring of the output module is
symmetrical to the factoring of the input modules
For an output module, a search is made for the
next transform that can be applied to bring the
output closer to the ultimate output
This now becomes the central transform, and an
output module is created for each data stream
going on this transform
During the factoring of output modules, there will
always be no input modules
In this e.g. there is only one transform after the
MAO so this factoring need not be done

If the DFD of the problem is sufficiently detailed,
factoring of the input and the output modules is
straightforward
But there are no such rules for factoring the
central transforms
The goal is to determine sub-transforms that will
together compose the overall transform
Repeat the process for the newly found
transforms, until the atomic modules is reached
Factoring the central transform is essentially an
exercise in functional decomposition and will
depend on the designers experience and
judgment

Transform module:
One way to factor a transform module is to treat
it as a problem in its own right and start with a
DFD for it
The inputs to the DFD are the data coming into
the module and the outputs are the data being
returned by the module
Each transform in this DFD represents a sub-
transform of this transform
The central transform can be factored by
creating a subordinate transform module for
each of the transforms in this DFD
The process can be repeated for the new
transform modules that are created until the
atomic level is reached
The factoring of the central transform count-the-
no-of-different-words shown below:
Count no. of
different words
Get a word
Same as
previous
Increment
count
w.l.
word
word
flag
count
count
Factoring the central transform
Design Heuristics
The design steps mentioned earlier do not
reduce the design process to a series of steps
that can be followed blindly
The strategy requires the designer to exercise
sound judgment and commonsense
The basic objective is to make the program
structure reflect the problem as closely as
possible
The SDM should be treated as an initial
structure which may need to be modified
Hence there are some heuristics that can be
used to modify the structure, if necessary
Again these are merely pointers
The designer is still the final judge of whether a
particular heuristic is useful for a particular or not
Module size is often considered to be an
indication of module complexity
In some modules that are very large & may
not be implementing a single function and
therefore be broken into many modules
each implementing a different function
On the other hand, modules that are too
small may not require any additional
identity and can be combined with other
modules
However, the decision to split a module or
combine different modules should not be
based on size alone
Cohesion and coupling of modules should be the
primary guiding factors
A module should be split into separate module
only if the cohesion or the original module was
low
The resulting modules have a higher degree of
cohesion and the coupling between the modules
doesnt increase
Similarly, two or more modules should be
combined only if the resulting module has a
higher degree cohesion and the coupling of the
resulting module is not greater than the coupling
of the sub-modules
Also, a module should usually not be split
or combined with another module if it is
subordinate to many different modules
As a rule of the thumb, the designer should
take a hard look at modules that will be
larger than about 100 lines of source code
or will be less than a couple of lines
Another parameter that can be considered
while fine-tuning the structure is the fan-in
and fan-out of modules
Fan-in of a module is the no. of arrows
coming in the module
This indicates the no. of super-ordinates of
a module

Fan-out of a module is the no. of arrows going
out of that module.
It indicates the no. of subordinates of the module
A very high fan-out is not very desirable as it
means that the module has to control and
coordinate too many modules and may therefore
be too complex
Fan-out can be reduced by creating a
subordinate and making many of the current
subordinates subordinate to the newly created
module
In general the fan-out should not be increased
above 5 or 6

Whenever possible, the fan-in should be
maximized
But this should not be obtained at the cost
of increasing the coupling or decreasing
the cohesion of modules
For e.g. implementing different functions
into a single module simply to increase the
fan-in is not a good idea
Fan-in can often be increased by
separating out common functions from
different modules and creating a module to
implement that function

Another important factor that should be
considered is the correlation of the scope of
effect and scope of control
Scope of effect of a decision ( in a module) is the
collection of all the modules that contain any
processing that is conditional on that decision or
whose invocation is dependent on the outcome
of the decision
The scope of control of a module is the module
itself and all its subordinates (not just the
immediate subordinates)
The system is usually simpler when the scope of
effect of a decision is a subset of the scope of
control of the module in which the decision is
located

Ideally the scope of effect should be limited to
the modules that are immediate subordinates of
the module in which the decision is located
Violation of this rule of thumb often results in
more coupling between modules
There are some methods that a designer can
use to ensure that the scope of effect of a
decision is within the scope of control of the
module
The decision can be removed from the module
and moved-up in the structure
Alternatively, modules that are in the scope of
effect but are not in the scope of control can be
moved down the hierarchy so that they fall within
the scope of control
Transaction Analysis
The structured design technique discussed earlier is
called transform analysis where most of the transform in
the DFD have a few inputs and a few outputs
There are situations where a transform splits an input
stream into many different sub-streams with a different
sequence of transforms specified for the different sub-
streams
For e.g. this is the case with systems where there are
many different sets of possible actions and the actions to
be taken depend on the input command specified
In such situations the transform analysis can be
supplemented by transaction analysis and the detailed
DFD of the transform splitting the input may look like the
DFD shown below
T
a
b
c
d
+
+
+
Transaction
Centre
DFD for transaction analysis
The module splitting the input is called the
transaction centre
It need not be a central transform and may
occur on either the input branch or the
output branch of the DFD of the system
Design verification and reviews
The output of the system design phase
should be verified before proceeding with
the activities of the next phase
The most common approach for
verification is design reviews or inspection
Design Reviews
The purpose of the design review is to ensure
that the design satisfies the requirements and
is of good quality
If errors are made during the design process,
they will ultimately reflect themselves in the
code and the final system
As the cost of removing faults caused by
errors that occur during design phase
increases with delay in detecting the errors, it
is best if design errors are detected early,
before they manifest themselves in the
system
Detecting errors in design is the aim of design
reviews
In a system design review process a group of
people get together to discuss the design with
the aim of revealing design errors or undesirable
properties
The group must include:
a member of the system design team
the detailed design team
the author of the SRS
The author responsible for maintaining the design
document
An independent s/w quality engineer
Each member studies the design before
the meeting
With the aid of a checklist items are
marked that the reviewer feels are
incorrect or need clarification
The member asks questions and the chief
engineer tries to explain the situation
Discussion ensures and during its course
design errors are revealed
It should be noted that the aim of the
meeting is to uncover the design errors
and not try to fix them, fixing is done later
A sample checklist:
The use of checklist is extremely useful for
any review
The checklist can be used by each
member for design study and review
meeting
For best results, the checklist should be
tailored to the project at hand
This helps to uncover problem-specific
errors
But a general checklist can be:
1. Is each of the functional requirements
taken into account?
2. Are there analyses to demonstrate that
performance requirements can be met?
3. Are all assumptions explicitly stated and
are they acceptable?
4. Are there any limitations or constraints
on the design beyond those in the
requirements?
5. Are external specifications of each
module completely specified?

6.Have exceptional conditions been
handled?
7. Are all data formats consistent with the
requirements?
8. Are the operator and user interfaces
properly addressed?
9. Is the design modular and does it conform
to local standards?
10. Are the sizes of data structures
estimated? Are provisions made to guard
against overflow?

The analysis model
The analysis model must achieve 3
primary objectives:
1. To describe what the customer requires
2. To establish a basis for the creation of a s/w
design
3. To define a set of requirements that can be
validated once the s/w is built
4. To accomplish these objectives, the analysis
model takes the form as:
Data
Dictionary
Data flow
Diagram
(DFD)
State Transition
Diagram
(STD)
Entity
Relationship
Diagram
(ERD)
Process
Specification
(PSPEC)
Control Specification
(CSPEC)
Data Object
Description
The structure of the analysis model
At the core of the model lies the data
dictionary
It is a repository that contains
descriptions of all data objects consumed
or produced by the s/w
3 different diagrams surround the core
1. The entity-relationship diagram (ERD)
2. The data flow diagram (DFD)
3. The state-transition diagram (STD)
1. The ERD
Depicts relationships between data
objects
The ERD is the notation that is used to
conduct the data modeling activity
The attributes of each data object is
noted in the ERD can be described using
a data object diagram

2. The DFD
Serves 2 purposes:
1. To provide an indication of how data are
transformed as they move through the
system
2. To depict the functions and sub-functions
that transform the data flow
A description of each function presented
in the DFD is contained in a process
specification (PSPEC)
3. The STD
Indicates how the system behaves as a
consequence of external events
To accomplish this, the STD represents
the various modes of behaviour (called
states) of the system and the manner in
which transitions are made from state to
state
The STD serves as a basis for behavioral
modeling
Additional aspects of the s/w is contained
in the control specification (CSPEC)
Translating the analysis model into
a software design
Data
Dictionary
Data flow
Diagram
(DFD)
State Transition
Diagram
(STD)
Entity
Relationship
Diagram
(ERD)
Process
Specification
(PSPEC)
Control Specification
(CSPEC)
Data Object
Description
The analysis model
Procedural
design
Interface
design
Architectural
design
Data
design
Each of the elements of the analysis
model provides information that is required
to create a analysis model
The flow of information during software
design is shown in the previous slide
Software requirements, manifested by the
data, functional and behavioral models,
feed the design step
Using one of a no. of design methods, the
design step produces a data design, an
architectural design, an interface design,
and a procedural design
Risk Management
Risk concerns future happenings
It is concerned with what might cause the
software project to go awry
Risk is mainly due to changes in:
Customer requirements
Development techniques
Target computers
Choices:
what methods and tools to use
How many people should be involved
How much emphasis on quality is enough?


It is impossible to eliminate risk
But it can be minimized
To do this, it is important to identify all the
risks that are obvious to both software
enggrs. and managers
Reactive Vs. Proactive Risk
Strategies
Reactive risk strategies: react when the risk
actually becomes a problem
Majority of the software project teams rely on
reactive risk strategy
This strategy monitors a project for likely risks
Resources are set aside for them should they
become actual problems
Most commonly the software team does nothing
about the risks until something goes wrong
Then the team flies into action in an attempt to
correct the problem rapidly
This is often called the fire-fighting mode
When this fails, crises management
takes over and the project is in real
jeopardy
A more intelligent strategy for risk
management is to be proactive
This strategy begins long before technical
work is initiated.
Potential risks are identified, their
probability and impact are assessed and
they are prioritized by importance
Then the software team establishes a plan
for managing a risk
The primary objective is to avoid risk
But because not all risks can be avoided,
the team works to develop a contingency
plan that will enable it to respond in a
controlled and effective manner

Software Risks
Risk involves two characteristics:
Uncertainty: The event that characterizes the
risk may or may not be happen i.e. there are
no 100% probable risks
Loss: if the risk becomes a reality, unwanted
consequences or losses will occur
When risks are analyzed it is important to
quantify the level of uncertainty and
degree of loss associated with each risk
To accomplish this, different categories of
risk are considered
Project risks: threaten the project plan
That is, if the project risks become real it is
likely that the project schedule will slip and
that costs will increase
Project risks identify potential:
budgetary
schedule
personnel (staffing and organization)
resource
customer
requirements problems and their impact on a
software project

Technical risks: threaten the quality and
timeliness of the software to be produced
If the technical risk becomes a reality,
implementation may become difficult or
impossible
Technical risks identify potential:
Design
Implementation
Interfacing
Technical uncertainty
Technical obsolescence
Leading-edge technology
Technical risks occur because the problem is
harder to solve than it was thought to be
Business risks: threaten the viability of the
software to be built
Business risks often jeopardize the project or
the product
Candidates for the top 5 business risks are:
1. Building an excellent product that no one really
wants (market risk)
2. Building a product that no longer fits into the overall
business strategy for the company (strategic risk)
3. Building a product that the sales force doesnt
understand how to sell
4. Losing the support of senior management due to
change in focus or a change in people (management
risk)
5. Losing budgetary or personnel commitment (budget
risk)
It is extremely important to note that
simple categorization wont always work.
Some risks are simply unpredictable in
advance
Another known categorization of risks as
proposed by Charette are:
Known risks
Predictable risks
Unpredictable risks
Known risks are those that can be
uncovered after:
careful evaluation of the project plan
the business and technical environment in
which the project is being developed
Other reliable sources, e.g:
unrealistic delivery date,
lack of documented requirements or software
scope,
poor development environment


Predictable risks are extrapolated from
past project experience, e.g.:
staff turnover,
poor communication with the customer,
dilution of staff effort as ongoing maintenance
requests are serviced
Unpredictable risks are the joker in the
deck
They can and do occur but they are extremely
difficult to identify in advance

Risk Identification
Risk identification is a systematic attempt
to specify threats to the project plan
(estimates, schedule, resource loading
etc)
By identifying known and predictable risks,
the project manager takes a first step
toward avoiding them when possible and
controlling them when necessary
There are two distinct types of risks for
each of the categories presented earlier:
generic risks and product-specific risks
Generic risks are a potential threat to every
software project
Product-specific risks can only be identified by
those with a clear understanding of the
technology, the people and the environment that
is specific to the project at hand
To identify product-specific risks, the project plan
and the software statement of scope are
examined and an answer to the following
question is developed:
What special characteristics of this product may
threaten our project plan?
Both generic and product-specific risks
should be identified systematically
If you dont actively attack the risks, they will
actively attack you
One method for identifying risks is to create
a risk item checklist
The checklist can be used for risk
identification and focuses on some subset
of known and predictable risks in the
following generic subcategories:
Product size- Risks associated with the overall
size of the software to be built or modified
Business impact- risks associated with
constraints imposed by management or the
marketplace
Customer characteristics- risks associated with
the sophistication of the customer and the
developers ability to communicate with the
customer in a timely manner
Process definition- risks associated with the
degree to which the software process has been
defined and is followed by the development
organization
Development environment- risks associated with
the availability and quality of the tools to be used
to build the product
Technology to be built- risks associated
with the complexity of the system to be
built and the newness of the technology
that is packaged by the system
Staff size and experience- risks associated
with the overall technical and project
experience of the software engineers who
will do the work
The answers to these questions allow the
planner to estimate the impact of risk
Finally a set of risk components and
drivers are listed along with their
probability of occurrence
Risk components and drivers
The U.S Air Force has written guidelines
for software risk identification and
abatement
The USAF approach requires that the
project manager identify the risk drivers
that affect software risk components are
defined in the following manner:
Performance risk- the degree of uncertainty
that the product will meet its requirements and
be fit for its intended use

Cost risk- the degree of uncertainty that the
project budget will be maintained
Support risk- the degree of uncertainty that
the software will be easy to correct, adapt and
enhance
Schedule risk- the degree of uncertainty that
the project schedule will be maintained and
that the product will be delivered on time
The impact of each driver on the risk
component is divided into one of four
categories- negligible, marginal, critical
and catastrophic

Risk Projection
Risk projection, also called risk elimination
attempts to rate each risk in two ways:
The likelihood or probability that the risk is real
The consequences of the problems associated with
the risk, should it occur
The project planner, along with other managers
and technical staff performs 4 risk projection
activities:
1. Establish a scale that reflects the perceived
likelihood of a risk
2. Delineate the consequences of the risk
3. Estimate the impact of the risk on the project and the
product
4. Note the overall accuracy of the risk projection so
that there will be no misunderstandings
Developing a risk table
A risk table provides a project manager with a
simple technique for risk projection
A project team begins by listing all the risks
(no matter how remote) in the first column of
the table
Each risk is categorized in the second column
The probability of occurrence of each risk is
entered in the next column
The probability value of each risk can be
estimated by team members individually
Individual values are averaged to develop a
single consensus probability



Next the impact of each risk is assessed in
terms of it being negligible, marginal,
critical or catastrophic
The categories for each of the 4 risk
components-performance, support, cost
and schedule are averaged to determine
an overall impact value


Risks Category Probability Impact RMMM
Size estimate may be significantly low PS 60% 2
Larger no. of users than planned PS 30% 3
Less reuse than planned PS 70% 2
End users resist system BU 40% 3
Delivery deadline will be tightened BU 50% 2
Funding will be lost CU 40% 1
Customer will change requirements PS 80% 2
Technology will not meet expectations TE 30% 1
Lack of training on tools DE 80% 3
Staff inexperienced ST 30% 2
Staff turnover will be high ST 60% 2
. .. . .
Impact Value:
1- catastrophic
2- critical
3-marginal
4-negligible
Once the first four columns of the risk have been
completed, the table is sorted by probability and
by impact
High-probability, high-impact risks percolate to
the top of the table, and low-probability risks drop
to the bottom
This accomplishes first-order risk prioritization
The project manager studies the resultant sorted
table defines a cut-off line
The risks that lie above the line will be given
further attention
Risks that fall below the line are re-evaluated to
accomplish second order prioritization
Risk impact and probability have a distinct
influence on management concern
A risk factor that has a high impact but a very
low probability of occurrence should not absorb
a significant amount of management time
However, high-impact risks with moderate to
high probability and low impact risks with high
probability should be carried forward into the
management steps that follow
All risks that lie above the cut-off line must be
managed
The column labeled RMMM contains a pointer
into a Risk Mitigation, Risk Monitoring and
Management Plan
The RMMM is developed for all risks that lie
above the cut-off
Risk probability can be determined by making
individual estimates and then developing a single
consensus value.
Risk drivers can be assessed on a qualitative
probability scale that has the following values:
Impossible
Improbable
Probable
Frequent
Mathematical probability can then be associated
with each qualitative value (for e.g. a probability
of 0.7 to 1.0 implies a high probable risk
Very
high
impact
Very
low
Disregard
Risk factor
Management
concern
high
0
1.0
Probability
Of occurence
Risk and Management
concern
Assessing Risk Impact
3 factors affect the consequences that are likely if a risk
does occur
Its nature
Its scope
Its timing
Nature of the risk indicates the problems that are likely if
it occurs.
For e.g. a poorly defined external interface to the
customer hardware ( a technical risk) will lead to system
integration problems late in a project
The scope of a combines severity with its overall
distribution (how much of the project will be affected or
how many customers are harmed)
The timing of a risk considers when and how long the
impact will be felt
In most cases, a project manager might want the bad
news to occur as soon as possible but in some cases,
the longer the delay, the better
To determine the overall consequences of
a risk, use the USAF guidelines for risk
analysis
1. determine the avg. probability of occurrence
value for each risk component
2. determine the impact for each component
based on the criteria shown earlier
3.Complete the risk table and analyze the
results

Risk Assessment
At this point of the risk management
process, a set of triplets have been
established as
[r
i
,l
i
,x
i
]
where:
r
i
= the risk
l
i
= the likelihood (probability) of the
risk
x
i
= the impact of the risk
During risk assessment :
examine the accuracy of the estimates that
were made during risk projection,
attempt to prioritize the risks that have been
uncovered and
Begin thinking of ways to control and/or avert
the risk that are likely to occur
For this assessment to be useful, a risk
referent level must be defined
Risk referent levels are:
Performance, cost, support and schedule
That is, there is level for:
performance degradation,
cost overrun,
support difficulty or
schedule slippage
There can also be a combination of the four
These can cause the project to be terminated
If a combination of risks create problems that
cause one or more of these referent levels to be
exceeded, work will stop
A risk referent level will have a point called the
referent point or break point
At this point, a decision as to whether to
continue or terminate a project will be equally
acceptable
The figure on the next slide shows this situation
graphically
If a combination of risks lead to problems that
cause cost and schedule overruns, there will be
a level
This level is represented by a curve in the figure
When this level is exceeded, it will lead to
project termination (shaded region)
At a referent point, the decisions to proceed or to
terminate are equally weighted
In reality, the referent level can be rarely
represented as a smooth line on a graph
In most cases, it is a region in which there are
areas of uncertainty
Projected cost overrun
P
r
o
j
e
c
t
e
d

s
c
h
e
d
u
l
e

o
v
e
r
r
u
n

Project termination will occur
Referent point (cost value, time value)
Therefore during risk assessment the
following steps are performed:
1. Define the risk referent levels for the project
2. Attempt to develop a relationship between
each [r
i
,l
i
,x
i
] and each of the referent levels
3. Predict the set of referent points that define
a region of termination, bounded by a curve
or areas of uncertainty
4. Try to predict how compound combinations
of risks will affect a referent level



Risk Mitigation, Monitoring &
Management
All of the risk analysis activities have a
single goal:
to assist the project team in developing a
strategy for dealing with risk
If a software team adopts a proactive
approach to risk, avoidance is always the
best strategy
This is achieved by developing a plan for
risk mitigation
For e.g. assume that a high staff turnover
is noted as a project risk, r
1
Based on past history and management
intuition, the likelihood, l
1
, of high turnover
is estimated at 0.7 (high)
The impact x
1
, is projected to have a
critical impact on project cost and
schedule
To mitigate this risk, the project
management must develop a strategy for
reducing turnover
Among the possible steps to be taken are:

Meet with the current staff to determine causes for
turnover (e.g poor working conditions, low pay,
competitive job market)
Act to mitigate those causes that are under management
control before the project starts
Once project commences, assume turnover will occur
and develop techniques to ensure continuity when
people leave
Organize project team so that information about each
development activity is widely dispersed
Define documentation standards and establish
mechanisms to be sure that documents are developed in
a timely manner
Conduct peer reviews of all work so that more than one
person is familiar with the work
Define a backup staff member for every critical person
Risk Monitoring
As the project proceeds, risk monitoring begins
The project manager monitors factors that may provide
an indication of whether the risk is becoming more or
less likely
In case of high staff turnover, the following factors can
be monitored:
General attitude of team members based on project
pressures
The degree to which the team has jelled
Interpersonal relationships among the team members
Potential problems with compensation and benefits
The availability of jobs within the company and
outside it
In addition to monitoring the factors above, the project
manager should also monitor the effectiveness of risk
mitigation steps
Risk management and contingency
planning
This stage assumes that the mitigation efforts have failed
and that the risk has become a reality
For e.g. a project is well underway and a no. of people
announce that they will be leaving
If mitigation strategy has been followed, backup is
available, information is documented and knowledge has
been dispersed across the team
In addition, the project manager may temporarily refocus
resources and readjust the project schedule
Those individuals who are leaving are asked to stop all
work and spend their last weeks in knowledge transfer
mode to newcomers
Conclusion
It is important to note that RMMM steps incur additional
cost
Thus it is necessary to evaluate when the benefits
accrued by the RMMM steps are outweighed by the
costs associated with implementing them
For a large project, 30-40 risk may be identified
If 3-7 risk management steps are applied to each risk
then risk management may become a project in itself!
So apply the pareto 80-20 rule
It says that 80% of the overall project risk can be
accounted for by only 20% of the identified risks
So the RMMM steps can be applied to only those critical
20%

Software Testing
Testing performs a very important role for quality
assurance and for ensuring the reliability of
software
During testing, the software is executed with a
set of test cases and the output of the program
for the test cases is evaluated to determine if the
program is performing as expected
Due to its approach, testing can only determine
the presence of errors in the program
Clearly, the success of testing in revealing errors
in programs depends critically on the test cases

Testing a large system is a complex activity
It is thus broken into smaller activities
Due to this, incremental testing is performed
Here the components and subsystems of the
system are tested separately before integrating
them to form the system for system testing
But integration in itself is an important issue
Thus this phase is also called integration and
testing

Testing fundamentals
Error, fault and failure:
IEEE defines these terms as:
Error: Used in two different ways
It refers to the discrepancy between a
computed, observed or measured value and
the true, specified or theoretically correct
value
That is, it is the difference between the actual
output of a software and the correct output
Error is also used to refer to human actions
that results in software containing a defect or
a fault
Fault: is a condition that causes a system
to fail in performing its required function
It is the basic reason for software
malfunction and is also referred to as a
bug
It should be noted that the only faults that
a software has are design faults; theres
no wear and tear in software

Failure: is the inability of a system or a
component to perform a required function
according to its specifications
A software failure occurs if the behaviour of the
software is different from the specified behaviour
Failures may be caused due to functional or
performance reasons
A failure is produced only when theres a fault in
the system
But the presence of a fault doesnt guarantee a
failure
It is also possible that a failure may occur but not
be detected
This means that by observing a system for some
time duration and not finding any errors doesnt
guarantee the absence of presence of faults
On the other hand, if failures are observe during
a time duration, this means that there are faults
in the system
In testing, system behaviour is observed
By observing the behaviour of a system or a
component during testing it can be determined
whether theres a failure or not
Because of this fundamental reliance on
external behaviour observation, testing can only
reveal the presence of faults, not their absence

But by not observing failures in the system
during a testing interval it cant be said that there
are no faults in the system
It thus becomes very hard to decide for how long
should the system be tested
During the testing process, only failures are
observed by which the presence of faults is
deduced
The actual faults are identified by separate
activities commonly referred to as debugging
This means that after testing, debugging has to
be performed
This is why testing is an expensive method

Top-Down and Bottom-Up
strategies
Generally parts of a program are tested before
testing the entire program
Besides partitioning the problem for testing,
another reason for testing parts separately is
that if a test case detects an error in a large
program, it will be extremely difficult to pinpoint
the source of error
That is, if a large program doesnt work
determining which module has errors can be a
formidable task
Also it would be very difficult to construct test
cases so that different modules are executed in
a sufficient no. of different conditions
In some cases, it is even difficult to
construct test cases so that all modules
will be executed
This increases the chances of a modules
errors going undetected
Hence its clear that for a large system first
the different parts of the system should be
tested independently before testing the
entire system

In incremental testing, some parts of the system
are first tested independently
These parts are then combined to form a
subsystem which is then tested independently
This combination can contain either only the
tested modules that have been tested
independently or some new untested modules
are combined with tested modules
Both of these approaches require that the order
in which the modules are to be tested and
integrated be planned before commencing
testing
If the system has a hierarchy of modules then
there are two ways the modules can be
combined as they are tested to form a working
program: top-down and bottom-up
In top-down strategy, first the top of the
hierarchy is tested and those modules that it
calls are added incrementally and then the new
combined system is tested
This approach requires stubs to be written
A stub is a dummy routine that simulates a
module
In top-down approach, a module (or a collection)
cannot be tested is isolation because they
invoke some other modules
To allow the modules to be tested before their
subordinates have been coded, stubs simulate
the behaviour of the subordinates
The bottom-up approach starts from the bottom
of the hierarchy
First the modules at the very bottom which have
no subordinates are tested
Then these modules are combined with higher
level modules for testing
At any stage of testing all the subordinate
modules exist and have been tested earlier
To perform bottom-up testing, drivers are
needed to set up the appropriate environment
and invoke the module
It is the job of the driver to invoke the module
under testing with the different set of test cases
It should be noticed that both top-down and
bottom-up approaches are incremental
They start with testing single modules and then
adding untested modules to those that have
been tested until the entire system has been
tested
In the 1
st
case, stubs must be written to perform
testing and in the other, drivers need to be
written
Top-down testing is advantageous if major flaws
occur towards the top of the hierarchy
Bottom-up is better if the major flaws occur
towards to the bottom
Often writing stubs can be more difficult
than writing drivers because one may
need to know beforehand the set of inputs
for the module being simulated by the stub
and to determine the proper responses for
these inputs
Finally, as stubs only simulate a module,
deciding test cases is often very difficult

Conclusion
It is often best to select the testing method
to conform with the development method
Thus, if the system is developed in a top-
down manner, top-down testing should be
used
If the system is developed in a bottom-up
manner, bottom-up testing should be used


In his classic book on software testing,
Glen Myers states a no. of rules that can
serve well as testing objectives:
1. Testing is a process of executing a
program with the intent of finding an error
2. A good test case is one that has a high
probability of finding an as-yet
undiscovered error
3. A successful test is one that uncovers an
as-yet undiscovered error
Testing Objectives
The above objectives imply a dramatic change
in viewpoint
They move counter to the commonly held view
that a successful test is one in which no errors
are found
The objective should be to design tests that
systematically uncover different classes of errors
and do so with a minimum amount of time and
effort
If testing is conducted successfully it will uncover
errors in the software
A secondary benefit is that testing proves that
software functions appear to be working
according to specification and that performance
requirements appear to have been met
In addition, data collected as testing is
conducted provides a good indication of
software reliability and some indication of
quality as a whole

Testing principles
Before applying methods to design effective
test cases, a software engineer must
understand the basic principles that guide
software testing
Davis suggested a set of testing principles:
All tests should be traceable to customer
requirements:
It has been observed that the most severe
defects (from the customers point of view) are
those that cause the program to fail to meet its
requirements
Tests should be planned long before
testing begins
Test planning can begin as soon as the
requirements model is complete
Detailed definition of test cases can begin
as soon as the design model has been
finalized
Therefore, all tests can be planned and
designed before any code has been
generated
The Pareto principle applied to software testing
This implies that 80% of all errors uncovered
during testing will likely to be traceable to 20% of
all program modules

Testing should begin in the small and progress
towards testing in the large
The first tests planned and executed generally
focus on the individual program modules
As testing progresses, testing shifts focus to
include integrated clusters of modules and
ultimately the entire system
Exhaustive testing is not possible
The no. of path permutations for even a
moderately sized program is exceptionally
large
For this reason, it is impossible to execute
every combination of paths during testing
It is possible to adequately cover program
logic and to ensure that all conditions in
the procedural design have been executed

To be most effective, testing should be
conducted by an independent third party
By most effective, it is meant a technique
that has the highest probability of finding
errors
It implies that the software engineer who
created the system is not the best person
to conduct all the tests for the software
Test case design
The design of the test cases for software can be
as challenging as the initial design of the product
itself
But software engineers often treat testing as an
afterthought
Over the last 20 years, a variety of test case
design methods have evolved for software
These methods provide the developer with a
systematic approach to testing
More importantly, these methods provide a
mechanism that can help ensure the
completeness of tests and provide the highest
likelihood for uncovering errors in software
Any software can be tested in 2 ways:
Black-box testing: Knowing the specified
function that a product has been designed
perform, tests can be conducted that
demonstrate each function is fully operational, at
the same time searching for errors in each
function
White-box testing: Knowing the internal workings
of a product, tests can be conducted to ensure
that all internal operation performs according to
specification and all internal components have
been adequately exercised
Black box testing alluded to tests that are
conducted at the software interface
Although they are designed to uncover
errors, black-box tests are used to
demonstrate that software functions are
operational; that input are properly
accepted and output is correctly produced;
and that the integrity of external
information (e.g data files) is maintained
Black-box test examines some
fundamental aspects of a system with little
regard for internal logical structure of the
software
White-box testing of software is predicated on
close examination of procedural detail
Logical paths through the software are tested by
providing test cases that exercise specific sets of
conditions and/or loops
The status of the program may be examined at
various points to determine if the expected or
asserted status corresponds to the actual status
It seems that white-box testing would lead to
100% correct programs
But unfortunately, exhaustive testing has certain
practical problems
For even small programs, the no. of possible
logical paths can be very large


White box testing however cant be
dismissed as impractical
A limited no. of important logical paths can
be selected and exercised
Important data structures can be probed
for validity
The attributes of both black and white box
testing can be combined to provide an
approach that validates the software
interface and selectively assures that the
internal workings of the software are
correct
White-box testing
Its also called glass box testing
Using this approach, test cases can be derived
that:
1. Guarantee that all independent paths within a
module have been exercised at least once
2. Exercise all logical decisions on their true and
false sides
3. Execute all loops at their boundaries and within
their operational bound
4. Exercise internal data structures to ensure their
validity
But a reasonable question is:
Why spend time and energy on the minute
logical details when its better to expend
effort ensuring that program requirements
have been met?
In other words, why shouldnt all the effort
be concentrated on black-box tests?
The answer lies in the nature of the
software defects:
Logic errors and incorrect assumptions are
inversely proportional to the probability
that a program path will be executed
It implies that errors tend to creep into the work
when functions, conditions or control are
designed and implemented that are out of the
mainstream
That is, everyday processing tends to be well
understood while special case processing
tends to fall into the cracks
Its often believed that a logical path is not likely
to be executed when, in fact, it may be executed
on a regular basis
The logical flow of a program is sometimes
counterintuitive
Unconscious assumptions about the flow of
control and data leads to design errors that are
uncovered only once path testing commences

Typographical errors are random
When a program is translated into
programming language source, it is likely
that some typing errors will occur
Many will be uncovered by the syntax
checking mechanism but others will go
undetected until testing begins
Each of these reasons provides an
argument for conducting white-box tests
Black-box tests, no matter, how thorough,
may miss the kind of errors noted above

Basis Path testing
Basis path testing is a white-box testing
technique
It enables the test case designer to derive
a logical complexity of a procedural design
This measure acts as a guide for defining
a basis set of execution paths
Test cases derived to exercise the basis
set are guaranteed to execute every
statement in the program at least once
during testing
Flow graph notation
Before the basis path method can be
introduced, a simple notation for the
representation of control flow, called a flow
graph (or program graph) must be
introduced
The flow graphs depicts a logical flow
using the notation shown in the next slide
Sequence
If
While
Until
Case
To illustrate the use of a flow graph, an example
of a procedural design representation as shown
The flowchart is mapped into a corresponding
flow graph
Each circle is called a node
It represents one or more procedural statements
A sequence of process boxes and a decision
diamond can map into a single node
The arrows on the flow graph are called edges
or links
They represent the flow of control and are
similar to flowchart arrows
An edge must terminate at a node, even if the
node doesnt represent any procedural
statements (node 11)
Areas bounded by edges and nodes are called
regions
When counting regions the area outside the
graph is also counted as a region
In fact, any program construct can be mapped
into a flow graph
When compound conditions are encountered in
a procedural design, the generation of the flow
graph becomes more complicated
A compound condition occurs when one or more
Boolean operators (logical OR,AND, NAND,
NOR) are present in a conditional statement

Note that a separate node is created for
each of the conditions a and b in the
statement IF a OR b
Each node that contains a condition is
called a predicate node and is
characterized by two or more edges
emanating from it
Cyclomatic complexity
It is a software metric that provides a
quantitative measure of the logical complexity of
a program
When this metric is used in the context of basis
path testing method, the value computed for
cyclomatic complexity defines the no. of
independent paths in the basis set of a program
It also provides an upper bound for the no. of
tests that must be conducted to ensure that all
statements have been executed at least once
An independent path is any path through a
program that introduces at least one new set of
processing statements or a new condition
When stated in terms of a flow graph, an
independent path must move along at least one
edge that has not been traversed
For example, a set of independent paths for the
flow graph shown is:
Path 1: 1-11
Path 2: 1-2-3-4-5-10-1-11
Path 3: 1-2-3-6-8-9-10-1-11
Path 4: 1-2-3-6-7-9-10-1-11
Note that each new path introduces a new edge
The path 1-2-3-4-5-10-1-2-3-6-8-9-10-1-11 is not
considered an independent path because it is
simply a combination of already specified and
does not traverse any new edges
So, paths 1,2,3,4 defined above comprise a
basis set for the flow graph
That is, if a tests can be designed to force
execution of these paths ( a basis set), every
statement in the program will be executed at
least one time and every condition will have
been executed on its true and false side
It should be noted that the basis set is
not unique and any no. of different basis
sets can be derived for a given
procedural design
But the question is, how many paths
should be searched for?
Cylcomatic complexity provides the
answer
It can be computed in 3 ways:
1. The no. of regions of the flow graph
correspond to the cyclomatic complexity

2. Cyclomatic complexity, V(G) for a flow graph
G is defined as
V(G)= E-N+2
where E is the no. of flow graph edges and N
is the no. of flow graph nodes
3. Cyclomatic complexity, V(G) for a flow graph
is also defined as
V(G)=P+1
where P is the no. of predicate nodes
contained in a flow graph G

For the flow graph shown the cyclomatic
complexity can be computed as:
1. The flow graph has 4 regions
2. V(G)= 11 edges 9 nodes +2 =4
3. V(G) = 3 predicate nodes +1 =4\
Thus the cyclomatic complexity of the given
flow graph is 4
V(G) also provides an upper bound for the no.
of independent paths that comprise the basis
set
This implies an upper bound on the no. of tests
that must be designed and executed to
guarantee coverage all program statements
Deriving Test cases
The basis path testing method can be applied
to a procedural design or to a source code
Basis path testing can done as a series of
steps
1. Using the design or a code as foundation,
draw a corresponding flow graph
2. Determine the cyclomatic complexity of the
resultant flow graph
3. Determine a basis set of linearly independent
paths
4. Prepare test cases that will force execution of
each path in the basis set
Test cases should be of the form:
Provided Value:
Expected Value:
Actual Value:
Test Status :
Control Structure Testing
The basis path testing is one of the no. of
techniques for control structure testing
Although it is simple and highly effective, it
is not sufficient in itself
There are other variations for control
testing which broaden and improve the
quality of white-box testing

1. Condition Testing
It is a test case design method that
exercises the logical conditions of a
program
A simple condition is a Boolean variable or
a relational expression such as
E
1
(relational-operator) E
2

where E1 and E2 are arithmetic
expressions and relational-operator can be
any one of the following:
<,>,, , =,,
A compound condition is composed of two
or more simple conditions, Boolean
operators and parenthesis
Therefore, the possible types of
components in a condition include:
Boolean operator
A Boolean variable
A pair of Boolean parenthesis
A relational operator
Or, an arithmetic expression
If a condition is incorrect, then at least one
component of the condition is incorrect
Thus types of errors in a condition include:
Boolean operator error
Boolean variable error
Boolean parenthesis error
Relational operator error
Arithmetic expression error
The condition testing method focuses on
testing each condition in the program
Condition testing strategies have 2
advantages:

1. Measurement of test coverage of a condition is
simple
2. Test coverage of conditions in a program
provides guidance for the generation of
additional tests for the program
The purpose of condition testing is to detect
not only errors in the conditions of a program
but also other errors in the program
If a test set for a program P is effective for
detecting errors in the conditions then it is
likely that it will be effective for detecting other
errors in P
A no. of condition testing strategies have been
proposed
Branch testing is the simplest condition testing
strategy
For a compound condition C, the true and false
branches of C and every simple condition in C
need to be executed at least once
Domain testing requires 3 or 4 tests to be
derived for a relational expression
For a relational expression
E
1
(relational-operator) E
2
3 tests are required:

Make E
1
greater than, equal to and less
than E
2
if relational-operator is incorrect and E
1

and E
2
are correct, then these 3 tests
guarantee the detection of the relational
operator error
To detect errors in E
1
and E
2
, a test that
makes the value of E
1
greater or less than
that of E
2
, should make the difference
between these values as small as possible
For a Boolean expression with n variables,
all of the 2
n
possible tests are required.
This strategy can detect Boolean operator,
variable and parenthesis errors but is
practical only for small values of n
Error sensitive tests for Boolean
expressions can also be derived
For a singular Boolean expression ( a
Boolean expression in which n Boolean
variables occurs only once) a test set with
less than 2
n
can be generated

Tai[89] suggests a condition testing
strategy that builds on the above
techniques
Called BRO (branch and relational
operator) testing
This guarantees the detection of branch
and relational operator errors in a
condition provided that all Boolean
variables and relational operators in the
condition occur only once and have no
common variables
The BRO strategy uses condition
constraints for a condition C
A condition constraint for C with n simple
conditions is defined as (D
1
,D
2
,.,D
n
)
where Di is a symbol specifying a
constraint on the outcome of the ith simple
condition in C
A condition constraint D for a condition C
is said to be covered by an execution of C
if during this execution of C the outcome of
each simple condition in C satisfies the
corresponding constraint in D
For a Boolean variable B, a constraint is
specified on the outcome of B that states
that B must either be true or false
Similarly, for a relational expression, the
symbols >,= and < are used to specify
constraints on the outcome of the
expression
For example, consider
C
1
: B
1
& B
2
where B
1
and B
2
Boolean variables
The condition constraint for C
1
is of the
form (D
1
,D
2
) where each of D
1
and D
2
is T
or F
The value (T,F) is a condition constraint
for C1
It is covered by the test that makes the
value of B1 to be true and the value of B2
to be false
The BRO testing strategy requires that the
constraint set {(t,t), (t,f), (f,t), (t,t)} be
covered by the executions of C
1
If C1 is incorrect due to one or more
Boolean operator errors, at least one
member of the constraint set will force C
1

to fail
The complete coverage of the constraint
set will guarantee detection of errors
Loop testing
Loops are the cornerstone for the vast majority
of all algorithms implemented in software
But they are paid little attention while testing
Loop testing is a white-box testing technique that
focuses exclusively on the validity of loop
constructs
4 different classes of loops can be defined:
Simple loops
Concatenated loops
Nested loops
Unstructured loops
1. Simple loops:
The following set of tests should be applied
to simple loops, where n is the max. no.
of allowable passes through the loop:
1. Skip the loop entirely
2. Only one pass through the loop
3. 2 passes through the loop
4. m passes through the loop where m<n
5. n-1, n, n+1 passes through the loop
2. Nested Loops:
The test approach for simple loops cannot be
extended for nested loops since the no. of
possible tests would grow geometrically as the
level of nesting increases
This would result in an impractical no. of tests
Beizer[90] suggested an approach that will help
to reduce the no. of tests:
1. start at the innermost loop. Set all other loops
to min. value
2. conduct simple loop tests for the innermost
loop while holding the outer loops at their min.
iteration parameter (e.g loop counter) values.
Add other tests for out-of-range or excluded
values

3. Work outward, conducting tests for the
next loop, but keeping all other outer loops
at min. values and other nested loops to
typical values
4. Continue until all loops have been tested
3. Concatenated loops:
Concatenated loops can be tested using
the approach define above for simple
loops if each loop is independent of the
other
But if 2 loops are concatenated and the
loop counter for loop1 is used as the initial
value for loop2, then the loops are not
independent
When the loops are not independent, the
approach applied to nested loops are
reccommended

4. Unstructured loops:
Whenever possible, this class of loops
should be redesigned to reflect the use of
structured programming constructs
Black Box Testing
It focuses on the functional requirements of the
software
That is, black-box testing enables a software
engineer to derive sets of input conditions that
will fully exercise all functional requirements for
a program
Black box testing is not an alternative to white-
box techniques
Rather it is a complementary approach that is
likely to uncover a different class of errors than
white-box methods
Black-box testing attempts to find errors
in the following categories:
1. Incorrect or missing functions
2. Interface errors
3. Errors in data structures or external
database access
4. Performance errors
5. Initialization and termination errors

Unlike white-box testing, which is
performed early in the testing process,
black box testing tends to be applied
during later stages of testing
Black box testing purposely disregards the
control structure
Tests are designed to answer the following
questions:
How is functionality validity tested?
What classes of input will make good test
cases?
Is the system particularly sensitive to certain
input values?
How are the boundaries of a data class isolated?
What data rates and data volume can the system
tolerate?
What effect will specific combinations of data have on
system operation?
By applying black-box techniques, a set of test
cases can be derived that satisfy the following
criteria:
1.Test cases that reduce the no. of additional
test cases that must be designed to achieve
reasonable testing
2. Test cases that tell something about the
presence or absence of classes of errors,
rather than errors associated only with the
specific test at hand

Boundary Value Analysis
For reasons that are not completely clear, a
large no. of errors tend to occur at the
boundaries of the input domain than in the
centre
It is for this reason that boundary value analysis
(BVA) has been developed as a testing
technique
BVA leads to a selection of test cases that
exercise bounding values
BVA is a test case design technique that
complements earlier discussed techniques
BVA leads to a selection of test cases at the
edges of the classes
Rather than focusing solely on input
conditions, BVA derives test cases from the
output domain as well
Some guidelines for BVA can be:
1. If an input condition specifies a range bounded
by values a and b, test cases should be
designed with values just above and just below
a and b
2. If an input condition specifies a no. of values,
test cases should be developed that exercise
the min. and max. nos. Values just above min.
and just below max. are also tested
3. Guidelines 1 and 2 are applied to output
conditions
For e.g. if a table is required as output from
a program, then test cases should be
designed to create a table with the max.
and min. allowable no. of entries
4. If internal program data structures have
prescribed boundaries (for e.g. an array of
size 100) then test cases should be
designed to exercise the data structure at
its boundary
Software Quality
Quality if defined as a characteristic or attribute
of something
As an attribute of an item, quality refers to
measurable characteristics such as:
Length
Colour
Etc
But for software, defining its attributes is more
challenging than physical objects
But measures of a programs characteristics do
exist
These include:
Cyclomatic complexity
Cohesion
No. of FPs
LOCs
Etc
When an item is based on its measurable
characteristics, two kinds of quality may be
encountered
Quality of design
Quality of conformance
Quality of design: refers to characteristics that
designers specify for an item
For software, quality of design encompasses:
Requirements specifications
Design of the system
Quality of conformance: is the degree to which
the design specifications are followed during
manufacturing
Quality of conformance is an issue that focuses
primarily on implementation
If the implementation follows the design and the
resulting system meets its requirements and
performance goals, conformance quality is high


The Quality Movement
The quality movement began in the 1940s
initiated by W. Edwards Deming
Using Demings approach, the Japanese
developed a systematic approach to the
elimination of the root causes of product defects
Their work migrated to the western world in the
1970s and 80s
This technique came to be known as Total
Quality Management or TQM
It has 4 basic steps

1. Kaizen: Refers to a system of continuous
process improvement
The goal of Kaizen is to develop a system
(in this case, a software process) that is
visible, repeatable and measurable
2. Atarimae Hinshitsu: This step follows
after Kaizen has been achieved
This step examines factors that affect the
process and works to optimize their impact
on the process
For e.g. the software process may be
affected by high staff turnover
High staff turnover may be due to frequent
reorganizations within the company
Thus be ensuring a stable organization, the
quality of the software can be improved
Atarimae Hinshitsu would lead management
to suggest changes in the way
reorganization occurs
3. Kansei: While the 1
st
two steps focus on
the process, this step concentrates on
the user of the product
Kansei translates to the five senses
By examining the way the user applies the
product, Kansei leads to improvement in the
product itself and potentially to the process
that created it
4. Miryokuteki Hinshitsu: it broadens management
concern beyond the immediate product
It forces the management to observe the use of the
product in the marketplace
In the software world, Miryokuteki Hinshitsu might be
an attempt to uncover new and profitable products or
applications that are an outgrowth from an existing
computer-based system
For most companies, Kaizen should be of
immediate concern
Until a good software process has been
achieved, theres little point in moving to the next
steps

Statistical quality assurance reflects a growing trend
throughout the industry to become more quantitative
about quality
For software statistical quality assurance implies the
following steps:
1. Information about software defects is collected
and categorized
2. An attempt is made to trace each defect to its
underlying cause e.g.
Non conformance to specification
Design error
Violation of standards
Poor communication with customer

Statistical Quality Assurance
3. Using the Pareto principle (80% of the
defects traced to 20% of all causes),
isolate the 20% (the vital few)
4. Once all the vital causes have been
identified, move to correct problems that
have caused the defects

This simple concept represents an
important step towards an adaptive
software engg. process in which changes
are made to improve those elements that
introduce error

To explain the process, assume that a
software company collects information on
defects for a period on 1 year
Some errors are uncovered as software is
being developed
Other defects are encountered after the
software has been released to its end user
Although 100s of different errors are
uncovered, all can be tracked to one (or
more) of the following causes:
Incomplete or erroneous specifications (IES)
Misinterpretation of customer communication (MCC)
Intentional deviation from specification (IDS)
Violation of programming standards (VPS)
Error in data representation (EDR)
Inconsistent module interface (IMI)
Error in design logic (EDL)
Incomplete or erroneous testing (IET)
Inaccurate or incomplete documentation (IID)
Error in programming language translation of design
(PLT)
Ambiguous or inconsistent human-computer interface
(HCI)
Miscellaneous (MIS)

To apply statistical quality assurance a table
is built as
Total Serious Moderate Minor
Error No. % No. % No. % No. %
IES 205 22 34 27 68 18 103 24
MCC 156 17 12 9 68 18 76 17
IDS 48 5 1 1 24 6 23 5
VPS 25 3 0 0 15 4 10 2
EDR 130 14 26 20 68 18 36 8
IMI 58 6 9 7 18 5 31 7
EDL 45 5 14 11 12 3 19 4
IET 95 10 12 9 35 9 48 11
IID 36 4 2 2 20 5 14 3
PLT 60 6 15 12 19 5 26 6
HCI 28 3 3 2 17 4 8 2
MIS 56 6 0 0 15 4 41 9
Totals 942 100 128 100 379 100 435 100
The table indicates that IES,MCC and EDR are
the vital few causes that account for 53% of all
errors
It should be noted that IES, EDR, PLT and EDL
would be selected as the vital few only when
serious errors are considered
Once the vital few causes are determined, the
software development organization can begin
corrective action
For e.g. to correct MCC, the software developer
might try to think of methods to improve the
quality of customer communication and
specification
As the vital few causes are corrected, new
candidates pop to the top of the stack


In conjunction with this defect information,
software developer can calculate an error index
(EI) for each major step in the software engg.
process as follows:

1. After analysis, design, coding, testing and
release, the following data are gathered:
E
i
= the total no. of errors uncovered during the
ith step in the software engg. process
S
i
= the no. of serious errors
M
i
= the no. of moderate errors
T
i
= the no. of minor errors
PS= the size of the product (LOC, design
statements, pages of documentations) at the
ith steps

2. Next, weight factors are chosen as:
ws = 10 for serious errors
wm= 3 for moderate errors
wt = 1 for trivial errors
These weighting factors for each phase
should become larger as development
progresses
This rewards an organization that finds
errors early
3. At each step of the software engg.
Process, a phase index (PI
i
) is
computed:
PI
i
= w=(S
i
/E
i
)+ w
m
(M
i
/E
i
) + w
t
(T
i
/E
i
)
4. The error index (EI) is computed by
calculating the cumulative effect for each
PI
i

5. The weighting factors encountered later
in the software engg. Process weighing
more heavily:
EI= (i PIi)/PS
= (PI
1
+2PI
2
+3PI
3
++iPIi)/PS
Conclusion
The error index is an overall indication of
improvement in software quality
The application of statistical SQA and the
Pareto principle says:
Spend your time focusing on things that
really matters, but be sure that you
understand what really matters!
Software Reliability
The reliability of a computer program is an
important element of its overall quality
If a software repeatedly and frequently fails to
perform, it matters little whether other software
quality factors are acceptable
Software reliability can also be measured,
directed and estimated using historical and
developmental data
Software reliability is defined in statistical terms
as the probability of failure free operation of a
computer program in a specified environment for
a specified time

For e.g. if a program X is estimated to have a
reliability of 0.96 over 8 elapsed processing
hours then:
It means that a program X were to be
executed 100 times and require 8 hours of
elapsed processing time( execution time), it is
likely to operate correctly (without failure) 96
times out of 100
But the important question is: What is meant by
the term failure?
In the context of software quality and reliability,
failure is nonconformance to software
requirements
But there are grades of failure
Failure can be annoying or catastrophic
Failure can be corrected within seconds
while another may require weeks or even
months to correct
Complicating this issue even further, the
correction of one failure may result in the
introduction of other errors that ultimately
result in other failures

Measures of reliability and safety
Early work in software reliability attempted
to extrapolate reliability in terms of
hardware
Most hardware related reliability models
are predicted on failure due to wear than
failure due to design defects
In hardware, failure due to physical wear
are more likely than design related failure
Unfortunately, the opposite is true for
software

In fact, all software failures can be traced to
design or implementation problems
Wear and tear does not enter into the picture
But there are a few simple concepts that can be
applied to software reliability and availability
For a computer-based system, a simple
measure of reliability is mean time between
failure (MTBF) where,
MTBF= MTTF+ MTTR
where,
MTTF= Mean time to failure
MTTR= Mean time to repair
Many researchers agree that MTBF is a
far more useful measure than
defects/KLOC
It means that an end user is most
concerned with failures, not with the total
defect count
Because each defect contained within a
program doesnt have the same failure
rate, the total defect count provides little
indication of the reliability of the system

For e.g. consider a program that has been in
operation for 14 months
Many defects in the program may remain
undetected for decades before they are
discovered
The MTBF of such obscure defects might be 50
or even 100 years
Other defects, as yet undiscovered, might have
a failure rate of 18 or even 24 months
Even if every one of the first category of defects
(those with long MTBF) is removed, the impact
on software reliability is negligible
This because rarely anyone uses a software for
so long
In addition to a reliability measure, a measure of
availability is also developed
Software availability is the probability that a
program is operating according to requirements
at a given point in time and is defined as:
Availability= MTTF/(MTTF+MTTR) 100%
The MTBF reliability measure is equally
sensitive to MTTF and MTTR
But the availability measure is somewhat more
sensitive to MTTR
MTTR is also an indirect measure of the
maintainability of software

McCall Quality Factors
The factors that affect quality can be
categorized in 2 broad groups:
1. Factors that can be directly measured
(e.g defects per FP)
2. Factors that can be measured only
indirectly (e.g usability and
maintainability)
But in each case, measurement must
occur

McCall and his colleagues proposed a
useful categorization of factors that affect
software quality
These software quality factors focus on 3
important aspects of a software product:
Its operational characteristics (Product
Operation)
Its ability to undergo change (Product
Revision)
Its adaptability to new environment (Product
Transition)

It is difficult and sometimes impossible to
develop direct measures of the above
quality factors
Therefore, a set of metrics are defined and
used to develop expressions for each of
the factors according to the following
relationship:
F
q
= c
1
m
1
+ c
2
m
2
+..+c
n
m
n

where: F
q
is a software quality factor
c
n
are regression coefficients
m
n
are the metrics that affect the
quality factor

The metrics are in the form of a checklist that is
used to grade specific attributes of the software
The grading scheme proposed by McCall is a 0
(low) to 10(high) scale
Some of the metrics used in the grading scheme
are:
Accuracy
Completeness
Consistency
Error tolerance
Security
Simplicity
Training
etc

FURPS
HP developed a set of software quality
factors
They are known as FURPS
FURPS stands for functionality, usability,
reliability, performance and supportability
The FURPS quality factors derive from
earlier work
The attributes are defined as:

Functionality is assessed:
by evaluating the feature set and capabilities
of the program
the generality of the functions that are
delivered, and
the security of the overall system
Usability is assessed:
By considering human factors
Overall aesthetics
Consistency, and
documentation
Reliability is evaluated:
By measuring the frequency and severity of
failure
The accuracy of the output results
The mean time between failures (MTBF)
The ability to recover from failure, and
The predictability of the program
Performance is measured:
By processing speed
Response time
Resource consumption
Throughput, and
Efficiency
Supportability combines:
Ability to extend the program (extensibility)
Adaptability
Serviceability
Testability
Compatibility
Configurability
Ease of installation
Ease with which problems can be localized
The FURPS quality factors and attributes
described above can be used to establish
quality metrics for each activity in the
software process

m
a
i
n
t
a
i
n
a
b
i
l
i
t
y

Software Quality Assurance
Software Quality Assurance is an essential
activity for any business that produces products
to be used by others
During the early days of computing, quality was
the sole responsibility of the programmer
Standards for quality assurance for software was
first introduced by the military during the 1970s
in its software development contract
But today, the significance is such that many
different constituencies in an organization have
SQA responsibility
They include people from ranging from:
Software Engineers
Project managers
Customers
Salespersons
Individuals who serve within the SQA group
SQA is defined as a planned and systematic
pattern of actions to ensure quality in software
The SQA group serves as the customers in-
house representative
This means that the people who perform SQA
must look at software from the customers point
of view
Some of the question they must look into are:
Does the software adequately meet the
quality factors noted in requirements?
Has the software development been
conducted according to pre-established
standards?
Have technical disciplines properly performed
their roles as part of the SQA activity?
The SQA attempts to answer these and
other questions to ensure that software
quality is maintained

SQA Activity
It is composed of a variety of tasks
associated with two different
constituencies:
1. The software engineers who do technical
work
2. SQA group that has the responsibility for
quality assurance planning, oversight,
record keeping, analysis and reporting

Software engineers address quality (and
perform quality assurance) by:
applying solid technical methods and
measures
Conducting formal technical reviews
Performing well-planned software testing
The SQA group works to assist the
software engineering team in achieving a
high quality end-product
The Software Engineering Institute
recommends a set of SQA activities that
address:
Quality assurance planning
Oversight
Record keeping
Analysis, and
Reporting
It is these activities that are performed by an
independent SQA group:
1. Perform an SQA plan for a project:
The plan is developed during project planning and
is reviewed by all interested parties
QA activities performed by the software engg. team
and the SQA group are governed by the plan
The plan identifies:
Procedures for error reporting and tracking
Documents to be produced by the SQA group
Amount of feedback provided to software
project team
2. Participates in the development of the
projects software process description:
The software engg. team selects a process
for the work to be performed
The SQA group reviews the process
description for compliance with the
organizational policy, internal software
standards, externally imposed standards (e.g
ISO 9001) and other parts of the software
project plan
3. Reviews software engg. activities to verify
compliance with the defined software process
The SQA group identifies documents and tracks
deviations from the process and verifies that
corrections have been made
4. Audits designated software work products to
verify compliance with those defined as part of
the software process
The SQA group reviews selected work products,
identifies documents and tracks deviations and
verifies that corrections have been made
The group periodically reports the results of its work
to the project manager
5. Ensures that deviations in software work and
work products are documented and handled
according to a documented procedure
Deviations may be encountered in the project plan,
process description, applicable standards or technical
work products
6. Records non-compliance and reports to senior
management
Noncompliance items are tracked until they are
resolved
The SQA group coordinates the control and
management of change and helps collect and
analyze software metrics