Académique Documents
Professionnel Documents
Culture Documents
By Richard Bender
and the system state after the test exe- The first two steps are totally inter-
When you ask testers how they know cutes. twined. Testing, by definition, is com-
they are done testing, the most common paring an expected answer to the
responses are: Build Test Cases: There are two parts observed answer. You need to define
needed to build test cases from logical quantitatively and qualitatively how
We test until we are out of time and test cases: creating the necessary data; much testing is enough and then design
resources; and building the components to support tests that will ensure that criteria is met.
testing (e.g., build the navigation to get to You must do this for each type of testing:
We test until all of the test cases we cre- the portion of the program being tested). functional, performance, usability, secu-
ated ran successfully at least once and rity, etc. Given the space constraints, we
there are no outstanding severe defects. Execute Tests: Execute the test case steps will only address functional testing in this
against the system being tested and docu- paper.
I admire the honesty of the first answer ment the results.
which comes from the “clean con- The first thing we need to understand is
science” school of testing – “I did all the Verify Test Results: Verify that the that you cannot exhaustively test any
testing I could under the constraints man- expected test results match the observed software system. The upper limit to the
agement gave me and my conscience is results. [Note: This pre-supposes that the total number of tests for a program is:
clear”. The obvious question that follows specifications are clear enough and
the second answer is how much function detailed enough to actually calculate the [2n(L1*L2*...*Lx)(V1*V2*...*Vy)]!
and code were actually tested? In the expected answer ahead of time. Testing
vast majority of cases the team has no specifications to ensure that they are cor- where “n” is the number of decisions,
quantitative measure of their level of test- rect, unambiguous, logically consistent, “Li” is the number of times a given deci-
ing. and written in sufficient detail is a non- sion can loop, x is the number of deci-
trivial issue and the subject of another sions which cause loops (x < or = to n),
Stepping back, testing is divided into the paper.] “Vi” is the number of all of the possible
following eight activities: values that each input variable can have,
Verify Test Coverage: Track the amount and y is the number of input variables.
Define Test Completion Criteria: The of coverage achieved by the successful The factorial (“!”) is because the order in
test effort has specific, quantifiable goals. execution of each test. which the set of tests are executed does
Testing is completed only when the goals make a difference as to the results. This
have been reached (e.g., testing is com- Manage the Test Library: Maintain the number is actually absolutely meaning-
plete when the tests that address 100% relationships between the test cases and less mathematically as well as being
functional coverage of the system all the programs being tested. Keep track of practically impossible to achieve. In
have executed successfully). what tests have/have not been executed, many programs this number exceeds the
and whether the executed tests have number of molecules in the universe
passed or failed.
Design Test Cases: Logical test cases are [1080 according to Stephen Hawkings].
defined by five characteristics: the initial
state of the system prior to executing the Manage the Resolution of Identified
The goal of test case design is to identify
test; the data in the system (e.g., data base Defects: Track the status of defects and
an extremely small subset of the possible
values); the inputs; the expected results; retest as needed.
combinations of data that will give you
Figure 3 Figure 4
B is always true. There is no Geneva can now see the A defect. When any as we expected because the D, E, F leg
Convention for software which limits us defect is detected all of the related tests worked. In this case we did not see the
to one defect per function. must be rerun. defect at C because it was hidden by the
F leg working correctly.;
Figure 3 shows the results of running the The above example addresses the issue
tests. When we run test variation 1 the that two or more defects can sometimes Therefore, the test case design algorithms
software says A is not true, it is false. cancel each other out giving the right must factor in:
However, is also says B is not false, it is answers for the wrong reasons. The
true. The result is we get the right answer problem is worse than that. The issue of The relations between the variables (e.g.,
for the wrong reason. When we run the observability must be taken into account. and, or, not);
second test variation we enter B true When you run a test how do you know it The constraints between the data attrib-
which the software always thinks is the worked? You look at the outputs. For utes (e.g., it is physically
case – we get the right answer. When we most systems these are updates to the impossible for variables one and
enter the third variation with just C true, databases, data on screens, data on two to be true at the same time);
the software thinks both B and C are true. reports, and data in communications The functional variations to test (i.e., the
Since this is an inclusive “or” we still get packets. These are all externally observ- primitives to test for each logical
the right answer. We are by now report- able. relationship); and
ing to management that we are three Node observability;
quarters done our testing and everything In Figure 5 let us assume that node G is
is looking great. Only one more test to the observable output. C and F are not The design of the set of tests must be
run and we are ready for production. externally observable. We will indirectly such that if one or more defects are pres-
However, when we enter the fourth test deduce that the A, B, C function worked ent, you are mathematically guaranteed
with all inputs false and still get D true by looking at G. We will indirectly that at least one test case will fail at an
that we know we have a problem. deduce that the D, E, F function worked observable point. When that defect is
by looking at G. Let us further assume fixed, if any additional defects are pres-
There are two key things about this there is a defect at A where the code ent, then one or more tests will fail at an
example so far. The first is that software, always assumes that A is true no matter observable point.
even when it is riddled with defects, will what the input is. A fairly obvious test
still produce correct results for many of case would be to have all of the inputs set A by product of these algorithms is that
the tests. The second thing is that if you to true. This should result in C, F, and G some variations get flagged as
do not pre-calculate the answer you were being set to true. When this test is “untestable”. That means there is no way
expecting and compare it to the answer entered the software says A is not true, it to design a set of tests which include this
you got you are not really testing. Sadly, is false. Therefore, C is not set to the variation and still guarantee that all
the majority of what purports to be test- expected true value but is set to false. defect scenarios will be observable. This
ing in our industry does not meet this cri- However, when we get to G it is still true is caused by a combination of constraints
teria. People look at the test results and Figure 5
just see if they look “reasonable”. Part of
the problem is that the specifications are
not in sufficient detail to meet the most
basic definition of testing.