Vous êtes sur la page 1sur 6

International Journal of Computational Intelligence and Information Security, December 2011 Vol. 2, No.

12

Code Coverage Technique Selection and Prioritization of Unit Test Cases in Regression Testing
1

Ratnesh Kumar Dubey M.Tech(CSE) Student LNCT, Bhopal, India ratneshdub@gmail.com

Alka Gulati CSE Dept. LNCT, Bhopal, India gulatialka@rediffmail.com Abstract

Dr Binod Kumar H.O.D. MCA Deptt LNCT, Bhopal, India binod.istar.1970@gmail.com

A regression test selection technique selects an appropriate number of test cases from a test suite that might expose a fault in the modified program One of the common performance goals is to run those test cases that achieve total code coverage at the earliest. In this work we propose a model that achieves 100% code coverage optimally during version specific regression testing In this paper, we propose both a regression test selection and prioritization technique. We implemented our regression test selection technique and demonstrated in two case studies that our technique is effective regarding selecting and prioritizing test cases . Keywords: Test Case, Prioritization of Test Cases, regression testing.

1. Introduction
Test Suite development is an expensive process and additionally even conscientiously maintained test suite can grow quite large. Most of the times running an entire Suite is not possible as it takes significant amount of time to run all tests in a test suite. So researchers have given various techniques for minimizing test suite. Test Suite minimization techniques lower costs by reducing a test suite to a minimal subset that maintains equivalent coverage of original set with respect to particular test adequacy criterion [1, 5, 8, 10]. Test suite minimization techniques, however, can have some drawbacks e.g., although some empirical evidence indicates that, in certain cases there is little or no loss in the ability of a minimized test suite to reveal faults in comparison to its un minimized original [10, 11], other empirical evidence shows that the fault detection of test suites can be severely compromised by minimization [1]. The other way of assisting in testing is to prioritize test cases on the basis of some criterion.

2. Test Case Prioritization


Test Case Prioritization is the process of scheduling test cases in an order to meet some performance goal. Test case Prioritization as defined in [2] Problem: Find T PT such that (T) (T PT) (T T) [f (T) f (T)] Given: T, a test suite; PT, the set of permutations of T; f, a function from PT to the real numbers.

Here, PT represents the set of all possible prioritizations (orderings) of T, and f is a function that, applied to any such ordering, yields an award value for that ordering. The definition assumes that higher award values are preferred over the lower ones. There can be number of possible goals of test case prioritization e.g., testers may wish to increase the rate of fault detection, or testers may wish to find the critical faults at the earliest or may wish to cover the maximum code during testing at the earliest. But all these goals lead to one major goal / prime goal that testers want to increase their confidence in the reliability of the system at a faster rate. So testers are interested in finding maximum number of faults as well as the most critical faults at the earliest [4]. We address the problem of test case prioritization during regression testing. Regression testing is performed on modified software to provide confidence that the software behaves correctly and that the modifications have not adversely impacted the software quality. Software engineers often save the test suites they develop for their software so that they can reuse those test suites later as the software evolves. There are two varieties of test case prioritization viz. general test case prioritization and version specific test case prioritization. In general test case prioritization, for a given program P and test suite T, we prioritize the test cases in T that will be useful over a succession of subsequent modified version of P without

10

International Journal of Computational Intelligence and Information Security, December 2011 Vol. 2, No. 12

any knowledge of modified version. In version specific test case prioritization, we prioritize the test cases in T, when P is modified to P with the knowledge of the changes that have been made in P. Various researchers have given different , prioritization techniques [2,3,6,7,9]. Studies have shown that some of these techniques can significantly increase the rate of fault detection of test suite in comparison to unordered or randomly ordered test suite [2, 7]. Most of these prioritization techniques are general prioritization techniques. In this work, we concentrate on version-specific test case prioritization. We propose a prioritization technique that achieves modified code coverage at the fastest rate possible.

3. Problem Statement:
Let P be a procedure or program P be a modified version of P T be a set of code coverage based tests (a test Suite) created to test P. When P is modified to P, we have to find T, which is subset of T that achieves maximum code coverage at the earliest and should be given highest priority during Regression Testing. For this purpose, we want to identify tests that Execute code that has been deleted so that test cases that have gone redundant can be deleted from T. Execute modified code at least once at the earliest.

4. Regression Test Selection Algorithm


We propose version specific regression test selection algorithm. This has the advantage over the general test case prioritization that we can utilize the information about the changed code to prioritize the test cases. We are interested in executing only those lines of code that have been modified and our aim is to execute the modified lines of code with minimum number of test cases. We have test case history that tells the code lines that have been covered by each test case. Let us take that for a program of 60 lines of code, there are 10 code coverage based test cases. Let T1 test case covers lines 1,2,20,30,40,50 and T2 test case covers lines 1,3,4,21,31,41,51 of code and so on test case T10 covers lines 1,2,3,4,60 of code. This is test case coverage information given as input to algorithm and stored in test case Array. 1,2,20,30,40,50 1, 3, 4, 21, 31, 41, 51 5, 6, 7, 8, 22, 32, 42, 52 6, 9, 10, 23, 24, 33, 43, 54 5, 9, 11, 12, 13, 14, 15, 20, 29, 37, 38, 39 Test case Array = 15, 16, 17, 18, 19, 23, 24, 25, 34, 35, 36 26, 27, 28, 40, 41, 44, 45, 46 46, 47, 48, 49, 50, 53, 55 55, 56, 57, 58, 59 1, 2, 3, 4, 60 Suppose after initial tests, the lines of code 1, 2, 5, 15, 35, 45, 55 are modified. The simplest way to execute all modified lines of code is to run all test cases that have any of the modified lines of code. In this problem there are 8 test cases viz T1, T2, T3, T5, T6, T7, T8, T9 that have at least one line of code that have been modified. So the simplest solution is to run all these 8 test Cases. But this is not the optimum solution. The proposed model optimizes the number of Test Cases needed to run, for 100% code coverage. The following section shows how we achieve this. We can find out that Test Case T1 has two lines of code that have been modified and the numbers of modified lines of code in T1 are 1 and 2. Similarly Test Case T2 has one line of code that has been modified and the number of modified line of code in T2 is 1. This is computed for all test cases as shown in Table.

11

International Journal of Computational Intelligence and Information Security, December 2011 Vol. 2, No. 12

Test Cases T1 T2 T3 T4 T5 T6 T7 T8 T9 T10

Line No.s of lines matched (found) 1, 2 1 5 5, 15 15, 35 45 55 55 -

No. of matches (nfound) 2 1 1 0 2 2 1 1 1 0

We sort the number of matches found (NFOUND array) in descending order and select the test case that has the maximum number of matches, as the test case that need to be executed. This nd represented in Table 2 by setting its candidate value = is st 1. Since now test case T1 has been selected and it covers 1 and 2 lines of code, the lines of code that still to be executed are [1,2,5,15,35,45,55]-[1,2]=[5,15,35,45,55] Remaining Modified Lines of Code are 5,15,35,45,55 Again we check for the number of modified lines of code covered by each test case and sort them in descending order and select the one with maximum number of matches. This is repeated till all the modified lines of code are covered. This is represented in Table 2, 3 and 4. Test Cases T1 T5 T6 T2 T3 T7 T8 T9 No. of matches (n found) 2 2 2 1 1 1 1 1 Table 2 Matches found 1,2 5.15 15.35 1 5 45 55 55 Candidate 1 0 0 0 0 0 0 0

Test Cases T5 T6 T3 T7 T8 T9

No. of matches (n found) 2 2 1 1 1 1 Table 3


T

Matches found 5.15 15.35 5 45 55 55

Candidate 1 0 0 0 0 0

Modified (mod_locode) = [35,45,55]

Test Cases T6 T7 T8 T9

No. of matches (n found) 1 1 1 1 Table 4

Matches found 35 45 55 55

Candidate 1 1 1 0

From table 2, 3, 4 the Test Cases that have Candidate value = 1 are T1, T5, T6, T7, T8. So these are the test cases that must be run. 12

International Journal of Computational Intelligence and Information Security, December 2011 Vol. 2, No. 12

DISCUSSION: out of the ten test cases, in above example, we need to run only 5 test cases for 100% code coverage of modified lines of code. So there is 50% saving in test cases. If we run only those test cases that have any of the modified lines of code, then also we need to run test cases T1, T3, T5, T6, T7, T8, T9, T10. The number of test cases that have any of the modified lines of code are 8. So even than the proposed algorithm provides 30% saving in test cases. Let us consider another example in which lines of code have been modified as well as deleted. The example considers 20 lines of code and 5 code coverage based Test Cases viz. T1, T2, T3, T4, T5. 1,5,7,15,20 2,3,4,5,8,16,20 Test case array = 6,8,9,10,11,12,13,14,17,18 1,2,5,8,17,19 1,2,6,8,9,13 Suppose after initial tests, the lines of code 6, 13, 17, 20 are changed: And the following lines of code are deleted: (del_locode) = [4, 7, 15] In this case, from the test case history information, we remove the lines of code that have been deleted. After removing the deleted lines of code from all test cases, they appear as: 1, 5, 20 2, 3, 8, 16, 20, 5 Test case array = 6, 8, 9, 10, 11, 12, 13, 14,17, 18 19, 17, 1, 2, 5, 8 6, 8, 9, 1, 2, 13

Line number of LOC T1

T5

Found in test case 1 5 20 6 8 9 1 2 13

Found in test case T4 T2 T2 T3 T3 T3 T4 T2 T3

Redundant y/n Y Y Y Y Y Y Y Y Y

The remaining Test Cases are = [T2, T3, T4] . Test case array = 2, 3, 5, 8, 16, 20 6, 8, 9, 10, 11, 12, 13, 14, 17, 18 1, 2, 5, 8, 17, 19

13

International Journal of Computational Intelligence and Information Security, December 2011 Vol. 2, No. 12

The number of matches of each Test case are found and represented in Table 6 Test Cases T2 T3 T4 Line Nos of lines matched (found) 20 6, 13, 17 Table 6 No. of matches (n found) 1 3 0

Test cases are sorted on the basis of no. of matches in descending Order. Test Cases T3 T2 T4 3 1 0 No. of matches (n found) Line Nos of lines matched (found) 6,13,17 6, 13, 17 Table 7 Candidate 1 1 0

Hence, the test cases T2, T3 need to be run and the redundant test cases (to be deleted) are T1, T5. Discussion: out of the five test cases, in above example, we need to run only 2 test cases for 100% code coverage of modified lines of code. So there is 60% saving in test cases. If we run only those test cases that have any of the modified lines of code, then also we need to run test cases T2, T3, T4. The numbers of test cases that have any of the modified lines of code are 3. So even than the proposed algorithm provides 20% saving in test cases.

5 Implementation of Algorithm
The above algorithm is implemented as follows.

5.1 Inputs
The algorithm uses the following inputs: Number of test cases viz. T1, T2, T3T n stored in n_test cases array. Line numbers of lines of code covered by each test case, stored in test cases array (two-dimensional). Total number of lines of code modified stored in n modloc, and their line numbers stored in mod_locode array. Total number of lines of code deleted stored in n_delloc, and their line numbers stored in del_locode array.

5.2 Test Selection Algorithm


The algorithm comprises of two modules, which execute in order given: Adjust module Reduce module

Adjust module
This module performs the following operations in that order: For each test case, it checks every value of line number against the values input in del_locode array. If a value is found, it is removed from that test case. An array deleted having size equal to number of test cases is set to all zeros (assuming all test cases are redundant). For each test case, i = 1 to n test cases For each test case, j = 1 to n test cases If i!=j and deleted[j]=0 then Scan (search) every value of test case i against every value of test case j If one or more non-match then Deleted [i]=0 Thus, at the end of Adjust module, test cases having deleted value=1 are redundant and will not be considered for Reduce module.

14

International Journal of Computational Intelligence and Information Security, December 2011 Vol. 2, No. 12

Reduce Module
This module performs the following operations in that order: Take an array n found and another array found. Take an array candidate. Initialize it to all zeros. Candidate value for a test case will be set to 1 if it is to be regenerated. Scan all line numbers of all test cases. If any line number matches with line numbers in mod_locode (line numbers of modified lines of code) array, increment the value of n found for that test case. Also, stored the value matched in found. Thus, after step 3, n found has number of values matched in every test case and found has actual line numbers matched. Sort the n found array in decreasing order of number of matches found in every test case. Select the first test case(having the maximum value of n found ) and set its candidate value = 1 and correspondingly modify mod_locode array by deleting the entries found in found for this test case. Repeat steps 3 to 5 with modified mod_locode array until there are no entries in mod_locode array.

5.3 Output of the algorithm


The following steps display the final test cases: Display the serial numbers of those test cases which have deleted value = 1. They are redundant test cases, to be removed. Display the serial numbers of those test cases which have candidate value = 1 and deleted value = 0. They are test cases, to be re-generated. The algorithm has been implemented in C++ language. The cost of implementing the algorithm is almost negligible but it saves the cost and effort of running extra test cases. Once the code coverage has been attained then rest of test cases can be prioritized by using any of the prioritization techniques already available in literature.

References:
1. G. Rothermel, M.J. Harrold, J. Ostrin, and C.Hong. An empirical study of the effects of minimization on the fault detection capabilities of test suites. In Proceedings of the international conference on Software Maintenance, 34-43, Nov.1998. G. Rothermel, R.H.Untch, C.Chu and M.J.Harold, Test Case Prioritization, IEEE Transactions of software Engineering, 27(10); 928-948, Oct. 2001. J.A.Jones and M.J.Harrold. Test Suite Reduction and Prioritization for modified condition/decision coverage. Proceedings of The International Conference on software Maintainance, Nov.2001 K.K.Aggarwal, Yogesh Singh, Software Engineering, Programs Documentation Operating Procedures, New Age International Publishers,2001. M.J.Harrold, R. gupta, and M.L.Soffa. A methodology for controlling the size of a test suite. ACM Transactions on Software Engineering and Methodology, 2(3); 270-285, July 1993. S.Elbaum, A Malishevsky and G. Rothermel, Incorporating-varying test costs and fault severities into test case prioritization. International Conference on Software Engineering; 329-338, May 2001. S.Elbaum. A Malishevsky and G. Rothermel, Test Case Prioratization: A family of empirical studies, IEEE transactions of software Engineering, 28(2); 159-182, February 2002. T.Y.Chen and M.F.Lau. Dividing strategies for the optimization of a test suite. Information Processing Letters, 60(3): 135-141, March 1996. W.E.Wong, J.R.Horgan, S.London and H.Agrawal. A study of effective regression in practice. Proceedings of the Eighth International Symposium on software reliability Engineering; 230-238, Nov.1994. W.E.Wong, J.R.Horgan, S. London, and A.P.Mathur. Effect of test set minimization on fault detection effectiveness. Software Practice and experience, 28(4); 347-369, April 1998. Z. Li, M. Harman, and R. M. Hierons Search algorithms for regression test case prioritization, IEEE Trans. On Software Engineering, Vol.33, No.4, April, 2007. Emelie Engstrom, Per Runeson y and Andreas Ljung Improving Regression Testing Transparency and Efficiency with History-Based Prioritization an Industrial Case Study 2011 Fourth IEEE International Conference on Software Testing, Verification and Validation.

2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12.

15

Vous aimerez peut-être aussi