Académique Documents
Professionnel Documents
Culture Documents
Briand,
and Walclio L. Melo
HOW
REUSE
INFLUENCES
PRODUCTIVITY
IN OBJECT- ORIENTED SYSTEMS
LTHOUGH
and
quality
gains.
The
104
most
notably
in
terms
of
105
HE
Reuse(C) = Size(C)
As we are dealing with an OO language allowing
inheritance, all ancestors of C, as well as all classes
aggregated by C, also have to be included in S. As
all C ancestors and all classes aggregated by C also
belong to the library, including a library class may
trigger an apparent large amount of verbatim reuse.
2. Class C is a new class created by specializing,
through inheritance, a library class LC. This case is
a variation of the first case, that is, the class LC
and all its ancestors and subclasses (aggregated
classes) will be included in S and will be dealt with
in a way similar to verbatim reuse.
3. Class C is a new class that aggregates a library class
LC. This case is also a variation of the first case,
that is, the class LC and all its ancestors and subclasses will be included in S and will be considered
in a way similar to verbatim reuse.
4. Class C has been created by changing the existing
class EC. Reuse can be estimated as:
Reuse(C) = (1 { %Change) 3 Size(C)
where %Change represents the percentage of C
added to or modified from EC.
However, the percentage of change is difficult to
obtain. As a simplification, we asked the developers to
107
The studys validity can be analyzed from two perspectives: internal (What threats to the conclusions
can we draw from the study?) and external (How
generalizable are these results?) [17]. With respect
to internal validity, we can say that subjects were
classified according to their ability and assigned
randomly to form equivalent teams. Therefore,
we are less likely to obtain biased results due to differences in ability across teams. However, performing a formal controlled experiment would have
required assignment of random levels of reuse to
projects and classesnot feasible in practice. And
because of this inability to assign random levels of
reuse, reuse rates might be associated with other
factors.
Concerning external validity, we can say that even
though students are not industrial programmers, we
have trained them thoroughly and we used an application domain intuitive enough to avoid misunderstandings when interpreting the requirements. In
addition, we used a development environment representative of what is available in industry for OO software development. Another possible threat to
external validity is that our systems were relatively
small, so their conceptual complexity may be limited
when compared to large software development applications. However, the inherent limitation of such
empirical studies cant be avoided.
Software Product Reuse and Software Quality
109
However, these differences should be assessed statistically; the significance of these trends should be
calculated. To perform this statistical analysis, defect
densities were computed for each project and each
reuse category; the results were combined to derive
defect densities for each projects subset of classes
393
No. of
SLOC
No.
Faults
Fault
Density
No. of
Errors
Error
Density Rework
25,642
15,165
6,685
16,015
247
93
11
6
9.63
6.13
1.57
0.37
157
74
10
2
6.11
4.89
1.50
0.12
336.35
160.04
22.5
3
63,537
356
4.88
243
3.82
521.89
17.5
15
10.0
12.5
7.5
Fault 10
Density
7.5
Error
Density
5.0
Mean
Mean
2.5
2.5
0.00
0
New
Ext.
Slight
Reuse Categories
Verbatim
New
Ext.
Slight
Reuse Categories
Verbatim
p-values
Slight
Ext.
New
p-values
Slight
Ext.
New
Verbatim
Slight
Ext.
0.46
0.012
0.012
0.012
0.012
0.26
Verbatim
Slight
Ext.
0.08
0.012
0.0136
0.012
0.025
0.26
111
Project No. of
SLOC
1
2
3
4
5
6
7
8
No. of
Errors
Reuse
Rate
Error Density
(with verbatim
reuse code)
24
33
42
33
26
25
15
44
47.29
2.23
31.44
18.08
40.05
48.67
64.01
0.00
1.72
6.51
4.31
3.86
3.18
3.93
2.28
8.68
13,981
5,068
9,735
8,543
8,173
6,368
6,571
5,068
Error Density
(without verbatim Rework
reuse code)
3.48
6.60
5.42
3.86
4.78
6.15
2.96
9.36
51
71
92
72
59
51
31
93
should be expected to be around 7, and each additional 10 percentage points in the reuse rate decrease
this density by nearly 1 (the estimate is 0.86) within
the range covered by the data set (we limit ourselves
to interpolation). No outlier seems to be the cause of
a spurious correlation; therefore, this result should
be meaningful (see Figure 3a).
Figures 3b and 3c show the relationships between
reuse rate and, respectively, error density without verbatim reused classes (R2= 0.54 and p-value = 0.04)
and number of errors (R2= 0.66 and p-value = 0.01).
In both cases, a significant negative relationship can
be observed, confirming our interpretation of the
relationship identified in Figure 3a. In other words,
we obtained consistent results using three different
measures of error density as a dependent variable.
45
10.00
2.50
40
7.50
Density
of
5.00
Errors
8.00
2
Error
Density
(No 6.00
Verbatim)
35
2
Errors
3
5
5
7
4.00
25
20
1
7
0.00
20.00 0.00 20.00 40.00 60.00 80.00
Reuse Ratio
30
2.00
20.00 0.00 20.00 40.00 60.00 80.00
Reuse Ratio
15
10
20.00 0.00 20.00 40.00 60.00 80.00
Reuse Ratio
Figure 3a. Linear relationship between error density and reuse rate
Figure 3b. Linear relationship between error density without verbatim reuse code and reuse rate
Figure 3c. Linear relationship between a projects number of errors and its reuse rate
check whether the total amount of reuse per project is related to a reduction in the project rework
effort.
70
60
We are interested in seeing whether the effort needed to repair reused classes is lower than the effort
needed to repair classes created from scratch or
extensively modified. We looked at three different
measures to answer this question:
40
Rework
30
20
Mean
10
0
New
Ext.
Slight
Reuse Categories
Verbatim
0.020
0.015
Rework/
LOCS
0.010
Mean
0.005
0.000
New
Ext.
Slight
Reuse Categories
Verbatim
Rework/
#Faults
2.0
Mean
1.5
1.0
0.5
0.0
New
Ext.
Slight
Reuse Categories
Verbatim
113
per reuse category. The results for the first metric are
shown in Table 5.
Based on Table 5, we can conclude that reuse
reduces the amount of rework, even when the code is
extensively modified. Again there is no observable difference between the verbatim-reused and slightly modified classes. These results show that from the perspective
of total rework, extensively modified code might still
bring benefits and, once again, slightly modified code is
nearly as good as code reused verbatim.
p-values
Slight
Ext.
New
p-values
Slight
Ext.
New
Verbatim
Slight
Ext.
0.138
0.018
0.025
0.012
0.017
0.05
Verbatim
Slight
Ext.
0.144
0.043
0.028
0.043
0.042
0.16
As for defect density and rework, we used the effort should be expected to be around 88 personWilcoxon T test to analyze the variations of the sec- hours for each project and that for each additional 10
ond metric, that is, rework normalized by the size of percentage points in reuse rate (within the reuse rate
the classes belonging to each reuse category interval covered by our data set) rework effort will
(referred to as rework density) across reuse cate- decrease by nearly 7.5 person-hours. These results are,
gories. Results are presented in Table 6.
of course, specific to the system requirements impleBased on Table 6, we conclude that reuse reduces mented in the study, but they could be generalized as
rework density except when the code is extensively
modified. There is no observable difference between Table 7. Overview of the projects data collected
after the errors have been fixed
the verbatim-reused and
slightly modified classes.
Productivity
These results show that
SLOC
Reuse Effort Productivity (without verbatim
from the perspective of Project Delivered Reused
Rate
reused code)
rework density, reuse
1
14,222
6,611
46.48
155
91.75
47.55
brings benefits and
2
5,105
113
2.21
280
18.23
17.70
slightly modified code is
3
11,687
3,061
26.19
365
32.01
18.28
nearly as good as code
4
10,390
1,545
14.87
303
34.3
23.10
reused verbatim.
5
8,173
3,273
40.04
159
51.4
30.82
6
8,216
3,099
37.71
264
31.12
12.38
Some projects had no
7
9,736
4,206
43.2
140
69.54
16.89
faults in their verbatim
8
5,255
0
0
264
19.9
19.20
or slightly reused classes. The number of data
points in these categories thus became too small for follows: Each additional 10 percentage points in reuse
applying a Wilcoxon T test to the third metric. In rate, within the reuse rate interval covered by our data
such cases, we could look only at the difference set, decreases rework by nearly 8.5%. Again, no outlier
between new and extensively modified classes. No seems to be causing any spurious correlation (see Figsignificant difference could be observed in this case. ure 7).
To better capture the concept of rework, it would
As a last attempt to look at change difficulty (the
third metric), we performed an analysis at the class be better to look at rework normalized by the size of
level, where rework effort per class was normalized the changes that occurred during the repair phase.
by the number of faults detected and corrected in Unfortunately, we could not capture this information
these classes. No significant differences in distribu- accurately with the data collection procedures we had
tion could be observed across reuse categories. If in place. As a rough approximation, we looked at
these results were confirmed by further studies, it rework normalized by the number of faults. However,
would mean that differences in rework effort across no significant differences were observed between
reuse categories would be mainly due to differences reuse categories. This result was confirmed when we
in fault-proneness and not to differences in ease of attempted to investigate rework normalized by the
number of faults at the class level.
modification.
In conclusion, the results support the assumption
To conclude, rework seems to be lower in high-reuse
categories, but there is no statistically significant evi- that reuse in OO software development results in
lower rework effort.
dence that faults are easier to detect and correct.
114
100
8
80
2
Rework
4
5
60
1 6
Note that the data in the reuse rate and SLOC delivered column are different from the data in Table 4.
This difference stems from the fact that Table 8 presents the results at the end of the lifecycle, that is,
after the errors have been fixed, whereas Table 4 presents the data collected at the end of the implementation phase.
40
7
20
20.00 0.00
ASED
45.00
100.00
40.00
80.00
7
60.00
5
Productivity
40.00
20.00
400
4
8
35.00
Productivity
(No Verbatim 30.00
Reused Code)
25.00
Effort
250
300
200
20.00
8
0.00
10.00 0.00 10.00 20.00 30.00 40.00 50.00
Reuse Ratio
350
7
150
6
15.00
10.00 0.00 10.00 20.00 30.00 40.00 50.00
Reuse Ratio
1
7
100
10.00 0.00 10.00 20.00 30.00 40.00 50.00
Reuse Ratio
115
116