Software Inspection Metric

Inspection isdifficult data to gather interpret.
and At AT&T laboratories, Bell the authors defined have nine keymetrics project that managers use plan, can to monitor, improve and inspections. of these Graphs metrics expose problems early canhelp and managers evaluate the inspection itself. process
MANAGING CODE INSPECTION INFORMNION
JACK BARNARD ART PRICE ATLT taboratoriis Bell
here is now general agreement that inspections reduce development costs and improve product quality. However, the percentage of defects removed through code inspection varies widely from about 30 to 75. We believe that - better management of inspection information will lead to more consistently effective inspections. Project managers who gather and use inspection information effectively can better * allocate resources, + control conformance to procedures, + determine the quality of inspected software, l measure the effectiveness of inspections, and + improve them.
We have established a measurement system that defines nine metrics to help plan, monitor, control, and improve the code-inspection process at AT&T Bell Laboratories. This measurement system has helped us achieve defect-removal efficiencies of more than 70 percent. Since 1986, we have applied our metrics to more than two dozen projects that use a code-inspection process based on Michael Fagan These projects are pris. marily real-time embedded systems written in C by teams of three to 80 developers. The projects inspected new, modified, and reused code. They also inspected fixes to defects detected in the field. Our measurement system helped reduce the cost of removing faults with code inspections by 300 percent, compared
IEEE
SOFTWARE
07407459/94/$04
m 0 1994 IEEE
59
with testing alone. We delivered one third as many faults to customers, and doubled the nroductivitv of our code-inspection proc ess. Although this practice specifically addresses code inspections, we feel it is applicable to software inspections in general.
NINEKEYMEASUREMENTS
A good measurement plan is essential. The plan should define the metrics, describe how they are used, and explain procedunes for collecting the data. Without a careful plan, a project could collect too much or too little data or the wrong data, or it could fail to use the data properly. A common problem in developing a plan is determining what to measure. Too often, the solution is to collect everypossi-
WITHOUT A CAREFUL PLAN, A PROJECT CAN COLLECT TOO MUCH, TOO LITTLE, THE OR WRONG DATA,
ble measure and figure out what it means later. But while this ensures that the information you need is available, it places an unnecessary burden on the development staff and adds expense. A more effective technique for selecting measurements is the GoalQuestion-Metric paradigm.7 GQM is a systematic approach to translate measurement needs into metrics. You begin by clearly identifying measurement goals, then pose specific questions L- in measurable terms -whose answers fulfill the goals. Finally, you enumerate the metrics, the answers to those questions. In this way, we constructed the list of goals, questions, and metrics in Table 1. Some of these questions could apply to any process we measure; others are spe-
cific to inspections. Each question relates to a single measurement goal. Careful consideration of each question, together with several iterations through the measurement cycle, led us to the nine metrics and their constituent data items, defined in the box below: 1. Total noncomment lines of source code inspected, in thousands ( IZOC). 2. Average lines of code inspected. 3. Average preparation rate. 4. Average inspection rate. 5. Average effort per KLOC. 6. Average effort per fault detected. 7. Average faults detected per BLOC. 8. Percentage of reinspections. 9. Defect-removal efficiency. Our projects used this same basic set of metrics. However, they also experimented with additional metrics, such as the degree of pretesting, percentage of faults found in preparation, level of inspector experience, code complexity, and code status (new, modified, or reused).
l3alKLOCimpeced=
i=
1ooo ,
where Nis the total number ofimpectkns. ArwqplwrdKads The average number of noncomment source lines of code inspected per inspection meeting AverageLOt inspected = Total KLOC inspeaed x l,ooO N
ing iimm equal to the average time an inspector spends prepi&x;@;for the inspection. It is a good intuitive generalization of the preparation rate for a single inspection. Werejfxtedanumveightd averageofpreparationmtesbe~itdot?snotacoountfbr~encesinthe~of~~~duid i#llqmens. With an unweighted average, a large module preparation s rate has the same inhence on the result as a small module s pFeparaticln An average of the two ignores size differences sate. and is not indicative of the preparation applied to the code as a whole.
60
MARCH
1994
( GOOI
QWliOflS
Metrics
What is the quality of the inspected software? Monitor
Average faults detected per KLOC Average inspection rate Average preparation rate Average inspection rate Average preparation rate Average lines of code inspected Percentage of reinspections Total KLOC inspected
and control
To what degree did the staff conform to the procedures?
I.What ISthe status of the inspection process?
IEEE
SOFTWARE
61
USINGMETRKSTOPLAN A\ the beginning of each project phase, the manager must decide how much time and effort inspection will require. If not enough time and staff are allocated, the inspectors will not have enough time to remove the faults. Several metrics are used to estimate tbe time needed and the cost of inspections. Now rnd d~~ h~pecthn cost?Effort is the most important factor in determining the cost of inspection. Inspection effort is the number of person-months needed to inspect and reinspect the code. The average effort per KLOC (metric S) and the percentage of reinspections (metric 8) are used to estimate cost. If you are content with past performance, you begin with the value of metric 5 for past inspections. If you decide to improve the inspection process by increasing effort, adjust this number to reflect
the change in effort before making a new estimate. For example, a decision to slow the inspection rate from 200 to 1SOlines of code per hour will add, for each participant, a little over one hour for every thousand lines of noncomment code. If you don have data from previous projects, we t suggest you use 50 hours per KLOC as an average effort This assumes a preparation and inspection rate of about 1SO KLOC an hour. Once you know the average effort per KLOC, multiply it by the KLOC to be inspected to estimate total effort. You must then determine the effort required for reinspections. To do this, use metric 8 to estimate the percentage of code that will be reinspected, again using historical data. Multiply this percentage by the KLOC to be inspected, then by average effort per KLOC (metric 5) to get the total reinspection effort. Add this to the estimated inspection effort, and multiply the result by the av-
erage cost per person-hour. This is tb total cost NOWrnd tine wii it t&e? The time a inspection process will take is difficult t determine accurately, but you can obtain reasonable estimate. First, as a baseline, use the time a sim ilar project required by simply countin the months between the first and last in spection meetings. Then estimate tb baseline project average effort per mont s by multiplying its average effort pe KLOC (metric 5) by the total KLOC in spected (metric l), and dividing by tb number of months. You must account for differences be tween tbe baseline and the new projec When you can identify changes in codin staff ratios or staff availability, simply ad just the baseline project average effoI s per month proportionally. This will giv you an estimate for the average effort pe month on the new project.
where
where Nk tfre total number ofimpections. DtltUltWJlL The data items that make up these equations are col/ lected for each individual inspection. Most data can be gathered at
62
MARCH
1994
Finally, to obtain the number of calendar months you will need, divide the predicted total effort (from the cost estimate) by the predicted average effort per month.
age of defects. For a ~untitative estimate, you can use more rigorous approaches, but they require you to collect more data. To assessquality, three metrics apply: average inspection rate (metric 4), avUSING METRICS MONITOR CONTROL erage preparation rate TO AND (metric 3), and average faults detected per You should monitor the inspection KLOC (metric 7). process to get an early estimate of the software quality, assessthe staffs confors The fmt step is to estimate the expected value mance to inspection procedures, and defor the average faults determine the status of the inspection process. When it is warranted, you must tected per KLOC. To do this, you select past inexercise appropriate control, to minimize spections on the same variations that arise fiorn differences in project for which the avercode characteristics, participants, and inage inspection rate (metric dividual practices. 4) and preparation rate Whatisthequdtyofthehpecteddtware? (metric 3) are within projTo measure quality, our metrics provide a ect guidelines. Then you compute the avwe&ve assessment of the number of erage faults detected per KLOC (metric 6) faults remaining in the code after inspecon these inspections. This number is your baseline. In our tion. They help managers decide if inspecexperience, you must sample at least 10 tions are removing the expected percent-
OUR METRICS PROVIDE A SUBJECTIVE ASSESSMENT OFTHE FAULTS LEFT AFTER INSPECTION. !Y;;;~~~~;; quall If the current value is much lower than the baseline value, you can conclude that the process is not removing the expected number of defects and that the inspected software quality is s probably low. Because, by definition, the
inspections before the average faults detected per KLOC is meaningful enough to use as a baseline. The set of inspections for this baseline by definition have slower inspection and preparation rates on average than the entire set of inspxtions for d-is project The next step is to compare the baseline value and the current value of metric 6: + If the values are close, the process is removing the expected number of defects and vou can conclude that the
not incktding&e time rqwtt by the moderator and the author in planning. Howit does in&de their time if they prepared for the meeting as inspectors. Tm&onally, the moderator asksparticipants at the beof the meeting for a spoken account of tfieii prep&m time. However, to obtain data lesslikely to be influenced hypeer pressure, wehave participants write down their preparationtime for submission to the moderator. total time spent impming the code. AtypiCal should not last more than two hours, so sevetal mef&gs may be necessaryto complete an won. InspectiondwrationisthetotaltimespentinaUthem~. tilhk& her of faults detected at the inspecti0n.J.n we use the total numhtx of ~~anddonot~faul~blrseverity..~tionbys~er@(far example, majorversus minor, or observable versus nonobservde) dimidhes theimportance MI inspectotsddetecting certain hlfs. They pay lessattention to identifying currently ins&n&ant faults that may become important &er code modifieation or maintenance. &Id hia The time spent by the author correcting the faults detect& in the inspect401~ is collected by the moderator f&m It the author after rework is completa ~S@#UIAt the end ofan impection, the moderator assigns animpection disposition. QfmultipIe meetings are required
L
far a code uni& the disposition is assigned at the end of the last me&g.) There are three possible dispositions: * Accept.IfthenanlIeandnumheroffsultswarranttit,themoderalKX,WidlinpfTOtIldK~team,decidgitWOUklIiotbecosltetlkxh2to~theaxleunitorminspeatherework.WmaUy, themodaatorveiities(desk~)the~~the~. 4 ~rez0ar;h.Itismst-e&tive~rthe~teamtci inspect the rework. * Rehpect. The moderator, with input f%omthe irtspecton team decidesitwoukl be cost-effective to reinspect the entire code unit. Faults were likeIy missed because the code had so many faults, and the rework will be so significant that the code unit will essentially be rewritten. ~~~ The total numberof fin&s detected in the impectionprocess, plus the faults identied in the &pected code tilwingsubsequent unit, integration, fimction, and system testing, plus fault ck%cted by customers. A single &ult may cause multiple failures, and a single &lure may be caused by more than one f&It Each distina fault shot&l be counted xtla!ERENm
1. I&J. Barnard and AL. Price, Automatingthe Impecti~ Process, Eightb fnt4 Co@ Testing Cmnprarrsofnoarr, Software Quality EZnpineerine, Jack&e, I%., 1991,pp. 189-198. 2. NJ. Barnard and R.B. Colkmt, Caqm: ADevelopment-Process Support System,ATdT Tkb.3, Mar. 1990,pp. 52-64.
J II
I
IEEE
SOFTWARE
63
Total KLOC inspected (m&*j j Average preparation rate (LOUhour) ,
9.3 )_ 343 194 I72 106 11
inspection and preparation rates for the entire set of inspections are faster than the baseline, and becausethe entire set of inspections are ah3 finding fewer hits on average than expected,you should take correcnve action to improve conformance to the inspection rate and preparation rate guidelines. + We have never seen the case where the current value is much higher than the baseline value. Again, you know that the inspection and preparation rates for the entire set of inspections is faster than the baseline set, so a current value much higher than the baseline value implies that inspections at rates faster than the recommended rates are finding more faults, on average, than inspections at slower rates. We do not believe this can happen. For example, Table 2 summarizes a sample project whose first 27 inspections - approximately 3 5 percent ofthe total -have been completed. The total faults detected per BLOC is 106. Our baseline is 118.5, which we derived born the 12 ofthe 27 inspections whose inspection and preparation rates were at or below 1SOlines of code per hour Because 106 is only about 10 percent less than the baseline, we conclude that the quality of the inspected software is good. Our experience with this software in test and after releaseconfirmed this assessment.
ITOFTEN HELPS TO ASSESS THE he rote. The inspection INSPECTORS rate is the speed with which the inspectors PREPAREDNESS material in the cover the BEFORE THE inspection meeting. It ineludes paraMEETING, phrase the time to code, each line of record the faults found, POSTPONING discuss the material. and IFNECESSARY,A project usually sets
guidelines for the inspection rate - between 100 and 150 lines of code per hour is typical. The average inspection rate (metric 4) should be less than or equal to this guideline. If it isn you should take immediate t, action, because inspections that are conducted at too fast a pace could seriously jeopardize the quality of the delivered software. The inspection rate can vary, depending on the reader skill, the inspectors s preparedness, the code complexity, and s
TowhdeyeedidthestclHconfmtoproce dure~?You determine conformance to inspection procedures by monitoring activities that are critical to its effectiveness, are subject to wider statistical variation, and have a higher probability of low conformance: namely, the inspection rate, preparation rate, lines of code inspected, and reinspection procedures. When deviations exceed expectations, you take corrective actions. You can begin to monitor conformance after about 20 inspections and periodically thereafter.
the number of code comments. If the average inspection rate is faster than your guideline, you can try to determine the cause by plotting the overall distribution of inspection-rate data across projects, as Figure la shows. This can help determine if the fast average inspection rate indicates a trend or is caused by an excessivenumber of outliers. It is also useful to plot inspection rate versus time, as Figure lb shows. This reveals when a fast average inspection rate is caused by a fluctuation in the nature of the project over time. For example, the project in Table 2 shows an average inspection rate of 172 lines of code per hour, 15 percent faster than the recommended 150. Figure la shows that the rates are not reasonably distributed: Forty-one percent of the inspections exceeded the recommended rate 19 percent of them exceeded it by more than 50 percent! Furthermore, Figure lb shows a recent trend toward faster meetings: The project was functioning more effectively at the beginning; something has caused a change. In this case, an approaching deadline had put extreme pressure on the inspection teams. hm rote. The preparation rate is the speed with which the inspectors cover the inspection material before the inspection meeting, including the time it takes to study the design specifications and review the code. Preparation guidelines are usually stated in terms of lines of code per hour - typically 100 to 150 lines of code per hour - even though preparation involves studying the design as well, Inspectors who prepare too quickly may not be effective in the inspection meeting. For that reason, it often helps to have moderators assess inspectors preparedness before the inspection, postponing the meeting if necessary. You monitor preparation rate just as you monitor inspection rate. If the average preparation rate (metric 3) is above the project recommendation, examine the distribution of preparation rates and their behavior over time for insight into the cause. For our sample project, the average preparation rate is 194 lines of code per
64
MARCH
1994
hour, significantly faster than the recmm Figure I c shows three tnztjor outliers and many other inspections cxceoding the guidelines - the tnanager should determine what happened in these three inspections and take appropriate
mendation. conh-01
acti0t1.
metit is niotiitoretl using the same tools used to monitor cmfortnancc to the inspection and preparation rate guidelines. In the sample project in Xble 2, the average lines ofcodc inspected is 343, well below the rccotntnendadon. Reimpeaiom. Developers tnust change a large amount of code to fix fxtlts f&id in inspections. This may result in new f;nults. YOLI should have guidelines for inspecting this reworked code. \\T suggest that i f more than 90 fa~tlts are detected in a single inspection or if the average ktults detcctetl
linesof code impected k strongl!~ reconiCI mend that you limit the six of the tiiod~tle inspected. \I recointnentl a limit of 500 e lines of cnde. An)ting larger encourages hasty preparation and le5 diorough inspcction meeting5 C~otifortnancc to this requirc-
per KLOC (tnetric 7) for that inspection is more &an 90, the reworked code must he reinspected. monitor conformance to the reiiiI3 spccdon Cguidclines, plot the faults found and inspected tnodule size. In Fi~gut-cId, the solid line indicates 90 faults found; the dashed line indicates 90 f&Its per KI,OC: fixtnd. Tdcallv. evew insnectioti t,lot that Falls ahove c&her lint s houltl it&ate a need for ;I reinspection. Yet onI!. three ~nodules - indiczatetl with open dots were reinspected. The manager should take corrective action.
IEEE
SOFTWARE
65
process improvement, often making trade-offs among cost, quality, and schedule. We believe the major emphasis should he on quality. It is much cheaper to find faults early in the development cycle, so overall development cost generallv decreases when the effectiveness of inspections increases. You can use the key metrics, inferring how changing one might change others, to formulate process changes and predict their effeb. t3owever, before you make clxmges you should also conduct interviews, surveys, and rctrospe&ves. Productivity and effectiveness are related. A change in the inspection process that improves productivity could diminish effectiveness, and a change that improves effectiveness can lower productivity. It is important to studs both before recommending process changes. / Number of inspections in sample Txal KLOC inspected Average LOC inspected (module size) ;\verage preparation raw (LO(Yhour) I Average inspection rate (LOGhour) 7i)tal
faults
5s 77 --.. j 409 121.9 154.8 89.7 0.5 How effectiveis the inspection process?Before you can measure inspection effectiveness, you must decide what aspects of the process arc most important. Inspections reduce testing and maintenance costs, improve quality, and reduce development intervals. They also help in paining, team building, and knowledge building. These benefits are important, hut we focus on defect removal as a measure of inspection ~ effectiveness. Defect-removal efficiency (metric 9) measures the percentage offaults removed by inspection, compared with faults found in testing and hvcustomers. It is a httonline number you can wc to a5sess effect the ofchanges to the inspection process and to compare results. It also reflects additional Edults introduced during inspection rework. Faults hy themselves are not a good indication of effectiveness, because faults can vary with the type of softu,are under development and its qualit\ before inspection. Defect-removal efficiency, however, is independent of the code inherent fault s density. Defect-removal efficiency cannot be computed until the end of development, when the next release may already be under dcvelopmcnt. Because defect-remow1 efficiency improves as faults found in inspections increases, you can use aver-
detected (ohsewable and nonotwxwt~le) per KLOC
j Average effort per fault detected (hours/fault)
What is the status of the inspection process? can assfx the status of the process by comparing the amount of code inspected with the amount you had planned to inspcct. 1lhen too many developers waittoo long to inspect their code, prepardtion and inspection rates suffer and it hecomcs difficult to schedule the right people at each inspection. To prevent this, monitor the rate at which code is inspected. A simple method is to graph the total KIOC inspected (metric 1) versus the days since the first inspection meeting, as in Figure 2. This shows ifthe current rate of inspection coincides with expectations and if the trend indicates the project will complete all iw spections as planiicd. The horizontal dashed lint in the figurc indicates the KI-OC the project planned to inspect; the \,ertical dotted lint,
You
the planned completion of all inspections For several M-e&s after the first inspection, not manv inspections were conducted. After about three months, the manager encouraged developers to inspect conpletetl submodules instead ofwaiting until they had completed the entire M ark product. The developers responded and, although the original completion date is still in jeopardy, progress toward completion is good. USINGMETRICS IMPROVEINSPECTIONS TO Dan analysis can also be used to inprove the inspection process itself. This can be done at the end ofa phase, a release, or an increment. On wry large projects, it can be done whenever the data redsonahly represents the project as a whole. 7 Inanager tletennines the focus for he
66
MARCH
1994
ire presented with a trade-off. Because the current productivity level, 0.5 hours per fault, is satisfactory, we decide that we are not willing to sacrifice effeciveness for productivity and conclude that decreasing effort is not an acceptable way to improve productivity. Even though Figures 4a and 4b show that increasing effort will decrease pro~uctivity, the average productivity is likely to remain at or below 3ne hour per fault. Because this is still extremely economical, we recommend the project decrease its inspection and preparation rates and accept the decrease in productivity. anagers should be aware that changing the inspection process to improve effectiveness generally lowers productivity, but the cost of this decrease is negligible when compared with the cost of removing defects later in development or test. Some improvements in effectiveness also increase productivity, of course. In general, we thinkvou should continuously strive to improve the effectiveness of &inspection process and simply monitor productivity, considering trade-offs in detail only if the cost per fault gets close to that of testing. Unfortunately, we have seen projects make changes to improve productivity - such as combining inspection roles, limiting preparation, and reducing the number of participants - only to 6nd they have decreased the effectiveness of the inspection + process.
ACKNOWLEDGMENTS
We thank John Betten, formerly of AT&T Bell Laboratories, coauthor of the AT&T&S Czwent Practire on Cc& Impctim, for collahomting with us on this article. Much of the information in this article w-as used in that 1992 issue of the best current practice.
REFERENCES
1. WS. Humphrey, Managing the Sofh wre hew, Addison-Wesley, Reading, LMass., 1989. 2. C. Jones, Aj$ied .S+are .t#eawment, McGraw-Hill, New York, 1991. 3, J.S. Collofello and S.N. Woodfield, Evaluating the Effectiveness of Reliahility-Assurance Techniques,3. .S~avuand S&we, Vol. 9, 1989, pp. 19 l-195. 4. W Whitten, hhzging S&awe Dmhpmmt Pnjim> Wqv, New York 1990. 5. E.F. Weller, Lessons Learned from Three Years ofInspection Data, IEEE .s@m, Sept. 1993, pp. 38-45. 6. M.E. Fagan, Design and Code Inspections to Reduce Errors in Program Development, [BMSysremr. No. 3, 1976, pp. 184-2 11. j., 7. RR. Basili and H.D. Rombach, The TAME Project: Towards Improvement-Oriented Software Environments, lEEE Trans. sofbire Eng., June 1988,~~. 758-773. 8. J.E. G&&y, Jr., Estimating the Number of Faults in Code, IEEE Trans. S&am Eng., July 1984, pp.459-464. 9. W.S. Cleveland, Elaenn ofGraphtig, Wadsworth, iMonterey, Calif., 1985.
1111
Jack Barnard is a distinguished member of technical staff in the Development Txhnologies Depamnent, AT&T Bell Laboratories, where his interests are sofhvare tools and process-improvement technologies. Barnard received a BS in mathematics and a MS in computer science from the University of Houston. I le is secretary of the IEEE Computer Society technical committee on software engineering.
gattelle,a practical technofogy company 64years with experience Wputtingtechnology work industry government, an to for and has exciting opportunity its Transportation in Group.The successful candidate conducthazardand other s stem and sofhvare will safety analyses usingsuchtechniques ti trees,FMEAs, asfau etc., to analyzesafetycritiiaf computer baseddesigns; developand appfyrail safetyassurance melhodologies; conductsafety and audi&.Minimumrequirementsincfude: B.S.inelectricalenginearing/ computer science/related 5 years field; experience softvfareand in system safety;knowledge software of engineering software and qualityassurance, software testing,safetyanalysis techniques, and assemblerand htgh-levelprogramming languages;and excellent and writtencommunication oral skills.Familiarity with safelyrelateddesignlassessment standards, railroad/transit and othertransportation-related critical safety controlequipment, and projectmanagement experience desired. are I poaftlonis at Columbus, headquaflers. $@ Ohio Columbus a is dynatnfcandgrowing thatis notedfor diverse, city convenient, and affordable Mafng, complemented qualitysdroots,a wide by f& of@crea&&al $ff opportunities, an activeartscommtrnity and thaLo#%s avariefj alculturaleventsthroughout year. the ,i,
Art Price is a disdng.lished member of technical staff 111 the Quality Management Systems Group at AT&T Bell Laboratories, where he works as a internal consultant on software process and quahty technology. 1 Iis research interem include software cost estimation, research and development process management and improvement, and technology transfer. Price received a BS, an MS, and a PhD in mathematics from Rensselaer Polytechnic Institute. He is a member of ACM, the A&can Sonety for Quality Control, and the
Address questions about this article to the authors at AT&T 11900 N. Pecos St., Denver, CO X0234; h.j.hamard@atr.com
Bell Laboratories,
IEEE
SOFTWARE

Software Inspection Metric

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Software Inspection Metric

Transféré par

Droits d'auteur :

Formats disponibles

Inspection isdifficult data to gather interpret.

MANAGING CODE INSPECTION INFORMNION

JACK BARNARD ART PRICE ATLT taboratoriis Bell

What is the quality of the inspected software? Monitor

To what degree did the staff conform to the procedures?

I.What ISthe status of the inspection process?

Total KLOC inspected (m&*j j Average preparation rate (LOUhour) ,

9.3 )_ 343 194 I72 106 11

detected (ohsewable and nonotwxwt~le) per KLOC

j Average effort per fault detected (hours/fault)

Vous aimerez peut-être aussi