Vous êtes sur la page 1sur 15

Methods of software quality assurance

Daniel Osielczak (d.osielczak@gmail.com) Sebastian Mianowski (aeons@wp.pl)

Table of Contents
1 Introduction 1.1 SQA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1.2 Overview of methods . . . . . . . . . . . . . . . . . . . . . . . . . 2 Description of testing methods 2.1 Black-box testing . . . . . . . 2.2 White-box testing . . . . . . 2.3 Unit testing . . . . . . . . . . 2.4 Integration testing . . . . . . 2.5 Functional testing . . . . . . 2.6 End-to-end testing . . . . . . 2.7 Sanity(Smoke) testing . . . . 2.8 Acceptance testing . . . . . . 2.9 Load testing . . . . . . . . . . 2.10 Usability testing . . . . . . . 2.11 Recovery testing . . . . . . . 2.12 Security testing . . . . . . . . 2.13 Exploratory testing . . . . . . 3 Summary 3 3 4 7 7 7 8 8 9 9 10 11 11 12 13 13 14 15

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

. . . . . . . . . . . . .

1
1.1

Introduction
SQA

Software Quality Assurance (SQA) consists of a means of monitoring the software engineering processes and methods used to ensure quality. It does this by means of audits of the quality management system under which the software system is created. These audits are backed by one or more standards, usually ISO 9000. It is distinct from software quality control which includes reviewing requirements documents, and software testing. SQA encompasses the entire software development process, which includes processes such as software design, coding, source code control, code reviews, change management, conguration management, and release management. Whereas software quality control is a control of products, software quality assurance is a control of processes. Software quality assurance is related to the practice of quality assurance in product manufacturing. There are, however, some notable dierences between software and a manufactured product. These dierences stem from the fact that the manufactured product is physical and can be seen whereas the software product is not visible. Therefore its function, benet and costs are not as easily measured. Whats more, when a manufactured product rolls o the assembly line, it is essentially a complete, nished product, whereas software is never nished. Software lives, grows, evolves, and metamorphoses, unlike its tangible counterparts. Therefore, the processes and methods to manage, monitor, and measure its ongoing quality are as uid and sometimes elusive as are the defects that they are meant to keep in check. [1] SQA is also responsible for gathering and presenting software metrics. For example the Mean Time Between Failure (MTBF) is a common software metric (or measure) that tracks how often the system is failing. This Software Metric is relevant for the reliability software characteristic and, by extension the availability software characteristic. SQA may gather these metrics from various sources, but note the important pragmatic point of associating an outcome (or eect) with a cause. In this way SQA can measure the value or consequence of having a given standard process, or procedure. Then, in the form of continuous process improvement, feedback can be given to the various process teams (Analysis, Design, Coding etc.) and a process improvement can be initiated.

1.2

Overview of methods

Software Quality Assurance takes several forms. A brief list of testing methods that should be considered[2]: Methods: Black box testing - not based on any knowledge of internal design or code. Tests are based on requirements and functionality. White box testing - based on knowledge of the internal logic of an applications code. Tests are based on coverage of code statements, branches, paths, conditions Unit testing - the most micro scale of testing; to test particular functions or code modules. Typically done by the programmer and not by testers, as it requires detailed knowledge of the internal program design and code. Not always easily done unless the application has a well-designed architecture with tight code; may require developing test driver modules or test harnesses Incremental integration testing - continuous testing of an application as new functionality is added; requires that various aspects of an applications functionality be independent enough to work separately before all parts of the program are completed, or that test drivers be developed as needed; done by programmers or by testers Integration testing - testing of combined parts of an application to determine if they function together correctly. The parts can be code modules, individual applications, client and server applications on a network, etc. This type of testing is especially relevant to client/server and distributed systems Functional testing - black-box type testing geared to functional requirements of an application; this type of testing should be done by testers. This doesnt mean that the programmers shouldnt check that their code works before releasing it (which of course applies to any stage of testing) System testing - black-box type testing that is based on overall requirements specications; covers all combined parts of a system End-to-end testing - similar to system testing; the macro end of the test scale; involves testing of a complete application environment in a situation that mimics real-world use, such as interacting with a database, using network communications, or interacting with other hardware, applications, or systems if appropriate

Sanity (Smoke) testing - typically an initial testing eort to determine if a new software version is performing well enough to accept it for a major testing eort. For example, if the new software is crashing systems every 5 minutes, bogging down systems to a crawl, or corrupting databases, the software may not be in a sane enough condition to warrant further testing in its current state Regression testing - re-testing after xes or modications of the software or its environment. It can be dicult to determine how much re-testing is needed, especially near the end of the development cycle. Automated testing tools can be especially useful for this type of testing Acceptance testing -nal testing based on specications of the end-user or customer, or based on use by end-users/customers over some limited period of time Load testing - testing an application under heavy loads, such as testing of a web site under a range of loads to determine at what point the systems response time degrades or fails Stress testing - term often used interchangeably with load and performance testing. Also used to describe such tests as system functional testing while under unusually heavy loads, heavy repetition of certain actions or inputs, input of large numerical values, large complex queries to a database system, etc. Performance testing - term often used interchangeably with stress and load testing. Ideally performance testing (and any other type of testing) is dened in requirements documentation or QA or Test Plans Usability testing - testing for user-friendliness. Clearly this is subjective, and will depend on the targeted end-user or customer. User interviews, surveys, video recording of user sessions, and other techniques can be used. Programmers and testers are usually not appropriate as usability testers Install/Uninstall testing - testing of full, partial, or upgrade install/uninstall processes Recovery testing - testing how well a system recovers from crashes, hardware failures, or other catastrophic problems Failover testing - typically used interchangeably with recovery testing Security testing - testing how well the system protects against unauthorized internal or external access, willful damage, etc; may require sophis-

ticated testing techniques Compatibility testing - testing how well software performs in a particular hardware/software/operating system/network/etc. environment Exploratory testing - often taken to mean a creative, informal software test that is not based on formal test plans or test cases; testers may be learning the software as they test it Ad-hoc testing - similar to exploratory testing, but often taken to mean that the testers have signicant understanding of the software before testing it Context-driven testing - testing driven by an understanding of the environment, culture, and intended use of software. For example, the testing approach for life-critical medical equipment software would be completely dierent than that for a low-cost computer game User acceptance testing - determining if software is satisfactory to an enduser or customer Comparison testing - comparing software weaknesses and strengths to competing products Alpha testing -testing of an application when development is nearing completion; minor design changes may still be made as a result of such testing. Typically done by end-users or others, not by programmers or testers Beta testing - testing when development and testing are essentially completed and nal bugs and problems need to be found before nal release. Typically done by end-users or others, not by programmers or testers Mutation testing - a method for determining if a set of test data or test cases is useful, by deliberately introducing various code changes (bugs) and retesting with the original test data/cases to determine if the bugs are detected. Proper implementation requires large computational resources

2
2.1

Description of testing methods


Black-box testing

Black box testing takes an external perspective of the test object to derive test cases. These tests can be functional or non-functional, though usually functional. The test designer selects valid and invalid input and determines the correct output. There is no knowledge of the test objects internal structure. This method of test design is applicable to all levels of software testing: unit, integration, functional testing, system and acceptance. The higher the level, and hence the bigger and more complex the box, the more one is forced to use black box testing to simplify. While this method can uncover unimplemented parts of the specication, one cannot be sure that all existent paths are tested. User input must be validated to conform to expected values. For example, if the software program is requesting input on the price of an item, and is expecting a value such as 3.99, the software must check to make sure all invalid cases are handled. A user could enter the price as -1 and achieve results contrary to the design of the program. Other examples of entries that could be entered and cause a failure in the software include: 1.20.35, Abc, 0.000001, and 999999999. These are possible test scenarios that should be entered for each point of user input. Other domains, such as text input, need to restrict the length of the characters that can be entered. If a program allocates 30 characters of memory space for a name, and the user enters 50 characters, a buer overow condition can occur. Typically when invalid user input occurs, the program will either correct it automatically, or display a message to the user that their input needs to be corrected before proceeding[1]

2.2

White-box testing

This testing method is also known as the glass-box testing. It uses an internal perspective of the system to design test cases based on internal structure. It requires programming skills to identify all paths through the software. The tester chooses test case inputs to exercise paths through the code and determines the appropriate outputs. Since the tests are based on the actual implementation, if the implementation changes, the tests probably will need to also. This adds nancial resistance to the change process, thus buggy products may stay buggy. While white box testing is applicable at the unit, integration and system levels of the software testing process, it is typically applied to the unit. While it normally

tests paths within a unit, it can also test paths between units during integration, and between subsystems during a system level test. Though this method of test design can uncover an overwhelming number of test cases, it might not detect unimplemented parts of the specication or missing requirements, but one can be sure that all paths through the test object are executed. Typical white box test design techniques include control ow and data ow testing.[1]

2.3

Unit testing

Unit testing is a procedure used to validate that individual units of source code are working properly. A unit is the smallest testable part of an application. In procedural programming a unit may be an individual program, function, procedure, etc., while in object-oriented programming, the smallest unit is a method; which may belong to a base/super class, abstract class or derived/child class. Ideally, each test case is independent from the others; mock objects and test harnesses can be used to assist testing a module in isolation. Unit testing is typically done by developers and not by Software testers or end-users. Testing, in general, cannot be expected to catch every error in the program. The same is true for unit testing. By denition, it only tests the functionality of the units themselves. Therefore, it may not catch integration errors, performance problems, or other system-wide issues. Unit testing is more eective if it is used in conjunction with other software testing activities. Unit testing frameworks, which help simplify the process of unit testing, have been developed for a wide variety of languages. It is generally possible to perform unit testing without the support of specic framework by writing client code that exercises the units under test and uses assertion, exception, or early exit mechanisms to signal failure. This approach is valuable in that there is a nonnegligible barrier to the adoption of unit testing. However, it is also limited in that many advanced features of a proper framework are missing or must be hand-coded. [1]

2.4

Integration testing

Integration testing (sometimes called Integration and Testing, abbreviated I&T) is the phase of software testing in which individual software modules are combined and tested as a group. It follows unit testing and precedes system testing. Integration testing takes as its input modules that have been unit tested, groups them in larger aggregates, applies tests dened in an integration test plan to those aggregates, and delivers as its output the integrated system ready for system testing. Integration testing can expose problems with the interfaces among program components before trouble occurs in real-world program execution.

There are two major ways of carrying out an integration test, called the bottomup method and the top-down method. Bottom-up integration testing begins with unit testing, followed by tests of of progressively higher-level combinations of units called modules or builds. In top-down integration testing, the highestlevel modules are tested rst and progressively lower-level modules are tested after that. In a comprehensive software development environment, bottom-up testing is usually done rst, followed by top-down testing. The process concludes with multiple tests of the complete application, preferably in scenarios designed to mimic those it will encounter in customers computers, systems and networks. [3]

2.5

Functional testing

Functional testing covers how well the system executes the functions it is supposed to executeincluding user commands, data manipulation, searches and business processes, user screens, and integrations. Functional testing covers the obvious surface type of functions, as well as the back-end operations (such as security and how upgrades aect the system). So it can be described together with system testing which is also conducted on a complete, integrated system to evaluate the systems compliance with its specied requirements. As a rule, system testing takes, as its input, all of the integrated software components that have successfully passed integration testing and also the software system itself integrated with any applicable hardware system(s). The purpose of integration testing is to detect any inconsistencies between the software units that are integrated together (called assemblages) or between any of the assemblages and the hardware. System testing is a more limiting type of testing; it seeks to detect defects both within the inter-assemblages and also within the system as a whole. System testing is actually done to the entire system against the Functional Requirement Specication(s) (FRS) and/or the System Requirement Specication (SRS). Moreover, the system testing is an investigatory testing phase, where the focus is to have almost a destructive attitude and test not only the design, but also the behaviour and even the believed expectations of the customer. It is also intended to test up to and beyond the bounds dened in the software/hardware requirements specication(s). [1], [4]

2.6

End-to-end testing

Altough it is quite similar to system testing, there is a major dierence between those tests. While system testing is nothing but testing core functionality of SUT (System Under Test) end-to-end testing is testing functional and nonfunctional elements in SUT. There are two ways to do end-to-end testing. The most commonly understood meaning of end-to-end testing is that testing occurs horizontally, across multiple applications. For example, a Web-based order-entry system might interface to

the back-oce accounts receivable, inventory, and shipping systems. A horizontal end-to-end test includes verifying transactions through each application, start to nish, assuring that all related processes occur correctly. For example, an order item list is properly removed from inventory, shipped to the customer, and invoiced. This level of integration might exist within a single enterprise resource planning (ERP) application, but most companies have a mixture of applications acquired from dierent vendors or internally developed at dierent times. Vertical end-to-end testing refers to testing each of the layers of a single applications architecture from top to bottom. For example, the order-entry system might use HTML to access a Web server that calls an API on the transaction server, which in turn generates SQL transactions against the mainframe database. Other applications may share the API and SQL components, so those interfaces must be tested both individually and vertically in an end-to-end environment. This is a special challenge, because intermediate layers such as API or SQL middleware are headless in the sense that they have no user interface that lends itself to the usual manual testing fallback. As a result, you must either test these layers through the various applications that they supporta daunting task because there can be so many of themor you must write an articial front end, thus creating yet another development project. [2], [5]

2.7

Sanity(Smoke) testing

It is a very brief run-through of the functionality of a computer program, system, calculation, or other analysis, to assure that the system or methodology works as expected, often prior to a more exhaustive round of testing. If it is rst test made after repairs or rst assembly then we call it Smoke test, in other case it is a Sanity test. Other dierence is that, generally, a smoke test is scripted (either using a written set of tests or an automated test), whereas a sanity test is usually unscripted. With the evolution of test methodologies, sanity tests are useful both for initial environment validation and future interactive increments. The process of sanity testing begins with the execution of some online transactions of various modules, batch programs of various modules to see whether the software runs without any hindrance or abnormal termination. This practice can help identify most of the environment related problems. A classic example of this in programming is the hello world program. If a person has just set up a computer and a compiler, a quick sanity test can be performed to see if the compiler actually works: write a program that simply displays the words hello world. A smoke test is a collection of written tests that are performed on a system prior to being accepted for further testing. This is also known as a build verication test. This is a shallow and wide approach to the application. The tester touches all areas of the application without getting too deep, looking for answers to basic questions like, Can I launch the test item at all?, Does

it open to a window?, Do the buttons on the window do things?. There is no need to get down to eld validation or business ows. If you get a No answer to basic questions like these, then the application is so badly broken, theres eectively nothing there to allow further testing. [1]

2.8

Acceptance testing

Acceptance testing is a term referring to the functional testing of a user story by the software development team during the implementation phase. The customer species scenarios to test when a user story has been correctly implemented. A story can have one or many acceptance tests, whatever it takes to ensure the functionality works. Acceptance tests are black box system tests. Each acceptance test represents some expected result from the system. Customers are responsible for verifying the correctness of the acceptance tests and reviewing test scores to decide which failed tests are of highest priority. Acceptance tests are also used as regression tests prior to a production release. A user story is not considered complete until it has passed its acceptance tests. This means that new acceptance tests must be created for each iteration or the development team will report zero progress. The results of these tests give condence to the clients as to how the system will perform in production. They may also be a legal or contractual requirement for acceptance of the system. [1]

2.9

Load testing

Load testing generally refers to the practice of modeling the expected usage of a software program by simulating multiple users accessing the programs services concurrently. As such, this testing is most relevant for multi-user systems, often one built using a client/server model, such as web servers. However, other types of software systems can be load-tested also. For example, a word processor or graphics editor can be forced to read an extremely large document; or a nancial package can be forced to generate a report based on several years worth of data. The most accurate load testing occurs with actual, rather than theoretical, results. When the load placed on the system is raised beyond normal usage patterns, in order to test the systems response at unusually high or peak loads, it is known as stress testing. The load is usually so great that error conditions are the expected result, although no clear boundary exists when an activity ceases to be a load test and becomes a stress test. Load Tests are major tests, requiring substantial input from the business, so that anticipated activity can be accurately simulated in a test environment. If the project has a pilot in production then logs from the pilot can be used to generate usage proles that can be used as part of the testing process, and can

even be used to drive large portions of the Load Test. Load testing must be executed on todays production size database, and optionally with a projected database. If some database tables will be much larger in some months time, then Load testing should also be conducted against a projected database. It is important that such tests are repeatable, and give the same results for identical runs. They may need to be executed several times in the rst year of wide scale deployment, to ensure that new releases and changes in database size do not push response times beyond prescribed SLAs (Service Level Agreements). [6] Stress testing is particularly important for mission critical software, but is used for all types of software. Stress tests commonly put a greater emphasis on robustness, availability, and error handling under a heavy load, than on what would be considered correct behavior under normal circumstances. In particular, the goals of such tests may be to ensure the software doesnt crash in conditions of insucient computational resources (such as memory or disk space), unusually high concurrency, or denial of service attacks. [1]

2.10

Usability testing

Usability testing is a technique used to evaluate a product by testing it on users. This can be seen as an irreplaceable usability practise, since it gives direct input on how real users use the system. This is in contrast with usability inspection methods where experts use dierent methods to evaluate a user interface without involving users. Usability testing focuses on measuring a human-made products capacity to meet its intended purpose. Examples of products that commonly benet from usability testing are web sites or web applications, computer interfaces, documents, or devices. Usability testing measures the usability, or ease of use, of a specic object or set of objects, whereas general human-computer interaction studies attempt to formulate universal principles. Setting up a usability test involves carefully creating a scenario, or realistic situation, wherein the person performs a list of tasks using the product being tested while observers watch and take notes. Several other test instruments such as scripted instructions, paper prototypes, and pre- and post-test questionnaires are also used to gather feedback on the product being tested. For example, to test the attachment function of an e-mail program, a scenario would describe a situation where a person needs to send an e-mail attachment, and ask him or her to undertake this task. The aim is to observe how people function in a realistic manner, so that developers can see problem areas, and what people like. Techniques popularly used to gather data during a usability test include think aloud protocol and eye tracking. [1]

2.11

Recovery testing

Recovery testing is the activity of testing how well the software is able to recover from crashes, hardware failures and other similar problems. Recovery testing is the forced failure of the software in a variety of ways to verify that recovery is properly performed. Mean time to failure : The average or mean time between initial operation and the rst occurrence of a failure or malfunction. In other words, the expected value of system failure time. Mean time to repair : The average time that it takes to repair a failure. Mean time between failures : A statistical measure of reliability, this is calculated to indicate the anticipated average time between failures. The longer the better. [1] This kind of test is quite similar to Failover test Failover Tests verify of redundancy mechanisms while the system is under load. This is in contrast to Load Tests which are conducted under anticipated load with no component failure during the course of a test. For example, in a web environment, failover testing determines what will happen if multiple web servers are being used under peak anticipated load, and one of them dies. Failover testing allows technicians to address problems in advance, in the comfort of a testing situation, rather than in the heat of a production outage. It also provides a baseline of failover capability so that a sick server can be shutdown with condence, in the knowledge that the remaining infrastructure will cope with the surge of failover load. After verifying that a system can sustain a component outage, it is also important to verify that when the component is back up, that it is available to take load again, and that it can sustain the inux of activity when it comes back online. [6]

2.12

Security testing

This process to determines that software protects data and maintains functionality as intended. The six basic security concepts that need to be covered by security testing are: 1. Condentiality A security measure which protects against the disclosure of information to parties other than the intended recipient that is by no means the only way of ensuring condentiality. 2. Integrity A measure intended to allow the receiver to determine that the information which it receives has not been altered in transit or by other than the originator of the information. Integrity schemes often use some of the same underlying technologies as condentiality schemes, but they usually involve adding additional information to a communication to form the basis of an algorithmic

check rather than the encoding all of the communication. 3. Authentication A measure designed to establish the validity of a transmission, message, or originator. Allows a receiver to have condence that information it receives originated from a specic known source. 4. Authorisation The process of determining that a requester is allowed to receive a service or perform an operation. Access control is an example of authorization. 5. Availability Assuring information and communications services will be ready for use when expected. Information must be kept available to authorized persons when they need it. 6. Non-repudiation A measure intended to prevent the later denial that an action happened, or a communication that took place etc. In communication terms this often involves the interchange of authentication information combined with some form of provable time stamp. Some testing techniques are predominantly manual, requiring an individual to initiate and conduct the test. Other tests are highly automated and require less human involvement. Regardless of the type of testing, sta that setup and conduct security testing should have signicant security and networking knowledge. [1], [7]

2.13

Exploratory testing

Exploratory testing is the tactical pursuit of software faults and defects driven by challenging assumptions. It is an approach in software testing with simultaneous learning, test design and test execution. While the software is being tested, the tester learns things that together with experience and creativity generates new good tests to run. It is very similar to Ad-hoc testing, which is the less formal iegg. no documentation) method. Altough it is based on the same ideas and rules. Exploratory testing seeks to nd out how the software actually works, and to ask questions about how it will handle dicult and easy cases. The testing is dependent on the testers skill of inventing test cases and nding defects. The more the tester knows about the product and dierent test methods, the better the testing will be. When performing exploratory testing, there are no exact expected results; it is the tester that decides what will be veried, critically investigating the correctness of the result.

In reality, testing almost always is a combination of exploratory and scripted testing, but with a tendency towards either one, depending on context. The main advantage of exploratory testing is that less preparation is needed, important bugs are found fast, and is more intellectually stimulating than scripted testing. Another major benet is that testers can use deductive reasoning based on the results of previous results to guide their future testing on the y. They do not have to complete a current series of scripted tests before focusing in on or moving on to exploring a more target rich environment. This also accelerates bug detection when used intelligently Disadvantages are that the tests cant be reviewed in advance (and by that prevent errors in code and test cases), and that it can be dicult to show exactly which tests have been run. Exploratory testing is extra suitable if requirements and specications are incomplete, or if there is lack of time. The approach can also be used to verify that previous testing has found the most important defects. It is common to perform a combination of exploratory and scripted testing where the choice is based on risk. [1]

Summary

Although we only described briey some of the most popular methods of SQA, the article was already quite long. This shows how important to software engineering and complex Software Quality Assurance is. With the growth of level of sophistication and complexity of modern applications and systems, and amount of money spend each year by corporations on their development, the testing became the most important task in software engineering. Most organizations have found that an important key to achieving shortest possible schedules is focusing their development processes so that they do their work right the rst time. If you dont have time to do the job right, the old chestnut goes, where will you nd the time to do it over?

References
1. 2. 3. 4. 5. 6. 7. http://www.wikipedia.org/ http://www.softwareqatest.com/ http://searchsoftwarequality.techtarget.com/ http://www.ece.cmu.edu/ http://itmanagement.earthweb.com/ http://www.loadtest.com.au/ http://www.csrc.nist.gov/

Vous aimerez peut-être aussi