Mapua University Abstract Software testing is a crucial activity in and an integral part of software development as it evaluates the quality of a software program by identifying its defects and problems which are the basis of its improvements. It has evolved from the code and fix process in finding coding errors, to having a separate and distinct stage in a software development life cycle up to being a collaborative effort embedded throughout the entire software development process performed even by the business customers. This paper takes a look at its application in software development over the years and magnify its importance by exemplifying the adverse effects of its inadequacy in the real world. This is a literary review of 91 journals, books, thesis and reports which explored the growth of the discipline and identified the contributors of its development over the years. I. Introduction Software testing is the process of identifying errors, gaps or missing requirements and ensuring the software being developed is fit for use and has met its stakeholders expectations. It should not be confused with software debugging as testing aims to identify undiscovered errors and software deficiencies contrary to the former which is a process of investigating the discovered errors and subsequently removing it by design or code modification (Li, 1990). Software testing typically consumes 30% - 40% of development efforts. Andersson and Bergstand (1995) estimated that 30% of the effort were put into testing in the 1990s and International Software Testing Qualifications Board [ISTQB] (2016) reported that the average expenditure for testing in an IT project was 35% in 2015. This clearly point towards the fact that software testing is an important aspect of software development but despite this level of importance, software testing is still not seen as a lucrative career. Waychal and Capretz (2016) surveyed 73 junior students of a reputable computer engineering program and 22 testing professionals and concluded that the testing profession is far from being popular with only 7% of the students who are thinking of taking up testing careers and 73% of the professionals regarded their roles as second-class citizenship citing not being involved in the decision process, not getting credit for good quality products but getting discredited for bad quality products and exerting pressure on testers to compensate for developers overruns as the major reasons for thinking so. But this doesnt discount the fact that software testing plays an indispensable role in software development as it continues to be a part of every software development methodology that have emerged. This paper delves on that claim and investigates the co-evolution of software testing and software development over the years which tackles the first research question How is software testing applied in software development from the 1950s until the 2010s? Also, to give emphasis on how important testing software is, a supplementary research question What are the real-life examples of expensive and fatal software errors that could have been prevented by software testing? The next section of this paper provides a brief review on software testing to familiarize the readers with the basic concepts of software testing and terminologies used to expound the research questions. Section 3 and 4 answers the research questions and the last section provides a narrative conclusion to this study. II. Software Testing Concepts A. Objectives of testing Discover defects, errors and deficiencies in the software Determine system capabilities and limitations of the software Improve the quality of the software. Ensure software meets the business and customer requirements. B. Types of testing Black box or Functional testing purely observes the results from some input values without any analysis of a code. Its main purpose is ensuring proper acceptance of the input and correctly production of the output. White box or Structural testing allows examination of the code but no attention is given to its specifications. It is basically a process of monitoring how the system processes some input values to provide the desired output. C. Levels of testing Unit Testing is the first level of testing usually done by the developer or someone with the proper knowledge of the core program design to isolate each part of the program and ensure individual parts are correct in terms of requirements and functionality. Integration Testing is usually done after unit testing wherein modules are assembled and integrated to form the complete package to be tested to uncover errors associated with interfacing. System Testing rigorously tests the whole application to ensure that it meets the specified quality standards. Acceptance Testing is done to ensure that the software meets the intended specifications and satisfies the clients requirement. It employs a black box type of testing as it focuses on the systems functionality rather that its code. Alpha testing takes place at developers sites and involves testing of the software by internal staff before it is released to the external customers while Beta testing takes place at customers sites and involves testing by a group of customers who use the system at their own locations and provide feedback. D. Limitations of software testing Even after satisfactory completion of the testing phase in software development, it is impossible to guarantee that the developed system is free from any error since it is not practical to test exhaustibly with respect to each value that the input can assume (Saini, & Rai, 2013). An example would be a small 100-line program with some nested paths and a single loop executing less than twenty times may require 10 to the power of 14 possible paths to be executed which would take 3170 years to test assuming each path can be evaluated in a millisecond (Pressman, 2001). 1. Software testing cannot guarantee that the software being developed is free from any error. It can only show the presence of errors but never show their absence (Miller, & Howden, 1981). 2. Software testing is not factor in deciding whether to release a product on deadline with errors or compromise the deadline to fix the errors (Pressman, 2001). 3. Software testing cannot establish that a product functions properly under all conditions but can only establish that it does not function properly under specific conditions (Istyaq, & Zargar, 2010). 4. Software testing does not help in finding root causes which resulted in injection of defects in the first place. Locating root causes of failures can help in preventing injection of such faults in the future (Anupriya & Ajeta, 2012). Software testing is a trade-off among budget, time and quality and the optimistic stopping rule in software testing is when either reliability meets the requirement or the benefit from continuing testing cannot justify the testing cost (Yang, & Chao, 1995). III. Software testing in software development There were only two steps in the software development process in the 1950s, an analysis step followed by a coding step (Royce, 1987). This was all that is required before as the computer programs developed are to be operated by those who built it and software problems were correlated with hardware reliability putting the focus on the latter in the earliest days of digital computer (Gelperin, 1988). However, the concepts of debugging and testing were not clearly differentiated and were used interchangeably. Charles Baker distinguished the two terms in a review of Dan McCrackens book Digital Computer Programming (Gelperin, 1988). Baker (1957) defined debugging as ensuring the program runs and testing as ensuring the program solves the problem. Both of which ensures the software satisfies its requirements. A decade later, people realized that software was easier to modify than hardware and did not require expensive production environments to develop (Boehm, 2006). This resulted to a shift towards a code and fix approach which is simply writing a code and fixing the problem in the code (Connell, Carta, & Baer, 1993). It preferred frequent patches over the exhaustive critical design reviews before execution (measure twice, cut once) but this often resulted in unwieldy spaghetti code and pulling all-nighters to hastily patch faulty code to meet deadlines (Boehm, 2006). Spaghetti code problem was addressed by Dijkstras famous letter to ACM communications mentioning the harms of go-to statements (Dijkstra, 1968). Bhm and Jacopini (1966) proved that sequential programs could be constructed without go-to statements in their paper Flow diagrams, Turing machines and languages with only two formation rules which led to the Structured Programming movement (Boehm, 2006). This is significant as it may have had only eliminated a fraction of sequence and control errors but these were important as they tend to persist until the later, more difficult stages of validation in critical real-time programs (Boehm, Brown, & Lipow 1976). This claim is supported by the study of Air Force Systems Command called Information Processing/Data Automation Implications of Air Force Command and Control Requirements in the 1980s or CCIP-85 which showed that the 27% sequencing and control errors in 7 batch programs were carried out to its final validation phase accounting to 51% of the total errors (Boehm, & Haile, 1985). But as computer applications increased in number, cost and complexity. It was evident that computer systems contained many deficiencies, and the cost of recovering and fixing these problems was substantial (Gelperin, 1988). This initiated users and managers to place greater emphasis on a more effective approach at detecting problems before product delivery. Traditional software development A synthesis of the 1950s and the 1960s paradigm was provided by Royces version of the waterfall model in the 1970s establishing defined phases of software development, incorporating the stakeholder perspective into the development process and emphasizing requirement analysis and the importance of testing activities (Pettichord, 2002). It is a sequential software development model that goes downward from requirement analysis, design, coding, testing and maintenance with each phase proceeding in order without any overlapping. The testing phase can be categorized into unit testing, system testing and acceptance testing which verifies that the individual components and integrated solution have minimized error probability and meets the software requirement specification. Defects found during the testing stage are provided as feedback to the developers who in turn fix the issues (Dorette, 2011). But placing the testing activities at the end of its sequential process had become one of its most significant criticism as this often results in increased cost (Gillenson, Racer, Richardson, & Zhang, 2011) as Boehm and Basili (2001) observed, it is 100 times more expensive to find and fix software problems after delivery than during its requirements and design phase. Several data which supports this claim were enumerated during the expert workshop at the 2002 Metrics Symposium, one of which is Don ONeills description of IBM Rochester data which had a ratio of defect slippage from code to field of about 117:1. Another example is, Yoshiro Matsumotos software factory of 2600 IT workers which had an average rework time after shipment of 22.85 hours versus less than 10 minutes if the work had been done prior to shipment (Shull, Basili, Boehm, Winsor Brown, Costa, Lindvall, & Zelkowitz, 2002, p. 2). Dalal, Horgan and Kettenring (1993) considered software testing as a coordinated effort that is part of the overall software engineering process. It was no longer just seen as an error finding activity but as a prevention activity that not only reduced costs but improved the quality of the software being developed. It was identified as an important part of a cooperative process where multiple actors such as testers, designers, programmers and users link together to accomplish a collective set of tasks over the entire software development life cycle (Kraut, & Streeter, 1995). This has proved beneficial as Waligora and Coon (1997) assessed the value of this change in the Flight Dynamics Division at NASA Goddard wherein they compared a typical waterfall life cycle in which the system is fully developed before any system testing begins against a new approach which involves an independent test group who did all the functional testing and performed testing as soon as the first build is completed. The new testing approach declined the overall system testing effort from 41% to 31%, reaped 35% cost savings and the average error rate decreased to 1.5 errors per KSLOC from 4.3 errors per KSLOC. This also supports Schachs suggestion to perform testing throughout the software life cycle and predicted that testing in the future will prevent faults rather than detect them (Schach, 1996) but Paul Rook had known this exactly a decade earlier and proposed a software development process which can be presumed to be the extension of the waterfall model called The V-Model (Rook, 1986). The V-Model associates a testing phase for each phase of its development life cycle forming a V-shape instead of moving down in a linear way. It employs testing activities such as test designing at the beginning of the project incorporating testing into the entire software development life cycle. A typical V-model typically consists of unit testing performed during the module design phase, integration testing performed during the architectural design phase, system testing performed during the system design phase and acceptance testing performed during the requirements analysis phase (Tanya, & Gupta, 2011). The waterfall methodology has continuously to evolve in the decades following its introduction (Benediktsson, Dalcher, & Thorbergsson, 2006) but the advancement in its testing application is insignificant. The succeeding methodologies such as the Iterative waterfall model suggested feedback paths in the waterfall model from every phase to its preceding phases creating iterations to allow the correction of the errors committed during a phase that are detected in later phases which mandates not only to rework the design, but also to redo the coding and the system testing (Ghezzi, Jazayeri, & Mandrioli, 2002). The Unified Process Model developed by James Rumbaugh, Grady Booch and Ivar Jacobson during the late 1980s and early 1990s is a use-case driven, architecture-centric, iterative and incremental framework which employs a transition phase that transfers the software from the developer to the end-user for beta testing and acceptance testing (Jacobson, Booch, & Rumbaugh, 1999). The Spiral Model was proposed in 1988 by Barry Boehm in his paper A spiral model of software development and enhancement wherein unit testing, integration testing and acceptance testing are performed between coding and implementation like the classical waterfall method but repeats the same set of life-cycle phases until development is complete with main emphasis given on risk analysis (Boehm, & Hansen, 2001). The Incremental / Iterative model divides the waterfall cycle into smaller iterations with each passing through design, development, testing and implementation phases subsequently working on the software being developed (Jalote 2005). Other software methodologies such as Rapid Application Development and Rational Unified Process have emerged as well but will not be discussed on this paper as these employs more or less the same concept of testing as the waterfall but on 2001, a group of 17 IT professionals with similar objectives and idea convened in in Salt Lake City, Utah, USA and formulated a set of principles to uncover better ways of developing software by prioritizing individuals and interactions over processes and tools, working software over comprehensive documentation, customer collaboration over contract negotiation and responding to change over following a plan called the Agile manifesto (Talby, Keren, Hazzan, & Dubinsky, 2006), this would eventually change software testing dramatically. Agile Testing The Agile software development methods evolved in the mid-1990s as a reaction to the issues that traditional software development methodologies had with cost of changing the requirements, long time to market and high risk of failing the expectations of end-users in a changing business environment (Mwambe, & Lutsaievskyi, 2013). Agile software methodologies such as Extreme Programming, Scrum, Lean Development and Crystal follow the core principles of the agile manifesto (Rajasekhar, & Shafi, 2014). Agile testing is a software testing practice that follows the principles of the agile manifesto. There is no specific testing phase in agile methods unlike in the traditional software development process, instead it is integrated into the development process (Abrahamsson, Salo, Ronkainen, & Warsta, 2002) and involves all members of the project team. Test cases are created in a collaborative manner among subject matter expert, tester, developer, business analyst and customer. The significant difference in agile testing is the quick feedback from testing which helps developers figure the issues in very early stage (Tripathi, & Goyal, 2014). Systematic, a CMMI Level 5 company, doubled productivity and cut defects by 40% compared to waterfall projects in 2006 by adopting Scrum Agile methodology and focusing on early testing and time to fix builds (Jakobsen, & Sutherland, 2009, p. 333). Both development and testing team takes responsibility for analyzing the business specifications and define the Sprint goal or the goal of each set period of time during which specific work must be completed and made ready for review. The testing team defines the scope of the testing that is validated and approved by the entire team and the client after which the testing team works on the test case design and document these either in a testing tool or in an excel spreadsheet to be handed over to the development team and project sponsor from the business side to ensure that the test coverage is as complete as possible. This is performed simultaneously with the development team coding the modules in the very first Sprint. The testing team then begins their testing in the test environment once the test case review and modifications are completed for a Sprint. Defects found during testing are logged in a defect tracking tool since fixing these are taken care of only during the next Sprint. The team along with the project sponsor determines which defects are to be fixed in the current iteration on a priority basis at the end of each Sprint and this continues until all planned Sprints are completed. When the code is ready to test after the end of each Sprint, the testing team works with the development team to execute test cases in the development environment to identify the early stage defects so that the developers can fix them during the next round on a priority basis. This process is repeated throughout the development process. SCRUM Scrum is an example of an agile software methodology. It is an iterative framework which breaks the complete project into small shippable product increment deliverables that can be tested at the end of each sprint and employs a daily 15-minute stand-up meetings and delivering workable software at end of every sprint (Beedle, & Schwaber, 2001) which normally last 2-4 weeks. There is no separate testing team in a Scrum team instead developers are expected to perform unit testing during the design and development stage of a sprint and creates test cases for the next sprint and execute test cases and document its results during the quality assurance stage (Downey, & Sutherland, 2013). Extreme Programming Extreme Programming (XP) is another example of an agile software methodology which is developed by Kent Beck in 1999 to address the specific needs of software development conducted by small teams in the face of vague and changing requirements (Beck, 1999). Beck imposes XP tests to be isolated and automatic (Beck, 1999). XP unit tests are written by programmers while the functional tests are written by customers with the help of at least one dedicated tester to translate the testing ideas of the customer into real, automatic and isolated test. These two are the heart of the XP testing strategy as Beck implied in his paper Extreme Programming Explained. SCRUM and Extreme Programming are just two examples of agile software methodologies. Other methodologies which adopts the Agile manifesto are Lean Software Development, Crystal Methodology, Dynamic Systems Development Method and Feature-Driven Development which are not discussed on this paper since these methodologies has the same application of agile testing, which are, integrating software testing into the entire software development cycle and software testing being a collaborative effort among business users, business analysts, developers, testers and managers to guarantee the business value desired by the client is delivered with a sustainable and continuous rhythm (Baddoo, Cuadrado, Gallego, O'Connor, Muslera, Smolander, & Messnarz, 2009) however, software engineering practices which these agile methodologies may employ provides significant innovation over the traditional testing techniques. This claim are proved and discussed in the succeeding subsections. Test-Driven Development Test-driven development is not a testing technique despite its name, but rather a development and design technique which imposes tests to be written prior to the production code (Beck, 2001). The first step is developer writing tests that concentrate on testing a function (i.e. if a system should be able to handle multiple inputs, the tests should reflect multiple inputs) which enables the developer to focus on the requirements before writing the code. This is a subtle but differentiating feature of a test-driven development versus writing a regular unit test after the code is written. Next, all tests should be run and ensure the new test fails to confirm that the new test requires new codes to be written because the required behavior already exists. This rules out the possibility that the new test is flawed and will always pass. Codes that causes the test to pass is written after and the test is ran again to see if all the test cases pass to confirm that the written code meets the test requirements and does not break any existing functions. Refactoring or restructuring an existing code without changing its external behavior to make the test pass should be done and the test cases should be re-run throughout each refactoring phase to give confidence to the developer that the process is not altering any existing functionality. The cycle is repeated starting with another new test until the completion of the software (Astels, 2003). Refactoring is assumed to have positive effects on a softwares extensibility, modularity, reusability, complexity, maintainability and efficiency (Mens, & Tourw, 2004) but Kannangara and Wijayanake (2015) conducted a study on 4,922 non-refactored code and 5,005 refactored code and found no improvement in code analyzability, changeability and time behavior after applying ten refactoring techniques. The results showed that there is an improvement in code maintainability in the refactored code however. While software maintenance is not in the scope of this research, it is worth noting that it is a serious cost factor in software development as the relative cost for maintaining a software and managing its evolution represents more than 90% of its total cost in year 2000 (Erlikh, 2000) with annual software maintenance cost in the US has been estimated to be more than $70 billion (Sutherland, 1995). Acceptance Test-Driven Development Acceptance Test-Driven Development encompasses the same practices of a regular test-driven development but differs on writing acceptance tests instead of unit tests coding (Vodde, & Larman, 2010). The developer, tester and business customer collaboratively create acceptance tests prior to coding through requirements analysis and are specified in business domain terms. Failing these provide quick feedback that the requirements are not being met. This gives an interrelation between tests and requirements connoting tests which do not refer to a requirement are considered as unneeded and acceptance tests developed after implementation represents new requirements (Pugh, 2010). Results in the literature reported that ATDD improves communications between customers and developers, improve productivity and allows to automatically test software at the business level. An empirical study has been conducted on the application of ATDD in the migration of independent text mode applications supporting all its activity (taking and managing orders, cashing, reporting, etc.) into a web-based environment for a restaurant chain company with only 6 months given as a deadline. The project was delivered after five and a half months and concluded that the use of ATDD improved the understanding and capturing of the business requirements. The study also reported that the participation of the customer in the specification and definition of the acceptance test was the key factor (Latorre, 2014) Behavior-Driven Development Behavior-Driven Development is originally developed by Dan North as a response to the issues in Test-Driven Development and Acceptance Test-Driven Development, one of which, is using an unstructured and unbounded natural language to describe the test cases which are hard to understand especially to business stakeholders and end users who may have little software development knowledge. It is regarded as the evolution of TDD and ATDD. In BDD, The tests are clearly written and easily understandable because it utilizes a ubiquitous language that helps stakeholders specify their tests (Keogh, 2010). Cucumber, Jbehave (Java), and RSpec (Ruby) are some of the toolkits which support BDD and employs a Domain-Specific Language (DSL) for defining scenario and derive executable code by a regular expression like the Gherkin language (Chelimsky, Astels, Dennis, & Hellesoy, 2011). Pair Programming Pair programming is an essential XP practice that William and Kessler (2002) defined as a style of programming in which two programmers work side by side at one computer, continually collaborating on the same design, algorithm, code, or test. One of the pair, called the driver, is typing at the computer or writing down a design. The other partner, called the navigator, has many jobs, one of which is to observe the work of the driver, looking for tactical and strategic defects which occur when the driver is headed down the wrong path. The navigator is the strategic, long-range thinker. Another great thing is that the driver and the navigator can brainstorm on-demand at any time and switch roles periodically (p. 208). It contributes to testing in that it amounts to a review of the code or document produced and can be considered as one of XPs strengths over traditional programming in terms of testing. Xu (2006) investigated the effects of extreme programming practice in game development by conducting a case study with 12 advanced undergraduate students who were assigned to implement a simple game application, 4 Pairs used XP practices such as pair programming, test-driven development and refactoring and 4 individuals applied traditional waterfall-like approach. The average number of test cases passed by the pairs were 11.8 contrary to 9.8 that were passed by those who worked individually per 291.7 and 180.5 line of codes respectively. The results of the case study showed that the programmer paired students completed their tasks faster and with higher quality than individuals and passed more test cases and wrote cleaner code with higher cohesion. However, one of its criticisms is that it doubles code development expenses and manpower needs but in 1999, a controlled experiment at the University of Utah investigated the economics of pair programming involving advanced undergraduates in a software engineering course. 1/3 of the class coded class projects by themselves and the rest of the class completed their projects using pair programming. The study disproves that development costs double with pair programming as pairs only spent about 15% more time than the individuals and significantly, their resulting code has about 15% fewer defects (Cunningham, Jeffries, Kessler & Williams, 2000). The 15% increase in code development expense is recovered in the reduction in defects, let a program of 50,000 lines of code be developed by individual programmers and paired programmers and considering that programmers inject 100 defects per thousand lines of code according to Watts Humphrey in his book A Discipline for Software Engineering (Humphrey, 1995). Let a thorough development process removes 70% of the defects leaving the individuals to have 1,500 defects in their program and paired programmers would have 15% less or 1,275 in total (225 less defects). Industry data reports per Watts Humphrey, hours spent on each defect takes between 3.3 and 8.8 hours (Humphrey, 1995), using a conservative factor of 4 hours per defect, the extra 225 defects by individual programmers would cost 900 hours more than the paired programmers hours (Cockburn, & Williams, 2000). This supports the economic benefit of early defect identification and removal in pair programming. Exploratory Testing Exploratory Testing is where the developer / tester learns, design and execute the tests simultaneously (Bach, 2004). It means that the developers / testers are exploring the software, learning its functionality and performing test execution based on their intuition which enables them to control the design of the tests while executing and learning more about the software (Bhatti, & Ghazi, 2010). It is valuable in situations that requires rapid feedback, learning of product, not having enough time for systematic testing approaches and testing from an end user point of view (Naseer, & Zulqar, 2010). Tuomikoski and Tervonen (2007) performed an action research about the application of exploratory testing in a company with 20 years of software development experience which consists of developers and test engineers with an average of 8 years of software development experience and concluded that exploratory testing provides an easy way to familiarize oneself with new features and almost everyone could participate in exploratory testing sessions. Itkonen (2011) supports this on his case study on the defect- reporting contribution of different organizational groups in three software product development organizations using exploratory testing and accounts 21.5% of the defects found to the Internal Miscellaneous team, 18.1% were found by customers, 17.0% were found by developers, 7.3% were found by the sales & consulting team and only 9.8% were found by professional testers. Automation as the core of Agile Testing Crispin and Gregory (2009) claimed that test automation is the key to a successful agile development and the core of agile testing. A research conducted by Collins, Dias-Neto and De Lucena (2012) had outstanding results when the automation of tests applied in a scrum agile methodology in three different software projects obtained a faster application of testing and promoted team harmony, collaboration, knowledge transfer and fast feedback from sprint results. Automation of software test activities such as development and execution of test scripts, test requirements, unit testing, functional testing and regression testing are encouraged in Agile testing to diminish the time used doing repetitive tasks manually since software delivery happens in Agile methods in a very short iteration (Gil, Diaz, Orozco, De la Hoz, Morales, 2016) but the tradeoff is high initial costs especially at the time of writing the test cases or configuring the automation framework (Haugset & Hanssen, 2008). Karhu, Repo, Taipale and Smolander (2009) observed in their empirical research on the use of software testing automation on five organizational units published in their conference paper "Empirical Observations on Software Testing Automation" that Automated software testing may reduce costs and improve quality because of more testing in less time, but it causes new costs in, for example, implementation, maintenance, and training (p. 3). Having said that, test automation remains to be the current trend in most types of software testing. Practitest (2017) is an end to end QA and Test management solutions company which surveyed 1,600 professional testers from 61 countries in 2017 and found that of 87% of their respondents adopting agile methodology, 85% utilizes automation in their project with 75% having automated functional or regression testing, 41% stress testing and 37% automated unit testing. While waterfall method remains significant even with a decrease from 39% in 2016 to 37%, another methodology has reflected with a significant figure. DevOps usage has risen from 14% in 2015 to 26% in 2017 and seems to be an emerging trend in software development after the traditional waterfall and agile methods. DevOps: The emerging trend after Agile Dimensional Research reported in January 2017 that 88% are either adopting DevOps or considering it and only 6% have no plans or have not even considered it. However, only 10% have fully embraced DevOps across their entire company, 25% have just started and 24% have only a few teams that are fully immersed (Dimensional Research 2017). This is also supported by the 2016-2017 World Quality Report conducted by Capgemini, Hewlett Packard Enterprise and Sogeti which reported the rise in percentage of DevOps usage in more than half of their projects from 21% in 2015 to 32% in 2016 and DevOps usage in almost half of their projects from 17% in 2015 to 32% in 2016 (Buenen & Muthukrishnan, 2016). This just proves that DevOps continues to grow but is still in an emerging stage. DevOps integrates software development and operational deployment continuously, employing Continuous Testing, a process involving automation of the testing process and prioritization of test cases to help reduce the time between the introduction of errors and their detection with the aim of eliminating root causes effectively (Gotlieb, Marijan & Sen, 2013). Ernst and Saff (2003) introduced the concept of continuous testing in their experiment which showed that it can help reduce overall development time by as much as 15% and suggested that it can be an effective tool in reducing the waiting time in software development. IV. The real cost of insufficient software testing Software is used in safety-critical areas such as aeronautics and national defense, only to name a few, which the smallest software flaw could have devastating consequences that can lead to significant damage including the loss of life hence adequate and extensive testing plays an important role in developing software intended for these industries. The following are some examples of catastrophic accidents that were caused by faulty software either directly or indirectly: On February 25, 1991, a Patriot system failed to track and intercept an incoming Iraqi Scud missile at an army base in Dhahran, Saudi Arabia. The failure let the Scud missile reach its target, killing 28 American soldiers and wounding roughly 100 others. The failures cause was a rounding error resulting in a clock drift that worsened with increased operational time between system reboots. The army had already worked out a software fix for the problem but the updated software arrived at the base just one day late (Cunningham, Michael, & Zhivich, 2012). Another example is the explosion of Ariane 5 launcher on June 4 1996 just 37 seconds after its liftoff. The rocket has been developed for a decade and the expense is estimated at $7 billion. The cause of the software failure was the attempted conversion of a 64-bit floating point number to a 16-bit signed integer. The input was larger than 32,767 and outside the range representable by a 16-bit signed integer so the conversion failed due to an overflow. The loss of altitude control resulted from the shutdown of both the active and backup computers at the same time when the error arose. The report stated that there was no test to verify that the SRI would behave correctly when being subjected to the countdown and flight time sequence and the trajectory of Ariane 5. This, and many other ground tests could have been performed during acceptance testing and would have exposed the failure (Nuseibeh, 1997). In addition to this, The US-based National Institute of Standards and Technology (NIST) reported, in its 2002 study, that the aerospace industry alone lost over a $2.6 billion and incurred 163 fatalities from 1993 to 1999 due to avoidable software defects. V. Conclusion The abovementioned examples illustrate just how important it is to ensure the reliability and robustness of a software. Software errors can be very expensive and fatal and we cannot test exhaustively just to guarantee that a software is free from any error that might cause any of those catastrophic events from happening again but software testing continues to co-evolve with software development from its earliest approach of just fixing the errors after coding to having a separate testing phase performed by independent professional testers to integrating the discipline throughout the whole software development life cycle and being performed by end users to automating the process and getting better results than the traditional methods. The studies and results presented on this paper show that the progression of software testing over the years does not only provide alternative ways of minimizing errors but new ways of engineering software as well hence we can conclude that software testing remains to be an absolute necessity in software development. References: Abrahamsson, P., Salo, O., Ronkainen, J., & Warsta, J. (2002). Agile software development methods - Review and analysis. VTT Publications, 478. Andersson, M., & Bergstrand, J. (1995). Formalizing use cases with message sequence charts (Unpublished master's thesis). Lund Institute of Technology. Anupriya, & Ajeta. (2014). Software Testing - Principles, Lifecycle, Limitations and Methods. International Journal of Science and Research (IJSR), 3(10), 1000-1002. Astels, D. R. (2003). Test-Driven Development: A Practical Guide. New Jersey: Prentice Hall. Bach, J. (2004). The Testing Practitioner (2nd edition). UTN. Baker, C. (1957). Review of D.D. McCracken's Digital Computer Programming. Mathematical tables and other aids to computation, 11(60), 298-305. Beck, K. (2001). Aim, fire [test-first coding]. IEEE Software, 18(5), 87-89. Beck, K. (1999). Extreme programming explained: embrace change (1st edition). Boston: Addison-Wesley Longman Publishing Co., Inc. Benediktsson, O., Dalcher, D., & Thorbergsson, H. (2006). Comparison of Software Development Life Cycles: A Multiproject Experiment. IEE Proceedings Software, 153(3), 87- 101. Bhatti, K., & Ghazi, A. (2010). Eectiveness of Exploratory Testing An empirical scrutiny of the challenges and factors aecting the defect detection eciency (Master's thesis, Blekinge Institute of Technology, 2010) (pp. 1-78). SWELL (Swedish Software Verication and Validation Excellence). Boehm, B. W., Brown, J. R., & Lipow, M. (1976). Quantitative evaluation of software quality. ICSE '76 Proceedings of the 2nd international conference on Software engineering, 592-605. Boehm, B. W., & Haile, A. C. (1985). Information Processing/Data Automation Implications of Air Force Command and Control Requirements in the 1980s (CCIP-85). Space and Missile Systems Organization Los Angeles CA, 11. Boehm, B. W. (1988). A spiral model of software development and enhancement. IEEE, 21(5), 61-72. Boehm, B., & Basili, V. R. (2001). Software Defect Reduction Top 10 List. Foundations of empirical software engineering: the legacy of Victor R. Basili, 426, 135-137. Boehm, B. W., & Hansen, W. J. (2001). The Spiral Model as a Tool for Evolutionary Acquisition. The Journal of Defense Software Engineering. Boehm, B. (2006). A View of 20th and 21st Century Software Engineering. ICSE '06 Proceedings of the 28th international conference on Software engineering, 12-29. Buenen, M., & Muthukrishnan, G. (2017). World Quality Report 2016-2017 (8th ed., Rep.). Capgemini Publications. Bhm, C., & Jacopini, G. (1966). Flow diagrams, Turing machines and languages with only two formation rules. Communications of the ACM, 9(5), 366-371. Chelimsky, D., Astels, D., Dennis, Z., Hellesoy, A., Helmkamp, B., & North, D. (2011). The RSpec Book: Behaviour Driven Development with RSpec, Cucumber, and Friends. Pragmatic Bookshelf. Cockburn, A., & Williams, L. (2001). The Costs and Benefits of Pair Programming. Boston: Addison-Wesley Longman Publishing Co. Collins, E., Neto, A., & De Lucena, V. F. (2012). Strategies for Agile Software Testing Automation: An Industrial Experience. Computer Software and Applications Conference Workshops (COMPSACW), 2012 IEEE 36th Annual, 440-445. Connell, M. C., Carta, J. J., & Baer, D. M. (1993). Programming generalization of in-class transition skills: Teaching preschoolers with developmental delays to self-assess and recruit contingent teacher praise. 1993 Society for the Experimental Analysis of Behavior, 25-33. Cordeiro, L., Fischer, B., & Marques-Silva, J. (2010). Continuous Verification of Large Embedded Software Using SMT-Based Bounded Model Checking. Proceeding ECBS '10 Proceedings of the 2010 17th IEEE International Conference and Workshops on the Engineering of Computer-Based Systems, 160-169. Crispin, L., & Gregory, J. (2009). Agile Testing: A Practical Guide for Testers and Agile Teams. Boston: Addison-Wesley Professional. Dalal, S., Horgan, J., & Kettenring, J. (1993). Reliable software and communication: software quality, reliability, and safety. ICSE '93 Proceedings of the 15th international conference on Software Engineering, 425-435. Deuter, A. (2013). Slicing the V-model Reduced effort, higher flexibility. 2013 IEEE 8th International Conference on Global Software Engineering, 1-10. Dijkstra, E. W. (2002). Cooperating sequential processes. The origin of concurrent programming, 65-138. Dorette, J. J. (2011). Comparing Agile XP and Waterfall Software Development Processes in two Start-up Companies (Master's thesis, Chalmers University of Technology, 2011). Sweden: Chalmers Publication. Downey, S., & Sutherland, J. (2013). Scrum Metrics for Hyper Productive Teams: How They Fly like Fighter Aircraft. Proc. 46th Hawaii International Conference on System Sciences, 1530- 1605. Erlikh, L. (2000). Leveraging legacy system dollars for e-business. (IEEE) IT Professional, 2(3), 17-23. Gelperin, D., & Hetzel, B. (1988). The growth of software testing. CACM, 31(6). Ghezzi, C., Jazayeri, M., & Mandrioli, D. (2002). Fundamentals of software engineering. New Jersey: Prentice Hall. Gil, C., Diaz, J., Orozco, M., De La Hoz, A., De La Hoz, E., & Morales, R. (2016). Agile testing practices in software quality: State of the art review. Journal of Theoretical and Applied Information Technology, 92(1), 28-36. Gillenson, M. L., Racer, M. J., Richardson, S. M., & Zhang, X. (2011). Engaging Testers Early and Throughout The Software Development Process: Six Models and a Simulation Study. Journal of Information Technology Management, 22(1), 8-27. Gotlieb, A., Marijan, D., & Sen, S. (2013). Test case prioritization for continuous regression testing: An industrial case study. Software Maintenance (ICSM), 2013 29th IEEE International Conference on, 540-543. Haugset, B., & Hanssen, G. (2008). Automated Acceptance Testing: a Literature Review and an Industrial Case Study. Agile 2008 Conference, 27-38. Hirsch, P. (1972). Whats wrong with the air traffic control system? Datamation, 48-53. Humphrey, W. S. (1995). A discipline for software engineering. Massachusetts, Boston: Addison-Wesley. Humphrey, W. S. (1995). Why should you use a personal software process? ACM SIGSOFT Software Engineering Notes, 20(3), 33-36. ISTQB (International Software Testing Qualifications Board). (2016). ISTQB Worldwide Software Testing Practices Report 2015 - 2016 (p. 13, Rep.). Brussels: ISTQB. Istyaq, S., & Zargar, A. (2010). Debugging, Advanced Debugging and Run time Analysis. (IJCSE) International Journal on Computer Science and Engineering, 2(2), 246-249. Itkonen, J. (2011). Empirical studies on exploratory software testing (Doctoral dissertation, Aalto University, 2011). Helsinki: Aalto University publication series. Jacobson, I., Booch, G., & Rumbaugh, J. (1999). The Unified Software Development Process. Addison-Wesley. Jakobsen, C., & Sutherland, J. (2009). Scrum and CMMI Going from Good to Great. Agile Conference, 2009, 333 - 337. Jalote, P. (2005). An Integrated Approach to Software engineering (3rd edition). New York: Springer. Kannangara, S. H., & Wijayanayake, W. M. (2015). An Empirical Evaluation of Impact of Refactoring on Internal and External Measures of Code Quality. International Journal of Software Engineering & Applications (IJSEA), 6(1), 51-68. Karhu, K., Repo, T., Taipale, O., & Smolander, K. (2009). Empirical Observations on Software Testing Automation. Software Testing Verification and Validation, 2009. ICST '09. International Conference on Cite this publication. Keogh, E. (2010). BDD: A Lean Toolkit in Processing of Lean Software & Systems Conference. Khan, E., & Khan, F. (2014). Importance of Software Testing in Software Development Life. IJCSI International Journal of Computer Science Issues, 11(2), 2nd ser., 120-123. Kraut, R. E., & Streeter, L. A. (1995). Coordination in Software Development. Communications of the ACM, 38(3), 69-81. Larman, C., & Vodde, B. (2010). Practices for Scaling Lean & Agile Development: Large, Multisite, and Offshore Product Development with Large-Scale Scrum. Boston: Practices for Scaling Lean & Agile Development: Large, Multisite, and Offshore Product Development with Large-Scale Scrum. Latorre, R. (2014). A successful application of a Test-Driven Development strategy in the industrial environment. Empirical Software Engineering, 19(3), 753-773. Li, E. Y. (1990). Software Testing in a System Development Process: A Life Cycle Perspective. Journal of Systems Management, 41(8), 23-31. LIONS, J. L. (1996). Flight 501 Failure: Report by the Inquiry Board. Retrieved from http://www.esrin.esa.it/htdocs/tidc/Press/Press96/ariane5rep.html Mens, T., & Tourwe, T. (2004). Survey of software refactoring. IEEE Transactions on Software Engineering, 30(2), 126-139. Miller, E., & Howden, W. E. (1981). Tutorial, software testing & validation techniques. IEEE Computer Society Press. Morgan, T., & Roberts, J. (2002). An Analysis of the Patriot Missile System. Retrieved from http://seeri.etsu.edu/SECodeCases/ethicsC/PatriotMissile.htm Mwambe, O., & Lutsaievskyi, O. (2013). Selection and Application of Software Testing Techniques to Specific Conditions of Software Projects. International Journal of Computer Applications, 70(18), 22-28. Naseer, A., & Zulfiqar, M. (2010). Investigating Exploratory Testing in Industrial Practice (Master's thesis, Blekinge Institute of Technology, 2010). SWELL (Swedish Software Verication and Validation Excellence). National Institute of Standards and Technology. (2002). The Economic Impacts of Inadequate Infrastructure for Software Testing (pp. 1-11, Rep.). Research Triangle Park, NC: RTI Health, Social, and Economics Research. Nuseibeh, B. (1997). Ariane 5: Who Dunnit? IEEE Software archive, 14(3), 15-16. O'Connor, R., Baddoo, N., Cuadrado-Gallego, J., Muslera, R., Smolander, K., & Messnarz, R. (2009). Software Process Improvement: 16th European Conference. Spain: Springer Science & Business Media. Pettichord, B. (2002). Design for Testability. Pacific Northwest Software Quality Conference. Phaphoom, N., Sillitti, A., & Succi, G. (n.d.). Pair Programming and Software Defects An Industrial Case Study. XP 2011: Agile Processes in Software Engineering and Extreme Programming, 208-222. Practitest. (2017). 2017 State of Testing Report (pp. 1-22, Rep.). PractiTest & Tea Time with Testers. Pressman, R. S. (2001). Software Engineering: A Practitioner's Approach (5th edition). Boston: McGraw-Hill Higher Education. Pugh, K. (2011). Lean-Agile Acceptance Test-Driven Development: Better Software through Collaboration (1st edition). Boston: Addison-Wesley Professional. Rajasekhar, P., & Mahammad Shafi, R. (2014). Agile Software Development and Testing: Approach and Challenges in Advanced Distributed Systems. Global Journal of Computer Science and Technology: B Cloud and Distributed, 14(1). Rook, P. (1986). Controlling software projects. Software Engineering Journal - Controlling software projects, 1(1), 7-16. Royce, W. W. (1987). Managing the development of large software systems: concepts and techniques. ICSE '87 Proceedings of the 9th international conference on Software Engineering, 328-338. Saff, D., & Ernst, M. D. (2003). Reducing wasted development time via continuous testing. Proceeding ISSRE '03 Proceedings of the 14th International Symposium on Software Reliability Engineering, 281. Saini, G., & Rai, K. (2013). An Analysis on Objectives, Importance and Types of Software Testing. International Journal of Computer Science and Mobile Computing, 2(9), 18-23. Schach, S. R. (1996). Testing: principles and practice. ACM Computing Surveys (CSUR), 28(1), 277-279. Schwaber, K., & Beedle, M. (2001). Agile Software Development with Scrum (1st edition). New Jersey: Prentice Hall. Shull, F., Basili, V., Boehm, B., Winsor Brown, A., Costa, P., Lindvall, M., . . . Zelkowitz, M. (2002). What We Have Learned About Fighting Defects. Proceeding METRICS '02 Proceedings of the 8th International Symposium on Software Metrics, 249. Sutherland, J. (1995). Business objects in corporate information systems. ACM Computing Surveys, 27(2), 274-276. Talby, D., Keren, A., & Hazzan, O. (2006). Agile software testing in a large-scale project. IEEE Software, 23(4), 30-37. Taya, S., & Gupta, S. (2011). Comparative Analysis of Software Development Life Cycle Models. IJCST - International Joint Conference on Science and Technology, 2(4), 536-539. Testing Trends in 2017 - Sauce Labs. (n.d.). Retrieved August 21, 2017. Tripathi, V., & Goyal, A. (2014). Agile Testing Challenges and Critical success factors. International Journal of Computer Science & Engineering Technology (IJCSET), 5(6), 66-69. Tuomikoski, J., & Tervonen, I. (2009). Absorbing Software Testing into the Scrum Method. Product-Focused Software Process Improvement, 10th International Conference, 199-215. Tuteja, M., & Dubey, G. (2012). A Research Study on importance of Testing and Quality Assurance in Software Development Life Cycle (SDLC) Models. International Journal of Soft Computing and Engineering (IJSCE), 2(3), 251-257. Vuori, M. (2011). Agile Development of Safety-Critical Software. Tampere University of Technology. Department of Software Systems, 14, 2-109. Waligora, S., & Coon, R. (1997). Improving the Software Testing Process in NASA's Software Engineering Laboratory. SEL Software Process-Improvement Program IEEE Software, 12(6), 83-87. Waychal, P., & Capretz, L. (2016). Why a testing career is not the first choices of engineers? 123rd Annual Conference of the American Society for Engineering Education, 1-10. Williams, L., Kessler, R. R., Cunningham, W., & Jeffries, R. (2000). Strengthening the Case for Pair Programming. IEEE Software archive, 17(4), 19-25. Williams, L., & Kessler, R. R. (2002). Pair programming illuminated. Boston: Addison-Wesley Professional. Xu, S. (2006). Empirical Validation of Test-Driven Pair Programming in Game Development (Master's thesis, Laurentian University, 2006). Honolulu: Computer and Information Science, 2006 and 2006 1st IEEE/ACIS International Workshop on Component-Based Software Engineering. Yang, M., & Chao, A. (1995). Reliability-estimation and stopping-rules for software testing, based on repeated appearances of bugs [Abstract]. IEEE Transactions on Reliability, 44(2), 315- 321. Zhivich, M., & Cunningham, R. K. (2009). The Real Cost of Software Errors. IEEE Security & Privacy Magazine, 7(2), 87-90.