Manual software testing can never catch all errors–so can automation help? David Norfolk looks at the pros and cons of automated testing and offers advice–and warnings–on its use.
Manual software testing can never catch all errors - so can automation help? David Norfolk looks at the pros and cons of automated testing and offers advice–and warnings–on its use
One thing you say for sure about software testing is that there is not enough time in the world to test manually all possible paths and combinations of paths through a realistically large–that is, useful–program. The tests will never be anything like exhaustive, so they must be made cost efficient: you must find the greatest number of defects with the resources available.
This has consequences. First, you cannot afford to run tests that do not find defects, so structured testing plans that maximise the chances are vital.
Second, automation, to maximise the use of resources, makes lots of sense–but only if the automation fits the demands of the structured test plan.
The basic principles of automated testing go back to earlier keystroke recording utilities. If you record the keystrokes made by someone entering a test case, you can rerun the test case at very little cost, perhaps to check that a defect that caused the test to fail on its initial run has now been fixed. In fact you now have a stored regression test that can be run after any code maintenance, to ensure that unexpected errors have not been introduced elsewhere in the program while part of it was being changed.
But you can do better than this. One automated test script can be cheaply cloned into dozens of scripts, each testing for a similar type of defect: for example exploring boundary conditions for an input string: too small, zero, too big, negative.
An important facility in any automated testing tool is a repository of test specifications. These can be developed early, before coding starts, and used to develop 'what if' scenarios for validating requirements specifications and so on. The most cost-effective time to find and remove defects is during requirements analysis–so requirements analysis tools can usefully be considered as part of the automated testing toolset, although they do more than just help validate requirements.
Test specifications in an automated repository are extremely cost-effective, because they can be used to generate test cases regardless of the maintenance state (database record formats and so on) of the system. Stored test cases can be rendered almost useless if the format of the records being processed changes. In addition, the test specification should make the business justification for performing a set of tests clear.
An interesting new application of automated testing is in extreme programming. Every functional requirement has its associated test case–prepared before coding–and these are run regularly, and the results published, to keep developers focused on the user requirements. Unit test cases are also prepared for all code and regression test cases for all bugs (to prevent their return) and code must pass all the tests before it is released. Managing and running all these tests would be impractical without automated test tools.
Automated testing is a real godsend to development quality in general, and although the basic principles still apply, the products have become increasingly sophisticated.
Nevertheless, automation has its limits and certainly has not de-skilled the process. It is easy for the inexperienced to confuse high volumes of test results–easily produced with automated tools–with effective testing.
The theories of testing talk of 'equivalence classes': test cases which all find the same general type of defect. If a program accepts integers between 1 and 9, for instance, integers between 2 and 8 form an equivalence class: if it processes one of them it will probably process the rest successfully.
Once you have tested from within an equivalence class, it is cost effective, in terms of the resources used for each defect found, to explore different equivalence classes in your next tests. In the example, having tested that 7 is accepted, it is best to explore boundary conditions such as 1, 9, 10 or even zero, -1 or 1.5 next time, instead of 2, 3 and so on.
If automated testing tools are used mindlessly, it is easy to generate a false sense of security by running thousands of tests within the one equivalence class. It is also easy to succumb to the temptation to test what automated tools are good at testing–they are particularly useful for regression testing, load testing and performance testing–rather than what you really need to test, which is usually your requirements.
Performance testing and so on are useful, but you should work to a test plan driven solely by the business need to remove defects, including performance defects, not by the limitations of the available technology.
The mantra for testers should be 'early defect removal is cheap defect removal'–and, traditionally, automated tools have been weak at the early stages of development. This situation is improving: Rational, for example, has a tool called Quality Architect, which lets you test Web-based applications for expansion and other factors, from a UML design, before any code is available; it then generates the code harnesses you will need for early testing of code.
Life cycle testing–defect removal at all stages of development, the earlier the better–should be implemented, supported by appropriate tools. Look also for holistic testing tools, which take in the entire user experience, from hitting the 'submit' button in an e-commerce transaction through correct updating of databases to delivery of the goods. Original Software, is a supplier here.
Automated testing is set to expand in scope. Big commercial systems have run routine test transactions since commercial computing started. The Australian Department of Health's pharmaceutical benefit system, written in the late 1970s, for example, ran a standard artificial payment transaction once a month, to ensure that the basic arithmetic of the system had not been compromised. Modern automated testing tools can do something similar: evaluating the end-to-end service of a Web transaction, for example, so that potential performance problems can be addressed proactively. Mercury Interactive's Topaz product is an example.
This idea could be extended. Even with automated testing, current systems are probably too complex to test completely in any reasonable time, but computer power is cheap, so monitoring system outputs for unusual patterns is possible. If a program is routinely selling televisions online for £400-£4,000, for example, it could take corrective action, typically alerting an operator, if TVs suddenly appear at £4.
This does not replace proper testing, but intelligent system design could limit the impact of unavoidably limited testing resources–or fraud, or data entry errors.
One could imagine a pattern matching program continually checking outputs for unusual patterns indicative of system defects. It is unlikely that such a program would be allowed to automatically correct defects, but this is a plausible approach to managing the impact of systems complexity on conventional testing overheads.
Perhaps a more immediate approach to controlling software defects is the exploitation of mathematically based development methods and tools to produce provably correct software components for building systems. The use of such methods for significantly sized systems was described to the BCS Bristol Branch recently by Roderick Chapman of systems consultancy Praxis. A key finding was that the use of formal proof was easily cost-justified: developing a formal specification took 5% of the project effort and found 3.25% of the faults, but proving this specification then took only 2.5% of the effort and found 16% of the defects. Against this, unit testing also found 16% of the defects found, but took 10 times the effort (25%).
Testing is expensive–but then so are defects in production systems–because it requires intellectual resources to design a test plan that will find defects efficiently, and because running tests takes system resources that could be used for business.
Automation cannot help much with designing clever test plans yet. However, it can allow testers to concentrate on designing test plans that will wheedle out more defects rather than on coding test cases: a test specification can generate many tests. Nevertheless, even with automation you still need skilled testers, with different thinking to that of developers.
Automation cannot design a structured test suite, nor can it do more than hint that testing might have finished. A person still has to decide what defects are unacceptable and whether all significant defects in the program have been found, with some degree of confidence–or merely all the defects that a limited test plan is capable of finding.