Lessons Learned in Performance, Stress, and Volume Testing

[article]
Member Submitted
Summary:

In this article, the author shares some of his insights from previous testing engagements. He hopes to help testing professionals make informed decisions before they initiate a performance test for capacity planning, measuring an application's response times, identifying degradation points, breaking points, and bottlenecks. Flawed testing techniques are also pointed out as lessons learned from several projects and a mitigation path for overcoming these flawed testing techniques is also explained.

Waiting until the 11th hour–Stress/Volume/Performance test execution 4 days before deployment

I was in a project where the GUI (the client that the end user interfaces with) for an ERP system was upgraded and my client wanted to execute a performance test, volume test and stress with load generation automated tool 4 days before the GUI upgrade changes were moved to the production environment. The project encountered that the application with a new GUI in a production-like environment had unacceptable response times, degradation points and many bottlenecks. The project wanted to repeat the same performance, volume, and stress tests after all the fixes were incorporated to the ERP system before deploying the system into a production environment. The problem was that it took nearly 2 working weeks to troubleshoot the ERP problems and introduce the fixes to the application which delayed the project’s deployment schedule and subsequently cost the company untold tens of thousands of dollars in delaying the release of the new ERP software upgrade. Much time was spent among the various support personnel in trying to identify, troubleshoot and pinpoint the bottlenecks and degradation points within the application. The test engineer, DBA, infrastructure engineer, and middleware engineer had to review multiple graphs and charts from the automated testing tool, and other performance monitoring tools in addition to troubleshooting and fixing the problems for several days which caused the project director to delay the release and deployment of the software until the response times for the application were adequate or in line with the service line agreements.

A recommendation would be to execute and complete a performance, volume, load, soak or stress test 3-4 weeks before the actual deployment deadline or release date to migrate the application under test into its final destination the production environment. One should plan to finish the performance/stress/volume test well in advance of the deployment date since the tests may reveal that the system under test has encountered several performance problems that necessitate the testing team and support teams to repeat the performance, stress, and volume test multiple times for troubleshooting and fixing the application under test. In addition time will be needed to review graphs and interpret results from the various performance-monitoring tools.

Furthermore, additional tasks or support personnel may have to be scheduled for troubleshooting and fixing the system under test if tests need to be repeated. As an example of the additional tasks that may have to be performed the following are included: 1. Obtaining more unique data values from the subject matter experts to re-execute the tests with multiple iterations of data for processes that have unique data constraints, 2. Tuning the Database, 3. Rewriting programs with inefficient SQL statements, 4. Upgrading the LAN, 5. Upgrading the hardware, etc. For these reasons it’s unwise if not risky not to complete a performance, stress, volume, and load test with 3-4 weeks left before the deployment deadline. If you cannot answer the question “What will you do if problems arise out of performance/stress/volume test” because you have an impending deployment deadline or because you have a compressed and unrealistic schedule you will probably encounter a situation where you have to make a tradeoff between deploying a system into production with unacceptable response times or delaying your project’s deployment deadline to properly tune your system’s performance.

Missing trial runs (Proof of concept runs)
In one project that I worked with the SAP project manager wanted to jump into the execution of the performance test, and stress test with a maximum load of concurrent users in an environment different from the one where the automated scripts were developed.

The execution environment for the automated scripts in the aforementioned project had some configurations that were different from the development environment where the scripts were created. Having environment differences between the two environments caused many of the automated scripts to fail and some scripts that actually did run could only do so for just a few emulated end users since all the data in the test scripts came from the development environment and some of this test script data was not recognized in the execution environment.

I have found that a more pragmatic approach before jumping into a full-blown performance/volume/stress test is to conduct 2 trial runs for the automated scripts where the various support personnel are monitoring the system under test with a load that is first equal to 10% of the expected peak load in the production environment for the first trial run, and then with a load that is equal to 25% of the expected peak load in the production environment. With this proposed approach the various support personnel can first validate via proof of concept trial runs that they have the ability to monitor the various components of the system under test including the LAN, and second the test engineer can validate that the emulated end-users can playback the recorded business process. Lastly, with trial runs one can also validate that the application under test receives data from interface programs that run as batch scheduled jobs and generate traffic within the application.

After the trial runs are successfully completed with a partial load of emulated end-users as described above then a decision can be made to initiate the execution of the performance/stress/volume test.

Unnecessary risks in the live production environment
I saw in a project that the testing team had plans to test the application in the live production environment to simulate real-world conditions. This practice should be avoided at all costs unless you have the luxury of bringing down an entire live production system without affecting end-users or financially affecting a company’s bottom line.

Testing in a production environment can: crash the production environment, introduce false records, corrupt the production database, violate regulatory constraints, you may not be able to have the production environment a second time to repeat the test if needed, if you have shared production environment and you happen it to crash it for one application you can bring down other applications sharing the same production environment, and finally you may have to create dummy user’s ids to emulate end users in the production environment which may compromise your application’s security. These reasons are not all-inclusive and are only meant as risks for not conducting a performance/stress/volume test in a production environment.

A project is better off re-creating an environment that closely mimics or parallels the production environment with the same number of servers, hardware, database size etc and using this production like environment to conduct the performance/stress/volume/load test. If the production-like environment is a virtual image of the production environment then the tester may have even obviated the need to extrapolate performance results from one environment to the other.

Prevent Data Caching
On some projects I found that the same set of records were populated and entered into the application with the same automated scripts, when the performance/stress/load/volume tests were performed. This approach causes data to be stored and buffered which does not fully exercise the database.

The creation of automated test scripts with enough unique data records will prevent the data-caching problem from occurring. The test engineer who hands experience with the automation tool but lacks functional knowledge of the application under test should work with the subject matter experts and testers with functional knowledge of the application to identify enough sets of unique data records to repeat the performance/stress/volume/load test with distinct or unique values of data. For smaller applications the DBA may restore and refresh the database after each performance/stress/volume/load test to cleanse the database for values entered during a previous test run, in this way when the performance/load/volume/stress test is repeated with the same set of records the records will not be cached or buffered.

Failure to notify system users, and disable users
I have witnessed a project where in the middle of a performance/load/volume/stress test execution other users not associated with the test logged on to the environment where the application was under test and therefore skewing test results. Furthermore these unexpected users that logged on to the environment in the middle of the test called the helpdesk and the system administrator several times to complain about the performance of the system and thus wasting time from other groups.

This goes without saying but please notify in advance all users via email or system messages when a performance/volume/stress/volume test will take place, what date, what time, and in which environment. As a second precautionary step disable all users log-on ids from the environment where the test will take place that are not associated with the test and only permit emulated end-users to log-on to the environment under test with the previously created dummy user ids (e.g. user001, user002, etc) and other users that are associated with the test for tasks such as performance monitoring of the test.

Throughput emulation. Same number of end users as emulated end users
Throughput emulation. Same number of end users as emulated end users I was present in a project where the application’s middleware engineer wanted the system to be tested with a load of emulated users that was in fact unrepresentative of the total number of expected concurrent end users in the production environment. The actual expected number of concurrent end users in production was 1200, whereas the middleware engineer wanted the automation tool to only log-on 120 emulated end users, since he calculated a ratio of 10 to 1 where 1 emulated end user from the automated testing tool works at the rate of 10 actual human end users.

Emulated end users from an automated testing tool work at a constant rate, and have the ability to perform multiple iterations without stopping unlike a human being, and they also have the ability to generate the as much traffic and execute business processes per hour as the expected number of concurrent end users in productions. However by utilizing a ratio of emulated end users to the number of end users the test engineer may not be able to find out if the application under test can actually withstand the actual log-on of the maximum number of expected production end users. Thus in the example above the application the test engineer may demonstrate that the application under test can support 120 emulated end users logged on simultaneously but how can the test engineer say with any confidence that the application above can withstand 1200 end users logged on concurrently. Another reason to emulate the actual expected number of end users in production is that the test engineer and DBA can ascertain that the application under test’s database has the number of database connections correctly set-up.

To avoid having an emulated end user that works at a rate that is in fact much faster than the work rate of an actual end user, I’d recommend that the automated scripts be constructed and developed in exactly the same manner in which an end user would execute a business process in the production environment. And second that the test engineer’s utilizes the "think time" feature from the automation tool that allows the automation tool to playback the script at the same rate at which the script was recorded. Furthermore, if the “think time” feature is not available the test engineer can artificially delay the execution of the script by introducing artificial wait times within the test script. The objective is to playback the script at a rate that is similar to the way in which a human being would execute a business process in production and thus emulate throughput with the same number of emulated end users as the total expected number of production end users.

Not testing the entire round trip
Sometimes projects will test an application’s response time at the back-end level only. But what about the interaction between the end user and the application through the GUI or the client side of the application, shouldn’t this be tested too? The answer is unequivocally yes; the response times starting from the client side should be tested too.

The application's entire response time round trip starts from the client side where the end user works with the application. The end user does not work with the back end or the database. Hence the test engineer should ascertain that the end user would have adequate response times from the application from the point in which the end users interact with the application.

While ensuring that the application has satisfactory response times at the back end level it is equally important if not more important that the end users or customers working with the application will also have adequate response times from the client side (GUI) of the application.

Missing Personnel
I can recall one project in which a performance/stress test was conducted while many of the supporting personnel were working remotely or on-call. Key support personnel that were not present at the site of the test. Within this project an in-house developed bar coding application was tested for performance, volume, and stress testing during the middle of the night.

The application worked fine for an initial load of emulated end users but the application under test encountered all sorts of problems when the number of emulated end users was gradually increased. Many of the problems resided with the fact that the application under test needed to have some obscure parameter enabled that to allow the concurrent log-on from multiple emulated end users to the application from the same desktop. The personnel at the testing site did not have an iota of a clue as to how to fix the problem or enabled the parameter since they were not the developers of the bar coding software. The testers on site attempted to reach the developer who created the bar coding application but the developer could not be reached by pager for a long time and when the developer was finally reached a resolution could not be identified remotely for the problem at hand. The result was that the testing had to be halted completely for the night wasting much time and billable hours for contractors.

A better approach is to have all personnel associated with the testing present on site during the entire test.

Absence of a Contingency Plan
The best test engineers can crash an application or bring down the LAN when conducting a stress test since stress testing is an inexact science. In reality many companies make educated guesses and assumptions about an application’s expected performance that may easily be proven false after the test takes place. In one project I worked with the NT servers were crashed, and the application was rendered inoperable at 6:00 am when the first shift workers were expected to come to work at 7:30 am and would be affected by this problem of an inoperable application. The problem was not solved until 10 hours later and the company had no alternative plans or contingency plans for such an emergency. It is essential that test managers, test engineers, and support personnel not only document all potential risks but also mitigation actions to reduce the risks before starting a stress test.

Knowledge walks out the door. Contractors leave the projects after one test release leaving no documentation
Often projects will hire contractors for short engagements lasting 2-3 months to develop the automated testing scripts, plan the test, execute the scripts and then interpret results. But what projects fail to do is to document what scripts the contractors developed, where the scripts are stored, and where the test execution results, graphs, charts are stored. This is a problem when the contractors walk out the door. The idea is to create re-usable automated scripts that can be re-used “as is” or with some modifications from one release to the other but often times test managers let the contractors wrap up the project and exit the project without knowing where the automated scripts were saved. Other times if questions arise about the test results when the system has been deployed to production the test manager has no clue as to what the contractors did with the test execution results and thus are not able to generate test graphs after the contractors leave the project. It is important to document and store automated scripts and results in a central repository or test management tool where the project can access them long after the contractors left the project.

No consistent method for tracking expected volumes, business process throughput
Another problem that I can recall from previous projects was that many projects did not have detailed requirements that would help the test engineer create automated test scripts in order to emulate the expected volumes of transactions and business processes within the production environment for a given number of concurrent users with an automated testing tool. Other projects rely on interviews with SMEs to attempt to learn what the expected production volumes are that need to be emulated and this information is stored on emails, or personal notes, or not documented at all. To overcome these problems I would suggest that the test engineer create excel spreadsheets with templates for collecting information from the subject matter experts, developers, middleware engineers to capture consistently what the expected volumes of data and traffic will be in the production environment. The test engineer should seek to learn how many concurrent users are executing a given scenario or business process, and how many business processes a user executes per hour on a typical day or on a day where production traffic is much greater than usual. For instance on a typical day a sales entry clerk takes 10 orders for toys per hour but over the period leading to the holidays in December a sales entry clerk takes 50 orders for toys per hour.

No records of transmission Kilobytes of data, no network topology diagram
Once executing a stress, volume, or performance test some network segments or routers are in risk of being saturated and thus affecting multiple users or bringing down the LAN/WAN connections down completely. Before starting a stress/volume test find out how many KBs of data are currently being transmitted for a typical workday for all applications and if possible try to isolate how many KBs of data the application under test sends on a given day for an end-user. Also the test engineer should work closely with the infrastructure team to obtain a network topology diagram to assess which network segments and routers will be affected as a result of the stress/volume test. The network topology diagram and the KBs of data sent per day should help the test engineer plan and mitigate risks associated with the stress/volume test.

No forum for reporting and interpreting of testing results
In many projects after a stress test is completed and problems are encountered there is much finger pointing as to what group is responsible for the problem. Some groups will say the database is not tuned properly, or the LAN needs to be upgraded or the hardware needs to be upgraded etc. To avoid finger pointing the test engineer should schedule a meeting after the conclusion of the stress test where all monitoring support personnel bring their graphs and reports that they gathered from the stress test and all the teams interpret their results. Any graphs with abnormal spikes or deviations should be thoroughly discussed to help pinpoint potential bottlenecks or degradation points. The supporting personnel should also discuss and examine that their graphs and charts behave as expected and report this information to the other monitoring groups if such is the case to help eliminate the potential causes of problems.

No knowledge of Hardware limitations
Some projects that I have worked with spent hundred of thousands of dollars in automated testing tools to conduct performance, stress, and volumes tests only to find out 2 weeks before the execution of the test that they did not have enough hardware to emulate more than a few hundred end users when in fact their production environment would have over a thousand end users. Before a stress/performance test begins it is unnecessary to understand that each emulated end users requires a few MB of RAM and that only so many end users can be emulated on a desktop. Investigate as soon as possible how many end users will need to be emulated with the automated testing tools and how much hardware equipment is available in the project before spending untold thousands on licenses to emulate thousands of end users when in fact this will not be possible due to existing hardware limitations. If the project has sufficient budget to procure new desktops and servers to support a stress test then these pieces of hardware should be procured as soon as possible to prevent any delays to the execution of the stress test.

Duplication of testing efforts
It’s possible for a project to be so large that testing efforts are duplicated. I saw first hand in a large ERP project where supply chain and distribution processes were thoroughly stress tested with an automated testing tool. Yet one of the process team lead did not have first hand knowledge of the work that the QA team had performed to test the ERP systems response times and this process team lead had his entire team re-test the response times of shipping orders and thus duplicating many aspects of the previously executed automated stress test. This effort was unnecessary and wasted a lot of man-hours.

No access to SMEs (Subject Matter Experts)
The test engineer may have extensive knowledge of the automated testing tools but may have little or no knowledge about how the application under test actually works and what sets of data are necessary to construct automated scripts. This is especially true of ERP applications such as SAP R/3 where there are multiple environments to go log and the data that works in one environment may not work in a different environment. The automated tester may have difficulty creating automated scripts if he/she has no access to the subject matter experts or the opportunity to discuss with them what are the valid data sets that are necessary or how to navigate a particular business process that needs to be automated. It’s imperative that the QA manager coordinates with other team leads to get the test engineer support from the SMEs when questions arise for the creation of automated scripts.

Testing on an unstable environment
Performance/stress/volume testing are not tests to ascertain that the functionality of the application is indeed working properly. The system under test should have been thoroughly testing for functionality thoroughly before engaging on a performance/stress/volume test. The performance/stress/volume test will help the test engineer discover bottlenecks, perform capacity planning, optimize the system’s performance, etc when emulated traffic is generated for the application under test. But the functionality of the application should be robust and stable before initiating a performance/stress/volume test.

Inexperienced testers
Many test managers incorrectly assume that because a tester has experience creating automated scripts on an automated tool that automatically confers on the tester the ability to conduct a performance/stress/volume test. This assumption is erroneous since a performance/stress/volume test is an art that requires lots of hands on experience and should be led by an experienced tester. Stress/Volume/Performance testing should be left be conducted by an experienced tester who understands how to generate traffic within an application, understand the risks of the test such as crashing an application, has the ability to interpret test results, monitor the test, and coordinate the testing efforts with multiple parties since performance/stress/volume testing does not take place in a silo.

Not knowing what will be monitored
Many projects head into a stress test without knowing what will be monitored during the stress test.

Every project has applications with their own nuances, customizations, and idiosyncrasies that make them unique from other projects. While it would be very difficult to generate a generic list of all the components of an application that need to be monitored during a stress test it is fair to say that at the very minimum the following components should be monitored: the database, the project's infrastructure (i.e. LAN), the servers, the application under test, etc. I advise companies to have meetings with various managers, owners and stakeholders of the application under test and discuss all potential risks and areas that need to be monitored before conducting a stress test. Also create a point of contact list with names of the individuals, their phone numbers, and their tasks during the stress test after all the areas that will be monitored during the stress test have been identified. Every person associated with the stress test should have his/her role and responsibility clearly defined.

No formal testing definitions
Many projects assume that a performance test is the same as a stress test as a load test as a volume test, as a soak test, etc this is a faulty assumption. Test managers should understand the scope of each test that they are trying to perform for instance a stress test may find your applications breaking points but a performance test may help a test engineer conduct benchmarking of the application without telling the test engineer what the application’s breaking points are. Test manager should understand the definitions and consequences of each testing effort.

Not disabling processes not affecting the outcome of the test
I was in a project where a business process was recorded and playback and every time the process was played back it was kicking off a print job and sending output to a printer that wasted hundreds of pages every time it did so. The actual printing process did not cause or generate any traffic within the application under test. It is important to understand what processes are not necessarily associated with the system’s performance and to disable these processes if they are launched from the performance test. So if during the performance the test engineer is wasting printed-paper, or triggering events such as emails, etc and these processes are not affecting the outcome of the performance test then these processes should be disabled.

Not extrapolating results
If a stress/performance/volume tests were conducted in an environment that does not even come close to emulating the actual production environment then results would need to be extrapolated from one environment to the other. Many projects fail to do so and assume that if the application under test has an adequate response in one environment the same holds true for another environment without extrapolating results. There are many tools in the market from multiple vendors to help companies extrapolate results and if the test environment is a smaller scale of the production environment these third-party tools should be employed for extrapolating the test results.

Lessons Learned
The aforementioned lessons are some of the lessons that I have learned after many years of hands on experience with automated testing tools and in particular with tools that emulate traffic within an application. Whether you want to test your application’s performance and response times by hand or with an automated testing tool it is important to foster a culture within your organization that follows best practices, is consistent and repeatable. Initially testers have a proclivity for resisting rigorous and proven testing processes that deviate from their own experiences but it is the role of the test manager to foster proven practices and not to repeat costly lessons.

Many companies are highly dependent on their software and application as their only source of revenue and do not have the luxury of having an application be so slow, or inflexible that customers will not use it again. As an example companies on the web that sell products such as books or CDs where these products are their only source of revenue cannot afford to have a customer experiencing excessive delays for buying a book, or cannot have a customer unable to log on to buy a CD because the application is inoperable with a maximum load of concurrent end users. Test managers have the onus of understanding the application's traffic and thoroughly testing the application for performance before deploying the application to a production environment.

The test manager should document lessons learned from his/her own project and from other projects and store these lessons learned in a commercial tool or in-house repository. The test manager should ensure that the testers whether they are company employees or contractors understand the lessons learned and take corrective action to follow best testing practices at all times.

About the author

CMCrossroads is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.