Testers are often on the critical path for getting a software release out. They must plan carefully in order to minimize the critical path, while still doing a complete job of testing. This schedule pressure is taken to an extreme when a production server must be taken offline in order to deploy the software, and everyone is waiting for the final test results before the system can go live again. Karen Johnson describes her company's carefully planned and orchestrated method for doing a final check of an installed system. Her story is relevant to e-commerce companies as well as IT shops that are under pressure to keep systems updated while minimizing downtime.
Beyond industry knowledge and technical skills, a quality assurance person brings to the table a critical career skill that has nothing to do with the latest or coolest technology: the skill of planning.
It's important to be organized in a profession that often senses other people (such as the rest of the technical and management team) pacing and waiting for QA to finish their job. Quality assurance needs to be clear with the rest of the team about what needs to be done and the risks we take when corners are cut because we are short on staff or short on time.
As if the development process didn't put enough time pressure on quality assurance, the release cycle turns up the heat even higher. During the development cycle, new functionality and features are created and tested. This typically involves weeks or months of build, test, and rebuild cycles. During the release cycle, the new code is ready to be delivered and the final CD is cut, or in my case, installed onto a Web site. Quality assurance is the final step before a release is placed in the hands of the customer.
Life on the Web
While the theory and promise of a 24/7 Web site are talked about, the reality for many company Web sites is that they need to be taken offline (at least occasionally) to install new releases or upgrade software or hardware. In my environment, we occasionally need to take our production site offline for an hour or two during the night to install a new release. A variety of people are involved with these releases and the pressure is high-everybody takes their turn working through their piece of the process, knowing that offline time means lost revenue and reduced customer service. In this race against the clock, quality assurance is designated the last position-the final checkpoint before the site is open to the public again.
For us, these middle-of-the-night activities begin during the day. We collect statistics on our site, learning what day of the week and what time of day our site has the least amount of customer activity. This is how we select what time to take the site offline. When the install time is selected and the new release is ready for install, we hold a release meeting. We map out a tight schedule of what needs to be done and who needs to complete each task.
When the magic hour approaches, our network staff begins by trying to close the site as gently as possible for customers-waiting for people to sign off and hoping more people aren't logging in at three o'clock in the morning. Once the customer count is as low as possible, we post a maintenance page stating that our site will be offline temporarily, and an email will be sent when the site is available again. I use a connection outside of our firewall to verify that this maintenance page is what is being shown to the "outside world." I enter my email address to receive notification when the site is live again. Then, the network staff makes any hardware or software upgrades as needed, followed by configuration changes.
Meanwhile, our database administrators (DBAs) begin making modifications to the production databases. Database tables may be added, changed, or deleted. Sometimes these database changes can only happen when the DBAs can gain exclusive access to a table. The only database changes that are made are those agreed upon in advance by development, quality assurance, and the DBAs. These database changes are well documented and executed according to the agreed-upon plan.
Our network staff continues by installing the new software. They follow release notes that were prepared by quality assurance in advance. The developers review configuration settings, ensure content is moved to production, and check that the correct files are in place. Once the network, DBA, and developers are done, quality assurance steps in. QA ensures that the release is ready, and that the site is ready to go live again.
What does quality assurance do at four o'clock in the morning with a production Web site on hold? To begin with, there is no time to plan during offline time-you have to be ready to work. We use a checklist that was prepared ahead of time. This reduces release stress and gives quality assurance the ability to ask for other team members to lend a hand in speeding up the process. At this stage, release cycle testing is the same as development cycle testing-planning is essential.
Steps to Checking the Release
Here are the steps our quality assurance team follows to verify the correct release has been installed on each production server.
Use a Specially Created Login Page to Log in to Each Server.
My company uses multiple production servers to support our Web site. We use load-balancing software that pushes each new customer session onto the server with the lowest number of users. While this load balancer is an important element of our production site, it is not helpful when quality assurance or development needs/wants to log in to a particular production server. To work around this issue, our developers built a simplified version of our login page that allows us to specify which production server we want our session to be run on. For our middle-of-the-night installations, this page becomes an important tool as we use the simplified page to log in to each server and ensure each server is ready to go.
Check the Basics First
For each server, some basic elements are checked that have nothing to do with the new release. I make sure I can access each server and log in using a customer account that has been created ahead of time. I also check that basic transactions relevant to our site can be performed (in our case, this means searching for an item, browsing the aisles, putting an item in the shopping cart, submitting an order, etc.).
Ensure SSL is Working on Every Server
SSL (secure sockets layer) is an imperative part of an e-commerce site, because it renders any credit card number entry online reasonably safe. I check SSL to make sure it's working on each production server after every installation. There are secure and nonsecure pages in our site. There are secure pages exposed only to existing customers and secure pages exposed to new customers. I check both, and I check each server.
Like any assessment done by quality assurance, you have to assess risk, likelihood, and impact. How big of a risk would it be to your site if SSL weren't working? How likely would it be for a customer to encounter the flaw? And, could you survive the impact of that flaw on your production site? When you consider these points, checking each secure page on each server is like maintaining your company's own life insurance policy; one that you wouldn't want to skip payment on.
Use a Checklist of Visual Features that Changed for the Release
While most releases contain some functionality change that is not visually noticeable, they may include some change to the user interface such as a newly revamped page or a new icon/graphic. To prepare for our middle-of-the-night releases, I create a list of these quick visual checkpoints ahead of time. I log into each production server and use these checkpoints to ensure each server has the correct release in place. Remember, the entire release has already been through the quality assurance process. This is the release process, not the development process.
You can see how forcing a session on a particular server combined with knowing the visual changes of a release creates a reasonably fast, manual means of ensuring the new release is in place on each server. Logging in, checking the basics, checking SSL, and making a visual checkpoint of a few items have to be done on each server. Because I'm working against the clock, I log into each server rapidly so if a particular server is having an issue, I can call out that issue to the network staff immediately. Then I go back through each server and check the basics, SSL, and visual checkpoints. Making sure the basic build is working before testing new features in depth is similar to the QA strategy used typically during the development process.
Check INI, Property Files, or Other Configuration Files
We use a combination of property files and INI files for configuration settings that are specific to the environment. These files define where content and graphics are being read from, session timeout values, error file locations, etc. Since our system test environment is not configured the same as our production environment, these configuration files must be carefully checked with each release and for each server. We have scripts in place that our network staff uses which forces the files to be in sync from one server to another, but I still take the time to at least spot-check the configuration files on each production server. Similar to the visual checkpoints, I build a list of any configuration file changes for the release in advance. This list is used by the network staff who modify the files and by quality assurance, so everyone on the team is reading from the same document.
Scenario-Specific Checkpoints
Sometimes we make a change that impacts aspects of the site that are worth checking in production. For example, we might add a required field in online registration that won't work correctly unless the corresponding database modification has been made. If needed, I create accounts in advance for this type of testing. If your database design is complicated or includes triggers, stored procedures, etc., you might want to enter the first few orders or records and check the data in the database before your site goes live.
Time to Go Live
Once these checkpoints have been made and everyone on the team is confident the build is ready, we go live. I watch to receive my user email notification indicating that the site is now online, and I return to my test lab to check how the site is being presented outside the firewall.
Steps to Verify the Release is Working When the Site Is Live
Check the First One or More Records in the Database
Typically, everyone on the team has their eye on the servers as we watch for customers to return to the site. If new tables have been added to the database or major changes have been made, the first few records in the production database are checked by the DBA and QA team to ensure the data is accurate.
Sometimes what quality assurance looks for on the early morning releases is impossible to plan. With one release, we began service in a new city. After the install was complete and the site was live, two people on the team placed "real production orders." We discovered that our order-generating system was starting with the number 1. The fact that the order number so clearly indicated how many orders had been placed in that market wasn't the kind of fact we wanted our competitors to discover so easily! A quick change to the database was made and the team placed additional production orders to ensure the order numbering had been properly corrected.
Check Any Third-Party Links
We have a couple of links that work with other companies and their Web sites. When our site goes live, I check to see if these links are working to ensure the sites are working together as defined. I check them from both inside and outside of our firewall.
Verify that Nightly or Weekly Jobs Are Ready for the Environment
Last on the list is checking any nightly or weekly jobs that need to be run in production have been modified and are in place. Typically, our DBA will check to ensure data warehousing jobs and other jobs are in place and configured correctly. Following the release, the team keeps an extra eye open for any production issues. Our technical support team-which has been told in advance of each release-also waits to hear if any issues arise. If so, the release team is immediately called into action.
In Conclusion
Preparing a checklist in advance helps minimize time and confusion during the release cycle. This is especially helpful if you have to complete the release cycle at three or four o'clock in the morning. By planning, organizing, and maintaining a commonsense approach when searching for flaws during any step of the process, QA testers make valuable contributions to the team.