Old software may not always work as well as it seems. The mentality of "If it isn't broke, then don't fix it" could be the culprit. In this column, Linda Hayes offers a few suggestions to help you look at your software with a more critical eye, which might help you realize where your old software is broken or in need of attention.
I have long excoriated capture/playback as an ineffective test automation approach because it lacks structure, reliability, and maintainability. But there is something about it that is more subtle and disturbing. When you get right down to it, all capture/playback can do is tell you if the same steps got the same results, not whether the steps or results were right or why they were repeatable.
Batch testing is also often automated by simply capturing production data, declaring it the baseline, and thereafter comparing against it. As expedient as this is for describing a lot of functionality in shorthand, it unfortunately means that an error memorialized in the baseline can go undetected for a long time and do a lot of damage.
Don't get me wrong, I can understand the attraction, even the need to do this. After years in production, any system documentation becomes long out of date and the original authors are perhaps long gone, so the best source of knowledge about what the software should be doing is what it is doing. After all, if no one is complaining then it must be working. Right?
Not necessarily. HomeSide Lending, a New Jersey mortgage broker acquired by National Australia Bank, was under-pricing mortgages for years due to a "computer error" in which the wrong interest rate was used. The eventual write-offs, after all the dust settled, amounted to $4 billion, and the bank's market value lost over $6.5 billion. The bank was almost bankrupted.
Aside from the shocking price tag, it's amazing how long the error went undetected. Mortgage pricing is essential to HomeSide's core business and alleged to be their special competency. I don't know precisely how it happened, but obviously someone, somewhere, and at some time decided the software was correctly pricing mortgages when it wasn't. No one questioned it until the Australian banking authority audited the software.
It stands to reason that the best and only way to truly test software is to first decide what it should do, then prove that it does as expected instead of the other way around. This is hard, especially for older systems, because of the long accumulation of changes and the tangled network of integrations. Still, is any excuse adequate against the possibility of long term, perhaps even fatal damage?
Sarbanes Oxley wasn't around in 2001 when the HomeSide Lending disaster occurred, but if it were, I'm sure the repercussions for executives would have been far more serious than being fired. And while decades ago a "computer error" was easier to cloak in some technical mystery, today technology is commonplace and errors are understood as defects that have a cause.It would be better to avoid hidden assumptions that lead to sudden surprises by following the example of the Social Security Administration's Unified Measurement System project. Tasked with updating the existing performance evaluation system for 1,300 field offices, Harriet Hardy, the program manager, started by defining the business rules first, then evaluating the existing system to establish the baseline. This process uncovered differences between the documented performance-measurement policies and how the policies were implemented in the software. Had Harriet simply assumed that the existing system was operating correctly because no one said otherwise, the errors could have been perpetuated even longer.
Of course not all management is willing to invest in functional introspection. Discovering and documenting functionality takes time but produces no lines of code, and unfortunately, code is sometimes confused with value. The true, underlying asset—the knowledge expressed in the system—is often neglected. Ironically, it is this knowledge that gives the software value, not the number of lines of code. No one buys software by the pound. Yet time for detailed analysis, planning, and documentation are often squeezed by the need to show supposed tangible progress.
A project to rearchitect one of our core components has taken thousands of hours so far, and most of this was not even analysis or design—let alone coding—just basic domain-knowledge transfer. Unfortunately, I know from expensive experience that assuring a comprehensive understanding by all team members from the beginning will avoid far more costly issues in the future.
So expose your hidden assumptions, bring them into the light, and question them. It may be the most important quality step you can take.