Agile development and continuous integration challenges. To travel by air, you get to the airport, check in, with any luck get on a plane, and get to your destination. Perhaps you do not arrive precisely on time, but close enough that you make that important meeting or family event. Like jumbo jet pilots, software development teams and project managers have a lot more to worry about than the final consumers of the software or the passengers on the plane. Pilots have to go through a rigorous pre-flight check routine to ensure the plane is in top condition, the correct amount of fuel is on board, hydraulics and electrical systems are all working properly, the flight plan and latest weather is reviewed - and it's all done on the same jet that is being flown. The pilot wouldn't do their pre-flight check on their private Cessna then jump on board the corporate Gulfstream, or a Boeing 767, would they?
One of the primary goals of continuous integration is fast feedback for developers; they need the technology infrastructure that allows them to build and test as early and often as possible. This has driven the market for continuous integration for years, with the goal being to build high quality software for customers, whether it's a small team implementing an internal application for financial reporting, or a multi-million dollar gaming company about to roll out its latest cross-platform blockbuster hit.
To build the best software, development teams must hit their important milestones to allow for sufficient QA testing, and they struggle when their code breaks. Why? Because many still do their validation and tests on their own machines, not on production-class systems. The development team is following the continuous integration and agile methodology of test early and often, yet the build breaks when their code is integrated. They respond, "It worked on my machine!" So something was different in the production environment than on the developer's machine, and it failed. Perhaps they had some different tools, or versions of tools - multiply that by a team of 10s or 100s of developers and it quickly gets unruly.
Or instead of the "it worked on my machine" problem, the infrastructure itself breaks. The code was good, it tested fine, but something else happened. A machine goes down; a script is bad or something else goes wrong. Everything stops. Everyone waits. The jet sits on the tarmac and phones and email start to light up. Who broke it? How do we back it out? Roll out the snack cart and headsets.
The impact of continuous integration in this common scenario is threefold. First, developers are fearful to check in late in the day-they don't want to be the one who breaks down the build and have to stay all night to get things straightened out, so productivity is the first casualty. Second, delayed code submission causes the schedule to slip. Or third, the developers who are creating clean code submit and wait and get an "inconclusive - build infrastructure failed" result. Not a motivating result. Morale goes into a nosedive.
And we're not taking into consideration the added challenges of a global development team. With a U.S. and overseas development team working through time zone and language challenges, the impact of lost productivity can go from hours to days. As we all know, lost productivity is lost revenue.
And the loss in morale cannot be underestimated. I've been witness to customers losing great developers just because of this issue. Very good companies have been unable to hold onto their best developers because builds would continuously break due to the infrastructure issues, not bad code.
So what's the solution?
Just like a pilot would not do the pre-flight check on a different plane prior to take off, the development team shouldn't be validating their builds on anything but a production-class environment. Early, frequent building on the right OS, database versions, and tool chains combined with the ability to test cross platform is key to an effective agile strategy. Pre-flight build and test - building and testing before check in, including both unit and a subset of system tests in a production class environment eliminates the big productivity losses and drops in morale suffered in the examples given above. The developer gets fast feedback. Is the code clean? If the answer is yes, it is automatically checked in and ready for QA. If it fails, it's kicked back to the developer, but the problems are caught much earlier in the testing process and the rest of the team continues to be productive, checking in code with no waiting. So a mistake that might not have been detected until takeoff is now found before anyone boards the plane. What would have been hours of delay, now can be isolated and fixed without impacting all the passengers as well as the other planes waiting to take-off.
So how does it really work? We have several customers that put pre-flight builds into practice every day. We have one customer who has seen a 90 percent reduction in broken builds in just a couple of months. And it's worth noting that the remaining 10 percent of broken builds were due to developers who were circumventing the pre-flight build and continuous integration methodology.
The result of testing early and often is that you discover problems in the pre-flight stage versus "mid air" and you get faster feedback, and end up with a cleaner code base. One plane goes back to the terminal but 20 others take off in the meantime. In agile development, you get working software and can release any time because you are hitting your milestones. Your demos are ready to roll; QA is testing the right version and can run a clean test. And, as we all know, it costs a few dollars to catch bugs early, rather than hundreds or thousands when they reach the customer.
Continuous integration is great, but it only tells you there's a problem after it's too late to prevent the downstream damage of that problem - high-performing development teams have incorporated pre-flight build and test to gain agility and focus on the code, not the infrastructure.