Conventional wisdom is that complex projects involving large groups of engineers can not benefit from the application of Agile techniques. Certain Agile practices, when properly used, can benefit even relatively large development projects with large teams. What's even more interesting, these practices can be introduced in "mid-stream" with little preparation to large teams of "old school" developers. These developers may initially resist the methodology, but the Agile practices still win people over and bring tremendous results in productivity, product quality, and team morale in a very short period of time.
Introducing XP
XP was first introduced in StarSoft in 2002, after our CEO read Kent Beck's "XP Explained: Embrace Change." Later that year, StartSoft won its first Agile customer, a world leading chip manufacturer. The customer picked us specifically because its internal development organization was based around the practice of XP. Since then, StarSoft has successfully delivered over 60 Agile projects. Today, over four years later, we can say that XP has become very much a part of the company's culture. According to our COO whose office is in the hallway that leads to the cafeteria, he can tell when an XP team is going to lunch as opposed to one of the "waterfall" teams: they even walk faster because they feel they have no time to waste! Engineers either like or hate XP; however, conversions are entirely possible even for the most die-hard "old school" engineers. We have {sidebar id=1} found that for most young developers, the clarity, focus, and high productivity yielded by XP are infectious and people become ardent promoters of the methodology within the organization. No engineer who has tried XP has ever wanted to go back to "the old ways."
The Labka II Project
StarSoft introduced XP to a large-scale software development project for Computer Sciences Corporation (CSC) in Scandinavia, yielding some interesting results. A large and complex system called Labka II that had been in development for three years was about to go into production. Labka II is a medical laboratory information and production planning system supporting requisition and analysis of clinical chemical samples, blood cell analysis, and the like. The system communicates with a large number of advanced lab instruments. Labka II controls test requests, definition of work, schedules, collecting and processing test responses, validation and approval, reporting, data exchange, quality assurance, production statistics, and extract for invoicing. The underlying technology is based on BEA WebLogic, Oracle, AIX, MS SQL and Windows. Labka II can service a large hospital or a network of hospitals in a region or a county. The StarSoft project team size reached up to 80 people at peak times. The system contains over one million LOC, 700+ DB tables, and 40MB+ of source code.
The StarSoft team was having trouble getting performance and quality under control. Labka II is replacing the legacy Labka system and detailed specifications were developed and handed down to the team up front. Inevitably, CSC and its client (one of the largest Scandinavian hospital systems) introduced numerous changes in the course of the project to take advantage of the new technologies and include advanced functionality in the new version of the system. The problem was that we were not very wise in how we processed the changes to the scope: we simply accepted the requests without communicating back to the client the impact that the requests were going to have on the project's complexity and timeline. We were just saying "yes" to everything and hoping it would somehow work. At the same time, our internal project organization was largely inefficient and was not coping with the amount of work (see more on that later). Eventually, we had to learn from our mistakes the hard way.
After several months were spent polishing the system, it became increasingly clear that some of the algorithms (suboptimal "patchwork" code written by separate people at different times due to poor processing of the influx of change requests over a long period) created acute performance problems and the number of fixed defects was constantly lagging behind the number of new defects found. Both CSC and its client were unhappy because the deadline was approaching and there just seemed to be no light at the end of the tunnel.
The StarSoft management realized that something had to be changed fast before the situation would explode in everyone's faces. We could not risk stressing this client relationship any further, so the entire management and production team worked together to identify the key problem areas where improvements absolutely had to be made:
- Frustration from scope creep. The individual frustration of developers was largely caused by the scope management process, which could be likened to the communist "central planning economy." Decisions were made high up and then communicated down to "the common people," sometimes in disconnect from the reality of what was going on "on the farms" and "in the factories." The creeping scope and the time pressure created the feeling among the engineers that this project was never going to get to "done" state: no matter how hard they worked, there never seemed to be light at the end of the tunnel.
- Disconnect between individual effort and team results. The numbers of defects per each developer were not so dramatic, but since the project was quite big, the aggregate numbers looked pretty horrible. People were de-motivated because the overall project results did not seem to correlate with their individual efforts.
- Disconnect between developers and testers. Developers and testers didn't talk to each other a great deal, resulting in a communication delay and lack of focus. Developers and testers were clearly out of sync and when a defect was fixed, it could take up to three or four days to get it verified and fed back to the developers.
- Fatigue from being driven too hard. Management (all levels, from the EVP and PM to the team leads) had been driving the team to work too hard, for too long, with too little result. Frustration and indifference were common and the team was unable to focus on the root causes of the problem.
Changing our behavior took courage: it is not easy to tell the customer that it will get all it wants, when it said it wanted it. But we had to make it clear that unless the scope was fixed and change requests were put on hold, we would not be able to dramatically improve quality and performance. Eventually, CSC was understanding and supportive, and worked with StarSoft to prioritize the tasks based on business value. As a result, management agreed upon a scope that would not be changed for several months to give the team the time to straighten out the defects and deal with performance issues.
New Practices
Drawing on StarSoft's several years of experience with XP, and with the help and coaching from the group of our XP-savvy project managers, we implemented the following practices:
- Smaller, fully functional teams. The large team was split up into seven fully functional teams, each consisting of a team lead, 3-6 developers, and 1-4 testers. Each team lead became a project manager of a small Agile project. This helped bridge the communication gap and ensured much faster turnaround.
- Agile planning and scope management. The project moved from "central planning" to "market economy." Developers were finally involved in determining how much will be done and by what time! Teams worked on their respective areas of the project in very short iterations (4 to 10 days), with a half-day planning game at the start of each iteration.
- Individual commitment to results. For each performance or quality target, there was now a personal commitment from a team lead. Team leads were assured by the management that scope creep would no longer be allowed, and they were effectively insulated from new change requests until the system was thoroughly cleaned up. Team leads in turn got the commitments from individual team members, on the daily basis.
- Each day is a project. In the morning standup meeting, tasks for the day were defined and agreed to by the team members. At the evening standup meeting, the results were also reported personally by each team member. Not only were the tasks shared, they were also written down on a dozen flipchart sheets and posted on the walls at the beginning of each iteration. As developers completed their tasks, they would walk up to the chart and cross off their tasks. This cemented the teams and restored the feeling in everyone that what they did mattered, and that they were actually getting something done. Everyone was accountable every day, and there was great energy in the teams. (In fact, the charts were especially difficult to get some people to do. In some cases, the company's EVP had to personally oversee planning games and morning standup meetings to make sure people actually wrote tasks down on charts and posted them on the walls.)
- Granular planning and estimates on the level of hours. Every morning, individual granular tasks were estimated (in hours) and committed to by engineers. A review followed every evening, so estimates were corrected and re-negotiated within the team almost in real time. This practice dramatically improved quality of estimates, within days of being first introduced.
- Refactoring. Most iterations of most teams were focused on fixing certain defects. Sometimes engineers were able to trace a certain group of bugs to a particular code fragment. Then, at the planning game, instead of deciding to simply go ahead and fix those defects one by one, the team would decide to refactor that piece of code and eliminate the root cause of the problem. As a result, we saw better code structure, better stability, and better performance (with occasional performance boosts of one or even two orders of magnitude).
- Pair programming. This was the practice that, while recommended by the management, was met with particular skepticism and resistance. However, eventually the infectious enthusiasm of the "XP mentors" convinced some people to try it. Immediately, it was clear that productivity did not really decrease as the "non-believers" had feared (of course, the XP folks knew it wouldn't). At the same time, the quality improved dramatically. The other benefits noted by developers were mutual education and joint code ownership. A number of experienced engineers ended up using pair programming for more complex tasks, e.g. refactoring.
- Open communication. XP-style communication eliminated a lot of waste and created a great atmosphere conducive to creative and productive work. The mood clearly changed from "we're working hard but it is all in vain anyway so leave us alone" to "if you know how to do things better, just tell us."
The results were, to put it simply, unbelievable. The metrics chart (see Figure 1) shows the dramatic decline in the amount of active defects starting in the beginning of December 2004 (precisely when the XP practices were first introduced). The quality and performance metrics charts were posted on the wall of the project room daily, and this chart has since been framed and placed on the wall in the CEO's office as testament to the power of agility.
Figure 1: Active defects over time
The team initially resisted XP simply because it was new and because for most people these practices seemed quite radical. Generally, developers just laughed at the XP practices, but only until they realized that something important had changed: the feeling of hopelessness was gone. They could finally see themselves make real, tangible progress, and that made them quickly embrace the program.
Morale And Motivation
The most significant changes happened in the area of morale and motivation. Daily stand-up meetings and incremental planning with real-time adjustments led to goals being clear and attainable. It was no longer "oh no, here is this tremendous task for the next 3 months that I have no idea if I am ever going to complete" but rather, "these are small tasks that I have to do today." If someone had four small tasks to complete for the day, and saw that he had already crossed off three, and if he only stayed an extra 30 minutes, he could complete the last one, he would do it, go home proud and come back the next morning even more energized and motivated.
At the same time, a great deal of pressure was eliminated by moving from Waterfall top-down scope management to a business value-based discussion that involved the development team. The engineers were finally being asked, "How much can you accomplish in this time?" "Can you do it with the agreed level of quality?" Knowing that no wild change request would suddenly fall out of the sky tomorrow helped developers relax and focus on tasks at hand, which led to great increases in productivity.
Another boost to team's motivation came from the fact that management team demoted itself and was involved hands-on. The CEO effectively assumed the role of program manager, negotiating quality and performance targets with the clients. The EVP became a project manager and moved from his office to the project room for three months, participating in stand-up meetings every morning. And, the project manager became one of the team leads, taking on the most challenging part of the project with his team.
This particular project is currently in production, end users of the system are happy, and our relationship with CSC is stronger than ever. Part of this success is clearly attributable to the use of XP. Although we would not argue that a project of any size, no matter how large, can be successfully implemented under the XP paradigm, we have no doubt that the use of certain elements of XP, if done correctly, can benefit almost any project in immediately obvious ways.
About the Labka II Program
The program team comprises over 120 staff from CSC Denmark, CSC UK, CSC France, CSC Sweden, and StarSoft Development Labs in St. Petersburg, Russia. Labka II is CSC's largest Nordic offshore program, with over 80 StarSoft team members based in St. Petersburg. CSC has 5-10 team members working on-site with StarSoft. Currently the first release of Labka II is live at five sites in Denmark supporting 1800 end users, and will be live at nine sites with 8000 end users by the end of September.
(St. Petersburg Labka II Development Center) (CSC and StarSoft Architects in St. Petersburg)
(One of Labka II sites) (One of Labka II sites)