In his CM: the Next Generation series, Joe Farah gives us a glimpse into the trends that CM experts will need to tackle and master based upon industry trends and future technology challenges.
From my perspective, the two most important guidelines are to keep it simple and to automate. But how do you keep CM simple? And how much effort do you spend automating? What's the most effective marching strategy? Let's take a look.
Process and Team Work
Take a fresh development team who have been working away for weeks or months without any CM tools. Now tell them that they have to start using a CM tool and put all of their existing work into the CM tool. The complaints begin: There's not enough time to introduce a new tool. Designers will balk at the idea. They don't want to lose focus and we've got a deadline to meet.
Your development team may have some valid points. If you try the approach: "I know you'll take a hit, but it will improve product quality," you may make some inroads, but you may not get very far. Rather than imposing a CM tool on your team, take a different approach. "I want to make everyone's job easier. The result will be a better product." Now you'll get a bit more attention. So how do you deliver?
You have to get your product from the start line to the finish line. Not only that, but you have to make sure that what you finish with satisfies the requirements you started with. And you have to make sure the result is of high quality. Walk through the steps. How are you going to accomplish this? Well you'll need to test the product at the end. You'll also need to ensure that all of the requirements are covered by the tests. And then there's everything in between:
· You take the requirements of the customer/market and turn them into a product functional specification
· You take the functional specification and create a design around it
· You take the design and you structure your code base accordingly
· You fill in the modules of your code base using the best available integrated development environments
· You integrate the modules and do some sanity testing
· You do you full verification testing
That's one way: the waterfall method. Or perhaps you prefer a more agile, iterative approach. The big difference here is that you have various iterations running through these steps in an overlapping fashion. You may start your specification and even your design based on only a small subset of your requirements. Your requirements may be phased in so that you can give your customer feedback and iteratively refine the requirements. Your verification team can start building tests to hard requirements, or even dig up an old test suite which covers a portion of your requirements.
How you go about your design process may be very important. The waterfall method may be most appropriate. Agile development may work better. But the bottom line remains that you have to start with a set of requirements and end up with a quality product that meets those requirements.
Your CM tools and processes must support you in doing this. And don't forget, you've already promised to make everyone's job easier.
The entire team needs to be part of the process. They need to have an overview of what the ALM processes are and must have open avenues for making recommendations. Furthermore information and input solicitation meetings should be made available. When everyone feels they are part of the process team, they will use the available avenues to recommend process improvements or to highlight problems in the process. This will lead to continuous process improvement, and higher quality products.
What's My Job
"Hi, I'm Bob, a programmer on the project. My manager gave me an API that I have to implement. I just want to write the code in a few files in my directory, compile them and hand over the results. I don't want a CM tool getting in my way. I know what I have to do and I have the tools to do it! How are you going to make my job easier?"
Here we have a programmer who doesn't understand his job. Like it or not, the programmer will have to do the following things in addition to the steps he perceives in his mind:
· Make backup copies of the code in case of disk crashes, etc.
· Put copyright notices into each of his files
· Run tests to ensure that the resulting code is working.
· Have others run tests to ensure that the API implementation is successful
· Keep track of problems that others have found in the implementation
· Fix these problems
· Let others know when the problems are fixed
· Identify changes to the API requirements for the next release
· Implement and test those changes while making sure that the original release still works.
At this point, the programmer realizes that he needs to keep two versions of his implementation: one for the current release and one for the next release. Not only that, but he has had to keep track of which requirements (i.e. versions of the API) apply to which release. And which tests are to be run on each release. Then there are the problems that are reported: do they apply to the current release or the next one, or both? And what happens when he fixes a problem in the next release and then realizes that it has to be fixed in the previous release as well?
So, maybe the programmer didn't really understand his job that well. Quality requires that each team member understands the job assigned to his/her role. Well-defined and well-understood roles allow the puzzle to fit together without any missing pieces. Still, he argues, he could just maintain two parallel directory structures, one for each release.
Hiding the Complexity
The programmer has accepted that his original job understanding fell a bit short, but has developed a directory structure to hold the API requirements, test scripts, code, problem descriptions, etc. in a nicely ordered fashion. He's comfortable. He doesn't really need a CM tool. That is, until someone comes to him and asks:
· That problem that you fixed, which files were changed to fix it?
· Can we do a code review on that fix?
· We delivered our product 2 months ago to customer X. Is that problem fixed for that customer?
· Can you propagate the change you made last month to the next release? Are there any other changes that should be propagated?
· By the way, we'll be shipping release 2 next month, can you start work on release 3 tomorrow?
Bob is a smart programmer. He knows he could continue to evolve his own processes to help him to track things. But now he realizes that this is a much larger task than he originally thought and that he could use a few tools to help him out. His world is just becoming a bit too complex.
Many programmers will be able to keep all of the complexities in mind as they learn them. Many will not. The CM tool needs to hide these complexities. Complexities in the process and at the user interface will lead to human error for any number of reasons. A good CM tool will hide the complexities by ensuring that it has sufficient context information to allow the user to work more simply. Simpler user interface, fewer errors, higher quality.
I've seen and heard of many projects with very complex branching strategies, along with labeling instructions and repetitive merging. The developer doesn't want to have to spend hours understanding a branching strategy only to find out that he has to create a maze of branches, label them and finally work himself out of the maze. Instead he would like to say: I'm working on this release of this product. I'm making the changes approved for this feature or problem report. He then goes to the source tree, checks out the files, makes the changes and checks in the tested results. The CM tool lets him know if he needs to create branches, or if he needs to reconcile his code because someone else had a parallel checkout. The CM tool tracks the status of the change, rather than forcing him to keep it on his working directory until such time as the build team allows him to check it in.
Look at the product manager's tasks. Or the build and integration team tasks. These are complex. It's up to the CM tool to hide the complexity for them as well.
The build doesn't work. The CM tool gives the difference between this build and the previous working one, but not in terms of lines of code. It’s in terms of changes, features, problems fixed, developers who made the changes. If he wants to drill down to lines of code, fine, otherwise, hide this complexity.
The marketing team wants to put together a glossy about the new features for the upcoming release. It has to be accurate. The faster they can do this, the later they can start and the higher the accuracy level. If they have to spend days or weeks collecting the information, their job is too complex. It should be presented to them by the CM tool which has all of the traceability information and clearly identifies features that are already implemented from those that may not make the release deadline.
The project manager informs the product manager that the release cannot be ready on time. The product manager does not want to spend days collecting information and trying to put together alternative plans. He wants the CM system to tell him what will make it into the release if he maintains the same deadline, and what won't, so that he can go back to the customer and/or marketing team and negotiate. He also wants to ensure, up front, that all of the high priority items will make it. After negotiations, he wants to easily adjust the release so that development, documentation and verification teams have an accurate picture of what has changed. It good to communicate these changes through meetings and documents, but if the CM tool is not driving the to-do lists and tasks based on this new picture, it will take a lot longer to turn the ship to its new destination. If the complexity is hidden behind a few buttons, it disappears from view.
The CM Tool: A Help, Not Overhead
Some CM tools are designed to help the development team members with the addition of a minimal amount of overhead. I would argue that this is not good enough. A CM tool should have negative overhead. That is, it should actually improve the team member productivity while collecting the data necessary to improve management control.
Packaging Files Into a Change
A software change/update often affects more than a single file. A file-based tool requires checking out each file, putting the reason against each checkout, editing/compiling each file, doing a delta/diff report on each file and consolidating them for a code review, doing a check-in of each file, doing a promotion of each file to a ready state. Now a change-based tool still requires a checkout of each file against the change package identifier, but the reason is supplied once against the change, the delta/diff report is done on the change, the change is checked in, and the change is promoted. This is a time-saver for the developer. It also has the advantage that those downstream in the process can deal with a collection of several changes rather than more than several files.
Searching for Past Changes
If you don't have a change-based CM tool, you probably don't realize that you tend to look at past changes (and there deltas) a lot more frequently than you do at currently open changes. If it's difficult to do it's rarely done. But if it's easy to do, you look at it to see how you implemented a change in the past, or more frequently, how someone else introduced a feature or a problem fix. This is especially true if the documentation is not yet up to date. Perhaps you have a strange behavior in your latest build and so you browse through changes to see which one is a potential candidate for the behavior. Maybe you're looking through all of the changes made involving a particular file. This is sometimes a timesaver, but more often it’s an added value capability-enabler.
Delivering to a Customer
The customer has release 2.2.4 and you're about to deliver release 3.1. The customer needs to know what problems have been fixed, and what features have been implemented. It's fine to wait until marketing produces its release notes and then you may pass these onto the customer. But it's much better as you approach the end of the implementation cycle to be able to tell each customer what's coming down the pipe, which of his requests have been addressed, which have not, and to receive feedback if there are any critical problem fixes that have not been covered. You still have time to address a couple of these. You have an extra interaction with your customer and the customer feels that the product is evolving with their concerns at hand. It's not just the next version of product.
If you're trying to sell your CM tool as just a little bit of overhead, it's going to be a tough sell. If you're going to make life easier for the user, or give the user new capabilities, or make the user look good when dealing with customers, you just have to identify the benefits and maybe he'll come running to you for your proposed solution.
The March to Quality
There's no way around it. If you want quality, you have to keep things simple, and you have to automate to ensure reproducibility and to eliminate human error. Here are a few of the areas I see that need attention in the CM industry at large. Some tools have a much better record at dealing with them than other.
User Context
A CM tool needs to be many things to a very wide range of users. This begs for complexity. The best way to simplify is to know the user's context. This may involve:
· What products can the user work on?
· What role(s) does the user play?
· What product and release is the user working on?
· What was he working on the last time he used the tool?
· What are the user's workspaces and how does each relate to his work?
This is a huge starting point. All of a sudden the CM tool can reduce the menus to reflect the user roles. It can reduce the set of data the user has to browse through. It can set intelligent defaults on forms (e.g., a new change to populate the user, product, development stream automatically). It can identify the appropriate set of to-do lists/inboxes that the user needs to look at. It can allow the user to work through normal change control operations without having to look at revision numbers or labels.
Product Information
There's a lot of information that a lot of CM tools simply don't bother to capture. This results in additional complexity. For example, your CM tool should be able to tell you:
· Who is the primary owner of each data record?
· How do products relate to/depend on one another?
· How do I find the source tree for each product?
· What is the development history for the product?
Now this list could go on forever as many requirements spawn data schema requirements. But some of these are more crucial to the simplicity of the user interface than others. Given the development history for the product, the CM tool can recommend to the user where branches are to be branched from. Given owner information, data can be restricted or presented in the user-preferred order more easily.
Basic CM Process
If you want to really want to simplify and automate the CM function, you need to have your process reflect what is being done and you need your CM tool to capture the data needed to support the process and its automation. Here are a number of ways your process/tool combination can help. Not all of these capabilities will be possible with all tools, but perhaps someday they will be.
Change-based CM
The industry already agrees this is good, but it's actually more than good, it's crucial. You will not be able to build a successful CM tool with file-based CM. You need to be able to track changes and change files only against these changes. The changes themselves will contain the traceability information necessary to interact with the rest of the ALM functions.
Select changes for a build, roll-back changes, add changes to a baseline, propagate a change from one development stream to another. If you're doing these operations on a file-by-file basis, you're on shaky ground and the day will come when you put half a change into a release and your customer pays for it. If your IDE plug-ins work on a file-based mechanism only, you'll need your vendor to improve them or you'll need some extra steps in your process to help ensure that changes are properly defined.
Change Context and Change Dependencies
One step better is to ensure that the change contains adequate information so that it can flow through the system automatically. The change context identifies the product and development stream for the change. This is vital to ensure that the correct changes are delivered to the correct build teams and processes. Ideally, this context information is automatically added to the change based on the user's working context. Change dependencies will be both implicit and explicit. The system can identify implicit dependencies based on the order of file revision changes. The user can add explicit dependencies to ensure that changes are delivered in the right order, where necessary.
Change Promotion
Branches are overused. If you're using branching to define your change promotion model, you could be doing better. Unfortunately few tools support context views based on change status (i.e. promotion level). If you have one of those tools, you will simplify your branching structure and eliminate a lot of branching, merging and labeling activity.
Changing Tree Structure
Adding files and directories, moving files to new directories, removing files from the tree: these are examples of changes. Rather than being made in the source code, they are made to the directories. And just like source file modifications, a change may also have tree structure modifications. If your tool tracks these great, you're way ahead of the game. If your tool allows you to specify these changes without having to manually checkout and create new directory revisions, all the better. When you have to create the directory revisions, you're implicitly introducing a dependency on change order which probably would not otherwise have to occur.
Reduce Branching and Manual Labeling
Branching and labeling run rampant in some CM processes. The reason is that the underlying tools have a branching mechanism, but no other mechanisms to deal with the requirements of your process. If you have to branch to create promotion lineups, to identify short term parallel checkouts, to collect files into a change or to define a baseline or build definition, you'll likely have a lot of labeling to do as well, and a lot of scripting to help automate that labeling. Preferably you have underlying technology which can deal with baselines, builds and changes as first order objects. Hopefully you have technology that can compute views and deal with parallel checkouts without forcing unnecessary branching into the picture. A runner-up capability might be one that helps those branches disappear when they no longer add value to the CM repository.
Automated Branching
Your CM repository is tracking all of your development branches, where they've evolved from, etc. It also knows the branching information for each file, directory and ideally the product as a whole. It should let you share unmodified code from earlier streams in a later stream and let you inherit changes from the earlier stream automatically if you so choose. If your branching is well structured, your CM tool should be able to tell you at checkout time whether or not you need to branch and should also identify the branch point for you. It should be automated, without branching strategies to learn. You shouldn't even have to look at the file revisions when you do a checkout operation. There should be enough information in the CM tool to allow this to happen and that's been the case with the tools I've used over the past 20 years. And to go one step further, because the CM repository contains all of the history, it should let you know when you make a change to one branch that may have to be applied to other branches.
Baseline Trees versus Collections
As an added benefit, and going back to change-based CM, branching of directory objects (assuming your tool supports these), should be automated. The changes are defined in the set of software updates you've prepared. You should only have to request a new baseline from the root (or any sub-tree for that matter) and new revisions of directories should be created according to the updates. In some cases, directories may have to be branched. A tree-based baseline structure integrated with a change-based CM system can eliminate all manual directory branching decisions and tasks. If your baselines are defined simply as a set of revisions with directory specification stored against each, rather than through a revised file tree structure, this may be difficult to accomplish.
Configuration View Flexibility
If you have to define a baseline in order to navigate a configuration, you're very limited in your configuration view capabilities. Many tools allow you to specify what's in your view without having to create a baseline which defines that view. In cases where the specification is rule-based, the view changes as the data changes. You can even go one step further if your CM tool will automatically show you your preferred view based on your context, without you having to specify any rules. Your tool should let you wander from one configuration to another as necessary, to look at the contents of a specific build, to view parallel development streams, etc. The quality of your decisions are going to be only as good as the quality of the information you're viewing. And if you're looking at an "almost the same" configuration because it’s too difficult to look at the actual one, you're asking for trouble.
Stream-Based Main Branches
Many of you have had debates over what's better, a single main branch or a main branch
for each persistent development stream. Although a single main branch may seem simpler, it breaks the rule "as simple as possible, but not too simple". As a result, the process required to support a single main branch is much more complex, introducing artificial dates for starting and ending a stream's tenure in the main branch. It also requires three different processes for dealing with a development stream: before, during and after it's tenure as the main branch. Stream-based main branches are simpler: one main branch per stream means the process for that stream is always the same and there are no artificial start and end dates. This in turn leads to easier automation of the CM process.
Nightly Build Automation
If your process and/or tool does not allow you to fully automate your nightly (or other) builds, they could use some refinement. What's going into the build, what tools are being used, where are the results being placed and who's being notified on errors/completion. Your tools and processes should allow this to be fully automated. Perhaps "what's going into the build" is most difficult. Some simplify by using "whatever's submitted to the repository". This is dangerous, as it requests users not to submit completed code into the repository if they don't want it going into the build. A change promotion model works better - all changes at a given promotion level or higher go into the build for that promotion level. You may have builds at various promotion levels. If your tool permits and your project is very large, you may even perform incremental builds automatically. If you can't do automatic nightly builds, take another look at your CM process and data/tools.
This is a partial list, but a good starting point for introducing quality into your configuration management process.
What Else?
The difference between levels 4 and 5 of the Capability Maturity Model (from the Software Engineering Institute at Carnegie Melon University) is that a level 5 process is continually optimizing itself. If you want to achieve that level, you need tools that permit you to optimize your process. The easier it is to optimize, the faster you'll move along the quality curve. There are three things to look for to support your process optimization efforts:
(1) Easy to customize the solution. Whatever solution you select for CM, it must be easy to
customize to meet your needs. You have a process in mind because of your requirements. If the tools can't support that process, you'll either never get there or you'll spend a significant portion of your time either doing things manually or building your own tools. Neither is recommended. Make sure your CM solution is easy to customize to your needs.
(2) Easy to incrementally add customizations. This is not a lot different from the previous one, but the fact is, your needs are going to change. Not only that, but you might want to start using your new solution before it's fully customized to meet all of your current requirements. In both cases, you'll need to customize your solution incrementally. Ideally, you can create a customization, apply it, and roll it back if it creates unwanted side-effects. If you have to take your user base off line when you do a customization, you'll either be working a lot of late hours, or reduce your customization flexibility. On top of that, consider the case where you have your process and tools replicated over multiple sites. Ideally you can make a change to all sites at the same time. Better yet, you can use your CM tool as the repository for your customizations so that when a change is made, it is replicated at all of the sites.
(3) Integration of data and user Interface. The customization of process is going to be very difficult if your CM data is scattered among several repositories and/or if your user interface is different for each part of the process. Yes the buttons and actions are going to be different, but if you need different tool-level expertise to customize each of the different management areas, you'll have a greater learning curve and a higher potential for error. Look for a single user interface across your ALM functions. And look for a single repository that can handle data revisions as well as file revisions across all functions.
Summary
This seems like a bit of work or maybe even a lot of work. OK, it could be a huge effort. The good news is that most of this has already been done. You don't have to start from scratch. There are consultants and vendors out there that can help you. They've been there before and their experience is really priceless. Trial and error, learning from your mistakes, is a great way to learn, as long as it's not at the expense of your product team. It does not have to be an expensive proposition. In fact, because you're making everyone's job easier and increasing product quality, it's well worth the investment. If you can't convince management to invest, keep marching. Take smaller steps and make the improvements to your current solution that will provide the biggest benefits to the greatest number of people.
But if you're the one making the improvements, you might want to make improvements that give you more time first!