Present Day and Future CM Tool Features: An Interview with Steve Berczuk

[interview]
Summary:

We're sitting down with CM experts to discuss not just their backgrounds in the field, but what their favorite tools are and why. This week, Steve Berczuk provides a great deal of information on what tools he prefers to use today, and what he would love to see from them in the future.

Noel: What's the primary tool you're working with, and how long have you been working with it?

Steve: I'm currently using Git in my day-to-day project work. Until recently Subversion was the tool that I used most frequently, since about 2002.

Noel: What tools do you have past experience with, and what made you switch to your current favorite?

Steve: I've used the tools that whatever organization I was working at had in place. In most cases this was Subversion. I've also used Perforce, MKS, Star Team, CVS, SCCS, and even Visual Source Safe.

To say that I have a "favorite" would not be totally accurate. When things got challenging with a given tool, I tried to figure out how best to use the tool in our context. In most cases it was possible to make small changes in how we worked to get around problems. Though some were easier to use than others. I've been happy with Subversion, though Git makes some things easier.

The most important this for me that that the VCS supports the work flow that the team uses and that it integrates well with the other tools in the development eco system, like IDEs, issue tracking, and continuous integration systems. A version control system should, in many ways, be almost invisible most of the time. Most modern tools integrate well enough for that not to be a big issue.

The bigger challenge is that the tool enables a work style that works well for the team. In some cases "works well for the team" means that the tool supports the workflow the team already has. In others, it can mean that the tool prefers a workflow that the team can adopt.

Most problems that people have with source code management systems stem from a mismatch between the model the tool has and the model the team wants to use. In some cases, the team's model is the right one for the project, and they really should be using a different tool. In others, the team's vision of how they want to work isn't optimal or hasn't really been thought out. Rather than leveraging the framework the tool uses, they force a workflow onto the tool.

Since version control tools are something everyone one on the team interacts with many times a day, any overhead the tool adds has a high cost.

Noel: What allows a tool to be scalable or adaptable to other organizations or companies?

Steve: Almost any tool can be adapted to the needs of an organization with enough thought and effort. The question is how much thought or effort you want to apply.

For example, when I first started working, I was part of a distributed engineering team, and one roadblock to collaboration was network latency when accessing the code repository. Having the repository local to us meant that the other team would face slow response times. Having it locally to them meant our team would be slowed down.

What we wanted was the ability for each team to be able to "own" their part of the repository while having fast read only access to the other team's code. We ended up developing our own infrastructure for a distributed versioning system using SCCS and unix tools. It wasn't perfect, but solved the problem.

Now between better infrastructure (so having a remote centralized repository is not really a bottleneck) and tools like DVCS, there is less of a need to build your own supporting tools. You still need to think about what the best way for a team to work is, and what could get in the way of efficient development.

So, if we were working on the same project today, we could consider a DVCS, but a Centralized VCS with a reliable fast network connection could also work just as well.

Noel: What do CM tools need to be equipped with to be able to be implemented on an agile project? 

Steve: Since agile is about individuals in interactions more than tools, SCM tools that work best in agile environments are the ones that are most transparent. Integration with the IDEs and build systems the tool uses needs to be easy.

Since a canonical agile team wants to integrate frequently and keep code working though tests and good software craftsmanship, one could argue that a tool that discourages branching would be best, arguing against a DVCS like Git. But it makes no sense to restrict the power of your tools to force a work style when you can enforce good practices in other ways. And tools like DVCSs have some useful features.

Much like how languages have evolved to make some design patterns trivial, some tools implicitly implement some SCM Patterns. For example, the distributed staging model of Git gives you the "Private Versioning" pattern for free. With Subversion you'd need to do a bit of work, but it's still possible.

I've often said that the best tool a team can have to improve their version management process is good tests. If you have frameworks in place to keep your main line working so that developers can update and commit code frequently, you can avoid most of the challenges that VCS tools attempt to solve, and sometimes cause.

Noel: Are there any features that can be problematic?

Steve: One issue with tool features is that as tools make it easier to do things, you need to actually do them if they don't work with your model.

For example, Git makes creating branches easy, but you can still get into situations that cause problems if you branch without thinking through your workflow.

If you create too many branches, defer integration, and don't keep branches, you will eventually merge back in synch with the trunk as you go. But the good news is that if a workflow with task branches makes sense, then you don't need to not do it because of the constraints of your tool.

Noel: What future capabilities do CM tools perhaps not have today, but would be nice to have in the future?

Steve: The set of features that I'd most like to see are those that connect changes with meaning, and which treat code as something other than simply loosely associated text documents.

For example, tools that understand code as something other than a stream of text would make it easier to understand the meaning of change, and make merging easier when you need to merge. There has been some research on refactoring aware SCMs that Danny Dig at UIUC has worked on. At least one company, Codice Software, has introduced a semantic merge tool.

Also, anything that helps connect the details of the code change with the reason for the change would be nice. A common approach now is to add issue numbers in commit comments. This allows your issue tracking system to list all of the changes that went into an issue by parsing commit messages. But there is still room for user error, and it can be awkward to correct mistakes. For example if you type in the wrong issue ID in a Git commit, but don't realize it for a while, changing the commit message can be a challenge.

So, tools that provide tighter integration with issue tracking systems, but tighter integration between the systems out of the box would be nice.

Steve Berczuk is a Principal Engineer and ScrumMaster at Fitbit in Boston, MA. The author of Software Configuration Management Patterns: Effective Teamwork, Practical Integration , he is a recognized expert in software configuration management and agile software development. Steve is passionate about helping teams work effectively to produce quality software. He has an M.S. in operations research from Stanford University and an S.B. in Electrical Engineering from MIT, and is a certified, practicing ScrumMaster.

About the author

Upcoming Events

Apr 27
Jun 08
Sep 21