Principles of Agile Version Control: From OOD to TBD

[article]
Summary:
In this article, the authors discuss the principles of version control that help enable agile development. With an understanding of the principles of object-oriented design, as well as the principles of agile development, they can derive the principles of agile version control. We focus on the principles of object-oriented design (OOD) and how we can use them to derive corresponding version control principles for task-based development (TBD).

In this article, we will discuss the principles of version control that help enable agile development. With an understanding of the principles of object-oriented design, as well as the principles of agile development, we can derive the principles of agile version control. We will focus on the principles of object-oriented design (OOD) and how we can use them to derive corresponding version control principles for task-based development (TBD).

Symptoms of Poor Version Control

If agile and iterative development constantly integrate, build and test software in very small increments, how can we ever have stable baselines or codelines? How can we ever propagate fixes or enhancements from legacy releases to the current under-development release and make sure we aren't doing too much branching, merging and baselining? How can we make sure that the branching, merging and baselining that we are doing follows a simple, yet structured fashion that safeguards the integrity and reproducibility of the software without hindering productivity and development creativity?

Robert Martin describes several symptoms of poor design in [1] that translate quite readily into symptoms of poor version control:

  • Rigidity/Inertia: The software is difficult to integrate and deploy/upgrade because every update impacts, or is impacted by, dependencies upon other parts of the development, integration, or deployment environment.
  • Fragility/Inconsistency: Builds are easily “broken” because integrating new changes impacts other fixes, features, or enhancements that are seemingly unrelated to the change that was just made, creates changes keep disappearing and reappearing, or are difficult to identify/reproduce.
  • Immobility/Inactivity: New fixes, features, or enhancements take too long to develop because configurations and their updates take a long time to integrate and propagate or build and test.
  • Viscosity/Friction: The “friction” of the software process against the development flow of client-valued change is too high because the process has an inappropriate degree of ceremony or control.
  • Needless Complexity/Tedium: Procedures and tasks are overly onerous and/or error prone because of too many procedural roles, steps, and artifacts or are too fine-grained “micro” tracking/status-accounting and possess an overly strict enforcement or rigid and inflexible workflow.
  • Needless Repetition/Redundancy: The version-control process exhibits excessive branching and merging, workspaces, or baselining in order to copy the contents of files, changes and versions to maintain multiple views of the codebase.
  • Opacity/Incomprehensibility: It is difficult to understand and untangle the branching, merging, and baselining schemes into a simple and transparent flow of changes for multiple tasks and features developed from multiple collaborating teams that are working on multiple projects from multiple locations for multiple customers.

Why do these things happen?  They may happen for a host of reasons. They happen because the practitioners involved haven't yet learned or appreciated the importance of sound version control principles and practices, and haven't suffered the consequences of not following them. Or perhaps they have suffered the consequences but they don't understand why. It may be blamed on the wrong thing, such as trying to prevent change when they need to simply make change easier.  It may be that practitioners try to overcompensate for something “bad” that happened by going too far to the other extreme.

What's a seasoned veteran version-control practitioner to do?  They do what they've always done! They use sound practices and judgment honed from years of experience to recognize these symptoms and their underlying problems, identify which principles have been violated, and then apply the appropriate SCM patterns [2] for their particular context.

Many of the practices that help us recognize and resolve these problems come from what is commonly called change-based (or change-oriented) versioning and task-based development. The corresponding principles have much in common with the principles of object-oriented design.

Principles of Object-Oriented Design

 

The principles of object-oriented design [1] address the need to minimize the complexity and impact of change by minimizing dependencies through the use of loosely coupled and highly cohesive classes, interfaces, and packages. These are the logical containers of software functionality and are realized in “physical” containers of version control (configuration elements).  These exist in the form of files, libraries and components, and are as follows:

Principles of Class Design

SRP

The Single Responsibility Principle

A class should have one, and only one, reason to change.

OCP

The Open-Closed Principle

A class should be open for extension (of its behavior) but closed against modification (of its contents).

LSP

The Liskov Substitution Principle

Derived classes must be substitutable for (require no more, and ensure no less than) their base classes.

DIP

The Dependency Inversion Principle

Depend on abstract interfaces, not on concrete details.

ISP

The Interface Segregation Principle

Make fine grained interfaces that are client specific.

LOD

The Law Of Demeter
(Principle of Least Assumed Knowledge)

Objects and their methods should assume as little as possible about the structure or properties of other objects (including its own subcomponents). [3]

DRY

The "Don't Repeat Yourself" Principle

Every piece of knowledge should have one authoritative, unambiguous representation within the system. [4]

Principles of Package Design

REP

The Release Reuse Equivalency Principle

The granule of reuse is the granule of release.

CCP

The Common Closure Principle

Classes that change together are packaged together.

CRP

The Common Reuse Principle

Classes that are used together are packaged together.

Principles of Package Coupling

ADP

The Acyclic Dependencies Principle

The dependency graph of packages shall contain no cycles.

SDP

The Stable Dependencies Principle

Depend in the direction of stability.

SAP

The Stable Abstractions Principle

Abstractness increases with stability.

We wish to derive similar principles for version control using the logical containers of change-based versioning and task-based development:

  • Changes
  • Workspaces (sandboxes)
  • Versions (labels/tags)
  • Configurations (builds & baselines)
  • Streams (codelines)

The resulting version control principles should address the need to minimize the complexity and impact of change by minimizing dependencies through the use of loosely coupled and highly cohesive, correct and consistent changes, versions and codelines. We must be mindful of the notion of time when deriving these principles in order to properly and accurately translate their meaning from the domain of classes and packages to the domain of version control.

In addition to terms like classes and packages, we will also need to translate the meaning of terms like abstract, interface and stability into the domain of version control:

  • Classes and packages might correspond to any of the aforementioned types of entities (changes, workspaces, versions, configurations, codelines). An instance of a particular class is an object.  Each object has its own behavior, state (its data or contents), and unique identity. Previously published notions of container-based/component-based SCM are a precedent for this frame of thinking (see [5]-[8]).
  • An interface might be some set of identifying characteristics or acceptance criteria that is common to a set of items (e.g., instances of configurations), or to a particular state of an item.
  • Abstraction/abstractness is somewhat different from the above. The more broadly used a container is, the more it serves as a basis for subsequent ongoing evolution, just as the more abstract a class is, the more it serves as a basis for subsequent derivation. This relates to the liveness or velocity of the flow of change/activity for an item's contents. A more conservative flow tries to conserve present value, while a more progressive flow attempts to add value at a faster rate of progress.
  • Stability corresponds to the dependability of an abstract container or the reliability or safety with which one can depend upon it to preserve the interface or invariants of the container. This directly relates to the level of quality assurance of an item. The stability of a container is the extent to which its contents are of consistently reliable quality.
  • Change is Evolution! Most times when the principles of OOD refer to change or modification, it means evolution in the context of version control.

With the above terminology translations in effect, we are now prepared to attempt translating the principles of OOD into version control principles, and to explore their validity and applicability.

The Principles of Task-Based Development (TBD)

To derive these principles, we need to identify the objects that correspond to the version-control abstractions and which dependencies we wish to minimize and manage. Each version-control object is a container that encapsulates its associated state (data/contents). The key behavior a container exhibits is that of evolution: either the container's contents are allowed to change, while still being the same container, or else the container (but not it's contents) undergoes a change of state as it evolves. Any time a container evolves, there is some set of characteristics (its interface) that it tries to preserve within its own context. The part that is unchanged by the evolution operation is an important part of the context for the object.

  • Changes take place within the context of a workspace (sandbox) in which elements are created/revised and then built/tested before the change is considered complete.
  • Workspaces (sandboxes) are populated, and then subsequently updated, with the appropriate "base view" of files & versions from the codeline.  Some of these are modified or new files created as part of the change that is to be committed (checked-in) to the codeline.
  • Configurations of the codeline are created every time a codeline or workspace is updated. Each change is against an existing current configuration and results in a new configuration that, when checked-in, becomes the new current configuration of the codeline.
  • Baselined versions are created when we tag/label a checked-in configuration with a meaningful version name that corresponds to the contents of that configuration at that point in time. Once identified, these baselined configurations typically undergo one or more levels of quality assurance before they can be ultimately released.
  • Codelines (streams) encapsulate an evolving current configuration of a set of files and other objects that are eventually built and tested together. A codeline represents one flow of changes to an evolving configuration.

The principles of OOD refer almost exclusively to classes or packages, but we will need to determine how they apply to all of the above. Some OOD principles may apply to more than one of these version-control objects, possibly resulting in multiple principles. We must keep this in mind as we derive our principles of version-control. A first pass through the principles would seem to suggest the following:

Single Responsibility - Cohesive Flow

The Single Responsibility Principle for OOD says that, “a class should have one and only one reason to change.” This is a statement of cohesion for a class, that it should encapsulate a single coherent and cohesive responsibility. Restating this in terms of version-control would yield: A container should have one, and only one, reason for its state to change. This is a statement about cohesive flow of evolution. Every evolutionary step should happen for a distinct reason and all the things that changed should have been very closely related. This an extremely generic statement and would probably be a lot easier to understand if we broke it down by the various kinds of containers to which it may apply. We'll do that later but, for now, let's press onward.

Open-Closed - Identity Preservation

The Open-Closed Principle for OOD states that a class should be “open for extension, but closed for modification.” Translating this into version-control terms is a little bit tricky. Extension corresponds to deriving or evolving a new entity from an existing entity. If we add to the existing entity, we create a new one that reuses the essential characteristics of the original. But I shouldn’t have to modify the original entity in order to do it? What does that mean?

The original entity or container corresponded to a particular meaning at a certain point in time. Evolving it into something new shouldn’t destroy the existence or history of what it evolved from. The evolved result may be new, but this is version control, and the thing that evolved must somehow maintain its identity or essence. This is a statement about Identity Preservation: A container should be open for evolution, but closed against redefinition. What does this mean? We'll, say more about that later too. Onward again!

Liskov Substitution - Evolution Integrity

The Liskov-Substitution Principle for OOD says that, “derived classes should be substitutable for their base classes.” In this context, substitutable means “requires no stronger pre-conditions, and ensures no weaker post-conditions.” In version-control terms, this means that derived containers must be substitutable for (require no more, and ensure no less than) their base containers.  This is a statement of evolution integrity, but we first need to determine what a derived container is. Within the flow of a codeline, the current configuration is derived from it's predecessor configuration, as suggested by Anne Mette Hass in [9]. A derived container might also mean a branch, which inherits it's initial content from its branchpoint.

Dependency Inversion - Container-based Dependency

The Dependency Inversion Principle (DIP) for OOD says to “depend upon abstractions, not concretions” or equivalently “abstractions should not depend upon details; details should depend upon abstractions.” This is a statement about separating policy from mechanism to depend in the direction of increasing abstraction (or decreasing detail).

In the case of abstraction, a version control abstraction might be a baseline or a codeline, which abstracts and encapsulates a single or current configuration from the details of its content. A detail would be a detail about the composition or content of the configuration, or of a particular change. So DIP would mean we should depend upon a specific codeline or labeled configuration, and not upon their specific contents or context. This gives us: Depend upon named containers, not upon their specific contents or context. This is the version control equivalent of saying, "program to an interface, not to an implementation." It is closely related to the Principle of Least Assumed Knowledge or Law of Demeter.

Least Assumed Knowledge - Evolution Insulation

The Law of Demeter is not one of the principles from [1]. It is actually a separate principle that is really more of a style rule for structure shyness in object-oriented programming. In its more general form, it simply says that objects should assume as little knowledge as possible about the structure and data/attributes of other objects, including of its own sub-parts. This would translate into a version-control container assuming as little knowledge as possible about the content of other containers, including its own subparts.

DRY - Container Encapsulation

The DRY Principle would seem to translate directly to version control terminology: Each piece of version-control information should have a single unambiguous authoritative representation within the version-control system.

Interface Segregation - Promotion Leveling & Codeline Branching

The Interface Segregration Principle for OOD states, “make fine-grained interfaces that are client-specific.” All consumers of a class should be partitioned into one or more client types and a separate abstract interface should be defined for each type of client the class must serve.

Interfaces are a form of abstraction and for version control we believe that translates into acceptance criteria for the users of a container and also a level of visibility of a container. For version control, interface segregation can be applied to changes, baselines, and codelines:

  • For changes, there are the times when they are private and still in-progress and then are subsequently ready to be promoted in visibility and used in progressively more public scopes of the enterprise.
  • For baselines, their contents do not change. What changes is the level of assurance we have as to their overall quality and readiness for production release/deployment. An in-progress change is just an initial, private and not yet promoted state of a configuration that may eventually be baselined and further promoted.
  • For codelines, their contents do change and evolve, sometimes to the point where it needs to be decided whether or not to branch a new codeline to support a parallel path of evolution.

All of these are really just forms of Visible IncrementalEvolution, when restating the general form of ISP in terms of version-control containers: Make fine grained acceptance-criteria that are client-specific.

Reuse, Release and Evolution Granularity

The OOD principles of reuse-release equivalence, common closure, and common reuse are all about the appropriate granularity of packages of classes. We wish to investigate the corresponding granularity of packages of evolution within a version control environment. We can consider the scope and granularity of a change, a baseline, a codeline, and beyond. In either case, both (re)use and release imply reuse by some consumer and releasing to some target consumer. Once again, the notion of abstractness in OOD translates to the level of visibility of versioned content that will play a key role.

In version control, the reuse of changes and versions occurs when we view or update (merge/modify) a version of one or more files in our workspace in order to develop, build, test and release our changes. The release of changes and versions occurs when we commit changes to a codeline, transfer changes between codelines, baseline a configuration, or promote a configuration to a new promotion-level. Based on this, we already know a few things about change granularity:

  • The granule of change is not an individual file/checkout, but is rather a single development task that holds changes we then commit to the codeline.
  • The granule of baselining and promotion is not an individual change but is, instead, the unit of integration for building and testing the component or product undergoing change on the codeline, which is the configuration.
  • The granule of progressive collaboration and evolution is the codeline

Not all of the above require principles. Furthermore it’s not apparent that each of these three OOD principles will translate to individual version control principles. Instead, we want to see if we can apply these three principles together as a group for each of changes, baselines, and codelines.

Package Coupling - Change-Flow

The package coupling principles of OOD translate almost directly to version-control, with only slight modification.  Sometimes dependency translates into flow, but note that change-flow does not imply dependency. These deal with branching, merging, and the flow and structure of changes across codelines.

What's Next?

Now that we've set the stage and introduced the players, we will exit the stage until next month, when we try to directly apply our translations to each of the different types of version control containers: changes/workspaces, baselines, and codelines. We're very interested in your feedback on these ideas and our initial mapping of them into the version-control domain. So if you have some insights to share, please let us know.

References

[1] Agile Software Development: Principles, Patterns, and Practices; by Robert C. Martin; Prentice-Hall, 2002

[2] Software Configuration Management Patterns: Effective Teamwork, Practical Integration; by Stephen Berczuk and Brad Appleton; Addison-Wesley, November 2002.

[3] Object-Oriented Programming: An Objective Sense of Style; by Karl J. Lieberherr, Ian Holland, Arthur Riel; Proceedings of the 1988 Conference on Object-Oriented Programming Systems, Languages, and Applications (OPSLA'88); September 1988, San Diego, CA, pp. 323-334.

[4] The Pragmatic Programmer: From Journeyman to Master; by Andrew Hunt and David Thomas; Addison-Wesley, 1999.

[5] A Software Configuration Management Model for Supporting Component-Based Software Development; by Hong Mei, Lu Zhang, Fuqing Yang; ACM SIGSOFT Software Engineering Notes, Vol. 26, Issue 2; (March 2001), pp. 53-58; ISSN:0163-5948

[6] A Component-Based Software Configuration Management Model And Its Supporting System; by Hong Mei, Lu Zhang, Fuqing Yang; Journal of Computer Science and Technology, Vol. 17, Issue 4; (July 2002), pp.432 – 441; ISSN:1000-9000

[7] Container-Based SCM and Inter-File Branching; by Laura Wingerd; 1st BCS CMSG Conference, April 2003

[8] Flexible Configuration Management for a Component-based Software Asset Repository; by Tom Brett; BCS CMSG event: Why Software Asset Management and Configuration Management is essential, March 2004 (also see accompanying presentation)

[9] Configuration Management Principles and Practice; by Anne Mette Hass; Addison-Wesley, December 2002. Chapter 1, “What is Configuration Management?” (available online)

About the author

About the author

CMCrossroads is a TechWell community.

Through conferences, training, consulting, and online resources, TechWell helps you develop and deliver great software every day.