Tuesday, July 05, 2005

Whitgift's Principles of Effective SCM

In an effort to try and deduce/derive The Principles of SCM, I'm going through the SCM literature to see what other published SCM experts have identified as SCM Principles.

Among the best "oldies but goodies" are the books by David Whitgift (Methods and Tools for SCM, Wiley 1991) and Wayne Babich (SCM: Coordination for Team Productivity, Addison-Wesley 1986). These books are 15-20 years old, but most of what they say still seems relevant for software development and well-aligned with agile and iterative development methodologies.

Today I'll focus on David Whitgift's writings. In the very first chapter ("Introduction") he says that CM is more concerned with the relationships between items than with the contents of the items themselves:
  • This is because CM needs to understand the decomposition + dependencies among and between all the things that are necessary for a change to result in a correct + consistent version of the system that is usable to the rest of the team (or its stakeholders).

  • He also states that most of the problems that arise from poor CM and which CM is meant to resolve are issues of coordination/communication and control. And he gives a shopping list of common problems encountered in each of five different areas (change-management, version-management, build-management, repository-management, item identification/relationship management).
At the end of section 1.2 in the first chapter, Whitgift writes:
In the course of the book, five principles become apparent which are the keys to effective CM. They are that CM should be:
  • Proactive. CM should be viewed not so much as a solution to the problems listed in the previous section but as a collection of procedures which ensure the problems do not arise. All too often CM procedures are instituted in response to problems rather than to forestall them. CM must be carefully planned.

  • Flexible. CM controls must be sensitive to the context in which they operate. Within a single project an element of code which is under development should not be subject to restrictive change control; once it has been tested and approved, change control needs to be formalized. Different projects may have very different CM requirements.

  • Automated. All aspects of CM can benefit from the use of software tools; for some aspects, CM tools are all but essential. Much of this book is concerned with describing how CM tools can help. Beware, however, that no CM tool is a panacea for all CM problems.

  • Integrated. CM should not be an administrative overhead with which engineers periodically have to contend. CM should be the linchpin which integrates everything an engineer does; it provides much more than a repository where completed items are deposited. Only if an engineer attempts to subvert CM controls should he or she be conscious of the restrictions which CM imposes.

  • Visible. Many of the issues raised in the previous section stem from ignorance of the content of items, the relationships between items, and the way items change. CM requires that any activity which affects items should be conducted according to clearly defined procedures which leave a visible and auditable record of the activity.

Whitgift's "proactivity principle" might seem a bit "Big Planning/Design Up Front" rather than responsive, adaptive, or emergent. And his "visibility principle" one may seem like heavyweight traceability and documentation. However, in light of being flexible, automated, and integrated, it might not be as bad as it sounds.

Each of the above five Principles of Effective CM seem (to me) to be potentially competing objectives ("forces") that need to be balanced when resolving a particular SCM problem. I wouldn't regard them as principles in the same sense as Robert Martin's "Principles of Object-Oriented Design."

Each of the subsequent chapters in Whitgift's text delves into various principles for the various sub-areas of CM. I write about those in subsequent blog-entries.


Bob Corrick said...

If we take configuration management to be the disciplined use of a tool or a service that supports the creation, development and maintenance of some system of interest; and if we take principles to be guidelines that can inform our practices; how about these for a start?


Natural identification:

Identify items by their real name (eg MyClass.java, my_package.sql). This avoids ambiguity when reading or updating items; it assumes that a naming convention applies when creating an item.

Apparent removal:

Remove an item from the scope of reading or updating; preserve the knowledge of its existence. This allows access to historical items, and avoids their accidental re-creation.

Unique location:

Find (and therefore put) an item in its natural place (eg java/com/ourcompany/package.jar, java/com/ourcompany/package/MyClass.java, oracle/my_schema/my_package.sql, java/org/junit/junit.jar). I think "use", not "reuse" (because the default way to reuse something is to, um, er, copy it).

And we probably need some more, perhaps under WHICH VERSION?

meaningful tags

minimise branches ("history is better than geography")

spot the difference


locking or merging


Brad Appleton said...

Hi Bob! Thanks for the comments. You identify some good patterns (natural identification, apparent removal, etc.)

I say "patterns" rather than "principles" here because when I say principles I'm seeking more than guidelines or rules of thumb for what to do. Im looking for fundamental rules or "solution design principles" that should not be violated when doing a particular SCM practice or achieving an SCM objective.

For details see my 20-April-2005 blog entry or the The Principles of CM discussion thread on CMCrossroads.com

Bob Corrick said...

Hi Brad!

I was thinking some more about the scope of software configuration management...

Let's assume that it does not include: machine or network configuration; document management; or software deployment & distribution.

Let's assume that it does include: version control; build automation; and release management for 'stuff' (items, components, packages).

There are a few things that I either take for granted in SCM, or have wished were available. Maybe these things are more like principles:

COMPLETE HISTORY: all item versions are identifiable and retrievable (even if 'deleted' or 'moved')

IDENTIFIABLE CONFIGURATIONS: any combination of items, each at a particular version, can be identified in its own right (as well as, and independently of, individual revision numbers or tags on versions of items)

EXPLICIT ACCESS CONTROL: any configuration (say a 'release'), and any version of any item, can be made visible / readable / updatable (ie updatable for 'deletion' or 'movement', but not for replacement as that would violate COMPLETE HISTORY)

One might check how well these support the life cycle of creation, development, release, modification, support and retirement of 'stuff' by playing through some scenarios or use cases, I thought.

Thanks for thinking about this, and for opening up the whole discussion.


Brad Appleton said...

Interesting ideas Bob! Now I have to think harder about what I consider a "principle." You have identified some objectives/abilities that you want as a result of "good" CM.

If those are things that CM "must" do (and not merely as a side-effect of achieving something else), than many would consider such "objectives" to be principles.

The kinds of principles I spoke of (when I compared them to Bob Martin's principles of OOD) are more like "invariants." They are conditions that must be be preserved or conserved in the design/implementation of a solution that meets such objectives.

If I dont preserve them ... if I violate them somehow, then something is out of balance, or perhaps more accurately, out of equilibrium. And either I didnt really achieve the objective I had set out to achieve, or I did so, but at the expense/neglect of something else or some other objective.

I guess that means I'm seeking principles as rules/invariants of SCM design.

Bob Corrick said...

Hi Brad

How about:


OPEN means that any item can be...
* added to your SCM as a first version
* read for building and testing
* modified and superseded (ie the same item is added at a later version)
* tagged and re-tagged freely for tracking and co-ordination
...typically by software developers

CLOSED means that a combination of versions of items is identified, adjusted, confirmed...
...and that such a combination, once confirmed, defines a release

Perhaps that sounds simplistic? In my experience people do have to take special care to 'close' a release in their SCM system.

On my last team, we used a common 'product release' tag on all included items - each at the desired version. This tag was like 'Rx_y', as distinct from the tags like 'F123' that we used in the development of certain items for a given feature.

The tool did not support different meanings for the 'F' and 'R' tags; our conventions and discipline were needed to keep to the principle.

Now I can't think of another one!


Brad Appleton said...

Hi again Bob! EXCELLENT! You have arrived at exactly the sort of tghing I'm seeking.

"OPEN for development, but CLOSED for release" sounds right on the monet to me! In the "Principles of CM" thread on CMCrossroads.com I translated the Open-Closed Principle into pretty much the same sentiment (great minds think alike :)

I called it the Baseline Immutability Principle or "BLIP" (meaning, once a configuration is "baselined", then the name/id of that baseline must always refer to the set of elements and versions it was made-up of at the time it was baselined.)

And I (perhaps loosely) translated the "Liskov Substitution Principle" into the Configuration Promotion Principle, and spoke of a possible Codeline Consistency Principle or Codeline Integrity Principle.

Bob Corrick said...

Hi Brad - thank you for keeping the conversation open here!

Re. OPEN / CLOSED, on projects so far I've had to provide some things that the tools seem to lack (rather than provide it's more like support, by our agreed working practices):

* preventing the alteration of tags that define a release, while...

* allowing free use of tags during development

(Maybe we haven't been using the best tools!)

Re. SUBSITUTION, I wonder if the analogy with OO principles is quite as strong?

Things like backward compatibility and package structure are, I feel, more to do with a product in itself - the 'stuff' whose configuration we are managing, as it slowly changes during development and successive releases.


Maybe a database analogy could also suggest some 'candidate' principles... I just thought of third normal form, when expressed as 'the key, the whole key, and nothing but the key'.

This could suggest a principle of SCM usage like: "provide only what is both necessary and sufficient about a product configuration".

At a release, this would avoid ambiguity.

In development, one of the 'pains' that this could ease is the confusion and effort of dealing with 'cruft' - all faithfully tracked, of course.

I think I was reaching for something like this, earlier - when I was thinking in terms of features, like 'identifiable configurations' and 'explicit (complete?) access control'.