Saturday, August 27, 2005

The Baseline Immutability Principle

Adding more baselining principles to my Principles of SCM. So far I've described the Baseline Reproducibility Principle (BLREP) and the Baseline Identification Principle (BLIDP). Now I want to describe the Baseline Immutability Principle (BLIMP).

The Baseline Immutability Principle (BLIMP) is really just a rephrasing of The Open-Closed Principle (OCP) from The Principles of Object-Oriented Design as applied to baselines (baselined configurations). The OCP (first stated by Bertrand Meyer in the classic book Object-Oriented Software Construction) states that "Software entities (classes, modules, functions, etc.) should be open for extension but closed for modification."

The OCP means I should have a way of being able to extend a thing without changing the thing itself. Instead I should be able to create some new "thing" of my own that reuses the existing thing and somehow combines that with just my additions, resulting in an operational "extension" of the original thing. The OCP is the basis for letting me reuse rather than reinvent when I need to create something that is "like" an existing thing but which still requires some additional stuff.

If applied for baselined configurations (a.k.a. baselines) the OCP would read "A baseline should be open for extension but closed for modification." That means if I want to create a "new" configuration that extends the previously baselined configuration, I should do so by creating a new configuration that is the baseline PLUS my changes. The result is not a "changed" baseline - the baselined configuration stays the same as it was before my change. We don't actually ever "change" a baseline. What we do is request/apply one or more changes against/to a baseline; and the result is a new configuration, possibly resulting in a new baseline.

According to the Baseline Immutability Principle ...
    If a baseline is to be reproducible, and if it needs to be identifiable, then the name that identifies the baseline with its corresponding configuration must always refer the exact same configuration: the one that was released/baselined.
For example, suppose I have release 1.2 of my product and I apply a label/tag of "REL-1.2" to everything that was used to make 1.2 (not just the code, but ALL of it: requirements, designs, tests, make/ANT files, etc.). Suppose that version of element FUBAR was one of the file revisions that was labeled. Now suppose that during the following month, "REL-1.2" is moved/reapplied to version of FUBAR.

In this example, I have just violated the baseline immutability principle. If a customer needs me to be able to reproduce Release 1.2, and if Release 1.2 contained v1.2.3.4 of FUBAR, then if I use "REL-1.2" to recreate the state of the codebase for Release 1.2, I just got the wrong result, because the version of FUBAR in Release 1.2 is different from the version that is tagged with the "REL-1.2" label.

Notice that I am not saying that we can't make changes against a baseline. We most certainly can. And the result is a new configuration!
    When we make a change to a baseline, we aren't really changing the configuration that was baselined and then trying to use the same name for the result. Our changed result is a new configuration that took the current baseline and added our changes to it. And if we chose to name this new configuration, we give it a new name (one that is different from the name of any previously baselined configuration).
So a baseline name and the configuration it references are married: once the configuration is baselined, that name must forever after be faithfully monogamous to that configuration for better or for worse, for richer or for poorer, in sickness and in health for as long as they both shall live.

Always and forever? What about a divorce, or an anullment?
    An "anullment" in this case is when I didnt get it right the first time. Either I "blessed" a configuration as "baselined" that didnt really meet the criteria to be called a "baseline." Or else I incorrectly identified the corresponding configuration: I might have labeled the wrong version of a file, or I forgot to label some file (e.g., people often forget to label their makefiles), or I labeled something I shouldnt have.

    Correcting a baseline's labeled-set so that it accurately identifies ("tags") the baselined configuration isnt really changing the baseline; it's merely correcting the identification of it (because it was wrong up until then).

    What about a "divorce"? We all know that a divorce can be quite expensive, and require making payments for a long time thereafter. Retiring (and trying to reuse) a baseline name can have significant business impact. Retiring the baseline often means no longer providing support for that version of the product. Trying to then reuse the same baseline name of the same product for a new configuration can create lots of costly confusion and can even be downright misleading.

Note that the term "a baseline" should not be confused with the term "the baseline":
  • The term "the baseline" really means the latest/current baseline. It is a reference!

  • This means that "the baseline" is really just shorthand for "the latest baseline." And when we "change the baseline", we are changing the designation of which baseline is considered "latest": we are changing the reference named "latest baseline" to point to a newer configuration.
So The Baseline Immutability Principle states that once a configuration is baselined, the identification of the baseline name with its corresponding configuration is immutable: The set of elements (e.g., files and revisions) referenced by the baseline name must always be the same set. And that set must always correspond to the set that was used to produce the version of the product that was baselined.

I think this may be equivalent to Damon Poole's "TimeSafe Property" -- see Damon's paper The TimeSafe Property: a Formal Statement of Immutability for CM.

Let me know what you think!


Unknown said...

Could you please use reasy-to-remember abbreviations of the principles. For example:

BREP (Baseline Reproducibility Principle)
BIDP (Baseline Identification Principle)
BIMP (Baselnie Immutibility Principle)

Especially mixing uppercase and lowercase is hard to read.


Brad Appleton said...

Hi Frank! Thanks for the feedback.

Hmmn - I thought that "BLIMP" might be easier to remember than "BIMP". (Granted, I think "BlIP" is easier to remember than "BlIDP" :-).

I tend to use "BL" as an abbreviation for "Baseline". I'm going to keep doing that for now. (I'm not sure yet if some principles will refer to 'B' for "Build" versus for Baseline --Don't be surprised if I use "CL" for Codeline.) After I flesh out more of these principles, I'll revisit whether that's necessary.

I can see that mixed case look horrid for sans-serif fonts (especially for lowercase 'l' versus uppercase 'I').
So I'll try to stick to uppercase from now on (maybe I'll even edit this blog-entry).

I was wondering if the "P" should be mandatory for each abbreviation. That is, do I call it "BLIDP" or the "BLID principle"? (same goes for "BLIM" versus "BLIMP")

I sort of like it better without the P at the end. Thus far I'd have BLREP (still have the P here for reProducubility), BLID, and BLIM.