Tuesday, August 09, 2005

SCM Design Smells

First, the news of the passing of Peter Jennings (ABC World News Tonight Anchor) became known to me early this morning. I'm very saddened by this. The world has lost a great mind and communicator, and Ive lost the trusted advisor I used to let into my home every evening since I was a teen to tell me about what was going on elsewhere in the world.

Getting back to my earlier topic of The Principles of SCM, I'd like to step through each of the Object-Oriented Design Principles mentioned in Robert Martin's book Agile Software Development: Principles, Patterns, and Practices and step through each principle, looking for how it applies to SCM.

Before I do that however, I'd first like to look at what "Uncle Bob" (as he is more affectionately called) refers to as design smells. These are as follows:
  • Fragility - Changes cause the system to break easily and require other changes.
  • Immobility - Difficult to disentangle entities that can be reused in other systems.
  • Viscosity - Doing things wrong/sloppy causes more friction and slows you down the next time you navigate through that code.
  • Needless Complexity - The System contains infrastructure that has no direct benefit (overengineering and/or "gold plating").
  • Needless Repetition - Repeated structures that should have a single abstraction (Redundancy).
  • Opacity - Code is hard to understand.
How might each of these apply to your SCM process and procedures? How might they apply to your branching & merging structure? Or to the organization of your source-tree?

Here's one possible "translation" of how these might apply to Software CM process and procedures:
  • Intolerant/Fragility - Changes cause the project, team, or organization to fall apart easily and require change to other parts of the project, team, or organization.
  • Rigidity/Immobility - Difficult to identify or disentangle practices and policies that can be reused by other projects, teams, or organizations.
  • Friction/Viscosity - Doing things wrong/sloppy causes more friction and slows you down the next time you navigate through that workflow or go on to the next one.
  • Wasteful/Needless Complexity - The Process contains "waste" in the form of extra steps, processing, handoff, waiting, or intermediate artifacts that do not "add value" for the customer, project, or organization.
  • Manual Tedium/Repetition - Repeated or tedious steps and activities should have a single mechanism to automate them.
  • Opacity - The project or process is hard to understand. (Lack of Transparency)

How would you translate design smells into process smells for Software CM?


Bob Corrick said...

I think there's another smell: CRUFT. It's waste, it slows us down while we examine it, and if we eventually learn to ignore it then it will still smell to other people...

[As a flip side of that, I suggest a 3NF-like principle for SCM: "The release, the whole release, and nothing but the release."]

How many times have you seen unnecessary materials faithfully tracked in a SCM system?

I think it happens because the discipline of keeping out anything unnecessary is actually quite difficult.

I have seen the growth of cruft can start quite early in the life of a repository, due to fatigue or relief once the first sufficient configuration has been defined.

Later on, the temptation to 'keep it just in case' is still strong - especially if we are feeling a lack of familiarity, or perhaps some time pressure :-).

Still later, the effort of housekeeping seems much larger than any benefit in "smell reduction" that we can imagine.

This is a pity, because while we know, I mean while you and I are actually adding or reviewing or reorganizing something in the repository, the effort of identifying and keeping only what is both necessary and sufficient is relatively small.


Brad Appleton said...

Hi Bob! Thanks again for the comments. Interesting that you described CRUFT as "waste." In an earlier draft of my blog-entry, I actually wrote "Waste / Needless Complexity" - then later I replaced "Waste" with "Wasteful." Your comment makes me think my earlier draft was the better word choice.

Regarding what you call 3NF (Third Normal Form - The Release, the whole release, and nothing but the release) ... can you tell me more about how it is analogous to what 3NF means for databases, and maybe describe separately the intended meaning of each of (1) "the Release", (2) "the whole release" and (3) "nothing but the release"?

Bob Corrick said...

Hi again Brad

I think my 3NF analogy is only slight; "the truth, the whole truth, and nothing but the truth" might express the meaning just as well. Hmm... this all seems rather lightweight now that I break it down, but here goes:


It is possible to use SCM systems without ever identifying the exact configurations of versions of items that make up a release. I've seen teams (especially corporate) that track developer versions of items, or track change requests to packages of items, without ever trying to label a release.

So a principle for me is to always label a release explicitly.


And when you do so, label everything in the repository that is needed for a successful release or upgrade.

Furthermore, if there's something you need that isn't in the repository, put it in and label it. (You can spot this if you use only the SCM system and an empty target environment to make a test release.)


You will certainly have items in the repository that are not needed in a given release. I don't mean cruft; just stuff that has to be there for some other product or variant or release.

So, don't accidentally label such items as part of your release, as that would lead to cruft when you distribute your release, and to possible confusion within the repository.

I just had a further thought...


I've realised that my comments about 'cruft' relate more to the state of a repository than to SCM processes.

I find it easier to think about stable states than to think about about flow or process. So designing some process seems to go well for me when I think of it in terms of the possible states of the relevant system - here, a repository.

So a rule of thumb for me would be to design SCM processes in terms of the possible, valid states of the material in the repository.


Brad Appleton said...

Thanks for clarifying Bob, that helped a lot! So it seems like what you are getting at is explicitly identifying (i.e. "labeling" or "tagging") everything that is both necessary and sufficient to reproduce the released version.

Is that correct? So if something is necessary, but not sufficient - I need to tage more. And if something is not necessary (even if it's sufficient), then it's "cruft."

The "cruft" makes me think of refactoring
only in a somewhat different sense than we might typically think.

Bob Corrick said...

Brad, spot on:


'explicitly identify (i.e. "label" or "tag") everything that is both necessary and sufficient to reproduce the released version' sums up exactly what I mean.


Cruft would be anything that is still in the repository, but would never needed in connection with any release that is still being supported.

As I see it, the waste of cruft comes from:

* people taking time to find the relevant (non-cruft) among a larger than necessary number of items

* cruft being deployed and distributed, to become a distraction in all the target environments as well as in the repository

* cruft being used in error, e.g. examined during an investigation for which it has no possible relevance

* cruft being archived and duplicated etc. during housekeeping of the repository

I reckon it's worth avoiding it or getting rid of it. "If you ain't gonna need it, trash it."