Saturday, September 24, 2005

Quantum Agility and Organizational Gravity

Just a random synapse firing in my brain ... I remember back in my high school days being enthralled with physics and the latest grand-unified theories (GUTS), and how gravity was always the "odd ball" in trying to unify the four fundamental forces of nature into a single, simple, consistent and coherent theory:
  • Quantum mechanics could unify all but gravity. It was great, and incredibly accurate at explaining all the rich and myriad interactions of things at the molecular, atomic and subatomic levels.

  • But throw in celestial bodies and large distances, and the thing called "gravity" rears its ugly head and makes things complicated. In theory it's nowhere near as strong as the other forces, and yet any time you had to scale up to things large enough and far enough away to need a telescope instead of a microscope, it made everything fall apart.
Sometimes I think Agile "theory" and large projects and organizations are the same dichotomy.
  • The "Agile" stuff seems great in small teams and projects that can be highly collaborative and iterative over short (collocated) distances with small "lightweight" teams and processes.

  • But throw it into a large project or organization, and "gravity" sets in, adding weight and mass and friction to processes and communication, and yet necessarily so, in order to scale to a larger living system of systems of systems.
So we are left with quantum agility and organizational gravity and trying to reconcile the two. What's an Agile SCMer to do about all that?

Saturday, September 17, 2005

Can I have just one repository please?

One of the things I spend a lot of time dealing with is integration between application lifecycle management tools and their corresponding process areas: requirement management, configuration management, test management, document management, content management, change management, defect management, etc.

So I deal with process+tool architecture integration for a user community of several thousand, and the requirements, version control, change-tracking, and test management tools almost always each have their own separate repositories. Occasionally the change-tracking and version-control are integrated, but the other two are still separate.

And then if there is a design modeling tool, it too often tries to be a "world unto itself" by being not merely a modeling environment but attempting to store each model or set of models as a "version archive" with its own checkin/checkout, which makes it that much more of a pain in the you-know-what to get it versioned and labeled/baselined together with the code, particularly if code-generation is involved and needs to be part of the build process.

And what really gets to me is that, other than the version control tool, the other tools for requirements and test management, and typically change management usually have little or no capability to deal with branching (much less merging). So heaven forbid one has to support multiple concurrent versions of more than just the code if you use one of the other tools.

The amount of additional effort for tool customization and configuration and synchronization and administration to make these other tools be able to deal with what is such a basic fundamental version-control capability is enormous (not to mention issues of architectural platforms and application server farms for a large user base). So much so that it makes me wonder sometimes if the benefit gained by using all these separate tools is worth the extra integration effort. What if I simply managed them all as text files in the version control system?

At least then I get my easy branching and merging back. Plus I can give them structure with XML (and then some), and could easily use something like Eclipse to create a nice convenient GUI for manipulating their contents in a palatable fashion.

And all the data and metadata would be in the same database (or at least one single "virtual" database). No more having to sync with logically related but physically disparate data in foreign repositories and dealing with platform integration issues, just one big (possibly virtual) repository for all my requirements, designs, code, tests, even change-requests, without all the performance overhead and data redundancy and synchronization issues.

It could all be plain structured text with XML and Eclipse letting each artifact-type retain its own "personality" without having to be a separate tool in order to do it.

Why can't someone make that tool? What is so blasted difficult about it!!!

I think the reason we dont have it is because we are use to disconnected development as "the rule" rather than as the exception. Companies that shell out the big bucks for all of those different tools usually have separate departments of people for each of requirements (systems/requirements engineers), design (software architects), source-code ("programmers"), test (testers), and change-management.

It's a waterfall-based way of organizing large projects and it seems to be the norm. So we make separate tools for each "discipline" to help each stay separate and disconnected, and those of us doing EA/EAI or full lifecycle management of software products have to deal with all the mess of wires and plumbing of integration and platforms and workflow.

Oh how I wish I could take a combination of tools:
  • a good, stream-based version control tool like Accu-Rev
  • a fully Java/XML extensible issue-tracker like Jira (or combination of the two, like SpectrumSCM)
  • a framework like Eclipse
  • and a collaborative knowledge/content management system like Confluence
and roll them together into a single integrated system with a single integrated repository.

Notice I didn't mention any specific tools for requirements-management or test-management. Not that I dont like any of the ones available, I do, but I think it's time for a change in how we do those things with such tools:
    they basically allow storing structured data, often in a hierarchical fashion with traceability linkages, and a way of viewing and manipulating the objects as a structured collection, while being able to attach all sorts of metadata, event-triggers, and queries/reports
I think a great wiki + CMS like Confluence and Jira can do all that if integrated together; Just add another "skin" or two to give a view of requirements and tests both individually and as collections (both annotated and plain).

The same database/repository could give me both an individual and hierarchical collection-based views of my requirements, designs, code, tests and all their various "linkages." Plus linking things in the same database is a whole lot easier to automate, especially thru the same basic IDE framework like Eclipse.
  • the requirements "skin" gives me a structured view of the requirements, and collaborative editing of individual requirements and structured collections of them;
  • ditto for the test "skin";
  • and almost "ditto" for the "change-management" skin (but with admittedly more workflow involved)
  • the design tool gives me a logical (e.g., UML-based) view of the architecture
  • the IDE gives me a file/code/build-based view of my architecture
  • And once MS-Office comes out with the standard XML-based versions, then maybe it will be pretty trivial to do for documents too (and to integrate XML-based Word/Office/PPT "documents" with structured requirements and tests in a database)
Oh why oh why can't I have a tool like that! Pretty please can I have it?

Sunday, September 11, 2005

Change-Packaging Principles

In my previous blog-entry I tried translating Uncle Bob's OOD Principles of package cohesion into the version-control domain by substituting "release" with "promote" or "commit", and "reuse" with "test".

I think that didn't work too well. I still think "promotion" corresponds to "release", but "reuse" corresponds to something else. I'm going to try translating "reuse" to "integration". If I integrate (e.g., merge) someone else's changes into my workspace, I am quite literally reusing their work. If I commit my own change to the codeline, then I am submitting my work for reuse by the rest of the team that is using the codeline (particularly the "tip" of the codeline) as the basis of their subsequent changes.

So if I equate "release" with "promotion", and "reuse" with "integration" I think the result is the following:
  • The Promotion-Integration Equivalency Principle -- The granule of integration is the granule of promotion. (So it's not just the change content, but also the context – the entire configuration – that we end up committing to the codeline/workstream.)

  • The Change Closure Principle -- Elements that must be changed together are promoted together (implies task-level commit).

  • The Change Promotion Principle -- Elements that must be integrated together are promoted together (implies doing workspace update prior to task-level commit)
These "work" for me much better than the previous translation attempt. Note that the "change closure principle" didn't change much from before - it was just clarified a bit to indicate the dependency between elements.

This also makes me think I've stumbled onto the proper meaning for translating the Interface Segregation Principle (ISP): ISP states "Make fine-grained interfaces that are client-specific." If "integration" is reuse, then each atom/granule of change is an interface or "container" of the smallest possible unit of reuse.

The smallest possible unit of logical change that I can "commit" that doesn't break the build/codeline would be a very specific, individually testable, piece of behavior. Granted, sometimes it might not be run-time behavior ... it could be build-time behavior, or behavior exhibited at some other binding time.

This would yield the following translation of the ISP into the version-control domain:
    The Change Separation Principle -- Make fine-grained incremental changes that are behavior-specific. (i.e., partition your task into separately verifiable/testable yet minimal increments of behavior.)
I'm not thrilled about the name (please feel free to suggest a better one -- for example ... how about "segmentation" instead of "separation"?) but I think the above translation "works" quite well, and also speaks to "right-sizing" the amount of change that is committed to the codeline as an individual "transaction" of change. The way it's worded seems like it's talking exclusively about "code", but I think it really applies to more than just code, so long as we arent constraining ourselves to execution-time "behavior."

Let me know what you think about these 4 additions to the family of SCM principles!

Saturday, September 03, 2005

The Blog ate my homework!

[NOTE: due to some comment-spam on my last entry (which I have since deleted), I haved turned on "word verification" for comments.]

When I was composing my previous blog entry, something very frustrating happened: The blog ate my homework!

I frequently save intermediate drafts of my blog entries before I publish them. I had been working on my most recent draft for a couple hours. I'd been finalizing many of the sentences and paragraphs, making sure the flowed, checking the word usage, spellchecking, adding and verifying links, and then ... when I was finally ready to publish, I hit the publish button on the blogger compose window, and it asked me to login again. When I did, my final edits were GONE! I'd just lost two hours worth of work.

My first thought was ARRRRRRRGGGGGHHHHH! My next thought was "no freakin' WAY did that just happen to ME!" Then much profanity ensued (at least in my own silent frustration) and I tried my darndest to look thru any and all temp files and saved files on my system and on blogger.com, all for naught. I had indeed fallen victim to one of the most basic things that CM is supposed to help me prevent. How infuriating! How frustrating! How embarrassing. I was most upset not about the lost text, but about the lost time!

I figure there must be a lesson in there somewhere to pass along. Ostensibly, the most obvious lesson would be to use the Private Versions pattern as outline in my book. The thing is ... I had been doing just that! It was in the very act of saving my in-progress draft (before publishing it) that my changes were lost.

What I could (and possibly should) have done instead was not use blogger's composer to compose my drafts. I could have done it locally instead, on my own machine (and my own spellchecker). And perhaps I will do that a bit more from now on. Still, it's pretty convenient to compose it with blogger because"
  • I get rapid feedback as to what it will actually look like, and ...
  • I can access it from any machine (not just the one I use late at night)
I later realized why it happened. I was trying to do two things at once:
  • In one window I was composing my blog entry.
  • In another browser window I was visit webpages I wanted to hyperlink to from my entry and verifying the link.
Okay - so there's nothing wrong with that. I mean I was doing two things at the same time, but I wasn't really trying to multi-task because I was still trying to work on my blog-entry.

The real culprit wasnt that I had two windows open at the same time, it was that one of the webpages I wanted to hyperlink to was also a blogger.com hosted blog-entry. And since I was positing a question in my entry that referred to this one, I also wanted to create a comment in the referred-to entry that asked the question and referenced back to my own blog.

Posting that comment caused me to have enter my blogger id and passwrod, and that essentially forced a new login - which made it look like my current login (where I was composing my entry) either ended, or had something unusual going on that warranted blogger wanting me to re-authenticate myself. And when it did, I lost my changes! OUCH!

Actually, I hadnt even posted the comment - I had only previewed it (saving it as a draft). Anyway - I was too upset (and it was too late at night) to try and recreate my change sthen. So I waited another day before doing it. I have to say Im not as happy with the result. I had really painstakingly satisfied myself with my wording and phrasing before I lost my changes. I wasn't as thorough the second time around because I wanted to be done with it!

So what was my big mistake? I was using private versions, and I wasn't trying to multi-task. I was in some sense trying to simultaneously perform "commits" of two different things at the same time, but they were to different "sections" of the same repository, so that really shouldn't have been such a terrible thing.

My big mistake wasn't so much a lack of good CM as it was a lack of good "agility": I let too much time lapse in between saving my drafts. I wasn't working in small enough batch-sizes (increments/iterations)!

Granted, I don't want to interrupt my flow of thought mid-sentence or mid-paragraph to do a commit. But certainly every time I was about to visit and verify another hyperlink in my other browser window, I should have at least saved my current draft before doing so. And I probably should have made sure I did so at least every 15-20 minutes. (You can be darn sure that's what I did this time around :-)

This sort of relates to how frequently someone should commit their changes in a version control system. Some of the SCM principles that I havent described yet will relate to this. Uncle Bob's Principles of Object-Oriented Design have a subset that are about "package cohesion" and granularity
  • REP: The Release Reuse Equivalency Principle -- The granule of reuse is the granule of release.

  • CCP: The Common Closure Principle -- Classes that change together are packaged together.

  • CRP: The Common Reuse Principle -- Classes that are used together are packaged together.
In the context of version control, these "packages of classes" would probably correspond to "packages of changes" that make up a single logical "change transaction" or "commit" operation. If that is a valid analogy, then I need to decide what "reuse" and "release" mean in this context:
  • I think "release" would mean to "promote" or "commit" my changes so they are visible to others using the same codeline.

  • I think "reuse" would mean ... hmmn that's a tough one! It could be many things. I think that if a change is to be reusable, it must be testable/tested. Other things come to mind too, but that's the first one that sticks.
So let's see what happens if I equate "release" with "commit", equate "reuse" with "test" and see if the result is coherent and valid. This would give me the following:
  • The Commit/Test Equivalency Principle -- The granule of test is the granule of commit.

  • The Change Closure Principle -- Files that change together are committed together.

  • The Test Closure Principle -- Files that are tested together are committed together (including the tests).
Comments? Thoughts? What do these mean to you? Does it mean anything more than using a task-level commit rather than individual file checkin? Should these always "hold true" in your experience? When shouldnt they? (and why?)

Oh - and feel free to suggest better names if you dont like the ones I used. I'm not going to supply abbreviations for these because, or name any blog-entries after them just yet because I'm not yet certain if they are even valid.

Saturday, August 27, 2005

The Baseline Immutability Principle

Adding more baselining principles to my Principles of SCM. So far I've described the Baseline Reproducibility Principle (BLREP) and the Baseline Identification Principle (BLIDP). Now I want to describe the Baseline Immutability Principle (BLIMP).

The Baseline Immutability Principle (BLIMP) is really just a rephrasing of The Open-Closed Principle (OCP) from The Principles of Object-Oriented Design as applied to baselines (baselined configurations). The OCP (first stated by Bertrand Meyer in the classic book Object-Oriented Software Construction) states that "Software entities (classes, modules, functions, etc.) should be open for extension but closed for modification."

The OCP means I should have a way of being able to extend a thing without changing the thing itself. Instead I should be able to create some new "thing" of my own that reuses the existing thing and somehow combines that with just my additions, resulting in an operational "extension" of the original thing. The OCP is the basis for letting me reuse rather than reinvent when I need to create something that is "like" an existing thing but which still requires some additional stuff.

If applied for baselined configurations (a.k.a. baselines) the OCP would read "A baseline should be open for extension but closed for modification." That means if I want to create a "new" configuration that extends the previously baselined configuration, I should do so by creating a new configuration that is the baseline PLUS my changes. The result is not a "changed" baseline - the baselined configuration stays the same as it was before my change. We don't actually ever "change" a baseline. What we do is request/apply one or more changes against/to a baseline; and the result is a new configuration, possibly resulting in a new baseline.

According to the Baseline Immutability Principle ...
    If a baseline is to be reproducible, and if it needs to be identifiable, then the name that identifies the baseline with its corresponding configuration must always refer the exact same configuration: the one that was released/baselined.
For example, suppose I have release 1.2 of my product and I apply a label/tag of "REL-1.2" to everything that was used to make 1.2 (not just the code, but ALL of it: requirements, designs, tests, make/ANT files, etc.). Suppose that version 1.2.3.4 of element FUBAR was one of the file revisions that was labeled. Now suppose that during the following month, "REL-1.2" is moved/reapplied to version 1.2.3.5 of FUBAR.

In this example, I have just violated the baseline immutability principle. If a customer needs me to be able to reproduce Release 1.2, and if Release 1.2 contained v1.2.3.4 of FUBAR, then if I use "REL-1.2" to recreate the state of the codebase for Release 1.2, I just got the wrong result, because the version of FUBAR in Release 1.2 is different from the version that is tagged with the "REL-1.2" label.

Notice that I am not saying that we can't make changes against a baseline. We most certainly can. And the result is a new configuration!
    When we make a change to a baseline, we aren't really changing the configuration that was baselined and then trying to use the same name for the result. Our changed result is a new configuration that took the current baseline and added our changes to it. And if we chose to name this new configuration, we give it a new name (one that is different from the name of any previously baselined configuration).
So a baseline name and the configuration it references are married: once the configuration is baselined, that name must forever after be faithfully monogamous to that configuration for better or for worse, for richer or for poorer, in sickness and in health for as long as they both shall live.

Always and forever? What about a divorce, or an anullment?
    An "anullment" in this case is when I didnt get it right the first time. Either I "blessed" a configuration as "baselined" that didnt really meet the criteria to be called a "baseline." Or else I incorrectly identified the corresponding configuration: I might have labeled the wrong version of a file, or I forgot to label some file (e.g., people often forget to label their makefiles), or I labeled something I shouldnt have.

    Correcting a baseline's labeled-set so that it accurately identifies ("tags") the baselined configuration isnt really changing the baseline; it's merely correcting the identification of it (because it was wrong up until then).

    What about a "divorce"? We all know that a divorce can be quite expensive, and require making payments for a long time thereafter. Retiring (and trying to reuse) a baseline name can have significant business impact. Retiring the baseline often means no longer providing support for that version of the product. Trying to then reuse the same baseline name of the same product for a new configuration can create lots of costly confusion and can even be downright misleading.

Note that the term "a baseline" should not be confused with the term "the baseline":
  • The term "the baseline" really means the latest/current baseline. It is a reference!

  • This means that "the baseline" is really just shorthand for "the latest baseline." And when we "change the baseline", we are changing the designation of which baseline is considered "latest": we are changing the reference named "latest baseline" to point to a newer configuration.
So The Baseline Immutability Principle states that once a configuration is baselined, the identification of the baseline name with its corresponding configuration is immutable: The set of elements (e.g., files and revisions) referenced by the baseline name must always be the same set. And that set must always correspond to the set that was used to produce the version of the product that was baselined.

I think this may be equivalent to Damon Poole's "TimeSafe Property" -- see Damon's paper The TimeSafe Property: a Formal Statement of Immutability for CM.

Let me know what you think!

Sunday, August 21, 2005

The Baseline Identification Principle

Yesterday (actually just a few hours ago) was my 40th birthday. I had a really nice celebration with my wife and kids at a picnic in the park. I really dont feel like I'm 40. My body thinks I am 50 - at least that how it seems to be acting. My mind still isnt used the the fact that I'm now more than just a little bit older than all those leading men and leading ladies on TV and movies. (Guess I can no longer identify them as part of my historical "baseline" :-)

Back again to describing The Principles of SCM! Last time I described The Baseline Reproducibility Principle. Now we'll take the next logical step and talk about the need to identify baselines.

If the ability to reproduce a baseline is fundamental to SCM, then it stands to reason that the ability to identify a baseline that I must be able to reproduce should also be pretty fundamental. If I have to be able to "show it", then I must first be able to "know it." If I can't uniquely identify a baseline, then it's pretty hard to reproduce it if I'm not sure what I'm trying to reproduce.

So the baseline reproducibility principle gives rise to The Baseline Identification Principle: a baseline must be identified by a unique name that can be used to derive all the constituent elements of the baseline. In other words, we have to have a name, and a way of associating that name with all the object (e.g. files) and their revisions that participate in the baseline.

How do we identify a baseline? By defining a name (or a naming system) to use, and using that name to reference the set of elements that were used to build/create the baselined version of the product.

A "label" or "tag" is one common way that a version control tool allows us to identify the sources of a baseline. This lets us associate a name with a specific set of repository elements and their corresponding revisions. Or it lets us associate a name with an existing configuration or event from which the set of elements and versions may be derived.

Sometimes tagging all the "essential" files and revisions in the repository is sufficient. Sometimes I need more information. I can always take any files or information that werent previously in the version control repository, and put them in the repository:
  • I can put additional information in a text file and checkin the file
  • I can export a database or binary object into some appropriate format (e.g., XML, or other formatted text)
  • some tools let me directly checkin a binary object (e.g, compilers, libraries, images, models) to the repository

If you currently have to label or tag more than just source-code and manually created text-files, then tell me about the other kinds of things you checkin and tag, and what special things you do to ensure they are identified as part of a baseline.

Monday, August 15, 2005

The Baseline Reproducibility Principle

Getting back to my earlier topic of The Principles of SCM, I think probably the first and most fundamental principle would be the requirement to be able to reproduce any baselined/released version of the software.

I'll call this The Baseline Reproducibility Principle: a baseline must be reproducible. We must be able to reproduce the "configuration" and content of all the elements that are necessary to reproduce a "released" version of the product.

By "released" I really mean "baselined" - it doesn't have to be a release to a customer. It could be a hand-off to any other stakeholder outside of development (like a test group, or a CM group, or QA, etc.). There is some basic vocabulary we need, like the terms "baseline" and "configuration." Damon Poole has started a vocabulary/glossary for SCM. Damon defines configuration but doesn't yet define a baseline.

A baseline is really shorthand for a "baselined configuration." And a baselined configuration is basically "a configuration with an attitude!" The fact that it's been "baselined" makes it special, and more important than other configurations that aren't baselined. We baseline a configuration when we need to promote/release it to another team/organization. By "baselining" it, we are saying it has achieved some consensually agreed upon level of "blessedness" regarding what we said it would contain and do, and what it actually contains and does.

Why do we need to be able to reproduce a baselined version of the product we produce and deliver? For several reasons:

  • Sometimes we want to be able to reproduce a reported problem. It helps to be able to reproduce the exact versions of the source code that made up version of the product that the customer is using.

  • In general, when we hand-off a version of the product to anyone that may report problems or request enhancements, it is useful to be able to reproduce the versions of the files that make-up that version of the system to verify or confirm their observations and expectations.

  • When a "fix" is needed, customers are not always ready/willing to deploy our latest version (containing new funcitonality plus the fix). Even if they are, sometimes our business is not - it wants to "give" them the fix, but make more money on any new functionality. So we must provide a "patch" to their existing version

  • When a baseline is a version of the product, it includes the specs and the executable software. Configuration auditing requires us to know the differences between the current product+specs versus their actual+planned functionality at the time that the product was released to them.
Those are just a few reasons. There are many more I'm sure.

What does it mean to reproduce a baseline? At the very least it means being able to reproduce the exact set of files/objects and their corresponding versions that were used to produce/generate the delivered version of the product. (That includes the specs that may be audited against, as well as the code).

Sometimes being able to reproduce the source files for the code+docs (and build scripts) is enough. Often we need to be able to do more than that. Sometimes it may be necessary to reproduce one or more of the following as well:

  • The version of the compilers/linkers or other tools used to create that version of the product

  • The version of any third-party libraries, code/interfaces/headers used to build the product

  • Any other "significant" aspect of the computing environment/network utilized during the creation of the delivered version of the product
It can be too easy to go to more effort than necessary to ensure reproducibility of more than is absolutely essential. What is essential to reproduce may depend upon many business and technical factors (including some possible contractual factors regarding deployment/upgrade, operational usage and support).

The ability to be able to reproduce a baseline is so basic to SCM; I can't believe it hasn't been a "named" principle before. I know others have certainly written about it as a principle, I'm just not recalling if any of them gave the principle a name.

I think names are powerful things. Part of what makes software patterns so powerful is that they give a name to an important and useful solution to a recurring problem in a particular context. The pattern name becomes an element of the vocabulary of subsequent discussion on the subject. So I can use the terms "Private Workspace" or "Task Branch" in an SCM-related conversation instead of having to describe what they are over and over again.

This is why I'd like to develop a set of named principles for SCM. I think lots of folks have documented SCM principles, but didn't give them names. And they might "stick" better if we gave them names. If you know of any examples of SCM principles that are already well known and have a name, please let me know! (Please include a reference or citation if possible)

Tuesday, August 09, 2005

SCM Design Smells

First, the news of the passing of Peter Jennings (ABC World News Tonight Anchor) became known to me early this morning. I'm very saddened by this. The world has lost a great mind and communicator, and Ive lost the trusted advisor I used to let into my home every evening since I was a teen to tell me about what was going on elsewhere in the world.

Getting back to my earlier topic of The Principles of SCM, I'd like to step through each of the Object-Oriented Design Principles mentioned in Robert Martin's book Agile Software Development: Principles, Patterns, and Practices and step through each principle, looking for how it applies to SCM.

Before I do that however, I'd first like to look at what "Uncle Bob" (as he is more affectionately called) refers to as design smells. These are as follows:
  • Fragility - Changes cause the system to break easily and require other changes.
  • Immobility - Difficult to disentangle entities that can be reused in other systems.
  • Viscosity - Doing things wrong/sloppy causes more friction and slows you down the next time you navigate through that code.
  • Needless Complexity - The System contains infrastructure that has no direct benefit (overengineering and/or "gold plating").
  • Needless Repetition - Repeated structures that should have a single abstraction (Redundancy).
  • Opacity - Code is hard to understand.
How might each of these apply to your SCM process and procedures? How might they apply to your branching & merging structure? Or to the organization of your source-tree?

Here's one possible "translation" of how these might apply to Software CM process and procedures:
  • Intolerant/Fragility - Changes cause the project, team, or organization to fall apart easily and require change to other parts of the project, team, or organization.
  • Rigidity/Immobility - Difficult to identify or disentangle practices and policies that can be reused by other projects, teams, or organizations.
  • Friction/Viscosity - Doing things wrong/sloppy causes more friction and slows you down the next time you navigate through that workflow or go on to the next one.
  • Wasteful/Needless Complexity - The Process contains "waste" in the form of extra steps, processing, handoff, waiting, or intermediate artifacts that do not "add value" for the customer, project, or organization.
  • Manual Tedium/Repetition - Repeated or tedious steps and activities should have a single mechanism to automate them.
  • Opacity - The project or process is hard to understand. (Lack of Transparency)

How would you translate design smells into process smells for Software CM?

Monday, August 01, 2005

The Customer Inversion Principle of Process Design

Lookingback on last week's blog-entry suggesting we should CM to an Interface, not an implementation, I wonder if that was really an instance of the stated design principle, or of something else ...

Often times, the process and procedures that development must follow in order to comply with CM needs were developed by the people who receive the outputs of development but who dont necessarily perform the development activities themselves. These process-area experts are the process designers and the developers are the end-users of their process.

The conclusion of CM to an interface, not an implementation was to essentially invert or "flip" the relationship between who is the process "producer" and who is its "customer." The Principles of Lean Thinking suggest that processes should be designed by the practitioners who are most intimately familiar with performing the activities and their reasons for being a necessary step in the process: Those who receive the outputs of that process are its customers, and they get to specify the requirements, but not the implementation.

If true, this could perhaps be a statement of a different principle that we might call The Customer-Inversion Principle of Process Design:
  • Upstream Development procedures should not depend on downstream CM procedures, both should depend upon the abstract interfaces represented by development's exit criteria and CM's entry criteria.
  • Procedures should not be designed for their practitioners by the upstream customer of their results, Practitioners should design their own procedures to meet the requirements of their upstream customers.
Not only does this "inversion" of the process producer/customer relationship conform with the design principle to separate interface from implementation, and with principles of lean thinking, it also aligns with Agile principles of putting the customer in charge and preferring customer collaboration over contract negotiation, when "negotiating" the right balance between the process requirements and the procedural implementation.

It also somewhat "inverts" (or at least turns on its head) what might be the more stereotypical perception by many agilists of CM as "controlling opponents" into one of "collaborating customers", and hopefully helps lend some a new perspective about how to successfully pair with other organizational stakeholders making additional demands upon the use of more formal standards, documentation, and tools upon an agile project. (See my earlier blog-entry on Building Organizational Trust by Trusting the Organization.)

Surely there must be some exceptions. What about when development has absolutely no CM knowledge or appreciation whatsoever? Should a knowledgeable CM person define development's CM activities for them?

To me this sounds similar to the situation of an expert needing to play the role of coach for a more junior engineer. A more directive or coaching style of leadership may be required, where CM doesnt necessarily give all the answers, but still plays a strong collaborative role in specifying not only their requirements, but in educating development about existing SCM patterns and their applicability and context, and helping them choose the most appropriate patterns and tradeoffs to design the CM procedures that development should use.

If development is not yet able to understand and/or is willing to be initially "told" what to do - then "telling"/directing (instead of coaching) might be the first step. But ultimately I believe practitioners of a process need to feel a sense of ownership over their own process and procedures if they are to continue being effective. By helping them understand the process requirements, and the applicable patterns and principles, we help them become better developers, and better advocates of effective CM. At least that's been my experience.

What do you think? Does it sound nice in theory but not work "in practice" in your own experience?

Friday, July 22, 2005

CM to an Interface, not to an Implementation

It's been a very busy and heated week or so for "Agile CM" discussion on CMCrossroads.com. First, we published the debate-provoking article about some fictitious "newly discovered" National Treasures of Agile Development. Then there was, and still still is, the ensuing (and much heated) discussion on the Balancing CM and Agility discussion thread.

The gist of the article was that it purported to have discovered some historical artifact whose wording resembled the July 1776 declaration of independence, but was referring to agile developers complaining about, and wanting freedom from, the so called "tyrannies" of many recurring heavyweight Software CM process implementations. Of course since it was posted in a forum for CM professionals, the article sparked some very strong reactions from some people (and you can read them in the "Balancing CM and Agility" thread if you like).

One thing regarding the article and discussion thread that struck me is very much related to one of the SCM Principles from GoF Design Patterns that I blogged about last week:
  • Program to an Interface, Not an Implementation (a.k.a "Separate Interface from Implementation")

Some regarded the "National Treasures" article as an attack against CM and CM principles in general instead of taking it for what it was, a grievance with some commonly recurring characteristics of some heavyweight SCM process implementations.

That got me to thinking - maybe these could be recast as instances of violating an important SCM principle! Previously, I had blogged about "separating interface from implementation" as it applies to designing build scripts/recipes as well as CM processes and workflow. But I didnt really talk about how it applies to the roles involved in defining the CM activities and in executing the CM activities.

In the case of CM and development, I think it is frequently the case in many heavyweight SCM implementation, that the developers quite likely did not get a lot of say in defining the process and procedures they must follow to meet the needs of CM. It was instead defined primarily by configuration/build managers to meet needs needs like reproducibility, repeatability, traceability, and auditability without enough considerations to development's needs for:
  • high quality+velocity acheived though rapid and frequent feedback that comes from very frequent integration (including being able to do their own merges+builds on their own "Active Development Line")

  • unit, integration, and regression testing with the latest development state of the codeline

  • effect fixes/changes quickly and efficiently without long wait/approval periods for authorization of changes they are likely to know need to be made ASAP (like most bugfixes and maintainability/refactoring changes found after the corresponding task was committed to the development codeline)
This leads me to think an important manifestation of separating interface from implementation is to:
      Configuration Manage to an Interface, Not to an Implementation!
What I mean by this is that when defining CM policies, processes and procedures, the impacted stakeholders (particularly the ones who will execute the procedures) not only need to be involved, but as the "consuming end-user" of the process and procedures, they need to play a key role in defining the process implementation.

  • CM (and other downstream stakeholders) should define their needs/requirements in terms of interfaces: policies, invariants, constraints, and entry criteria for what is given to configuration/build managers.

  • Development should be empowered to attempt to define the implementation: the processes/procedures and conventions for how they will meet or conform to the "interface" needed by CM.
This seems very "Agile" to me in that it fosters collaboration between CM and development. It lets CM be the "customer" of agile developers by giving them their CM process requirements and allowing them to drive and fine-tune the implementation's "acceptance tests" to ensure it meets their needs. It also allows development, the folks who are executing the development activities, to collaboratively define their own processes (in accordance with lean development/manufacturing principles).

Does that mean every different development group that delivers to CM should be allowed to come up with their own process? What about having consistent and repeatable processes? If the requirements/interfaces are defined and repeatedly met, why and when do we need to care? Each development group is defining and repeatably executing its process to meet consistent CM entry-criteria. And doesnt the CMM/CMMI allow for project-specific tailoring of standard organizational processes?

Still, there should be some mechanism for a larger-grained collaboration, such as a so called "community of practice" for all the development projects to share their successful development practices so that that knowledge can be reused throughout the organization. And every time CM collaborates with yet another project, they can point to an existing body of development processes being successfully used that the project they are engaging might want to consider adopting/adapting.

I do think that if they (development + CM) choose to use a different process or significantly adapt an existing one, it would be good to know the differences and WHY they were necessary. Seems to me that matches the description of what an SCM Pattern is: something that defines a solution to an SCM problem in a context and captures the forces, their resolution, and the rationale behind it.

Then when CM and development work together the next time for the next project, they simply look at the set of SCM patterns they have been growing (into a pattern language perhaps) and decide which combination of patterns to apply, and how to apply them, to balance and resolve the needs of both CM and development, collaboratively!

Thursday, July 14, 2005

SCM Principles from GoF Design Patterns

I was reading Leading Edge Java online and saw an article on Design Principles from Design Patterns. The article is part III of a conversation with Erich Gamma, one of the famous Gang of Four who authored the now legendary book Design Patterns: Elements of Reusable Object-Oriented Software.

In this third installment, Gamma discusses two design principles highlighted in the GoF book:
  • program to an interface, not an implementation (a.k.a "separate interface from implementation")

  • favor object composition over class inheritance
Another recurring theme echoed throughout the book is:
  • encapsulate the thing that varies (separate the things that change from the things that stay the same during a certain moment/interval)
I think these three GoF Design Principles have pretty clear translations into the SCM domain:
  • Separate Interface from Implementation - this applies not only to code, but to Make/ANT scripts when dealing with multi-platform issues like building for different operating environments, or windowing systems, or component libraries from different vendors for the same functionality. We are often able to use variables to represent the target platform and corresponding set of build options/libraries: the rules for building the targets operate at the abstract level, independent of the particular platform. This can also apply to defining the process itself, trying to ensure the high-level workflow roles, states & actions are mostly independent of a particular vendor tool

  • Prefer Composition & Configuration over Branching & Merging - This is one of my favorites, because it talks about one of my pet peeves: inapproproriate use of branching to solve a problem that is better solved by using either compile-time, install-time, or run-time configuration options to "switch on" the desired combinations of variant behavior. Why deal with the pain of branching and then having to merge a subsequent fix/enhancement to multiple variant-branches if you can code it once in the same directory structure with patterns like Strategy, Decorator, Wrapper-Facade, or other product-line practices

  • Isolate Variation - this one is almost too obvious. Private Workspaces and Private Branches isolate variation/change, as does just about any codeline. And we do the same things in the build-scripts too

Can you think of any other valid interpretations of the above design rules in terms of how they translate into the SCM domain?

Tuesday, July 05, 2005

Whitgift's Principles of Effective SCM

In an effort to try and deduce/derive The Principles of SCM, I'm going through the SCM literature to see what other published SCM experts have identified as SCM Principles.

Among the best "oldies but goodies" are the books by David Whitgift (Methods and Tools for SCM, Wiley 1991) and Wayne Babich (SCM: Coordination for Team Productivity, Addison-Wesley 1986). These books are 15-20 years old, but most of what they say still seems relevant for software development and well-aligned with agile and iterative development methodologies.

Today I'll focus on David Whitgift's writings. In the very first chapter ("Introduction") he says that CM is more concerned with the relationships between items than with the contents of the items themselves:
  • This is because CM needs to understand the decomposition + dependencies among and between all the things that are necessary for a change to result in a correct + consistent version of the system that is usable to the rest of the team (or its stakeholders).

  • He also states that most of the problems that arise from poor CM and which CM is meant to resolve are issues of coordination/communication and control. And he gives a shopping list of common problems encountered in each of five different areas (change-management, version-management, build-management, repository-management, item identification/relationship management).
At the end of section 1.2 in the first chapter, Whitgift writes:
In the course of the book, five principles become apparent which are the keys to effective CM. They are that CM should be:
  • Proactive. CM should be viewed not so much as a solution to the problems listed in the previous section but as a collection of procedures which ensure the problems do not arise. All too often CM procedures are instituted in response to problems rather than to forestall them. CM must be carefully planned.

  • Flexible. CM controls must be sensitive to the context in which they operate. Within a single project an element of code which is under development should not be subject to restrictive change control; once it has been tested and approved, change control needs to be formalized. Different projects may have very different CM requirements.

  • Automated. All aspects of CM can benefit from the use of software tools; for some aspects, CM tools are all but essential. Much of this book is concerned with describing how CM tools can help. Beware, however, that no CM tool is a panacea for all CM problems.

  • Integrated. CM should not be an administrative overhead with which engineers periodically have to contend. CM should be the linchpin which integrates everything an engineer does; it provides much more than a repository where completed items are deposited. Only if an engineer attempts to subvert CM controls should he or she be conscious of the restrictions which CM imposes.

  • Visible. Many of the issues raised in the previous section stem from ignorance of the content of items, the relationships between items, and the way items change. CM requires that any activity which affects items should be conducted according to clearly defined procedures which leave a visible and auditable record of the activity.

Whitgift's "proactivity principle" might seem a bit "Big Planning/Design Up Front" rather than responsive, adaptive, or emergent. And his "visibility principle" one may seem like heavyweight traceability and documentation. However, in light of being flexible, automated, and integrated, it might not be as bad as it sounds.

Each of the above five Principles of Effective CM seem (to me) to be potentially competing objectives ("forces") that need to be balanced when resolving a particular SCM problem. I wouldn't regard them as principles in the same sense as Robert Martin's "Principles of Object-Oriented Design."

Each of the subsequent chapters in Whitgift's text delves into various principles for the various sub-areas of CM. I write about those in subsequent blog-entries.

Monday, June 27, 2005

FDD - an agile alternative to XP

I think there are a lot of folks out there who judge all of "Agile" by what they know of Extreme Programming (XP), quite possibly because that is all or most of what they've heard about agile development.

I think folks (especially agile skeptics) should take a close look at Feature-Driven Development (FDD) if for no other reason than because it is an example of an agile method that is very different from XP. FDD is quite agile while still employing of many of the traditional practices that agile skeptics are probably more accustomed to seeing.

For example, FDD has the more traditional progression of waterfall-ish phases in its iterations (while still being highly iterative and collaborative). FDD does conduct up-front planning, design and documentation and relies very heavily upon domain modeling. FDD also uses feature-teams with chief architects/programmers and traditional code-ownership and code-review (as opposed to pair-programming and refactoring).

To that end, here are some more resources about FDD so folks can learn more about it and become more aware of the fact that XP isnt the only agile "game" in town when it comes to development practices (SCRUM and DSDM are focused primairly on planning rather than development practices):

Sunday, June 19, 2005

Language Workbenches

Wouldn't ya know it! On the very same day that I blogged about Customer-Oriented Requirements Architecture (CORA) as The Next Big Thing, it turns out Martin Fowler wrote an article about Language Workbenches which seems to be getting at exactly the same core idea:
  • Language Workbenches utilize things like Meta-Programming Systems and Domain-Specific Languages (DSLs) to let the developer work more closely in the conceptual domain of the various subject-matter "spaces" of the requirements and the design.

  • They provide all sorts of IDE and refactoring support for the language domain they are created to support.

  • It seems a bit more focused on the design-end, whereas my CORA idea is a bit more focused on applying architectural principles and design patterns to the expression and maintenance of the requirements. I believe the end-result is the same however.

The more enabled we become at formally expressing the requirements in a language and framework more closely bound to the problem domain, the more important it will be to apply principles, patterns, and practices of refactoring, encapsulation, modularity, etc. to the groupings of requirements we develop and the relationships within and between them. And Language Workbenches become part of the environment that supports, maintains, and automates requirements dependency management and traceability

Feature-Driven Development (FDD) has an interesting way of trying to do some of this with its feature-sets and color modeling patterns. See recent discussions on the agilemanagement YahooGroup and the newly created colormodeling YahooGroup for more details.

I may blog in the future about the relationship between FDD, Color-modeling, "grammar rules" for domain-modeling, DSLs, and the Law of Demeter.

Sunday, June 12, 2005

Customer-Oriented Requirements Architecture - The Next Big Thing?

Robert Martin writes about what is "The 'Next Big Thing." From structured programming, to modular programming, object-oriented, extreme, aspect-oriented and service-oriented .... All have been heavily hyped in their "hey day."

Bob notes almost all of those started off as just "programming" and then each "trifurcated", having "design" and later "analysis" come afterward - completing the waterfall "triplet." Robert (Uncle Bob) Martin writes:
“I hope the next big thing is the big thing that we’ve needed for the last thirty years and haven’t had the guts to actually say. I hope the next big thing is professionalism. Perhaps a better word would be Craftsmanship.”
I notice that, within the past year, Aspect-Oriented Programming (AOP) and Aspect-Orientation has now "trifurcated", completing the waterfall triplet: the last few years have seen books and articles on Aspect-Oriented "Design", and within the last 6-8 months we have a book on "Aspect-Oriented Analysis and Design" and another on Aspect-Oriented Use-Cases.

Aspects and analysis/requirements (use-cases) interests me because I wonder if the trend we're seeing isnt so much aspects, but the same trend indicated by Charles Simonyi's emphasis on Intentional Programming (he now calls it "Intentional Software"), and by the emphasis on Domain-Specific Languages (DSLs) in Microsoft's "Software Factories", and to some extent even eXtreme Programming's representation of tests as precise requirements specification, and the way ObjectMentor's own FitNesse framework is almost a DSL-like front-end for that.

I think programming languages and software specifications have evolved beyond the domain of the coder and designer into the domain of the analyst and the customer.
  1. Assembly language was just some ever so slight mnemonic sugar (it wasnt even syntactic sugar) on "raw" machine code. We used some short, cryptic but mnemonic names, but it was still making the programmer think in terms of the way the computer's "processing" was organized. We had to think in terms of opcodes and registers and addresses in storage. We had to try and think like the machine.

  2. Then we got structured languages. You know, the kind where all that pseudo-code we used to write with words like "if then else", "while" and "for" could now be written directly in the language. We made the programming language try and represent the way the sequential logic seemed to go down in our heads. But it was still coding.

  3. Then with abstractions and objects we made programming languages cross the threshhold from mere coding & programming, to design, where we could now express not merely logical processing directly in the language, but design abstractions and encapsulation and various associations and interfaces.

  4. Then we built on that with patterns and libraries, and have now adorned it not just with inheritance and templates, but now Aspects and Java's "Annotations"
But one of our biggest technical problems in software development is still accurate communication of the customers and users needs and wants to the folks that have to translate that into working software. All those "shall" and "should" statements and "formal methods" like VDM and Z werent as much help as we hoped.

Enter test-driven development, where we let the customers talk to us in their native language, but we try to write the technical requirements not as vague prose, but rather as executable code in the form of tests, so we can get immediate feedback and use short-cycles. Fit and FitNesse attempt some of the goals of intentional-programming by making a little DSL and GUI to let us write something that looks closer to prose but still generates source-code to make our executable tests.

What this also does, along with use-cases, is bring the world of encapsulation and abstraction into the requirements/analysis domain. We can more formally and precisely attempt to package or encapsulate requirements into logical entities more formally manage the dependencies between them.

Use-cases were a start at this, though it was rare to see the various use-case-relationships (extends, specializes, generalizes, uses, etc.) utilized all that much. Rick Lutowski’s little known FREEDOM development methodology also gets very formal about applying encapsulation to requirements – it might become a little more well known now that his book is out as of May 2005 Software Requirements: Encapsulation, Quality and Reuse

With DSLs and Intentional software combined with the ability to encapsulate requirements, we can then start talking about managing their dependencies much the same way we do for code/design today, which means we’ll be able to talk about things like Refactoring of requirements, and “Design Patterns” of Requirements Design/Engineering (and ultimately “requirements reuse”).

And if we ever get even close to that, much of what we call “traceability” today will become a thing of the past. The dependencies in the requirement will be precisely specified and apparent from how we express them in encapsulations and their formal relationships (just like C++/C#/Java code today specifies classes, interfaces, packages, and their interitance, composition, and uses relationships)

So, if it is true that:
  • Computers and software are useful, primarily to the extent that they allow us to visualize & virtually manipulate the concepts we can previously only imagine within our minds.
  • The evolution of programming languages has gone from trying to make it easier for the programmer to understand the machine’s representational language, to making it easier for the language to represent the thoughts of programmer’s and designer’s
  • And with software becoming ever more pervasive and ubiquitous along with the increasing demand for regulatory traceability
Then perhaps Aspects and Sarbanes-Oxley combined with DSLs (and/or Intentional Software) and Test-Driven development will get us from Object-Oriented past Aspect-Oriented or Service-Oriented to arrive at Customer-Oriented Requirements Architecture of encapsulated requirements and the corresponding automated "acceptance" tests.

Then maybe the next big thing for software development may well be Customer-Oriented Requirements Architecture (CORA) and the evolution of expressive environments (I wont call them languages because I don’t think they’ll be strictly textual) that allow the business analyst and customer to express their needs and requirements in terms closer to their own thoughts and vocabulary, and to be able to directly transform that into encapsulated entities from which tests can be automatically generated and executed.

It would basically be creating a simulation environment with rapid feedback for exploring thoughts about what the software might do and analyzing the consequences. And just maybe it would enable the kind of Craftsmanship that Bob Martin's been doing for programming and design, only with the actual requirements!

But will we ever get there? Or will we be too busy chasing the next “Next Big Thing”? Or maybe I just need more sleep! What do you think? Is it too far fetched?

Sunday, June 05, 2005

The Art of Project Management

I just received a new O'Reilly&Associates book in the mail: The Art of Project Management, by Scott Berkun.

I didn't ask for it, O'Reilly just sent it to me. I perused it briefly. Judging from the table of contents, it looks promising and seems to focus on the reality of managing software projects. This is not a book about using MS-Project or PERT/GANNT charts or estimation. This book is about pragmatic project management realities rather than project management science or theory.

Looking through some of the text, I saw a few things I had some strong negative feelings about, and some other things that strongly resonated with me. So I'm not sure yet if I'll end up loving it or hating it, but I'll definitely have a lot to ponder and learn from it.

Scott also has a new blog
that looks pretty good, and has some excellent essays on leadership and teamwork.

Sunday, May 29, 2005

The Trouble with Traceability

I've taken many of my thoughts from my previous blog-entries on traceability and expounded & expanded upon them in my May 2005 "Agile SCM" column of CMCrossroads Journal. The article is entitled The Trouble with Tracing: Traceability Dissected. It describes:
  • Ten Commandments of Traceability
  • Nine Complaints against Traceability
  • Eight Reasons for Traceability
  • The Seven Functions of SCM:
  • The Six Facets of Traceability
  • The Five Orders of Traceability
  • The Four Rings of Stakeholder Visibility
  • Three Driving Forces for Traceability
  • Two Overarching Objectives: Transparency & Identification
  • One Ultimate Mission: Fostering a Community of Trust

On a separate note ... I recently realized that the order/date on which I first author my blog-entries has been very different than the published date of the entry. I was saving them, but not publishing them.

Normally I may take a day or two to "clean-up" an entry (add the URLs, fix the formatting and fight with the WYSIWYG blog-composer) and publish them within a couple days of creating them. But we had a grave illness (and then death) in my family that consumed most of May for us, and I didnt realize until recently that my blog-entries for the last half of April and most of May hadnt been published yet.

So for that I must apologize. And I'll try to do better about publishing more regularly (ideally at least weekly).

Saturday, May 21, 2005

Situational Ownership: Code Stewardship Revisited

I had some interesting feedback on my previous blog-entry about Code Stewardship. Most apparent was that I utterly failed to successfully convey what it was. Despite repeated statements that stewardship was not about code access, it seems everyone who read it thought the code steward, as described, was basically a "gatekeeper" from whom others must acquire some "write-token" for permission to edit a class/module.

I am at a loss for what to say other than that is not at all how it works. The code-steward serves as a mentor and trail-guide to the knowledge in the code. Consulting the code-steward is not about getting permission, it is about receiving knowledge and guidance:
  • The purpose of the protocol for consulting the code-steward is to ensure two-way communication and learning (and foster collaboration). That communication is a dialogue rather than a mere "token" transaction. It's not a one-way transfer of "control", but a two-way transfer of knowledge!
Perhaps I would have been better off saying more about how Peter Block defines stewardship in his book of the same name (Stewardship: Choosing Service over Self-Interest, see an interesting review here and another one over here):
  • Stewardship is being accountable to the larger team or organization by "operating in service, rather than in control, of those around us."
  • "We choose service over self-interest most powerfully when we build the capacity of the next generation to govern themselves"
  • Stewardship offers a model of partnership that distributes the power of choice and resources to each individual.
  • Stewardship is personal - everyone is responsible for outcomes; mutual trust is the basic building block, and the willingness to risk and be vulnerable is a given.
  • Total honesty is critical. End secrecy. Give knowledge away because it is a form of power
When practiced properly, collective code ownership is in fact an ideal form of stewardship (but not the only form). Stewardship may ultimately end-up as collective-ownership if given a sufficient period of time with the same group of people.

However, early on I would expect stewards to have a more direct role. And I believe the form of code-ownership that Feature-Driven Development (FDD) practices may seem fairly strict at first, but is really intended to be the initial stages of code-stewardship in the first two quadrants of the situational leadership model.

I beleive the form in which stewardship should manifest itself is situational, depending on the skill and motivation of the team and its members. In Ken Blanchard's model of situational leadership, there are four quadrants of leadership-style, each of which should be used on the corresponding combination of hi-lo motivation and hi-lo skill for a given task:
  • Directing (hi directive + lo supportive, for "enthusiastic beginners")
  • Supporting (hi directive + hi supportive, for "disillusioned learners")
  • Coaching (lo directive + hi supportive, for "reluctant contributors")
  • Delegating (lo directive + lo supportive, for "peak performers")
If we apply the concepts and principles of "stewardship" using the appropiate situational leadership-style, the outwardly visible result may appear to transition from individual code ownership, to code guardians/gate-keepers, then code coaches/counselors, and finally to truly collective ownership.

So I would say it is the presence of stewardship which is the key to succeeding with either individual code ownership or collective code ownership. If stewardship is present, then both can succeed; If it is absent, it's likely that neither will succeed. And the collective and individual "styles" are the extreme ends of the spectrum, with "code counselors" as the style in between those two extremes.

Saturday, May 14, 2005

Dreamy SCM Patterns Superheroes!

My SCM Patterns co-author and I received an honorable mention in the letters-to-the-editor section of The June 2005 issue of Software Development Magazine. The April issue had an article about the software development "dream team," giving nicknames and characteristics of these fictitious "software super heroes."

In this month's letters section is an letter from Curtis Yanko entitled "Team Member Missing" ... Curtis writes:
I just read 'The Dream Team' (Apr 2005), and while I found it informative and entertaining, I couldnt help but feel a little empty. I expected Software Development, of all magazines, to recognize the importance of the configuration manager. This superhero would mix Martin Fowler's appreciation of adapting the right amount of agility for any given team with Stephen Berczuk and Brad Appleton's understanding of SCM patterns. Automated builds and continuous integration are what will really allow this Dream Team to make super-human progress!
Many thanks Curtis!

Friday, May 06, 2005

Single Codebase - How many Codelines?

On the YahooGroup for the 2nd edition of Kent's Beck's book Extreme Programming Explained, Kent described a practice he calls Single Code Base:

There is only one code stream. You can develop in a temporary branch, but never let it live longer than a few hours. Multiple code streams are an enormous source of waste in software development. I fix a defect in the currently deployed software. Then I haveto retrofit the fix to all the other deployed versions and the active development branch. Then you find that my fix broke something you were working on and you interrupt me to fix my fix. And on and on.

There are legitimate reasons for having multiple versions of the source-code active at one time. Sometimes, though, all that is at work is simple expedience, a micro-optimization taken without a view to the macro-consequences. If you have multiple code bases, put a plan in place for reducing them gradually. You can improve the build system to create several products from a single code base. You can move the variation into configuration files. Whatever you have to do, improve your process until you no longer need them. [... example removed ...]

Don't make more versions of your source code. Rather than add more codebases, fix the underlying design problem that is preventing you from running from a single code base. If you have a legitimate reason for having multiple versions, look at those reasons as assumptions to be challenged rather than absolutes. It might take a while to unravel deep assumptions, but that unraveling may open the door to the next round of improvement.
Kent is equating creation of a new codeline with establishing a new code base within the same repository. He does so with good reason: creating a new codeline for supporting a legacy release and/or multiple customer variants is indeed creating a new project instance, complete with its own separately evolving copy of the code.

I posted a lengthy response to Kent's initial description of a Single Code Base. I feel I understand Kent's position intimately well. At the same time I think that having to support one or more legacy releases is a business constraint that is far more unavoidable then Kent's post seems to suggest. I summarized my opinion as:
  1. Transient branches are fine (even ones that last more than a few hours) and do not cause the waste/retrofitting described. But you do need to follow some rules regarding integration structure and frequency
  2. Variant branches are "evil", and should be solved with good architecture/factoring or else configuration that happens at later-binding-time
  3. Multiple release branches are often a necessity in order to support multiple releases. And supporting multiple releases is highly undesirable, but often unavoidably mandated by the business/customer
Under the heading of "Transient Branching", I include patterns like Private Branch, Task Branch, and "organizational coping" branches like Docking Line, and the Release-Line + Active-Development-Line pair. Another example (tho not short-lived) is Third Party Codeline. And of course if any branching is done, then proper use of a Mainline is essential (I think Mainline does for branching what refactoring does for source-code).

So while I vigorously agree with the desire against adding new codelines for supporting multiple releases, and that it's certainly good idea to question if it is truly necessary (and to fight "like heck" against using branches as a long-term solution to handling multiple variants), I think challenging the multiple maintenance constraint too vehemently isnt a great idea however once you understand the business need that is driving it.

We might still disagree with doing it, but at that point I think we need to "bite the bullet" and do it while perhaps exploring softer communication alternatives to persuade the business in the future. Part of that can be getting the business to:
  • Fully acknowledge and appreciate that each additional supported release/variant is a bona fide new project with all the associated impliciations of added cost, effort, management, and administration. (Often a new variant-line or release-line adds 40%-80% additional effort to support and coordinate).
  • Agree that if Multi-Tasking an individual is something that decreases productivity and flow by increasing interruptions and context-switching, then Multi-Project-ing the same team/codebase is an even grander black-hole that sucks away resources and productivity for many of the same reasons
  • Agree that when we do decide to support an additional release/variant, the new project should have some sort of charter and/or service-level-agreement (SLA) that clearly defines the scope and duration of the agreed upon effort and its associated costs.

For some additional reading, see DualMaintenance and UseOneCodeLine on the original Wiki-web, and BranchingAndMerging, ContinuousIntegration and AgileSCMArticles on the CMWiki Web