Brad Appleton's ACME Blog: Agile SCM Principles

Monday, July 24, 2006

Agile SCM Principles - From OOD to TBD+CBV+POB

I finally finished a set of articles I'd been working on for almost 10 years on and off on the subject of "translating" principles of OOD into principles of SCM. See the following:

The principles of OOD translated into principles of Task-Based Development (TBD), Container-based Versioning (CBV), and Project-Oriented Branching (POB).

Here are the principles that I translated. Most of them are from Robert Martin's book Agile Software Development: Principles, Patterns, and Practices, but a couple of them are from The Pragmatic Programmers:

Principles of Class Design
SRP	The Single Responsibility Principle	A class should have one, and only one, reason to change.
OCP	The Open-Closed Principle	A class should be open for extension (of its behavior) but closed against modification (of its contents).
LSP	The Liskov Substitution Principle	Derived classes must be substitutable for (require no more, and ensure no less than) their base classes.
DIP	The Dependency Inversion Principle	Depend on abstract interfaces, not on concrete details.
ISP	The Interface Segregation Principle	Make fine grained interfaces that are client specific.
LOD	The Law Of Demeter (Principle of Least Assumed Knowledge)	Objects and their methods should assume as little as possible about the structure or properties of other objects (including its own subcomponents).
DRY	The "Don't Repeat Yourself" Principle	Every piece of knowledge should have one authoritative, unambiguous representation within the system.
Principles of Package Design
REP	The Release Reuse Equivalency Principle	The granule of reuse is the granule of release.
CCP	The Common Closure Principle	Classes that change together are packaged together.
CRP	The Common Reuse Principle	Classes that are used together are packaged together.
Principles of Package Coupling
ADP	The Acyclic Dependencies Principle	The dependency graph of packages shall contain no cycles.
SDP	The Stable Dependencies Principle	Depend in the direction of stability.
SAP	The Stable Abstractions Principle	Abstractness increases with stability.

Here is what I ended-up translating them into. Note that some of the principles translated into more than one principle for version control because they applied to more than one of changes/workspaces, baselines, and codelines. I'm not real thrilled about the names & acronyms for several of them and am open to alternative names & acronyms:

General Principles of Container-Based Versioning
The Content Encapsulation Principle (CEP)	All version-control knowledge should have a single authoritative, unambiguous representation within the system that is its "container. In all other contexts, the container should be referenced instead of duplicating or referencing its content.
The Container-Based Dependency Principle (CBDP)	Depend upon named containers, not upon their specific contents or context. More specifically, the contents of changes and workspaces should depend upon named configurations/codelines.
The Identification Insulation Principle (IDIP)	A unique name should not identify any parts of its context nor or of its related containers (parent, child or sibling) that are subject to evolutionary change.
The Acyclic Dependencies Principle (ADP)	The dependency graph of changes, configurations, and codelines should have no cycles.
Principles of Task-Based Development
The Single-Threaded Workspace Principle (STWP)	A private workspace should be used for one and only one development change at a time.
The Change Identification Principle (CHIP)	A change should clearly correspond to one, and only one, development task.
The Change Auditability Principle (CHAP)	A change should be made auditably visible within its resulting configuration.
The Change/Task Transaction Principle (CHTP)	The granule of work is the transaction of change.
Principles of Baseline Management
The Baseline Integrity Principle (BLIP)	A baseline's historical integrity must be preserved - it must always accurately correspond to what its content was at the time it was baselined.
The Promotion Leveling Principle (PLP)	Define fine-grained promotion-levels that are consumer/role-specific.
The Integration/Promotion Principle (IPP)	The scope of promotion is the unit of integration & baselining
Principles of Codeline Management
The Serial Commit Principle (SCP)	A codeline, or workspace, should receive changes (commits/updates) to a component from only one source at a time.
The Codeline Flow Principle (CLFP)	A codeline's flow of value must be maintained - it should be open for evolution, but closed against disruption of the progress/collaboration of its users.
The Codeline Integrity Principle (CLIP)	Newly committed versions of a codeline should consistently be no less correct or complete than the previous version of the codeline.
The Collaboration/Flow Integration Principle (CFLIP)	The throughput of collaboration is the cumulative flow of integrated changes.
The Incremental Integration Principle (IIP)	Define frequent integration milestones that are client-valued.
Principles of Branching & Merging
The Codeline Nesting Principle (CLNP)	Child codelines should merge and converge back to (and be shorter-lived than) their base/parent codeline.
The Progressive-Synchronization Principle (PSP)	Synchronizing change should flow in the direction of historical progress (from past to present, or from present to future): more conservative codelines should not sync-up with more progressive codelines; more progressive codelines should sync-up with more conservative codelines.
The Codeline Branching Principle (CLBP)	Create child branches for value-streams that cannot "go with the flow" of the parent.
The Stable Promotion Principle (SPP)	Changes and configurations should be promoted in the direction of increasing stability.
The Stable History Principle (SHIP)	A codeline should be as stable as it is "historical": The less evolved it is (and hence more mature/conservative), the more stable it must be.

You can read the 2nd article to see which version-control principles were derived from which OOD principles. Like I mentioned before, I'm not real thrilled about the names & acronyms for several of them and am open to alternative names & acronyms. So please share your feedback on that (or on any of the principles, and how they were "derived").

3 comments:

Anonymous said...: I'm finding it very difficult to relate the principles to my current version control environment (CVS supporting an Oracle Data Warehouse development).

There are things that we are trying to achieve, and most of our habits seem to be helpful.

I'm sure we could do better, and I'm trying to use the principles to apply to our situation. Here's where the approach looks promising...

* Content encapsulation - we represent each component of our ETL solution as a single exported text file in CVS

* Acyclic dependencies - component usage is hierarchical (ETL items generated from the tool can depend on hand-coded packages but not vice versa); the CVS repository structure mirrors the code deployment structure

...and here's where I'm struggling:

* Container-based dependency - I find this hard to understand. Can you provide an example?

* Identification insulation - each of our component names is chosen to reflect the context to some extent, so that its purpose can be inferred more easily in the deployed environment; and the exported, 'encapsulated', components actually have their location within the code hierarchy embedded in the file (a bit like 'package com.acme.blah' in a Java file?)

regards
Bob; 7:02 PM
Brad Appleton said...: Hi Bob!
I agree that this draft of version control principles isnt real easy to comprehend. The words and phrasing are still more steeped in the object-oriented domain and not targeted to a version-control audience. I need to work on rewording that, and probably renaming several of the principles too (so your feedback is greatly appreciated).

Content Encapsulation is really about identifying those things from the "version control" domain that correspond to classes/objects (and hence units of encapsulation and abstraction). I think those things are :

Changes: a "change" encapsulates a set of revisions to a set of files, and they are all checked-in/merged together.

Versions: a version (or a "named configuration") is represented by a tag or label. If the version is "blessed", we baseline it and call it a "baseline". We refer to the contents of the version by using the name of the label/tag instead of trying to enumerate specific file revisions that belong to that "version".

Codelines: a codeline encapsulates a "current/latest" version in an evolving progression of versions. Instead of trying to keep-up with the name of the most recent tag/label I simply use the name of the codeline, and I either rely on the "tip" of the codeline to be the "latest and greatest stuff", or else I rely on some kind of "floating" or "sticky" tag to always reflect the "last good build" of the codeline.

This is where Container-Based Dependency comes in to play. Anytime I want to obtain the contents of a particular "view" or "configuration" of files, I should try and refer to the name of the container, instead of trying to point to specific contents.

This happens most often when I am first populating my workspace (sandbox) with file version, and also when Im merging a set of files into my workspace. The application of CBDP would be ...

When populating a workspace, use a codeline or baseline to reference the initial view of versions see/use. Dont try to cherry-pick specific files revisions, or specific changes if a codeline-name or version-name will get the job done.

When merging changes into a workspace, again you typically want to merge from a codeline (or a baseline) rather than from a specific change or a specific set of file-revisions.

For identification insulation, the idea is that the "unique identifier" should be stuff that wont change. If a branch is dedicated for a particular release or iteration, then naming it after that release or iteration is fine.

If I then create a task-branch for a feature named "ABC" and I name the task-branch "rel1_iter3_abc", then what happens if that feature is deferred to iteration-4 or even release-2?

Do I leave the branch-name "as is" and live with the inconsistency? or do I rename the branch and hope no other person, tool, query or system was using that information or (worse yet) copied it into some other tool or system expecting it to be a 'foreign key' for my change?

if I dont put the release or iteration in the name of the task-branch, I dont have to face that dilemma! Those are pieces of context information about the "intended target". Making them part of the branch-name violates the version-control equivalent of the Law-of-Demeter.

It also violates the "Lean" principle of "deferring commitment" (Im "commiting" the association between that content and that context in the database, and if it changes, I have a dillemma no matter what).

Granted, for usability, it might be really nice to see that information in the branch name itself. But at what cost/impact if the context changes?

How terrible is it if I add a level of indirection and perhaps make the release+iteration instead be separate attributes/properies that are associated with the branch (either in the version-control tool, or the change-tracking tool) instead of being part of its name?

That example used a branch-name. There are similar cases for things like document names/identifiers, names of "subparts" that assume they know what "containing-part" they belong to (how do we know it want be refactored to another location/components someday in the near future, or perhaps be part of multiple components or products instead of just one?)

Hope that helps! (and thanks for the feedback!); 10:40 AM
Anonymous said...: Thanks, Brad

I'm looking for ways to improve the structure of our code, and the organisation of our version control repository.

After a year, it's become apparent where some of the dependencies are: the aspects of our current setup that are somewhat awkward to handle. The (apparent) inertia of the version control respository and the development environment means that we don't actually reorganise much. We're too busy :-)

I believe that we could organise our version control much more as we might "wish we had done", and as we do so I think the structure of the code also needs to change somewhat.

At the moment, I'm trying to map some of these concepts onto our environment, as we learn more about what we've built (!).

Thanks again for the conversation.
Regards,
Bob; 3:31 PM

Brad Appleton's ACME Blog

Monday, July 24, 2006

Agile SCM Principles - From OOD to TBD+CBV+POB

3 comments:

About Me

Blog Categories

Blog Archives

Blogroll