Brad Appleton's ACME Blog

Tuesday, November 29, 2005

John Vlissides

I just learned from Martin Fowler's Bliki that John Vlissides passed away on Nov 24, 2005 after a long-term battle with cancer.

John was probably best known as one of the "Gang of Four" who authored the book Design Patterns, which was the seminal work on the subject of patterns if not on all of O-O software design, and one of the best selling computer-science books of all time. A wiki-page created in John’s memory is available for all to read, and to contribute to for those who remember him or have been influenced by him. I'll be posting the following memory there in a couple of days...

My first encounter with John was in 1995 on the "patterns" and "patterns-discussion" mailing lists. I was just a lurker on those lists at the time, and didn't feel "weighty" or "worthy" enough to post anything to them.

Then after having lunch (Pizza actually) with Robert Martin ("Uncle Bob") who encouraged me to do so, I ventured a posting to the patterns-list and described the Pizza Inversion pattern. I was actually quite nervous about it - me being a complete unknown and "daring" to post something that poked a little fun at patterns. John and Richard Gabriel were among the first to respond, and the response was very positive. I felt I had been officially "warmly welcomed" into the software patterns community.

A couple years later I attended the PLoP'97 conference and got to meet John in person for the first time at one of the lunches. Like many others, I was in awe of how unpretentious and humble he was. Again he made me feel very welcome amidst himself and others at the table of "rock star status" in the patterns community: he apparently recognized my name and included me in the running conversation, mentioning that when he first read my Pizza Inversion pattern, he "thought it was briliant!"

Later, at PLoP'98 and PLoP'99, John encouraged me to get together with Steve Berczuk and write a book on Software CM Patterns for the Addison-Wesley Software Patterns Series of books, for which he was the series editor. And during 1999 I actually became editor for the Patterns++ section of the C++ Report, including John's "Pattern Hatching" column and Jim Coplien's "Column Without a Name."

It was both an exciting and humbling experience for me to serve as editor for the contributions of two people so famous and revered in the patterns and object-oriented design communities. They both mentored and taught me so much (as did Bob Martin) during the "hey day" of patterns and OOD.

During the years between 1998 and 2002, John personally shared with me a great deal of insight and sage advice about writing, authoring and editing, as well as lending loads of encouragement and support. I truly feel like I have lost one of my mentors in the software engineering community. John's humor, insight, humility and clarity will be sorely missed.

Thursday, November 24, 2005

Pragmatic Book Reviews

HAPPY THANKSGIVING EVERYONE! (Even if you're not in the US :-)

As I mentioned in my previous blog-entry, I'll be attempting to post reviews of several books in the next month or two, mostly from the Pragmatic Programmers and from Addison-Wesley Professional. The ones I currently have are the following:

My Job Went to India (And All I Got Was This Lousy Book): 52 Ways to Save Your Job
by Chad Fowler

Agile Web Development with Rails: a Pragmatic Guide
by Dave Thomas and David Heinemeier Hansson with Leon Breedt, Mike Clark, Thomas Fuchs, and Andreas Schwarz

Ship It! A Practical Guide to Successful Software Projects
by Jared Richardson and Will Gwaltney

Behind Closed Doors: Secrets of Great Management
by Johanna Rothman and Esther Derby

Data Crunching: Solve Everyday Problems using Java, Python, and more
by Greg Wilson

The Build Master: Microsoft's Software Configuration Management Best Practices
by Vincent Maraia

Agile Estimating and Planning
by Mike Cohn

Saturday, November 19, 2005

Book Review: JUnit Recipes

I have a whole bunch of reviewer-copies of books that I've been intending to review for several months. So I'll be doing a number of book reviews throughout the remainder of this year, particularly titles from The Pragmatic Programmers and from Addison-Wesley Professional (who were nice enough to give me copies of the books).

Today however I'll be posting about a review fo a book from a different publisher. I did a review of the book JUnit Recipes for StickyMinds a few months ago. My summary of my review was:

JUnit Recipes should probably be mandatory reading for anyone using Java, J2EE and JUnit in the real-world. This comprehensive and imminently pragmatic guide not only conveys a great deal of highly practical wisdom but also clearly demonstrates and explains the code to accomplish and apply the techniques it describes.

The full review is featured this month on the StickyMinds front page and is available from their website at http://www.stickyminds.com/s.asp?F=S767_BOOK_4

Saturday, November 12, 2005

Commodity, Integrity, Simplicity

In a previous blog-entry on the subject of perishable -vs- durable value, I wrote about how business value is depreciable and therefore the business value of a feature is a perishable commodity. I then went on to describe what I thought were more durable forms of value: Integrity and Simplicity.

I defined Integrity as a triplet of related properties: {Correctness, Consistency, Completeness}. Integrity is a property of a deliverable item such as a feature, a configuration or a configuration item. So a feature or item has "integrity" if it is correct, consistent and complete.

I also defined Simplicity as a triplet of related properties: {Clarity, Cohesiveness (Coherency), Conciseness}. So a feature, item, or logical entity is "simple" if it is clear, cohesive and concise.

I then asked the question:

What about "form, fit and function"? Are "form" and "fit" also components of perishable value?

What I've been thinking since then is that the perishable form of value is the extrinsic value that it is given by the customer. From the end-consumers perspective, what they perceive as the form, fit, and function of the deliverable is what makes it valuable or not. We might call type of value "Commodity" or "Marketability". [Note: There are several things I both like and dislike about both those possible names, so please comment if you have a preference for one over the other (or for something else) and let me know why.]

I suggested this in a posting to the continuousintegration YahooGroup entitled "Commodity, Integrity, Simplicity (was Re: Extreme Frequency and Extreme Integrity)". Some relevant excerpts from the discussion:

Commodity is customer-desired Form, Fit and Function. ... Commodity has to do with what requirements are most valued by the customer at a given time. I think maybe those requirements are in terms of "Form, Fit, and Function". Which requirements those are and how much they are valued is most definitely time-sensitive. When I add "commodity"-based value to a codebase, I am adding time-sensitive perishable value that can depreciate or greatly fluctuate over time.
...
[from Ron Jeffries]:
A thing, to me, has integrity and simplicity but is a commodity.
I thought about this. And I completely agree - that probably is the main thing that makes the word "commodity" stand-out apart from the other two like "one of those things that just doesnt belong" with them.

Then I think about it some more, and I think, maybe the thing that makes it seem so "wrong" when listed with the other two is perhaps what is so "right" about it after all. Maybe it's a good think to think that a feature (or "story") is a commodity.

Maybe that's what it is first and foremost (a commodity) that we should always keep in mind, and where the most direct value to the customer is perceived. And maybe those other two things (integrity and simplicity) are the "secret sauce" that make all the difference in how we do it:
Maybe the integrity is the "first derivative" that gives us velocity AND continuity at a sustainable pace.

And maybe when we throw in simplicity, that is the second derivative of value, and it maybe harder for the customer to see directly, but when we do it right, that gives us more than just continuity+sustainability, it also gives us the acceleration to adaptiveness and responsiveness and "agility" to overcome that cost-of-change curve.
...
[follow-up from Ron Jeffries]:
However, a bit further insight (or what I use in place of insight) for why it troubles me. A "commodity" is a kind of product with value, but it is a fungible one. A commodity is a product usually sold in bulk at a price per item or per carload. One potato is like every other potato. A story/feature, in an important sense, isn't like every other story/feature.

Thanks Ron for all the thoughtful feedback. You are spot-on of course. And that notion of a commodity as a bulk shipment or mass purchase of units definitely "kills" the notion of value I'm trying to get at.

I'm still at a loss for a word/term that I like better. Marketability perhaps? It's more syllables than I'd like, although there is a precedent set for it in the book Software By Numbers in its use of an "Incremental Funding Method" (IFM) with "Minimal, Marketable Features" (MMFs).

So to my readers that have read this far ... what is your take on all of this talk about commodity/marketability and "perishable value"? Are commodity, integrity, and simplicity each just different perspectives of form, fit, and function, where:

"commodity/marketability" would be the customer view
"integrity" would be the view of requirements analysts/engineers, V&V/QA, and CM
"simplicity" would be the view of the developers and architects

What do form, fit and function mean for software anyway?

Is it container, context and content?
Is it interface, integration and implementation?

And how should that all trace back to our discussion about value and whether that value is extrinsic or intrinsic, and whether it is perishable, durable, or latent/emergent?

I admit I dont have a lot of coherent thoughts here, just a lot of incoherent ramblings and inconsistent questions. Let me know how you think this should all make sense (or if it shouldn't).

Saturday, November 05, 2005

Agile Lifecycle Collapses the V-model upon itself

Many are familiar with the waterfall lifecycle model for software development. Fewer are familiar with the 'V' lifecycle model for software development.

These two lifecycles models are very similar. The main difference is that the 'V' model makes a deliberate attempt to "engage" stakeholders located on the back-end of the 'V' during the corresponding front-end phase:

During Requirements/Analysis, system testers are engaged to not only review the requirements (which they will have to test against), but also to begin developing the tests.

During architectural and high-level design, integrators and integration testers are engaged to review the design and the interface control specs, as well as to begin developing plans and test-cases for integration and integration-testing

at this point, hopefully you get the main idea ... at a given phase where deleverables are produced, the folks who are responsible for validating conformance to those specs are engaged to review the result and to begin development of their V&V plans and artifacts

When used in conjunction with Test-Driven Development (TDD), and especially with a lean focus on minimizing intermediate artifacts, the agile lifecycle in a very real sense makes the two sides of the 'V' converge to create almost a single line (instead of two lines forming a 'V'):

TDD attempts to use tests as the requirements themselves to the greatest extent possible

emphasis on lean, readable/maintainable code oftenlead to a literate programming style (e.g., JavaDocs) and/or a verbose naming convention style such that detailed design and source code are one and the same.

focus on simplicity and eliminating redundancy increases this trend via principles and practices such as those mentioned in the Principle of Locality of Reference Documentation and Single-Source Information)

Use of iterative development with short iterations makes the 'V' (re)start and then converge over and over again throughout the development of a release.

The result: using cross-lifecycle collaboration in combination with tests as requirements and self-documenting code as detailed design and writing tests before the code makes the ends of the 'V' model converge together so that each end practically collapses against the other in a thick, almost single line. Plus successive short iterations serve to increase the frequency of this trend.

The agile lifecycle tries to eliminate (or at least create a tessarect for) the distance between the symmetric points at each end of the V-model by making the stakeholders come together and collaborate on the same artifacts (rather than separate ones) while also working in many small vertical slices on a feature-by-feature (or story-by-story) basis. There are no separately opposing streams of workflow: just a single stream of work and workers that collaborate to deliver business value down this single stream as lean + agile as possible.

Saturday, October 29, 2005

Codelines as Code Portals

I've been thinking a bit about the evolution of branching capability in version control tools.

First we had no branching support

Then we had very primitive branching support at the physical level of individual files using funky looking numbers like 1.1.1.1 that were basically 4-level revision numbers

Then we had better branching support, but still file-based, and it allowed us to use some reasonably readable-looking symbolic names to identify a branch

Then we has support for branching at the project/product level across the entire configuration item

Nowadays the better tools (such as AccuRev, ClearCase/UCM, Subversion, OurayCM, SpectrumSCM, Neuma CM+, and others) have "streams"

Among the differences between streams and project-oriented branches were that project-oriented branches were still only the changes that took place on that branch; whereas streams gave me a dynamically evolving "current configuration" of the entire item (not just the changes); And in many cases "streams" are first-class entities which can have other attributes as well.

Streams are, in a sense, giving a view of a codeline that is similar to a web portal. They are a "code portal" that pulls the right sets of elements and their versions into the "view" of the stream and eases the burden of configuration specification and selection by providing us this nice "portal."

So what might be next in the evolution of branches and branching after this notion of "code portal"?

Will it be in the area of distribution across multiple sites and teams?

Will it be in the area of coordination, collaboration and workflow?

Will it be in the area of increasing scale? What would a "stream of streams" look like?

Maybe it will be all three! Maybe a stream of streams is a composite stream where the parent stream gave a virtual view across several (possibly remotely distributed) streams and repositories, but via a dynamic reference (rather than a copy), so that the current configuration was a view of the combined currenty configuration of each consitituent stream? (somewhat reminiscent of how composite baselines work in ClearCase/UCM)?

What do you think will be the next steps in the evolution of branching beyond "streams" and what do you think are the trends that will fuel the move in that direction?

Saturday, October 22, 2005

Bugs versus Enhancements

On the SCRUM Develoment Yahoo Group, Stephen Bobick initiated a discussion about Bugs versus Enhancements:

Here's something I've run into agile and non-agile projects alike: the blurring of distinction between bugs and enhancement requests. To me a bug is erroneous operation of the software based on the customer's requirements. That's fine when both sides agree to what the requirements are. Sometimes a bug can also be caused by a misunderstanding of the requirements by the team, however, and yes I'll still call this a bug. Often, however, customers will dub "missing" functionality (which was never discussed initially) or "nice-to-have" features, shortcuts and so on as "bugs"....

When I have tried to make the distinction between bugs and enhancements clearer to the PO or customer, sometimes through a SM, the customer thinks we are nit-picking, or trying to "play the blame game", rather than properly categorize and identify their feedback. One approach is to keep trying to educate and convince them anyways (on a case by case basis, if necessary). Another approach is just to let them call anything they want a "bug". Of course this can screw up your metrics (incidence of bugs) - something we are interested in at my current job (i.e. reducing the rate of new bugs and fixing bugs in the backlog).

Any words from the wise out in the trenches on how to best approach this? Obviously, with unit testing and other XP practices there is a claim that bug rates will be low. But if anything can be declared a bug, it becomes more difficult to make management and the customer believe the claims you make about your software development process and practices. And when this happens, the
typical response is to revert to "old ways" (heavy-handed, waterfall-type approaches with formal QA).

-- Stephen

I've actually had a lot of personal experience in this for the past several years. Here are some of the things I have learned...

1. DONT ASSUME ALL DEFECTS ARE BUGS!

The term "bug" and the term "defect" don't always mean the same thing:

Bug tends to refer to something "wrong" in the code (either due to nonconformance with design or requirements).

Defect often means something that is "wrong" in any work-product (including the requirements).

Hence, many consider ALL of incorrect, inconsistent, incomplete, or unclear requirements to be "defects": if they believe a requirement is"missing" or incorrectly interpreted, it's still a "bug" in their eyes.

Ive also seen some folks define "bug" as: anything that requires changing ONLY the code to make it work "as expected". If it requires a change to docs, the consider it a "change request" (and the issue ofwhether or not it is still a "defect" isnt really addressed)

Also, many folk's metrics (particularly waterfall-ish metrics for phase containment and/or screening, but I think also orthogonal-defect classification -- ODC) explicitly identify "missing requirements" as a kind of defect

2. DO YOU TREAT BUGS DIFFERENTLY FROM ENHANCEMENTS?

If so, then be prepared to battle over the differences. Very often, the difference between them is just a matter of opinion, and the resolution will almost always boil down to a matter of which process (the bugfix process or the enhancement process) is most strongly desired for the particular issue, or else will become an SLA/contractual dispute. Then you can bid farewell to the validity of your defect metrics.

If your development process/practice is to treat "bugs" differently than "enhancements" (particularly if there is some contractual agreement/SLA on how soon/fast "bugs" are to be fixed and whether or not enhancements cost more $$$ but bugfixes are "free"), then definitions of what a bug/defect is will matter only to the extent outlined in the contract/SLA, and it will be in the customer's interest to regard any unmet expectation as a "bug".

If, on the other hand, you treat all customer reported "bugs" and "enhancements" sufficiently similar, then you will find many of the previous battles you used to have over what is a "bug" and what isn't will go away, and wont be as big of an issue. And you can instead focus on getting appropriate prioritization and scheduling of all such issues using the same methods.

If the customer learns that the way to get the thing they want when they want it is a matter of prioritization by them, and if the "cost" for enhancements versus bugfixes is the same or else isn't an issue, then they will learn that in order to get what they want, they don't have to claim its a bug, they just need to tell you how important it is to them with respect to everything else they have to prioritize for you.

3. IT'S ALL ABOUT SETTING AND MANAGING EXPECTATIONS!

None of the above (or any other) dickering over definitions is what really matters. What really matters is managing and meeting expectations. Sometimes business/organizational conditions mandate some contractual definition of defects versus enhancements and how each must be treated and their associated costs. If your project is under such conditions, then you may need to clearly define "bug" and "enhancement" and the expectations for each, as well as any agreed upon areas of "lattitude"

Other times, we don't have to have such formal contractual definitions. And in such cases, maybe you can treat enhancements and defects/bugs the same way (as noted earlier above).

Lastly, and most important of all, never forget that ...

4. EVERYONE JUST WANTS TO FEEL HEARD, UNDERSTOOD, AND VALUED!

If you can truly listen empathically and non-defensively (which isn't always easy), connecting with their needs at an emotional as well as intellectual level, and demonstrate that it is important to you, then EVERYONE becomes a whole lot easier to work with and that makes everything a whole lot easier to do.

Then it's no longer about what's a bug or what's an enhancement; and not even a matter of treating bugs all that differently from enhancements ... it simply becomes a matter of hearing, heeding and attending to their needs in a win-win fashion.

I'm sure there are lots of other lessons learned. Those were the ones that stuck with me the most. I've become pretty good at the first two, and have become competent at the third. I still need a LOT of work on that fourth one!!!

Sunday, October 16, 2005

TDD/BDD + TBD + IDE = EBT 4 Free?

I've been thinking a bit more about inter-relationships between Test-Driven Development (TDD), Task-Based Development (TBD), a spiffy interactive development environment (IDE) such as Eclipse, and the trouble with traceability ...

One thing that occurs to me that might actually make traceability be easier for agile methods is that some agile methods work in extremely fine-grained functional increments. I'm talking about more than just iterations or features. I mean individually testable behaviors/requirements:

Behavior-Driven Development (BDD)

This means, with TDD/BDD, a single engineering task takes a single requirement through the entire lifecycle: specification (writing the test for the behavior), implementation (coding the behavior), verification (passing the test for the behavior), and design.

That doesnt happen with waterfall or V-model development lifecycles. With the waterfall and V models, I do much of the requirements up front. By the time I do design for a particular requirement it might be months later and many tasks and engineers later. Ditto for when the code for the requirement actually gets written.

So traceability for a single requirement thru to specs, design, code, and test seems much harder to establish and maintain if those things are all splintered and fragmented across many disjointed tasks and engineers over many weeks or months.

But if the same engineering task focused on taking just that one single requirement thru its full lifecycle, and if I am doing task-based development in my version control tool, then ...

The change-set that I commit to the repository at the end of my change-task represents all of that work across the entire lifecycle of the realization of just that one requirement, then the ID of that one task or requirement can be associated with the change-set as a result of the commit operation/event taking place.And voila! Ive automatically taken care of much of the traceability burden for that requirement!

If I had a spiffy IDE that gave me a more seamless development environment integration and event/message passing with my change/task tracking tool, and my version-control tool, and the interface I use to edit code, models, requirements, etc., then it would seem to me that:

The IDE could easily know what kind of artifact Im working on (requirement, design, code, test

Operations in the IDE and the version-control tool would be able broadcast "events" that know my current context (my task, my artifact type, my operation) and could automatically create a "traceability link" in the appropriate place.

I realize things like CASE tools and protocols like Sun's ToolTalk and HP's SoftBench tried to do this over a decade ago, but we didnt have agile methods quite so formalized then and werent necessarily working in a TDD/TBD fashion. I think this is what Event-Based Traceability (EBT) is trying to help achieve.

If I had (and/or created) the appropriate Eclipse plug-ins, and were able to develop all my artifacts using just one repository, then if I used TDD/BDD with TBD in this IDE, I might just be able to get EBT for free! (Or at least come pretty darn close)

Wouldn't I?

Tuesday, October 11, 2005

XP as an overreaction?

Response to Damon Poole's blog-entry asking "Is XP an overreaction?" ...

I believe Extreme Programming (XP) and other Agile Methods are indeed a strong counter-reaction to some prevailing management and industry trends from arround 1985-1995. [Note I said counter-reaction rather than over-reaction]

I think the issue ultimately revolves around empowerment and control. During 1985-1995 two very significant things became very trendy and management and organizations bought into their ideas: The SEI Software Capability Maturity Model (CMM), and Computer-Aided Software Engineering.

During this same time, programming and design methods were all caught up in the hype of object-oriented programming+design, and iterative+incremental development.

Many a large organization (and small ones too) tried to latch-on to one or more of these things as a "silver bullet." Many misinterpreted and misimplemented CMM and CASE as a magic formula for creating successful software with plug-and-play replaceable developers/engineers:

Lots of process documentation was created
Lots of procedures and CASE tools were deployed with lots of contraints regarding what they may and may not do
and "compliance/conformance" to documented process was audited against.

Many felt that the importance of "the people factor" had been dismissed, and that creativity and innovation were stifled by such things. And many felt disempowered from being able to do their best work and do the things that they new were required to be successful, because "big process" and "big tools" were getting and their way and being forced upon them.

(Some would liken this to the classic debate between Hamiltonian and Jeffersonian philosophies of "big government" and highly regulated versus "that governemnt is best which governs least")

I think this is the "crucible" in which Agile methods like XP were forged. They wanted to free themselves from the ball and chain of restrictive processes and disabling tools.

So of course, what do we do when the pendulum swings so far out of balance in a particular direction that it really makes us say "we're mad as h-ll and we're not gonna take it any more!" ??

Answer: we do what we always do, we react with so much countering force that instead of putting the pendulum back in the middle where it belongs and is "balanced", we kick it as far as we can in the other direction. And we keep kicking as hard as we can until we feel "empowered" and "in control of our own destiny" again.

Then we don't look back and see when the pendulum (or the industry) starts self-correcting about every 10 years or so and starts to swing back and bite us again :)

XP started around 1995 and this years marks its 10th anniversary. Agile methods have been officially embraced by industry buzz somewhere around 2002, and for the last couple years, there has been some work on how to balance agility with large organizations and sophisticated technology.

Among the main things coming out of it that are generating a goodly dose of much deserved attention are:

testing and integration/buidling are getting emphasized much earlier in the lifecycle, and by development (not just testers and builders)

the "people factor" and teaming and communication is getting "equal time"

iterative development is being heavily emphasized up the management hierarchy - and not just iterative but HIGHLY iterative (e.g., weeks instead of months)

These are all good things!

There are some folks out there who never forgot them to begin with. They never treated CASE or CMM as a silver bullet and took a balanced approach from the start. And they didnt treat "agile" as yet another silver bullet either. And they have been quietly delivering successful systems without a lot of noise - and we didnt hear much about them because they weren't being noisy.

Unfortunately some other things may seem like they are "babies" being "thrown out with the bathwater". Agile puts so much emphasis on the development team and the project - that practitioners of some of the methods seem to do so at the expense of other important disciplines and roles across the organization (including, and perhaps even especially, SCM)

Saturday, October 08, 2005

When to Commit: Perishable Value and Durable Value

We had a recent (and interesting) discussion on the scm-patterns YahooGroup about the notion of "value" and Frank Schophuizen got me thinking about what is the "value" associated with a configuration or a codeline: how does value increase or decrease when a configuration is "promoted" or when/if the codeline is branched/split?

Agile methods often talk about business value. They work on features in order of the most business-value. They eschew activities and artifacts that don't directly contribute to delivery business value. etc...

David Anderson, in several of his articles and blogs at agilemanagement.net, notes that the value of a feature (or other "piece" of functionality) is not dependent upon the cost to produce it, but upon what a customer is willing to pay for it. Therefore the value of a feature is perishable and depreciates over time:

The longer it takes to receive delivery of a feature, the less a customer may begin to value it.

If it doesn't get shipped in the appropriate market-window of opportunity, the value may be significantly lost.

If the lead-time to market for the feature is too long, then competitive advantage may be lost and your competitor may be able to offer it to them sooner than you can, resulting in possible price competition, loss of sale or business

So business value is depreciable; and the value of a feature is a perishable commodity.

Might there be certain aspects to business value that are not perishable? Might there be certain aspects that are of durable value? Is it only the functionality associated with the feature that is of perishable value? Might the associated "quality" be of more durable value?

I've seen the argument arise in Agile/XP forums about whether or not one should "commit" one's changes every time the code passes the tests, or if one should wait until after refactoring, or even until more functionality is implemented (to make it "worth" the time/effort to update/rebase, reconcile merge conflicts and then commit).

Granted, I can always use the Private Versions pattern to checkin my changes at any time (certainly any time they are correct+consistent) without also committing them to the codeline for the rest of the team to see and use. So, assuming that the issue is not merely having it secured in the repository (private versions), when is it appropriate to commit my changes to the codeline for the rest of the team to (re)use?

If refactoring is a "behavior preserving transformation" of the structure of the code, and if it improves the design and makes it "simpler", then is "good design" or "simplicity" something that adds durable value to the implementation of a running, tested feature? Kent Beck's initial criteria for "simple code" (and how to know when you are done refactoring your latest change) was described in an XPMagazine article by Ron Jeffries as the following, in order of importance:

it passes all the tests (correctly :-)

it contains no redundancy (the DRY principle: Don't Repeat Yourself)

it expresses every thought we intended it to convey about the program (i.e. reveals all our intent, and intends all that it reveals)

it minimizes the size and number of classes and methods

If I squint a little when I read thru the above, it almost looks like it's saying the same thing that writing-instructors and editor's say about good writing! It should be: correct, consistent, complete, clear and concise!

I have often heard "correct, consistent and complete" used as a definition of product integrity. So maybe integrity is an aspect of durable value! And I have sometimes heard simplicity defined as "clear and concise" or "clear, concise and coherent/cohesive" (where "concise" would be interpreted as having very ruthlessly rooted out all unnecessary/extraneous or repeated verbage and thoughts). So maybe simplicity is another aspect of durable value.

And maybe integrity is not enough, and simplicity is needed too! That could possibly explain why it might make more sense to wait until after a small change has been refactored (simplified) before committing it instead of waiting only until it is correct+consistent+complete.

Perhaps the question "when should I commit my changes?" might be answered by saying "whenever I can assure that I am adding more value than I might otherwise be subtracting by introducing a change into a 'stable' configuration/codeline!"

If my functionality isn't even working, then it's subtracting a lot of value, even if did get it into the customer's hands sooner. It causes problems (and costs) for my organization and team to fix it, has less value to the customer if it doesn't work, and can damage the trust I've built (or am attempting to build) in my relationship with that customer

if my functionality is working, but the code isn't sufficiently simple, the resulting lack of clarity, presence of redundancy or unnecessary dependency can make it a lot harder (and more costly) for my teammates to add their changes on top of mine

if I wait too long, and/or don't decompose my features into small enough working, testable increments of change, then the business value of the functionality I am waiting to commit is depreciating!

Now I just have to figure out some easy and objective means of figuring out the "amount" of value I have added or subtracted :-)

So are "integrity" (correct + consistent + complete) and "simplicity" (clear + concise + coherent/cohesive) components of durable value? Is functionality the only form of perishable value?

What about "form, fit and function"? Are "form" and "fit" also components of perishable value? Am I onto something or just spinning around in circles?

Saturday, October 01, 2005

The Single Configuration Principle

I'm wondering if I tried to bite off too much at once with my Baseline Immutability Principle. Maybe there needed to be another step before that on the way from the Baseline Identification Principle ...

The baseline identification principle said that I need to be able to identify what I have to be able to reproduce. The baseline immutability principle said that the definition of a baselined configuration needs to be timesafe: once baselined, the identified set of elements and versions associated with that baseline must always be the same set of elements and versions, no matter how that baseline evolves in the form of subsequent changes and their resulting configurations.

Maybe somewhere in between the baseline identification principle and the baseline immutability principle should be the single configuration principle:

Single Configuration Principle

Of course the baseline itself might be an assembly of other baselined configurations, but then it still corresponds to the one configuration that represents that assembly of configurations. So the same baseline "identification" shouldnt be trying to represent multiple configurations; just one configuration.

What does that mean? It means don't try to make a tag or label serve "double-duty" for more than one configuration. This could have several ramifications:

maybe it implies that "floating" or "dynamic" configurations, that are merely "references", should have a separate identifier, even when the reference the same configuration as what was just labeled. So maybe the identifiers like "LATEST or "LAST_GOOD_BUILD" should be different from the one that identifies the current latest build-label (e.g., "PROD-BUILD-x.y.z-a.b")

maybe it might also imply that when we use a single label to capture a combination of component versions, that we really want true "
composite" labeling support. This would literally let me define "PROD_V1.2" as "Component-One_V1.1" and "Component-Two_V1.0" without requiring the label to explicitly tag all the same elements already tagged by the component labels

maybe it implies something similar for the notion of a "composite current configuration" or even a "composite codeline" where a product-wide "virtual" codeline could be defined in terms of multiple component codelines

What do you think? Is the single configuration principle a "keeper" or not?

Saturday, September 24, 2005

Quantum Agility and Organizational Gravity

Just a random synapse firing in my brain ... I remember back in my high school days being enthralled with physics and the latest grand-unified theories (GUTS), and how gravity was always the "odd ball" in trying to unify the four fundamental forces of nature into a single, simple, consistent and coherent theory:

Quantum mechanics could unify all but gravity. It was great, and incredibly accurate at explaining all the rich and myriad interactions of things at the molecular, atomic and subatomic levels.

But throw in celestial bodies and large distances, and the thing called "gravity" rears its ugly head and makes things complicated. In theory it's nowhere near as strong as the other forces, and yet any time you had to scale up to things large enough and far enough away to need a telescope instead of a microscope, it made everything fall apart.

Sometimes I think Agile "theory" and large projects and organizations are the same dichotomy.

The "Agile" stuff seems great in small teams and projects that can be highly collaborative and iterative over short (collocated) distances with small "lightweight" teams and processes.

But throw it into a large project or organization, and "gravity" sets in, adding weight and mass and friction to processes and communication, and yet necessarily so, in order to scale to a larger living system of systems of systems.

So we are left with quantum agility and organizational gravity and trying to reconcile the two. What's an Agile SCMer to do about all that?

Saturday, September 17, 2005

Can I have just one repository please?

One of the things I spend a lot of time dealing with is integration between application lifecycle management tools and their corresponding process areas: requirement management, configuration management, test management, document management, content management, change management, defect management, etc.

So I deal with process+tool architecture integration for a user community of several thousand, and the requirements, version control, change-tracking, and test management tools almost always each have their own separate repositories. Occasionally the change-tracking and version-control are integrated, but the other two are still separate.

And then if there is a design modeling tool, it too often tries to be a "world unto itself" by being not merely a modeling environment but attempting to store each model or set of models as a "version archive" with its own checkin/checkout, which makes it that much more of a pain in the you-know-what to get it versioned and labeled/baselined together with the code, particularly if code-generation is involved and needs to be part of the build process.

And what really gets to me is that, other than the version control tool, the other tools for requirements and test management, and typically change management usually have little or no capability to deal with branching (much less merging). So heaven forbid one has to support multiple concurrent versions of more than just the code if you use one of the other tools.

The amount of additional effort for tool customization and configuration and synchronization and administration to make these other tools be able to deal with what is such a basic fundamental version-control capability is enormous (not to mention issues of architectural platforms and application server farms for a large user base). So much so that it makes me wonder sometimes if the benefit gained by using all these separate tools is worth the extra integration effort. What if I simply managed them all as text files in the version control system?

At least then I get my easy branching and merging back. Plus I can give them structure with XML (and then some), and could easily use something like Eclipse to create a nice convenient GUI for manipulating their contents in a palatable fashion.

And all the data and metadata would be in the same database (or at least one single "virtual" database). No more having to sync with logically related but physically disparate data in foreign repositories and dealing with platform integration issues, just one big (possibly virtual) repository for all my requirements, designs, code, tests, even change-requests, without all the performance overhead and data redundancy and synchronization issues.

It could all be plain structured text with XML and Eclipse letting each artifact-type retain its own "personality" without having to be a separate tool in order to do it.

Why can't someone make that tool? What is so blasted difficult about it!!!

I think the reason we dont have it is because we are use to disconnected development as "the rule" rather than as the exception. Companies that shell out the big bucks for all of those different tools usually have separate departments of people for each of requirements (systems/requirements engineers), design (software architects), source-code ("programmers"), test (testers), and change-management.

It's a waterfall-based way of organizing large projects and it seems to be the norm. So we make separate tools for each "discipline" to help each stay separate and disconnected, and those of us doing EA/EAI or full lifecycle management of software products have to deal with all the mess of wires and plumbing of integration and platforms and workflow.

Oh how I wish I could take a combination of tools:

a good, stream-based version control tool like Accu-Rev
a fully Java/XML extensible issue-tracker like Jira (or combination of the two, like SpectrumSCM)
a framework like Eclipse
and a collaborative knowledge/content management system like Confluence

and roll them together into a single integrated system with a single integrated repository.

Notice I didn't mention any specific tools for requirements-management or test-management. Not that I dont like any of the ones available, I do, but I think it's time for a change in how we do those things with such tools:

they basically allow storing structured data, often in a hierarchical fashion with traceability linkages, and a way of viewing and manipulating the objects as a structured collection, while being able to attach all sorts of metadata, event-triggers, and queries/reportsI think a great wiki + CMS like Confluence and Jira can do all that if integrated together; Just add another "skin" or two to give a view of requirements and tests both individually and as collections (both annotated and plain).

The same database/repository could give me both an individual and hierarchical collection-based views of my requirements, designs, code, tests and all their various "linkages." Plus linking things in the same database is a whole lot easier to automate, especially thru the same basic IDE framework like Eclipse.

the requirements "skin" gives me a structured view of the requirements, and collaborative editing of individual requirements and structured collections of them;
ditto for the test "skin";
and almost "ditto" for the "change-management" skin (but with admittedly more workflow involved)
the design tool gives me a logical (e.g., UML-based) view of the architecture
the IDE gives me a file/code/build-based view of my architecture
And once MS-Office comes out with the standard XML-based versions, then maybe it will be pretty trivial to do for documents too (and to integrate XML-based Word/Office/PPT "documents" with structured requirements and tests in a database)

Oh why oh why can't I have a tool like that! Pretty please can I have it?

Sunday, September 11, 2005

Change-Packaging Principles

In my previous blog-entry I tried translating Uncle Bob's OOD Principles of package cohesion into the version-control domain by substituting "release" with "promote" or "commit", and "reuse" with "test".

I think that didn't work too well. I still think "promotion" corresponds to "release", but "reuse" corresponds to something else. I'm going to try translating "reuse" to "integration". If I integrate (e.g., merge) someone else's changes into my workspace, I am quite literally reusing their work. If I commit my own change to the codeline, then I am submitting my work for reuse by the rest of the team that is using the codeline (particularly the "tip" of the codeline) as the basis of their subsequent changes.

So if I equate "release" with "promotion", and "reuse" with "integration" I think the result is the following:

The Promotion-Integration Equivalency Principle -- The granule of integration is the granule of promotion. (So it's not just the change content, but also the context – the entire configuration – that we end up committing to the codeline/workstream.)

The Change Closure Principle -- Elements that must be changed together are promoted together (implies task-level commit).

The Change Promotion Principle -- Elements that must be integrated together are promoted together (implies doing workspace update prior to task-level commit)

These "work" for me much better than the previous translation attempt. Note that the "change closure principle" didn't change much from before - it was just clarified a bit to indicate the dependency between elements.

This also makes me think I've stumbled onto the proper meaning for translating the Interface Segregation Principle (ISP): ISP states "Make fine-grained interfaces that are client-specific." If "integration" is reuse, then each atom/granule of change is an interface or "container" of the smallest possible unit of reuse.

The smallest possible unit of logical change that I can "commit" that doesn't break the build/codeline would be a very specific, individually testable, piece of behavior. Granted, sometimes it might not be run-time behavior ... it could be build-time behavior, or behavior exhibited at some other binding time.

This would yield the following translation of the ISP into the version-control domain:

The Change Separation Principle

I'm not thrilled about the name (please feel free to suggest a better one -- for example ... how about "segmentation" instead of "separation"?) but I think the above translation "works" quite well, and also speaks to "right-sizing" the amount of change that is committed to the codeline as an individual "transaction" of change. The way it's worded seems like it's talking exclusively about "code", but I think it really applies to more than just code, so long as we arent constraining ourselves to execution-time "behavior."

Let me know what you think about these 4 additions to the family of SCM principles!

Saturday, September 03, 2005

The Blog ate my homework!

[NOTE: due to some comment-spam on my last entry (which I have since deleted), I haved turned on "word verification" for comments.]

When I was composing my previous blog entry, something very frustrating happened: The blog ate my homework!

I frequently save intermediate drafts of my blog entries before I publish them. I had been working on my most recent draft for a couple hours. I'd been finalizing many of the sentences and paragraphs, making sure the flowed, checking the word usage, spellchecking, adding and verifying links, and then ... when I was finally ready to publish, I hit the publish button on the blogger compose window, and it asked me to login again. When I did, my final edits were GONE! I'd just lost two hours worth of work.

My first thought was ARRRRRRRGGGGGHHHHH! My next thought was "no freakin' WAY did that just happen to ME!" Then much profanity ensued (at least in my own silent frustration) and I tried my darndest to look thru any and all temp files and saved files on my system and on blogger.com, all for naught. I had indeed fallen victim to one of the most basic things that CM is supposed to help me prevent. How infuriating! How frustrating! How embarrassing. I was most upset not about the lost text, but about the lost time!

I figure there must be a lesson in there somewhere to pass along. Ostensibly, the most obvious lesson would be to use the Private Versions pattern as outline in my book. The thing is ... I had been doing just that! It was in the very act of saving my in-progress draft (before publishing it) that my changes were lost.

What I could (and possibly should) have done instead was not use blogger's composer to compose my drafts. I could have done it locally instead, on my own machine (and my own spellchecker). And perhaps I will do that a bit more from now on. Still, it's pretty convenient to compose it with blogger because"

I get rapid feedback as to what it will actually look like, and ...
I can access it from any machine (not just the one I use late at night)

I later realized why it happened. I was trying to do two things at once:

In one window I was composing my blog entry.
In another browser window I was visit webpages I wanted to hyperlink to from my entry and verifying the link.

Okay - so there's nothing wrong with that. I mean I was doing two things at the same time, but I wasn't really trying to multi-task because I was still trying to work on my blog-entry.

The real culprit wasnt that I had two windows open at the same time, it was that one of the webpages I wanted to hyperlink to was also a blogger.com hosted blog-entry. And since I was positing a question in my entry that referred to this one, I also wanted to create a comment in the referred-to entry that asked the question and referenced back to my own blog.

Posting that comment caused me to have enter my blogger id and passwrod, and that essentially forced a new login - which made it look like my current login (where I was composing my entry) either ended, or had something unusual going on that warranted blogger wanting me to re-authenticate myself. And when it did, I lost my changes! OUCH!

Actually, I hadnt even posted the comment - I had only previewed it (saving it as a draft). Anyway - I was too upset (and it was too late at night) to try and recreate my change sthen. So I waited another day before doing it. I have to say Im not as happy with the result. I had really painstakingly satisfied myself with my wording and phrasing before I lost my changes. I wasn't as thorough the second time around because I wanted to be done with it!

So what was my big mistake? I was using private versions, and I wasn't trying to multi-task. I was in some sense trying to simultaneously perform "commits" of two different things at the same time, but they were to different "sections" of the same repository, so that really shouldn't have been such a terrible thing.

My big mistake wasn't so much a lack of good CM as it was a lack of good "agility": I let too much time lapse in between saving my drafts. I wasn't working in small enough batch-sizes (increments/iterations)!

Granted, I don't want to interrupt my flow of thought mid-sentence or mid-paragraph to do a commit. But certainly every time I was about to visit and verify another hyperlink in my other browser window, I should have at least saved my current draft before doing so. And I probably should have made sure I did so at least every 15-20 minutes. (You can be darn sure that's what I did this time around :-)

This sort of relates to how frequently someone should commit their changes in a version control system. Some of the SCM principles that I havent described yet will relate to this. Uncle Bob's Principles of Object-Oriented Design have a subset that are about "package cohesion" and granularity

REP: The Release Reuse Equivalency Principle -- The granule of reuse is the granule of release.

CCP: The Common Closure Principle -- Classes that change together are packaged together.

CRP: The Common Reuse Principle -- Classes that are used together are packaged together.

In the context of version control, these "packages of classes" would probably correspond to "packages of changes" that make up a single logical "change transaction" or "commit" operation. If that is a valid analogy, then I need to decide what "reuse" and "release" mean in this context:

I think "release" would mean to "promote" or "commit" my changes so they are visible to others using the same codeline.

I think "reuse" would mean ... hmmn that's a tough one! It could be many things. I think that if a change is to be reusable, it must be testable/tested. Other things come to mind too, but that's the first one that sticks.

So let's see what happens if I equate "release" with "commit", equate "reuse" with "test" and see if the result is coherent and valid. This would give me the following:

The Commit/Test Equivalency Principle -- The granule of test is the granule of commit.

The Change Closure Principle -- Files that change together are committed together.

The Test Closure Principle -- Files that are tested together are committed together (including the tests).

Comments? Thoughts? What do these mean to you? Does it mean anything more than using a task-level commit rather than individual file checkin? Should these always "hold true" in your experience? When shouldnt they? (and why?)

Oh - and feel free to suggest better names if you dont like the ones I used. I'm not going to supply abbreviations for these because, or name any blog-entries after them just yet because I'm not yet certain if they are even valid.

Saturday, August 27, 2005

The Baseline Immutability Principle

Adding more baselining principles to my Principles of SCM. So far I've described the Baseline Reproducibility Principle (BLREP) and the Baseline Identification Principle (BLIDP). Now I want to describe the Baseline Immutability Principle (BLIMP).

The Baseline Immutability Principle (BLIMP) is really just a rephrasing of The Open-Closed Principle (OCP) from The Principles of Object-Oriented Design as applied to baselines (baselined configurations). The OCP (first stated by Bertrand Meyer in the classic book Object-Oriented Software Construction) states that "Software entities (classes, modules, functions, etc.) should be open for extension but closed for modification."

The OCP means I should have a way of being able to extend a thing without changing the thing itself. Instead I should be able to create some new "thing" of my own that reuses the existing thing and somehow combines that with just my additions, resulting in an operational "extension" of the original thing. The OCP is the basis for letting me reuse rather than reinvent when I need to create something that is "like" an existing thing but which still requires some additional stuff.

If applied for baselined configurations (a.k.a. baselines) the OCP would read "A baseline should be open for extension but closed for modification." That means if I want to create a "new" configuration that extends the previously baselined configuration, I should do so by creating a new configuration that is the baseline PLUS my changes. The result is not a "changed" baseline - the baselined configuration stays the same as it was before my change. We don't actually ever "change" a baseline. What we do is request/apply one or more changes against/to a baseline; and the result is a new configuration, possibly resulting in a new baseline.

According to the Baseline Immutability Principle ...

always

For example, suppose I have release 1.2 of my product and I apply a label/tag of "REL-1.2" to everything that was used to make 1.2 (not just the code, but ALL of it: requirements, designs, tests, make/ANT files, etc.). Suppose that version 1.2.3.4 of element FUBAR was one of the file revisions that was labeled. Now suppose that during the following month, "REL-1.2" is moved/reapplied to version 1.2.3.5 of FUBAR.

In this example, I have just violated the baseline immutability principle. If a customer needs me to be able to reproduce Release 1.2, and if Release 1.2 contained v1.2.3.4 of FUBAR, then if I use "REL-1.2" to recreate the state of the codebase for Release 1.2, I just got the wrong result, because the version of FUBAR in Release 1.2 is different from the version that is tagged with the "REL-1.2" label.

Notice that I am not saying that we can't make changes against a baseline. We most certainly can. And the result is a new configuration!

When we make a change to a baseline, we aren't really changing the configuration that was baselined and then trying to use the same name for the result. Our changed result is a new configuration that took the current baseline and added our changes to it. And if we chose to name this new configuration, we give it a new name (one that is different from the name of any previously baselined configuration).So a baseline name and the configuration it references are married: once the configuration is baselined, that name must forever after be faithfully monogamous to that configuration for better or for worse, for richer or for poorer, in sickness and in health for as long as they both shall live.

Always and forever? What about a divorce, or an anullment?

Note that the term "a baseline" should not be confused with the term "the baseline":

The term "the baseline" really means the latest/current baseline. It is a reference!

This means that "the baseline" is really just shorthand for "the latest baseline." And when we "change the baseline", we are changing the designation of which baseline is considered "latest": we are changing the reference named "latest baseline" to point to a newer configuration.

So The Baseline Immutability Principle states that once a configuration is baselined, the identification of the baseline name with its corresponding configuration is immutable: The set of elements (e.g., files and revisions) referenced by the baseline name must always be the same set. And that set must always correspond to the set that was used to produce the version of the product that was baselined.

I think this may be equivalent to Damon Poole's "TimeSafe Property" -- see Damon's paper The TimeSafe Property: a Formal Statement of Immutability for CM.

Let me know what you think!

Sunday, August 21, 2005

The Baseline Identification Principle

Yesterday (actually just a few hours ago) was my 40th birthday. I had a really nice celebration with my wife and kids at a picnic in the park. I really dont feel like I'm 40. My body thinks I am 50 - at least that how it seems to be acting. My mind still isnt used the the fact that I'm now more than just a little bit older than all those leading men and leading ladies on TV and movies. (Guess I can no longer identify them as part of my historical "baseline" :-)

Back again to describing The Principles of SCM! Last time I described The Baseline Reproducibility Principle. Now we'll take the next logical step and talk about the need to identify baselines.

If the ability to reproduce a baseline is fundamental to SCM, then it stands to reason that the ability to identify a baseline that I must be able to reproduce should also be pretty fundamental. If I have to be able to "show it", then I must first be able to "know it." If I can't uniquely identify a baseline, then it's pretty hard to reproduce it if I'm not sure what I'm trying to reproduce.

So the baseline reproducibility principle gives rise to The Baseline Identification Principle: a baseline must be identified by a unique name that can be used to derive all the constituent elements of the baseline. In other words, we have to have a name, and a way of associating that name with all the object (e.g. files) and their revisions that participate in the baseline.

How do we identify a baseline? By defining a name (or a naming system) to use, and using that name to reference the set of elements that were used to build/create the baselined version of the product.

A "label" or "tag" is one common way that a version control tool allows us to identify the sources of a baseline. This lets us associate a name with a specific set of repository elements and their corresponding revisions. Or it lets us associate a name with an existing configuration or event from which the set of elements and versions may be derived.

Sometimes tagging all the "essential" files and revisions in the repository is sufficient. Sometimes I need more information. I can always take any files or information that werent previously in the version control repository, and put them in the repository:

I can put additional information in a text file and checkin the file
I can export a database or binary object into some appropriate format (e.g., XML, or other formatted text)
some tools let me directly checkin a binary object (e.g, compilers, libraries, images, models) to the repository

If you currently have to label or tag more than just source-code and manually created text-files, then tell me about the other kinds of things you checkin and tag, and what special things you do to ensure they are identified as part of a baseline.

Monday, August 15, 2005

The Baseline Reproducibility Principle

Getting back to my earlier topic of The Principles of SCM, I think probably the first and most fundamental principle would be the requirement to be able to reproduce any baselined/released version of the software.

I'll call this The Baseline Reproducibility Principle: a baseline must be reproducible. We must be able to reproduce the "configuration" and content of all the elements that are necessary to reproduce a "released" version of the product.

By "released" I really mean "baselined" - it doesn't have to be a release to a customer. It could be a hand-off to any other stakeholder outside of development (like a test group, or a CM group, or QA, etc.). There is some basic vocabulary we need, like the terms "baseline" and "configuration." Damon Poole has started a vocabulary/glossary for SCM. Damon defines configuration but doesn't yet define a baseline.

A baseline is really shorthand for a "baselined configuration." And a baselined configuration is basically "a configuration with an attitude!" The fact that it's been "baselined" makes it special, and more important than other configurations that aren't baselined. We baseline a configuration when we need to promote/release it to another team/organization. By "baselining" it, we are saying it has achieved some consensually agreed upon level of "blessedness" regarding what we said it would contain and do, and what it actually contains and does.

Why do we need to be able to reproduce a baselined version of the product we produce and deliver? For several reasons:

Sometimes we want to be able to reproduce a reported problem. It helps to be able to reproduce the exact versions of the source code that made up version of the product that the customer is using.

In general, when we hand-off a version of the product to anyone that may report problems or request enhancements, it is useful to be able to reproduce the versions of the files that make-up that version of the system to verify or confirm their observations and expectations.

When a "fix" is needed, customers are not always ready/willing to deploy our latest version (containing new funcitonality plus the fix). Even if they are, sometimes our business is not - it wants to "give" them the fix, but make more money on any new functionality. So we must provide a "patch" to their existing version

When a baseline is a version of the product, it includes the specs and the executable software. Configuration auditing requires us to know the differences between the current product+specs versus their actual+planned functionality at the time that the product was released to them.

Those are just a few reasons. There are many more I'm sure.

What does it mean to reproduce a baseline? At the very least it means being able to reproduce the exact set of files/objects and their corresponding versions that were used to produce/generate the delivered version of the product. (That includes the specs that may be audited against, as well as the code).

Sometimes being able to reproduce the source files for the code+docs (and build scripts) is enough. Often we need to be able to do more than that. Sometimes it may be necessary to reproduce one or more of the following as well:

The version of the compilers/linkers or other tools used to create that version of the product

The version of any third-party libraries, code/interfaces/headers used to build the product

Any other "significant" aspect of the computing environment/network utilized during the creation of the delivered version of the product

It can be too easy to go to more effort than necessary to ensure reproducibility of more than is absolutely essential. What is essential to reproduce may depend upon many business and technical factors (including some possible contractual factors regarding deployment/upgrade, operational usage and support).

The ability to be able to reproduce a baseline is so basic to SCM; I can't believe it hasn't been a "named" principle before. I know others have certainly written about it as a principle, I'm just not recalling if any of them gave the principle a name.

I think names are powerful things. Part of what makes software patterns so powerful is that they give a name to an important and useful solution to a recurring problem in a particular context. The pattern name becomes an element of the vocabulary of subsequent discussion on the subject. So I can use the terms "Private Workspace" or "Task Branch" in an SCM-related conversation instead of having to describe what they are over and over again.

This is why I'd like to develop a set of named principles for SCM. I think lots of folks have documented SCM principles, but didn't give them names. And they might "stick" better if we gave them names. If you know of any examples of SCM principles that are already well known and have a name, please let me know! (Please include a reference or citation if possible)

Tuesday, August 09, 2005

SCM Design Smells

First, the news of the passing of Peter Jennings (ABC World News Tonight Anchor) became known to me early this morning. I'm very saddened by this. The world has lost a great mind and communicator, and Ive lost the trusted advisor I used to let into my home every evening since I was a teen to tell me about what was going on elsewhere in the world.

Getting back to my earlier topic of The Principles of SCM, I'd like to step through each of the Object-Oriented Design Principles mentioned in Robert Martin's book Agile Software Development: Principles, Patterns, and Practices and step through each principle, looking for how it applies to SCM.

Before I do that however, I'd first like to look at what "Uncle Bob" (as he is more affectionately called) refers to as design smells. These are as follows:

Fragility - Changes cause the system to break easily and require other changes.
Immobility - Difficult to disentangle entities that can be reused in other systems.
Viscosity - Doing things wrong/sloppy causes more friction and slows you down the next time you navigate through that code.
Needless Complexity - The System contains infrastructure that has no direct benefit (overengineering and/or "gold plating").
Needless Repetition - Repeated structures that should have a single abstraction (Redundancy).
Opacity - Code is hard to understand.

How might each of these apply to your SCM process and procedures? How might they apply to your branching & merging structure? Or to the organization of your source-tree?

Here's one possible "translation" of how these might apply to Software CM process and procedures:

Intolerant/Fragility - Changes cause the project, team, or organization to fall apart easily and require change to other parts of the project, team, or organization.
Rigidity/Immobility - Difficult to identify or disentangle practices and policies that can be reused by other projects, teams, or organizations.
Friction/Viscosity - Doing things wrong/sloppy causes more friction and slows you down the next time you navigate through that workflow or go on to the next one.
Wasteful/Needless Complexity - The Process contains "waste" in the form of extra steps, processing, handoff, waiting, or intermediate artifacts that do not "add value" for the customer, project, or organization.
Manual Tedium/Repetition - Repeated or tedious steps and activities should have a single mechanism to automate them.
Opacity - The project or process is hard to understand. (Lack of Transparency)

How would you translate design smells into process smells for Software CM?

Monday, August 01, 2005

The Customer Inversion Principle of Process Design

Lookingback on last week's blog-entry suggesting we should CM to an Interface, not an implementation, I wonder if that was really an instance of the stated design principle, or of something else ...

Often times, the process and procedures that development must follow in order to comply with CM needs were developed by the people who receive the outputs of development but who dont necessarily perform the development activities themselves. These process-area experts are the process designers and the developers are the end-users of their process.

The conclusion of CM to an interface, not an implementation was to essentially invert or "flip" the relationship between who is the process "producer" and who is its "customer." The Principles of Lean Thinking suggest that processes should be designed by the practitioners who are most intimately familiar with performing the activities and their reasons for being a necessary step in the process: Those who receive the outputs of that process are its customers, and they get to specify the requirements, but not the implementation.

If true, this could perhaps be a statement of a different principle that we might call The Customer-Inversion Principle of Process Design:

Upstream Development procedures should not depend on downstream CM procedures, both should depend upon the abstract interfaces represented by development's exit criteria and CM's entry criteria.
Procedures should not be designed for their practitioners by the upstream customer of their results, Practitioners should design their own procedures to meet the requirements of their upstream customers.

Not only does this "inversion" of the process producer/customer relationship conform with the design principle to separate interface from implementation, and with principles of lean thinking, it also aligns with Agile principles of putting the customer in charge and preferring customer collaboration over contract negotiation, when "negotiating" the right balance between the process requirements and the procedural implementation.

It also somewhat "inverts" (or at least turns on its head) what might be the more stereotypical perception by many agilists of CM as "controlling opponents" into one of "collaborating customers", and hopefully helps lend some a new perspective about how to successfully pair with other organizational stakeholders making additional demands upon the use of more formal standards, documentation, and tools upon an agile project. (See my earlier blog-entry on Building Organizational Trust by Trusting the Organization.)

Surely there must be some exceptions. What about when development has absolutely no CM knowledge or appreciation whatsoever? Should a knowledgeable CM person define development's CM activities for them?

To me this sounds similar to the situation of an expert needing to play the role of coach for a more junior engineer. A more directive or coaching style of leadership may be required, where CM doesnt necessarily give all the answers, but still plays a strong collaborative role in specifying not only their requirements, but in educating development about existing SCM patterns and their applicability and context, and helping them choose the most appropriate patterns and tradeoffs to design the CM procedures that development should use.

If development is not yet able to understand and/or is willing to be initially "told" what to do - then "telling"/directing (instead of coaching) might be the first step. But ultimately I believe practitioners of a process need to feel a sense of ownership over their own process and procedures if they are to continue being effective. By helping them understand the process requirements, and the applicable patterns and principles, we help them become better developers, and better advocates of effective CM. At least that's been my experience.

What do you think? Does it sound nice in theory but not work "in practice" in your own experience?

Friday, July 22, 2005

CM to an Interface, not to an Implementation

It's been a very busy and heated week or so for "Agile CM" discussion on CMCrossroads.com. First, we published the debate-provoking article about some fictitious "newly discovered" National Treasures of Agile Development. Then there was, and still still is, the ensuing (and much heated) discussion on the Balancing CM and Agility discussion thread.

The gist of the article was that it purported to have discovered some historical artifact whose wording resembled the July 1776 declaration of independence, but was referring to agile developers complaining about, and wanting freedom from, the so called "tyrannies" of many recurring heavyweight Software CM process implementations. Of course since it was posted in a forum for CM professionals, the article sparked some very strong reactions from some people (and you can read them in the "Balancing CM and Agility" thread if you like).

One thing regarding the article and discussion thread that struck me is very much related to one of the SCM Principles from GoF Design Patterns that I blogged about last week:

Program to an Interface, Not an Implementation (a.k.a "Separate Interface from Implementation")

Some regarded the "National Treasures" article as an attack against CM and CM principles in general instead of taking it for what it was, a grievance with some commonly recurring characteristics of some heavyweight SCM process implementations.

That got me to thinking - maybe these could be recast as instances of violating an important SCM principle! Previously, I had blogged about "separating interface from implementation" as it applies to designing build scripts/recipes as well as CM processes and workflow. But I didnt really talk about how it applies to the roles involved in defining the CM activities and in executing the CM activities.

In the case of CM and development, I think it is frequently the case in many heavyweight SCM implementation, that the developers quite likely did not get a lot of say in defining the process and procedures they must follow to meet the needs of CM. It was instead defined primarily by configuration/build managers to meet needs needs like reproducibility, repeatability, traceability, and auditability without enough considerations to development's needs for:

high quality+velocity acheived though rapid and frequent feedback that comes from very frequent integration (including being able to do their own merges+builds on their own "Active Development Line")

unit, integration, and regression testing with the latest development state of the codeline

effect fixes/changes quickly and efficiently without long wait/approval periods for authorization of changes they are likely to know need to be made ASAP (like most bugfixes and maintainability/refactoring changes found after the corresponding task was committed to the development codeline)

This leads me to think an important manifestation of separating interface from implementation is to:

Configuration Manage to an Interface, Not to an Implementation!

What I mean by this is that when defining CM policies, processes and procedures, the impacted stakeholders (particularly the ones who will execute the procedures) not only need to be involved, but as the "consuming end-user" of the process and procedures, they need to play a key role in defining the process implementation.

CM (and other downstream stakeholders) should define their needs/requirements in terms of interfaces: policies, invariants, constraints, and entry criteria for what is given to configuration/build managers.

Development should be empowered to attempt to define the implementation: the processes/procedures and conventions for how they will meet or conform to the "interface" needed by CM.

This seems very "Agile" to me in that it fosters collaboration between CM and development. It lets CM be the "customer" of agile developers by giving them their CM process requirements and allowing them to drive and fine-tune the implementation's "acceptance tests" to ensure it meets their needs. It also allows development, the folks who are executing the development activities, to collaboratively define their own processes (in accordance with lean development/manufacturing principles).

Does that mean every different development group that delivers to CM should be allowed to come up with their own process? What about having consistent and repeatable processes? If the requirements/interfaces are defined and repeatedly met, why and when do we need to care? Each development group is defining and repeatably executing its process to meet consistent CM entry-criteria. And doesnt the CMM/CMMI allow for project-specific tailoring of standard organizational processes?

Still, there should be some mechanism for a larger-grained collaboration, such as a so called "community of practice" for all the development projects to share their successful development practices so that that knowledge can be reused throughout the organization. And every time CM collaborates with yet another project, they can point to an existing body of development processes being successfully used that the project they are engaging might want to consider adopting/adapting.

I do think that if they (development + CM) choose to use a different process or significantly adapt an existing one, it would be good to know the differences and WHY they were necessary. Seems to me that matches the description of what an SCM Pattern is: something that defines a solution to an SCM problem in a context and captures the forces, their resolution, and the rationale behind it.

Then when CM and development work together the next time for the next project, they simply look at the set of SCM patterns they have been growing (into a pattern language perhaps) and decide which combination of patterns to apply, and how to apply them, to balance and resolve the needs of both CM and development, collaboratively!

Thursday, July 14, 2005

SCM Principles from GoF Design Patterns

I was reading Leading Edge Java online and saw an article on Design Principles from Design Patterns. The article is part III of a conversation with Erich Gamma, one of the famous Gang of Four who authored the now legendary book Design Patterns: Elements of Reusable Object-Oriented Software.

In this third installment, Gamma discusses two design principles highlighted in the GoF book:

program to an interface, not an implementation (a.k.a "separate interface from implementation")

favor object composition over class inheritance

Another recurring theme echoed throughout the book is:

encapsulate the thing that varies (separate the things that change from the things that stay the same during a certain moment/interval)

I think these three GoF Design Principles have pretty clear translations into the SCM domain:

Separate Interface from Implementation - this applies not only to code, but to Make/ANT scripts when dealing with multi-platform issues like building for different operating environments, or windowing systems, or component libraries from different vendors for the same functionality. We are often able to use variables to represent the target platform and corresponding set of build options/libraries: the rules for building the targets operate at the abstract level, independent of the particular platform. This can also apply to defining the process itself, trying to ensure the high-level workflow roles, states & actions are mostly independent of a particular vendor tool

Prefer Composition & Configuration over Branching & Merging - This is one of my favorites, because it talks about one of my pet peeves: inapproproriate use of branching to solve a problem that is better solved by using either compile-time, install-time, or run-time configuration options to "switch on" the desired combinations of variant behavior. Why deal with the pain of branching and then having to merge a subsequent fix/enhancement to multiple variant-branches if you can code it once in the same directory structure with patterns like Strategy, Decorator, Wrapper-Facade, or other product-line practices

Isolate Variation - this one is almost too obvious. Private Workspaces and Private Branches isolate variation/change, as does just about any codeline. And we do the same things in the build-scripts too

Can you think of any other valid interpretations of the above design rules in terms of how they translate into the SCM domain?

Tuesday, July 05, 2005

Whitgift's Principles of Effective SCM

In an effort to try and deduce/derive The Principles of SCM, I'm going through the SCM literature to see what other published SCM experts have identified as SCM Principles.

Among the best "oldies but goodies" are the books by David Whitgift (Methods and Tools for SCM, Wiley 1991) and Wayne Babich (SCM: Coordination for Team Productivity, Addison-Wesley 1986). These books are 15-20 years old, but most of what they say still seems relevant for software development and well-aligned with agile and iterative development methodologies.

Today I'll focus on David Whitgift's writings. In the very first chapter ("Introduction") he says that CM is more concerned with the relationships between items than with the contents of the items themselves:

This is because CM needs to understand the decomposition + dependencies among and between all the things that are necessary for a change to result in a correct + consistent version of the system that is usable to the rest of the team (or its stakeholders).

He also states that most of the problems that arise from poor CM and which CM is meant to resolve are issues of coordination/communication and control. And he gives a shopping list of common problems encountered in each of five different areas (change-management, version-management, build-management, repository-management, item identification/relationship management).

At the end of section 1.2 in the first chapter, Whitgift writes:

In the course of the book, five principles become apparent which are the keys to effective CM. They are that CM should be:
Proactive. CM should be viewed not so much as a solution to the problems listed in the previous section but as a collection of procedures which ensure the problems do not arise. All too often CM procedures are instituted in response to problems rather than to forestall them. CM must be carefully planned.

Flexible. CM controls must be sensitive to the context in which they operate. Within a single project an element of code which is under development should not be subject to restrictive change control; once it has been tested and approved, change control needs to be formalized. Different projects may have very different CM requirements.

Automated. All aspects of CM can benefit from the use of software tools; for some aspects, CM tools are all but essential. Much of this book is concerned with describing how CM tools can help. Beware, however, that no CM tool is a panacea for all CM problems.

Integrated. CM should not be an administrative overhead with which engineers periodically have to contend. CM should be the linchpin which integrates everything an engineer does; it provides much more than a repository where completed items are deposited. Only if an engineer attempts to subvert CM controls should he or she be conscious of the restrictions which CM imposes.

Visible. Many of the issues raised in the previous section stem from ignorance of the content of items, the relationships between items, and the way items change. CM requires that any activity which affects items should be conducted according to clearly defined procedures which leave a visible and auditable record of the activity.

Whitgift's "proactivity principle" might seem a bit "Big Planning/Design Up Front" rather than responsive, adaptive, or emergent. And his "visibility principle" one may seem like heavyweight traceability and documentation. However, in light of being flexible, automated, and integrated, it might not be as bad as it sounds.

Each of the above five Principles of Effective CM seem (to me) to be potentially competing objectives ("forces") that need to be balanced when resolving a particular SCM problem. I wouldn't regard them as principles in the same sense as Robert Martin's "Principles of Object-Oriented Design."

Each of the subsequent chapters in Whitgift's text delves into various principles for the various sub-areas of CM. I write about those in subsequent blog-entries.