Sunday, March 27, 2005

Individual vs Collective Code Ownership: Stewardship to the Rescue

I heard another argument today claiming that collective code ownership often results in "no ownership" of the code, and that individual code ownership is better for managing attempts at concurrent changes.

Then I heard someone try to counter by saying individual ownership inhibits refactoring and goes against the team ethic of XP and other Agile methods.

The problem I have with both of these is that they are each extreme positions. I know from experience there is a successful middle ground, one that is sometimes referred to as code stewardship.

Individual Code Ownership, in its purist form, means exclusive access control: no one but the "owner" is allowed to edit the particular module or class. When code ownership is interpreted so rigidly as to exlude allowing others to to make updates, I have definitely seen it lead to the following problems on a regularly recurring basis:
  • Causes lots of wait-time and productivity-blocks
  • Used as a parallel-development block to avoid merging concurrent changes
  • Holds changes/code "too close to the vest" and leads to defensiveness and/or local optimization at the expense of the "whole" team/system
  • Increasing the project team's "truck factor" by limiting the number of people with intimate knowledge of that module/class within the overall system
At the same time, I have seen cases where "Collective Code Ownership" degrades into "no ownership" when there is no collective sense of team ownership or accountability.

So what is the balance? I think it is Code Stewardship, where the code-steward is both guardian and mentor for the body of knowledge embodied by the module/class. The Code Steward's job is not to safeguard against concurrent-access, but to instead safeguard the integrity+consistency of that code (both conceptually and structurally) and to widely disseminate knowledge and expertise about it to others.

  • When a portion of a module or class needs to change, the task-owner needs the agreement of the code-steward.
  • The task-owner explains what they are trying to accomplish, the code-steward helps them arrive at the best way to accomplish it, and provides any insights about other activities that might be related to it.
  • The task-owner makes their changes, seeking advice/counsel from the code-steward if needed (possibly thru pairing or dialogue or other means)
  • Before commiting changes to the codebase, the code-steward is involved in review or desk-checking (and possible coordination of concurrent activities)
This can be accomplished by pairing with the code-steward, or simply by seeking them out as an FDD programmer would a Chief programmer/architect. The code-steward is like the "editor-in-chief" of the module or class. They do not make all the changes, but their knowledge and expertise is still applied throughout. The benefits are:

  • Concurrent changes can still be made and wait-times avoided while still permitting notifications and coordination.
  • Knowledge is still disseminated (rather than hoarded) and spread-around the team
  • Collective ownership and its practices, such as refactoring, are still enabled
  • Pair programming can still be done, where pairing assignments can be based in part on who the "steward" is for a piece of code. (At some point stewards can even hand-off-the baton to another)
I guess the bottom-line for me is that collaborative ownership and authorship is still essential, and code ownership isnt supposed to be about controlling concurrent access (and is very suboptimal as a concurrency-strategy, even though some merge-a-phobic shops will swear by it). If we take "ownership" to either extreme - the result is impractical and imbalanced.

Sunday, March 20, 2005

The Five Orders of Traceability

Traceability is supposed to help us track and link related pieces of knowledge as we progress thru the software development lifecycle. The software development lifecycle is devoted to creating knowledge that transforms fuzzy functional concepts into tangible working results. It is the process of transforming theoretical concepts into executable reality.

Phil Armour, in his magnificent book The Laws of Software Process, maintains that software is not a "product" in the usual production-oriented sense of the word, but that software is really a medium for capturing executable knowledge. He then uses this to derive that software is therefore not a product-producing activity but rather a knowledge creating and knowledge acquiring activity.

Armour goes on to describe "The Five Orders of Ignorance" and how, if software development is a process of knowledge-acquisition and knowledge-creation, then it is also ultimately a process of "ignorance reduction" whereby we progressively reduce our ignorance of what the system is, what it needs to do, how it needs to do it, and how we need to do it and manage it.

Perhaps it should follow then, as a corollary to all of this, that the mission of traceability is to connect all the dots between these various pieces and "orders" of knowledge to help those of us auditing or reviewing them to learn the lessons of how and why that knowledge came to be created in its current form. To that end, I hereby propose The Five Orders of Traceability:
  • 0th Order Traceability – Existence: Tracking Knowledge Content
  • 1st Order Traceability – Structure: Tracking Knowledge Composition
  • 2nd Order Traceability – History: Tracking Knowledge Context
  • 3rd Order Traceability – Transformation: Tracking Knowledge Creation (causal connection in the knowledge creation/derivation process)
  • 4th Order Traceability – Meta-Traceability: Tracking Knowledge of the Five Orders of Traceability.
It's not entirely clear to me whether I have the 2nd and 3rd items mixed-up in the above (perhaps structure should come after context). When deriving the ordering and identification of the above, I basically took my cue from Armour's 5 orders of ignorance, and made the assumption that it is the intent of a particular "order" of traceability to eliminate or address the corresponding "order" of ignorance! Hence 3rd order traceability should help resolve 3rd order ignorance, 2nd order traceability should help resolved 2nd order ignorance, and so on.

With that in mind, I'll elaborate further upon each of the five orders of traceability Ive "coined" above ...

0th Order Traceability is merely the existence of knowledge content. There are no additional linkages or associations to navigate through that content. There is no explicit traceability - the content is simply there.

1st Order Traceability is an attempt to structurally organize the knowledge content and provide links that navigate the decomposition from one level of organization/detail to another. This would be like an outline structure and cross-referencing/indexing capabilities. A number of tools give us a means of organizing a particular portion of system knowledge:
  • Basic requirements document management tools provide way of organizing and viewing the requirements, and even design documentation for a project
  • Modeling tools provide a way of organizing and viewing problem-domain and solution-domain abstractions
  • Many interactive development environments give a logical (e.g., object-based) view of the code and/or a physical (e.g., file-based) view of the file+folder structure of the codebase.
  • Many of these tools even provide a way to link from a section of a document to a model entity (e.g., a UML object or package) and/or a code-construct (e.g., a class, method, or module)

2nd Order Traceability goes beyond mere content and structure to provide contextual awareness. Not only are there links, but there is also contextual information (typically in the form of metadata) giving a clue as to the events that transpired to create it: who authored the content, when they did it, where they did it (physically or virtually).

This type of historical/log information assists in auditing and recovery, and is typically most-suited for automatic capture/recording by the application used to create/modify and store the information (perhaps using a mechanism like dynamic event-based traceability). The information might also include application integration data (for example, recording the identifier of a requirement or a change-request and associating it with the corresponding portion of the design, code, or tests when it is created or updated).

3rd Order Traceability is the "nirvana" of traceability for many. Not only do we have structure and context, but we have additional associations and attributes (e.g., metalinks between [meta]data) that capture the causal connection between related pieces of knowledge in the value chain. Some call this "rich traceability" and there are some high-end requirements management tools capable of doing this.

Still, tracing all the way through to design models, and code remains very effort-intensive unless all knowledge artifacts are captured in the same repository (requirements, models, code, tests, project-activities, change-requests) where these advanced "rich tracing" capabilities exist.[
MKS claims to have taken a step toward achieving this with its new RM tool that tracks requirements in the same repository as the source-code version control system].

With 3rd order traceability, we are effectively capturing important decisions, criteria, constraints, and rationale at the various points in the knowledge creation lifecycle where one form of knowledge (e.g., prose, model, code) is being either transformed into another form of knowledge, or is being elaborated to another level of decomposition within the same form of knowledge.

4th Order Traceability is meta-traceability, or tracking of knowledge about the five orders of traceability within or across systems. (Sorry - I couldn't come up with anything better that is analgous to Armour's 5th order of ignorance - a.k.a. meta-ignorance. If you have a different idea of what 4th order traceability should be, feel free to comment.)

What about Ontology or Epistemology? I don't honestly know. I would imagine there must be some way of tying together the above with terms and concepts from "knowledge management," such us transforming tacit knowledge to explicit knowledge, and maybe even relating XML schemas and ontologies back to all of this. I leave that as an undertaking for someone much more versed than I in those domains.

Tuesday, March 15, 2005

Traceability and TRUST-ability

Traceability is one of those words that evokes a strong "gag" reflex among many hardcore agilists. They are all in favor of tracing tests to features, which is extremely straightforward when one is doing test-driven development (TDD). When it comes to tracing requirements thru to design and code, images of manually maintained traceability matrices that are hopelessly effort-intensive and never quite up-to-date seem to spring to mind.

So what are the main goals that traceability supposedly serves? Based on sources like CMMI and SWEBOK, and several others, I think the goals of traceability are to assist or enable the following:
  1. change impact-analysis: assess the impact and risk of a proposed change to facilitate communication, coordination and estimation [show the "know-how" to "know-what" to change, "know-where" to change it, and "know-who" will change it]

  2. product conformance: assure that necessary and sufficient requirements were implemented, and ensure the implementation of each requirement was verified/validated [prove we "did the right things" and "did the things right"]

  3. process compliance: assure that the necessary procedural activities (e.g., reviews and tests) were executed for each feature/requirement/code-change and ensure they were executed satisfactorily [prove we "walk the walk" and not just "talk the talk"]

  4. project accountability: assure that each change of each artifact was authorized and ensure that they correspond to requested functionality/business-value [safeguard against "gold plating"]

  5. baseline reproducibility: assure that the elements necessary to reproduce each baseline have been captured and ensure that the baselines can be reproduced [so "all the king's horses and all the king's men" can put your fallen "humpty-dumpty" build back together again]

  6. organizational learning: assure that the elements necessary to rediscover the knowledge of the system have been captured and ensure that the rationale behind critical decisions can be reproduced -- e.g., for root-cause analysis, or to transfer system knowledge to a deployment/support team. ["know-why" you did what you did when you did it]

Many would argue that these all boil down to questions of trust, and communication:
  • Do I trust the analysts/architects to do what I said and say what they did?

  • Do I trust the product or its producers to correctly realize the correct requirements?

  • Do I trust the process or the engineers following it?

  • Do I trust the project or the people managing it?

  • Do I trust the environment in which it is built and tested?

  • Do I trust the organization to be able to remember what they learned and learn from their mistakes?

Whether or not traceability efforts successfully achieve any of these goals is another matter entirely. Often, traceability efforts achieve at most one of these (conformance), and sometimes not even that. Many question whether the amount of effort required actually adds more value than what it subtracts in terms of effort (particularly when traceability efforts themselves may create additional artifacts and activities that require tracing).

Traceability is often desired because it's presumed to somehow alleviate fears and give us "TRUSTability." I suspect that's really an illusion. Traceability is useful to the extent that it facilitates more effective communication & interaction between individuals, teams, and stakeholders. And traceability can be useful to the extent that its helps us successfully convey, codify and validate system knowledge.

So to the extent that traceability helps us more quickly identify the right people to interact with and the right information to initiate that interaction, it can be extremely useful. It is the people and the quality of their interactions that provide the desired "trustability." Beyond that, unless unintrusively automated, it can quickly turn into "busy work" with very high "friction" on development of new functionality realized as "working software".

To that end, I have a few slides in a presentation I gave on Agile Configuration Management Environments (see slides 24-25) that talk about "Lean Traceability", and a big part of that is using encapsulation, modularity, and dependency management (a.k.a refactoring) to track at the coarse-grained level rather than a fine-grained one. This alone can reduce the traceability burden by 1-2 orders of magnitude. A more detailed discussion is near the end of the paper The Agile Difference for SCM.

The LoRD Principle is also mentioned (LoRD := Locality of Reference Documentation). You can read more about LoRD on the Wiki-Web, in Volume 3 of the Agile Times (pp. 37-40), and in Scott Ambler's essay Single Source Information. For automating traceability, there is some interesting work from the DePaul University Applied Requirements Engineering Lab and the SABRE Project (Software Architecture-Based Requirements Engineering) on the subject of dynamic event-based traceability.

Thursday, March 10, 2005

Building Trust with Transparency

I was catching up on some of my trade journals last weekend and I came across a coincidental confluence of several seemingly divergent streams of thought, all revolving around this recurring theme of building trust ...

First, The January 2005 issue of Software Development, with the featured them "RFID Everywhere: a primer for the new-age of self-tracking products" plastered on the cover. Then the February 2005 CACM had an article entitled "Trust in E-Commerce". Lastly, the February 2005 issue of Software Development had two fantastic articles, one by Kirk Knoernschild entitled "Benefits of the Build" and an interview profiling David Anderson about his new position "Managing at Microsoft."

The CACM article on e-commerce trust was about building trust between the vendor and the consumer. But it got me thinking about building trust between the agile development "vendor" (e.g., an agile team in a large organization) and the quite probably non-agile "consumer" with whom they need to develop a trustworthy working relationship.

Then Kirk's article on the benefits of the build talked about one such benefit being the Regular Frequency Indicating Development status/health (which of course made me think of RFID). A frequent build usually happens at regular intervals that many refer to as the rhythm, pulse, heartbeat, or cadence of the team's development progress and health. When the results of the build are visibly reported in a public location/webpage, it's like a "blinking beacon" that the team and its stakeholders can monitor.

Such build status reports are a part of the part of CM more formally known as configuration status accounting. Basically, status accounting lets us account for the status of any change-request, development-task, build, iteration, or release, at any time (sounds even more like RFID doesnt it).

Then I read the interview with David Anderson. I know David. I had occasion to meet him in July of last year when he gave a presentation about FDD and its relationship/history from Peter Coad and "Color Modeling" Pattern to the Chicago Agile Developers group (ChAD), and then a few days later he presented a Business-Case for Agile Management to the "Agile track" at the 2004 Motorola Engineering Symposium. [I was the "track chair" that arranged to have David speak both to ChAD and to Motorola, and I also had the chance to chat with him at length. I'm unendingly impressed by David and am in awe of his mindfulness, vast knowledge, keen insight, and systems-thinking abilities when it comes to the union of agility, lean, theory of constraints, software development and management theory.]

Anyways, near the end of his interview with SDMagazine, David says the following little gem about how Transparency Enhances Trust when he was asked how a manager can introduce and encourage transparency in a culture that is opaque and compartmentalized:

Economists talk about measuring the level of trust in a society. It's been proven that societies where there is greater trust experience faster economic growth—"My word is my bond." The same is true in software. To have trust, you must not be afraid that someone is hiding something from you. Transparency enhances trust.

However, to have transparency, you must drive out fear. Many people believe that data hiding is required because figures can be misinterpreted. Hence, they hide information from people whom they believe will draw the wrong conclusions ... I prefer to educate people to draw the right conclusions ... Information sharing (or transparency) is an enabler to team power. Team software development is about collaboration ....

And there you have it: transparency engenders trust! When we do regular builds and other regular development activities and publish build-status reports and burndown-charts and cumulative flow diagrams in a visible fashion, we are giving agile stakeholders a form of RFID for each task, feature, build, iteration and release.

This type of frequent and regular feedback gives transparency of an agile development team's activities to groups like CM, V&V/QA, Systems Engineering, and Program Management. That transparency helps establish trust, which in turn enhances cooperation, and ultimately enables collaboration.

Wednesday, March 02, 2005

Building Trust with Trustworthy Builds

More on "trusting the organization" ... Roger Session’s “Enterprise Rings” bear a striking resemblance to the different kinds of builds and promotion levels I describe in “Agile Build Promotion: Navigating the Ocean of Promotion Notions” where each kind of build corresponds to a different level of visibility within the organization:
  • Private Build: Individual / Task
  • Integration Build: Team / Project
  • QA/CM Release Build: Organizational / Program
  • Customer Release Build: Everybody else (business customer, portfolio investor, etc.)
As I wrote in my previous blog-entry on building organizational trust, each one of these levels of scope corresponds to a boundary between stakeholders that must be bridged with communication and trust

One way for development to build trust with the CM organization is to engage CM up front and elicit "stories" about what "trust" means to them for handoffs from development to CM. What are their biggest complaints/concerns about "trusting" development? Do they fear development will:
  • Break the build if they do their own integration/merging and "commits"
  • Hand off a build that is not reproducible and reliable?
  • Use/create unstable developmental builds that are inconsistent, incorrect, incomplete, or incoherent?
  • Neglect to create named stable baselevels at appropriate times
  • Monopolize the build servers/resources and software licenses that CM needs to use
If CM doesnt trust development about these things (and others), find out why, and partner together to create "shared stories" that will evolve into the acceptance criteria for development to hand off builds to CM. For example:
  • If developers use the SCM Patterns of Private Build, Task-Level Commit, and passing all the automated tests, will that eliminate the fear of development breaking the build? If not, what else? Do they need a way to be sure that these are actually followed?
  • If there is an automated build that development systematically uses to reliably ensure that a sandbox environment and build-time options/flags are "sufficiently comparable" to what CM uses, will that eliminate fears of inconsistency and unreliability of the build and its reproducibility?
  • If there are business requirements to be able to support multiple releases or (customer special" variations), many agilists/extremists may not want to hear this, but they must learn to respect it as both a business requirement and a CM concern. Then they must learn the requirements to be met so that a solution can be proposed (which might be different from the solution CM had been assuming should be used)
What other things must CM do with the build that they are afraid they wont be able to do if development does their own merging/building? How can they be incorporated into the automated build+commit protocol that development will use?

Once these stories are gathered, fears have been expressed, and needs have been elaborated, then development (with CM's assistance) should develop a systematic mechanism for automating and reliably reproducing such "trustworthy builds" and codelines.

What might be some comparable sorts of concerns and solutions for building trust with V&V, or Systems Engineering?