Sunday, March 20, 2005

The Five Orders of Traceability

Traceability is supposed to help us track and link related pieces of knowledge as we progress thru the software development lifecycle. The software development lifecycle is devoted to creating knowledge that transforms fuzzy functional concepts into tangible working results. It is the process of transforming theoretical concepts into executable reality.

Phil Armour, in his magnificent book The Laws of Software Process, maintains that software is not a "product" in the usual production-oriented sense of the word, but that software is really a medium for capturing executable knowledge. He then uses this to derive that software is therefore not a product-producing activity but rather a knowledge creating and knowledge acquiring activity.

Armour goes on to describe "The Five Orders of Ignorance" and how, if software development is a process of knowledge-acquisition and knowledge-creation, then it is also ultimately a process of "ignorance reduction" whereby we progressively reduce our ignorance of what the system is, what it needs to do, how it needs to do it, and how we need to do it and manage it.

Perhaps it should follow then, as a corollary to all of this, that the mission of traceability is to connect all the dots between these various pieces and "orders" of knowledge to help those of us auditing or reviewing them to learn the lessons of how and why that knowledge came to be created in its current form. To that end, I hereby propose The Five Orders of Traceability:
  • 0th Order Traceability – Existence: Tracking Knowledge Content
  • 1st Order Traceability – Structure: Tracking Knowledge Composition
  • 2nd Order Traceability – History: Tracking Knowledge Context
  • 3rd Order Traceability – Transformation: Tracking Knowledge Creation (causal connection in the knowledge creation/derivation process)
  • 4th Order Traceability – Meta-Traceability: Tracking Knowledge of the Five Orders of Traceability.
It's not entirely clear to me whether I have the 2nd and 3rd items mixed-up in the above (perhaps structure should come after context). When deriving the ordering and identification of the above, I basically took my cue from Armour's 5 orders of ignorance, and made the assumption that it is the intent of a particular "order" of traceability to eliminate or address the corresponding "order" of ignorance! Hence 3rd order traceability should help resolve 3rd order ignorance, 2nd order traceability should help resolved 2nd order ignorance, and so on.

With that in mind, I'll elaborate further upon each of the five orders of traceability Ive "coined" above ...

0th Order Traceability is merely the existence of knowledge content. There are no additional linkages or associations to navigate through that content. There is no explicit traceability - the content is simply there.

1st Order Traceability is an attempt to structurally organize the knowledge content and provide links that navigate the decomposition from one level of organization/detail to another. This would be like an outline structure and cross-referencing/indexing capabilities. A number of tools give us a means of organizing a particular portion of system knowledge:
  • Basic requirements document management tools provide way of organizing and viewing the requirements, and even design documentation for a project
  • Modeling tools provide a way of organizing and viewing problem-domain and solution-domain abstractions
  • Many interactive development environments give a logical (e.g., object-based) view of the code and/or a physical (e.g., file-based) view of the file+folder structure of the codebase.
  • Many of these tools even provide a way to link from a section of a document to a model entity (e.g., a UML object or package) and/or a code-construct (e.g., a class, method, or module)

2nd Order Traceability goes beyond mere content and structure to provide contextual awareness. Not only are there links, but there is also contextual information (typically in the form of metadata) giving a clue as to the events that transpired to create it: who authored the content, when they did it, where they did it (physically or virtually).

This type of historical/log information assists in auditing and recovery, and is typically most-suited for automatic capture/recording by the application used to create/modify and store the information (perhaps using a mechanism like dynamic event-based traceability). The information might also include application integration data (for example, recording the identifier of a requirement or a change-request and associating it with the corresponding portion of the design, code, or tests when it is created or updated).

3rd Order Traceability is the "nirvana" of traceability for many. Not only do we have structure and context, but we have additional associations and attributes (e.g., metalinks between [meta]data) that capture the causal connection between related pieces of knowledge in the value chain. Some call this "rich traceability" and there are some high-end requirements management tools capable of doing this.

Still, tracing all the way through to design models, and code remains very effort-intensive unless all knowledge artifacts are captured in the same repository (requirements, models, code, tests, project-activities, change-requests) where these advanced "rich tracing" capabilities exist.[
MKS claims to have taken a step toward achieving this with its new RM tool that tracks requirements in the same repository as the source-code version control system].

With 3rd order traceability, we are effectively capturing important decisions, criteria, constraints, and rationale at the various points in the knowledge creation lifecycle where one form of knowledge (e.g., prose, model, code) is being either transformed into another form of knowledge, or is being elaborated to another level of decomposition within the same form of knowledge.

4th Order Traceability is meta-traceability, or tracking of knowledge about the five orders of traceability within or across systems. (Sorry - I couldn't come up with anything better that is analgous to Armour's 5th order of ignorance - a.k.a. meta-ignorance. If you have a different idea of what 4th order traceability should be, feel free to comment.)

What about Ontology or Epistemology? I don't honestly know. I would imagine there must be some way of tying together the above with terms and concepts from "knowledge management," such us transforming tacit knowledge to explicit knowledge, and maybe even relating XML schemas and ontologies back to all of this. I leave that as an undertaking for someone much more versed than I in those domains.


Anonymous said...

Any chance of some concrete examples of each level - I get kinda hazy trying to pick up exactly what level 4/5 imply.

Anonymous said...

interesting, good post, especially in light of:

1) all the folks who get an agile bug up their ass to do away with any and all documentation.

2) all the other people who say they want some documentation so start a wiki but then nobody bothers to actually invest the effort required and the wiki just becomes kinda super dead and out of date and possibly worse than not having it at all.

fun times.