Monday, September 18, 2006

TEA-Time - a metric to elicit TDD behavior?

I've been thinking about a metric that might elicit Test-Driven behaviors in my organization. As a first step to TDD, we definitely want folks to create automated tests as much as feasible and execute them frequently. Once they get that, I've been thinking about what sort of metric might encourage them to actually work in short, test-driven cycles, where requirements are elaborated test-by-test (given a use-case or story, write the first test, write the code to pass the test, refactor, rinse-lather-repeat).

Some of these folks are very much ingrained in a systems engineering V-model lifecycle that does a lot of big-requirements up-front. So ensuring they work to automate tests and execute them frequently isn't enough to enourage them to use an interative approach of fine-grained TDD-style elaboration. An idea I had for what to measure is something I chose to call Test-Execution Availability Time, or TEA-Time (I figure if nothing else, my friends in the U.K will like the name :-).

As proposed, Test-Execution Availability Time (or TEA-Time) would be defined as the mean time between when a system requirement description is first baselined, and the time at which a "sufficient" number of automated tests for the requirement were implemented and ready (available) to be executed.

I was thinking that if a group was measuring this and wanted to behave in a way that minimized TEA-Time, it might encourage them to elaborate requirements in an iterative and incremental fashion, in smaller and smaller "functional slices". One thing I'm not sure of is what "a sufficient number of automated tests" should be in the above.

Any thoughts or comments?


Anonymous said...

I really like the idea of defining a formal metric to encourage iterative and incremental elaboration of requirements, so please don't take my comments to mean I don't like the idea. Several undesirable properties of your definition would be that (1) TEA-Time increases as the number of automated tests decreases, and (2) TEA-Time increases as the automated test quality decreases.

Maybe you could improve your definition by including the test definitions in the requirement description? For example,

"... when a system requirement description including definitions of required automated tests is first baselined, and the time at which the defined tests for the requirement ..."

Brad Appleton said...

Hi Chris! Thanks for the feedback (much appreciated).

Does TEA-Time increase as the number of automated tests decreases? That would mean writing fewer automated tests for a requirement increases the time it takes to have executable tests available? You must have meant TEA-Time improves (gets smaller) as the number of automated tests decreases. (yes?)

I think that is only true if TEA-Time is measures until all automated tests for the requirement are implemented. If the limit is not "all", but instead a "sufficient" number, then I don't think that is the case. The big question is how many is "sufficient". If N=1, then writing more tests for that requirement doesnt "degrade" TEA-Time, but 1 may not be a "sufficient" number.

I think the same applies to your 2nd statement about quality of automated tests. The tricky part is figuring out what is "sufficient." Obviously 1 test per requirement likely isnt enough. And yet there is some threshhold beyond which adding more tests doesnt add value. So how do we find that threshhold?

If we take the Agile/XP view that tests are an executable form of requirements, then creating too many tests is akin to "gold-plating" the requirements or "over-engineering" the system. Some tests add value, other proposed tests might not be adding much additional value. How do we decide that?

For requirements, we typically ask the customer to determine the value/priority. Is that reasonable to do for the detailed test-cases? Can the customer be expected to grok that level of detail? (If not, then who? and how do they judge?)

My concern about the "requirements baseline" including the definition of the tests (test cases) is that up until the requirements are initially baselined, TEA-Time alone doesnt provide much incentive to develop the requirements in a non-waterfall manner. So you could still do big-requirements-up-front (BRUF) before the clock started on TEA-Time. Adding the test-case definition to what is needed before that clock starts could (I fear) contribute even more to doing the initial requirements in a BRUF fashion all the way thru to detailed test cases (ouch)

In my organization, the system being delivered includes hardware as well as software, and they use a systems engineering approach (and lifecycle). At some point in that lifecycle, some level of BRUF and requirements decomposition already has to take in order to determine a set of software products and their corresponding features/stories. So that level of BRUF is unavoidable here (or at least outside my sphere of influence :-).

I was hoping to devise a metric that would incourage incremental/iterative elaboration of requirements (all the way thru to executable tests) starting at the point after the systems engineering folks handoff system-software requirements to the software product teams for t hem to further decompose and develop.

Fredrik said...

Isn't TEA-time just a variation of lead time for requirements? Considering that definition of done for the requirement step is that there are tests that ensure the requirement is fulfilled.

Brad Appleton said...

Hi Fredrik!
Can you say more about lead-time for requirements? Is it lead-time for specifying ALL the requirements? or for a single requirement?