One Bug, One Failing Test

I’ll admit it right now.

For years, I’ve written tests with the simple idea that more is better. Proudly I’ve looked at a set of tests I’ve written – especially made easier with parameterized tests – and been impressed by the number, often into the dozens or even hundreds.

Here’s the problem: when dozens of tests fail, it can be difficult to go through them all and determine what the underlying cause is. It becomes obvious that much can be too much, and really, only one of the tests should be sufficient to determine the problem.

Now, this is all written with an ideal in mind, not an absolute, inflexible criterion. But in general, tests are cleaner and more informative when some code fails, and only a few tests, or maybe only one, fail as a result.

I call this behavior “cascading tests”, or “downstream failures”, when one errant code “upstream” leads to a panoply of failures. So I try to isolate code and tests so that a failing line of code results in a very clear subset of failing tests.

How does one get to this state? Simple (yet difficult): have one test per edge condition.

An example is in a simple method from the Str class from IJDK:

    public int length() {
        return isNull() ? 0 : str().length();
    }    

And the test:

    @Test @Parameters @TestCaseName("{method} {index} {params}")
    public void length(int expected, String str) {
        int result = new Str(str).length();
        assertThat(result, equalTo(expected));
    }

    public List<Object[]> parametersForLength() {
        return paramsList(params(0, null), 
                          params(0, ""),   
                          params(1, "a"),
                          params(2, "ab"));
    }

Note that this is simple, yet adequate. The rule of thumb is that one should always test for the numbers 0, 1, and N, as well as for null, and for strings, the empty string.

So by breaking the code as follows, we get one failure:

    public int length() {
        return str().length();
    }

org.incava.ijdk.lang.StrTest > length 0 0, null FAILED
    java.lang.NullPointerException
        at org.incava.ijdk.lang.Str.length(Str.java:747)
        at org.incava.ijdk.lang.StrTest.length(StrTest.java:299)

Similarly:

    public int length() {
        return isNull() ? 0 : 1;
    }

org.incava.ijdk.lang.StrTest > length(...) #1; [0, ] FAILED
    java.lang.AssertionError: 
    Expected: <0>
         but: was <1>
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
        at org.incava.ijdk.lang.StrTest.length(StrTest.java:300)

org.incava.ijdk.lang.StrTest > length(...) #3; [2, ab] FAILED
    java.lang.AssertionError: 
    Expected: <2>
         but: was <1>
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
        at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:8)
        at org.incava.ijdk.lang.StrTest.length(StrTest.java:300)

Okay, that’s two failures, not just one, but it’s much better than 20, where 18 or so would have been redundant and confusing.

Of course, it’s better to “err” on the side of excessive test coverage, but focusing on a subset of possible conditions – input and state – can lead to improved precision and concise tests.

home

Related