Code Coverage – Part III: Statement coverage and some myths

This post is part of a four-part series on code coverage:

Consider the following scenario: After running your tests, you find that

Service A has 100% statement coverage
Service B has 80% statement coverage
Service C has 5% statement coverage

Which of the following can we say with confidence?

“We have fully tested Service A.”
“We need to test Service B more.”
“Service B is sufficiently tested.”
“Service C is insufficiently tested.” </aside>

Take some time to jot down your answers (and perhaps some examples!)

Done? Great! Let’s review the four statements together.

1. “We have fully tested Service A.”

While a 100% statement coverage seems like a perfect score, we actually can’t say that Service A is fully tested without reading the code. Consider the following block of code:

func HidingBugsSomewhere() {
	if ok || functionWithSideEffect() {
		doSomething()
	}
}

Any test that allows ok to be true will produce 100% statement coverage for this block of code. However functionWithSideEffect will not run in that test case, thereby exposing the system to any risks should it never be tested. As you can see, any conditional statement is bound to insidiously hide bugs behind a perfect coverage score for function HidingBugsSomewhere.

While statement coverage seems insufficient to give us reliable reports on our testing, other coverage metrics like edge coverage or functional coverage might do the job. More on that later…

2. “We need to test Service B more.”

80% statement coverage means that 20% of executable lines were not run during the tests.

But what if these 20% of “executable” lines can actually never be executable by your testing framework?

For example, this code may include error handlers for packet drops or network delays, or code for particular database connections that are not used in your test environment. Or maybe the code is meant to be use as a template instead of being tested directly, but was calculated into the statement coverage by default. Perhaps the code can be tested via an extreme edge case, but it’s not worthwhile modifying the current testing environment to allow for such a case to be automated.

As you can see, such inferences can only be made by reading through the code and analysing the code coverage profile. Statement coverage is just a number – what matters is the story behind it. As a wise man puts it,

Achieving 100% code coverage without analysis is just like reading through a book without understanding it.

3. “Service B is sufficiently tested.”

Without knowing what Service B does, it’s impossible to make any conclusive statements like this. In some organisations, it may be helpful to assign a baseline percentage as to how much code needs to be covered by testing before a ticket can be signed off or a branch can be merged, for practicality or efficiency’s sake. It may be safe to assume that if 80% did not cause the system to crash and burn, the other 20% is probably okay. But are you willing to take that risk?

Harkening back to point 1, it is possible that the 20% of the uncovered lines are actually contain the most convoluted and complex code, and instead represents 80% of the system logic. In that case, the system would be a lot more vulnerable than what we imagined! Again I repeat: statement coverage cannot be interpreted out of context – read the book and understand it!

4. “Service C is insufficiently tested.”

If all your tests have run and 95% of statements were never executed, it’s probably a signal that your tests are blind to the main functionality of the system. Either that, or the source code is stale or redundant – what’s it doing there, then?

Assuming the service is suitably complex, useful and still relevant, any sizeable number of tests that achieves such low statement coverage deserves some rethinking. We can say with certainty that it is insufficiently tested not just in the lines of code but also in terms of function and logic.

High statement code coverage does not imply high functional coverage. Low statement code coverage definitely implies low functional coverage.

Now that we’ve gone through some primers on code coverage, let’s explore more useful and interesting code coverage metrics. See you in the next post!

1. “We have fully tested Service A.”

2. “We need to test Service B more.”

3. “Service B is sufficiently tested.”

4. “Service C is insufficiently tested.”

Published by quiche