Unit Testing

This is still a technical blog first and a personal blog second, so despite a VP debate yesterday, the first Fracture reviews coming out yesterday, and other junk I’d like to talk about, this blog needs a technical post and I’m not going to let reality delay that.<hr />

Commonly abbreviated TDD, test driven development is one of the more recent techniques to show up in software engineering. It’s an interesting approach because it takes the normal process of software development and turns it on its head. In order to be able to discuss TDD effectively, though, it’s necessary to make sure everybody is on the same page regarding testing itself, both in the basic process and how it alters the dynamics of software development.

Traditionally, you develop software by iteratively writing pieces, then compiling and running it to confirm that the code you wrote is working as expected. If you’re at a company, there’s probably a QA department who is also poking at the software to try and find holes. If not, well then it’s up to your developer instinct to test likely spots for bugs and ensure that everything is fine. No matter how well organized the QA team is, this testing is still going to be relatively haphazard and time consuming. The reasons that’s not desirable are obvious, but it’s still important to have a test department. They are good at poking at software because it’s their job, and they have a lot more time to dedicate to it than the engineers. They can also identify bugs that are not code bugs. For example, a feature may work exactly as intended, but the actual intention can be flawed or misguided. A good QA department will identify thing are “weird”, not just things that are broken.

It’s also interesting to keep in mind that this does not test the code. It tests the software, which is the end result of a whole lot of code interacting. Most of these interactions are intentional, but some may not be. For example, there might be a block of code to load a configuration file and check that it is sane. This code will sit on top of a function to actually open the file, which should raise an exception if it fails. Now suppose the configuration loader has a bug where it doesn’t correctly check for that error. It would let that exception spill out into its parent code, which would probably result in an application crash or similarly undesirable and obvious behavior. But if the file open function is itself broken so that it doesn’t create that exception the way it’s supposed to, the configuration loader might just keep going as if everything is fine, which could lead to subtle bugs that are not noticed for a long time. If that scenario sounds overly contrived, then you probably don’t have much software development experience. This stuff happens all the time.

That’s where automated code testing comes in, also known as unit testing. This type of testing works by writing extra code that actually uses the code being tested in a synthetic setup, where it’s isolated and run in a specific way to yield specific results. The test passes if the expected result matches the actual result, and fails otherwise. It’s important for tests to be repeatable, independent, and compact. When tests are repeatable, they will always have the same exact results. That’s important for being able to nail down bugs quickly and efficiently. Independent tests will not interact with each other, which avoids nasty interactions between tests that can create all sorts of bugs (or hide bugs, which is equally bad). And tests need to be compact, because that ensures that they test very specific blocks of code independently, thus minimizing the potential “surface area” for a problem to occur in that test. These tests are typically part of your build system. The compiler checks that your code fits the rules of your language; the tests check that your code fits the rules of your specifications. The specifications, in turn, have now been converted from the original loosely descriptive English explanation into rigorous mathematical definitions of what is actually expected from correctly written code.

In order to be effective, unit tests need to be small and numerous, like little Zerglings attacking your code. What that means is that writing tests really sucks. It’s tedious, it’s repetitive, and it can feel pretty silly to be writing tests instead of going through your code for places to refactor and examine. I’ve never liked doing it and I’ll usually avoid it. Recently though, Josh Petrie added NUnit to the SlimDX build process, and I decided take a couple hours to write some unit tests for the oldest and most used class in it, the Direct3D 9 Device. By the time I was done, I’d identified and fixed three bugs.

It’s hard to argue with results. These were bugs in the single most popular class in a production library that gets several thousand downloads every release. Some of the bugs weren’t even terribly obscure; one only showed up with one particular generic type parameter, and one was actually an interface design mistake. (The set function took two arguments, but the get function returned them as a single bitwise ORed value like the native library does.) As a result, the plan is to significantly expand the amount of tests in SlimDX. It’s not fun, but when it comes to effectively writing code that actually works reliably, unit tests are absolutely necessary. It even forces you to examine your interface from the perspective of someone using your code, and as a result you can identify mistakes in it that made sense from the perspective of writing that code, but not using it.

As I said before, tests provide a rigorous definition of how you expect code to behave. It provides a proper mathematical specification, and writing the tests requires you to take a good hard look at how your specification translates into actually using the code. But what I’ve described so far isn’t what test driven development means. There’s a lot of implications to TDD, and my next technical entry will take a good look at some of those implications. The basic concept, though, is dead simple. I’ll sum it up in one sentence: What happens if you write your tests before writing the code it’s intended to test?

Comments