Sunday, 5 January 2014

Unit Tests - Challenges

Happy New Year! Here I return to my series of posts on Unit Tests.

I guess you know by now that Unit Tests have many advantages, so why isn't everybody using them? Well, you'd be surprised how many people are using them! But the uptake has been slow, probably due to ignorance of their benefits (already covered) and various "challenges". This week I will explain the challenges, why they are far outweighed by the benefits, and how to overcome them or at least lessen their effect.

First, I will mention something that is a major challenge in itself - actually implementing Unit Tests properly. Poor practices are the main reason that Unit Tests are rejected after they have been tried. (I had intended to cover this now but it will be in the next post Unit Tests - Best Practice which will include a discussion of poor practices like incomplete test coverage, etc -- so stay tuned!!)

1. You need to Understand the Code

The first problem with Unit Tests is something that nobody talks about - except that some people mention it as an advantage of them. (Maybe it is an advantage sometimes, but in my experience it is a drawback.)

In C, you can write code and know it will just work without truly understanding it. The design of the language, such as proper handling of zero (see Zero), asymmetric bounds (see Asymmetric Bounds), etc makes coding fast, and reliable. But with Unit Tests you suddenly have to understand things like boundary conditions or different combinations of inputs.

In this example I assume the coder knows the correct behavior of strncpy() - eg, that the destination string is not nul- terminated if the length is exceeded.

BTW As a test I wrote the code for strncpy() in 81 seconds.
For example, a C programmer can typically write the code for the standard library routine strncpy(), and get it right, in a minute or two. However, to consider all possible values and combinations of input parameters and create Unit Tests for them all would take at least an order of magnitude longer - at least 30 minutes, probably more.

Despite this I still believe that Unit Tests are more than worthwhile. (Though this was the main reason that I initially rejected them - see my previous post on Personal Experiences). First, for larger modules actually trying to understand how the code might react to different combinations of inputs may reveal bugs and lead to better code (see the Design section in Unit Tests - Advantages).

For modules that are likely to change (ie, most) there is an even more important consideration: you can change the module to your heart's content and you can simply run the Unit Tests to ensure that bugs have not been inserted. This can more than compensate for effort needed to understand the code and create the Unit Tests.

Even the code for strncpy(), mentioned above, might need to change. For example, it may have to be rewritten to copy 4 (aligned) bytes at a time for better performance. In this case having Unit Tests can be very useful for checking that the code still behaves correctly. (Of course, you probably need to add new tests after this change, for example that 4 character and 5 characters strings, strings with different alignment, etc are copied correctly.)

Finally, there are tools available to address this problem such as the excellent tool from Microsoft called Pex. Pex analyses your code and works out different sets of test values, for example, to check boundary conditions. It is not a fully automated system so you still have to understand the code, but it can save a lot of time.

2. They Take Time

There is no getting around it - writing Unit Tests takes time. Writing good Unit Tests can take much more time than the time to write the code being tested, for the reasons discussed below and next week. Many proponents of Unit Tests say that once you are used to writing them that it does not add appreciably to the time to write the code, but either they are not writing tests properly, are lying, or they know something that I don't.

Of course, you understand by now that, even if they take time, this is not wasted time. In the long term this is time exceedingly well spent. The trouble is you can't always control how you spend your own time. Many managers would not agree to anything that adds 10% to development time, 50% extra would make them very angry, 100% would be unthinkable - so imagine their reaction if you tell them you need to create some mock objects and a meta language for your Unit Tests which will add 500% to development time!

One day it will be standard practice to create Unit Tests (with full code coverage) as an accepted part of the process of creating software. However, until that time you either have to pray for a good boss, work overtime creating Unit Tests, or just grossly overestimate to give yourself some free time.

The good news is that there are a lot of tools and libraries around to assist with creating Unit Tests. I mentioned Pex above, but there are also tools that make it easy to create mock objects. These can save a lot of time.

3. They Require a Good Design

Before all else, in order to create Unit Tests you need a good design. (Of course, that is not the only advantage of good design - see Fundamentals of Good Design.) You need well-defined modules, each with simple well-understood interface(s). This is one reason it is hard to add Unit Tests to legacy code (see the next section).

There are different ways to split any design into modules. The way this is done can also affect how easy it is to test them. For example, the MVVM (Model, View, View-Model) design pattern is very useful for isolating the GUI from the rest of this system which makes it much easy to test the whole system in isolation from the GUI.

There are other aspects of the design which are rather specific to Unit Testing. For example, Dependency Injection (DI) makes it very easy to test modules with mock objects (see the section after next).

Finally, even just creating Unit Tests while you write the code (especially if you use TDD) will result in a better design.

4. They're Hard to Retrofit

Like a lot of programmers I have spent a large part of my career maintaining and modifying existing code (and cursing the bad to egregious design I had been lumbered with :). I have been quietly pushing the idea of Unit Tests to management at various employers for about 20 years - to which I get the obvious response: "Add some Unit Tests to our existing code to demonstrate the benefits".

Unfortunately, adding Unit Tests to existing code is difficult to impossible. Some people have reported some success in retrofitting Unit Tests to legacy code but in my experience these tests are not that useful (probably missing at least 80% of test cases).

A major problem for many legacy projects is that they do not have well-defined interfaces which is mandatory for being able to add Unit Tests (see previous section). Some do not even have any recognizable modules at all.

Even if you are lucky enough to have a well-designed legacy program there is still another problem. Only when the code is being written is it well understood. Even the person(s) who originally wrote it will begin to lose some understanding of it within a few days.

Unit Tests should be written at the same time as the code by the person who wrote the code since they (at that time) understand it best.

5. Often Need Test Doubles

The main reason that Unit Tests are not used for many new projects is that modules/objects have external dependencies that make it difficult or impossible to create Unit Tests without creating code to emulate those external dependencies. There are various ways that this emulation can be accomplished but they all take time and/or have other limitations. All methods of emulation nowadays fall under the umbrella term of test doubles, but there are various types (simulators, stubs, mock objects, etc) with different applicability and advantages/disadvantages as discussed below and in my next post.

There are
two reasons
to use
test doubles
Why do we need test doubles?

There are two general reasons to use test doubles. First, there may be undesirable consequences of using the real module which we want to avoid. Second, we may want to test conditions which the real module does not normally produce.

Here are examples of modules that require test doubles due to their undesirable behaviour:
  • communicates with a remote server and so is much too slow to use in a test
  • modifies a live database, such as updating customer details
  • uses resources, such as a printer module that consumes paper
  • depends on an environment that may not exist where the Unit Tests are run
  • sends test emails to real customers
  • requires manual intervention, such as clicking to confirm a message
  • returns different results each time it is used, such as stock values
  • does not even exist, for example a sub-module that has not been written
  • is not the final version, so has defects (bugs, poor performance, ...)
Here are some examples of external modules that require test doubles in order to simulate certain conditions:
  • hardware device driver, so you can simulate any possible error condition
  • system time, so that you can test what happens at midnight
  • remote communications, so you can test time-outs and other errors
  • return out of range or atypical values

Problems with Test Doubles

That's enough about explaining what test doubles are. Now, what are their problems? First, though often necessary, they can take a lot of time to create. Here again tools, like mock objects, can save time but may have their own limitations which in turn makes it difficult to decide what to do.

Even deciding whether or not to use a test double can be difficult. In my experience it is better not to use a test double, unless you really have to. By not using a double you may then discover integration problems caused by communication between modules that you would otherwise not be found until much later.
SUT    

The module or unit being tested is conventionally called the System Under Test. I will use the acronym SUT.

On the other hand many people recommend using test doubles for all external dependencies to avoid chasing problems not related to the SUT. This appeals to those of a scientific background since you isolate the SUT from any external influence that may affect your test results; that is, it reduces complexity by eliminating unknown variables.

I personally am of the opinion that it all boils down to how reliably the external module performs its task, and equally important how well-defined is its interface (which goes back to the point 3 of having a good design). If it has bugs then it may be better to use a test double in order to avoid chasing bugs not caused by the SUT. Another possibility is to create the same test twice, first using the real external module and also using a test double so it is obvious where the problem is occurring.

Another problem to watch for with test doubles is that they do not return the correct or expected results even when sent the wrong information. The problem of "false positives" is very common

The final problem that I will mention with test doubles is deciding which type to use. For example, it might be simple to use a mock object but in some circumstances mock objects are not a good idea as they can end up testing how the SUT is implemented internally. As I have mentioned before (in Unit Tests - White Box Testing) testing the implementation is a bad idea since changing how a module is implemented without changing its public interface should not invalidate your Unit Tests. This is a common mistake with mock objects that I will discuss next week.

Solutions

Mock objects are a neat solution for quickly creating test doubles. There are many mock frameworks that allow you to create a "mock" object on the fly that accepts function calls and responds with a canned response. (How easy this is depends on the framework and the language being used.) But note that with complex tests even setting up mock objects can be time-consuming.

There is also a danger to using mock objects -- as mentioned above, you may end up testing how the SUT is implemented not simply its external behaviour. I will discuss this next week but in brief it depends on the type of module the test double is created for. If you want to test how a module communicates with other peer modules then mocks are ideal. If the module is a private (subservient) module you normally want to check "what, not how", and avoid mocking.

An alternative is to use a simulators (some people call them "fakes") but the problem is they may take a lot longer to create than mock objects. An accurate simulation may take almost as long to create as the module it is simulating, while a trivial simulation may be useless. My advice is to start with a simple simulator and enhance it if and when required.

Another alternative is to ask the supplier of a 3rd-party module to provide a simulator. This can be a perfect solution as it will save you time and give more accurate results (since they should know better than anyone how their software is supposed to behave), and they can provide the same simulator to other customers. This practice is sometimes used by hardware manufacturers but could be used for any sort of module. It couldn't hurt to ask.

If the module to be simulated is developed in-house then another alternative is to clone the original and hollow it out into a shell that emulates the real thing without actually doing anything. You can also add the ability for the simulator to return possible error conditions so you can test error-handling.

6. Need Provability

Another problem with Unit Tests is that even if you have lots of tests and they all give a green light you still cannot be sure that they are adequate. People are lazy and/or make mistakes. Analysis of typical Unit Tests show they cover only about 50%-60% of the code.

Like all the problems I have mentioned here, there is a (partial) solution. A code coverage tool can tell you how much of the code your Unit Tests are executing. The target should be to get to 100% code coverage, or as close as possible. (Of course even 100% code coverage does not mean that there are no bugs.)

7. Poorly Implemented

Many shops try Unit Tests but strike problems due to poor practices. For example, if the tests are too slow people will stop running them. There are lots of pitfalls like these. I will discuss them next time, as well as how to look out for and avoid them.

Conclusion

That concludes the discussion on the inherent problems of Unit Testing. In summary, in order to create good Unit Tests you need a good design and to write the tests at the same time (or before) writing the code to be tested. They do take time to create but the benefits far outweigh the costs. Plus there are many time-saving tools and techniques such as mock frameworks, though you need to be aware of the limitations of any tools and techniques you use.

No comments:

Post a Comment