Software Development: Inversion of Control

I first heard the term inversion of control (IOC) last year. (Apparently it is mentioned in the GOF book which I read over ten years ago, but most of that book went over my head. :) I thought I'd better do a bit of reading to find out what all the fuss is about. After reading half a dozen definitions I have concluded that it is rather vaguely defined, or perhaps just not well understood by most people that talk about it.

My understanding is that essentially it's purpose is decoupling (a very worthwhile purpose) using the mechanism of callbacks. Now you may think of callbacks as simply function pointers, but with object-oriented languages the mechanisms for "calling back" are less obvious (though under the hood it all boils down to function pointers) - I explain this in Object-Oriented Callbacks below. Though the term only seems to be used in "object-oriented" literature, it can be applied in any language that supports function pointers.

What confuses most people (if they don't just ignore it) is understanding what is actually being inverted. Essentially, using a callback mechanism allows a program or module to give control to a subservient module with the understanding it can get back control at relevant places (using callbacks). So the "inversion" is simply that the calling module gives up control of a process to the called module. See the code example in C, and diagrams, below if this is unclear.

Another term that is often seen in a related context is dependency injection. This is related to inversion of control in that it separates how callbacks are found (or injected) into the calling module. It is another example of separation of concerns to facilitate decoupling; and, in a way, is itself an example of inversion of control.

IOC Examples

The term inversion of control appears to have been first used in Designing Reusable Classes by Johnson and Foote, in the context of frameworks. Traditionally (before frameworks) programs were created as a main program which called into pre-written library routines to accomplish tasks such user I/O. As user interfaces became more advanced it was recognized that the control in many programs followed a pattern, so instead an approach was taken where a pre-written framework was invoked when the application started up which called back into the custom user interface code. In a sense the main control was "inverted" by being passed from the application designer to the framework. This had the advantage that much of the code of an application did not have to be rewritten each time, but it meant that the application had to conform to the framework's way of working.

Actually, even before the invention of frameworks, control had been inverted by modern GUIs. For example, in the Windows operating system the main control of most programs is within the operating system in the "message loop" - as the user interacts with the program this sends messages back to the custom application code. This mechanism afforded users much more freedom when compared to earlier software so some people also think of inversion of control as giving control to the user rather than having the software strictly control the process.

Inversion of control has actually been in use in software development for a very long time (though not given that name). It is applied whenever a common complex process is recognized and incorporated into a reusable module. The main process cedes control to this module but still retains relevant control of the behavior via callback(s). The main advantage is that similar or identical code is not duplicated in every piece of software that needs to perform the process. It also assists decoupling which is a primary goal of well-written software.

C Code Example

Here is a C code example which demonstrates inversion of control used in sorting.

It is a common requirement to need to sort a list of strings with options that control the ordering such as top-to-bottom, bottom-to-top, case-insensitive, etc. The sorting code is left out below, for brevity, but the important thing is the calls to one of several functions that determines the relative order of the strings.

enum sort_type { ALPHA_TOP_TO_BOTTOM, ALPHA_BOTTOM_TO_TOP, CASE_INSENSITIVE_TOP_TO_BOTTOM, ... };

void sort(const char *str[], size_t num, enum sort_type how)
{
...

bool is_before;
switch (how)
{
case ALPHA_TOP_TO_BOTTOM:
is_before = compare(str[index1], str[index2]);
break;
case ALPHA_BOTTOM_TO_TOP:
is_before = compare(str[index2], str[index1]);
break;
case CASE_INSENSITIVE_TOP_TO_BOTTOM:
is_before = case_insensitive_compare(str[index1], str[index2]);
break;
case ...
....
}
...
}

This approach to sorting means that you have to write all the sorting code yourself, which gives a hierarchical structure where the main code calls the sub-routine sort() which in turn repeatedly calls one of the comparison routines.

Diagram 1. Sorting using Normal (Hierarchical) Control

Obviously, sorting is a very common thing to do, so it would be useful to encapsulate the sorting in a reusable module. This is exactly what the standard C library function qsort() is intended for.

The standard C library function qsort() implements the "Quick Sort" algorithm. Using qsort() is where beginners usually first encounter the concept of function pointers.

But to control the ordering of the sorted objects qsort() needs to make a call back into the calling code -- this is accomplished by a function pointer supplied as the fourth parameter to qsort(). A function pointer used like this is often called a callback.

...
const char *str[] = {" Abc", "def", "123", "aaa", "abc", }; // test data

int (*pcomp)(void *, void *);

switch (how)
{
case ALPHA_TOP_TO_BOTTOM:
pcomp = &compare;
break;
case ALPHA_BOTTOM_TO_TOP:
pcomp = &reverse_compare;
break;
case CASE_INSENSITIVE_TOP_TO_BOTTOM:
pcomp = &case_insensitive_compare;
break;
case ...
....
}
qsort(str, sizeof(str)/sizeof(*str), sizeof(*str), pcomp);
...

}

// Call back functions
int compare(const void *p1, const void *p2)
{
return strcmp(*(const char **)p1, *( const char **)p2);
}
int reverse_compare(const void *p1, const void *p2)
{
return strcmp(*(const char **)p2, *( const char **)p1);
}
int case_insensitive_compare(const void *p1, const void *p2)
{
return _stricmp(*(const char **)p1, *( const char **)p2);
}
...

Diagram 2. Inversion of Control using a CallBack

Function Pointers as Callbacks

The idea of callbacks is almost as old as stored-program computers. After all any subroutine has a starting address which can be stored in memory. As long as the calling convention and parameters of different subroutines are the same they can be substituted for each other, giving an effect now known as polymorphism.

Numerous uses for addresses of functions (commonly called function pointers) were found in assembly languages and they soon found there way into high-level languages like PL/1, C and Pascal. We saw above how function pointers can be used for callbacks from the standard C library function qsort().

Object-Oriented "Callbacks"

With the advent of object-oriented languages, inversion of control became easier and safer. However, it is a bit more convoluted so I will take some time to explain it. As usual we have the calling code that relinquishes control and the receiving module or class that controls the process. However, instead of passing one or more function pointers (callbacks) for the controlling module to call, a pointer (or reference) to an object is supplied.

This "callback" object either inherits from an abstract base class or implements an interface. Note that in C++ we use an abstract base class, since the language does not support interfaces like newer languages (such as Java and C#). However, they are essentially the same thing so for simplicity I will just use the term interface. Of course, the idea of using an interface (ie, a set of functions) for polymorphism is a very old one but has only recently been formalized in object-oriented languages.

An object that implements an interface basically just provides an array of function

A vtable is not to be confused with a delegate (as in C#). A delegate is an array of function pointers where each function has exactly the same signature (parameters etc). Whereas a vtable (or interface) is an array of function pointers where each function has a different purpose so the signatures are not necessarily the same.

pointers. In C++ this is called the "vtable", because each of the function pointers points to a "virtual" function which can vary depending on which particular class implements the interface. Each function of the interface has a name (at least at compile time) and an signature which says precisely what parameters it is passed, what it returns, etc. Polymorphism is achieved by the fact that different classes can implement the same interface in different ways.

Interface Example

Now I will show the same sorting code that we saw above using an "interface" instead of a function pointer. Note that this code is in C++ which does not support interfaces so we use an abstract base class instead. If written in Java or C# we would define a comparer interface, then each comparer class would implement (inherit from) the comparer interface instead of inheriting from an abstract base class. The sort function is then simply passed a reference (pointer) to the interface.

Of course, for this example the interface only needs one function (which takes two strings and returns an int). The advantage of an interface or abstract base class over a simple function pointer is that it can define more than one function. Also since an interface is attached to an object, the object itself may have data (private members) that are used in calls to the interface.

class IComparer{public: virtual int compare(const char*, const char*)=0;};

void sort(const IComparer & comp, const char * str[], size_t num_strings);

// Implements normal alphabetic compare
class alpha_compare : public IComparer
{
public:
int compare(const char *s1, const char *s2) {return strcmp(s1, s2);}
};

// Implements case-insensitive compare
class case_insensitive_alpha_compare : public IComparer
{
public:
int compare(const char *s1, const char *s2) {return stricmp(s1, s2);}
};

...
const char *str[] = {" Abc", "def", "123", "aaa", "abc", }; // test data

...
alpha_compare ac; // sort using normal alpha compare
sort(ac, str, sizeof(str)/sizeof (*str));
...

Diagram 3. Inversion of Control using an Interface

Now the module or class that gets control accepts an interface (technically a pointer to an object of a class that derives from the interface) and the caller that relinquishes control implements the class derived from the interface. This class implements the call-back function (in the above code comparer::compare()).

Misuse of Inversion of Control

Like many new ideas, inversion of control tends to be used when it is not really

I will have more to say about this in a future blog on the Gas Factory anti-pattern. Following fads and experimenting with novel designs is a major contributor to this problem.

necessary and even when it is detrimental to a design. It is important to understand different options when designing software then choose the most appropriate. Usually, the most appropriate is the simplest (YAGNI).

If you know anything about XML you may know that there are two main programming "APIs" that were invented for processing XML files. The first is DOM that allows you to very easily read and modify an entire XML file. The problem with DOM is that it usually requires storing the entire contents of the XML file in memory; for a large XML file this may be impractical. So SAX was invented to allow sequential processing of an XML file from start to finish basically so that XML data of indefinite length could be processed.

With SAX the main program (using the inversion of control principle) hands over control of the XML processing to the SAX module which uses callback functions to signal when it encounters something that may be of use to the caller.

There are many problems with SAX, such as the fact it is very time-consuming, inflexible and tedious to use. Despite being designed for efficiency (as opposed to DOM) it is often difficult or impossible to accomplish some tasks efficiently, both in terms of time and memory use. When it is compared with certain alternatives (such as the .Net XmlReader class) it is glaringly obvious that it is a poor design.

The main problem with SAX is that, in this case, it is a poor idea to invert control. Passing control to the SAX parser results in all sorts of problem for the callbacks. For example, the callbacks usually have to store some (often a lot of) information between calls. Even just ensuring that elements are parsed in the correct order will require "state" information to be kept. Further, the whole file must be parsed even if most of the data is irrelevant - because control has been given to the SAX parser there is no way to control which parts of the file are used.

Conclusion

Inversion of control is simply the technique of passing control of a process to a subservient module. Then, when necessary, control is handed back to the caller using callbacks.

This technique, under different names, has been used for probably close to half a century. It is a very powerful technique for separating a process into a separate module, while still allowing necessary control. However, as we saw with SAX it is not always appropriate.

Finally, I was also going to discuss the related topic of dependency injection, but this blog post is long enough already, so I will have to cover it in a later post.

Software Development

Tuesday, 19 June 2012

Inversion of Control

No comments:

Post a Comment