Friday, April 18, 2008

When Interface Dictates Implementation

This whole topic is probably kind of obvious, but I recently ran into a great example of how a seemingly minor difference in an interface can have a large impact on possible implementations. I was evaluating two libraries for inclusion in a project at work. Both provided a C API that we would use, and the two interfaces were only slightly different. This difference would prove to be significant.

There are only a handful of ways to create and manipulate "objects" with a C library. The first library, "Toolkit S", had an interface like this:

int CreateObject(...); /* returns a handle or a reserved value to indicate failure */

int Foo(int handle, ...);


int Bar(int handle, ...);


int DestroyObject(int handle);



That is, all required allocations are done internally, and clients interact via a numeric id. This is similar to how _open works. The second library, "Toolkit L", had an interface like:

typedef struct tag_Object{ ... } Object;

int CreateObject(Object* obj); /* returns an error code, initializes struct */


int Foo(Object* obj, ...);


int Bar(Object* obj, ...);


int DestroyObject(Object* obj);



In this case, the memory for the object is allocated by the caller and that struct is passed in to every call (like the CriticalSection functions). In this particular case the Object struct contains pointers to memory that could be allocated by the library, but this isn't generally the case.

There are pros and cons to each design; I'm not going to declare one superior to the other in all cases. However, what I realized (this is kind of obvious if you've thought about it) is that the first library must have a similar struct defined internally, and has an internal data structure that maps int handles to it. Also, this data structure has to have some kind of thread-safe synchronization around it which causes an unavoidable performance penalty in a multi-threaded environment that the other library avoids completely. No matter how access is controlled, there are use cases where your design requires a performance hit (above and beyond the extra level of indirection). The benchmarks showed what I knew they would: Library L made much better use of my test machine's second core.

They've been talking about the "multicore revolution" for years, and we're finally getting to the point where odds are that your code is running on a multi-core machine. You can't always take advantage of it - especially as a library author - but you have to do what you can to not get in the way of those who can take advantage.

No comments: