Writing a wrapper for the CUDA memory allocation API


As a software developer, you may be faced with the problem of modifying the behavior of a third-party program.  For example, you're writing a debugger that will check the usage of memory allocation in the program being debugged.  You often do not have the source for the program, so you cannot change it.  And, you do not have the sources for the memory allocation library that the program uses, so you cannot change that code either.   This is just the problem I was facing with Nvidia's CUDA API [1].  Although it is not a particularly hard problem, what I learned was yet another example of how frustrating it is to find a solution that should be well described, but is not. 

