Thursday, November 10, 2005

Subversion in C++?

So a while back one of the Subversion committers mentioned that it might make sense to write some (as yet theoretical) future version of Subversion in a dialect C++ instead of C.

You see, Subversion is VERY object oriented, at least for C code, and we jump through a LOT of hoops as a result of the C language. For example, there's about a million places in the source tree where we pass around void pointers as a way of storing context for some callback. In C++ we could pass objects that carry their own context around with them, instantly cutting in half the number of arguments we need to keep track of.

Anyway, I was thinking about this on the drive home from work today, and I started to wonder what would be required to make that really possible. The first thing that jumped to mind was making APR interoperate with C++. Virtually all resource management in Subversion is done via APR pools, and we're kind of used to that by now, so switching to a non-pool based world would be kind of weird. But APR pools deal in void pointers, low level raw memory C stuff, and setting up higher level cleanups is a pretty manual process.

Now it's possible to take the raw memory you got from a pool and turn it into an instance of a C++ object, via something called placement new, but it's kind of a pain in the ass, and even then you have to manually call the destructor when you're done with the object, which is kind of contrary to the point of allocating things out of a memory pool...

So the question is, how do we allocate a C++ object from a pool, but automatically register a cleanup that takes care of calling the object's destructor when the pool is cleared or destroyed?

It took a little doing, but I managed to come up with something I like. It looks like this:

main (int argc, char *argv[])

apr::pool p;

for (int i = 0; i < 10; ++i)

Foo *f = p.allocate();


Now to make that work, you do have to jump through one little hoop. The Foo class needs to have a static 'cleanup' function defined on it, which calls the object's destructor. That function will be used as the pool cleanup callback, so you don't have to worry about cleaning it up yourself. To simplify the process of writing this function, there's a helper macro that pounds it out for you, so the Foo class looks like this:

struct Foo
std::printf("in constructor\n");

std::printf("in destructor\n");


That really doesn't seem like such a horrible price to pay for the convenience of being able to work with pools the way we've come to expect, right?

Anyway, I doubt this will ever really be used for anything, but if you want to check out the code I wrote to make this work, you can grab it here.


  1. Ha! And you implied that you didn't have copious free time on your hands... This actually looks pretty cool, but I understood that one of the reasons Subversion went with apr in the first place over the Netscape portability library was the desire to use C instead of C++. Has the culture within the project changed enough that this is no longer the case?

  2. Well, this is all sort of theoretical, I don't know if the project would ever actually accept using C++. Plus, even if we did, we'd still probably want to keep using APR, to minimze the disruption while converting to C++ and because mod_dav_svn would basically mean we have to use APR at some point anyway.

  3. Interesting. I've been working on pool allocators this week myself. IMHO the most direct way to do it is with a factory that returns a shared_ptr with an overridden deleter. Then use placement new and delete to grab memory from the pool.
    One problem with this is that shared_ptr has a count that is allocated from the heap. I thought about using auto_ptr, but auto_ptr doesn't allow the user provided deleters.
    I'll probably blog about this week.

  4. I think the real problem with the shared_ptr approach is that the calling of the destructor is tied to the object you returned, not the pool itself. In my scheme, if I clear or destroy the pool, all the destructors get called. In your scheme (if I understand correctly) the destructor would only get called when the last shared_ptr reference went away, which could be after the pool was actually destroyed, which would be a bad thing...
    Of course, in my scheme you do get the potential problem of still having dangling pointers sticking around after the pool has reused the memory they point to, but I guess I'm just used to that from dealing with APR pools in C code, and I'm not sure I'd be willing to buy into some scheme that uses the shared_ptr reference counts to prevent the pool from being cleared/destroyed until the pointers were gone.