Sunday, February 27, 2005

Happy Birthday Apache

So Roy Fielding posted to the httpd mailing list today to let people know that 10 years ago today the Apache project really got off the ground with the initial creation of its first source repository and user accounts for developers.

I was a little curious what it was like back then, so I've been reading over the list archives (check it out, they really do go all the way back) and man was it a different world...

There was no CVS back then, they were using RCS, with local user accounts and manual merging of patches.

There was a lot of discussion of incompatibilities between various clients, the same as today, but at far lower levels than I'd expected. It's easy to forget that at one point basic things like forms and CGI didn't always work the way you'd expect.

Apache was a purely forking server at that point, and that fact went far enough that the code in the child process didn't even bother to free memory, although you can see the discussions of what would eventually become the memory pools used to day in APR.

There was no autoconf, just manual ifdefs for each platform.

Apache's own web server on seems to have been down rather often in the early days ;-)

Perhaps most interesting are the people involved, some of whom are still active in the ASF, some of whom are not, but regardless we all owe them a great deal, and it's nice to be reminded of that every now and then.

Saturday, February 26, 2005

Oh, How Cute, It's A Podling

After a little stalling due to various people lacking free time the Lucene4c project has started to move into the ASF Incubator.

The source code has been moved into subversion, and we now have a status page where you can track how we're progressing through the incubation process.

Of course, the status page is already a bit out of date, but I'm sure it'll be updated soon.

Still on the TODO list are bug tracking and a mailing list, but hopefully those will get finished up in the next few days, and I'll be able to return to writing code and recruiting developers instead of bugging people about moving things along in the ASF infrastructure world.

Oh, and as far as actual progress on the project itself I got the beginnings of the query and scorer code sketched out, to the point where I can now search for documents that contain "foo" and "bar". Support for or and not will hopefully get fleshed out soon.

Monday, February 21, 2005

Lucene4c 0.03: Let there be Documentation!

I'm happy to report the latest release of Lucene4c, which hopefully will be the last release I do before the project moves into the ASF Incubator, a precursor to it becoming part of the new Lucene TLP at the ASF.

You can get the scoop on exactly what changed from the CHANGES file, but for the curious the major changes are the addition of new APIs for manipulating an entire index instead of simply a single segment, the addition of document and field objects, improvement of many error messages and algorithms, fixes to the compound file stream code and most importantly the addition of doxygen documentation for the public APIs.

You can grab the new version (along with all previous versions) from the project web site.

Saturday, February 19, 2005

Subversion + Ruby + SWIG = Fun!

So thanks to the work of Kouhei Sutou we now have a set of SWIG based Subversion bindings for Ruby. They're currently living on a branch in the Subversion repository, and I've been meaning to get around to trying them out for a while now.

My first attempt resulted in a bit of a failure due to the fact that the version of Ruby I got with my Ubuntu linux install seems to fail to expose the symbol rb_hash_foreach from, so whenever I tried to run the tests I'd get failures due to relocation errors when it tried to use that function.

I have no clue why that symbol isn't being exposed (if anyone does have an idea, please let me know, cause it's a pain in the ass), but once I built my own version of Ruby (to go along with my own version of SWIG since Ubuntu's version of SWIG has bugs that keep it from working for the Subversion Ruby bindings) everything went along swimmingly.

It's quite nice to have a set of reasonably complete Subversion bindings for such a cool language. All they need is a little documentation and I suspect they'll be just as functional as the excellent Perl bindings we've had for some time.

Thursday, February 17, 2005

New O'ReillyNet Article

I got home this evening after playing pool with a friend and discovered that google alerts had already discovered my new O'ReillyNet article. This one is about preserving backwards compatibility in open source projects, something that we've taken quite seriously in Subversion for obvious reasons. I think the article came out pretty good (especially after my editor gave it a once over), but as usual I'd welcome any feedback.

Monday, February 14, 2005

I Love This Part

You know the part of a program where you've implemented a whole bunch of the low level bits, and you've just started stitching together the higher level interface, and the whole thing feels like it's coming together without any real effort on your part?

I just hit that part in Lucene4c.

I'm sure it'll pass, as I'm bound to run out of lower level bits I've actually implemented soon enough, but for now I actually seem to have enough of it there to provide something useful.

For example, tonight I implemented an interface to let you iterate over all the documents in an index that contain a given field. I had done much of this before, but that was for a given segment, now I'm at the point where I'm stitching together the multiple segments that can be part of an index, and it all seems to be working out so far...

Tomorrow I'll tackle lcn_index_get_document, which should bring me up to the level of functionality I was at in my initial release, but with a whole index instead of a single segment.

Sunday, February 13, 2005

Lucene4c 0.02: Compressed File Streams!

Lucene4c 0.02 escaped from my Subversion repository into the wild today.

This release adds support for the compressed file stream directory format used by modern versions of Lucene, recasts the public interfaces in terms of a new abstract directory object similar to that used in Java Lucene, and fixes a staggering number of bugs.

It also brings with it a web site:

Links to the new relase, previous release, and the URL of the project's Subversion repository can all be found there.

Next on the agenda is support for reading the remaining parts of the Lucene index format, after which we can head towards actual useful searching functionality.

Comments, questions, and of course contributions are always welcome.

Tuesday, February 8, 2005

Books I Plan To Read

A while back I posted my "recommended reading" list.

Basically all the best technical books I could think of at the time.

Today, I want to talk about something a little different. This is the list of books that are coming out in the reasonably near future that I see as likely to make their way onto that list.

  • Advanced Programming in the Unix Environment (2nd Edition) - the first edition made my original list, but it's absolutely due for an update, so pick up the new one when it hits the streets.
  • Higher Order Perl - Mark-Jason Dominus is a fantastic speaker, top-notch programmer, and generally a very interesting guy. I've been reading the mailing list associated with this book for a while now, and he's put a ton of work into it. Even if you're not a perl programmer I suggest reading it, the techniques he discusses here are applicable to many languages, and they'll make you a better programmer.
  • Practical Common Lisp - I've always thought Lisp was interesting, but never had a chance to learn it, at least not until I found the online version of this book. I'm absolutely going to pick up the dead-tree version as soon as possible, even though I already read most of it, it's just that good.

Anyway, that's the current "Oh my $DIETY I must read this when it comes out" list, enjoy.