[This is part of the Wide Finder 2 series.] We’re a few weeks in now, so I should provide an update. Those who are really interested might want to join the Wide Finder group.
I’m still getting email almost every day asking for access to the test data. As of June 21st, 42 people had downloaded the 100K-line test data, I’d given out 20 WF-related accounts on the test server, and nine people had reported results on the Wiki.
That results page is real interesting already, with lots more data to come. A few obvious results leap to the eye:
The best results are now approaching I/O speed.
Fan is a very interesting new language.
Check out wifii.groovy; I’m not going to prejudice the audience by leading with conclusions, but it’s very damn thought-provoking.
There’s a discussion of possible next steps here.
There are still lots of results to come. It’s not too late to get involved.
Comment feed for ongoing:
From: Damian Cugley (Jun 25 2008, at 05:30)
Rather chastening to me that the fastest result at the moment is in ML, a very high-level language I mainly remember for its incredible slowness and memory-hogging tendencies back when I was at university.
I am now trying the phrase "recoded main loop in ML for speed" on for size... :-)
[link]
From: Yuri Schimke (Jun 27 2008, at 23:59)
One thing to point out is that wfii.groovy is only a query side implementation. The parsing and parallelism is done by the Kolja framework. So there is a ton of code involved in getting the report running.
Ideally you would configure your log format like
http://fisheye.codehaus.org/browse/kolja/trunk/kolja/kolja-widefinder/src/main/config/wf1.xml?r=192
but the test results used a custom line parser matching strings, instead of Regular Expressions.
I don't want to optimise too much for specific log formats or large datasets, as Kolja is still mainly intended as an interactive log viewer "less on steroids".
[link]