What
· Technology
· · Software
· · · Topfew
Topfew Release 1.0 ·
Back in 2021-22, I wrote a series of blog posts about a program called “topfew” (tf from your shell command-line). It finds the field values (or combinations of values) which appear most often in a stream of records. I built it to explore large-scale data crunching in Go, and to investigate how performance compared to Rust. There was plentiful input, both ideas and code, from Dirkjan Ochtman and Simon Fell. Anyhow, I thought I was finished with it but then I noticed I was using the tf command more days than not, and I have pretty mainstream command-line needs. Plus I got a couple of random pings about whether it was still live. So I turned my attention back to it on April 12th and on May 2nd pushed v1.0.0 ... [1 comment]
Topfew+Amdahl.next ·
I’m in fast-follow mode here, with more Topfew reportage. Previous chapters (reverse chrono order) here, here, and here. Fortunately I’m not going to need 3500 words this time, but you probably need to have read the most recent chapter for this to make sense. Tl;dr: It’s a whole lot faster now, mostly due to work from Simon Fell. My feeling now is that the code is up against the limits and I’d be surprised if any implementation were noticeably faster. Not saying it won’t happen, just that I’d be surprised. With a retake on the Amdahl’s-law graphics that will please concurrency geeks ...
Topfew and Amdahl ·
On and off this past year, I’ve been fooling around with a program called Topfew (GitHub link), blogging about it in Topfew fun and More Topfew Fun. I’ve just finished adding a few nifty features and making it much faster; I’m here today first to say what’s new, and then to think out loud about concurrent data processing, Go vs Rust, and Amdahl’s Law, of which I have a really nice graphical representation. Apologies because this is kind of long, but I suspect that most people who are interested in either are interested in both ... [6 comments]
More Topfew Fun ·
Back in May I wrote a little command-line utility called Topfew (GitHub). It was fun to write, and faster than the shell incantation it replaced. Dirkjan Ochtman dropped in a comment noting that he’d written Topfew along the same lines but in in Rust (GitHub) and that it was 2.86 times faster; at GitHub, the README now says that with a few optimizations it’s now 6.7x faster. I found this puzzling and annoying so I did some optimization too, encountering surprises along the way. You already know whether you’re the kind of person who wants to read the rest of this ... [7 comments]
Topfew Fun ·
This was a long weekend in Canada; since I’m unemployed and have no workaday cares, I should have plenty of time to do family stuff. And I did. But I also got interested in a small programming problem and, over the course of the weekend, built a tiny tool called topfew. It does a thing you can already do, only faster, which is what I wanted. But I remain puzzled ... [8 comments]
By Tim Bray.
The opinions expressed here
are my own, and no other party
necessarily agrees with them.
A full disclosure of my
professional interests is
on the author page.
I’m on Mastodon!