We announced a bunch of new boxes this morning (of, course, the damn Register has had the poop for weeks, I find our leakiness irritating). There’s a Real Big Opteron server (personally, I’m more of a scale-out than scale-up kinda guy, but big iron is a big part of our business). There’s a blade box. I know nothing about blades, never been near one. Then there’s the Thumper oops X4500, it’s interesting. I even have a grainy amateurish photo of the inside of a pre-production model.
At a recent internal meeting, we got an up-close-and-intimate walk-through from Andy Bechtolsheim on How It Works; if we could figure out how to clone him and give the whole world that kind of pitch, we’d sell these things by the freighter-load. Suffice it to say that the maintainability and I/O bandwidth of these boxes are remarkable.
Now, the Thumper; it’s a 4U box with two fast dual-core Opterons and some silly, idiotic, enormous number of 250G or 500G disks. And really big, fat pipes throughout. Here are a few of the disks.
Ask the Bloggers · When I was getting ready to write this, I had a detail question about the Thumper, and since all the people I know in the Systems org are out on the marketing road-show today, I asked the internal bloggers’ list, and that turned out to be a smart move, I learned some interesting things:
If one of the disks fails, the little LED beside it lights up. The software handles it (see below) and things go on running; the intent is that you service it about once a year, swapping out the failed drives, which are easy to find. Bringing down maintenance costs is a big deal with a lot of our customers.
The LEDs are actually three-state: activity (Green), OK-to-remove (Blue), and Fault (amber).
It ain’t light. I suppose we’ll do a try-and-buy, and if you try one, please don’t try to hoist into place yourself. I gather that internally, we use a Genie Lift.
SATA What?!?! · Now, here’s the really interesting part. These are all SATA disks; i.e., pretty fast, really cheap, typically regarded as not suitable for use in big back-room servers. The thing that makes it all work is ZFS, which goes fast not by using the fastest disks, but by using lots of paths to the data, and which assumes that disks are going to fail sometimes and is built to just deal with it.
Remember, RAID used to stand for Redundant Array of Inexpensive Disks? That’s the idea here. This sucker was originally designed as a streaming-video server. But everywhere I look I see the data getting bigger and bigger and the transaction rate getting higher and higher. So I bet there are lots of places where this will come in handy.