Like almost everyone, I have a long list of things that I regret not having
done, and mine includes writing a Unix filesystem.
So instead, I measure ’em, with the help of my old friend
Bonnie.
I just spent
some time addressing the question: “How much does
FileVault slow
down a Macintosh?”
And turned up a couple other interesting results, too, including a fairly
startling three-way OS X/Linux/Solaris comparison.
[Update: Many readers write on the subject of
Linux and hdparm(8)
.]
The Linux/hdparm
updates are
here.
FileVault · How much does it hurt? Not at all reading data, a factor of four or more writing it (which is why you can’t combine it with FireWire video capture).
OS X vs. Debian vs. Solaris · After running Bonnie on my laptop, I went and ran it on the Athlon 2200/Debian box that served the fine ongoing content you are now reading. A couple weeks earlier, I’d run it on one of my own zed-boxes. The Mac & Linux numbers tests were run using a one-gig file because they’re 32-bit machines, I tested the V20z’s with an 8G file. I suppressed the “Random Seeks” numbers because both the Mac and the Debian box had enough RAM to buffer most of the file and produced loopy results.
The Mac had my typical PhotoShop and NetBeans and Safari and Emacs and NetNewsWire and Mail running, but all that idles along at well under 10% of the CPU. The Linux box was carrying a normal 10PM webserver load, mostly ongoing and Matt’s political party, but was like 98% idle when Bonnie wasn’t running. The Solaris box was entirely unused. Here are the numbers:
-------Sequential Output-------- ---Sequential Input--
-Per Char- --Block--- -Rewrite-- -Per Char- --Block---
Machine MB M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU
NoFV 1000 10.2 77.4 15.7 8.3 6.5 4.0 7.7 52.8 12.4 3.6
FV 1000 2.5 18.6 2.6 1.4 5.4 3.7 7.1 39.4 12.0 3.1
Debian 1000 5.0 73.0 7.3 36.7 3.1 41.5 4.2 87.6 5.9 33.0
V20z 8000 51.5 66.1 52.3 26.0 21.7 14.3 59.2 70.9 63.4 16.9
It’s frankly weird that the Mac seemed to do I/O about as fast as the Athlon box, was even faster at some things. Either that OS X filesystem is some hot stuff, or Debian’s been misconfigured, or it has really doggy disk drives. Anybody got some Bonnie numbers from Apple XServes? (If you want to run ’em, ask me for the latest source first.)
As for the V20z’s, well those are some seriously fast I/O mofos. Holy crap, 60+ Meg/sec using ordinary filesystem 8K reads! Mind you, a Dell roughly equivalent to the Athlon box lists at about $1500 (and I bet Matt’s colo paid less), while the V20z’s list at about $3000 (but then, I bet you can get them for less too). If I were serious about this, I’d do some Bonnie-throughput-per-dollar benchmarking.
Fixing Linux ·
Martin Kenny,
Chris Lightfoot,
Michael Hall,
Russell Coker (of
Bonnie++ fame),
Gregory Maxwell, and
Gaute Strokkenes
all wrote to agree that the Debian number up there is idiotically slow.
The consensus seems to be that you should expect to be able to improve that
performance by a factor of four or better via judicious application of
hdparm(8)
, in particular to turn on DMA.
Apparently, Linux distros in general and Debian in
particular default to extremely conservative parameters, because there are
certain drive/controller combinations where hdparm
aggressiveness can cause disaster.
Having said that, the test is still somewhat fair in that the Apple, the
Opteron/Solaris box, and the Athlon/Debian box were all running the
out-of-the-box defaults.
I wonder what proportion of Linux sysadmins are savvy about
hdparm
?
I wonder what could be done to make the Mac and Solaris boxes faster?
Weird I/O Stuff ·
So while I was running Bonnie, I was using top
and
vmstat
and various other system-readout things to watch what was
happening.
The first result is that I’m dubious about the
%CPU
column in Bonnie’s readout.
Look at it this way: in the OS X/FileVault scenario, you’ve got the Mach
microkernel with at least some filesystem stuff in userland, plus you’ve got
the FileVault encrypted-image stuff getting in there every time you call
read
, write
, or lseek
.
On the Debian (and I think Solaris) systems, you got a traditional monolithic
kernel doing all the filesystem stuff. I don’t think the proportion of CPU
cycles the respective accounting subsystems charge you with is all that
interesting.
On the other hand, if you were comparing different filesystems or disks on the
same box, it might be useful.
One thing baffled me: on Bonnie’s per-char phases, both the OS X and Debian boxes showed the CPU as pegged, no idle time. I thought we were long past the days of I/O being CPU-limited; or maybe I’m misinterpreting what I’m seeing. Even weirder, during the block I/O phases, Debian was still showing the CPU pegged and 98% in System state. Huh? This could really be a symptom of something wrong, because I’d expect a modern Linux server to do I/O (block I/O, forsooth) quite a bit faster than the 4—7½ M/Sec we’re seeing here.
The V20z’s never breathed hard; the CPU peaked at 71% (of one of the two processors), and the other 130% was completely idle.
There’s scope for a whole lot more really interesting work to be done here, but I’m not gonna do it, I’ve got other fish to fry.