I did some research on storage-system performance for my QConf San Francisco preso and have numbers for your delectation (pictures too). Now is a good time to do this research because the hardware spectrum is shifting, and there are a bunch of things you might mean by the word “disk”. I managed to test some of the very hottest and latest stuff.

Methodology · Even after all these years, I still like to measure with Bonnie. Yeah, it’s old (18 years!) and is a fairly blunt instrument, but it has the virtue that you don’t have to think very much before running it, and I’m still proud of how clear and compact the output is, and I still believe that the very few things it measures are really useful things to measure.

I’m not alone, either, Last week at ApacheCon, during the httpd-tuning talk by Colm MacCárthaigh, he talked about using it (the Bonnie++ flavor) to get a grip on filesystem performance. He said, looking kind of embarrassed, something along the lines of “Yeah, it’s old and it’s simplistic but it’s easy to use and has decent output.” [Smile]

Also, Steve Jenson has been using it to look at MacBook Pro filesystem performance, see More RPMs means faster access times. No news there. (Hey Steve, it’s OK to cut out all Bonnie’s friendly in-progress chat about how it’s readin’ and writin’, and just include the ’rithmetic.)

And hey, just to brighten up this dry technical post, here’s a picture of Bonnie Raitt, after whom the program is named. She’s older than me; doesn’t she look great?

Bonnie Raitt

What Does “Disk” Mean? · I think it can mean three distinct things, these days:

Systems Under Test · There are four different tests here, representing (I think) a pretty fair sampling of the storage options system builders have to choose from. The titles in the next few sub-sections correspond to the row labels in the summary table below.

MacPro · This is my own Mac Pro at home that I use for photo and video editing. It’s a meat-grinder; dual quad-core 2.8GHz Xeons, 6GB of RAM. There’s one 250G disk; whatever Apple thinks is appropriate, which bloody well better be pretty damn high-end considering what I paid for this puppy.

T2K · This the Sun T2000 hosts for the Wide Finder 2 project; eight 1.2GHz cores, 32G of RAM, two 170G disks; whatever Sun thinks is appropriate. There’s a single ZFS filesystem splashed across them, taking all the defaults.

7410 · This is a Sun Storage 7410 appliance, the top of the line that we just announced. It has an 11TB filesystem, backed by some combination of RAM and SSDs and spinning rust. They gave me a smaller box with 8G of RAM to run the actual test on, connected to the 7410 via 10G Ethernet.

IntelSSD · This is one of the latest-and-greatest; in fact the very one that Paul Statamiou recently wrote up in Review: Intel X25-M 80GB SSD. It’s attached to a recent 4G MacBook Pro, which Paul also reviewed. What happened was, I filled out Paul’s contact form and wondered politely if he’d be open to doing a Bonnie run. He wrote back with the output; what a guy.

The Table · There are notes below commenting on each of the four lines of numbers but, if you’re the kind of person who cares about this kind of thing, take a minute to review them and think about what you’re seeing.

              -------Sequential Output-------- ---Sequential Input-- --Random--
              -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine    GB M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU  /sec %CPU
MacPro     12  64.7 82.0  67.4 10.0  29.8  5.0  64.8 76.7  67.9  6.5   190  0.7
T2K       100  20.5  100 150.1  100  61.4 64.8  19.8 98.9 148.9 76.7   214 10.7
7140       16 121.5 97.7 222.2 51.0  75.3 27.2 100.0 95.6 254.2 47.3   975 76.6
IntelSSD    8  44.8 66.4  69.3 12.8  51.5 10.7  73.4 94.3 246.0 27.0  7856 43.2

Mac Pro Results · Given that Apple’s HFS+ filesystem is held in fairly low regard by serious I/O weenies, these numbers are not too bad. Salient points:

T2000 Results · This thing has a much slower and wider CPU than the Mac Pro, and a massively more ambitious I/O subsystem; it’s designed for life as a Web server.

7410 Results · Remember, in this one, there’s a (fast) network in between the computer and the disk subsystem.

Intel SSD · Um, one of these things is not like the other, and this would be the one.

Take-Aways ·

What a great time to be in this business.



Contributions

Comment feed for ongoing:Comments feed

From: alphageek (Nov 21 2008, at 03:12)

Nice recap of a variety of systems, although I'd point out one little thing - the T2000 is pretty clearly CPU bound on some operations (which makes sense since the ZFS management is being handled by the main CPU as well as other tasks) and the other numbers that are just over twice that of the MacPro which would seem to be clearly the result of having two spindles and a more efficient file system than HFS (who showed up much better than I'd imagined). I'd clarify that it's the two disk configuration, rather than the box itself that seems to make the difference.

I'd be curious to see the results on the Mac Pro with two disks using the experimental ZFS project from Apple.

The 7410 obviously rocks out, and it would be really interesting to see the results with multiple concurrent clients. Given the architecture, I suspect that you'll be able to product almost the same profile on two to four clients concurrently - love to to see a scale up chart on that one.

The SSHD justifies ZFS as the filesystem to use in the future with the ability to easily integrate SSHD cache and log devices with this kind of performance fronting for cheap high capacity devices. Moore's law is working great on capacity on spinning rust - it's just in IOps that it falls down.

[link]

From: robert (Nov 21 2008, at 10:07)

>> If your application is I/O-bound, and lots are, you’re going to have to go parallel, and be smart about doing block I/O.

This sounds like what a RDMBS engine does. Can you say 4NF? Too bad Sun bought a flat-file database.

[link]

From: Sam Pullara (Nov 21 2008, at 10:09)

Can you publish the command line that you used? When I try and run this on my system I get ridiculously high results which imply that some pretty severe caching is being done.

Sam

[link]

From: Tim (Nov 21 2008, at 10:15)

Sam: The trick is the -s option. You need at least twice as much data as you have RAM. So on a 4G machine, you'd say:

./Bonnie -s 8000

[link]

From: Sam Pullara (Nov 21 2008, at 12:50)

Interesting test. I've been looking forward to replacing my bootdisk on my Mac Pro with an SSD and it looks like they are getting close. I went ahead and ran this benchmark on two of my disk subsystems:

              -------Sequential Output-------- ---Sequential Input-- --Random--
              -Per Char- --Block--- -Rewrite-- -Per Char- --Block--- --Seeks---
Machine    GB M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU M/sec %CPU  /sec %CPU
MacPro     16  56.2 88.5 305.1 53.9 105.2 23.4  70.7 97.6 474.0 60.2   758  4.0
RAIDZ      16  59.0 71.2 137.0 31.9 126.7 36.2  89.4 96.9 587.0 81.8   162  3.1

The first is the 8 1TB disk array described here:

http://flickr.com/photos/spullara/2923859802/

The second result is using ZFS on my Mac Pro from here: http://zfs.macosforge.org/trac/wiki configured across 3 500GB disks using RAIDZ. I was quite impressed with the performance of ZFS!

[link]

From: Ric Davis (Dec 03 2008, at 05:56)

"That under-30G/sec number for in-place update of a big file is pretty poor." for the MacPro.

Shouldn't that be 30M/sec, or am I misunderstanding?

Looking at the GB column, it looks as if for the 7410(7140 in the table) test, you've gone with a file the same size as the test system RAM. Isn't that going to tend to flatter it?

[link]

From: Tim (Dec 03 2008, at 07:55)

Ric - Oops, you caught 2 typos. Should be 30M and 16G. The Network Is The Editor. Shocking that nobody caught this yet.

[link]

From: Tom Matthews (Dec 04 2008, at 02:25)

The 7140 results overview states :

"Way north of 200G/second both in and out"

Surely 200M/second?

[link]

From: uli.w (Dec 05 2008, at 03:46)

Does Sun force its employees to include PR and marketing in their own private blogs? That's really sad.

[link]

From: Edward Vielmetti (Dec 13 2008, at 21:01)

Tim -

There's a class of disk systems that you didn't look at which fit into some useful category - the ones where the drive controllers deliberately spin down the drives to keep power consumption down, at the expense of latency. This is the "massive array of idle disks" approach, where the cost of a "miss" might be 10 seconds or more.

You have to compare it to tape for near line storage. Power consumption should be a fraction of equivalent always-on drives since maybe 75% of the disks are powered down at any moment.

[link]

author · Dad
colophon · rights

November 20, 2008
· Technology (90 fragments)
· · Storage (30 more)

By .

The opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.

I’m on Mastodon!