Thanks a Million

The ongoing logfiles flip over early Sunday mornings, and sometimes I run some basic stats over them. This last Sunday they said that a total of 995,213 pages have been read, so there is a chance that if you’re reading this on the 29th or 30th of September, you will get the millionth page. Thanks to all; herewith a couple more statistics and some discussion of them.

But before the stats, I wanted to re-iterate that thank-you to everyone who takes the time to read this, I really mean it, ongoing has filled a hole in my life that I previously didn’t know existed. I can’t imagine not doing this.

How Do You Count? · Anyone who publishes anything wants to know who’s reading it. On the Web, it’s hard to figure out the right questions, and then it’s hard to figure out the answers. So when I say “a million pages read,” what do I really mean? Well, for the unix-literate, here’s an exact characterization of what I mean:


zcat *.log.gz | \
 egrep '"GET /ongoing/.* 200 ' | \
 awk ' {print $7}' | \
 egrep -v '\.' | \
 wc -l

For the rest, an approximate English description would be “everything that was fetched successfully whose URI began with /ongoing/ and which didn’t contain a dot.” Excluding the dot excludes all graphics as well as the RSS feed and the CSS stylesheet. So it really is a pretty decent approximation of of the number of times someone looked at a page.

It’s not perfect: it overestimates because some proportion of that million or so fetches were by Google, Inktomi, and many less-skilful robots and crawlers and so on. On the other hand, it underestimates because it excludes all the fetches of the full-size versions of the images, and all fetches of the source-code snippets and so on that I’ve posted. Also, it leaves out all the single-paragraph postings that are contained entirely in the RSS feed and are read that way. I’m willing to bet that the two errors kind of cancel each other out, and say that about a million stories have been read.

In that same time-span, my RSS feed has been fetched 1,856,905 times.

How Many Different People? · Resources at ongoing have been accessed from 228,855 different IP addresses. The RSS feed has been fetched from 49,703; 21,836 since August first.

Everyone knows that IPs are a lousy way to count people; it estimates high because people move around: I have one address at home, another at work, and have showed up from any number of hotel rooms and conferences. On the other hand, everyone at AOL has one IP address, as does everyone at Microsoft. My gut tells me that the number of unique IP addresses overcounts the number of unique people, maybe by a factor of two? But we shouldn’t have to rely on my gut, since there are people out there who count subscribers properly with cookies and so on, and would have a good feel for what the real ratio is. Anyhow, I’d be surprised if I had less than five thousand subscribers or more than fifteen thousand.

The Hit Parade · Q: What do people like reading? A: You’re a bunch of hopeless geeks, but that’s OK, so am I. I live in hope that one of my notes about nature or politics or music gets noticed outside the coterie of markup-slingin’ webheads who apparently are my natural audience.

Fetches	Essay
153116	ongoing
83816	XML Is Too Hard For Programmers
44539	Why XML Doesn’t Suck
30650	The Web’s the Place
17152	The Door Is Ajar
14601	I Like Pie
10133	Truth
9232	Language Fermentation
8649	What This Is
7715	Author
7402	Technology
7106	What · Technology · XML
6941	iYear
6286	iTunes Music Store and the WWW
5833	Business
5762	On the Goodness of Unicode
5739	Colophon
5474	What
5454	The RDF.net Challenge
5049	When

Pix · More geekery; the only full-size pictures that people look at are screen grabs and pictures of Macintoshes. The top three non-tech pictures that people actually looked at were the panoramic second shot in the write-up on my Canon S50 (330 views), the close-up of Byron’s Troy at the end of the Slim Book of Verse photo-essay (307 views), and of course the Bit Bucket (298 views). The lesson for me is obvious; the way I present the pictures on the page is the way they’re gonna get seen, so maybe the current approach of crushing them all down to 300 bytes wide is sub-optimal.

ongoing

What this is ·

Truth · Biz · Tech

author · Dad
colophon · rights

September 29, 2003
· The World (155 fragments)
· · Journalism (37 more)

By Tim Bray.

The opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.

I’m on Mastodon!