Over the last 24 hours I learned a lot about how the Web of A.D. 2003 works, and it's not like it used to be.
At about 3PM on Feb. 27th, I sent out emails to a few well-connected
acquaintances announcing ongoing, and also put a pointer to it in my standard
email signature.
Then I started hovering over the access_log
like an expectant
mother.
You and Your access_log ·
I've only ever run web sites on Apache or one of its ancestors, and this
lineage of web servers has always written its statistics into a file named
access_log
.
I think anyone who's running a Web site, or who cares about the Web, ought
to, on a regular basis, spend some time watching the access_log
in real time.
On unix-like systems, the command is:
tail -f access_log
Too often we get this image of the Web as a vast well-oiled machine, with
glossy browser screens in front and masses of gleaming software in back.
Watching the access_log
is like a window into the side lobby
of the legislature, or a tour of the fermentation vats at the brewery.
People who are Web-savvy and have spent years looking at access logs can skip a couple of paragraphs, and I'll get into some interesting statistics.
Here's a single line picked out of the access_log
:
pool-151-203-239-239.bos.east.verizon.net - - [28/Feb/2003:10:42:13 -0800] "GET /ongoing/ HTTP/1.1" 200 12693 "http://www.scripting.com/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
By looking at this, proceeding from left to right, we can tell:
/ongoing/
.200
means.The Word Spreads · As I said, I sent the emails out at around 3PM. The first external visitor was Jon Udell of Infoworld, who showed up at 3:24 PM.
By 4:19 PM, others were starting to trickle in, although the first to show up with an obvious link from Jon's blog wasn't till just past 7 PM.
Almost at the same time, the first of the RSS feed readers showed up.
At 8 PM, there was a visitor from Google, but it was a real person on a Linux box, not a robot.
By 9 PM, there were visitors from Denmark, Singapore, and Australia.
Just past 11 PM, the Inktomi robot showed up. Google's robot put in its first appearance at 6:34 Friday morning.
By about 10 AM, three or four bloggers had put in pointers, and the traffic was flowing in.
At about 1:30 PM, ongoing had had a thousand or so unique visitors.
Humans and Others ·
There are three kinds of visitors to a web site: humans, robots, and RSS
readers.
Of the thousand or so unique visitors, 14 were robots of one kind or another:
Google, Inktomi, Verity, research.att.com
, and some others I
didn't recognize.
214 of them were RSS readers of one kind or another, with the usual suspects well-represented: NetNewsWire, Radio Userland, Syndirella, Amphetadesk, and a whole lot of home-cooked readers.
Anomalies · I'm not going to invest the time in running the numbers properly, but a couple of glaring anomalies emerge:
I guess this says something about the kinds of people who watch RSS feeds and check out the new weblogs.
The Vast Sucking Sound ·
When you're watching the access_log
, what's really remarkable
is the steady pounding from the RSS aggregators.
ongoing has been up a day and a bit, and as I watch this, I'm seeing maybe four
or five hits a minute on the RSS feed.
When you consider the number of sites out there with with RSS feeds, and the number of people who subscribe to a bunch of them, we're talking some pretty serious traffic here. Architecturally, this seems pretty dumb, and you have to worry whether or not it's going to scale. On the other hand, the architecture of the whole Web is pretty simple, more or less built to be as simple as possible without breaking. I wrote a note on this back in January, but now I have direct personal experience, and yes, Houston, we (potentially) have a problem.