Recently, someone from a Google competitor told me that they were catching up, within a few percentage points. I didn’t believe that at all, but I decided that intuition is boring and hard data is interesting. So I went and ran search engine rankings for ongoing weekly through 2005. The numbers are surprising, to say the least. [Update: Thought-provoking feedback, and some conclusions] [And more feedback from Search Engine Watch.].
The surprise, of course, is that Google image search sends me more traffic than ordinary Google search. For the whole year, the average weekly referral numbers are: Ask Jeeves 14, Google 3967, Google images 5468, MSN 139, and Yahoo 644.
I may decide to make this a weekly feature if there’s sufficient interest out there.
Feedback: Watch that Extrapolation · In private email, Tim Converse made a subtle but telling point. I observe that I’m getting 38% or so more traffic from Google Image search than regular Web search; would it be reasonable to conclude that Image search is 38% more popular across the length & breadth of the Web? I doubt it; the numbers probably have something to do with the fact that I publish way more pictures than your average blog, and put quite a bit of work into them, and make sure they’ve all got nicely-searchable metadata.
Well, if you believe that, you should also be careful about concluding that Google has 6 times the traffic (or 14 times, if you count the image searches) that Yahoo gets; quite possibly it’s something about ongoing causing at least part of the gap.
Tim also pointed to this report at Search Engine Watch; but scroll down to the second pie chart under the heading “Search Providers”, which is apparently measuring the same thing I’m trying to; it shows a much more evenly-divided playing field than do my numbers.
Mind you, that study (as, to be fair, Tim points out) has transparency issues; the description of how ComScore develops those numbers sounds kind of plausible, but you’d want to place a lot of trust in their user-selection, traffic-monitoring, and programming techniques to believe the numbers in their chart. I can see why this report’s making Yahoo happy, but I’m unconvinced, because I don’t think ongoing is that far off the mainstream. But hey, I’m just a guy with a geek/flower-pix website and a Perl script. On the other hand, if ComScore has posted an actual description of where their numbers come from, let me know and I’ll highlight them.
Danny Says... · That would Danny Sullivan of Search Engine Watch, approximately the center of the world for those who care about this stuff. Anyhow, Danny wrote to point a two more interesting market-share pieces: one from NetRatings, and an real interesting analysis by Andrew Goodman.
Conclusions · Taking all the difficulty and doubt into account, two conclusions still jump out at you, one of which is surprising:
Google is still in the lead.
You can drive a ton of traffic to your site with some pretty pictures and a bit of care & attention to metadata.
Methodology ·
Here is the script that I ran over the Apache weekly
access_log
files. I’d love to hear if someone spots an error in
the code, and will report and update if so.
#/usr/bin/perl
use strict;
my $f = '\S+\s+';
my %engines;
my $lastGImage;
while (<STDIN>)
{
my $who;
my $agent;
if (/^(\S+)\s+$f$f$f$f$f$f$f$f$f\"(\S+)\"/)
{
$who = $1;
$agent = $2;
}
my $e = '';
next if ($agent =~ /^-?$/);
if ($agent =~ /search\.yahoo/) { $e = 'Yahoo'; }
elsif ($agent =~ m@^http://images.google@)
{
if ($who ne $lastGImage) { $e = 'Google Images'; }
$lastGImage = $who;
}
elsif ($agent =~ m@^http://www.google.*search@) { $e = 'Google'; }
elsif ($agent =~ m@^http://search.msn.com@) { $e = 'MSN'; }
elsif ($agent =~ m@^http://web.ask.com@) { $e = 'Ask Jeeves'; }
if ($e) { $engines{$e}++; }
}
my $e;
$ARGV[0] =~ s/.log.*$//;
print "$ARGV[0]\t";
foreach $e (sort keys(%engines))
{
print "$engines{$e}\t";
}
print "\n";
Note the trick with Google Image search; the result pages are set up in such a way that in almost every case, you see two or three accesses in a row from the same image search; so I count a bunch of successive accesses from the same host just once.