“Reference Publishing” is the business of publishing “reference works”: dictionaries, encyclopedias, and the like. By definition, it includes the Wikipedia, which is intended as a reference work. I have remarked kindly on the Wikipedia and taken exception when it came under attack. There have been a lot of voices chiming in on this issue; herewith a survey and, based on my years in reference publishing and because I care profoundly, I’ll add my own observations.

Just a word on my background: I spent three years full-time working on the New Oxford English Dictionary Project at the end of the Eighties. Then, while working for Open Text, we did projects with Random House in New York on one of the big Webster dictionaries, and Grolier, a New England Encyclopedia publisher.

After I left Open Text, I did some consulting for Encyclopædia Britannica on their back-room editorial and publishing processes. My consulting clients also included the European Union’s Office of Official Publications in Luxembourg and the tech-publishing arms of a couple of technology companies.

Why This Matters · Encyclopedias are wonderful things, if only as manifestations of the lovable human urge to take on ridiculously huge projects; to build something much bigger than oneself. The drive to write down, well, everything, briefly and in alphabetical order, combines hubris with a love of language and learning; it is wholly admirable.

When I was young we had a Britannica in its own bookcase, a dozen or more volumes, richly illustrated. As a bookworm kid, I would cheerfully haul a volume off to bed and amble my eyes through it an hour or more at a time.

I’m worried that my own kid may not have this option. Having a printed encyclopedia around has come to feel fairly exotic. The Britannica has struggled for a business model; you can still buy one (for US$1,095), but they seem aimed mainly at libraries, not families.

So it may be the case that we have to figure out how to make the Wikipedia good, because that’s all there’s going to be.

Is It Just a Load of Crap? · On the subject of the Wikipedia, one vocal school of thought holds it to be worthless; that because anyone can edit it, there is no basis for trusting it and that it will necessarily be full of errors and distortions. Put another way, the Wikipedia blows off the tradition of scholarly authority that, in the great Enlightenment, supplanted the Medieval tradition of received scriptural wisdom, except for in Wahhabi territory and the Red States.

Bob McHenry’s piece that I complained about elsewhere is one good example; he closes by comparing the Wikipedia to a public toilet because, as he says “you can’t be sure who was there before you.” Another example can be found in a selection of letters to the Register provoked by this, whose URL suggests that the Wikipedians are on a path to become mass murderers (which is coming it a bit high even for the Register).

If you’d like to study more criticisms of the Wikipedia, one good place to look would be in its helpful entry on the subject (currently a couple of months behind, but it’ll catch up).

To start with the conclusion, these critics apparently feel that the Wikipedians should just abandon the project and walk away.

There are several problems with this line of argument. First of all, it fails to account for the fact that most of the Wikipedia is quite good. The vast majority of times I go there to look something up, I find a sensible, clear explanation, well-furnished with pointers to primary resources.

In fact, since I discovered it, I’ve used it regularly here at ongoing when I wanted a pointer to background something I was mentioning in passing: recent examples include Godwin's Law, Podcasting, Yggdrasil, Resonance, and the Fortune 500. (By the way, quite a bit of poking about failed to turn up any other free online version of the F500).

Second, the it’s-all-crap argument fails to explain the observation that the quality of the Wikipedia’s entries is observed, by and large, to improve reliably and steadily.

Third, those who engage in this sort of frontal attack on the Wikipedia tend to be mean-spirited. Really, McHenry’s “public restroom”, and Orlowski’s “Khmer-Rouge in daipers (sic)”? Which does not mean that their points can be ignored, but this childish crudeness sits poorly with high-minded appeals to Authority and Trust.

Finally, the proposition that the Wikipedia is a misguided waste of time is boring. Something poorly-understood is happening here, and the observed results are immensely better than intuition from first principles would suggest. This is interesting; it seems obvious to me that there are lessons to learn here, about reference publishing in particular and knowledge husbandry in general.

So I for one am not willing to give up and walk away. And, I’ll keep on using the Wikipedia as long as it keeps on performing well for me.

The Problems · Clearly, there are problems with the Wikipedia. McHenry’s randomly-chosen entry turned out to be flawed and inconsistent, and Alex Halavais showed that you can introduce errors which will not immediately be corrected.

Also troubling are the entries on contentious subjects such as George W. Bush and the Israeli-Palestinian conflict, which have been made subject to repeated lockdown after outbreaks of “attack editing”.

Other Critics · There are those who acknowledge the Wikipedia’s problems but who offer, instead of cheap polemics, constructive suggestions about how to address them. My favorite in recent times is this from Mitch Ratcliffe; his proposal is to, in the contentious areas, offer more than one entry. Alternative views of reality, for the cases where those views are too far apart to reconcile.

David Weinberger offers a similar argument, and strengthens it by a powerful analogy to how collisions of names and opinions are dealt with in the world of Linnaean taxonomy.

Kill, Cook, and Freeze · The Wikipedia is not perfect, but then neither is traditional reference publishing. Perhaps a brief survey of how that is conventionally done would be helpful; the title of this section, while meant to be amusing, is a fair description.

Different publications source their material differently; in dictionaries the entries are written by full-time in-house staff, while encyclopedias obtain some portion of their articles from external, usually academic, experts. The Britannica in particular has often had articles authored by famous names.

After the material is written, it goes through a hierarchical editing process, whereby it has to be signed off on by (at least) a front-line editor, a supervising and/or subject-matter editor, and eventually the Chief Editors whose names appear in the front pages of the book.

Independently, a lot of work goes into terminology and naming: Encyclopedias worry a lot about the names of places, people, and things, which are prone to severe variance in spelling and usage; a famous example is the name Muammar Qadaffi, which is known to have a dozen variations or more when appearing in English. Reference publishers handle this by the aid of laboriously-maintained “authority files” which contain the officially-blessed identifications and spellings.

Encyclopedias also put a lot of work into subject indices; the Britannica in particular was fanatical about it, to an extent I found unreasonable because it seemed way out of proportion to the usefulness of the actual index.

Along with all this editing, which is aimed at ensuring accuracy and consistency in style and tone, there are multiple proofreading passes, looking for typos and thinkos.

Finally, there is the process of page make-up and typesetting, which is astonishingly time-consuming and expensive, because the volume of text is enormous, the space is finite, and the content is thickly laced with pictures, tables, formulas, and mathematics. It is no accident that most encyclopedia pages look very fine.

So, you take a living document from the hands of the author and kill it to stop it moving; then you cook it carefully via all these editorial processes until you think it’s nicely done, then you freeze it into its finally-typeset form.

Which is to say, at the moment that an encyclopedia is printed or blasted onto disk, it’s obsolete.

The Living Cycle · The Wikipedia’s process is profoundly different in that it has no end. Once you get past that, it has strong points of similarity with conventional reference publishing; the articles are written by external contributors, then they are reviewed by one or more of the contingent of people who make it their business to do this, checking for style, basic adherence to facts, and so on. This is process is repeated—not with 100% reliability—every time an entry is updated. Put another way, entries are cooked, but neither killed nor frozen.

One really interesting part of the Wikipedia process is the person/place/thing naming issue. It’s handled cheaply and reliably by means of hyperlinks. No matter how many spellings there are for Mr. Qadhaffi’s name, there’s only one entry for the man himself, and an excellent level of consistency is achieved simply by getting the pointers right.

In my view, the level of consistency approaches that provided by Britannica, at a cost that has to be orders of magnitude lower. In reference-publishing jargon, the Wikipedia serves as its own online authority file.

Furthermore, the hyperlinks in the Wikipedia work immensely better than the cross-references in any conventional reference work, simply because they are so much easier to use.

Obviously, the typographical effort is limited by the narrow boundaries of what can be accomplished in HTML, so once again the Wikipedia’s cost is a laughable fraction of those publications bound in actual paper. But still, I think the online pages are clear, attractive, and visually well organized.

Where To? · I don’t know which way the Wikipedia’s headed, and I don’t know if it’ll survive; it’s certainly not invulnerable to concerted malicious attack, and we’ve seen that happen often enough on the Net. But that may not happen, or it may grow sufficient defenses, or it may adopt some processes more like those of mainstream reference publishing.

It may flower, but on the other hand, it may dissolve in a pool of demotivation and recrimination. Who can tell? Not me; but I’ll watch, and I’ll use it, and I’ll help out a bit. One thing is sure: the Wikipedia dwarfs its critics.


author · Dad
colophon · rights
picture of the day
December 06, 2004
· Technology (90 fragments)
· · Publishing (162 fragments)
· · · Reference (15 more)

By .

The opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.

I’m on Mastodon!