This chapter of the On Search saga is a side-trip; a look at an unusual search user interface I built a dozen years ago. One of the reasons it didn’t catch on back then was that there wasn’t enough XML in the world. Now that there is, maybe this bit of legacy code will provoke an idea or two. Just maybe, it contains some ideas that will be useful to the folk who are wondering how to make the power of XPath and XQuery useful to ordinary people.
Promises, Promises · The next outing will get back to a systematic tour through real-world everyday search; some of the candidate subjects are stopwords, internationalization, APIs, result ranking, and metadata; I see this stretching on for a while. But one of the many nice things about this style of serial publishing is that you can take side-trips as they occur to you; there’s no editor, no publisher, no trade-show that you have ship the book for.
Lector and PatMotif · This was actually provoked by a note out of the blue from Håkon Lie, formerly of the W3C, now of Opera. He asked me about an old University of Waterloo/Open Text product called Lector, which was a proto-browser; it was entirely stylesheet-driven, routinely ran with multiple stylesheets, and did multi-column rendering, all things that modern HTML browsers are only beginning to approach. Håkon was researching the history of stylesheets, but I wasn’t able to help him much.
I did, however, run across some screenshots of another old product which we called PatMotif. Here’s one, with PatMotif on the left and Lector (the window labeled “Valentine”) on the right. If you’re going to follow the narrative you might want to right-click that and open the full-size version in another window.
It was called PatMotif because the back-end search engine was called “pat” and the UI used the “Motif” toolkit that ran on top of X windows. At one time, a lot of smart people thought Motif was going to be the GUI framework of the future and beat Windows, and if it hadn’t been for criminal stupidity in the Unix vendor community, and Sun Microsystems’ bone-headed land-grab attempt on behalf of their competing “Open Look” toolkit, that might even have happened.
I wrote PatMotif starting in 1990, originally based on release 1.0 of Motif, which was a trail of tears indeed, but it matured as Motif did and eventually became very solid, and ran pleasingly fast on pretty well anything that could run X. That was the first time I’d done real GUI work.
There are four working areas on the screen. At the top left you type in your query: we had a nice little query language, but the idea of PatMotif was that you didn’t need to use it, so most queries were just strings. Down the right side is a list of the structures available in the document base. This was before XML, but in XML terms this is a list of the elements and attributes. In the example picture, the database is the online Second Edition of the Oxford English Dictionary, so the structures are the kind of things you find in a large scholarly dictionary. It was pretty complex, I seem to remember about fifty element types, of which only a dozen or so were really useful for searching.
Chaucer Walk-Through ·
In this example, someone has typed chaucer
in the search
field.
This caused the line labeled 1. "chaucer"
to appear in the left
middle window, which shows your search results so far.
Also, it caused a sample of the matches to chaucer
to appear
in the bottom left window: this kind of listing is called
KWIC—KeyWord In Context—in search
jargon and it can be very useful.
We actually included the raw XML markup in the KWIC display, which I thought
then was a good idea, but I was probably wrong.
Next, the person clicked on First Quot.
, one of the
dictionary's elements, which selected all the occurrences of
chaucer
that were in a dictionary entry’s earliest-dated
quotation.
That produced the display you see in the illustration.
If they’d gone on to click Entry
in the
“Components” window, they would have generated a listing of all words
in the OED whose first known use is by Chaucer, something that a
linguistics or literature researcher might well care about.
That’s Lector over on the right, showing off one of those entries: I bet you didn’t know before today that Chaucer gets credit for the first known use of “valentine” as a noun.
Guessing + Hierarchy = Intuitiveneness · The idea was that any time you clicked on something, the software tried to figure out a reasonable way to combine what you’d clicked on with your most recent result.
What would have happened when you clicked Entry
? The most
recent result was matches to chaucer
in the First
Quot.
element. Given that result set and and the Entry
element, the software
would have done the obvious thing and found the result set members which were
contained by Entry
.
In this case, it would have noticed that going from contained-by-First
Quot.
to contained-by-Entry
didn’t change anything
(since everything’s in
an Entry
), so it would have turned the query around and returned
the (relatively few) Entry
elements which contained the Chaucer
matches.
Suppose you wondered about linguistic history of the words first traced to
Chaucer; so you’d click on Etymology
.
PatMotif would try to find instances of Etymology
that included
your most recent result (a bunch of entries) and not find any, so it would
turn around and try it the other way, looking for Etymology
elements contained within those Chaucerian Entry
elements, and
return you those.
This guesswork, combined with the naturally hierarchical structure of
document databases, produced a remarkably intuitive way to work your way
through a bunch of structured text.
And if you didn’t want the guesswork, you could pull down that
“Combine” menu which let you do
AND
and
OR
and NOT
and INCLUDE
and
WITHIN
combinations manually by clicking on the things you
wanted combined.
We introduced it to nontechnical users by saying “Just type some words in and start clicking on things and you’ll see what it’s doing.” And they did. It was fun demonstrating it to a computer programmer, though; they’d do a couple of clicks and get this puzzled “How are they doing that?” look in their eyes.
XPath, XQuery, Etc. · I look at the recent drafts coming out of the XPath and XQuery group, and for damn sure these things are going to give the programmers of the world the tools they need to pick apart XML instances and retrieve from XML databases (in fact, the programmers are getting way more than they need, but that’s another debate).
But I suspect the number of people who are going to be able to use XPath/XQuery directly is even smaller than the number who can do SQL, which is already a pretty small number. So we’re obviously going to need some sort of user interface for ordinary people to use. I wouldn’t be surprised at all if PatMotif had some useful ideas to contribute to that design process.