A Web Interface for Web Publishing

Today Sam Ruby launched a discussion of API options for weblogs, or more generally for Episodic Web Publishing, or even more generally for the Writeable Web. This is in conjunction with the ongoing effort to develop a next-generation syndication format aimed at the same problem. This essay considers the technical issues around a pure Web interface.

XML-RPC & SOAP · I’ll leave to others the task of figuring out how to do this in XML-RPC and/or SOAP. I’ve never really liked either; they feel like a pair of bracketing shots to me.

XML is good at taking a chunk of data and applying labels to its parts in a simple and straightforward way. I’ve always thought that XML-RPC, with its unlabeled positional parameters, throws this essential value away. And SOAP on the other hand has about five times as much complexity and abstraction as I think you ought to need to accomplish the task of building applications around message flow.

On the other hand, there are lots of implementations for both of them, and I gather that XML-RPC is falling-off-a-log easy to get going, and that if you’re playing in the Microsoft sandbox, the SOAP toolbox is outstanding.

The HTTP Way · But I think the basic low-level machinery that you get for free with the Web provides a potentially outstanding platform for this API. I’m not the first person to have this idea; the best-known expression of it, as far as I know, is Joe Gregorio’s RESTLog; my approach differs in detail but is essentially identical in spirit.

The Web wants you to call important things “Resources” and identify them with URIs. In this situation, the publication (“weblog”) is a Resource, and so is each entry in it.

The Web wants you to access resources using a very small set of verbs; in practice, those defined for HTTP in RFC2616. GET, the most common, asks a server to send a representation of a resource. POST is designed to do the following (quoting from section 9.5 of the RFC):

Annotation of existing resources;
Posting a message to a bulletin board, newsgroup, mailing list, or similar group of articles;
Providing a block of data, such as the result of submitting a form, to a data-handling process;
Extending a database through an append operation.

PUT requests that a representation of a resource be stored, and DELETE requests that a resource be deleted.

The RFC helpfully provides appropriate return codes for each of these verbs in various scenarios.

Creating an Entry · There are two ways you could go about this. You could PUT the entry’s contents to its URI, or you could POST a request-new-entry transaction to the weblog’s URI. On balance, POST seems to be the way to go. First, when you create the entry, you might not know what its URI is going to be; a lot of weblog publishing systems cook up the URI when you create the entry. Secondly, the description above of what POST is designed for sounds like pretty much exactly what we’re trying to do here.

Let’s create a short new entry in the publication at http://example.com/moby. The client software connects to example.com on port 80 and the interaction goes like this:

POST /moby HTTP/1.1
Content-Type: application/not-echo+xml

<?xml version='1.0'?>
<entry>
 <title>Chapter 1</title>
 <author><name>Herman Melville</name></author>
 <created>1851-10-18T02:00:00Z</created>
 <content type='text/plain'>Call me Ishmael.</content>
 </entry>

Assuming the entry creation works, the response message from the server might look like this:

HTTP/1.1 201 Created
Content-Type: application/not-echo+xml

<?xml version='1.0'?>
<entry>
 <title>Chapter 1</title>
 <author><name>Herman Melville</name></author>
 <link>http://example.com/moby/chap1</link>
 <created>1851-10-18T02:00:00Z</created>
 <issued>2003-07-02T23:11:12Z</issued>
 <modified>2003-07-02T23:11:12Z</modified>
 <content type='text/plain'>Call me Ishmael.</content>
 </entry>

Updating an Entry · Since you have to know an entry’s URI to update it, PUT seems to offer more utility for updating than for creating entries. But on balance, I’d lean to doing this with POST too; if a POST to a weblog includes an entry’s URI and that entry exists, you can conclude that this is an attempt to update.

This isn’t rocket science, and realistically, the architecture of your publishing application is apt to be simpler if you can run both create and edit transactions through your POST handler.

Removing an Entry · Once again, if you know the URI, you could reasonably expect to use a DELETE against it. But I'd be inclined to use POST for this one too, for the same reason: it seems like less work to run all this stuff through the POST handler.

So what I’d suggest would be that to remove an entry, you POST to the weblog’s URI a body containing only:

<delete link="URI-of-entry-to-nuke" />

Note that if you require the use of PUT and DELETE against individual entry URIs, it makes strategies such as serving them from static files quite a bit trickier.

What, No Query? · So far, I’ve said nothing whatsoever about querying or retrieving or using GET. This is by design. If you know an entry’s URI, you ought to be able to GET it and trust the publishing system to do something sensible. The publication ought to have published a set of URIs that can be used for retrieving recent entries, or entries selected according to a SQL query, or an XQuery, or a full-text-search, or whatever. I can’t see any reason at this point in time to try to constrain the behavior of servers or standardize on a query mechanism.

Comments, Trackbacks, Etc. · I’ve never implemented these, so I won’t dive deep here. However, I will note that running this kind of thing through the publication’s POST handler is apt to be more straightforward than trying to interact directly with individual entry resources, and to have no real downside for the client-side programmer.

A Challenge · This definition is only just sketched in, but wouldn’t be that much work to polish up; there’s lots of experience and prior art out there. If we define a Web API along these lines, and others built on XML-RPC and/or SOAP, I bet that a competent programmer is going to be able to bash out a publishing system through the Web API quicker than the XML-RPCniks or SOAPsters, and it’ll be less complicated, and it’ll be easier to understand and maintain.

But I could be wrong. The neat thing is that I think there’s a good chance we’ll actually be doing the experiment in the very near future.

ongoing

What this is ·

Truth · Biz · Tech

author · Dad
colophon · rights

July 03, 2003
· Technology (90 fragments)
· · Web (396 more)

By Tim Bray.

The opinions expressed here
are my own, and no other party
necessarily agrees with them.

A full disclosure of my
professional interests is
on the author page.

I’m on Mastodon!