It so happens that my name on is the front page of Namespaces in XML 1.0, a technology which is pretty broadly disliked. Well, it seemed like a good idea at the time. But I think we’ve learned some useful things since then and can make some good consensus recommendations for people doing this kind of thing, especially if they’re using JSON.
Problems of History · A good place to start brushing up on this would be with James Clark’s recent XML Namespaces. James is authoritative on technology, but I’m going to quibble with his take on history: “the argument for naming namespaces with URIs is that you can do a GET on the URI and get something human- or machine-readable back that tells you about the semantics of the namespace.” I’m not sure. I recall it more simply: namespaces needed to have names and over at the Web consortium, it’s basic dogma that whenever you’re naming things, you should name them using URIs unless there’s a good reason not to. The benefits of using URIs are many and don’t necessarily include using them to retrieve data.
In any case, the rest of James’ argument is well worth reading. I have another approach I’d like to explore here, which is marginal for XML but real interesting for JSON.
Do It Like Java ·
My Java classes have names like org.tbray.framer.Framer
or
com.sun.cloud.VM
, depending on the context. It’s very unlikely
they’ll collide with anything else. We really couldn’t adopt this approach
for XML, because the dot “.” had historically been allowed, and commonly used,
in SGML element types and attribute names. I bet that, if it hadn’t, we might
well have done that.
Especially for JSON · A lot of protocols these days format their messages in JSON, which makes perfect sense if you’re not trying to interchange document-like things. These messages tend to be dictionaries; here’s an example from the Sun Cloud API:
{
"name" : "Database"
"uri": "/vdc/m~001",
"run_status" : "HALTED",
"description" : "MySQL host",
"tags" : [ "sql" ]
}
In the Sun Cloud API, like in many others, the dictionary keys don’t use any dots. And like many others, there’s a MustIgnore policy. So, while I still have an action item to make this explicit, the extensibility policy is obvious: use java-style names with dots. So if I wanted to add a sun-specific proprietary extension to this particular resource representation, it’d look like this:
{
"name" : "Database"
"uri": "/vdc/m~001",
"run_status" : "HALTED",
"description" : "MySQL host",
"tags" : [ "sql" ],
"com.sun.cloud.solaris-version" : "2009.06",
}
Simple, obvious, and explained in a single short paragraph; I like it a lot.
Comment feed for ongoing:
From: Peter Keane (Jan 17 2010, at 21:17)
This sounds good, but I'd guess there will be a need to easily port over namespaces from XML (e.g. a JSONified Atom) as well. Now all we need is a way to define link semantics in JSON and it'll be good to go. ...
[link]
From: Tom Passin (Jan 17 2010, at 21:30)
I don't think this would completely solve the problem, because the biggest part of the problem is the definition of prefixes - as aliases for the longer actual namespace names - and their binding to the actual namespace. Or, somewhat equivalently, establishing the scope of each namespace alias.
In java, as an example, you can import com.a.b.c.d.FancyThing, and then refer to it just as FancyThing, anywhere in the importing class file. Python has a similar ability, as do many other languages. It's this ease of establishing a readable, easily writable name whose scope is easy to understand that is missing on XML. Not saying there aren't good reasons for it, but still, there it is.
[link]
From: Stuart Langridge (Jan 18 2010, at 01:05)
For the desktopcouch project, we define extensibility through an "application_annotations" dictionary in each JSON document, which is where apps should put data which is specific to them. So, a standard "contact" record might be:
{
"name": "Stuart Langridge",
"phone": "01 811 8055",
"web": "http://www.kryogenix.org/"
}
and if Thunderbird, say, wanted to store thunderbird-specific data in that record, it'd be:
{
"name": "Stuart Langridge",
"phone": "01 811 8055",
"web": "http://www.kryogenix.org/",
"application_annotations": {
"Thunderbird": {
"internal_id": "927423984",
}
}
}
or similar.
[link]
From: Bram Cohen (Jan 18 2010, at 20:22)
On a technical level, your suggestion involves doing nothing, which I think is absolutely correct. You're admitting that everyone lives in a single global namespace, and proposing a convention for people to avoid stepping on each others's toes. Thank you for being in touch with reality.
From a practical standpoint, there are two approaches, either to have names which unambiguously specify the vendor, thus ensuring no collisions but also possibly having lots of redundant keys with identical semantics and problems migrating to consistent ones, or to simply trust everyone use reasonably descriptive names and give them the responsibility of not stepping on each others's toes. Once there's a com.sun.java.lang.encoding.bitrate and a com.ibm.codec.bitrate, both of which mean the same thing, that's obviously a worse situation than if they'd both used the name 'bitrate' to begin with, but in the other case you have to worry about everybody getting the units the same for the bitrate key. In my experience doing forward compatibility with a bitfield doesn't give enough room for people to avoid stepping on each others's toes, but having a dictionary with utf-8 keys does. My preference is strongly towards simply expecting people to be grown-ups, because the alternative is trying to solve a problem which probably doesn't exist.
Trying to be too serious about these issues can result in some truly ludicrous results. For example, the mimetype for BitTorrent is application/x-bittorrent, because I was told that you're supposed to use x- for things which aren't 'real' standards, so I did, but from now until the end of time the Apache default config won't include it, because anything in x-* is by definition not a 'real' standard, and there's a strict policy that they only include 'real' standards in the default config. I'm going to switch mimetypes when pigs fly, because nobody on the planet seriously suggests that the mapping of .torrent to application/x-bittorrent isn't a clear de facto standard, and converting would be extremely painful, and at this point most real distributions have the Apache config file fixed, for that and a bunch of other reasons. The Apache people are very serious about this wankery though.
So I guess my point is that you should make clearly the statement 'This is a convention for avoiding stepping on each others's toes. It only applies to tags which are only for a single vendor. Please act like grown-ups and don't do anything stupid when coordinating tags which are shared between multiple vendors'
[link]
From: masklin (Jan 19 2010, at 00:58)
> Simple, obvious, and explained in a single short paragraph; I like it a lot.
Except, of course, it breaks e.g. javascript's equivalence between `[]` and `.`.
[link]