The IETF has just revised its JSON spec; the new version is RFC7159 — that link is to the IETF’s traditional line-printer format, I’ve parked an HTML version at rfc7159.net for people who want to actually read the thing not just link to it. [Disclosure: I edited RFC7159.]
Highlights · RFC7159 cleans up some ambiguities and inconsistencies in various JSON definitions, none of which caused any real-world pain.
More important, it captures industry experience about stupid things you can do in your JSON that are allowed by the spec but will cause problems in practice. If you’re interested, I recommend opening up the HTML version and searching forward for the string “interop”. There are 17 occurrences. If you’re generating JSON — something a lot of us do all the time — and make sure you avoid the mistakes highlighted in those 17 places, you’re very unlikely to cause pain or breakage in software that’s receiving it.
History · JSON was invented by Douglas Crockford. When I first came across it in the early 2000s, I poked around and ended up at Crockford’s JSON.org, which remains an excellent example of how to do a technical tutorial right; short, clear, and with no BS.
In 2005 or so the IETF noticed that gobs and gobs of JSON were being sent back and forth with HTTP and the rules say that when you do that you should have a “Content-type” header with a registered Media Type. There wasn’t one, so Crockford pitched in and we got RFC 4627, which was probably what most people thought of as the official definition of JSON.
Since the “JS” in JSON stands for JavaScript, and the official spec for that is done in ECMA (where it’s called “ECMAScript”), it shouldn’t be surprising that another description of JSON turned up in section 15.12 of the ECMAScript Version 5.1 spec.
Which brings us to 2013. There was considerable grumbling in the IETF
over JSON specs and in particular RFC4627. First of all, it was
“Informational” rather than “Standards track” which meant there were
bureaucratic problems referring to it in certain contexts. Secondly, it and
the ECMA version were inconsistent in that 4627 required a JSON text to be
either an object or a list, whereas the ECMA version was fine with just
"42"
or true
. Finally, 4627 allowed a few things,
like duplicate keys in objects and broken Unicode strings, that everyone
agrees are bad practices.
So the IETF spun up a JSON Working Group in 2013 with the objective of revising 4627 to fix these problems. Douglas Crockford signed up to edit the revision, but couldn’t stand the chaos and noise of the IETF process and stepped away. Since I’d been making suggestions on the mailing list and have experience editing markup-language specifications, they asked me if I’d mind finishing up the editing process. I didn’t think it would be much work (I was right, it wasn’t) and agreed.
Then we had a completely ridiculous standards-organization kitten-fight; someone told the ECMA working group that the IETF had gone crazy and was going to rewrite JSON with no regard for compatibility and break the whole Internet and something had to be done urgently about this terrible situation. Apparently nobody bothered to read the IETF Working Group’s charter. So ECMA rushed out ECMA 404, yet another specification of JSON, on the double-quick hurry-up. It’s extremely minimal, claims to specify only the syntax but not the semantics of JSON (I don’t understand what they mean by those words). It doesn’t address any of the gripes that were motivating the IETF revision.
It got pretty silly for a while, with demands that everyone officially recognize ECMA 404 as the only “normative” definition of JSON. But I think this little flurry of excitement left no permanent scars.
Anyhow, by my count, the Internet now has at least 5 different things that can be thought of as JSON specifications. Fortunately, everyone pretty well agrees on the right way to do JSON, so that’s not a problem.
Comment feed for ongoing:
From: Grahame Grieve (Mar 06 2014, at 03:50)
"none of which caused any real-world pain".. actually, nearly all of them had come up for us (http://hl7.org/fhir project) in one place or another. In particular, property order has been an issue since there are order dependent processors out there, and it can make quite a difference to how you parse the JSON.
So it's good to have that issue clarified. Thanks.
[link]
From: Nelson (Mar 06 2014, at 13:06)
That's a remarkably good-humored writeup. Thanks for working on the RFC!
[link]
From: Janne (Mar 06 2014, at 19:53)
My one major pain point with JSON - and one I had sincerely hoped would have been addressed - is that the numeric format is not IEEE 754 compliant. In fact, this text puts an explicit nail in that coffin:
"Numeric values that cannot be represented in the grammar below (such as Infinity and NaN) are not permitted."
Which means we're often left with either putting all numeric data in strings; or simply disregarding the standard and using "Inf" and "NaN" in our JSON numeric fields. Neither is ideal of course, but such is life.
[link]
From: Aaron (Mar 07 2014, at 09:06)
I'm going to show my ignorance... Why is it so important to codify null in the spec (besides legacy)?
[link]
From: tom jones (Mar 08 2014, at 17:33)
ironically, the "line-printer" format of the spec is actually more readable to me.
the 160 character lines are close to 3 times the length of the recommendation for an enjoyable read.
[link]