Jean Paoli called last week to tip me off about the release of the MS Office XML schema-ware. It turns out that the actual data won’t be on view till December 5, so I don’t have a lot to say about the technical details. I also haven’t gone very deep on the patent and licensing issues, but Jean’s description made it sound like they’re trying not to get in the way. However, there are a couple of points he emphasized on the phone that don’t seem to have made it into the press coverage. [Update: Jean sends some pointers.]
Human-Oriented · Jean says that along with the XSD schemas, there is going to be a bunch of human-oriented technical documentation explaining what all those elements and attributes mean. This is important. Now, it won’t be complete: since this markup can express essentially all of the semantics of Word and Excel, representing thousands of person-years of software development, there’s no way you could capture all that and write it down and maintain it. But it’s a whole lot better than nothing. In general, I think that this kind of tech doc is an order of magnitude more important than schemas, which only tell you whether the elements and attributes have reasonable values and are in the proper order. So I think the impact on the world will be substantially a function of the quality and usability of this tech doc.
Update: Jean sent me two pointers to some of this tutorial stuff. Here is a WordML write-up; start with the overview.
Branding · Jean was at pains, on two or three occasions during our brief phone call, to emphasize that the three markup languages in question (for Word, Excel, and InfoPath) had their own shiny new names: WordprocessingML, SpreadsheetML, and FormTemplateML. Now if I were a hardened cynic, I’d be speculating that this is not unrelated to the fact that Open Office has its own set of public, unencumbered XML dialects, and that these might have the potential to become de facto or de jure standards. Wait a second, I am a hardened cynic.