Recently I wrote of my disgruntlement with JSON Schema. Since then I’ve learned that its authors plan more work, and that there are several other efforts to build a schema facility for JSON. This note is just a complaint about a particular use-case, with the hope that it might inform these efforts.

Here’s the problem; the language I’m building includes a big object whose fields are also objects, and each of these child objects has a Type field, whose value is, in effect, an enum; a constrained set of string values. The rest of the fields in each child object depend on the Type value.

Like so:


{
  "o1": {
    "Type": "Animal",
    ... other fields specific to Animal ...
  },
  "o2": {
    "Type": "Vegetable",
    ... other fields specific to Vegetable ...
  },
  "o3": {
    "Type": "Mineral",
    ... other fields specific to Mineral
  }
}

In JSON Schema it’s easy enough to use a oneOf construct to describe what you want, but here’s the problem: when you syntax-check these, the error messages go right off the rails. For example, suppose my schema says that if Type is Mineral, then there must be a boolean-valued Organic field. And then suppose I try to validate a doc with an object that has "Type": "Mineral" and "Organic": 42.

The validators I’ve been working with give me a massively unhelpful flurry of messages saying that:

  • This can’t be Animal because the Type is wrong.

  • This can’t be Vegetable because the Type is wrong.

  • This can’t be Mineral because the Organic is wrong.

  • This can’t be a valid instance because the oneOf doesn’t match.

  • Of these, only one error message is helpful; but I don’t see any way to write a schema processor to report what’s really wrong in a human-friendly way.

    Anyhow, I can think of specific recommendations for schema-language design that fall out of this observation. But I’m not building any of them so I’ll stick with just reporting the use-case.



    Contributions

    Comment feed for ongoing:Comments feed

    From: Erik Wilde (May 22 2016, at 15:34)

    you have to do what you have to do. but you'd run into this issue with the majority of schema languages. this kind of "structural value-based co-constraint" is hard to design cleanly into a schema language. any chance to ditch the container names o1 and so forth and use the types instead? frankly, just looking at it i wouldn't consider this a great JSON model, so i have sympathy for any schema language not working great for it.

    [link]

    From: Grahame Grieve (May 22 2016, at 20:36)

    Actually, this is a relatively common pattern, and indeed, most schema languages can describe it, but don't give processors a realistic way to navigate it. This means that code generators are impossible, or produce incoherent code, and validators produce obscure error messages.

    The key is to introduce into the schema language the notion of a 'discriminator' - an explicit instruction about which element to use as a master to navigate the options, along with a requirement that the chosen discriminator is unambiguous (and demonstrably so) for all of the choices. For schema processors, then, this collapses to some logical variant of a switch statement, and generating code and coherent error messages becomes straight forward

    [link]

    author · Dad
    colophon · rights
    picture of the day
    May 22, 2016
    · Technology (90 fragments)
    · · Software (80 more)

    By .

    The opinions expressed here
    are my own, and no other party
    necessarily agrees with them.

    A full disclosure of my
    professional interests is
    on the author page.

    I’m on Mastodon!