Lessons Learnt from JSON Designs I've Worked On
Over the last couple of years, I've worked on a few JSON schema. For example, IPTC's NINJS (for representing news) and W3C GC ODRL's ODRL in JSON (for representing permissions and restrictions). I've also done some work on JSON internal to AP, for various APIs and search systems.Along the way, I've learnt some lessons about better or worse ways to design the JSON - both about the way to do it and some JSON "style" tips. I've broken this into three posts:
- An approach to designing JSON
- Handy tools and standards (this one)
- My thoughts on JSON style
A Couple of Handy JSON Tools
For basic syntax checking of your JSON documents, JSONLint is invaluable. Alternatives include JSON Formatter and Validator (online) and demjson (Python). Most of the XML tools (such as XML Spy or oXygen) also support JSON, too.
Even though JSON is touted as being a lightweight alternative to XML, equivalents of many of the features of XML are gradually being added to JSON. One that I make extensive use of is JSON Schema. This is an IETF effort. Even though - at the time of writing - JSON Schema is still a draft, it already has decent software support - including online validation and support in many languages. Having a JSON schema for your format is a great way to document how you intend the format to be used and it can help you spot certain kinds of errors. (My blog post Ban Unknown Properties! discusses some of the finer points of JSON validation).
Selecting and Querying JSON
One of the fundamentals of XML (and related standards including XSLT and XQuery) is XPath. So, imagine my excitement when I discovered JSONPath which has the tag line "XPath for JSON". It holds out the promise of language-independent way to specify properties within a given JSON document. Very handy - and there are a couple of language bindings, already. Unfortunately, it only seems to work for fairly simple expressions - it certainly doesn't have the full power of XPath. And it isn't backed by a standards body or a consortium of companies, so the future path (sic) of JSONPath isn't clear to me.
Perhaps more promising is JSONiq a fully fledged query language for JSON, which claims to be "The SQL of NoSQL". In fact, JSONiq is based very much on XQuery. Again, this is not backed by an independent standards body. It has been implemented on top of some XQuery engines (28.io, zorba.io, IBM's Websphere and Pascal). However, notably, the major JSON-native engines are directly supporting it, which means you need to use their proprietary query languages.
And it seems that there is a bit of a Cambrian Explosion going on in this area. Tim Bray recently published his blog post Fat JSON. In part, he illustrates why you need a tool to pick out properties from within a JSON document (basically, some JSON objects contain way too many properties than you need for a particular purpose). He discusses one approach - support Partial Responses in your API. That works if you're the author of the API but more likely you're the client of an API or are dealing with a complete JSON document from MongoDB or Elasticsearch or the like.
Perhaps more promising is JSONiq a fully fledged query language for JSON, which claims to be "The SQL of NoSQL". In fact, JSONiq is based very much on XQuery. Again, this is not backed by an independent standards body. It has been implemented on top of some XQuery engines (28.io, zorba.io, IBM's Websphere and Pascal). However, notably, the major JSON-native engines are directly supporting it, which means you need to use their proprietary query languages.
And it seems that there is a bit of a Cambrian Explosion going on in this area. Tim Bray recently published his blog post Fat JSON. In part, he illustrates why you need a tool to pick out properties from within a JSON document (basically, some JSON objects contain way too many properties than you need for a particular purpose). He discusses one approach - support Partial Responses in your API. That works if you're the author of the API but more likely you're the client of an API or are dealing with a complete JSON document from MongoDB or Elasticsearch or the like.
He points out several attempts to recreate XPath for JSON, which are similar to JSONPath (none of which I have tried yet, but which are all imaginatively called "[jJ][Pp]ath"):
Not to be outdone, Mr. Bray has knocked together JWalk - some Java source code to very simply pick out properties based on their names alone (i.e. not based on parent names or child property values as you would want from a more full-fat XPath style library). I suspect that this won't be the last attempt to solve this problem.
JSON Standards
As is probably obvious by now, I'm a big fan of standards. Not just because I've helped to create a few (e.g. MDDL, NewsML-G2, hNews, rNews, RightsML, NINJS) but also because - whenever I'm faced with solving a problem - I think "surely someone has done this before me?". I've found that looking at how someone else has attempted to tackle some domain is very instructive. In the best case, you can simply adopt someone else's hard work, along with documentation, working code and a thriving community who will help to quickly bring you up to speed. Of course, not all prior work is great - the compromises required to create a consensus standard are notorious for producing unwieldy solutions. But, even then, it can be instructive to help you understand what you don't want to do.
Whilst developing IPTC's News in JSON (NINJS), for example, we looked at previous efforts - both public and proprietary - to render articles, blog posts, photos and video using JSON properties. We also researched particular areas that are not directly tied to news. For example, when we were figuring out how to represent place metadata, we found it really helpful to examine the different approaches taken by GeoJSON and Geonames, amongst others. (In the end, rather than pick a winner, we decided to add a "pattern property" into NINJS so that providers could select the JSON geometry representation that best fits their needs).
A somewhat different type of JSON-related standard are things like JSON-LD. JSON Linked Data is a way to serialize the RDF data model to and from the JSON format. This W3C Recommendation is an increasingly popular way to structure JSON and is equivalent to the XML and Turtle serializations of RDF. So, if you are fundamentally working with RDF, then you should consider it (however, there are at least some JSON-LD dissenters). If you are not working with the RDF data model, then I would consider whether the additional features / complexity of JSON-LD is going to be a barrier to adoption.
JSON Standards
As is probably obvious by now, I'm a big fan of standards. Not just because I've helped to create a few (e.g. MDDL, NewsML-G2, hNews, rNews, RightsML, NINJS) but also because - whenever I'm faced with solving a problem - I think "surely someone has done this before me?". I've found that looking at how someone else has attempted to tackle some domain is very instructive. In the best case, you can simply adopt someone else's hard work, along with documentation, working code and a thriving community who will help to quickly bring you up to speed. Of course, not all prior work is great - the compromises required to create a consensus standard are notorious for producing unwieldy solutions. But, even then, it can be instructive to help you understand what you don't want to do.
Whilst developing IPTC's News in JSON (NINJS), for example, we looked at previous efforts - both public and proprietary - to render articles, blog posts, photos and video using JSON properties. We also researched particular areas that are not directly tied to news. For example, when we were figuring out how to represent place metadata, we found it really helpful to examine the different approaches taken by GeoJSON and Geonames, amongst others. (In the end, rather than pick a winner, we decided to add a "pattern property" into NINJS so that providers could select the JSON geometry representation that best fits their needs).
A somewhat different type of JSON-related standard are things like JSON-LD. JSON Linked Data is a way to serialize the RDF data model to and from the JSON format. This W3C Recommendation is an increasingly popular way to structure JSON and is equivalent to the XML and Turtle serializations of RDF. So, if you are fundamentally working with RDF, then you should consider it (however, there are at least some JSON-LD dissenters). If you are not working with the RDF data model, then I would consider whether the additional features / complexity of JSON-LD is going to be a barrier to adoption.
As I will discuss in the third and final post in this series, one goal I prize when designing a JSON schema is that simple examples make sense "intuitively". I want them to look sufficiently appealing to, say, a Ruby developer that she decides to use that schema rather than make one up herself.
JSON Design: A Series
Part one discussed an approach to designing JSON schema. Part three will discuss JSON style.