JSON-LD is the Bee’s Knees

Full disclosure: I’m one of the primary authors and editors of the JSON-LD specification. I am also the chair of the group that created JSON-LD and have been an active participant in a number of Linked Data initiatives: RDFa (chair, author, editor), JSON-LD (chair, co-creator), Microdata (primary opponent), and Microformats (member, haudio and hvideo microformat editor). I’m biased, but also well informed.

JSON-LD has been getting a great deal of good press lately. It was adopted by Google, Yahoo, Yandex, and Microsoft for use in schema.org. The PaySwarm universal payment protocol is based on it. It was also integrated with Google’s Gmail service and the open social networking folks have also started integrating it into the Activity Streams 2.0 work.

That all of these positive adoption stories exist was precisely the reason why Shane Becker’s post on why JSON-LD is an Unneeded Spec was so surprising. If you haven’t read it yet, you may want to as the rest of this post will dissect the arguments he makes in his post (it’s a pretty quick 5 minute read). The post is a broad brush opinion piece based on a number of factual errors and misinformed opinion. I’d like to clear up these errors in this blog post and underscore some of the reasons JSON-LD exists and how it has been developed.

A theatrical interpretation of the “JSON-LD is Unneeded” blog post

Shane starts with this claim:

Today I learned about a proposed spec called JSON-LD. The “LD” is for linked data (Linked Data™ in the Uppercase “S” Semantic Web sense).

When I started writing the original JSON-LD specification, one of the goals was to try and merge lessons learned in the Microformats community with lessons learned during the development of RDFa and Microdata. This meant figuring out a way to marry the lowercase semantic web with the uppercase Semantic Web in a way that was friendly to developers. For developers that didn’t care about the uppercase Semantic Web, JSON-LD would still provide a very useful data structure to program against. In fact, Microformats, which are the poster-child for the lowercase semantic web, were supported by JSON-LD from day one.

Shane’s article is misinformed with respect to the assertion that JSON-LD is solely for the uppercase Semantic Web. JSON-LD is mostly for the lowercase semantic web, the one that developers can use to make their applications exchange and merge data with other applications more easily. JSON-LD is also for the uppercase Semantic Web, the one that researchers and large enterprises are using to build systems like IBM’s Watson supercomputer, search crawlers, Gmail, and open social networking systems.

Linked data. Web sites. Standards. Machine readable.
Cool. All of those sound good to me. But they all sound familiar, like we’ve already done this before. In fact, we have.


We haven’t done something like JSON-LD before. I wish we had because we wouldn’t have had to spend all that time doing research and development to create the technology. When writing about technology, it is important to understand the basics of a technology stack before claiming that we’ve “done this before”. An astute reader will notice that at no point in Shane’s article is any text from the JSON-LD specification quoted, just the very basic introductory material on the landing page of the website. More on this below.

Linked data
That’s just the web, right? I mean, we’ve had the <a href> tag since literally the beginning of HTML / The Web. It’s for linking documents. Documents are a representation of data.

Speaking as someone that has been very involved in the Microformats and RDFa communities, yes, it’s true that the document-based Web can be used to publish Linked Data. The problem is that standard way of expressing a link to another piece of data that can be followed did not carry over to the data-based Web. That is, most JSON-based APIs don’t have a standard way of encoding a hyperlink.

The other implied assertion with the statement above is that the document-based Web is all we need. If this were true, sending HTML documents to Web applications would be all we needed. Web developers know that this isn’t the case today for a number of obvious reasons. We send JSON data back and forth on the Web when we need to program against things like Facebook, Google, or Twitter’s services. JSON is a very useful data format for machine-to-machine data exchange. The problem is that JSON data has no standard way of doing a variety of things we do on the document-based Web, like expressing links, expressing the types of data (like times and dates), and a variety of other very useful features for the data-based Web. This is one of the problems that JSON-LD addresses.

Web sites
If it’s not wrapped in HTML and viewable in a browser it, is it really a website? JSON isn’t very useful in the browser by itself. It’s not style-able. It’s not very human-readable. And worst of all, it’s not clickable.

Websites are composed of many parts. It’s a weak argument to say that if a site is mainly composed of data that isn’t in HTML, and isn’t viewable in a browser, that it’s not a real website. The vast majority of websites like Twitter and Facebook are composed of data and API calls with a relatively thin varnish of HTML on top. JSON is the primary way that applications interact with these and other data-driven websites. It’s almost guaranteed these days that any company that has a popular API uses JSON in their Web service protocol.

Shane’s argument here is pretty confused. It assumes that the primary use of JSON-LD is to express data in an HTML page. Sure, JSON-LD can do that, but focusing on that brush stroke is missing the big picture. The big picture is that JSON-LD allows applications that use it to share data and interoperate in a way that is not possible with regular JSON, and it’s especially useful when used in conjunction with a Web service or a document-based database like MongoDB or CouchDB.

Standards based
To their credit, JSON-LD did license their website content Creative Commons CC0 Public Domain. But, the spec itself isn’t. It’s using (what seems to be) a W3C boilerplate copyright / license. Copyright © 2010-2013 W3C® (MIT, ERCIM, Keio, Beihang), All Rights Reserved. W3C liability, trademark and document use rules apply.


Nope. The JSON-LD specification has been released under a Creative Commons Attribution 3.0 license multiple times in the past, and it will be released under a Creative Commons license again, most probably CC0. The JSON-LD specification was developed in a W3C Community Group using a Creative Commons license and then released to be published as a Web standard via W3C using their W3C Community Final Specification Agreement (FSA), which allows the community to fork the specification at any point in time and publish it under a different license.

When you publish a document through the W3C, they have their own copyright, license, and patent policy associated with the document being published. There is a legal process in place at W3C that asserts that companies can implement W3C published standards in a patent and royalty-free way. You don’t get that with CC0, in fact, you don’t get any such vetting of the technology or any level of patent and royalty protection.

What we have with JSON-LD is better than what is proposed in Shane’s blog post. You get all of the benefits of having W3C member companies vet the technology for technical and patent issues while also being able to fork the specification at any point in the future and publish it under a license of your choosing as long as you state where the spec came from.

Machine readable
Ah… “machine readable”. Every couple of years the current trend of what machine readable data should look like changes (XML/JSON, RSS/Atom, xml-rpc/SOAP, rest/WS-*). Every time, there are the same promises. This will solve our problems. It won’t change. It’ll be supported forever. Interoperability. And every time, they break their promises. Today’s empires, tomorrow’s ashes.


At no point has any core designer of JSON-LD claimed 1) that JSON-LD will “solve our problems” (or even your particular problem), 2) that it won’t change, and 3) that it will be supported forever. These are straw-man arguments. The current consensus of the group is that JSON-LD is best suited to a particular class of problems and that some developers will have no need for it. JSON-LD is guaranteed to change in the future to keep pace with what we learn in the field, and we will strive for backward compatibility for features that are widely used. Without modification, standardized technologies have a shelf life of around 10 years, 20-30 if they’re great. The designers of JSON-LD understand that, like the Web, JSON-LD is just another grand experiment. If it’s useful, it’ll stick around for a while, if it isn’t, it’ll fade into history. I know of no great software developer or systems designer that has ever made these three claims and been serious about it.

We do think that JSON-LD will help Web applications interoperate better than they do with plain ‘ol JSON. For an explanation of how, there is a nice video introducing JSON-LD.

With respect to the “Today’s empires, tomorrow’s ashes” cynicism, we’ve already seen a preview of the sort of advances that Web-based machine-readable data can unleash. Google, Yahoo!, Microsoft, Yandex, and Facebook all use a variety of machine-readable data technologies that have only recently been standardized. These technologies allow for faster, more accurate, and richer search results. They are also the driving technology for software systems like Watson. These systems exist because there are people plugging away at the hard problem of machine readable data in spite of cynicism directed at past failures. Those failures aren’t ashes, they’re the bedrock of tomorrow’s breakthroughs.

Instead of reinventing the everything (over and over again), let’s use what’s already there and what already works. In the case of linked data on the web, that’s html web pages with clickable links between them.

Microformats, Microdata, and RDFa do not work well for data-based Web services. Using Linked Data with data-based Web services is one of the primary reasons that JSON-LD was created.

For open standards, open license are a deal breaker. No license is more open than Creative Commons CC0 Public Domain + OWFa. (See also the Mozilla wiki about standards/license, for more.) There’s a growing list of standards that are already using CC0+OWFa.

I think there might be a typo here, but if not, I don’t understand why open licenses are a deal breaker for open standards. Especially things like the W3C FSA or the Creative Commons licenses we’ve published the JSON-LD spec under. Additionally, CC0 + OWFa might be neat. Shane’s article was the first time that I had heard of OWFa and I’d be a proponent for pushing it in the group if it granted more freedom to the people using and developing JSON-LD than the current set of agreements we have in place. After glossing over the legal text of the OWFa, I can’t see what CC0 + OWFa buys us over CC0 + W3C patent attribution. If someone would like to make these benefits clear, I could take a proposal to switch to CC0 + OWFa to the JSON-LD Community Group and see if there is interest in using that license in the future.

No process is more open than a publicly editable wiki.

A counter-point to publicly accessible forums

Publicly editable wikis are notorious for edit wars, they are not a panacea. Just because you have a wiki, does not mean you have an open community. For example, the Microformats community was notorious for having a different class of unelected admins that would meet in San Francisco and make decisions about the operation of the community. This seemingly innocuous practice would creep its way into the culture and technical discussion on a regular basis leading to community members being banned from time to time. Similarly, Wikipedia has had numerous issues with publicly editable wikis and the behavior of their admins.

Depending on how you define “open”, there are a number of processes that are far more open than a publicly editable wiki. For example, the JSON-LD specification development process is completely open to the public, based on meritocracy, and is consensus-driven. The mailing list is open. The bug tracker is open. We have weekly design teleconferences where all the audio is recorded and minuted. We have these teleconferences to this day and will continue to have them into the future because we make transparency a priority. JSON-LD, as far as I know, is the first such specification in the world developed where all the previously described operating guidelines are standard practice.

(Mailing lists are toxic.)

A community is as toxic as its organizational structure enables it to be. The JSON-LD community is based on meritocracy, consensus, and has operated in a very transparent manner since the beginning (open meetings, all calls are recorded and minuted, anyone can contribute to the spec, etc.). This has, unsurprisingly, resulted in a very pleasant and supportive community. That said, there is no perfect communication medium. They’re all lossy and they all have their benefits and drawbacks. Sometimes, when you combine multiple communication channels as a part of how your community operates, you get better outcomes.

Finally, for machine readable data, nothing has been more widely adopted by publishers and consumers than microformats. As of June 2012, microformats represents about 70% of all of the structured data on the web. And of that ~70%, the vast majority was h-card and xfn. (All RDFa is about 25% and microdata is a distant third.)

Microformats are good if all you need to do is publish your basic contact and social information on the Web. If you want to publish detailed product information, financial data, medical data, or address other more complex scenarios, Microformats won’t help you. There have been no new Microformats released in the last 5 years and the mailing list traffic has been almost non-existent for around 5 years. From what I can tell, most everyone has moved on to RDFa, Microdata, or JSON-LD.

There are a few that are working on Microformats 2, but I haven’t seen anything that it provides that is not already provided by existing solutions that also have the added benefit of being W3C standards or backed by major companies like Google, Facebook, Yahoo!, Microsoft, and Yandex.

Maybe it’s because of the ease of publishing microformats. Maybe it’s the open process for developing the standards. Maybe it’s because microformats don’t require any additions to HTML. (Both RDFa and microdata required the use of additional attributes or XML namespaces.) Whatever the reason, microformats has the most uptake. So, why do people keep trying to reinvent what microformats is already doing well?

People aren’t reinventing what Microformats are already doing well, they’re attempting to address problems that Microformats do not solve.

For example, one of the reasons that Google adopted JSON-LD is because markup was much easier in JSON-LD than it was in Microformats, as evidenced by the example below:

Back to JSON-LD. The “Simple Example” listed on the homepage is a person object representing John Lennon. His birthday and wife are also listed on the object.

        {
          "@context": "http://json-ld.org/contexts/person.jsonld",
          "@id": "http://dbpedia.org/resource/John_Lennon",
          "name": "John Lennon",
          "born": "1940-10-09",
          "spouse": "http://dbpedia.org/resource/Cynthia_Lennon"
        }

I look at this and see what should have been HTML with microformats (h-card and xfn). This is actually a perfect use case for h-card and xfn: a person and their relationship to another person. Here’s how it could’ve been marked up instead.

        <div class="h-card">
          <a href="http://dbpedia.org/resource/John_Lennon" class="u-url u-uid p-name">John Lennon</a>
          <time class="dt-bday" datetime="1940-10-09">October 9<sup>th</sup>, 1940</time>
          <a rel="spouse" href="http://dbpedia.org/resource/Cynthia_Lennon">Cynthia Lennon</a>.
        </div>

I’m willing to bet that most people familiar with JSON will find the JSON-LD markup far easier to understand and get right than the Microformats-based equivalent. In addition, sending the Microformats markup to a REST-based Web service would be very strange. Alternatively, sending the JSON-LD markup to a REST-based Web service would be far more natural for a modern day Web developer.

This HTML can be easily understood by machine parsers and humans parsers. Microformats 2 parsers already exists for: JavaScript (in the browser), Node.js, PHP and Ruby. HTML + microformats2 means that machines can read your linked data from your website and so can humans. It means that you don’t need an “API” that is something other than your website.

You have been able to do the same thing, and much more, using RDFa and Microdata for far longer (since 2006) than you have been able to do it in Microformats 2. Let’s be clear, there is no significant advantage to using Microformats 2 over RDFa or Microdata. In fact, there are a number of disadvantages for using Microformats 2 at this point, like little to no support from the search companies, very little software tooling, and an anemic community (of which I am a member) for starters. Additionally, HTML + Microformats 2 does not address the Web service API issue at all.

Please don’t waste time and energy reinventing all of the wheels. Instead, please use what already works and what works the webby way.


Do not miss the irony of this statement. RDFa has been doing what Microformats 2 does today since 2006, and it’s a Web standard. Even if you don’t like RDFa 1.0, RDFa 1.1, RDFa Lite 1.1, and Microdata all came before Microformats 2. To assert that wheels should not be reinvented and then claim that Microformats 2, which was created far after there were already a number of well-established solutions, is quite a strange position to take.

Conclusion

JSON-LD was created by people that have been directly involved in the Linked Data, lowercase semantic web, uppercase Semantic Web, Microformats, Microdata, and RDFa work. It has proven to be useful to them. There are a number of very large technology companies that have adopted JSON-LD, further underscoring its utility. Expect more big announcements in the next six months. The JSON-LD specifications have been developed in a radically open and transparent way, the document copyright and licensing provisions are equally open. I hope that this blog post has helped clarify most of the misinformed opinion in Shane Becker’s blog post.

Most importantly, cynicism will not solve the problems that we face on the Web today. Hard work will, and there are very few communities that I know of that work harder and more harmoniously than the excellent volunteers in the JSON-LD community.

If you would like to learn more about Linked Data, a good video introduction exists. If you want to learn more about JSON-LD, there is a good video introduction to that as well.

10 Comments

Got something to say? Feel free, I want to hear from you! Leave a Comment

  1. Thanks for taking the time to write this rebuttal, Manu. While Mr. Becker’s post may not have actually ended up dissuading anyone from employing JSON-LD, but these same sorts of arguments are used by others trying to discredit the utility of structured and-or linked data protocols, so it’s good to see them addressed so thoroughly.

    That the ubiquity of microformats proves their continued utility is a facile conclusion. That urban transport in 1900 predominantly consisted of horses didn’t mean that motor vehicles were superfluous (though I’ve little doubt that certain detractors then, too, asked “why do people keep trying to reinvent what horses already do well?”).

    As you point out, no new microformats have been released in some time. The core of this fact – that is, that microformats must be “released” – shows the utility of RDFa and microdata in itself. These more modern methods of marking up structured data are much more versatile than microformats because they schema-independent, and JSON-LD takes this one step further.

    It’s convenient that Mr. Becker used the John Lennon example in support of his arguments. I don’t think a microformat would perform so well where a web page described the causes of a particular medical condition, or – heavens forbid – dynamically described an action like someone ordering book.

  2. H.E.A.T. says:

    When you are personally involved in the development of a product, stepping back and listening to critics can be difficult. I am reading more into Mr. Becker’s post than just the written words. I will tie in what I think I understand with what I know I feel.

    JSON-LD is just another tool under the umbrella of what is an undefined web — the Semantic Web (SW). There are so many tools (technology, so to speak) claiming association with the SW. Here is the problem: who is using any of them in the trenches?

    Auto companies market cars to common people. Phone companies market mobile devices to common people. Contrary to this “theory”, the W3C markets SW to corporations — the few.

    There are more developers operating outside of corporations that, if the W3C marketed this tech to them, maybe, just maybe, all this tech would make sense to the common people. As it stands today, the W3C is riding the fence with RDFa and Microdata — which one does the W3C truly support to be part of the SW? Mr. Sporny, you are the only one defending JSON-LD while the W3C continues to kowtow to the WHATWG and schema.org.

    Maybe if the W3C would roll up its sleeves, put on some boxing gloves, use their massive intellect and governess, and let all members of their once mighty and irreproachable consortium know where the web needs to go and lay out the pathway that WILL be followed, then maybe the web community can jump on board and make sense of it all.

    Will the W3C shelve JSON-LD like XHTML 2 when a renegade faction decides to pick up their ball and go home? Will the W3C shelve RDFa 1.1 (and any derivatives) in favor of Microdata when Google decides to take over the web?

    How can anyone trust any new specification coming out of the W3C when they are allowing the WHATWG to use fancy wording to disguise purely presentational tags in HTML5 (you know, B and I)? Is this same level of politics at play with JSON-LD?

    I am not questioning the value of JSON-LD nor am I criticizing your stance to passionately defend a product you helped build, but the past actions of the W3C is smearing distrust over your cause. Should not the W3C be defending and marketing this product with the same passion?

    Mr. Becker sounds frustrated to me. He sounds as if he is tired of this smorgasbord of semantic or linked data tech constantly spewing out of the W3C with no understandable link to each other. Touting that corporations are using the tech does not matter to the common developer — you know, the one building a website for his grandmother or putting together a recipe-sharing site for her Aunt Sally.

    There are more folks like this than there are corporations. They could benefit from this tech, if the W3C would market it to them. However, when I am constantly reading from the W3C’s blog that corporations are using this and that, I say, “Well, they are not talking to me so I guess this tech is not for me.”

    Sometimes, being defensive can distort the listening process. If you were to take a second (or third) read of Mr. Becker’s post from a perspective that you want to push JSON-LD to the common people, maybe you will have more empathy toward his passion.

    • ManuSporny says: (Author)

      Here is the problem: who is using any of them in the trenches?

      Here are the people using JSON-LD today: The Web Payments community (PaySwarm specs), Gmail, the Activity Streams 2.0 community, and schema.org. It’s still early days, but Web developers are using the technology in experimental projects that are meant to be broadly deployed… used by “common people” as you put it.

      let all members of their once mighty and irreproachable consortium know where the web needs to go and lay out the pathway that WILL be followed

      The W3C has never worked like this and never will. This isn’t consensus you’re asking for, it’s dictatorial rule, which will have a largely negative effect on the Web.

      Should not the W3C be defending and marketing this product with the same passion?

      Since when has the W3C defended or marketed any product in that manner? :)

      Mr. Becker sounds frustrated to me.

      Oh, I definitely agree with that, and I understand his frustration. However, he’s taking it out on the wrong community and specification. We’re actively working to try and fix some of the issues he complains about in his post. He’s criticizing the very people that are trying to fix some of the problems he raises.

      If you were to take a second (or third) read of Mr. Becker’s post from a perspective that you want to push JSON-LD to the common people, maybe you will have more empathy toward his passion.

      I do have empathy toward what he might be feeling. However, I disagree with the approach he took, which was a misinformed rant against a technology that isn’t trying to do many of the things he is railing against in his blog post. Sometimes the only thing you can do is just deconstruct the entire argument to show people the actual reality vs. the one that is falsely being presented as reality.

      • H.E.A.T. says:

        Consensus, by definition, is based on reaching a general agreement on a matter, usually by some kind of voting process. I am not expecting the W3C to operate as a dictatorship in the web community, but to stand by their decision after reaching such a consensus.

        Once the W3C decided to go the course of HTML5, they committed themselves to an all-out campaign to show their full support for the tech. Believe me, I spend a lot of time over their site reading their blogs, specifications, and things out of their working groups. At the beginning, HTML5 was being touted as the end all to the be all. I would read articles with Tim Berners-Lee himself selling HTML5 as the next step for the web.

        Again, this is not the problem I am having.

        First, let me say something. Sometimes, frustrations are taken out on others because these others are showing they give a doggone. I can see you give a care about RDFa and JSON-LD regardless of your part in their development. I feel you believe that, at present, they are the best tech for their intended purpose. My comments are not in contrast to the efficacy of this tech.

        Mr. Becker sounds as though he is frustrated at the level of uncertainty spewing out from the W3C. I remember reading an article on your blog about your frustration with the W3C thinking about moving the Microdata specification forward. If the members, like yourself, are working to produce this tech cannot rely on them to be fully supported, then what’s the point? This is why I can empathize with Mr. Becker.

        Regardless of the amount of misinformation or misdirected frustration from Mr. Becker, the point is that it exists. He may be one of the people writing about, but more people are thinking it. I remember reading on the W3C’s own blog about a gentleman wanting the W3C to disband, or for some other organization to take over the reigns. This demonstrates that there are some people who want to see leadership–true leadership–from the W3C.

        I cannot speak for Mr. Becker, so all my references to his article is based on my assessments. I guess I affliated myself with the underlying emotions of his article because I have similar frustrations. I use XML, XHTML 1.1, SVG, and RDFa because I believe in structure–clear structure. I guess I am expecting the same out of the consortium that developed that tech.

  3. Matt Yoing says:

    json-ld introduces key words into json, using the @ operator.

    If intstead we introduced key words as names without an operator, then you get javascript:

    var context: {etc}

    Skip the quotes in json, and the result is javascrt itself as active data, and javascipt is executed from a json expression graph.

    This is the solution we are headed for, because any data base that executes a join of two json graphs will be very close to a javascript interpreter. The web community will see te benefit of acrive data, and the show is over. We end up executing javascript out of database storing expression tree.

    • ManuSporny says: (Author)

      That is one potential future, yes. I don’t quite know how we get there, but JSON-LD may be a stepping stone in that direction. I think you’re getting at the notion of a programmable graph, where the data isn’t static, but it lives (for lack of a better term).

  4. Colin Maudry says:

    I am Product documentation analyst at NXP semiconductors, an organization that could fall in the category of “the few”.
    Though we don’t use JSON-LD yet, linked data technologies enabled us to deploy better tools, faster than we could dream. This versatility had a positive effect at multiple levels:
    – costs reduced = our bosses, common people ignorant of what linked data is, were happy
    – more focus on core activities = common employees, who manipulate data everyday, felt more useful. They have a rough idea of what XML is.
    – more data published = external collaborators, who just want some tailored CSV to fill their product data base, get it in a few days vs. a few weeks previously.

    The point is that I think the common people don’t want to know whether data handlers use microformats or linked data compliant formats: they want more features, better features, faster features. And I can tell the the openness and the versatility of linked data technologies make it much easier for me to deliver these features.

    Thanks to Manu Sporny and all the others who decided to work for an open Web.

  5. Colin Maudry says:

    Now that I think of it, my colleague John Walker published a insightful blog post about our plans regarding Linked data:
    http://blog.nxp.com/is-linked-data-the-future-of-data-integration-in-the-enterprise/

    This was more than half a year ago and we got a lot of questions from other semiconductors company, because we plan to solve one of their biggest problems: how to deal with data when there is a lot, when it is complex and when it changes often. We have made good progress in that direction, and if it goes well for us, it will make a lot of people’s life easier.

Trackbacks for this post

  1. Opinionated Guidelines for Designing a Truly RESTful Web Application | Extra Fox - A blog by Christopher Taylor

Leave a Comment

Let us know your thoughts on this post but remember to play nicely folks!