Full Disclosure: I am one of the primary creators of JSON-LD, lead editor on the JSON-LD 1.0 specification, and chair of the JSON-LD Community Group. This is an opinionated piece about JSON-LD. A number of people in this space don’t agree with my viewpoints. My statements should not be construed as official statements from the JSON-LD Community Group, W3C, or Digital Bazaar (my company) in any way, shape, or form. I’m pretty harsh about the technologies covered in this article and want to be very clear that I’m attacking the technologies, not the people that created them. I think most of the people that created and promote them are swell and I like them a lot, save for a few misguided souls, who are loveable and consistently wrong.
JSON-LD became an official Web Standard last week. This is after exactly 100 teleconferences typically lasting an hour and a half, fully transparent with text minutes and recorded audio for every call. There were 218+ issues addressed, 2,000+ source code commits, and 3,102+ emails that went through the JSON-LD Community Group. The journey was a fairly smooth one with only a few jarring bumps along the road. The specification is already deployed in production by companies like Google, the BBC, HealthData.gov, Yandex, Yahoo!, and Microsoft. There is a quickly growing list of other companies that are incorporating JSON-LD. We’re off to a good start.
In the previous blog post, I detailed the key people that brought JSON-LD to where it is today and gave a rough timeline of the creation of JSON-LD. In this post I’m going to outline the key decisions we made that made JSON-LD stand out from the rest of the technologies in this space.
I’ve heard many people say that JSON-LD is primarily about the Semantic Web, but I disagree, it’s not about that at all. JSON-LD was created for Web Developers that are working with data that is important to other people and must interoperate across the Web. The Semantic Web was near the bottom of my list of “things to care about” when working on JSON-LD, and anyone that tells you otherwise is wrong.
TL;DR: The desire for better Web APIs is what motivated the creation of JSON-LD, not the Semantic Web. If you want to make the Semantic Web a reality, stop making the case for it and spend your time doing something more useful, like actually making machines smarter or helping people publish data in a way that’s useful to them.
If you don’t know what JSON-LD is and you want to find out why it is useful, check out this video on Linked Data and this one on an Introduction to JSON-LD. The rest of this post outlines the things that make JSON-LD different from the traditional Semantic Web / Linked Data stack of technologies and why we decided to design it the way that we did.
Decision 1: Decrypt the Cryptic
Many W3C specifications are so cryptic that they require the sacrifice of your sanity and a secret W3C decoder ring to read. I never understood why these documents were so difficult to read, and after years of study on the matter, I think I found the answer. It turns out that most specification editors are just crap at writing.
It’s not like many of the things that are in most W3C specifications are complicated, it’s just that the editor is bad at explaining them to non-implementers, which are most of the web developers that end up reading these specification documents. This approach is often defended by raising the point that readability of the specification by non-implementers is viewed as secondary to its technical accuracy for implementers. The audience is the implementer, and you are expected to cater to them. To counter that point, though, we all know that technical accuracy is a bad excuse for crap writing. You can write something that is easy to understand and technically accurate, it just takes more effort to do that. Knowing your audience helps.
We tried our best to eliminate complex techno-babble from the JSON-LD specification. I made it a point to not mention RDF at all in the JSON-LD 1.0 specification because you didn’t need to go off and read about it to understand what was going on in JSON-LD. There was tremendous push back on this point, which I’ll go into later, but the point is that we wanted to communicate at a more conversational level than typical Internet and Web specifications because being pedantic too early in the spec sets the wrong tone.
It didn’t always work, but it certainly did set the tone we wanted for the community, which was that this Linked Data stuff didn’t have to seem so damn complicated. The JSON-LD 1.0 specification starts out by primarily using examples to introduce key concepts. It starts at basics, assuming that the audience is a web developer with modest training, and builds its way up slowly into more advanced topics. The first 70% of the specification contains barely any normative/conformance language, but after reading it, you know what JSON-LD can do. You can look at the section on the JSON-LD Context to get an idea of what this looks like in practice.
This approach wasn’t a wild success. Reading sections of the specification that have undergone feedback from more nitpicky readers still make me cringe because ease of understanding has been sacrificed at the alter of pedantic technical accuracy. However, I don’t feel embarrassed to point web developers to a section of the specification when they ask for an introduction to a particular feature of JSON-LD. There are not many specifications where you can do that.
Decision 2: Radical Transparency
One of the things that has always bothered me about W3C Working Groups is that you have to either be an expert to participate, or you have to be a member of the W3C, which can cost a non-trivial amount of money. This results in your typical web developer being able to comment on a specification, but not really having the ability to influence a Working Group decision with a vote. It also hobbles the standards-making community because the barrier to entry is perceived as impossibly high. Don’t get me wrong, the W3C staff does as much as they can to drive inclusion and they do a damn good job at it, but that doesn’t stop some of their member companies from being total dicks behind closed door sessions.
The W3C is a consortium of mostly for-profit companies and they have things they care about like market share, quarterly profits, and drowning goats (kidding!)… except for GoatCoats.com, anyone can join as long as you pay the membership dues! My point is that because there is a lack of transparency at times, it makes even the best Working Group less responsive to the general public, and that harms the public good. These closed door rules are there so that large companies can say certain things without triggering a lawsuit, which is sometimes used for good but typically results in companies being jerks and nobody finding out about it.
So, in 2010 we kicked off the JSON-LD work by making it radically open and we fought for that openness every step of the way. Anyone can join the group, anyone can vote on decisions, anyone can join the teleconferences, there are no closed door sessions, and we record the audio of every meeting. We successfully kept the technical work on the specification this open from the beginning to the release of JSON-LD 1.0 web standard a week ago. People came and went from the group over the years, but anyone could participate at any level and that was probably the thing I’m most proud of regarding the process that was used to create JSON-LD. Had we not have been this open, Markus Lanthaler may have never gone from being a gifted student in Italy to editor of the JSON-LD API specification and now leader of the Hypermedia Driven Web APIs community. We also may never have had the community backing to do some of the things we did in JSON-LD, like kicking RDF in the nuts.
Decision 3: Kick RDF in the Nuts
RDF is a shitty data model. It doesn’t have native support for lists. LISTS for fuck’s sake! The key data structure that’s used by almost every programmer on this planet and RDF starts out by giving developers a big fat middle finger in that area. Blank nodes are an abomination that we need, but they are applied inconsistently in the RDF data model (you can use them in some places, but not others). When we started with JSON-LD, RDF didn’t have native graph support either. For all the “RDF data model is elegant” arguments we’ve seen over the past decade, there are just as many reasons to kick it to the curb. This is exactly what we did when we created JSON-LD, and that really pissed off a number of people that had been working on RDF for over a decade.
I personally wanted JSON-LD to be compatible with RDF, but that’s about it. You could convert JSON-LD to and from RDF and get something useful, but JSON-LD had a more sane data model where lists were a first-class construct, you had generalized graphs, and you could use JSON-LD using a simple library and standard JSON tooling. To put that in perspective, to work with RDF you typically needed a quad store, a SPARQL engine, and some hefty libraries. Your standard web developer has no interest in that toolchain because it adds more complexity to the solution than is necessary.
So screw it, we thought, let’s create a graph data model that looks and feels like JSON, RDF and the Semantic Web be damned. That’s exactly what we did and it was working out pretty well until…
Decision 4: Work with the RDF Working Group. Whut?!
Around mid-2012, the JSON-LD stuff was going pretty well and the newly chartered RDF Working Group was going to start work on RDF 1.1. One of the work items was a serialization of RDF for JSON. The lead solutions for RDF in JSON were things like the aptly named RDF/JSON and JTriples, both of which would look incredibly foreign to web developers and continue the narrative that the Semantic Web community creates esoteric solutions to non-problems. The biggest problem being that many of the participants in the RDF Working Group at the time didn’t understand JSON.
The JSON-LD group decided to weigh in on the topic by pointing the RDF WG to JSON-LD as an example of what was needed to convince people that this whole Linked Data thing could be useful to web developers. I remember the discussions getting very heated over multiple months, and at times, thinking that the worst thing we could do to JSON-LD was to hand it over to the RDF Working Group for standardization.
It is at that point that David Wood, one of the chairs of the RDF Working Group, phoned me up to try and convince me that it would be a good idea to standardize the work through the RDF WG. I was very skeptical because there were people in the RDF Working Group who drove some of thinking that I had grown to see as toxic to the whole Linked Data / Semantic Web movement. I trusted Dave Wood, though. I had never seen him get religiously zealous about RDF like some of the others in the group and he seemed to be convinced that we could get JSON-LD through without ruining it. To Dave’s credit, he was mostly right.
Decision 5: Hate the Semantic Web
It’s not that the RDF Working Group was populated by people that are incompetent, or that I didn’t personally like. I’ve worked with many of them for years, and most of them are very intelligent, capable, gifted people. The problem with getting a room full of smart people together is that the group’s world view gets skewed. There are many reasons that a working group filled with experts don’t consistently produce great results. For example, many of the participants can be humble about their knowledge so they tend to think that a good chunk of the people that will be using their technology will be just as enlightened. Bad feature ideas can be argued for months and rationalized because smart people, lacking any sort of compelling real world data, are great at debating and rationalizing bad decisions.
I don’t want people to get the impression that there was or is any sort of animosity in the Linked Data / Semantic Web community because, as far as I can tell, there isn’t. Everyone wants to see this stuff succeed and we all have our reasons and approaches.
That said, after 7+ years of being involved with Semantic Web / Linked Data, our company has never had a need for a quad store, RDF/XML, N3, NTriples, TURTLE, or SPARQL. When you chair standards groups that kick out “Semantic Web” standards, but even your company can’t stomach the technologies involved, something is wrong. That’s why my personal approach with JSON-LD just happened to be burning most of the Semantic Web technology stack (TURTLE/SPARQL/Quad Stores) to the ground and starting over. It’s not a strategy that works for everyone, but it’s the only one that worked for us, and the only way we could think of jarring the more traditional Semantic Web community out of its complacency.
I hate the narrative of the Semantic Web because the focus has been on the wrong set of things for a long time. That community, who I have been consciously distancing myself from for a few years now, is schizophrenic in its direction. Precious time is spent in groups discussing how we can query all this Big Data that is sure to be published via RDF instead of figuring out a way of making it easy to publish that data on the Web by leveraging common practices in use today. Too much time is spent assuming a future that’s not going to unfold in the way that we expect it to. That’s not to say that TURTLE, SPARQL, and Quad stores don’t have their place, but I always struggle to point to a typical startup that has decided to base their product line on that technology (versus ones that choose MongoDB and JSON on a regular basis).
I like JSON-LD because it’s based on technology that most web developers use today. It helps people solve interesting distributed problems without buying into any grand vision. It helps you get to the “adjacent possible” instead of having to wait for a mirage to solidify.
Decision 6: Believe in Consensus
All this said, you can’t hope to achieve anything by standing on idealism alone and I do admit that some of what I say above is idealistic. At some point you have to deal with reality, and that reality is that there are just as many things that the RDF and Semantic Web initiative got right as it got wrong. The RDF data model is shitty, but because of the gauntlet thrown down by JSON-LD and a number of like-minded proponents in the RDF Working Group, the RDF Data Model was extended in a way that made it compatible with JSON-LD. As a result, the gap between the RDF model and the JSON-LD model narrowed to the point that it became acceptable to more-or-less base JSON-LD off of the RDF model. It took months to do the alignment, but it was consensus at its best. Nobody was happy with the result, but we could all live with it.
To this day I assert that we could rip the data model section out of the JSON-LD specification and it wouldn’t really affect the people using JSON-LD in any significant way. That’s consensus for you. The section is in there because other people wanted it in there and because the people that didn’t want it in there could very well have turned out to be wrong. That’s really the beauty of the W3C and IETF process. It allows people that have seemingly opposite world views to create specifications that are capable of supporting both world views in awkward but acceptable ways.
JSON-LD is a product of consensus. Nobody agrees on everything in there, but it all sticks together pretty well. There being a consensus on consensus is what makes the W3C, IETF, and thus the Web and the Internet work. Through all of the fits and starts, permathreads, pedantry, idealism, and deadlock, the way it brings people together to build this thing we call the Web is beautiful thing.
I’d like to thank the W3C staff that were involved in getting JSON-LD to offical Web standard status (and the staff, in general, for being really fantastic people). Specifically, Ivan Herman for simultaneously pointing out all of the problems that lay in the road ahead while also providing ways to deal with each one as we came upon them. Sandro Hawke for pushing back against JSON-LD, but always offering suggestions about how we could move forward. I actually think he may have ended up liking JSON-LD in the end . Doug Schepers and Ian Jacobs for fighting for W3C Community Groups, without which JSON-LD would not have been able to plead the case for Web developers. The systems team and publishing team who are unknown to most of you, but work tirelessly to ensure that everything continues to operate, be published, and improve at W3C.
From the RDF Working group, the chairs (David Wood and Guus Schreiber), for giving JSON-LD a chance and ensuring that it got a fair shake. Richard Cyganiak for pushing us to get rid of microsyntaxes and working with us to try and align JSON-LD with RDF. Kingsley Idehen for being the first external implementer of JSON-LD after we had just finished scribbling the first design down on paper and tirelessly dogfooding what he preaches. Nobody does it better. The rest of the RDF Working Group members without which JSON-LD would have escaped unscathed from your influence, making my life a hell of a lot easier, but leaving JSON-LD and the people that use it in a worse situation had you not been involved.