The Origins of JSON-LD

Full Disclosure: I am one of the primary creators of JSON-LD, lead editor on the JSON-LD 1.0 specification, and chair of the JSON-LD Community Group. These are my personal opinions and not the opinions of the W3C, JSON-LD Community Group, or my company.

JSON-LD became an official Web Standard last week. This is after exactly 100 teleconferences typically lasting an hour and a half, fully transparent with text minutes and recorded audio for every call. There were 218+ issues addressed, 2,071+ source code commits, and 3,102+ emails that went through the JSON-LD Community Group. The journey was a fairly smooth one with only a few jarring bumps along the road. The specification is already deployed in production by companies like Google, the BBC, HealthData.gov, Yandex, Yahoo!, and Microsoft. There is a quickly growing list of other companies that are incorporating JSON-LD, but that’s the future. This blog post is more about the past, namely where did JSON-LD come from? Who created it and why?

I love origin stories. When I was in my teens and early twenties, the only origin stories I liked to read about were of the comic and anime variety. Spiderman, great origin story. Superman, less so, but entertaining. Nausicaä, brilliant. Major Motoko Kusanagi, nuanced. Spawn, dark. Those connections with characters fade over time as you understand that this world has more interesting ones. Interesting because they touch the lives of billions of people, and since I’m a technologist, some of my favorite origin stories today consist of finding out the personal stories behind how a particular technology came to be. The Web has a particularly riveting origin story. These stories are hard to find because they’re rarely written about, so this is my attempt at documenting how JSON-LD came to be and the handful of people that got it to where it is today.

The Origins of JSON-LD

When you’re asked to draft the press pieces on the launch of new world standards, you have two lists of people in your head. The first is the “all inclusive list”, which is every person that uttered so much as a word that resulted in a change to the specification. That list is typically very long, so you end up saying something like “We’d like to thank all of the people that provided input to the JSON-LD specification, including the JSON-LD Community, RDF Working Group, and individuals who took the time to send in comments and improve the specification.” With that statement, you are sincere and cover all of your bases, but feel like you’re doing an injustice to the people without which the work would never have survived.

The all inclusive list is very important, they helped refine the technology to the point that everyone could achieve consensus on it being something that is world class. However, 90% of the back breaking work to get the specification to the point that everyone else could comment on it is typically undertaken by a 4-5 people. It’s a thankless and largely unpaid job, and this is how the Web is built. It’s those people that I’d like to thank while exploring the origins of JSON-LD.

Inception

JSON-LD started around late 2008 as the work on RDFa 1.0 was wrapping up. We were under pressure from Microformats and Microdata, which we were also heavily involved in, to come up with a good way of programming against RDFa data. At around the same time, my company was struggling with the representation of data for the Web Payments work. We had already made the switch to JSON a few years previous and were storing that data in MySQL, mostly because MongoDB didn’t exist yet. We were having a hard time translating the RDFa we were ingesting (products for sale, pricing information, etc.) into something that worked well in JSON. At around the same time, Mark Birbeck, one of the creators of RDFa, and I were thinking about making something RDFa-like for JSON. Mark had proposed a syntax for something called RDFj, which I thought had legs, but Mark didn’t necessarily have the time to pursue.

The Hard Grind

After exchanging a few emails with Mark about the topic over the course of 2009, and letting the idea stew for a while, I wrote up a quick proposal for a specification and passed it by Dave Longley, Digital Bazaar’s CTO. We kicked the idea around a bit more and in May of 2010, published the first working draft of JSON-LD. While Mark was instrumental in injecting the first set of basis ideas into JSON-LD, Dave Longley would become the most important key technical mind behind how to make JSON-LD work for web programmers.

At that time, JSON-LD had a pretty big problem. You can represent data in JSON-LD in a myriad of different ways, making it hard to tell if two JSON-LD documents are the same or not. This was an important problem to Digital Bazaar because we were trying to figure out how to create product listings, digital receipts, and contracts using JSON-LD. We had to be able to tell if two product listings were the same, and we had to figure out a way to serialize the data so that products and their associated prices could be listed on the Web in a decentralized way. This meant digital signatures, and you have to be able to create a canonical/normalized form for your data if you want to be able to digitally sign it.

Dave Longley invented the JSON-LD API, JSON-LD Framing, and JSON-LD Graph Normalization to tackle these canonicalization/normalization issues and did the first four implementations of the specification in C++, JavaScript, PHP, and Python. The JSON-LD Graph Normalization problem itself took roughly 3 months of concentrated 70+ hour work weeks and dozens of iterations by Dave Longley to produce an algorithm that would work. To this day, I remain convinced that there are only a handful of people on this planet with a mind that is capable of solving those problems. He was the first and only one that cracked those problems. It requires a sort of raw intelligence, persistence, and ability to constantly re-evaluate the problem solving approach you’re undertaking in a way that is exceedingly rare.

Dave and I continued to refine JSON-LD, with him working on the API and me working on the syntax for the better part of 2010 and early 2011. When MongoDB started really taking off in 2010, the final piece just clicked into place. We had the makings of a Linked Data technology stack that would work for web developers.

Toward Stability

Around April 2011, we launched the JSON-LD Community Group and started our public push to try and put the specification on a standards track at the World Wide Web Consortium (W3C). It is at this point that Gregg Kellogg joined us to help refine the rough edges of the specification and provide his input. For those of you that don’t know Gregg, I know of no other person that has done complete implementations of the entire stack of Semantic Web technologies. He has Ruby implementations of quad stores, TURTLE, N3, NQuads, SPARQL engines, RDFa, JSON-LD, etc. If it’s associated with the Semantic Web in any way, he’s probably implemented it. His depth of knowledge of RDF-based technologies is unmatched and he focused that knowledge on JSON-LD to help us hone it to what it is today. Gregg helped us with key concepts, specification editing, implementations, tests, and a variety of input that left its mark on JSON-LD.

Markus Lanthaler also joined us around the same time (2011) that Gregg did. The story of how Markus got involved with the work is probably my favorite way of explaining how the standards process should work. Markus started giving us input while a masters student at Technische Universität Graz. He didn’t have a background in standards, he didn’t know anything about the W3C process or specification editing, he was as green as one can be with respect to standards creation. We all start where he did, but I don’t know of many people that became as influential as quickly as Markus did.

Markus started by commenting on the specification on the mailing list, then quickly started joining calls. He’d raise issues and track them, he started on his PHP implementation, then started making minor edits to the specifications, then major edits until earning our trust to become lead specification editor for the JSON-LD API specification and one of the editors for the JSON-LD Syntax specification. There was no deliberate process we used to make him lead editor, it just sort of happened based on all the hard work he was putting in, which is the way it should be. He went through a growth curve that normally takes most people 5 years in about a year and a half, and it happened exactly how it should happen in a meritocracy. He earned it and impressed us all in the process.

The Final Stretch

Of special mention as well is Niklas Lindström, who joined us starting in 2012 on almost every JSON-LD teleconference and provided key input to the specifications. Aside from being incredibly smart and talented, Niklas is particularly gifted in his ability to find a balanced technical solution that moved the group forward when we found ourselves deadlocked on a particular decision. Paul Kuykendall joined us toward the very end of the JSON-LD work in early 2013 and provided fresh eyes on what we were working on. Aside from being very level-headed, Paul helped us understand what was important to web developers and what wasn’t toward the end of the process. It’s hard to find perspective as work wraps up on a standard, and luckily Paul joined us at exactly the right moment to provide that insight.

There were literally hundreds of people that provided input on the specification throughout the years, and I’m very appreciative of that input. However, without this core of 4-6 people, JSON-LD would have never had a chance. I will never be able to find the words to express how deeply appreciative I am to Dave, Markus, Gregg, Niklas and Paul, who did the work on a primarily volunteer basis. At this moment in time, the Web is at the core of the way human kind communicates and the most ardent protectors of this public good create standards to ensure that the Web continues to serve all of us. It boils my blood to then know that they will go largely unrewarded by society for creating something that will benefit hundreds of millions of people, but that’s another post for another time.

The next post in this series tells the story of how JSON-LD was nearly eliminated on several occasions by its critics and proponents while on its journey toward a web standard.

Trackbacks for this post

  1. JSON-LD and Why I Hate the Semantic Web | The Beautiful, Tormented Machine

Leave a Comment

Let us know your thoughts on this post but remember to play nicely folks!