Linked JSON: RDF for the Masses

There are times when we can see ourselves doing things that will be successful, and then there are times when we can see ourselves screwing it all up. I’ve just witnessed the latter in the RDF Working Group at the World Wide Web Consortium and thought that it may help to do a post-mortem on what went wrong. This was a social failure, not a technical one. Unlike technical failures, social failures are so much more complicated – so, let’s see if we can find out what went wrong.

Background

I spend a great deal of my time trying to convince technology leaders at large companies like Google, the New York Times, Facebook, Twitter, Sony, Universal and Warner Brothers to choose a common path forward that will help the Web flourish. Most of that time is spent at the World Wide Web Consortium (W3C), in standards working groups, trying to predict and build the future of the Web. I’m currently the Chair of the RDF Web Applications Working Group, formerly known as the RDFa Working Group. My participation covers many different working groups at the W3C; RDFa, HTML5, RDF, WebID, Web Apps, Social Web, Semantic Web Coordination, and a few others. The hope is that all of these groups are building technologies that will actually make all of our lives easier – especially for those that create and build the Web.

The Pull of Linked Data

There is a big push on the Web right now to publish data in an inter-operable way. RDFa is a good example of this new push to get as much Linked Data out there as possible. Our latest work in the RDF Working Group was to try and find a way to bring Linked Data to JSON. That is, we were given the task of figuring out a way to get companies like Google, Yahoo!, The New York Times, Facebook and Twitter to publish their data in a standards-compliant format that the rest of the world could use. We’ve already convinced some of these large companies to publish their data in RDFa. This was a huge win for the Web, but it was only a fraction of the interesting data out there. The rest of it is locked up in Web Services – in volumes of JSON data that are passed back and forth via JSON-REST APIs every day.

Wouldn’t it be great if we had something like RDFa for JSON? A way for a standard software stack to extract globally meaningful objects from Web Services? In fact, that is what JSON-LD was designed to do. There are also a number of other JSON formats that could be read not only as JSON, but as RDF. If we could get the world to start publishing their JSON data as Linked Data, we would have more transparency and more inter-operable systems. The rate at which we re-use data from other JSON-based systems would grow by leaps and bounds.

This is what the charge of the RDF Working Group was, and at the Face-to-Face meeting a little over a week ago, we failed miserably to deliver on that promise.

Failure Timeline

Here is a quick run-down of what happened:

  • March 2010: Work starts on JSON-LD – focusing on an easy-to-use, stripped down version of Linked Data for Web Developers. The work builds on previous work done by lots of smart people across the Web.
  • Summer 2010: An W3C RDF Workshop finds that there is a deep desire in the community for a JSON-based RDF format.
  • January 2011: The RDF Working Group starts up – starts to analyze 10 different RDF in JSON format proposals. There is general confusion in the group as to the exact community we’re attempting to address. Some think it’s people that are already using RDF/Graph Stores and SPARQL, others believe we are attempting to bring independent Web Developers into the world of Linked Data. I was of the latter mindset – we don’t need to convince people that are already using RDF to keep using RDF.
  • March 2011: Arguments continue about what features of JSON we’ll use and whether or not we are just creating another triple-based serialization for RDF, or if we are creating an easier to use form of Linked Data in JSON.
  • April 2011: At the RDF Face-to-Face, a show of hands decides to place the JSON work intended for independent Web Developers on the back burner for a year or more. The reason was that there was no consensus that we were solving a problem that needed to be solved.

Before I get into what went wrong, I don’t intend any of this to be bashing anyone in the RDF Working Group. They’re all good people that want to do good things for the Web. Many of them have put years of work into RDF – they want to see it succeed. They are also very smart people – they are the worlds leading experts in this stuff. There were no politics or back-room dealing that occurred. The criticism is more about the group dynamic – why we failed to deliver what some of us saw as our primary directive in the group.

What Went Wrong?

How did we go from knowing that people wanted to get Linked Data out of JSON to deciding to back-burner the work on providing just that to the people that build the Web? I pondered what went wrong for about a week and came up with the following list:

  • I failed to gather support and evidence that people wanted to get Linked Data out of JSON. I place most of the blame on myself for not educating the group before the decision needed to be made. I wouldn’t be saying this if the vote was close, but when it came time to show who supported the work – out of a group of 20-some-odd people, only two raised their hands. One of those people was me. I should have spent more time getting the larger companies to weigh in on the matter. I should have had more documentation and evidence ready on why the world needed to get Linked Data out of JSON. I should have had more one-on-one conversations with the people that I could see struggling with why we needed Linked Data for JSON. I assumed that it was obvious that the world needed this and that assumption came back to kick our collective asses.
  • A lack of Web App developers in the RDF Working Group helped compound the problem stated above. Most of the group didn’t understand why just serializing triples to JSON wasn’t good enough as most of them had APIs to make sense of the triples. They were also not convinced that we needed to bring Web App developers into the RDF community. RDF is already successful, right? Wrong. Every RDF serialization format is losing out to JSON when it comes to data interchange – not by a little, but by a staggering margin. The RDF community is so pathetically tiny compared to the Web App development community. The people around the world that use JSON as their primary data serialization format are easily 100 fold greater than those using RDF. I’m convinced that there is a problem. I don’t think that the majority of traditional RDF community thinks that there is a problem.
  • Lacking a common vision will kill a community. It has been said that standards groups should not innovate, but instead they should standardize solutions that are already out in the marketplace. There are days where I believe this – the TURTLE work has been easy to move forward in the RDF Working Group. There are also days where I know this is not true. Standards groups can be fantastic innovators – just look at the WHATWG, CSS, RDFa, Web Applications, and HTML5 Working Groups. At the heart of the matter is whether or not a group has a common vision. If you don’t have a common vision, you go nowhere. We didn’t have a common vision for the Linked Data in JSON work.
  • Only one company in the group was depending on the technology to be completed in order to ship a product. That company was Digital Bazaar, for the PaySwarm work. None of the other companies really have any skin in the game. Sure some of them would like to see something developed, but they’re not dependent on it. One recipe for disaster is to get a group of people together to work on something without hardly any negative consequence for failure.
  • I pushed JSON-LD too hard when discussing the various possibilities. I pushed it because I thought it was the best solution, and still do. I think my sense of urgency came across as being too pushy and authoritarian. This strategy, if you could call it that, backfired. Rather than open up a debate on the proper Linked Data JSON format, it seemed as if some people refused to have any sort of debate on the formats and instead chose to debate which community we were attempting to address in order to slow down the decision process until they could catch up with the state of all of the serialization formats.
  • Old school RDF people compose the majority of the RDF Working Group. It’s hard to pinpoint, but I saw what I could only describe as an “old world” mentality in the RDF Working Group. Browser-based APIs and development weren’t that important to them. They had functioning graph storage engines, operational SPARQL query engines, and PhDs to solve all of the hard problems that they may find in their everyday usage of RDF. Independent Web developers rarely have all of these advantages – many of them have none of these advantages. Many Web developers only have a browser, JavaScript, some server side code and JSON.parse() for performing data serialization and deserialization. JSON coupled with REST is simple, fast, stable and works for most everything we do. To solve 80% of our problems, there is no need for the added complexity that the “old school” RDF crowd brings to the table.
  • The RDF Working Group didn’t do their homework. We are all busy, I get that. However, even after two months, it was painfully clear that many in the group had not taken the time to understand the proposals on the table in any amount of depth. In some cases, I’m convinced that some did not even look at the proposals before passing judgement on whether or not the solution was sound.
  • Experts tend to over-analyze and cripple themselves and their colleagues with all of the potential failure scenarios. There were assertions in the group at times that, while had a basis of validity, were not constructive and came across as typical academic nay-saying. It is easier to find reasons why a particular direction will not succeed when you’re an expert. This nay-saying was very active in the RDF Working Group. We didn’t have a group that was saying “Yes, we can make this happen.” Instead, we had a minority that set the tone for the group by repeating “I don’t know if this’ll work, let’s not do it.”

I think the RDF Working Group has lost it’s way – we have forgotten the end-goal of enabling everyone on the Web to use Linked Data. We have chosen to deal with the easier problems instead of taking the biggest problem (adoption) seriously. There are many rational arguments to be made about why we’re not doing the work – none of those reasons are going to help spread Linked Data outside the modestly sized community that it enjoys at the moment. We need to get Web Apps developers using Linked Data – JSON is one way to accomplish that goal. It is a shame that we’re passing up this opportunity.

All is Not Lost

The RDF Working Group is only working on one interesting thing right now – and that’s how to represent multiple graphs in the various RDF serializations. Call them Named Graphs, or Graph Literals, or something else – but at least we’re taking that bull by the horns. As for the rest of the work that the RDF Working Group plans to do – it’s uninspired. I joined the group hoping to breathe some new life into RDF – make it exciting and useful to JavaScript developers. Instead, we ended up spending most of our time polishing things that are already working for most people. Don’t get me wrong, it’s good that some of these things are being polished – but it’s not going to impact RDF adoption rates in any significant way.

All is not lost. We decided to create a public linked data in JSON mailing list (not activated yet) where the people that would like to see something come of JSON in Linked Data could continue the work. We’re already revving JSON-LD and updating it to reflect issues that we discovered over the past several months. That’s where I’ll be spending most of my effort on Linked Data in JSON from now on – the RDF Working Group has demonstrated that we can’t accomplish the goal of growing the Linked Data community there.

5 Comments

Got something to say? Feel free, I want to hear from you! Leave a Comment

  1. Brian Peterson says:

    A JSON serialization of RDF was the only way I could get Web developers to work with RDF. More specifically, JSON that included RDF but didn’t look anything like RDF. Now our developers can work with RDF in Linked Data without knowing anything about RDF. I hope to interact with the mailing list once it is activated.

  2. Pavel Arapov says:

    I will participate mailing list activity, for me personally it’s a bad news that RDF JSON turned more to classical RDF way. I am working on my PhD ( Semantic Wiki ) and my proposal was based on JavaScript and JSON for equal environment on all level of interaction : client, server, database ( like MongoDB ).
    I hope that anyway we will find a solution for RDF JSON to simplify developers life! Thank you for your work and great ideas.

  3. zazi says:

    I don’t believe that an RDF/JSON serialisation format will be a big win at all. We shouldn’t wast our time in trying to fit the RDF knowledge representation structure that powers the Semantic Web knowledge representation languages and vocabularies into a kind of hack that seems to be fashionable these days. I insist that the days where JSON is the preferred serialisation format for data (! – not knowledge) exchange are countable. There is nothing inbuilt in JSON which shortcuts any useful semantics. Everytime when I had a look at the several popping-up RDF/ JSON serialisation proposals, e.g., recently the nice comparison at http://www.w3.org/2011/rdf-wg/wiki/JSON-Serialization-Examples#JSON_Serializations_Lineup, I thought all the time: “Please have a look at N3/Turtle” (*). It is equally simple and, thereby, delivers nice built-in shortcuts, e.g., ‘=’ for owl:sameAs. Instead of trying to convulsively fitting RDF into a non-suiting JSON costume, we should take all abilities to push forward N3/Turtle as the most practial serialisation format for RDF.
    Besides, JSON does not meet the requirements that are outlined by the principles of the REST architecture style (as defined by Roy T. Fielding) at all. It is just the belief of the “easy going” Web 2.0 developer community that they are deploying “RESTful web service” (which is not the case at all; I’m not aware of any existing one), especially by providing a proprietary JSON serialisation that does not include any inbuilt semantics at all nor it fits the REST constraints completely. In some years hopefully the majority of these web developers will graps that it was a vast of time to code all these dirty hacked mash-ups that make use of the proprietary interfaces of the single information services. “Web 2.0 is the messy way that the Semantic Web is actually happening,” says O’Reilly (from http://www.businessweek.com/technology/content/apr2007/tc20070409_248062.htm).
    The Web wouldn’t be Web, if every service would deploy its own communication protocol as a replacement of HTTP. So why don’t apply a kind of proofed, well-appropriated knowledge representation structure such as RDF as a replacement for all these messy propritaries that exist at the moment. Let’s start a clean-up process to get live more relaxed. Let’s share what we _know_ (**)!

    (*) I had to admit that I was a big fan of RDF/JSON at the beginning
    (**) know is derived from/strongly related to knowledge

  4. Sandro Hawke says:

    (Background: I’m one of the W3C staff contacts for the RDF Working Group, and I proposed one of the technical solutions in this space, JRON, last year.)

    Manu, I agree with your observations in general, but I’d put it together a little differently. I think the main reason for the WG postponing this work is that it realized that significant design work was needed, and that this is not the right forum for that work. This Linked JSON solution needs to be designed mostly by a smaller group who really knows and cares about this space.

    My sense is that it should be people who want to publish data to web developers and also have that data readable as RDF. It’s unclear who is really motivated toward that at the moment. The big gap is in seeing the need for reading it as RDF — that is, for being able to gather data from multiple websites using the same software. It may not be in the business interest of data producers to commoditize themselves like that.

    Alternative, maybe this can come from the consumer side — folks who want to consume data from many sites using the same software. That’s closer to the current RDF WG, but still not a perfect match. Unfortunately for your cause, these multi-site data consumers don’t really need what you’re proposing; they’d be happy with just plain old RDF (possibly in a JSON wrapper — that something the WG is still going to do). If you want one format to satisfies both these many-site consumer and the one-site consumers, it has to be produced by a group that somehow balances those needs. That may be very hard.

    In any case, I agree it’s unfortunate the RDF WG isn’t the currently the place to do this work, but I’m glad that was quickly recognized, rather than having the group waste a lot of time or produce something which did more harm than good.

    Now the trick is seeding and gathering the right group around the right technology.

  5. “One recipe for disaster is to get a group of people together to work on something without hardly any negative consequence for failure.”

    Exactly. And that’s why the working group made the right decision in not pursuing JSON-LD or something with similarly ambitious goals. It’s an admission by the group that the group doesn’t have the right composition to tackle this work. It’s an admission that designing, prototyping and validating JSON-LD requires a different and more focused set of people.

    You are right that this is a lost opportunity. If you ask me, this opportunity was already lost way earlier, when W3C chartered this WG with a strong focus on small cleanups, rubber-stamping already implemented but nonstandard features, and backwards compatibility. With such a charter, it was clear that the WG would attract participation mainly from old RDF hands with strong investment in the status quo and reluctance to change.

    And to be honest, I see nothing wrong with that. Innovation just works better outside of standards bodies.

Leave a Comment

Let us know your thoughts on this post but remember to play nicely folks!