JSON-LD: Cycle Breaking and Object Framing

We’ve been doing a bit of research at Digital Bazaar on how to best meld graph-based object models with what most developers are familiar with these days – JSON-based object programming (aka: associative-array based object models). We want to enable developers to use the same data models that they use in JavaScript today, but to work with arbitrary graph data.

This is an issue that we think is at the heart of why RDF has not caught on as a general data model – the data is very difficult to work with in programming languages. There is no native data structure that is easy to work with without a complex set of APIs.

When a JavaScript author gets JSON-LD from a remote source, the graph that the JSON-LD expresses can take a number of different but valid forms. That is, the information expressed by the graph can be identical, but each graph can be structured differently.

Think of these two statements:

The Q library contains book X.
Book X is contained in the Q library.

The information that is expressed in both sentences is exactly the same, but the structure of each sentence is different. Structure is very important when programming. When you write code, you expect the structure of your data to not change.

However, when we program using graphs, the structure is almost always unknown, so a mechanism to impose a structure is required in order to help the programmer be more productive.

The way the graph is represented is entirely dependent on the algorithm used to normalize and the algorithm used to break cycles in the graph. Consider the following example, which is a graph with three top-level objects – a library, a book and a chapter. Each of the items is related to one another, thus the graph can be expressed in JSON-LD in a number of different ways:

{
   "#":
   {
      "dc": "http://purl.org/dc/elements/1.1/",
      "ex": "http://example.org/vocab#"
   },
   "@":
   [
      {
         "@": "http://example.org/test#library",
         "a": "ex:Library",
         "ex:contains":  ""
      },
      {
         "@": "",
         "a": "ex:Book",
         "dc:contributor": "Writer",
         "dc:title": "My Book",
         "ex:contains": ""
      },
      {
         "@": "http://example.org/test#chapter",
         "a": "ex:Chapter",
         "dc:description": "Fun",
         "dc:title": "Chapter One"
      }
   ]
}

The JSON-LD graph above could also be represented like so:

{
   "#":
   {
      "dc": "http://purl.org/dc/elements/1.1/",
      "ex": "http://example.org/vocab#"
   },
   "@": "http://example.org/test#library",
   "a": "ex:Library",
   "ex:contains":
   {
      "@": "",
      "a": "ex:Book",
      "dc:contributor": "Writer",
      "dc:title": "My Book",
      "ex:contains":
      {
         "@": "http://example.org/test#chapter",
         "a": "ex:Chapter",
         "dc:description": "Fun",
         "dc:title": "Chapter One"
      }
   }
}

Both of the examples above express the exact same information, but the graph structure is very different. If a developer can receive both of the objects from a remote source, how do they ensure that they only have to write one code path to deal with both examples?

That is, how can a developer reliably write the following code:

// print all of the books and their corresponding chapters
var library = jsonld.toObject(jsonLdText);
for(var bookIndex = 0; bookIndex < library["ex:contains"].length;
    bookIndex++)
{
   var book = library["ex:contains"][bookIndex];
   var bookTitle = book["dc:title"];
   for(var chapterIndex = 0; chapterIndex < book["ex:contains"].length;
       chapterIndex++)
   {
      var chapter = book["ex:contains"][chapterIndex];
      var chapterTitle = chapter["dc:title"];
      console.log("Book: " + bookTitle + " Chapter: " + chapterTitle);
   }
}

The answer boils down to ensuring that the data structure that is built for the developer from the JSON-LD is framed in a way that makes property access predictable. That is, the developer provides a structure that MUST be filled out by the JSON-LD API. The working title for this mechanism is called "Cycle Breaking and Object Framing" since both mechanisms must be operable in order to solve this problem.

The developer would specify a Frame for their language-native object like the following:

{
   "#": {"ex": "http://example.org/vocab#"},
   "a": "ex:Library",
   "ex:contains":
   {
      "a": "ex:Book",
      "ex:contains":
      {
         "a": "ex:Chapter"
      }
   }
}

The object frame above asserts that the developer expects to get a library containing one or more books containing one or more chapters returned to them. This ensures that the data is structured in a way that
is predictable and only one code path is necessary to work with graphs that can take multiple forms. The API call that they would use would look something like this:

var library = jsonld.toObject(jsonLdText, objectFrame);

The discussion on this particular issue is continued on the JSON-LD mailing list.

Leave a Comment

Let us know your thoughts on this post but remember to play nicely folks!