All posts by ManuSporny

Google adds JSON-LD support to Search and Google Now

Full disclosure: I’m one of the primary designers of JSON-LD and the Chair of the JSON-LD group at the World Wide Web Consortium.

Last week, Google announced support for JSON-LD markup in Gmail. Putting JSON-LD in front of 425 million people is a big validation of the technology.

Hot on the heels of last weeks announcement, Google has just announced additional JSON-LD support for two more of their core products! The first is their flagship product, Google Search. The second is their new intelligent personal assistant service called Google Now.

The addition of JSON-LD support to Google Search now allows you to do incredibly accurate personalized searches. For example, here’s an example search for “my flights”:

and here’s an example for “my hotel reservation for next week”:

Web developers that mark certain types of sort of information up as JSON-LD in the e-mails that they send to you can now enable new functionality in these core Google services. For example, using JSON-LD will make it really easy for you to manage flights, hotel bookings, reservations at restaurants, and events like concerts and movies from within Google’s ecosystem. It also makes it easy for services like Google Now to push a notification to your phone when your flight has been delayed:

Or, show your boarding pass on your mobile phone when you’ve arrived at the airport:

Or, let you know when you need to leave to make your reservation for a restaurant:

Google Search and Google Now can make these recommendations to you because the information that you received about these flights, boarding passes, hotels, reservations, and other events were marked up in JSON-LD format when they hit your Gmail inbox. The most exciting thing about all of this is that it’s just the beginning of what Linked Data can do to for all of us. Over the next decade, Linked Data will be at the center of getting computing and the monotonous details of our everyday grind out of the way so that we can focus more on enjoying our lives.

If you want to dive deeper into this technology, Google’s page on schemas is a good place to start.

Google adds JSON-LD support to Gmail

Google announced support for JSON-LD markup in Gmail at Google I/O 2013. The design team behind JSON-LD is delighted by this announcement and applaud the Google engineers that integrated JSON-LD with Gmail. This blog post examines what this announcement means for Gmail customers as well as providing some suggestions to the Google Gmail engineers on how they could improve their JSON-LD markup.

JSON-LD enables the representation of Linked Data in JSON by describing a common JSON representation format for expressing graphs of information (see Google’s Knowledge Graph). It allows you to mix regular JSON data with Linked Data in a single JSON document. The format has already been adopted by large companies such as Google in their Gmail product and is now available to over 425 million people via currently live software products around the world.

The syntax is designed to not disturb already deployed systems running on JSON, but provide a smooth upgrade path from JSON to JSON-LD. It is primarily intended to be a way to use Linked Data in Web-based programming environments, to build inter-operable Linked Data Web services, and to store Linked Data in JSON-based storage engines.

For Google’s Gmail customers, this means that Gmail will now be able to recognize people, places, events and a variety of other Linked Data objects. You can then take actions on the Linked Data objects embedded in an e-mail. For example, if someone sends you an invitation to a party, you can do a single-click response on whether or not you’ll attend a party right from your inbox. Doing so will also create a reminder for the party in your calendar. There are other actions that you can perform on Linked Data objects as well, like approving an expense report, reviewing a restaurant, saving a coupon for a free online movie, making a flight, hotel, or restaurant reservation, and many other really cool things that you couldn’t do before from the inside of your inbox.

What Google Got Right and Wrong

Google followed the JSON-LD standard pretty closely, so the vast majority of the markup looks really great. However, there are four issues that the Google engineers will probably want to fix before pushing the technology out to developers.

Invalid Context URL

The first issue is a fairly major one. Google isn’t using the JSON-LD @context parameter correctly in any of their markup examples. It’s supposed to be a URL, but they’re using a text string instead. This means that their JSON-LD documents are unreadable by all of the conforming JSON-LD processors today. For example, Google does the following when declaring a context in JSON-LD:

  "@context": "schema.org"

When they should be doing this:

  "@context": "http://schema.org/"

It’s a fairly simple change; just add “http://” to the beginning of the “schema.org” value. If Google doesn’t make this change, it’ll mean that JSON-LD processors will have to include a special hack to translate “schema.org” to “http://schema.org/” just for this use case. I hope that this was just a simple oversight by the Google engineers that implemented these features and not something that was intentional.

Context isn’t Online

The second issue has to do with the JSON-LD Context for schema.org. There doesn’t seem to be a downloadable context for schema.org at the moment. Not having a Web-accessible JSON-LD context is bad because the context is at the heart and soul of a JSON-LD document. If you don’t publish a JSON-LD context on the Web somewhere, applications won’t be able to resolve any of the Linked Data objects in the document.

The Google engineers could fix this fairly easily by providing a JSON-LD Context document when a web client requests a document of type “application/ld+json” from the http://schema.org/ URL. The JSON-LD community would be happy to help the Google engineers create such a document.

Keyword Aliasing, FTW

The third issue is a minor usability issue with the markup. The Google help pages on the JSON-LD functionality use the @type keyword in JSON-LD to express the type of Linked Data object that is being expressed. The Google engineers that wrote this feature may not have been aware of the Keyword Aliasing feature in JSON-LD. That is, they could have just aliased @type to type. Doing so would mean that the Gmail developer documentation wouldn’t have to mention the “specialness” of the @type keyword.

Use RDFa Lite

The fourth issue concerns the use of Microdata. JSON-LD was designed to work seamlessly with RDFa Lite 1.1; you can easily and losslessly convert data between the two markup formats. JSON-LD is compatible with Microdata, but pairing the two is a sub-optimal design choice. When JSON-LD data is converted to Microdata, information is lost due to data fidelity issues in Microdata. For example, there is no mechanism to specify that a value is a URL in Microdata.

RDFa Lite 1.1 does not suffer from these issues and has been proven to be a drop-in replacement for Microdata without any of the downsides that Microdata has. The designers of JSON-LD are the same designers behind RDFa Lite 1.1 and have extensive experience with Microdata. We specifically did not choose to pair JSON-LD with Microdata because it was a bad design choice for a number of reasons. I hope that the Google engineers will seek out advice from the JSON-LD and RDFa communities before finalizing the decision to use Microdata, as there are numerous downsides associated with that decision.

Closing

All in all, the Google engineers did a good job of implementing JSON-LD in Gmail. With a few small fixes to the Gmail documentation and code examples, they will be fully compliant with the JSON-LD specifications. The JSON-LD community is excited about this development and looks forward to working with Google to improve the recent release of JSON-LD for Gmail.

Permanent Identifiers for the Web

Web applications that deal with data on the web often need to specify and use URLs that are very stable. They utilize services such as purl.org to ensure that applications using their URLs will always be re-directed to a working website. These “permanent URL” redirection services operate kind of like a switchboard, connecting requests for information with the true location of the information on the Web. These switchboards can be reconfigured to point to a new location if the old location stops working.

How Does it Work?

If the concept sounds a bit vague, perhaps an example will help. A web author could use the following link (https://w3id.org/payswarm/v1) to refer to an important document. That link is hosted on a permanent identifier service. When a Web browser attempts to retrieve that link, it will be re-directed to the true location of the document on the Web. Currently, that location is https://payswarm.com/contexts/payswarm-v1.jsonld. If the location of the payswarm-v1.jsonld document changes at any point in the future, the only thing that needs to be updated is the re-direction entry on w3id.org. That is, all Web applications that use the https://w3id.org/payswarm/v1 URL will be transparently re-directed to the new location of the document and will continue to “Just Work™”.

w3id.org Launches

Permanent identifiers on the Web are an important thing to support, but until today there was no organization that would back a service for the Web to keep these sorts of permanent identifiers operating over the course of multiple decades. A number of us saw that this is a real problem and so we launched w3id.org, which is a permanent identifier service for the Web. The purpose of w3id.org is to provide a secure, permanent URL re-direction service for Web applications. This service will be run and operated by the W3C Permanent Identifier Community Group.

Specifically, the following organizations that have pledged responsibility to ensure the operation of this service for the decades to come: Digital Bazaar, 3 Round Stones, OpenLink Software, Applied Testing and Technology, and Openspring. Many more organizations will join in time.

These organizations are responsible for all administrative tasks associated with operating the service. The social contract between these organizations gives each of them full access to all information required to maintain and operate the website. The agreement is setup such that a number of these companies could fail, lose interest, or become unavailable for long periods of time without negatively affecting the operation of the site.

Why not purl.org

While many web authors and data publishers currently use purl.org, there are a number of issues or concerns that we have about the website:

  1. The site was designed for the library community and was never intended to be used by the general Web.
  2. Requests for information or changes to the service frequently go unanswered.
  3. The site does not support HTTPS connections, which means it cannot be used to serve documents for security-sensitive industries such as medicine and finance. Requests to migrate the site to HTTPs have gone unanswered.
  4. There is no published backup or fail-over plan for the website.
  5. The site is run by a single organization, with a single part-time administrator, on a single machine. It suffers from multiple single points of failure.

w3id.org Features

The launch of the w3id.org website mitigates all of the issues outlined above with purl.org:

  1. The site is specifically designed for web developers, authors, and data publishers on the general Web. It is not tailored for any specific community.
  2. Requests for information can be sent to a public mailing list that contains multiple administrators that are accountable for answering questions publicly. All administrators have been actively involved in world standards for many years and know how to run a service at this scale.
  3. The site supports HTTPS security, which means it can be used to securely serve data for industries such as medicine and finance.
  4. Multiple organizations, with multiple administrators per organization have full access to administer all aspects of the site and recover it from any potential failure. All important site data is in version control and is mirrored across the world on a regular basis.
  5. The site is run by a consortium of organizations that have each pledged to maintain the site for as long as possible. If a member organization fails, a new one will be found to replace the failing organization while the rest of the members ensure the smooth operation of the site.

All identifiers associated with the w3id.org website are intended to be around for as long as the Web is around. This means decades, if not centuries. If the final destination for popular identifiers used by this service fail in such a way as to be a major inconvenience or danger to the Web, the community will mirror the information for the popular identifier and setup a working redirect to restore service to the rest of the Web.

Adding a Permanent Identifier

Anyone with a github account and knowledge of simple Apache redirect rules can add a permanent identifier to w3id.org by performing the following steps:

  1. Fork w3id.org on Github.
  2. Add a new redirect entry and commit your changes.
  3. Submit a pull request for your changes.

If you wish to engage the community in discussion about this service for your Web application, please send an e-mail to the public-perma-id@w3.org mailing list. If you are interested in helping to maintain this service for the Web, please join the W3C Permanent Identifier Community Group.


Note: The letters ‘w3′ in the w3id.org domain name stand for “World Wide Web”. Other than hosting the software for the Permanent Identifier Community Group, the “World Wide Web Consortium” (W3C) is not involved in the support or management of w3id.org in any way.

Browser Payments 1.0

Kumar McMillan (Mozilla/FirefoxOS) and I (PaySwarm/Web Payments) have just published the first draft of the Browser Payments 1.0 API. The purpose of the spec is to establish a way to initiate payments from within the browser. It is currently a direct port of the mozPay API framework that is integrated into Firefox OS. It enables Web content to initiate payment or issue a refund for a product or service. Once implemented in the browser, a Web author may issue navigator.payment() function to initiate a payment.

This is work that we intend to pursue in the Web Payments Community Group at W3C. The work will eventually be turned over to a Web Payments Working Group at W3C, which we’re trying to kick-start at some point this year.

The current Browser Payments 1.0 spec can be read here:

http://web-payments.github.io/browser-payments/

The github repository for the spec is here:

https://github.com/web-payments/browser-payments/

Keep in mind that this is a very early draft of the spec. There are lots of prose issues as well as bugs that need to be sorted out. There are also a number of things that we need to discuss about the spec and how it fits into the larger Web ecosystem. Things like how it integrates with Persona and PaySwarm are still details that we need to suss out. There is a bug and issue tracker for the spec here:

https://github.com/web-payments/browser-payments/issues

The Mozilla guys will be on next week’s Web Payments telecon (Wednesday, 11am EST) for a Q/A session about this specification. Join us if you’re interested in payments in the browser. The call is open to the public, details about joining and listening in can be found here:

https://payswarm.com/minutes/

Identifiers in JSON-LD and RDF

TL;DR: This blog post argues that the extension of blank node identifiers in JSON-LD and RDF for the purposes of identifying predicates and naming graphs is important. It is important because it simplifies the usage of both technologies for developers. The post also provides a less-optimal solution if the RDF Working Group does not allow blank node identifiers for predicates and graph names in RDF 1.1.

We need identifiers as humans to convey complex messages. Identifiers let us refer to a certain thing by naming it in a particular way. Not only do humans need identifiers, but our computers need identifiers to refer to data in order to perform computations. It is no exaggeration to say that our very civilization depends on identifiers to manage the complexity of our daily lives, so it is no surprise that people spend a great deal of time thinking about how to identify things. This is especially true when we talk about the people that are building the software infrastructure for the Web.

The Web has a very special identifier called the Uniform Resource Locator (URL). It is probably one of the best known identifiers in the world, mostly because everybody that has been on the Web has used one. URLs are great identifiers because they are very specific. When I give you a URL to put into your Web browser, such as the link to this blog post, I can be assured that when you put the URL into your browser that you will see what I see. URLs are globally scoped, they’re supposed to always take you to the same place.

There is another class of identifier on the Web that is not globally scoped and is only used within a document on the Web. In English, these identifiers are used when we refer to something as “that thing”, or “this widget”. We can really only use this sort of identifier within a particular context where the people participating in the conversation understand the context. Linguists call this concept deixis. “Thing” doesn’t always refer to the same subject, but based on the proper context, we can usually understand what is being identified. Our consciousness tags the “thing” that is being talked about with a tag of sorts and then refers to that thing using this pseudo-identifier. Most of this happens unconsciously (notice how your mind unconsciously tied the use of ‘this’ in this sentence to the correct concept?).

The take-away is that there are globally-scoped identifiers like URLs, and there are also locally-scoped identifiers, that require a context in order to understand what they refer to.

JSON and JSON-LD

In JSON, developers typically express data like this:

{
  "name": "Joe"
}

Note how that JSON object doesn’t have an identifier associated with it. JSON-LD creates a straight-forward way of giving that object an identifier:

{
  "@context": ...,
  "@id": "http://example.com/people/joe",
  "name": "Joe"
}

Both you and I can refer to that object using http://example.com/people/joe and be sure that we’re talking about the same thing. There are times that assigning a global identifier to every piece of data that we create is not desired. For example, it doesn’t make much sense to assign an identifier to a transient message that is a request to get a sensor reading. This is especially true if there are millions of these types or requests and we never want to refer to the request once it has been transmitted. This is why JSON-LD doesn’t force developers to assign an identifier to the objects that they express. The people that created the technology understand that not everything needs a global identifier.

Computers are less forgiving, they need identifiers for most everything, but a great deal of that complexity can be hidden from developers. When an identifier becomes necessary in order to perform computations upon the data, the computer can usually auto-generate an identifier for the data.

RDF, Graphs, and Blank Node Identifiers

The Resource Description Framework (RDF) primarily uses an identifier called the Internationalized Resource Identifier (IRI). Where URLs can typically only express links in Western languages, an IRI can express links in almost every language in use today including Japanese, Tamil, Russian and Mandarin. RDF also defines a special type of identifier called a blank node identifier. This identifier is auto-generated and is locally scoped to the document. It’s an advanced concept, but is one that is pretty useful when you start dealing with transient data, where creating a global identifier goes beyond the intended usage of the data. An RDF-compatible program will step in and create blank node identifiers on your behalf, but only when necessary.

Both JSON-LD and RDF have the concept of a Statement, Graph, and a Dataset. A Statement consists of a subject, predicate, and an object (for example: “Dave likes cookies”). A Graph is a collection of Statements (for example: Graph A contains all the things that Dave said and Graph B contains all the things that Mary said). A Dataset is a collection of Graphs (for example: Dataset Z contains all of the things Dave and Mary said yesterday).

In JSON-LD, at present, you can use a blank node identifier for subjects, predicates, objects, and graphs. In RDF, you can only use blank node identifiers for subjects and objects. There are people, such as myself, in the RDF WG that think this is a mistake. There are people that think it’s fine. There are people that think it’s the best compromise that can be made at the moment. There is a wide field of varying opinions strewn between the various extremes.

The end result is that the current state of affairs have put us into a position where we may have to remove blank node identifier support for predicates and graphs from JSON-LD, which comes across as a fairly arbitrary limitation to those not familiar with the inner guts of RDF. Don’t get me wrong, I feel it’s a fairly arbitrary limitation. There are those in the RDF WG that don’t think it is and that may prevent JSON-LD from being able to use what I believe is a very useful construct.

Document-local Identifiers for Predicates

Why do we need blank node identifiers for predicates in JSON-LD? Let’s go back to the first example in JSON to see why:

{
  "name": "Joe"
}

The JSON above is expressing the following Statement: “There exists a thing whose name is Joe.”

The subject is “thing” (aka: a blank node) which is legal in both JSON-LD and RDF. The predicate is “name”, which doesn’t map to an IRI. This is fine as far as the JSON-LD data model is concerned because “name”, which is local to the document, can be mapped to a blank node. RDF cannot model “name” because it has no way of stating that the predicate is local to the document since it doesn’t support blank nodes for predicates. Since the predicate doesn’t map to an IRI, it can’t be modeled in RDF. Finally, “Joe” is a string used to express the object and that works in both JSON-LD and RDF.

JSON-LD supports the use of blank nodes for predicates because there are some predicates, like every key used in JSON, that are local to the document. RDF does not support the use of blank nodes for predicates and therefore cannot properly model JSON.

Document-local Identifiers for Graphs

Why do we need blank node identifiers for graphs in JSON-LD? Let’s go back again to the first example in JSON:

{
  "name": "Joe"
}

The container of this statement is a Graph. Another way of writing this in JSON-LD is this:

{
  "@context": ...,
  "@graph": {
    "name": "Joe"
  }
}

However, what happens when you have two graphs in JSON-LD, and neither one of them is the RDF default graph?

{
  "@context": ...,
  "@graph": [
    {
      "@graph": {
        "name": "Joe"
      }
    }, 
    {
      "@graph": {
        "name": "Susan"
      }
    }
  ]
}

In JSON-LD, at present, it is assumed that a blank node identifier may be used to name each graph above. Unfortunately, in RDF, the only thing that can be used to name a graph is an IRI, and a blank node identifier is not an IRI. This puts JSON-LD in an awkward position, either JSON-LD can:

  1. Require that developers name every graph with an IRI, which seems like a strange demand because developers don’t have to name all subjects and objects with an IRI, or
  2. JSON-LD can auto-generate a regular IRI for each predicate and graph name, which seems strange because blank node identifiers exist for this very purpose (not to mention this solution won’t work in all cases, more below), or
  3. JSON-LD can auto-generate a special IRI for each predicate and graph name, which would basically re-invent blank node identifiers.

The Problem

The problem surfaces itself when you try to convert a JSON-LD document to RDF. If the RDF Working Group doesn’t allow blank node identifiers for predicates and graphs, then what do you use to identify predicates and graphs that have blank node identifiers associated with them in the JSON-LD data model? This is a feature we do want to support because there are a number of important use cases that it enables. The use cases include:

  1. Blank node predicates allow JSON to be mapped directly to the JSON-LD and RDF data models.
  2. Blank node graph names allow developers to use graphs without explicitly naming them.
  3. Blank node graph names make the RDF Dataset Normalization algorithm simpler.
  4. Blank node graph names prevent the creation of a parallel mechanism to generate and manage blank node-like identifiers.

It’s easy to see the problem exposed when performing RDF Dataset Normalization, which we need to do in order to digitally sign information expressed in JSON-LD and RDF. The rest of this post will focus on this area, as it exposes the problems with not supporting blank node identifiers for predicates and graph names. In JSON-LD, the two-graph document above could be normalized to this NQuads (subject, predicate, object, graph) representation:

_:bnode0 _:name "Joe" _:graph1 .
_:bnode1 _:name "Susan" _:graph2 .

This is illegal in RDF since you can’t have a blank node identifier in the predicate or graph position. Even if we were to use an IRI in the predicate position, the problem (of not being able to normalize “un-labeled” JSON-LD graphs like the ones in the previous section) remains.

The Solutions

This section will cover the proposed solutions to the problem in order least desirable to most desirable.

Don’t allow blank node identifiers for predicates and graph names

Doing this in JSON-LD ignores the point of contention. The same line of argumentation can be applied to RDF. The point is that by forcing developers to name graphs using IRIs, we’re forcing them to do something that they don’t have to do with subjects and objects. There is no technical reason that has been presented where the use of a blank node identifier in the predicate or graph position is unworkable. Telling developers that they must name graphs using IRIs will be surprising to them, because there is no reason that the software couldn’t just handle that case for them. Requiring developers to do things that a computer can handle for them automatically is anti-developer and will harm adoption in the long run.

Generate fragment identifiers for graph names

One solution is to generate fragment identifiers for graph names. This, coupled with the base IRI would allow the data to be expressed legally in NQuads:

_:bnode0 <http://example.com/base#name> "Joe" <http://example.com/base#graph1> .
_:bnode1 <http://example.com/base#name> "Susan" <http://example.com/base#graph2> .

The above is legal RDF. The approach is problematic when you don’t have a base IRI, such as when JSON-LD is used as a messaging protocol between two systems. In that use case, you end up with something like this:

_:bnode0 <#name> "Joe" <#graph1> .
_:bnode1 <#name> "Susan" <#graph2> .

RDF requires absolute IRIs and so the document above is illegal from an RDF perspective. The other down-side is that you have to keep track of all fragment identifiers in the output and make sure that you don’t pick fragment identifiers that are used elsewhere in the document. This is fairly easy to do, but now you’re in the position of tracking and renaming both blank node identifiers and fragment IDs. Even if this approach worked, you’d be re-inventing the blank node identifier. This approach is unworkable for systems like PaySwarm that use transient JSON-LD messages across a REST API; there is no base IRI in this use case.

Skolemize to create identifiers for graph names

Another approach is skolemization, which is just a fancy way of saying: generate a unique IRI for the blank node when expressing it as RDF. The output would look something like this:

_:bnode0 <http://blue.example.com/.well-known/genid/2938570348579834> "Joe" <http://blue.example.com/.well-known/genid/348570293572375> .
_:bnode1 <http://blue.example.com/.well-known/genid/2938570348579834> "Susan" <http://blue.example.com/.well-known/genid/49057394572309457> .

This would be just fine if there was only one application reading and consuming data. However, when we are talking about RDF Dataset Normalization, there are cases where two applications must read and independently verify the representation of a particular IRI. One scenario that illustrates the example fairly nicely is the blind verification scenario. In this scenario, two applications de-reference an IRI to fetch a JSON-LD document. Each application must perform RDF Dataset Normalization and generate a hash of that normalization to see if they retrieved the same data. Based on a strict reading of the skolemization rules, Application A would generate this:

_:bnode0 <http://blue.example.com/.well-known/genid/2938570348579834> "Joe" <http://blue.example.com/.well-known/genid/348570293572375> .
_:bnode1 <http://blue.example.com/.well-known/genid/2938570348579834> "Susan" <http://blue.example.com/.well-known/genid/49057394572309457> .

and Application B would generate this:

_:bnode0 <http://red.example.com/.well-known/genid/J8Sfei8f792Fd3> "Joe" <http://red.example.com/.well-known/genid/j28cY82Pa88> .
_:bnode1 <http://red.example.com/.well-known/genid/J8Sfei8f792Fd3> "Susan" <http://red.example.com/.well-known/genid/k83FyUuwo89DF> .

Note how the two graphs would never hash to the same value because the Skolem IRIs are completely different. The RDF Dataset Normalization algorithm would have no way of knowing which IRIs are blank node stand-ins and which ones are legitimate IRIs. You could say that publishers are required to assign the skolemized IRIs to the data they publish, but that ignores the point of contention, which is that you don’t want to force developers to create identifiers for things that they don’t care to identify. You could argue that the publishing system could generate these IRIs, but then you’re still creating a global identifier for something that is specifically meant to be a document-scoped identifier.

A more lax reading of the Skolemization language might allow one to create a special type of Skolem IRI that could be detected by the RDF Dataset Normalization algorithm. For example, let’s say that since JSON-LD is the one that is creating these IRIs before they go out to the RDF Dataset Normalization Algorithm, we use the tag IRI scheme. The output would look like this for Application A:

_:bnode0 <tag:w3.org,2013:dsid:345> "Joe" <tag:w3.org,2013:dsid:254> .
_:bnode1 <tag:w3.org,2013:dsid:345> "Susan" <tag:w3.org,2013:dsid:363> .

and this for Application B:

_:bnode0 <tag:w3.org,2013:dsid:a> "Joe" <tag:w3.org,2013:dsid:b> .
_:bnode1 <tag:w3.org,2013:dsid:a> "Susan" <tag:w3.org,2013:dsid:c> .

The solution still doesn’t work, but we could add another step to the RDF Dataset Normalization algorithm that would allow it to rename any IRI starting with tag:w3.org,2013:. Keep in mind that this is exactly the same thing that we do with blank nodes, and it’s effectively duplicating that functionality. The algorithm would allow us to generate something like this for both applications doing a blind verification.

_:bnode0 <tag:w3.org,2013:dsid:predicate-1> "Joe" <tag:w3.org,2013:dsid:graph-1> .
_:bnode1 <tag:w3.org,2013:dsid:predicate-1> "Susan" <tag:w3.org,2013:dsid:graph-2> .

This solution does violate one strong suggestion in the Skolemization section:

Systems wishing to do this should mint a new, globally unique IRI (a Skolem IRI) for each blank node so replaced.

The IRI generated is definitely not globally unique, as there will be many tag:w3.org,2013:dsid:graph-1s in the world, each associated with data that is completely different. This approach also goes against something else in Skolemization that states:

This transformation does not appreciably change the meaning of an RDF graph.

It’s true that using tag IRIs doesn’t change the meaning of the graph when you assume that the document will never find its way into a database. However, once you place the document in a database, it certainly creates the possibility of collisions in applications that are not aware of the special-ness of IRIs starting with tag:w3.org,2013:dsid:. The data is fine taken by itself, but a disaster when merged with other data. We would have to put a warning in some specification for systems to make sure to rename the incoming tag:w3.org,2013:dsid: IRIs to something that is unique to the storage subsystem. Keep in mind that this is exactly what is done when importing blank node identifiers into a storage subsystem. So, we’ve more-or-less re-invented blank node identifiers at this point.

Allow blank node identifiers for graph names

This leads us to the question of why not just extend RDF to allow blank node identifiers for predicates and graph names? Ideally, that’s what I would like to see happen in the future as it places the least burden on developers, and allows RDF to easily model JSON. The responses from the RDF WG are varied. These are all of the current arguments against that I have heard:

There are other ways to solve the problem, like fragment identifiers and skolemization, than introducing blank nodes for predicates and graph names.

Fragment identifiers don’t work, as demonstrated above. There is really only one workable solution based on a very lax reading of skolemization, and as demonstrated above, even the best skolemization solution re-invents the concept of a blank node.

There are other use cases that are blocked by the introduction of blank node identifiers into the predicate and graph name position.

While this has been asserted, it is still unclear exactly what those use cases are.

Adding blank node identifiers for predicates and graph names will break legacy applications.

If blank nodes for predicates and graph names were illegal before, wouldn’t legacy applications reject that sort of input? The argument that there are bugs in legacy applications that make them not robust against this type of input is valid, but should that prevent the right solution from being adopted? There has been no technical reason put forward for why blank nodes for predicates or graph names cannot work, other than software bugs prevent it.

The PaySwarm work has chosen to model the data in a very strange way.

The people that have been working on RDFa, JSON-LD, and the Web Payments specifications for the past 5 years have spent a great deal of time attempting to model the data in the simplest way possible, and in a way that is accessible to developers that aren’t familiar with RDF. Whether or not it may seem strange is arguable since this response is usually levied by people not familiar with the Web Payments work. This blog post outlines a variety of use cases where the use of a blank node for predicates and graph naming is necessary. Stating that the use cases are invalid ignores the point of contention.

If we allow blank nodes to be used when naming graphs, then those blank nodes should denote the graph.

At present, RDF states that a graph named using an IRI may denote the graph or it may not denote the graph. This is a fancy way of saying that the IRI that is used for the graph name may be an identifier for something completely different (like a person), but de-referencing the IRI over the Web results in a graph about cars. I personally think that is a very dangerous concept to formalize in RDF, but there are others that have strong opinions to the contrary. The chances of this being changed in RDF 1.1 is next to none.

Others have argued that while that may be the case for IRIs, it doesn’t have to be the case for blank nodes that are used to name graphs. In this case, we can just state that the blank node denotes the graph because it couldn’t possibly be used for anything else since the identifier is local to the document. This makes a great deal of sense, but it is different from how an IRI is used to name a graph and that difference is concerning to a number of people in the RDF Working Group.

However, that is not an argument to disallow blank nodes from being used for predicates and graph names. The group could still allow blank nodes to be used for this purpose while stating that they may or may not be used to denote the graph.

The RDF Working Group does not have enough time left in its charter to make a change this big.

While this may be true, not making a decision on this is causing more work for the people working on JSON-LD and RDF Dataset Normalization. Having the tag:w3.org,2013:dsid: identifier scheme is also going to make many RDF-based applications more complex in the long run, resulting in a great deal more work than just allowing blank nodes for predicates and graph names.

Conclusion

I have a feeling that the RDF Working Group is not going to do the right thing on this one due to the time pressure of completing the work that they’ve taken on. The group has already requested, and has been granted, a charter extension. Another extension is highly unlikely, so the group wants to get everything wrapped up. This discussion could take several weeks to settle. That said, the solution that will most likely be adopted (a special tag-based skolem IRI) will cause months of work for people living in the JSON-LD and RDF ecosystem. The best solution in the long run would be to solve this problem now.

If blank node identifiers for predicates and graphs are rejected, here is the proposal that I think will move us forward while causing an acceptable amount of damage down the road:

  1. JSON-LD continues to support blank node identifiers for use as predicates and graph names.
  2. When converting JSON-LD to RDF, a special, relabelable IRI prefix will be used for blank nodes in the predicate and graph name position of the form tag:w3.org,2013:dsid:

Thanks to Dave Longley for proofing this blog post and providing various corrections.

DRM in HTML5

A few days ago, a proposal was put forward in the HTML Working Group (HTML WG) by Microsoft, Netflix, and Google to take DRM in HTML5 to the next stage of standardization at W3C. This triggered another uproar about the morality and ethics behind DRM and building it into the Web. There are good arguments about morality/ethics on both sides of the debate but ultimately, the HTML WG will decide whether or not to pursue the specification based on technical merit. I (@manusporny) am a member of the HTML WG. I was also the founder of a start-up that focused on building a legal, peer-to-peer, content distribution network for music and movies. It employed DRM much like the current DRM in HTML5 proposal. During the course of 8 years of technical development, we had talks with many of the major record labels. I have first-hand knowledge of the problem, and building a technical solution to address the problem.

TL;DR: The Encrypted Media Extensions (DRM in HTML5) specification does not solve the problem the authors are attempting to solve, which is the protection of content from opportunistic or professional piracy. The HTML WG should not publish First Public Working Drafts that do not effectively address the primary goal of a specification.

The Problem

The fundamental problem that the Encrypted Media Extensions (EME) specification seems to be attempting to solve is to find a way to reduce piracy (since eliminating piracy on the Web is an impossible problem to solve). This is a noble goal as there are many content creators and publishers that are directly impacted by piracy. These are not faceless corporations, they are people with families that depend on the income from their creations. It is with this thought in mind that I reviewed the specification on a technical basis to determine if it would lead to a reduction in piracy.

Review Notes for Encrypted Media Extensions (EME)

Introduction

The EME specification does not specify a DRM scheme in the specification, rather it explains the architecture for a DRM plug-in mechanism. This will lead to plug-in proliferation on the Web. Plugins are something that are detrimental to inter-operability because it is inevitable that the DRM plugin vendors will not be able to support all platforms at all times. So, some people will be able to view content, others will not.

A simple example of the problem is Silverlight by Microsoft. Take a look at the Plugin details for Silverlight, specifically, click on the “System Requirements” tab. Silverlight is Microsoft’s creation. Microsoft is a HUGE corporation with very deep pockets. They can and have thrown a great deal of money at solving very hard problems. Even Microsoft does not support their flagship plugin on Internet Explorer 8 on older versions of their operating system and the latest version of Chrome on certain versions of Windows and Mac. If Microsoft can’t make their flagship Web plugin work across all major Operating Systems today, what chance does a much smaller DRM plugin company have?

The purpose of a standard is to increase inter-operability across all platforms. It has been demonstrated that plug-ins, on the whole, harm inter-operability in the long run and often create many security vulnerabilities. The one shining exception is Flash, but we should not mistake an exception for the rule. Also note that Flash is backed by Adobe, a gigantic multi-national corporation with very deep pockets.

1.1 Goals

The goals section does not state the actual purpose of the specification. It states meta-purposes like: “Support a range of content security models, including software and hardware-based models” and “Support a wide range of use cases.”. While those are sub-goals, the primary goal isn’t stated once in the Goals section. The only rational primary goal is to reduce the amount of opportunistic piracy on the Web. Links to piracy data collected over the last decade could help make the case that this is worth doing.

1.2.1. Content Decryption Module (CDM)

When we were working on our DRM system, we took almost exactly the same approach that the EME specification does. We had a plug-in system that allowed different DRM modules to be plugged into the system. We assumed that each DRM scheme had a shelf-life of about 2-3 months before it was defeated, so our system would rotate the DRM modules every 3 months. We had plans to create genetic algorithms that would encrypt and watermark data into the file stream and mutate the encryption mechanism every couple of months to keep the pirates busy. It was a very complicated system to keep working because one slip up in the DRM module meant that people couldn’t view the content they had purchased. We did get the system working in the end, but it was a nightmare to make sure that the DRM modules to decrypt the information were rotated often enough to be effective while ensuring that they worked across all platforms.

Having first-hand knowledge of how such a system works, it’s a pretty terrible idea for the Web because it takes a great deal of competence and coordination to pull something like this off. I would expect the larger Content Protection companies to not have an issue with this. The smaller Content Protection companies, however, will inevitably have issues with ensuring that their DRM modules work across all platforms.

The bulk of the specification

The bulk of the specification is what you would expect from a system like this, so I won’t go into the gory details. There were two major technical concerns I had while reading through the implementation notes.

The first is that key retrieval is handled by JavaScript code, which means that anybody using a browser could copy the key data. This means that if a key is sent in the clear, the likelihood that the DRM system could be compromised goes up considerably because the person that is pirating the content knows the details necessary to store and decrypt the content.

If the goal is to reduce opportunistic piracy, all keys should be encrypted so that snooping by the browser doesn’t result in the system being compromised. Otherwise, all you would need to do is install a plugin that shares all clear-text keys with something like Mega. Pirates could use those keys to then decrypt byte-streams that do not mutate between downloads. To my knowledge, most DRM’ed media delivery does not encrypt content on a per-download basis. So, the spec needs to make it very clear that opaque keys MUST be used when delivering media keys.

One of the DRM systems we built, which became the primary way we did things, would actually re-encrypt the byte stream for every download. So even if a key was compromised, you couldn’t use the key to decrypt any other downloads. This was massively computationally expensive, but since we were running a peer-to-peer network, the processing was pushed out to the people downloading stuff in the network and not our servers. Sharing of keys was not possible in our DRM system, so we could send the decryption keys in the clear. I doubt many of the Content Protection Networks will take this approach as it would massively spike the cost of delivering content.

6. Simple Decryption

The “org.w3.clearkey” Key System indicates a plain-text clear (unencrypted) key will be used to decrypt the source. No additional client-side content protection is required.

Wow, what a fantastically bad idea.

  1. This sends the decryption key in the clear. This key can be captured by any Web browser plugin. That plugin can then share the decryption key and the byte stream with the world.
  2. It duplicates the purpose of Transport Layer Security (TLS).
  3. It doesn’t protect anything while adding a very complex way of shipping an encrypted byte stream from a Web server to a Web browser.

So. Bad. Seriously, there is nothing secure about this mechanism. It should be removed from the specification.

9.1. Use Cases: “user is not an adversary”

This is not a technical issue, but I thought it would be important to point it out. This “user is not an adversary” text can be found in the first question about use cases. It insinuates that people that listen to radio and watch movies online are potential adversaries. As a business owner, I think that’s a terrible way to frame your customers.

Thinking of the people that are using the technology that you’re specifying as “adversaries” is also largely wrong. 99.999% of people using DRM-based systems to view content are doing it legally. The folks that are pirating content are not sitting down and viewing the DRM stream, they have acquired a non-DRM stream from somewhere else, like Mega or The Pirate Bay, and are watching that. This language is unnecessary and should be removed from the specification.

Conclusion

There are some fairly large security issues with the text of the current specification. Those can be fixed.

The real goal of this specification is to create a framework that will reduce content piracy. The specification has not put forward any mechanism that demonstrates that it would achieve this goal.

Here’s the problem with EME – it’s easy to defeat. In the very worst case, there exist piracy rigs that allow you to point an HD video camera at a HD television and record the video and audio without any sort of DRM. That’s the DRM-free copy that will end up on Mega or the Pirate Bay. In practice, no DRM system has survived for more than a couple of years.

Content creators, if your content is popular, EME will not protect your content against a content pirate. Content publishers, your popular intellectual property will be no safer wrapped in anything that this specification can provide.

The proposal does not achieve the goal of the specification, it is not ready for First Public Working Draft publication via the HTML Working Group.

Aaron Swartz, PaySwarm, and Academic Journals

For those of you that haven’t heard yet, Aaron Swartz took his own life two days ago. Larry Lessig has a follow-up on one of the reasons he thinks led to his suicide (the threat of 50 years in jail over the JSTOR case).

I didn’t know Aaron at all. A large number of people that I deeply respect did, and have written about his life with great admiration. I, like most of you that have read the news, have done so while brewing a cauldron of mixed emotions. Saddened that someone that had achieved so much good in their life is no longer in this world. Angry that Aaron chose this ending. Sickened that this is the second recent suicide, Iilya’s being the first, involving a young technologist trying to make the world a better place for all of us. Afraid that other technologists like Aaron and Iilya will choose this path over persisting in their noble causes. Helpless. Helpless because this moment will pass, just like Iilya’s did, with no great change in the way our society deals with mental illness. With no great change, in what Aaron was fighting for, having been realized.

Nobody likes feeling helpless. I can’t mourn Aaron because I didn’t know him. I can mourn the idea of Aaron, of the things he stood for. While reading about what he stood for, several disconnected ideas kept rattling around in the back of my head:

  1. We’ve hit a point of ridiculousness in our society where people at HSBC knowingly laundering money for drug cartels get away with it, while people like Aaron are labeled a felon and face upwards of 50 years in jail for “stealing” academic articles. This, even after the publisher of said academic articles drops the charges. MIT never dropped their charges.
  2. MIT should make it clear that he was not a felon or a criminal. MIT should posthumously pardon Aaron and commend him for his life’s work.
  3. The way we do peer-review and publish scientific research has to change.
  4. I want to stop reading about all of this, it’s heartbreaking. I want to do something about it – make something positive out of this mess.

Ideas, Floating

I was catching up on news this morning when the following floated past on Twitter:

clifflampe: It seems to me that the best way for we academics to honor Aaron Swartz’s memory is to frigging finally figure out open access publishing.

1Copenut: @clifflampe And finally implement a micropayment system like @manusporny’s #payswarm. I don’t want the paper-but I’ll pay for the stories.

1Copenut: @manusporny These new developments with #payswarm are a great advance. Is it workable with other backends like #Middleman or #Sinatra?

This was interesting because we have been talking about how PaySwarm could be applied to academic publishing for a while now. All the discussions to this point have been internal, we didn’t know if anybody would make the connection between the infrastructure that PaySwarm provides and how it could be applied to academic journals. This is up on our ideas board as a potential area that PaySwarm could be applied:

  • Payswarm for peer-reviewed, academic publishing
    • Use Payswarm identity mechanism to establish trusted reviewer and author identities for peer review
    • Use micropayment mechanism to fund research
    • Enable university-based group-accounts for purchasing articles, or refunding researcher purchases

Journals as Necessary Evils

For those in academia, journals are often viewed as a necessary evil. They cost a fortune to subscribe to, farm out most of their work to academics that do it for free, and employ an iron-grip on the scientific publication process. Most academics that I speak with would do away with journal organizations in a heartbeat if there was a viable alternative. Most of the problem is political, which is why we haven’t felt compelled to pursue fixing it. Political problems often need a groundswell of support and a number of champions that are working inside the community. I think the groundswell is almost here. I don’t know who the set of academic champions are that will be the ones to push this forward. Additionally, if nobody takes the initiative to build such a system, things won’t change.

Here’s what we (Digital Bazaar) have been thinking. To fix the problem, you need at least the following core features:

  • Web-scale identity mechanisms – so that you can identify reviewers and authors for the peer-review process regardless of which site is publishing or reviewing a paper.
  • Decentralized solution – so that universities and researchers drive the process – not the publishers of journals.
  • Some form of remuneration system – you want to reward researchers with heavily cited papers, but in a way that makes it very hard to game the system.

Scientific Remuneration

PaySwarm could be used to implement each of these core features. At its core, PaySwarm is a decentralized payment mechanism for the Web. It also has a decentralized identity mechanism that is solid, but in a way that does not violate your privacy. There is a demo that shows how it can be applied to WordPress blogs where just an abstract is published, and if the reader wants to see more of the article, they can pay a small fee to read it. It doesn’t take a big stretch of the imagination to replace “blog article” with “research paper”. The hope is that researchers would set access prices on articles such that any purchase to access the research paper would then go to directly funding their current research. This would empower universities and researchers with an additional revenue stream while reducing the grip that scientific publishers currently have on our higher-education institutions.

A Decentralized Peer-review Process

Remuneration is just one aspect of the problem. Arguably, it is the lesser of the problems in academic publishing. The biggest technical problem is how you do peer review on a global, distributed scale. Quite obviously, you need a solid identity system that can identify scientists over the long term. You need to understand a scientists body of work and how respected their research is in their field. You also need a review system that is capable of pairing scientists and papers in need of review. PaySwarm has a strong identity system in place using the Web as the identification mechanism. Here is the PaySwarm identity that I use for development: https://dev.payswarm.com/i/manu. Clearly, paper publishing systems wouldn’t expose that identity URL to people using the system, but I include it to show what a Web-scale identifier looks like.

Web-scale Identity

If you go to that identity URL, you will see two sets of information: my public financial accounts and my digital signature keys. A PaySwarm Authority can annotate this identity with even more information, like whether or not an e-mail address has been verified against the identity. Is there a verified cellphone on record for the identity? Is there a verified driver’s license on record for the identity? What about a Twitter handle? A Google+ handle? All of these pieces of information can be added and verified by the PaySwarm Authority in order to build an identity that others can trust on the Web.

What sorts of pieces of information need to be added to a PaySwarm identity to trust its use for academic publishing? Perhaps a list of articles published by the identity? Review comments for all other papers that have been reviewed by the identity? Areas of research that other’s have certified that the identity is an expert on? This is pretty basic Web-of-trust stuff, but it’s important to understand that PaySwarm has this sort of stuff baked into the core of the design.

The Process

Leveraging identity to make decentralized peer-review work is the goal, and here is how it would work from a researcher perspective:

  1. A researcher would get a PaySwarm identity from any PaySwarm Authority, there is no cost associated with getting such an identity. This sub-system is already implemented in PaySwarm.
  2. A researcher would publish an abstract of their paper in a Linked Data format such as RDFa. This abstract would identify the authors of the paper and some other basic information about the paper. It would also have a digital signature on the information using the PaySwarm identity that was acquired in the previous step. The researcher would set the cost to access the full article using any PaySwarm-compatible system. All of this is already implemented in PaySwarm.
  3. A paper publishing system would be used to request a review among academic peers. Those peers would review the paper and publish digital signatures on review comments, possibly with a notice that the paper is ready to be published. This sub-system is fairly trivial to implement and would mirror the current review process with the important distinction that it would not be centralized at journal publications.
  4. Once a pre-set limit on the number of positive reviews has been met, the paper publishing system would place its stamp of approval on the paper. Note that different paper publishing systems may have different metrics just as journals have different metrics today. One benefit to doing it this way is that you don’t need a paper publishing system to put its stamp of approval on a paper at all. If you really wanted to, you could write the software to calculate whether or not the paper has gotten the appropriate amount of review because all of the information is on the Web by default. This part of the system would be fairly trivial to write once the metrics were known. It may take a year or two to get the correct set of metrics in place, but it’s not rocket science and it doesn’t need to be perfect before systems such as this are used to publish papers.

From a reviewer perspective, it would work like so:

  1. You are asked to review papers by your peers once you have an acceptable body of published work. All of your work can be verified because it is tied to your PaySwarm identity. All review comments can be verified as they are tied to other PaySwarm identities. This part is fairly trivial to implement, most of the work is already done for PaySwarm.
  2. Once you review a paper, you digitally sign your comments on the paper. If it is a good paper, you also include a claim that it is ready for broad publication. Again, technically simple to implement.
  3. Your reputation builds as you review more papers. The way that reputation is calculated is outside of the scope of this blog post mainly because it would need a great deal of input from academics around the world. Reputation is something that can be calculated, but many will argue about the algorithm and I would expect this to oscillate throughout the years as the system grows. In the end, there will probably be multiple reputation algorithms, not just one. All that matters is that people trust the reputation algorithms.

Freedom to Research and Publish

The end-goal is to build a system that empowers researchers and research institutions, is far more transparent than the current peer-reviewed publishing system, and remunerates the people doing the work more directly. You will also note that at no point does a traditional journal enter the picture to give you a stamp of approval and charge you a fee for publishing your paper. Researchers are in control of the costs at all stages. As I’ve said above, the hard part isn’t the technical nature of the project, it’s the political nature of it. I don’t know if this is enough of a pain-point among academics to actually start doing something about it today. I know some are, but I don’t know if many would use such a system over the draw of publications like Nature, PLOS, Molecular Genetics and Genomics, and Planta. Quite obviously, what I’ve proposed above isn’t a complete road map. There are issues and details that would need to be hammered out. However, I don’t understand why a system like this doesn’t already exist, so I implore the academic community to explain why what I’ve laid out above hasn’t been done yet.

It’s obvious that a system like this would be good for the world. Building such a system may have reduced the possibility of us losing someone like Aaron in the way that we did. He was certainly fighting for something like it. Talking about it makes me feel a bit less helpless than I did yesterday. Maybe making something good out of this mess will help some of you out there as well. If others offer to help, we can start building it.

So how about it researchers of the world, would you publish all of your research through such a system?

Objection to Microdata Candidate Recommendation

Full disclosure: I’m the current chair of the standards group at the World Wide Web Consortium that created the newest version of RDFa, editor of the HTML5+RDFa 1.1 and RDFa Lite 1.1 specifications, and I’m also a member of the HTML Working Group.

Edit: 2012-12-01 – Updated the article to rephrase some things, and include rationale and counter-arguments at the bottom in preparation for the HTML WG poll on the matter.

The HTML Working Group at the W3C is currently trying to decide if they should transition the Microdata specification to the next stage in the standardization process. There has been a call for consensus to transition the spec to the Candidate Recommendation stage. The problem is that we already have a set of specifications that are official W3C recommendations that do what Microdata does and more. RDFa 1.1 became an official W3C Recommendation last summer. From a standards perspective, this is a mistake and sends a confused signal to Web developers. Officially supporting two specification that do almost exactly the same thing in almost exactly the same way is, ultimately, a failure to standardize.

The fact that RDFa already does what Microdata does has been elaborated upon before:

Mythical Differences: RDFa Lite vs. Microdata
An Uber-comparison of RDFa, Microdata, and Microformats

Here’s the problem in a nutshell: The W3C is thinking of ratifying two completely different specifications that accomplish the same thing in basically the same way. The functionality of RDFa, which is already a W3C Recommendation, overlaps Microdata by a large margin. In fact, RDFa Lite 1.1 was developed as a plug-in replacement for Microdata. The full version of RDFa can also do a number of things that Microdata cannot, such as datatyping, associating more than one type per object, embed-ability in languages other than HTML, ability to easily publish and mix vocabularies, etc.

Microdata would have easily been dead in the water had it not been for two simple facts: 1) The editor of the specification works at Google, and 2) Google pushed Microdata as the markup language for schema.org before also accepting RDFa markup. The first enabled Google and the editor to work on schema.org without signalling to the public that it was creating a competitor to Facebook’s Open Graph Protocol. The second gave Microdata enough of a jump start to establish a foothold for schema.org markup. There have been a number of studies that show that Microdata’s sole use case (99% of Microdata markup) is for the markup of schema.org terms. Microdata is not widely used outside of that context, we now have data to back up what we had predicted would happen when schema.org made their initial announcement for Microdata-only support. Note that schema.org now supports both RDFa and Microdata.

It is typically a bad idea to have two formats published by the same organization that do the same thing. It leads to Web developer confusion surrounding which format to use. One of the goals of Web standards is to reduce, or preferably eliminate, the confusion surrounding the correct technology decision to make. The HTML Working Group and the W3C is failing miserably on this front. There is more confusion today about picking Microdata or RDFa because they accomplish the same thing in effectively the same way. The only reason both exist is due to political reasons.

If we step back and look at the technical arguments, there is no compelling reason that Microdata should be a W3C Recommendation. There is no compelling reason to have two specifications that do the same thing in basically the same way. Therefore, as a member of the HTML Working Group (not as a chair or editor of RDFa) I object to the publication of Microdata as a Candidate Recommendation.

Note that this is not a W3C formal objection. This is an informal objection to publish Microdata along the Recommendation track. This objection will not become an official W3C formal objection if the HTML Working Group holds a poll to gather consensus around whether Microdata should proceed along the Recommendation publication track. I believe the publication of a W3C Note will continue to allow Google to support Microdata in schema.org, but will hopefully correct the confused message that the W3C has been sending to Web developers regarding RDFa and Microdata. We don’t need two specifications that do almost exactly the same thing.

The message sent by the W3C needs to be very clear: There is one recommendation for doing structured data markup in HTML. That recommendation is RDFa. It addresses all of the use cases that have been put forth by the general Web community, and it’s ready for broad adoption and implementation today.

If you agree with this blog post, make sure to let the HTML Working Group know that you do not think that the W3C should ratify two specifications that do almost exactly the same thing in almost exactly the same way. Now is the time to speak up!

Summary of Facts and Arguments

Below is a summary of arguments presented as a basis for publishing Microdata along the W3C Note track:

  1. RDFa 1.1 is already a ratified Web standard as of June 7th 2012 and absorbed almost every Microdata feature before it became official. If the majority of the differences between RDFa and Microdata boil down to different attribute names (property vs. itemprop), then the two solutions have effectively converged on syntax and W3C should not ratify two solutions that do effectively the same thing in almost exactly the same way.
  2. RDFa is supported by all of the major search crawlers, including Google (and schema.org), Microsoft, Yahoo!, Yandex, and Facebook. Microdata is not supported by Facebook.
  3. RDFa Lite 1.1 is feature-equivalent to Microdata. Over 99% of Microdata markup can be expressed easily in RDFa Lite 1.1. Converting from Microdata to RDFa Lite is as simple as a search and replace of the Microdata attributes with RDFa Lite attributes. Conversely, Microdata does not support a number of the more advanced RDFa features, like being able to tell the difference between feet and meters.
  4. You can mix vocabularies with RDFa Lite 1.1, supporting both schema.org and Facebook’s Open Graph Protocol (OGP) using a single markup language. You don’t have to learn Microdata for schema.org and RDFa for Facebook – just use RDFa for both.
  5. The creator of the Microdata specification doesn’t like Microdata. When people are not passionate about the solutions that they create, the desire to work on those solutions and continue improve upon them is muted. The RDFa community is passionate about the technology that they have created together and have strived to make it better since the standardization of RDFa 1.0 back in 2008.
  6. RDFa Lite 1.1 is fully upward-compatible with RDFa 1.1, allowing you to seamlessly migrate to a more feature-rich language as your Linked Data needs grow. Microdata does not support any of the more advanced features provided by RDFa 1.1.
  7. RDFa deployment is broader than Microdata. RDFa deployment continues to grow at a rapid pace.
  8. The economic damage generated by publishing both RDFa and Microdata along the Recommendation track should not be underestimated. W3C should try to provide clear direction in an attempt to reduce the economic waste that a “let the market sort it out among two nearly identical solutions” strategy will generate. At some point, the market will figure out that both solutions are nearly identical, but only after publishing and building massive amounts of content and tooling for both.
  9. The W3C Technical Architecture Group (TAG), which is responsible for ensuring that the core architecture of the Web is sound, has raised their concern about the publication of both Microdata and RDFa as recommendations. After the W3C TAG raised their concerns, the RDFa Working Group created RDFa Lite 1.1 to be a near feature-equivalent replacement for Microdata that was also backwards-compatible with RDFa 1.0.
  10. Publishing a standard that does almost exactly the same thing as an existing standard in almost exactly the same way is a failure to standardize.

Counter-arguments and Rebuttals

[This is a] classic case of monopolistic anti-competitive protectionism.

No, this is an objection to publishing two specifications that do almost exactly the same thing in almost exactly the same way along the W3C Recommendation publication track. Protectionism would have asked that all work on Microdata be stopped and the work scuttled. The proposed resolution does not block anybody from using Microdata, nor does it try to stop or block the Microdata work from happening in the HTML WG. The objection asks that the W3C decide what the best path forward for Web developers is based on a fairly complicated set of predicted outcomes. This is not an easy decision. The objection is intended to ensure that the HTML Working Group has this discussion before we proceed to Candidate Recommendation with Microdata.

<manu1> I'd like the W3C to work as well, and I think publishing two specs that accomplish basically 
        the same thing in basically the same way shows breakage.
<annevk> Bit late for that. XDM vs DOM, XPath vs Selectors, XSL-FO vs CSS, XSLT vs XQuery, 
         XQuery vs XQueryX, RDF/XML vs Turtle, XForms vs Web Forms 2.0, 
         XHTML 1.0 vs HTML 4.01, XML 1.0 4th Edition vs XML 1.0 5th Edition, 
         XML 1.0 vs XML 1.1, etc.

[link to full conversation]

While W3C does have a history of publishing competing specifications, there have been features in each competing specification that were compelling enough to warrant the publication of both standards. For example, XHTML 1.0 provided a standard set of rules for validating documents that was aligned with XML and a decentralized extension mechanism that HTML4.01 did not. Those two major features were viewed as compelling enough to publish both specifications as Recommendations via W3C.

For authors, the differences between RDFa and Microdata are so small that, for 99% of documents in the wild, you can convert a Microdata document to an RDFa Lite 1.1 document with a simple search and replace of attribute names. That demonstrates that the syntaxes for both languages are different only in the names of the HTML attributes, and that does not seem like a very compelling reason to publish both specifications as Recommendations.

Microdata’s processing algorithm is vastly simpler, which makes the data
extracted more reliable and, when something does go wrong, makes it easier for 1) users to debug their own data, and 2) easier for me to debug it if they can’t figure it out on their own.

Microdata’s processing algorithm is simpler for two major reasons:

The complexity of implementing a processor has little bearing on how easy it is for developers to author documents. For example, XHTML 1.0 had a simpler processing model which made the data that was extracted more reliable and when something went wrong, it was easier to debug. However, HTML5 supported more use cases and recovers from errors in cases where it can, which made it more popular with Web developers in the long-run.

Additionally, authors of Microdata and RDFa should be using tools like RDFa Play to debug their markup. This is true for any Web technology. We debug our HTML, JavaScript, and CSS by loading it into a browser and bringing up the debugging tools. This is no different for Microdata and RDFa. If you want to make sure your markup does what you want, make sure to verify it by using a tool and not by trying to memorize the processing rules and running them through your head.

For what it is worth, I personally think RDFa is generally a technically better solution. But as Marcos says, “so what”? Our job at W3C is to make standards for the technology the market decides to use.

If we think one of these technologies is a technically better solution than the other one, we should signal that realization at some level. The most basic thing we could do is to make one an official Recommendation, and the other a Note. I also agree that our job at W3C is to make standards that the technology market decides to use, but clearly this particular case isn’t that cut-and-dried. Schema.org’s only option in the beginning was to use Microdata, and since authors didn’t want to risk not showing up in the search engines, they used Microdata. This forced the market to go in one direction.

This discussion would be in a different place had Google kept the playing field level. That is not to say that Google didn’t have good reasons for making the decisions that they did at the time, but those reasons influenced the development of RDFa, and RDFa Lite 1.1 was the result. The differences between Microdata and RDFa have been removed and a new question is in front of us: given two almost identical technologies, should the W3C publish two specifications that do almost exactly the same thing in almost exactly the same way?

… the [HTML] Working Group explicitly decided not to pick a winner between HTML Microdata and HTML+RDFa

The question before the HTML WG at the time was whether or not to split Microdata out of the HTML5 specification. The HTML Working Group did not discuss whether the publishing track for the Microdata document should be the W3C Note track or the W3C Recommendation track. At the time the decision was made, RDFa Lite 1.1 did not exist, RDFa Lite 1.1 was not a W3C Recommendation, nor did the RDFa and Microdata functionality so greatly overlap as they do now. Additionally, the HTML WG decision at that time states the following under the “Revisiting the issue” section:

“If Microdata and RDFa converge in syntax…”

Microdata and RDFa have effectively converged in syntax. Since Microdata can be interpreted as RDFa based on a simple search-and-replace of attributes that the languages have effectively converged on syntax except for the attribute names. The proposal is not to have work on Microdata stopped. Let work on Microdata proceed in this group, but let it proceed on the W3C Note publication track.

Closing Statements

I felt uneasy raising this issue because it’s a touchy and painful subject for everyone involved. Even if the discussion is painful, it is a healthy one for a standardization body to have from time to time. What I wanted was for the HTML Working Group to have this discussion. If the upcoming poll finds that the consensus of the HTML Working Group is to continue with the Microdata specification along the Recommendation track, I will not pursue a W3C Formal Objection. I will respect whatever decision the HTML Working Group makes as I trust the Chairs of that group, the process that they’ve put in place, and the aggregate opinion of the members in that group. After all, that is how the standardization process is supposed to work and I’m thankful to be a part of it.

The Problem with RDF and Nuclear Power

Full disclosure: I am the chair of the RDFa Working Group, the JSON-LD Community Group, a member of the RDF Working Group, as well as other Semantic Web initiatives. I believe in this stuff, but am critical about the path we’ve been taking for a while now.

The Resource Description Framework (a model for publishing data on the Web) has this horrible public perception akin to how many people in the USA view nuclear power. The coal industry campaigned quite aggressively to implant the notion that nuclear power was not as safe as coal. Couple this public misinformation campaign with a few nuclear-power-related catastrophes and it is no surprise that the current public perception toward nuclear power can be summarized as: “Not in my back yard”. Nevermind that, per tera-watt, nuclear power generation has killed far fewer people since its inception than coal. Nevermind that it is one of the more viable power sources if we gaze hundreds of years into Earth’s future, especially with the recent renewed interest in Liquid Flouride Thorium Reactors. When we look toward the future, the path is clear, but public perception is preventing us from proceeding down that path at the rate that we need to in order to prevent more damage to the Earth.

RDF shares a number of these similarities with nuclear power. RDF is one of the best data modeling mechanisms that humanity has created. Looking into the future, there is no equally-powerful, viable alternative. So, why has progress been slow on this very exciting technology? There was no public mis-information campaign, so where did this negative view of RDF come from?

In short, RDF/XML was the Semantic Web’s 3 Mile Island incident. When it was released, developers confused RDF/XML (bad) with the RDF data model (good). There weren’t enough people and time to counter-act the negative press that RDF was receiving as a result of RDF/XML and thus, we are where we are today because of this negative perception of RDF. Even Wikipedia’s page on the matter seems to imply that RDF/XML is RDF. Some purveyors of RDF think that the public perception problem isn’t that bad. I think that when developers hear RDF, they think: “Not in my back yard”.

The solution to this predicament: Stop mentioning RDF and the Semantic Web. Focus on tools for developers. Do more dogfooding.

To explain why we should adopt this strategy, we can look to Tesla for inspiration. Elon Musk, founder of PayPal and now the CEO of Tesla Motors, recently announced the Tesla Supercharger project. At a high-level, the project accomplishes the following jaw-dropping things:

  1. It creates a network of charging stations for electric cars that are capable of charging a Tesla in less than 30 minutes.
  2. The charging stations are solar powered and generate more electricity than the cars use, feeding the excess power into the local power grid.
  3. The charging stations are free to use for any person that owns a Tesla vehicle.
  4. The charging stations are operational and available today.

This means that, in 4-5 years, any owner of a Tesla vehicle be able to drive anywhere in the USA, for free, powered by the sun. No person in their right mind (with the money) would pass up that offer. No fossil fuel-based company will ever be able to provide “free”, clean energy. This is the sort of proposition we, the RDF/Linked Data/Semantic Web community, need to make; I think we can re-position ourselves to do just that.

Here is what the RDF and Linked Data community can learn from Tesla:

  1. The message shouldn’t be about the technology. It should be about the problems we have today and a concrete solution on how to address those problems.
  2. Demonstrate real value. Stop talking about the beauty of RDF, theoretical value, or design. Deliver production-ready, open-source software tools.
  3. Build a network of believers by spending more of your time working with Web developers and open-source projects to convince them to publish Linked Data. Dogfood our work.

Here is how we’ve applied these lessons to the JSON-LD work:

  1. We don’t mention RDF in the specification, unless absolutely necessary, and in many cases it isn’t necessary. RDF is plumbing, it’s in the background, and developers don’t need to know about it to use JSON-LD.
  2. We purposefully built production-ready tools for JSON-LD from day one; a playground, multiple production-ready implementations, and a JavaScript implementation of the browser-based API.
  3. We are working with Wikidata, Wikimedia, Drupal, the Web Payments and Read Write Web groups at W3C, and a number of other private clients to ensure that we’re providing real value and dogfooding our work.

Ultimately, RDF and the Semantic Web are of no interest to Web developers. They also have a really negative public perception problem. We should stop talking about them. Let’s shift the focus to be on Linked Data, explaining the problems that Web developers face today, and concrete, demonstrable solutions to those problems.

Note: This post isn’t meant as a slight against any one person or group. I was just working on the JSON-LD spec, aggressively removing prose discussing RDF, and the analogy popped into my head. This blog post was an exercise in organizing my thoughts on the matter.

HTML5 and RDFa 1.1

Full disclosure: I’m the chair of the newly re-chartered RDFa Working Group at the W3C as well as a member of the HTML WG.

The newly re-chartered RDFa Working Group at the W3C published a First Public Working Draft of HTML5+RDFa 1.1 today. This might be confusing to those of you that have been following the RDFa specifications. Keep in mind that HTML5+RDFa 1.1 is different from XHTML+RDFa 1.1, RDFa Core 1.1, and RDFa Lite 1.1 (which are official specs at this point). This is specifically about HTML5 and RDFa 1.1. The HTML5+RDFa 1.1 spec reached Last Call (aka: almost done) status at W3C via the HTML Working Group last year. So, why are we doing this now and what does it mean for the future of RDFa in HTML5?

Here’s the issue: the document was being unnecessarily held up by the HTML5 specification. In the most favorable scenario, HTML5 is expected to become an official standard in 2014. RDFa Core 1.1 became an official standard in June 2012. Per the W3C process, HTML5+RDFa 1.1 would have had to wait until 2014 to become an official W3C specification, even though it would be ready to go in a few months from now. W3C policy states that all specs that your spec depends on must reach the official spec status before your spec becomes official. Since HTML5+RDFa 1.1 is a language profile for RDFa 1.1 that is layered on top of HTML5, it had no choice but to wait for HTML5 to become official. Boo.

Thankfully the chairs of the HTML WG, RDFa WG, and W3C staff found an alternate path forward for HTML5+RDFa 1.1. Since the specification doesn’t depend on any “at risk” features in HTML5, and since all of the features that RDFa 1.1 uses in HTML5 have been implemented in all of the Web browsers, there is very little chance that those features will be removed in the future. This means that HTML5+RDFa 1.1 could become an official W3C specification before HTML5 reaches that status. So, that’s what we’re going to try to do. Here’s the plan:

  1. Get approval from W3C member companies to re-charter the RDFa WG to take over publishing responsibility of HTML5+RDFa 1.1. [Done]
  2. Publish the HTML5+RDFa 1.1 specification under the newly re-chartered RDFa WG. [Done]
  3. Start the clock on a new patent exclusion period and resolve issues. Wait a minimum of 6 months to go to W3C Candidate Recommendation (feature freeze) status, due to patent policy requirements.
  4. Fast-track to an official W3C specification (test suite is already done, inter-operable implementations are already done).

There are a few minor issues that still need to be ironed out, but the RDFa WG is on the job and those issues will get resolved in the next month or two. If everything goes according to plan, we should be able to publish HTML5+RDFa 1.1 as an official W3C standard in 7-9 months. That’s good for RDFa, good for Web Developers, and good for the Web.

HTML5+RDFa 1.1 published – pla…

HTML5+RDFa 1.1 published – plan to become official spec in 7 months! http://t.co/oCx8YS7S #w3c #html5 #rdfa

A very moving Haka performed f…

A very moving Haka performed for fallen soldiers in New Zealand (video): http://t.co/wxHhs4Of #visceral #haka #nz #kiwi

If you didn’t see Bill Clinton…

If you didn’t see Bill Clinton’s speech at the DNC, it was fantastically precise: http://t.co/lNBy5rSG #dnc #math #greatspeech

@rouninmedia Thanks – glad you…

@rouninmedia Thanks – glad you discovered RDFa and all the great work (and people) behind it. #w3c #rdfa

RT @rouninmedia: no need to le…

RT @rouninmedia: no need to learn Microdata for http://t.co/KJRNfw8o & RDFa for FB OpenGraph. RDFa suffices. Here comes the Semantic Web.

New RDFa WG publishes HTML5+RD…

New RDFa WG publishes HTML5+RDFa 1.1, intends to go to REC in 8-9 months: http://t.co/MdnJ2RAu #w3c #rdfa #html5

o_O – Have you /seen/ Michelle…

o_O – Have you /seen/ Michelle Obama’s speech!? Totally blows the doors off of every Obama speech ever given: http://t.co/Vq1LAqZQ

Occupy Wall Street Tech Workin…

Occupy Wall Street Tech Working Group drops by to chat with W3C Web Payments Working Group: http://t.co/vAbh8Wfu #ows #w3c #payswarm

JSON-LD group discusses NoSQL …

JSON-LD group discusses NoSQL talk, RDF terminology, syntax intro, and future of .flatten()/.frame(): http://t.co/fonVXXDH #w3c #jsonld

Web Foundation releases global…

Web Foundation releases global stats on the Web’s growth, utility and impact on people & nations: http://t.co/cpL4435S /via @timberners_lee

RT @ivan_herman: RDFa, microda…

RT @ivan_herman: RDFa, microdata, turtle-in-HTML, and RDFLib http://t.co/yS9YRYgF

@venessamiemis Happy birthday!…

@venessamiemis Happy birthday! Hope your weekend will be filled with celebrating. :)

@benadida congrats on your new…

@benadida congrats on your new little one (and your AMAZING SAVINGS!) – hope each of you are doing well – all the best.

Tea-partier picks fight with I…

Tea-partier picks fight with Irish president (2010), does not go well: http://t.co/htzV7EVu /via Nadine Hack

"Let’s build a goddamn Te…

“Let’s build a goddamn Tesla Museum” raises $1M in 8 days via Matt Inmann (The Oatmeal) & Indiegogo: http://t.co/Sg6TutHJ #tesla

Foul mouthed grannies let Akin…

Foul mouthed grannies let Akin really know how they feel about his “legitimate rape” comments: http://t.co/XUostHah #nomeansno #akin

@agebhard blame the people wit…

@agebhard blame the people with the opinions… besides, you should know better than to abet a religious war before getting on a plane. :)

@sideshowbarker … and browse…

@sideshowbarker … and browser manufacturers have stated very clearly that they’re not interested in an RDFa API.

@sideshowbarker JSON-LD API: h…

@sideshowbarker JSON-LD API: http://t.co/IiegGQtN (the issue is: browser manufacturers don’t care yet…)

@sideshowbarker I was kidding….

@sideshowbarker I was kidding… note the “:P *ducks*” in the o.p. <– This is why I’m not involved in governmental politics. /cc @danbri

Go see this! RT @gkellogg: Tal…

Go see this! RT @gkellogg: Talking about publishing structured data from wikis today at 2:00pm. #jsonld #mongodb #nosqlnow

+1 RT @gkellogg: I agree that …

+1 RT @gkellogg: I agree that #microdata made #rdfa better. Now that’s done, its time to move on and get with RDFa.

My new hobby: Trolling @danbri…

My new hobby: Trolling @danbri on Twitter. :P /cc @agebhard @scorlosquet @gkellogg

@danbri @gkellogg @scorlosquet…

@danbri @gkellogg @scorlosquet @agebhard RDFa is better than Microdata, that’s a fact. :P *ducks*

Great post on JSON-LD, MongoDB…

Great post on JSON-LD, MongoDB, & MediaWiki/Wikia: http://t.co/kE5S5JBW /by @gkellogg /via @ivan_herman #mongo #wiki #w3c #jsonld

@danbri @agebhard @scorlosquet…

@danbri @agebhard @scorlosquet Yes, absolutely! What Stephane said. (although, there were better ways of approaching that issue). :)

@danbri My point still stands …

@danbri My point still stands – no good technical reason to use Microdata.

@danbri That said, we’ve seen …

@danbri That said, we’ve seen very little interest in an in-browser API to extract metadata – that’s why we didn’t pursue that route.

@danbri Is there any large dep…

@danbri Is there any large deployment of the Microdata API? RDFa API is going to be RDFa -> JSON-LD, and we’re working on it.

@agebhard I agree. That said, …

@agebhard I agree. That said, now that RDFa Lite 1.1 exists – there is no good technical reason for Microdata: http://t.co/suLnJ1MQ

RT @bergie: still unconvinced …

RT @bergie: still unconvinced of the necessity for #Microdata in a #RDFa world, despite @linclark ‘s excellent #DrupalCon session

Earthworm-like robot oozes alo…

Earthworm-like robot oozes along ground, can survive sledgehammers and stomping from puny humans: http://t.co/KXp7ZuxP #mit #robotics

Autonomous robotic plane flies…

Autonomous robotic plane flies indoors, through parking garage at 10m/s: http://t.co/CgSEUI2O #mit #uav

@aymericbrisse it was, we took…

@aymericbrisse it was, we took care of it. Shouldn’t happen again (hopefully)

JSON-LD group discusses Drupal…

JSON-LD group discusses Drupal 8 support, optional features, property generators, language maps: http://t.co/UNyPT9r1 #w3c #jsonld

Web Payments group discusses p…

Web Payments group discusses payment code example, PaySwarm Alpha 4 release, HTML5 WebApp store: http://t.co/OKkL3312 #w3c #payswarm

PaySwarm Alpha 4 released (sup…

PaySwarm Alpha 4 released (support for HTML5 Web App stores, new release process, bug fixes): http://t.co/6ymrEvVR #w3c #payswarm

Brilliant talk by Nick Hanauer…

Brilliant talk by Nick Hanauer on the true job creators: http://t.co/bUILUBCx #ted #middle #class

Summary of all JSON-LD specifi…

Summary of all JSON-LD specification updates that have happened in the last month: http://t.co/VmDGaVyd #w3c #jsonld

RT @ptwobrussell: If true, thi…

RT @ptwobrussell: If true, this is unbelievably despicable: This is how Visa works: http://t.co/cXCmRlVk /via @rands

PaySwarm Alpha 4 released – su…

PaySwarm Alpha 4 released – support HTML5 app stores, new build system, bug fixes: http://t.co/OTEQ7Iw0 #w3c #payswarm

"Researching" HTML5 …

“Researching” HTML5 games at work… RAPT is awesome (as long as you have a friend you can play it with): http://t.co/uqcb6Qk1

RT @niklasl: I’m well on the w…

RT @niklasl: I’m well on the way towards implementing a redesigned RDFa DOM API: http://t.co/9Fon3Ryz Live updates, not triple-centric.

RT @danbri: We’re close to bei…

RT @danbri: We’re close to being able to round-trip the http://t.co/KJRNfw8o site through RDFa 1.1 #html5 #google #seo #rdfa

FACT: All dogs in Ukraine are …

FACT: All dogs in Ukraine are trained in Parkour from an early age: http://t.co/jtdVLEMa /via @bsletten #parkour #dogs #ukraine

Call Me Maybe + Chatroulette +…

Call Me Maybe + Chatroulette + Cross Dressing == http://t.co/T0t9dFRD #party

PaySwarm Alpha 3 released – W3…

PaySwarm Alpha 3 released – W3C Web Payments reference implementation nears commercialization: http://t.co/eQTET8LU #payswarm #w3c

W3C RDFa Working Group plans t…

W3C RDFa Working Group plans to take HTML5+RDFa to official standard in the next six months: http://t.co/m3CmMF6I #w3c #html5 #rdfa

@cygri while not perfect, I th…

@cygri while not perfect, I think this is a solid step forward: http://t.co/cN54FmhT #rdf #vocab #docs

RT @thelal: @payswarm = Univer…

RT @thelal: @payswarm = Universal #Payment Standard for the #Web and the New Economy http://t.co/IBjk5VPs #futureofmoney

Web Payments group discusses d…

Web Payments group discusses decentralized HTML5 Web App stores, listing assets for sale: http://t.co/kXqI101m #w3c #html5 #payswarm

W3C JSON-LD group discusses pr…

W3C JSON-LD group discusses pre-processing JSON, synchronous API, array-position-based properties: http://t.co/diGkWb9S #jsonld #w3c

If you missed the Curiosity to…

If you missed the Curiosity touchdown on Mars – here’s a video of what happened: http://t.co/uvurdKnB #drama #ridiculous #awesome

Watch live as Curiosity lands …

Watch live as Curiosity lands on Mars in 75 minutes – 10:30pm PST, 1:30am EST – live stream here: http://t.co/CpviIApr #msl

Why men can’t have it all: htt…

Why men can’t have it all: http://t.co/7zfeAu1L /via @pemo #fatherhood #startups

Current corporate office statu…

Current corporate office status: Gangnam Style – http://t.co/26KZ0pbG #korea #horse #dancing #techno

RT @doriantaylor: Paywalls are…

RT @doriantaylor: Paywalls are awesome because they are super effective reminders that I have better things to do with my time.

@edithyeung Great chatting wit…

@edithyeung Great chatting with you too – glad to hear about http://t.co/9UCLgdwv fighting for developers and the Web! :)

RT @edithyeung: @manusporny Gr…

RT @edithyeung: @manusporny Great chatting with you! :) You guys are doing some exciting @w3c stuff for payment! http://t.co/5sfzbsJA

JSON-LD support for Wikidata /…

JSON-LD support for Wikidata / Drupal 8 REST APIs (internationalization support): http://t.co/TNrOzcle /cc @Dries #jsonld #w3c

JavaScript on V8 now firmly ki…

JavaScript on V8 now firmly kicking PHP, Ruby, Python, and Perl’s keister: http://t.co/PMhtDBww /via @davegeist #programming

Bruce Schneier on the Aurora s…

Bruce Schneier on the Aurora shootings and ‘security theatre’: http://t.co/cXd3gCS2 /via @davegeist #usa #guns #security

Favorite quote of the day: &qu…

Favorite quote of the day: “By all measures, @scorlosquet is a semantic web bad ass.”: http://t.co/4Bl4y7Pl #rdfa #w3c #schema

Phase2 integrates RDFa, rNews …

Phase2 integrates RDFa, rNews & http://t.co/KJRNfw8o into publishing platform 4 news sites: http://t.co/XGA1Z6zJ #rdfa #rnews #w3c

@openpublish online publishing…

@openpublish online publishing platform improves RDFa support in Drupal 7: http://t.co/IhxN5nsG #rdfa #drupal

"…a vast porno cluster …

“…a vast porno cluster can be seen between Brazil and Japan…”: http://t.co/D4ArWcjJ #ohinternetyousofunny

A Google maps-like map of the …

A Google maps-like map of the Web: http://t.co/ZyLVK6Ev /via @webr3 #web #science

The dark future of retinal dis…

The dark future of retinal displays and biomods: http://t.co/PkiRXDbo /via +Gregory Esau #film #hmm

"OAuth 2.0… the biggest…

“OAuth 2.0… the biggest professional disappointment of my career.” — Eran Hammer, resigns as lead of OAuth: http://t.co/2vos4hT2

@agebhard I’ll see what I can …

@agebhard I’ll see what I can pull together for you… :P

"GNOME3 turned that stupi…

“GNOME3 turned that stupid up to eleven” — on how the Gnome project is dying: http://t.co/O3OxlZWI

@agebhard I could hire some cl…

@agebhard I could hire some clowns and juggle baby chicks while singing “Poker Face”… if that would help re-infuse some randomness?

pic of plane crash showing pil…

pic of plane crash showing pilot/passenger getting stuff out of the plane: http://t.co/kG3sB8Te

whoa – plane just buzzed 100ft…

whoa – plane just buzzed 100ft over the office, crashed on the other side of building – pilot/passenger OK – caught by the chain link fence.

Femto-photography – imaging at…

Femto-photography – imaging at a trillion frames per second: http://t.co/B173sB98 #ted #takethatcanon

Zynga management dumps stock a…

Zynga management dumps stock at 4x current stock price just before crash… booo: http://t.co/B5ELXlFO

Google Residential Fiber (holy…

Google Residential Fiber (holy crap this looks amazing / dammit it’s not offered in Blacksburg, VA): http://t.co/taX5ktVE

RT @cs_conferences: Congrats t…

RT @cs_conferences: Congrats to AKSW’s Ali Khalili, who won Best Paper at @compsac 2012 for “The RFDa Content Editor”! http://t.co/PO2CKCTH

The bear ladder technique (vid…

The bear ladder technique (video): http://t.co/IiQ3Kavs #rescue

A Visual Mind Map of SCRUM: ht…

A Visual Mind Map of SCRUM: http://t.co/yP2jQdbC #scrum #mindmap

Vint Cerf (father of the Inter…

Vint Cerf (father of the Internet) calls bullshit on revisionist Internet history: http://t.co/ECYUaVwm #arpanet #crovitz

Web Payments group on aligning…

Web Payments group on aligning JSON Web Keys & Security vocab, new payswarm.js release, Web Keys http://t.co/EYP3yUCF #payswarm #w3c

JSON-LD group discusses single…

JSON-LD group discusses single term to multiple IRIs, a formal grammar, @context within @context, more: http://t.co/flxP4wB9 #jsonld

Want to find out more about to…

Want to find out more about today’s PaySwarm release? Listen in 15 minutes: http://t.co/dtMsDBtd #futureofmoney #payswarm

Just pushed an update for http…

Just pushed an update for http://t.co/S685NAb1 REST APIs that allows one to use payswarm.js: http://t.co/f439h1tD #payswarm #w3c

@yoichiro @andraz @mterenzio @…

@yoichiro @andraz @mterenzio @orangeaurochs Also, check out the live RDFa editor/visualizer/viewer: http://t.co/yqJY4mM3 #rdfa

@yoichiro @andraz @mterenzio T…

@yoichiro @andraz @mterenzio The RDFa Lite specification is under 5 pages, simple: http://t.co/WmSVpDAI #rdfa #html5

@bsletten Best way to eat Marm…

@bsletten Best way to eat Marmite: toast white bread, spread butter, put 1 tsp of marmite on top – delicious. #yesreally

Ever seen a master mason lay 1…

Ever seen a master mason lay 12-inch block? It’s very zen-like: http://t.co/Pfgvdrmm /via @sivers #mastery #masonry

The onion gives the most bitin…

The onion gives the most biting coverage of the Colorado shootings: http://t.co/I6r575RG