5 RDFa Features Inspired by Microdata and Microformats

Full disclosure: I am the current Chair of the group at the World Wide Web Consortium that created RDFa. That said, this is my personal blog – I am not speaking on behalf of the W3C, RDFa Working Group, RDF Web Apps Working Group or my company, Digital Bazaar.

I’ve seen a few comments by Web authors and developers like this over the past several years:

as a web developer, I have to say … w3c was neglecting web developers with rdfa for last X years.” — Andraz Tori

Every time that statement is made in my presence, I attempt to calmly explain that this is not true. Sometimes it’s a bad experience that the person has had with a standards body, but most of the time the commenter doesn’t understand how the Internet and the Web are built. Here’s the explanation I typically give:

The RDFa Working Group cares very deeply about what Web developers have to say. All RDFa Working Group meetings are publicly recorded and available, anyone can join the public mailing list and contribute, we have a public issue tracker. There is nothing to stop anyone from participating and contributing. If people demonstrate deep knowledge of structured data and contribute frequently, they’re usually asked to join the Working Group as Invited Experts. We are required to address all public input – you cannot get a Web/Internet spec until you do that. If you don’t prove that you have addressed all public input, you don’t get an official spec – it’s as simple as that.

The reason that we take public input very seriously is because we want to create a standard that works for the most number of people while keeping the complexity of the specification to a manageable level. That is, when forced to decide between the two, we put Web publishers and developers first – and parser implementers second.

You don’t need to know much about the history of Microformats, RDFa and Microdata to understand this post. Microformats and RDFa came about at roughly the same time, around 2004. RDFa has had a number of its features inspired by Microformats. Microdata started off as direct modifications to RDFa that removed some of the features that the RDFa folks felt were necessary. RDFa has also pulled in a number of newer features from Microdata. The rest of the article describes what these features are, where they came from, and why we included them.

1. Profiles and Terms

I’ve spent a good deal of time in the Microformats community. When I was asked to join the RDFa Working Group, a great deal of that thinking came along with me. Luckily, many of the others in the RDFa Working Group shared much of this thinking about Microformats. One of the most striking features of Microformats is the simplicity of the markup and the vocabularies. These Web vocabularies are typically expressed using Profiles. Here is the list of Microformats profiles.

We wanted to provide the same sort of simple Markup in RDFa 1.1, so we introduced the concept of Terms and RDFa Profiles. This feature allows you to use Microformats-like markup in RDFa 1.1:

<body profile="http://microformats.org/profile/hcard">
...
<div typeof="vcard">
    <span property="fn">Tantek Çelik</span> is known on Twitter as <span property="nickname">t</span>.
</div>

2. Absolute IRIs

RDFa 1.0 allowed people to compact IRIs so that fewer mistakes are made when typing in a whole bunch of property names. The Microdata folks felt that compact IRIs are problematic because prefixes can be re-bound to different values. People carelessly copying and pasting source code could accidentally mess up what the chunk of HTML is supposed to mean if they forget to declare the prefixes, or declare them differently. IRI support in all RDFa 1.1 attributes was added to address this concern. If Web developers are generating code that they expect people to cut-and-paste, and they don’t like CURIEs, they can use absolute IRIs instead. This means that the following markup:

<div prefix="dc: http://purl.org/dc/terms/">
   <h2 property="dc:title">My Blog Post</h2> by 
   <h3 property="dc:creator">Alice</h3>
   ...
</div>

Can be written like this, instead:

<div>
   <h2 property="http://purl.org/dc/terms/title">My Blog Post</h2> by 
   <h3 property="http://purl.org/dc/terms/creator">Alice</h3>
   ...
</div>

The markup above doesn’t need to have prefixes declared, nor is it susceptible to some types of careless cut-and-pasting.

3. The @vocab Attribute

Microdata and Microformats are clever in the way that you don’t need to use CURIEs or URIs to express properties. Unfortunately, for the RDFa folks, those properties are not very good semantic web identifiers because they are not dereferenceable. That is, a human could not stick a shortened vocabulary term from Microdata into a Web Browser and find out what that term is all about. A machine could not follow the Microdata vocabulary term URL and hope to find anything useful at the end of the URL. The ability to follow any URL and find out more about it is often refered to as “follow-your-nose”, and is an important part of the design of RDFa.

The RDFa 1.1 work focused on pulling this feature over from Microdata’s itemtype attribute, but also ensuring that it would work for follow-your-nose. The following markup demonstrates how an RDFa 1.1 processor can use Microdata-like markup when using a single vocabulary, but still support follow-your-nose:

<div vocab="http://schema.org/">
   <ul>
      <li typeof="Person">
        <a rel="url" href="http://example.com/bob/">Bob</a>
      </li>
      <li typeof="Person">
        <a rel="url" href="http://example.com/eve/">Eve</a>
      </li>
      <li typeof="Person">
        <a rel="url" href="http://example.com/manu/">Manu</a>
      </li>
   </ul>
</div>

If we take the http://schema.org/Person term, we can plug that into a Web browser and find out more about the vocabulary term. Unfortunately, schema.org doesn’t provide a machine-readable version of their vocabulary. For an example of a human-and-machine readable vocabulary, please see http://purl.org/media/audio.

4. Web Apps API

Web developers typically don’t want to be bothered with the document markup when they are programming. The Microdata specification provides a DOM API in order to read items into JavaScript objects so that structured data in the page can be processed by Web Applications. This was clearly one of the key differentiators of Microdata in the beginning, and seemed to be a feature that many Web developers were excited about. Of particular note was that Microformats historically have not had a clear generic parsing model or an API, which may have held back their adoption in Web Applications. These two shortcomings are being actively discussed in the microformats-2 work.

The RDFa Working Group paid close attention to these developments, learned from them, and finally concluded that an RDFa DOM API was necessary in order to make the use of RDFa for Web Developers easier. For example, to find out all of the subjects on the page that contain a name, one need only do something like this:

thingsWithNames = document.data.getSubjects("foaf:name");

To get all of the names associated with a particular thing, a Web developer could do this:

var thingNames = document.data.getValues(thing, "foaf:name");

5. Projections/JSON-mapping

Everyone loves JSON. It is a simple data format that is incredibly expressive, compact and maps easily to JavaScript, Python, Ruby and many other scripting languages. Microdata has a native mapping from markup on the page to a JavaScript object and JSON serialization. The RDFa Working Group saw this as a powerful feature, but also thought that Web Developers should have the ability to map objects to whatever layout made the most sense to them. The concept of a Projection was proposed and now closely mirrors all of the benefits provided by the Microdata-to-JSON mapping, along with giving developers the added benefit of freely “projecting” objects from structured data in a Web page.

For example, developers could get all people on the page like so:

var people = document.data.getProjections("rdf:type", "foaf:Person");

or they could build specific objects, and access the object’s members like so:

var albert = document.data.getProjection("#albert", {"name": "foaf:name"});
var name = albert.name;

This feature is detailed in the RDFa API right now, but may become more generalized and apply to any structured data language like Microformats or Microdata.

Closing Thoughts

The RDFa Working Group cares very deeply about what Web developers have to say. All three syntaxes for structured data on the Web today have cross-pollinated with one another – that’s a good thing. We feel that with RDFa 1.1, we took some of the best features of Microdata and Microformats and made them better. We provide functionality in a way that allows Web Developers to use as few or as many of these features as they so desire. We continue to listen and improve RDFa 1.1 in order to make it an effective tool for Web authors, publishers and developers. After all, one of the goals of the RDFa Working Group is to discover and standardize what the Web community wants – to make authoring and using RDFa content easier.

Thanks to DL, MB, DB, and DIL for reviewing the post and providing feedback and change suggestions.

Trackbacks for this post

  1. Microformats 2 and RDFa Collaboration | The Beautiful, Tormented Machine

Leave a Comment

Let us know your thoughts on this post but remember to play nicely folks!