During the recent schema.org kerfuffle, Tantek Çelik and I found ourselves agreeing with each other on the fundamentals of how a Web vocabulary should be developed. Like any technology standard meant for the world to use, we hoped that it would be developed transparently and scientifically. Tantek asked me to review the new Microformats 2 work and I thought it would be interesting to see what they’ve been up to recently.
I’ve been a contributing member of the Microformats community for some time, having participated in the design work for the hAudio, hVideo, hMedia, hProduct, hRecipe, currency, collection and measurement Microformats, among others. I’ve documented the process, commented on inconsistencies in the community, been critical of the confusing spec-creation steps, raised governance and technical issues, pushed the community to more clearly address patent and copyright concerns as well as admit that the lack of a unified parsing model is holding Microformats back. I have been harsh about how the community was run, but continued to participate because there were a number of redeeming qualities in the Microformats movement.
All of the frustration with the various inconsistencies, the administrators, and lack of progress led to me to take a hiatus from the community. I think many others in the community felt this frustration around the same time, as you can see the discussion average of 125 messages per month drop to an average of 10 per month and stay there to this day. When I took a leave from the Microformats work, I joined the RDFa Working Group at the W3C where I now Chair the group that created RDFa. In 2007, my company was working on expressing music on the Web as structured data and RDFa seemed like a much better way to do it, so we shifted our focus to RDFa and distributed vocabulary development. Fast forward to today and both PaySwarm and MusicBrainz publish all of their data as RDFa. However, with the recent launch of schema.org, an interesting question was pushed into the public view once again: What is the best way to develop a Web vocabulary for structured data in HTML if millions of people are going to depend on it?
The Microformats 2 work attempts to address a number of concerns that have been raised in the community over the past several years. Most of these issues were logged during a period of peak activity in the community, between 2007 and 2009, during the development of the hAudio, hVideo, hMedia, hProduct, collection and measurement Microformats. Here’s a quick breakdown of my initial thoughts on the Microformats 2 work:
There are a number of really great things proposed for Microformats 2 that could breathe new life into the community.
- Unified parsing model – Microformats 2 has it – this is one of the best changes to the new direction.
- Flat set of properties – All Microformats are treated as objects with a flat set of properties. This maps to JSON nicely and is another move in the right direction.
- Hungarian prefixing – All Microformats 2.0 markup will now have an h-* prefix for the Microformat, a p-* prefix for string properties, a u-* prefix for URLs, and a d-* prefix for datetimes.
- Vendor extensions. – I hope this catches on – it allows a path toward experimentation which we desperately needed for the PaySwarm work. The Microformats community has a saying, “Pave the cowpaths”. This philosophy effectively boils down to ensuring that standards are rooted in existing practice. However, you can’t pave cowpaths that aren’t there yet. Typically, innovation requires the first cow to start making the cowpath. It would be nice to have an open community that you can innovate within – this could provide that mechanism. Moo.
- Separation of Syntax from Vocabularies – Tantek mentioned that the Microformats 2 work would separate vocabularies from syntax. I couldn’t find that statement on the page, but I think it would be great to do that. I’ve always believed that the real contribution of the Microformats community to the Web was in the development of well-researched Web vocabularies. We now have syntaxes that are capable of expressing Microformats; RDFa and Microdata. Why do we need yet another syntax? The part of this new Microformats 2 reboot I’m most interested in participating in is the vocabulary part. Specifically, porting all of the Microformats Vocabularies over to RDFa 1.1 Profiles. The markup would be almost exactly the same as what is proposed on the Microformats 2 wiki page (example below).
Some of the changes to Microformats aren’t really necessary, nor do I think that they will result in stronger uptake of Microformats.
- Root Class Name Only – Microformats aren’t that difficult to publish. Simplifying them down to one tag will probably not result in much uptake or data that is interesting or helpful.
- “hcard” instead of “vcard” – Yes, it was a point of confusion. I don’t think it really prevented people from implementing Microformats.
Some of the most important things that the Microformats community needs to change are not addressed. I’d like to see them addressed before assuming that new work done in that community will have a lasting impact:
- The Administrators – One of the strongest criticisms by the community has always been the status of the self-appointed leaders. They do a good job most of the time, but having a mechanism where the community elects the leaders and administrators would get us closer to a meritocracy. Not allowing the community to govern itself shows that you don’t trust the membership of the community. If you don’t trust us, how can we trust you? If there is a “you” and a “them”, then it becomes easy to have a “you versus them” situation. The Microformats community could learn a great deal from the Debian community in this respect.
- The Process – I had previously complained that it was not very clear what you need to meet each hurdle in the Microformats process. This seems to have been clarified with the new Microformats 2 work. I’m still concerned that too much is left in the hands of the “leaders”. There was a great deal of what I felt was “moving the goalposts” when developing hAudio. The process kept changing. If the process keeps changing, it can mean that all of your hard work may not end up making it to the “official” Microformats standard stage. So, I am suspect of the process if the community has no power over who gets to change the process and when.
- Open Innovation – How does one innovate in the Microformats community? That is, how do we have an open discussion about the Commerce, Signature and PaySwarm Web vocabularies in the Microformats community? We’re trying to solve a real-world problem – Universal Payment on the Web. We need to have an open discussion about the Web vocabularies used to accomplish this goal. How can we have this discussion in the Microformats community?
- Collaboration – How can the RDFa community, Microdata folks and the Microformats community work together? I’d really like all of us to work together. I’ve been trying to make this happen for several years now, each attempt met with varied levels of failure. Our continued track record of not reaching out and working with one another on a regular basis is damaging structured data adoption on the Web – and each community feels as if they are blame-less for the current state of affairs. “If only they’d listen to us, we wouldn’t be in this mess!”. Schema.org is just one signal that all of us need to come together and work on a unified way forward.
So, how do we collaborate on this? We have added Microformats-like features to RDFa over the past few years because we wanted RDFa 1.1 markup to be just as easy as Microformats markup. This example is used on the Microformats 2 page:
<h1 class="h-card"> <span class="p-fn"> <span class="p-given-name">Chris</span> <abbr class="p-additional-name">R.</abbr> <span class="p-family-name">Messina</span> </span> </h1>
The markup above can be easily expressed in RDFa 1.1, using RDFa Profiles like so:
<h1 typeof="hcard"> <span property="fn"> <span property="given-name">Chris</span> <abbr property="additional-name">R.</abbr> <span property="family-name">Messina</span> </span> </h1>
This is useful to the Microformats 2 work because every RDFa 1.1 compliant parser could easily become a compliant Microformats 2 parser. Food for thought.
Let’s try to work together on this. As a first step, I think that the RDFa community could easily generate RDFa Profiles for Microformats. This would give people the ability to use Microformats either in the Microformats 2 syntax, or in RDFa 1.1 syntax. That would drive further adoption of the Microformats vocabularies – which would be great for both communities. How can we make this happen?
Thanks to DL and DIL for reviewing this post.