The Downward Spiral of Microdata

Full disclosure: I’m the chair of the RDFa Working Group and have been heavily involved during the RDFa and Microdata standardization initiatives. I am biased, but also understand all of the nuanced decisions that were made during the creation of both specifications.

Support for the Microdata API has just been removed from Webkit (Apple Safari). Support for the Microdata API was also removed from Blink (Google Chrome) a few months ago. This means that Apple Safari and Google Chrome will no longer support the Microdata API. Removal of the feature from a browser also shows us a likely future for Microdata, which is less and less support.

In addition, this discussion on the Blink developer list demonstrates that there isn’t anyone to pick up the work of maintaining the Microdata implementation. Microdata has also been ripped out of the main HTML5 specification at the W3C, with the caveat that the Microdata specification will only continue “if editorial resources can be found”. Translation: if an editor doesn’t step up to edit the Microdata specification, Microdata is dead at W3C. It just takes someone to raise their hand to volunteer, so why is it that out of a group of hundreds of people, no one has stepped up to maintain, create a test suite for, and push the Microdata specification forward?

A number of observers have been surprised by these events, but for those that have been involved in the month-to-month conversation around Microdata, it makes complete sense. Microdata doesn’t have an active community supporting it. It never really did. For a Web specification to be successful, it needs an active community around it that is willing to do the hard work of building and maintaining the technology. RDFa has that in spades, Microdata does not.

Microdata was, primarily, a shot across the bow at RDFa. The warning worked because the RDFa community reacted by creating RDFa Lite, which matches Microdata feature-for-feature, while also supporting things that Microdata is incapable of doing. The existence of RDFa Lite left the HTML Working Group in an awkward position. Publishing two specifications that did the exact same thing in almost the exact same way is a position that no standards organization wants to be in. At that point, it became a race to see which community could create the developer tools and support web developers that were marking up pages.

Microdata, to this day, still doesn’t have a specification editor, an active community, a solid test suite, or any of the other things that are necessary to become a world class technology. To be clear, I’m not saying Microdata is dying (4 million out of 329 million domains use it), just that not having these basic things in place will be very problematic for the future of Microdata.

To put that in perspective, HTML5+RDFa 1.1 will become an official W3C Recommendation (world standard) next Thursday. There was overwhelming support from the W3C member companies to publish it as a world standard. There have been multiple specification editors for RDFa throughout the years, there are hundreds of active people in the community integrating RDFa into pages across the Web, there are 7 implementations of RDFa in a variety of programming languages, there is a mailing list, website and an IRC channel dedicated to answering questions for people learning RDFa, and there is a test suite with 800 tests covering RDFa in 6 markup languages (HTML4, HTML5, XML, SVG, XHTML1 and XHTML5). If you want to build a solution on a solid technology, with a solid community and solid implementations; RDFa is that solution.

22 Comments

Got something to say? Feel free, I want to hear from you! Leave a Comment

  1. H.E.A.T. says:

    “Microdata was, primarily, a shot across the bow at RDFa. The warning worked because the RDFa community reacted by creating RDFa Lite…”

    This was the source of my frustration and lack of trust in RDFa. At least, you are finally admitting to this fact; a fact that I have expressed in comments on earlier posts on the subject.

    For me, it is not that Microdata is an inferior syntax (although I feel it is). RDFa is genetically matched to RDF and RDF is suppose to be the foundation for the semantic web. My distrust came from the W3C not being decisive and standing by its own tech. My distrust also came from the W3C allowing usurpers to continue amongst its ranks.

    Maybe now, the ambivalence of an authorizative metadata syntax will allow the W3C to focus their efforts to making RDFa usable for the common people. Maybe RDFa will be developed to make it sensible to the end user and not just big corporations (or big data).

    For me, it is not about bias in favor of one tech over another. I just like to know that the creators of a particular tech of interest will stand behind that tech regardless of the size of the opposition. The question still remains, what happens if someone decides to take on development of Microdata? Will the W3C go back to the fractured support for a particular syntax?

    • ManuSporny says: (Author)

      My distrust came from the W3C not being decisive and standing by its own tech. My distrust also came from the W3C allowing usurpers to continue amongst its ranks.

      This is not how the W3C works. The W3C is just a consortium, which means that it does what it’s membership and the public wants it to do. It’s membership consists of 392 member companies (Microsoft, Google, Facebook, Twitter, etc.), and the rest is made up of hundreds of regular Web developers and Internet advocates. The HTML WG itself is composed of hundreds of people with varying opinions on just about everything that is discussed. The W3C works on consensus, which means that everything is a compromise between hundreds of interested parties. This is what gave us the Web we have today.

      So, if you say you distrust the W3C, you are saying that you distrust those 392 member companies and the people in the public that take part in building this technology on a month-to-month basis.

      I just like to know that the creators of a particular tech of interest will stand behind that tech regardless of the size of the opposition.

      I hope that the RDFa community has demonstrated that at this point. Throughout RDFa’s history, we’ve had to go against the will of the lead HTML5 editor, the WHAT WG, the HTML WG, and all of the major search companies to bring RDFa to where it is today. That’s standing behind our tech regardless of the size of the opposition, we know it’s the best that’s out there, that’s why we keep improving it and evangelizing it (in spite of others wanting us to stop).

      The question still remains, what happens if someone decides to take on development of Microdata? Will the W3C go back to the fractured support for a particular syntax?

      If someone picks up Microdata, yes, the W3C will go back to supporting both if that’s where the consensus is. There is a very low barrier to compete in Web standards, and that’s a very good thing. That’s what gave us the Web that we have today.

      • H.E.A.T. says:

        The W3C is an organization with a leadership base. Once all input has been taken from members and all discussions finalized, the leadership makes a decision. Now, all must execute that mission with maximum effort toward success.

        The W3C had made a decision to move the web forward with XHTML 1.1 and RDFa. This fact was included on their own site. The WHATWG continued to disagree and, instead of moving forward with their consortium’s decision, decided to bully the consortium into shelving XHTML2 and bastardizing RDFa into RDFa Lite. This is a known fact.

        Yes, a consortium will and should accept feedback from its members, but once the final decision is made, then all need to move forward like professionals.

        The work done to extract RDFa Lite from the RDFa specification was a complete waste of effort. This effort would have been better spent on making the RDFa specification more readable (accessible, if you will). This is the cause of distrust in me. First, the WHATWG caused XHTML2 (the most professional tech) to be shelved for the sloppy and malformed HMTL5. Second, schema.org caused RDFa to be broken down into two still unreadable specifications.

        Step back for second and look at this situation from the common developer’s perspective. How does one decide which syntax to use to markup pages? Even if RDFa is the better tech, the major search engines are using the inferior tech and I must use that one. Will this move the web in a positive direction?

        This is the trust folks want in the W3C: to stand up for the better tech regardless of negative pressure; even from within its own ranks. I see the W3C as a leader in moving the web forward. I feel this way because they posture themselves up in this manner. And now you say, in agreement, that if someone takes up Microdata, then it’s back to business as usual.

        And I suppose to trust anything else coming out of the consortium?

        Is JSON-LD a high-quality tech? Or is it just a trend until Google decides to continue use of Microdata, or Microdata-LD (to be created)? A tech that is approved by the W3C as a Recommendation must be able to stand up against scrutiny. If not, then it is not a “standard” or complete specification. Yes, I know the W3C does not push out standards in the technical sense of the word, but they allow their specs to be touted as such.

        Read the other comments. No one is devaluing RDFa or JSON-LD. What is being said is that browser implementations is confusing the masses as to which syntax to employ. There is no need to defend every point being made; that shows pettiness and a lack of professionalism. Your commenters are just as passionate as you about using W3C’s tech, but do not want to use it if all it will become is bloat in the pages.

      • mattur says:

        Throughout RDFa’s history, we’ve had to go against the will of the lead HTML5 editor, the WHAT WG, the HTML WG, and all of the major search companies to bring RDFa to where it is today.

        Come now.

        There were specific problems with RDFa. These problems were presented to the RDF priesthood many years ago by Hixie, hsivonen et al. The RDF priesthood “blew them off” for theological reasons, making Microdata “inevitable”. You know this.

        http://krijnhoetmer.nl/irc-logs/whatwg/20110817#l-947

        When Microdata was adopted by schema.org, it forced the RDF priesthood to finally recognise and fix these problems (with RDFa Lite). All this could have been avoided. You know the history Manu, so no excuse for misrepresenting how we ended up here.

        • ManuSporny says: (Author)

          Here’s a summary of the feedback we keep receiving from the WHAT WG for any technology based on RDF:

          Ian Hickson: “The main problem with [RDFa/JSON-LD] is that it’s based on RDF, which is a trivial solution to a non-problem.” [ref]

          There is very little common ground when that is the basis for comments made against a spec that is based on RDF.

          I don’t know who the RDF priesthood is, could you name the members, please? I have a feeling that when you name them, very few of them were involved in the creation of RDFa. Especially now, most of the people involved in RDFa are people that were not originally from the RDF community.

          I never said there weren’t specific problems with RDFa. We had discussed something along the lines of the RDFa Lite spec for a while, but the schema.org decision made it an easier sell. However, even after the launch of RDFa Lite (and it’s feature-for-feature match against Microdata), the people pushing Microdata didn’t back down (also, for theological reasons).

          Perhaps you could be more specific about what I’m misrepresenting here? That would lead to a more constructive discussion.

  2. “If you want to build a solution on a solid technology, with a solid community and solid implementations; RDFa is that solution.”

    How does this coincide with the fact that schema.org itself is promoting the use of Microdata?
    Because if the major search engines promote the use of Microdata surely it’s popularity will only be rising. Implementation will grow and the need for advancements will arise. Won’t it simply be a matter of time before somebody picks up Microdata?

    And what do you feel is or should be the role of the major search engines in maintaining the Microdata implementation? Surely by now they can’t afford to let it slowly wither and disappear anymore.

    • ManuSporny says: (Author)

      How does this coincide with the fact that schema.org itself is promoting the use of Microdata?

      It’s difficult to say what schema.org will do in light of these recent changes. I’m sure they’ll continue to support Microdata, but to what extent is unclear. Remember, there was a time when they only supported Microformats and RDFa. When schema.org launched, they quickly dropped support for RDFa and Microformats and moved to Microdata. Now that Microdata support is waning, the pendulum could swing back the other way. Only time will tell, it’s always been hard to predict what the search vendors will do.

      For example, I’m the lead editor of JSON-LD. I had no idea that Google was going to put it into schema.org and 425 million Gmail accounts until the news hit the tech sites. I have very little insight into what they plan to do in the future because they tend to be very secretive about their future plans.

      Because if the major search engines promote the use of Microdata surely it’s popularity will only be rising. Implementation will grow and the need for advancements will arise. Won’t it simply be a matter of time before somebody picks up Microdata?

      Replace “Microdata” with “Microformats” and “RDFa 1.0″ and you’ll start to see a trend. Don’t forget that those two and Rich Snippets came before Microdata and both were unceremoniously tossed to the side once Microdata came along. They may decide to throw everything out and just switch over to JSON-LD since they’re now having trouble w/ people marking up things w/ Microdata.

      And what do you feel is or should be the role of the major search engines in maintaining the Microdata implementation?

      Well, to date, their role in maintaining the public Microdata implementations has been non-existent. They have Microdata implementations, but none of them have ever been made public (to my knowledge, it’s only for use in their internal systems). They have RDFa implementations, but those has never been made public either for similar reasons. What I feel is not important, reality is. Reality is that the search engine companies do not engage in that type of activity.

      Surely by now they can’t afford to let it slowly wither and disappear anymore.

      Sure they can, because they know that SEOs will do whatever they tell them to in order to get better search rankings. So, if schema.org said that you had to mark all your data up in JSON-LD to get better search rankings, you’d do it in a heartbeat because that’s what people hire you to do.

  3. “For example, I’m the lead editor of JSON-LD. ”
    About that, thanks for your efforts with that. It’s much appreciated.

    “Replace “Microdata” with “Microformats” and “RDFa 1.0″ and you’ll start to see a trend.”
    Hehe, you probable have a solid point here. Although the amount of time I invested in making the switch from RDFa to Microdata makes me very much wish you are wrong with this. It nevertheless could very well turn out to be the ugly truth (for Microdata adaptors).

    “you’d do it in a heartbeat because that’s what people hire you to do.”
    Partially true. But in my own defence, I started using RDFa before it had any real effects in search engines. I started ‘playing’ with it because I saw the need and usefulness of it even before it had any form of SEO benefits and therefor started a long time ago. I’ve build news sites which have been mentioned at the W3C as being one of the very earliest, large-scale, adaptors of RDFa (and all the faults that come with it). So I’m definitely no stranger to adapting new technologies even before it’s clear what it’s benefits are.

    Even though I’m a person who works in SEO and should follow what is actual in relation to that, deep inside I am a Semantics fanboy, in just about any format it presents itself in. Microformats is just about the only format I have ever refused to work with. My roots lie with web accessibility and in the time Microformats was gaining popularity I was s strong advocate of not implementing it due to the conflicts it has with accessibility.

    So even though most of the time I follow so called SEO-best-practices, I also can turn against them if I see serious downsides in a technique. Because if SEO were to be so black and white there wouldn’t be a difference between Black- and White-hat SEO. Not all SEOs lack ethics you know. A substantial part of SEOs are actually trying to help improve the internet as well as getting clients to perform well. One doesn’t necessarily exclude the other.

    I truly hope that the energy I’ve poured into working with Microdata doesn’t prove to have been for nothing but if it what you say is fact, I can’t deny you maybe have a very correct yet scary (for me) hypothesis.

    • ManuSporny says: (Author)

      Not all SEOs lack ethics you know.

      That’s not what I was trying to imply. What I am saying is that if your customers want good search rankings, that you will use whatever technology will give you the best result. So, if Google makes it seem as if Microdata+schema.org is preferred, you will listen to their advice and implement it (because not doing so is going to be against your customer’s best interests).

      Google supports RDFa Lite in schema.org, but their lag in updating the documentation has led to a large number of people getting a false impression of what schema.org’s preferences are: http://blog.schema.org/2012/06/semtech-rdfa-microdata-and-more.html

  4. There’s still schema.org promoting its use in the name of major search providers, with its own cross-domain vocab. Its cross-domain vocab allows to think of information as data, which is good for archiving. If we traverse in the reverse direction from data towards information, where every domain has functional (one-to-one mapping relationship) with an ontology, then the semantic rich (domain experts) can get comfortable space (vocab) on the Web platform. As long as investment in schema.org is disproportional to other vocabs, it might simply resurrect microdata.

    • ManuSporny says: (Author)

      As long as investment in schema.org is disproportional to other vocabs, it might simply resurrect microdata.

      Exactly right. The thing that could keep the Microdata zombie shambling along is Google’s insistence on its usage for schema.org, since that’s really the only strong usage Microdata has ever seen.

  5. Shoresite says:

    The ultimate markup standards supported by the W3C is linked data stack.
    In that context isn’t the schema.org is only a temporary solution with microformats and microdata?
    It provide browsers with semantic markup as in-page open data for hard coded presentation and plugin hooks.
    For SEO and custom search the standards are still applied by SE indexing algorithms through site maps ?

    • ManuSporny says: (Author)

      I don’t know if schema.org is temporary. The way it is being developed (through an open community, but a closed website that you cannot contribute to) is certainly not ideal.

      I don’t quite understand the other question you’re asking, or comment you’re making.

  6. tomByrer says:

    H.E.A.T said
    > No one is devaluing RDFa or JSON-LD. What is being said is that browser implementations is confusing the masses as to which syntax to employ…. Your commenters… do not want to use it if all it will become is bloat in the pages.

    Exactly how I feel. A few years ago I thought Microdata.org was the best resource; had major names as backers including the majority holder of the search market. Now that same said company is dropping support for Microdata, I feel I’ve wasted my time.

    I do think competition is healthy; often help both become better. But we busy web devs just want the best for ourselves & clients… so why does Microdata.org fully switch to RDFa then?

  7. Bernardo Medeiros says:

    So why Google recommends microdata and ditches RDFa?

    Here:
    https://support.google.com/webmasters/answer/99170?hl=en

    And here:
    https://support.google.com/webmasters/answer/162163?hl=en&ref_topic=1088474

    If the major player in search tells me not to use RDFa, why should I listen to you?

  8. As a web designer, I’m finding it really confusing as to which markup should be used for structured data (Microdata/Microformats/RDFa). Although, I have now switched over to RDFa Lite after reading this article (from Microformats – I’ve never liked Microdata personally) the fact that Google is still recommending Microdata makes me feel uneasy. I think more pressure should be put on Google to clarify their position, perhaps they should just remove the recommendation in favour of Microdata and then leave web designers/developers to decide/do their own research.

    In implementing RDFa Lite I did find it very hard to find actual markup examples, so using schema.org as a guide I created my own in this blog post. I’d be really grateful if someone could sanity check them or point me in the direction of a community where I can find someone who can:

    http://www.enov8.co.uk/web-design-blog/2014/03/17/structured-data-using-rdfa-lite/

Trackbacks for this post

  1. Bruce Lawson’s personal site  : Reading List
  2. Drupal as a CMS for a chunks content strategy (part 2) - Not Sure Yet
  3. Inside Tech - January 2014 - AMP Agency
  4. What is microdata for SEO? - Sydney SEO Consultant - Nick Cavarretta

Leave a Comment

Let us know your thoughts on this post but remember to play nicely folks!