A New Way Forward for HTML5
By halting the XHTML2 work and announcing more resources for theHTML5 project, the World Wide Web Consortium has sent a clear signalon the future markup language for the Web: it will be HTML5.Unfortunately, the decision comes at a time when many working withWeb standards have taken issue with the way the HTML5 specification is being developed.
The shut down of the XHTML2 Working Group has brought to a head along-standing set of concerns related to how the new specification isbeing developed. This page outlines the current state of developmentand suggests that there is a more harmonious way to move forward. Byadopting some or all of the proposals outlined below, the standardscommunity will ensure that the greatest features for the Web areintegrated into HTML5.
What’s wrong with HTML5?
There are likely as many reasons for why HTML5 is problematic as there are for why HTML5 will succeed where XHTML2 didn’t. Some of these reasons are technical in nature, some are based on process, and others may lack sufficient evidential support.
Many, including the author of this document, have praised the WHAT WG for making steady progress on the next version of HTML. Using implementation data to back up additions, removals or re-writes to the core HTML specification has helped improve the standard. In general, browser implementors have been very supportive of the current direction, so there is much to be celebrated when it comes to HTML5.
The HTML5 editorial process, however, has also created several complaints among long-time members of the web standards community that find themselves marginalized as the specification proceeds. The problem has more to do with politics than it does science, but as we find in the real world — the politics are shaping the science.
The biggest complaint with the current process is that the power to change the originating specification lies with one individual. This gives that one individual, or group of individuals, an advantage that has created an acrimonious environment.
In a Consortium like the W3C, a process that shows favor to certain members by giving them privileges that other members can never attain is fundamentally unfair.
In this particular case, it tilts the table toward the current HTML5 specification editor and toward the browser manufacturers. Search companies, tool developers, web designers and developers, usability experts and many others have suddenly found themselves without a voice. Some say that this approach is a good thing — it focuses on those that must adhere to the standard and on those that are producing results. Unfortunately, the approach also creates conflict. The secondary communities feel a sense of unfairness because they feel their needs for the Web are not being met. It is not a simple problem with an easy solution.
HTML5 is now the way forward. In order to ensure that dissenting argumentation can have an impact on the specification, if the argumentation is valid, we must subtly change the editorial process. The changes should not affect the speed at which HTML5 is proceeding, so there is a certain finess that must be employed to any action we perform to make the HTML5 community better.
This set of proposals addresses our current situation and what other similar communities have done to improve their own development processes.
The goal of the actions listed in this document is to allow all of the communities interested in HTML5 to collaborate on the specification in an efficient and agreeable manner. The tools that we elect to use have an effect on the perceived editorial process in place. Currently, the process does not allow for wide-scale collaboration and the sharing and discussion of proposals that support consortium-based specification authoring.
The strategy for moving HTML5 forward should focus on being inclusive without increasing disruption, red tape, or maintenance headaches for those who are contributing the most to the current HTML5 specification. Any strategy employed should ensure that we create a more open, egalitarian environment where everyone who would like to contribute to the HTML5 specification has the ability to do so without the barriers to entry that exist today.
About the Author
Manu Sporny is a Founder of Digital Bazaar and the Commons Design Initiative, an Invited Expert to the W3C, the editor for the hAudio Microformat specification, the RDF Audio and Video vocabularies, a member of the RDFa Task Force and the editor of the HTML5+RDFa specification.
Over the next six months, he will be raising funding from private enterprise and public institutions to address many of the issues outlined below. If your company depends on the Internet and can afford to fund just a small fraction of the work below (minimum $8K grant), then please contact him at firstname.lastname@example.org. If you know of an institution that is able to fund the work described on this page, please have them contact Manu.
The majority of this document lists some of the current HTML5 issues and attempts to provide actions that the standards community could take to address them.
Problem: A Kitchen Sink Specification
The HTML5 specification currently weighs in at 4.1 megabytes of HTML. If one were to print it out, single-spaced and using 12-point font, the document would span 844 pages. It is not even complete yet and it is already roughly the same length as the new edition of “War and Peace” – a full 3 inches thick.
Reading an 844 page technical specification is daunting for even the most masochistic of standards junkies. The HTML 4.01 specification, problematic in its own right, is 389 pages. Not only are large specifications overwhelming, but it can be almost impossible to find all of the information you need in them. Clearly the specification needs to be long enough to be specific, but not any larger. The issue has more to do with focus and accessibility to web authors and developers than length.
The current HTML5 specification contains information that is of interest to web authors and designers, parser writers, browser manufacturers, HTML rendering engine developers, and CSS developers. Unfortunately, the spec attempts to address all of these audiences at once, quickly losing its focus. A document that is not focused on its intended audience will reduce its utility for all audiences. Therefore, some effort should be placed into making the current document more coherent for each intended audience.
Action: Splitting HTML5 into Logically Targeted Documents
“Know your audience” is a lesson that many creative writers learn far before their first comic, book, or novel is published. Instead of creating a single uber-specification, we should instead split the document into logically targeted sections that are more focused on their intended audience. For example, the following is one such break-out:
- HTML5: The Language – Syntax and Semantics
- This document lists all of the language elements and their intended use
- Useful to authors and content creators
- HTML5: Parsing and Processing Rules
- This document lists all of the text->DOM parsing and conversion rules
- Useful for validator and parser writers
- HTML5: Rendering and Display Rules
- This document lists all of the document rendering rules
- Useful for browser manufacturers and developers of otherapplications that perform visual and auditory display
- HTML5: An Implementers Guide
- This document lists implementation guidelines, common algorithms, and other implementation details that don’t fit cleanly into the other 3 documents.
- Useful for application writers who consume HTML5
Problem: Commit Then Review
The HTML5 specification, to date, has been edited in a way that has enjoyed large success in other open source projects. It uses a development philosophy called Commit-Then-Review (CTR). This process is used from time to time at the W3C for small changes, with larger, possibly contentious changes using a process called Consensus-Then-Commit (CTC).
In a CTR process, developers make changes to an open source project and commit their changes for review by other developers. If the new changes work better, they are kept. If they don’t work, they are removed. Source control systems such as CVS, Subversion, and Git are heavily relied upon to provide the ability to rewind history and recover lost changes. This process of making additions to a specification can cause an unintended psychological effect; ideas that are already in a specification are granted more weight than those that are not. In the worst case, one may use the fact that text exists to solve a certain problem to squelch arguments for better solutions.
In a CTC process, as used at the W3C, consensus should be reached before an editor changes a document. This approach assumes that it is much harder to remove language than it is to add it. The approach is often painfully slow and it can take weeks to reach consensus on particularly touchy items. Many have asserted that the HTML5 specification could not have been developed via CTC, an assertion that is also held by the author of this document.
There is nothing wrong with either approach as long as certain presumptions hold. One of the presumptions that CTR makes is that there are many people who may edit a given project. This ensures that good ideas are improved upon, bad ideas are quickly replaced by better ideas, and that no one has the ability to impose his or her views on an entire community without being challenged. However, in HTML5, there is only one person who has editing privileges for the source document. This shifts the power to that individual and requires everyone else in the project to react to changes implemented by that committer.
CTR also requires a community where there is mutual trust among the project leaders and contributors. The HTML5 community is, unfortunately, not in that position yet.
Action: More Committers + Distributed Source Control
Commit Then Review is a valuable philosophy, but it is dangerous when you only have one committer. Having only one committer creates a barrier to dissenting opinion. It does not allow anyone else to make a lasting impact on the only product of HTML WG and WHAT WG – the HTML5 specification.
It is imperative that more editors are empowered in order to level the playing field for the HTML5 specification. It is also important that edit-wars are prevented by adopting a system that allows for distributed editing and source management.
The Git source control system is one such distributed editing and management solution. It doesn’t require linear development and it is used to develop the Linux kernel – a project dealing with many more changes per day than the HTML5 specification. It allows for separate change sets to be used and, most importantly, there is no one in “control” of the repository at any given point. There is no central authority, no political barrier to entry, and no central gatekeeper preventing someone from working on any part of the specification.
Distributed source control systems are like peer-to-peer networks. They make the playing field flat and are very difficult to censor. The more people that have the power to clone and edit the source repository, the larger the network effects. We would go from the one editor we have now, to ten editors in a very short time, and perhaps a hundred contributors over the next decade.
Problem: No Way for Experts to Contribute in a Meaningful Way
There have been three major instances spanning 12-24 months in which expert opinion was not respected during the development of the current HTML5 specification. These instances concerned Scalable Vector Graphics (SVG), Resource Description Framework in Attributes (RDFa), and the Web Accessibility Initiative (WAI).
The situation has received enough attention so that there are now web comics and blogs devoted to the conflict between various web experts and the author of the HTML5 specification. These disagreements have become a spectacle with many experts now refusing to take part in the development of the HTML5 specification citing others’ unsuccessful attempts to convey decades of research to the current editor of the HTML5 specification.
Similarly, the editor of the current specification has many valid reasons not to integrate suggestions into the massive document that is HTML5. However, an imperfection in a particular suggestion does not eliminate the need for a discussion of alternative proposals. The way the problem is being approached, by both sides, is fundamentally flawed.
Action: Alternate, Swappable Specification Sections
The HTML5 document should be broken up into smaller sections. Let’s call them microsections. The current specification source is 3.4 megabytes of editable text and is very difficult to author. It is especially daunting to someone who only wants to edit a small section related to their area of expertise. It is even more challenging if one wishes to re-arrange how the sections fit together or to re-use a section in two documents without having to keep them in sync with one another.
Ideally, certain experts, W3C Task Forces, Working Groups, and technology providers could edit the HTML5 specification microsections without having to worry about larger formatting, editing, merging, or layout issues. These microsections could then be processed by a documentation build system into different end-products. For example, HTML5+RDFa or HTML5+RDFa+ARIA, or HTML5+ARIA-RDFa-Microdata. Specifications containing different technologies could be produced very quickly without the overhead of having to author and maintain an entirely new set of documents. You wouldn’t need an editor per proposal to keep all of them in sync with one another, thus reducing the cost of making new proposals. Some microsections could even be used across 2-3 HTML5-related documents.
This approach, coupled with the move to a more decentralized source control mechanism, would provide a path for anyone who can clone a Git repository and edit an HTML file to contribute to the HTML5 specification. Merging changes into the “official” repository could be done via W3C staff contacts to ensure fair treatment to all proposals.
Problem: Mixing Experimental Features with Stable Ones
There is language in the HTML5 specification that indicates that different parts of the specification are at different levels of maturity. However, it is difficult to tell which parts of the specification are at which level of maturity without deep knowledge of the history of the document. This needs to change if we are going to start giving the impression of a stable HTML5 specification.
When someone who is not knowledgeable about the arcane history of the HTML5 Editors Draft sees the <datagrid> element in the same document as the <canvas> element, it is difficult for them to discern the level of maturity of each. In other words, canvas is implemented in many browsers while datagrid is not, yet they are outlined in the same document. The HTML5 specification does have a pop-up noting which features are implemented, have test cases, and are marked for removal, however it is difficult to read the document knowing exactly which paragraphs, sentences and features are experimental and which ones are not.
This is not to say that either <datagrid> or <canvas> shouldn’t be in the HTML5 specification. Rather, there should be different HTML5 specification maturity levels and only features that have reached maturity should be placed into a document entering Last Call at the W3C. We should clearly mark what is and isn’t experimental in the HTML5 specification. We should not standardize on any features that don’t have working implementations.
Action: Shifting to an Unstable/Testing/Stable Release Model
There is much that the standards community can learn from the release processes of larger communities like Debian, Ubuntu, RedHat, the Linux kernel, FreeBSD and others. Each of those communities clearly differentiates software that is very experimental, software that is entering a pre-release testing phase, and software that is tested and intended for public consumption.
If the HTML5 specification generation process adopts Microsections and a distributed source control mechanism, it should be easy to add, remove, and migrate features from an experimental document (Editors Draft), to a testing specification (Working Draft), to a stable specification (Recommendation).
While it may seem as if this is how W3C already operates, note that there is usually only a single stage that is being worked on at a time. This doesn’t fit well with the way HTML5 is being developed in that there are many people working on stable features, testing features, and experimental features simultaneously. A new release process is needed to ensure a smooth, non-disruptive transition from one phase to the next.
Problem: Two Communities, One Specification
When the WHAT WG started what was to become HTML5, the group was asked to do the work outside of the World Wide Web Consortium process. As a benefit of working outside the W3C process, the HTML5 specification was authored and gained support and features very quickly. Letting anybody join the group, having a single editor, dealing directly with the browser manufacturers and focusing on backwards compatability all resulted in the HTML5 specification as we know it today.
When the W3C and WHAT WG decided to collaborate on HTML5 as the future language of the web, it was decided that work would continue both in the HTML WG and the WHAT WG. People from the WHAT WG joined the HTML WG and vice-versa to show good faith and move towards openly collaborating with one another. At first, it seemed as if things were fine between the two communities. That is, until the emergence of an “us vs. them” undercurrent – both at the W3C and in WHAT WG.
Keeping both communities active was and will continue to be a mistake. Instead of combining mailing lists, source control systems, bug trackers and the variety of other resources controlled by each group, we now operate with duplicates of many of the systems. It is not only confusing, but inefficient to have duplicate resources for a community that is supposed to be working on the same specification. It sends the wrong signal to the public. Why would two communities that are working on the same thing continue to separate themselves from one another, unless there was a more fundamental issue that existed?
Action: Merging the Communities
The communities should be merged slowly. Data should be migrated from each duplicate system and a single system should be selected. The mailing lists should be one of the first things to be merged. If either community feels that the other community isn’t the proper place for a list, then a completely new community should be created that merges everyone into a single, cohesive group.
The two communities should bid on an html5.xyz domain (html5.org, html5.info, html5.us) and consolidate resources. This would not only be more efficient, but also eventually remove the “us vs. them” undercurrent.
Problem: Specification Ambiguity
A common problem for specification writers is that, over the years, their familiarity with the specification makes them unable to see ambiguities and errors. This is one of the reasons why all W3C specifications must go through a very rigorous internal review process as well as a public review process. Public review and feedback are necessary in order to clarify specification details, gather implementation feedback, and ultimately produce a better specification.
In order to contribute bug reports, features, or comments on the HTML5 specification, one must send an e-mail to either the HTML WG or the WHAT WG. The combined HTML5 and WHAT WG mailing list traffic can range from 600 to 1,200 very technical e-mails a month. Asking those who would like to comment on the HTML5 specification to devote a large amount of their time to write and respond to mailing list traffic is a formidable request. So formidable that many choose not to participate in the development of HTML5.
Action: In-line Specification Commenting and Bug Tracking
There are many websites that allow people to interactively comment on articles or reply to comments on web pages. We need to ensure that there are as few barriers as possible for commenting on the HTML5 specification. Sending an e-mail to the HTML WG or WHAT WG mailing lists should not be a requirement for providing specification feedback. It should be fairly easy to create a system that allows specification readers to comment directly on text ambiguities, propose alternate solutions, or suggest text deletions when viewing the HTML5 specification.
The down-side to this approach is that there may be a large amount of noise and a small amount of signal but that would be better than no signal at all. We must understand that many web developers, authors, and those who have an interest in the future of the Web cannot put as much time into the HTML5 specification as those who are paid to work on it.
Problem: No Way to Propose Lasting Alternate Proposals
If one were to go to the trouble of adding several new sections into HTML5 or modifying parts of the document, they would then need to keep their changes in sync with the latest version of the document at svn.whatwg.org. This is because there is currently only one person who is allowed to make changes to the “official” HTML5 specification. Keeping documents in sync is time consuming and should not be required of participants in order to affect change.
Since the HTML5 specification document is changed on an almost daily basis, the current approach forces editors to play a perpetual game of catch-up. This results in them spending more time merging changes than contributing to the HTML5 document. It may also cause the feeling that their changes are less important than those further upstream.
Action: At-will Specification Generation from Interchangeable Modules
As previously mentioned, moving to a microsectioned approach can help in this case as well. There can be a number of alternative microsections that editors may author so that each section can be a drop-in replacement. The document build system could be instructed, via a configuration file, on which microsections to use for a particular output product. Therefore if someone wanted to construct a version of HTML5 with a certain feature X, all that would be required is the authoring of the microsection and an instruction for the documentation build system to generate an alternative specification with the microsection included.
Problem: Partial Distributed Extensibility
One of the things that will vanish when the XHTML2 Working Group is shut down at the end of this year is the concept that there would be a unified, distributed platform extensibility mechanism for the web. In short, distributed platform extensibility allows for the HTML language to be extended to contain any XML document. Examples of XHTML-based extensibility include embedding SVG, MathML, and a variety of other specialized XML-based markup languages into XHTML documents.
HTML5 is specifically designed not to be extended in a decentralized manner for the non-XML version of the language. It special-cases SVG and MathML into the HTML platform. It also disallows platform extensibility and language extensibility in HTML5 (not XHTML5) using the same restricted rubric when they are clearly different types of extensibility mechanisms.
Many proponents of distributed extensibility are very concerned by this rubric and resulting design decision. At the heart of distributed extensibility is the assertion that anyone should be able to extend HTML in the future to address their markup needs. It is a forward-looking statement that asserts that the current generation cannot know how the world might want to extend HTML. The power should be placed into the hands of web developers so that they may have more tools available to them to solve their particular set of problems.
Action: A Set of Proposals for Distributed Extensibility
Whether or not distributed extensibility will ever be used on a large scale is not the issue. The issue is that there are currently no proposals for distributed extensibility in HTML5 (again, not XHTML5). Without a proposal, there is no counter-point for the “no distributed extensibility” assertion that HTML5 makes. Thus, if the W3C were to form consensus at this point, there would be only one option.
Consensus around a single option is not consensus. In the very least, HTML5 needs draft language for distributed extensibility. It doesn’t need to be the same solution as XHTML5 provides, it doesn’t even need to be close, but alternatives should exist. XHTML spent many years solving this problem and because of that, SVG and MathML found a home in XHTML documents. Enabling specialist communities, such as the visual arts and mathematics, to extend HTML to do things that were not conceivable during the early days of the Web is a fundamental pillar of the way the Web operates today.
Similarly, a set of tools to provide data extensibility do not exist in a form that are acceptable to the standards community. These tools are also going to fundamentally shape the way we embed and transfer information in web pages. If we are realistic about the expanse of problems that the Web is being called upon to solve, we should ensure that data extensibility capabilities are provided as we move forward.
Problem: Disregarding Input from the Accessibility Community
Accessibility is rarely seen as important until one finds oneself in a position where they or a loved one’s vision, hearing, or motor skills do not function at a level that makes it easy to navigate the web. Accessible websites are important not only to those with disabilities, but also to those who cannot interact with a website in a typical fashion. For example, web accessibility is also important when one is on a small form factor device, using a text-only interface, or a sound-based interface.
Members of the Website Accessibility Initiative (WAI) and the creators of The Accessible Rich Internet Applications (ARIA) technical specification have noted on a number of occasions that they feel as if they are being ignored by the HTML5 community.
Action: Integrate the Accessibility Community’s Input
Empowering WAI to edit the HTML5 specification in a way that does not conflict with others, but produces an accessibility-enhanced HTML5 specification is important to the future of the Web. Microsections and distributed source control would allow this type of collaboration without affecting the speed at which HTML5 is being developed. It may be that WAI needs a specification writer that is capable of producing unambiguous language that will enable browser manufacturers to easily create interoperable implementations.
The Plan of Action
In order to provide a greater impact on the near-term health of HTML5, the proposals listed above should be performed in the following order (which is not the order in which they are presented above):
- Implementation of git for distributed source control.
- Microsection splitter and documentation build system.
- Recruit more committers into the HTML5 community.
- Split features based on their experimental nature into unstable and testing during Last Call.
- Implement in-line feedback mechanism for HTML5 spec.
- Distributed extensibility proposals for HTML5.
- Better, more precise accessibility language for HTML5.
- Merge the HTML WG and WHAT WG communities.
The author would like to thank the following people for reviewing this document and providing feedback and guidance (in alphabetical order): Ben Adida, John Allsopp, Tab Atkins Jr., L. David Baron, Dan Connolly, John Drinkwater, Micah Dubinko, Michael Hausenblas, Ian Hickson, Mike Johnson, David I. Lehn, Dave Longley, Samantha Longley, Shelley Powers, Sam Ruby, Doug Schepers, and Kyle Weems.