All posts in Politics

The Web Browser API Incubation Anti-Pattern

This blog post is about a number of hard lessons learned at the World Wide Web Consortium (W3C) while building new features for the Web with browser vendors and how political and frustrating the process can be.

The W3C Web Payments Community Group (a 220 person pre-standardization effort around payments on the Web) has been in operation now for a bit more than four years. This effort was instrumental in starting conversations around the payments initiative at W3C and has, over the years, incubated many ideas and pre-standards specifications around Web Payments.

When the Web Payments Working Group (a different, official standardization group at W3C) was formed, a decision was made to incubate specifications from two Community Groups and then merge them before the Web Payments Browser API First Public Working Draft was released in April 2016.

The first community group was composed of roughly 220 people from the general public (the Web Payments Community Group). The second community group was composed of a handful of employees from browser vendor companies (fewer than 7 active). When it came time to merge these two specifications, the browser vendors dug in their heels making no compromises and the Web Payments Community Group had no choice but to sacrifice their own specifications in the hope of making progress based on the Microsoft/Google specifications.

It is currently unclear how much the Web Payments Community Group or the Web Payments Working Group will be able to sway the browser vendors on the Web Payments Browser API specification. I do believe that there is still room for the Web Payments Community Group to influence the work, but we face a grueling uphill slog.

The rest of this post analyzes a particular W3C standards development anti-pattern that we discovered over the last few years and attempts to advise future groups at W3C so that they may avoid the trap in which we became ensnared.

A Brief History of Web Payments in and Around W3C

Full disclosure: I’m one of the founders and chairman of several W3C Community Groups (JSON-LD, Permanent Identifiers, Credentials, and Web Payments). A great deal of my company’s time, effort, and money (to the tune of hundreds of thousands of dollars) has gone into these groups in an effort to help them create the next generation Web. I am biased, so take this post with a grain of salt. That said, my bias is a well-informed one.

For a while now, the W3C Community Group process seemed to be working as designed. It took a great deal of effort to get a new activity started at W3C, but a small group of very motivated “outsiders” were able to kick-start the most supported initiative in W3C history. Here’s a brief history of payments in and around W3C:

  • 1998-2001 – The W3C launches a Micropayments activity that ultimately fails.
  • 2010 May – The PaySwarm open payments community launches outside of the W3C with a handful of people, a few specifications, and the goal of standardizing payments, offers of sale, digital receipts, and automated digital contracts on the Web.
  • 2012 May – W3C staff contact, Doug Schepers, convinces a small handful of us to create the Web Payments Community Group at W3C and bring the work into W3C. In parallel, a W3C Headlights process identifies payments as an area to keep an eye on.
  • 2012 – 2014 – Web Payments Community Group expands to 100 people with several experimental specifications under active development. Web Payments Community Group members engage the United Nation’s Internet Governance Forum, banking/finance industry, and technology industry to gather large organization support.
  • 2013 October – W3C TPAC in Shenzen, China – W3C Management is convinced by the Web Payments Community Group and W3C staff contact Dave Raggett to hold a workshop for Web Payments.
  • 2014 March – W3C Workshop for Web Payments is held in Paris, France. The result is a proposal to start official work at the W3C.
  • 2014 October – First official Web Payments Activity, the Web Payments Interest Group, is launched to create a vision and strategy for Web Payments. It does so, along with the creation of a Web Payments Working Group charter. A great deal of the Web Payment Community Group’s input is taken into account.
  • 2015 September – Second official Web Payments activity, the Web Payments Working Group, is launched to execute on the Web Payments Interest Group’s vision.

So far, so good. While it took twice as long to start a Working Group as we had hoped, we had far greater support going in than we imagined when we embarked on our journey in 2010. The Web Payments Interest Group was being receptive to input from the Web Payments Community Group. The Working Group was launched and that’s when the first set of issues between the Web Payments Community Group, Web Payments Interest Group, and the Web Payments Working Group started.
The Web Browser API Incubation Anti-Pattern

  • 2015 October – A decision is made to incubate the Web Payments Working Group input specifications in separate Community Groups.

When W3C Working Groups are created, they are expected to produce technical specifications that expand the capabilities of the Web. Often input is taken from Community Group documents, existing specifications and technology created by corporate organizations, research organizations, and workshops. This input is then merged into one or more Working Group specifications under the control of the Working Group.

That is how it is supposed to work, anyway. In reality, there are many shenanigans that are employed by some of the largest organizations at W3C that get in the way of how things are supposed to work.

Here is one such story:

The Web Payments Community Group has a body of specification work going back 4+ years. When the Web Payments Working Group started, Microsoft and Google did not have a Web Payments specification. As a result, they put something together in a couple of weeks and incubated it at the Web Incubator Community Group. In an attempt to “play on the same field”, some of the editors of the Web Payments Community Group specifications moved that community’s specifications to the Web Incubator Community Group and incubated them there as well.

The explanation that the Microsoft and Google representatives used was that they were just figuring this stuff out and wanted a bit of time to incubate their ideas before merging them into the Web Payments Working Group. At the time I remember it sounding like a reasonable plan and was curious about the Web Incubator Community Group process; it couldn’t hurt to incubate the Web Payments Community Group specifications alongside the Microsoft/Google specification. I did raise the point that I thought we should merge as quickly as possible and the idea that we’d have “competing specifications” where one winner would be picked at the end was a very undesirable potential outcome.

The W3C staff contacts, chairs, and Microsoft/Google assured the group that we wouldn’t be picking a winner and would merge the best ideas from each set of specifications in time. I think everyone truly believed that would happen at the time.

Until…

  • 2016 February – The months old Microsoft/Google specification is picked as the winner over the years old work that went into the Web Payments Community Group specification. Zero features from the Web Payments Community Group specification are merged with a suggestion to perform pull requests if the Web Payments Community Group would like modifications made to the Microsoft/Google specification.

Four Months of Warning Signs

The thing the Web Payments Working Group did not want to happen, the selection of one specification with absolutely zero content being merged in from the other specification, ended up happening. Let’s rewind a bit and analyze how this happened.

  • 2015 October – The Web Payments Working Group has its first face-to-face meeting at W3C TPAC in Sapporo, Japan.

Warning Sign #1: Lack of Good-Faith Review

At the start of the Working Group, two sets of specification editors agreed to put together proposals for the Web Payments Working Group. The first set of specification editors consisted of members of the Web Payments Community Group. The second set of specification editors consisted of employees from Microsoft and Google. A mutual agreement was made for both sets of specification editors to review each other’s specifications and raise issues on them in the hope of eventually hitting a point where a merge was the logical next step.

The Web Payments Community Group‘s editors raised 24+ issues on the Microsoft/Google proposal and compared/contrasted the specifications in order to highlight the differences between both specifications. This took an enormous amount of effort and the work continued for weeks. The Microsoft/Google editors raised zero issues on the counter-proposals. To this day, it’s unclear if they actually read through the counter proposals in depth. No issues were raised at all by the Microsoft/Google editors on the Web Payments Community Group’s specifications.

Warning Sign #2: Same Problem, Two Factions

When the Web Payments Working Group agreed to work on their first specification, noting that there were going to be at least two proposals, the idea was raised by the Microsoft/Google representatives that they wanted to work on their specification in the Web Incubator Community Group. Even though the Web Payments Community Group editors chose to incubate their community’s specifications in the Web Incubator Community Group, the inevitable structure of that work forced the editor’s to split into two factions to make rapid progress on each set of specifications. This lead to a breakdown in the lines of communication between the editors with Microsoft and Google collaborating closely on their specification and the Web Payments Community Group editors focusing on trying to keep their specifications in sync with the Microsoft/Google specification without significant engagement from the Microsoft/Google editors.

This inevitably led to closed discussion, misunderstanding motives, backchannel discussions, and a variety of other things that are harmful to the standardization process. In hindsight, this was trouble waiting to happen. This sort of system pits the editors and the group against itself and the most likely outcome is a Web Payments Community Group vs. Microsoft/Google gambit. Systems often force a particular outcome and this system forces harmful competition and a winner take all outcome.

Warning Sign #3: One-Sided Compromise

A few weeks before the Web Payments Working Group face-to-face meeting, the editors of the Web Payments Community Group specifications discovered (through a backchannel discussion) that “the browser vendors are never going to adopt JSON-LD and the Web Payments Community Group specification will never gain traction as a result”. The Web Payments Community Group specifications used JSON-LD to meet the extensibility needs of Web Payments. This desire to not use JSON-LD by the browser vendors was never communicated through any public channel nor was it directly communicated by the browser vendors privately. This is particularly bad form in an open standardization initiative as it leaves the editors to guess at what sort of proposal would be attractive to the browser vendors.

Once again, the Web Payments Community Group editors (in an attempt to propose a workable specification to the browser vendors) reworked the Web Payments Community Group specifications to compromise and remove JSON-LD as the message extensibility mechanism, ensuring it was purely optional and that no browser would have to implement any part of it. We did this not because we thought it was the best solution for the Web, but in an effort to gain some traction with the browser vendors. This modification was met with no substantive response from the browser vendors.

Warning Sign #4: Limited Browser Vendor Cycles

Two weeks before the Web Payments Working Group face-to-face meeting, the Web Payments Community Group editors personally reached out to the Microsoft/Google editors to see why more collaboration wasn’t taking place. Both editors from Microsoft/Google noted that they were just trying to come up to speed, incubate their specifications, and make sure their proposal was solid before the face-to-face meeting. It became clear that they too could be stressed for time and did not have the bandwidth to review or participate in the counter-proposal discussions.

The Web Payments Community Group specification editors, in a further attempt to compromise, forked the Microsoft/Google specification for the purposes of demonstrating how some aspects of the Web Payments Community Group design could be achieved with the browser vendor’s specification and prepared a presentation to outline four compromises that could be made to get to a merged specification. The proposal received no substantive response or questions from the browser vendors.

Warning Sign #5: Insufficient Notice

Google put forward a registration proposal 12 hours before the Web Payments face-to-face meeting and announced it as a proposal during the face-to-face. This has happened again recently with the Microsoft/Google proposed Web Payments Method Identifiers specification. Instead of adding an issue marker to the specification to add features, like was being done for other pull requests, the editors unilaterally decided to merge one particular pull request that took the specification from three paragraphs to many pages overnight. This happened the day before a discussion about taking that specification to First Public Working Draft.

Clearly, less than 24 hours isn’t enough time to do a proper review before making a decision and if the Web Payments Community Group editors had attempted to do something of that nature it would have been heavily frowned upon.

Things Finally Fall Apart

  • 2016 February – The Web Payments Working Group Face-to-Face Meeting in San Francisco

The Web Payments Community Group specification editors went into the meeting believing that there would be a number of hard compromises made during the meeting, but a single merged specification would come out as a result of the process. There were presentations by Microsoft/Google and by the Web Payments Community Group before the decision making process of what to merge started.

For most of the afternoon, discussion went back and forth with the browser vendors vaguely pushing back on suggestions made by the Web Payments Community Group specifications. Most of the arguments were kept high-level instead of engaging directly with specification text. Many of the newer Web Payments Working Group participants were either unprepared for the discussion or did not know how to engage in the discussion.

Further, solid research data seemed to have no effect on the position of the browser vendors. When research data covering multiple large studies on shopping cart abandonment was presented to demonstrate how the Microsoft/Google solution could actually make things like shopping cart abandonment worse, the response was that “the data doesn’t apply to us, because we (the browser vendors) haven’t tried to solve this particular problem”.

After several hours of frustrating discussion, one of the Chairs posed an interesting thought experiment. He asked that if the group flipped a coin and picked a specification, if there were any Web Payments Working Group members that could not live with the outcome. The lead editor of the Microsoft/Google specification said he could not support the outcome because he could not understand the Web Payments Community Group’s proposals. Keep in mind that one of these proposals was a fork of the Microsoft/Google proposal with roughly four design changes.

At that point, it became clear to me that all of our attempts to engage the browser vendors had been entirely unsuccessful. In the best case, they didn’t have the time to review our specifications in detail and thus didn’t have a position on them other than they wanted to use what was familiar, which was their proposal. Even if it was purely an issue of time, they made no effort to indicate they saw value in reviewing our specifications. In the worst case, they wanted to ensure that they had control over the specification and the Community Group’s technical design didn’t matter as much as them being put in charge of the specifications. The Working Group was scheduled to release a First Public Working Draft of the Web Payments Browser API a few weeks after the meeting, so there was a strong time pressure for the group to make a decision.

In the end, there were only two outcomes that I could see. The first was a weeks-to-months-long protracted fight which had a very high chance of dividing the group further and greatly reducing the momentum we had built. The second was to kill the Web Payments Community Group specifications, give the browser vendors what they wanted (control), and attempt to modify what we saw as the worst parts of the Microsoft/Google proposals to something workable to organizations that aren’t the browser vendors.

The Key Things That Went Wrong

After having some time to reflect on what happened during the meeting, here are the most prominent things that went wrong along the way:

  • Prior to the start of the Working Group, we tried, multiple times, to get the browser vendors involved in the Web Payments Community Group. Every attempt failed due to their concerns over Intellectual Property and patents. Only the creation of the Working Group was able to bring about significant browser vendor involvement.
  • The Microsoft/Google specification editors were not engaged in the Web Payments Interest Group and thus had very little context on the problems that we are trying to solve. This group discusses the future of payments on the Web at a much higher-level and provides the context for creating Working Groups to address particular problems. Without that context, the overarching goals may be lost.
  • The Microsoft/Google specification editors were not engaged with the Community Group specifications. It was a mistake to never require them to at least review the Community Group specifications and raise issues on them. They were asked to review them and they agreed to, but the Working Group should have put a stop on other work until those reviews were completed. At a minimum, this may have caused some of the ideas to be incorporated into the Microsoft/Google specification once work resumed.
  • The Working Group decided to incubate specifications outside of the Working Group, which meant that they had no control over how those specifications were incubated or the timeline around them. In hindsight, it is clear now that this was not just an experiment, but a dereliction of duty.
  • The Working Group decided to incubate technical specifications between two groups that are grossly unmatched in power: the Web Payments Community Group and the browser vendors. Imagine if we were to throw members of the middle class into the same room as the largest financial companies in the world and ask them to figure out how to extend capital to individuals in the future. Each group would have a different idea on what would be the best approach, but only one of the groups needs to listen to the other.

Two Proposed Solutions to the W3C Browser Incubation Anti-Pattern

We know that the following patterns hold true at W3C:

  • Attempting to propose a browser API with little to no browser vendor involvement from the beginning, regardless of the technical merit of the specification does not work.
  • Attempting to propose a browser API where the browser vendors do not feel sufficiently in control of the specification does not work.

Giving browser vendors significant control over a browser API specification and providing feedback via pull requests is what currently works best at W3C. The downside here, of course, is that no one but the browser vendors can expect to have a significant voice in the creation of new browser APIs.

There are a few exceptions to the patterns listed above as some specification editors are more benevolent than others. Some put in more due diligence than others. Some standardization initiatives are less politically charged than others. Overall, there has to be a better way at doing specifications for browser APIs than what we have at W3C now.

The simplest solution to this problem is this:

As a best practice, a Working Group should never have multiple specifications that are still being incubated when it starts. The Working Group should pick either one specification before it starts or start with a blank specification and build from there. All discussion of that specification and decisions made about it should be done out in the open and with the group’s consent.

Barring what is said above, if the group decides to incubate multiple specifications (which is a terrible idea), here is a simple three-step proposal that could have been used to avoid the horror show that was the first couple of months of the Web Payments Working Group:

  1. Any Working Group that decides to incubate multiple specifications that do more or less the same thing, and will attempt to merge later, will always start out with one blank specification under the control of the Working Group.
  2. As editors of competing specifications finish up work on their incubated specifications, they make pull requests on the blank specification with concepts from their competing specification. The pull requests should be:
    • Small in size, integrating one concept at a time
    • Pulled in alternating between both specifications (with priorities on what to pull and when assigned by the originating editors)
    • Commented on by all editors in a timely fashion (or merged automatically after 5 business days; it’s better to have conflicting material in the spec that needs resolution than to ignore the desires of some of the group)
  3. The Working Group Chairs merge pull requests in. If there is a conflict, both options are pulled into the specification and an issue marker is created to ensure that the WG discusses both options and eventually picks one or combines them together.

The proposal is designed to:

  • Ensure that the Working Group is in control of the specification at all times.
  • Ensure that proper due diligence is carried out by all editors during the merging process.
  • Ensure that a mix of ideas are integrated into the specification.
  • Ensure that steady progress can be made on a single specification.

While I doubt the browser vendors will find the proposal better than the current state of things, I don’t think they would find it so unpalatable that they wouldn’t join any group using this procedure. This approach does inject some level of fairness to those participants that are not browser vendors and forces due diligence into the process (from all sides). If the Web is supposed to be for everyone, our specification creation process for browser APIs needs to be available to more than just the largest browser vendors.

Web Payments and the World Banking Conference

The standardization group for all of the banks in the world (SWIFT) was kind enough to invite me to speak at the world’s premier banking conference about the Web Payments work at the W3C. The conference, called SIBOS, happened last week and brings together 7,000+ people from banks and financial institutions around the world. The event was being held in Dubai this year. They wanted me to present on the new Web Payments work being done at the World Wide Web Consortium (W3C) including the work we’re doing with PaySwarm, Mozilla, the Bitcoin community, and Ripple Labs.

If you’ve never been to Dubai, I highly recommend visiting. It is a city of extremes. It contains the highest density of stunningly award-winning sky scrapers while the largest expanses of desert loom just outside of the city. Man-made islands dot the coastline, willed into shapes like that of a multi-mile wide palm tree or massive lumps of stone, sand, steel and glass resembling all of the countries of the world. I saw the largest in-mall aquarium in the world and ice skated in 105 degree weather. Poverty lines the outskirts of Dubai while ATMs that vend gold can be found throughout the city. Lamborghinis, Ferraris, Maybachs, and Porsches roared down the densely packed highways while plants struggled to survive in the oppressive heat and humidity.

The extravagances nestle closely to the other extremes of Dubai: a history of indentured servitude, women’s rights issues, zero-tolerance drug possession laws, and political self-censorship of the media. In a way, it was the perfect location for the worlds premier banking conference. The capital it took to achieve everything that Dubai had to offer flowed through the banks represented at the conference at some point in time.

The Structure of the Conference

The conference was broken into two distinct areas. The more traditional banking side was on the conference floor and resembled what you’d expect of a well-established trade show. It was large, roughly the size of four football fields. Innotribe, the less-traditional and much hipper innovation track, was outside of the conference hall and focused on cutting edge thinking, design, new technologies. The banks are late to the technology game, but that’s to be expected in any industry that has a history that can be measured in centuries. Innotribe is trying to fix the problem of innovation in banking.

“Customers”

One of the most surprising things that I learned during the conference was the different classes of customers a bank has and which class of customers are most profitable to the banks. Many people are under the false impression that the most valuable customer a bank can have is the one that walks into one of their branches and opens an account. In general, the technology industry tends to value the individual customer as the primary motivator for everything that it does. This impression, with respect to the banking industry, was shattered when I heard the head of an international bank utter the following with respect to banking branches: “80% of our customers add nothing but sand to our bottom line.” The banker was alluding to the perception that the most significant thing that customers bring into the banking branch is the sand on the bottom of their shoes. The implication is that most customers are not very profitable to banks and are thus not a top priority. This summarizes the general tone of the conference with respect to customers when it came to the managers of these financial institutions.

Fundamentally, a bank’s motives are not aligned with most of their customer’s needs because that’s not where they make the majority of their money. Most of a bank’s revenue comes from activities like short-term lending, utilizing leverage against deposits, float-based leveraging, high-frequency trading, derivatives trading, and other financial exercises that are far removed with what most people in the world think of when they think of the type of activities one does at a bank.

For example, it has been possible to do realtime payments over the current banking network for a while now. The standards and technology exists to do so within the vast majority of the bank systems in use today. In fact, enabling this has been put to a vote for the last five years in a row. Every time it has been up for a vote, the banks have voted against it. The banks make money on the day-to-day float against the transfers, so the longer it takes to complete a transfer, the more money the banks make.

I did hear a number of bankers state publicly that they cared about the customer experience and wanted to improve upon it. However, those statements rang pretty hollow when it came to the product focus on the show floor, which revolved around B2B software, high-frequency trading protocols, high net-value transactions, etc. There were a few customer-focused companies, but they were dwarfed by the size of the major banks and financial institutions in attendance at the conference.

The Standards Team

I was invited to the conference by two business units within SWIFT. The first was the innovation group inside of SWIFT, called Innotribe. The second was the core standards group at SWIFT. There are over 6,900 banks that participate in the SWIFT network. Their standards team is very big, many times larger than the W3C, and extremely well funded. The primary job of the standards team at SWIFT is to create standards that help their member companies exchange financial information with the minimum amount of friction. Their flagship product is a standard called ISO 20022, which is a 3,463 page document that outlines every sort of financial message that the SWIFT network supports today.

The SWIFT standards team are a very helpful group of people that are trying their hardest to pull their membership into the future. They fundamentally understand the work that we’re doing in the Web Payments group and are interested in participating more deeply. They know that technology is going to eventually disrupt their membership and they want to make sure that there is a transition path for their membership, even if their membership would like to view these new technologies, like Bitcoin, PaySwarm, and Ripple as interesting corner cases.

In general, the banks don’t view technical excellence as a fundamental part of their success. Most view personal relationships as the fundamental thing that keeps their industry ticking. Most bankers come from an accounting background of some kind and don’t think of technology as something that can replace the sort of work that they do. This means that standards and new technologies almost always take a back seat to other more profitable endeavors such as implementing proprietary high frequency trading and derivatives trading platforms (as opposed to customer-facing systems like PaySwarm).

SWIFT’s primary customers are the banks, not the bank’s customers. Compare this with the primary customer of most Web-based organizations and the W3C, which is the individual. Since SWIFT is primarily designed to serve the banks, and banks make most of their money doing things like derivatives and high-frequency trading, there really is no champion for the customer in the banking organizations. This is why using your bank is a fairly awful experience. Speaking from a purely capitalistic standpoint, individuals that have less than a million dollars in deposits are not a priority.

Hobbled by Complexity

I met with over 30 large banks while I was at SIBOS and had a number of low-level discussions with their technology teams. The banking industry seems to be crippled by the complexity of their current systems. Minor upgrades cost millions of dollars due to the requirement to keep backwards compatibility. For example, at one point during the conference, it was explained that there was a proposal to make the last digit in an IBAN number a particular value if the organization was not a bank. The amount of push-back on the proposal was so great that it was never implemented since it would cost thousands of banks several million dollars each to implement the feature. Many of the banks are still running systems as part of their core infrastructure that were created in the 1980s, written in COBOL or Fortran, and well past their initial intended lifecycles.

A bank’s legacy systems mean that they have a very hard time innovating on top of their current architecture, and it could be that launching a parallel financial systems architecture would be preferable to broadly interfacing with the banking systems in use today. Startups launching new core financial services are at a great advantage as long as they limit the number of places that they interface with these old technology infrastructures.

Commitment to Research and Development

The technology utilized in the banking industry is, from a technology industry point of view, archaic. For example, many of the high-frequency trading messages are short ASCII text strings that look like this:

8=FIX.4.1#9=112#35=0#49=BRKR#56=INVMGR#34=235#52=19980604-07:58:28#112=19980604-07:58:28#10=157#

Imagine anything like that being accepted as a core part of the Web. Messages are kept to very short sequences because they must be processed in less than 5 microseconds. There is no standard binary protocol, even for high-frequency trading. Many of the systems that are core to a bank’s infrastructure pre-date the Web, sometimes by more than a decade or two. At most major banking institutions, there is very little R&D investment into new models of value transfer like PaySwarm, Bitcoin, or Ripple. In a room of roughly 100 bank technology executives, when asked how many of them had an R&D or innovation team, only around 5% of the people in the room raised their hands.

Compare this with the technology industry, which devotes a significant portion of their revenue to R&D activities and tries to continually disrupt their industry through the creation of new technologies.

No Shared Infrastructure

The technology utilized in the banking industry is typically created and managed in-house. It is also highly fractured; the banks share the messaging data model, but that’s about it. The SWIFT data model is implemented over and over again by thousands of banks. There is no set of popular open source software that one can use to do banking, which means that almost every major bank writes their own software. There is a high degree of waste when it comes to technology re-use in the banking industry.

Compare this with how much of the technology industry shares in the development of core infrastructure like operating systems, Web servers, browsers, and open source software libraries. This sort of shared development model does not exist in the banking world and the negative effects of this lack of shared architecture are evident in almost every area of technology associated with the banking world.

Fear of Technology Companies

The banks are terrified of the thought of Google, Apple, or Amazon getting into the banking business. These technology companies have hundreds of millions of customers, deep brand trust, and have shown that they can build systems to handle complexity with relative ease. At one point it was said that if Apple, Google, or Amazon wanted to buy Visa, they could. Then in one fell swoop, one of these technology companies could remove one of the largest networks that banks rely on to move money in the retail space.

While all of the banks seemed to be terrified of being disrupted, there seemed to be very little interest in doing any sort of drastic change to their infrastructure. In many cases, the banks are just not equipped to deal with the Web. They tend to want to build everything internally and rarely acquire technology companies to improve their technology departments.

There was also a relative lack of executives at banks that I spoke with that were able to carry on a fairly high-level conversation about things like Web technology. It demonstrated that it is going to still be some time until the financial industry can understand the sort of disruption that things like PaySwarm, Bitcoin, and Ripple could trigger. Many know that there are going to be a large chunk of jobs that are going to be going away, but those same individuals do not have the skill set to react to the change, or are too busy with paying customers to focus on the coming disruption.

A Passing Interest in Disruptive Technologies

There was a tremendous amount of interest in Bitcoin, PaySwarm, Ripple and how it could disrupt banking. However, much like the music industry, all but a few of the banks seemed to want to learn how they could adopt or use the technology. Many of the conversations ended with a general malaise related to technological disruption with no real motivation to dig deeper lest they find something truly frightening. Most executives would express how nervous they were about competition from technology companies, but were not willing to make any deep technological changes that would undermine their current revenue streams. There were parallels between many bank executives I spoke with, the innovators dilemma, and how many of the music industry executives I had been involved with in the early 2000s reacted to the rise of Napster, peer-to-peer file trading networks, and digital music.

Many higher-level executives were dismissive about the sorts of lasting changes Web technologies could have on their core business, often to the point of being condescending when they spoke about technologies like Bitcoin, PaySwarm, and Ripple. Most arguments boiled down to the customer needing to trust some financial institution to carry out the transaction, demonstrating that they did not fundamentally understand the direction that technologies like Bitcoin and Ripple are headed.

Lessons Learned

We were able to get the message out about the sort of work that we’re doing at W3C when it comes to Web Payments and it was well received. I have already been asked to present at next year’s conference. There is a tremendous opportunity here for the technology sector to either help the banks move into the future, or to disrupt many of the services that have been seen as belonging to the more traditional financial institutions. There is also a big opportunity for the banks to seize the work that is being done in Web Payments, Bitcoin, and Ripple, and apply it to a number of the problems that they have today.

The trip was a big success in that the Web Payments group now has very deep ties into SWIFT, major banks, and other financial institutions. Many of the institutions expressed a strong desire to collaborate with them on future Web Payments work. The financial institutions we spoke with thought that many of these technologies were 10 years away from affecting them, so there was no real sense of urgency to integrate the technology. I’d put the timeline closer to 3-4 years than 10 years. That said, there was general agreement that these technologies mattered. The lines of communication are now more open than they used to be between the traditional financial industry and the Web Payments group at W3C. That’s a big step in the right direction.

Interested in becoming a part of the Web Payments work, or just peeking in from time to time? It’s open to the public. Join here.

Aaron Swartz, PaySwarm, and Academic Journals

For those of you that haven’t heard yet, Aaron Swartz took his own life two days ago. Larry Lessig has a follow-up on one of the reasons he thinks led to his suicide (the threat of 50 years in jail over the JSTOR case).

I didn’t know Aaron at all. A large number of people that I deeply respect did, and have written about his life with great admiration. I, like most of you that have read the news, have done so while brewing a cauldron of mixed emotions. Saddened that someone that had achieved so much good in their life is no longer in this world. Angry that Aaron chose this ending. Sickened that this is the second recent suicide, Iilya’s being the first, involving a young technologist trying to make the world a better place for all of us. Afraid that other technologists like Aaron and Iilya will choose this path over persisting in their noble causes. Helpless. Helpless because this moment will pass, just like Iilya’s did, with no great change in the way our society deals with mental illness. With no great change, in what Aaron was fighting for, having been realized.

Nobody likes feeling helpless. I can’t mourn Aaron because I didn’t know him. I can mourn the idea of Aaron, of the things he stood for. While reading about what he stood for, several disconnected ideas kept rattling around in the back of my head:

  1. We’ve hit a point of ridiculousness in our society where people at HSBC knowingly laundering money for drug cartels get away with it, while people like Aaron are labeled a felon and face upwards of 50 years in jail for “stealing” academic articles. This, even after the publisher of said academic articles drops the charges. MIT never dropped their charges.
  2. MIT should make it clear that he was not a felon or a criminal. MIT should posthumously pardon Aaron and commend him for his life’s work.
  3. The way we do peer-review and publish scientific research has to change.
  4. I want to stop reading about all of this, it’s heartbreaking. I want to do something about it – make something positive out of this mess.

Ideas, Floating

I was catching up on news this morning when the following floated past on Twitter:

clifflampe: It seems to me that the best way for we academics to honor Aaron Swartz’s memory is to frigging finally figure out open access publishing.

1Copenut: @clifflampe And finally implement a micropayment system like @manusporny’s #payswarm. I don’t want the paper-but I’ll pay for the stories.

1Copenut: @manusporny These new developments with #payswarm are a great advance. Is it workable with other backends like #Middleman or #Sinatra?

This was interesting because we have been talking about how PaySwarm could be applied to academic publishing for a while now. All the discussions to this point have been internal, we didn’t know if anybody would make the connection between the infrastructure that PaySwarm provides and how it could be applied to academic journals. This is up on our ideas board as a potential area that PaySwarm could be applied:

  • Payswarm for peer-reviewed, academic publishing
    • Use Payswarm identity mechanism to establish trusted reviewer and author identities for peer review
    • Use micropayment mechanism to fund research
    • Enable university-based group-accounts for purchasing articles, or refunding researcher purchases

Journals as Necessary Evils

For those in academia, journals are often viewed as a necessary evil. They cost a fortune to subscribe to, farm out most of their work to academics that do it for free, and employ an iron-grip on the scientific publication process. Most academics that I speak with would do away with journal organizations in a heartbeat if there was a viable alternative. Most of the problem is political, which is why we haven’t felt compelled to pursue fixing it. Political problems often need a groundswell of support and a number of champions that are working inside the community. I think the groundswell is almost here. I don’t know who the set of academic champions are that will be the ones to push this forward. Additionally, if nobody takes the initiative to build such a system, things won’t change.

Here’s what we (Digital Bazaar) have been thinking. To fix the problem, you need at least the following core features:

  • Web-scale identity mechanisms – so that you can identify reviewers and authors for the peer-review process regardless of which site is publishing or reviewing a paper.
  • Decentralized solution – so that universities and researchers drive the process – not the publishers of journals.
  • Some form of remuneration system – you want to reward researchers with heavily cited papers, but in a way that makes it very hard to game the system.

Scientific Remuneration

PaySwarm could be used to implement each of these core features. At its core, PaySwarm is a decentralized payment mechanism for the Web. It also has a decentralized identity mechanism that is solid, but in a way that does not violate your privacy. There is a demo that shows how it can be applied to WordPress blogs where just an abstract is published, and if the reader wants to see more of the article, they can pay a small fee to read it. It doesn’t take a big stretch of the imagination to replace “blog article” with “research paper”. The hope is that researchers would set access prices on articles such that any purchase to access the research paper would then go to directly funding their current research. This would empower universities and researchers with an additional revenue stream while reducing the grip that scientific publishers currently have on our higher-education institutions.

A Decentralized Peer-review Process

Remuneration is just one aspect of the problem. Arguably, it is the lesser of the problems in academic publishing. The biggest technical problem is how you do peer review on a global, distributed scale. Quite obviously, you need a solid identity system that can identify scientists over the long term. You need to understand a scientists body of work and how respected their research is in their field. You also need a review system that is capable of pairing scientists and papers in need of review. PaySwarm has a strong identity system in place using the Web as the identification mechanism. Here is the PaySwarm identity that I use for development: https://dev.payswarm.com/i/manu. Clearly, paper publishing systems wouldn’t expose that identity URL to people using the system, but I include it to show what a Web-scale identifier looks like.

Web-scale Identity

If you go to that identity URL, you will see two sets of information: my public financial accounts and my digital signature keys. A PaySwarm Authority can annotate this identity with even more information, like whether or not an e-mail address has been verified against the identity. Is there a verified cellphone on record for the identity? Is there a verified driver’s license on record for the identity? What about a Twitter handle? A Google+ handle? All of these pieces of information can be added and verified by the PaySwarm Authority in order to build an identity that others can trust on the Web.

What sorts of pieces of information need to be added to a PaySwarm identity to trust its use for academic publishing? Perhaps a list of articles published by the identity? Review comments for all other papers that have been reviewed by the identity? Areas of research that other’s have certified that the identity is an expert on? This is pretty basic Web-of-trust stuff, but it’s important to understand that PaySwarm has this sort of stuff baked into the core of the design.

The Process

Leveraging identity to make decentralized peer-review work is the goal, and here is how it would work from a researcher perspective:

  1. A researcher would get a PaySwarm identity from any PaySwarm Authority, there is no cost associated with getting such an identity. This sub-system is already implemented in PaySwarm.
  2. A researcher would publish an abstract of their paper in a Linked Data format such as RDFa. This abstract would identify the authors of the paper and some other basic information about the paper. It would also have a digital signature on the information using the PaySwarm identity that was acquired in the previous step. The researcher would set the cost to access the full article using any PaySwarm-compatible system. All of this is already implemented in PaySwarm.
  3. A paper publishing system would be used to request a review among academic peers. Those peers would review the paper and publish digital signatures on review comments, possibly with a notice that the paper is ready to be published. This sub-system is fairly trivial to implement and would mirror the current review process with the important distinction that it would not be centralized at journal publications.
  4. Once a pre-set limit on the number of positive reviews has been met, the paper publishing system would place its stamp of approval on the paper. Note that different paper publishing systems may have different metrics just as journals have different metrics today. One benefit to doing it this way is that you don’t need a paper publishing system to put its stamp of approval on a paper at all. If you really wanted to, you could write the software to calculate whether or not the paper has gotten the appropriate amount of review because all of the information is on the Web by default. This part of the system would be fairly trivial to write once the metrics were known. It may take a year or two to get the correct set of metrics in place, but it’s not rocket science and it doesn’t need to be perfect before systems such as this are used to publish papers.

From a reviewer perspective, it would work like so:

  1. You are asked to review papers by your peers once you have an acceptable body of published work. All of your work can be verified because it is tied to your PaySwarm identity. All review comments can be verified as they are tied to other PaySwarm identities. This part is fairly trivial to implement, most of the work is already done for PaySwarm.
  2. Once you review a paper, you digitally sign your comments on the paper. If it is a good paper, you also include a claim that it is ready for broad publication. Again, technically simple to implement.
  3. Your reputation builds as you review more papers. The way that reputation is calculated is outside of the scope of this blog post mainly because it would need a great deal of input from academics around the world. Reputation is something that can be calculated, but many will argue about the algorithm and I would expect this to oscillate throughout the years as the system grows. In the end, there will probably be multiple reputation algorithms, not just one. All that matters is that people trust the reputation algorithms.

Freedom to Research and Publish

The end-goal is to build a system that empowers researchers and research institutions, is far more transparent than the current peer-reviewed publishing system, and remunerates the people doing the work more directly. You will also note that at no point does a traditional journal enter the picture to give you a stamp of approval and charge you a fee for publishing your paper. Researchers are in control of the costs at all stages. As I’ve said above, the hard part isn’t the technical nature of the project, it’s the political nature of it. I don’t know if this is enough of a pain-point among academics to actually start doing something about it today. I know some are, but I don’t know if many would use such a system over the draw of publications like Nature, PLOS, Molecular Genetics and Genomics, and Planta. Quite obviously, what I’ve proposed above isn’t a complete road map. There are issues and details that would need to be hammered out. However, I don’t understand why a system like this doesn’t already exist, so I implore the academic community to explain why what I’ve laid out above hasn’t been done yet.

It’s obvious that a system like this would be good for the world. Building such a system may have reduced the possibility of us losing someone like Aaron in the way that we did. He was certainly fighting for something like it. Talking about it makes me feel a bit less helpless than I did yesterday. Maybe making something good out of this mess will help some of you out there as well. If others offer to help, we can start building it.

So how about it researchers of the world, would you publish all of your research through such a system?

A New Way Forward for HTML5

A New Way Forward for HTML5

By halting the XHTML2 work and announcing more resources for theHTML5 project, the World Wide Web Consortium has sent a clear signalon the future markup language for the Web: it will be HTML5.Unfortunately, the decision comes at a time when many working withWeb standards have taken issue with the way the HTML5 specification is being developed.

The shut down of the XHTML2 Working Group has brought to a head along-standing set of concerns related to how the new specification isbeing developed. This page outlines the current state of developmentand suggests that there is a more harmonious way to move forward. Byadopting some or all of the proposals outlined below, the standardscommunity will ensure that the greatest features for the Web areintegrated into HTML5.

What’s wrong with HTML5?

There are likely as many reasons for why HTML5 is problematic as there are for why HTML5 will succeed where XHTML2 didn’t. Some of these reasons are technical in nature, some are based on process, and others may lack sufficient evidential support.

Many, including the author of this document, have praised the WHAT WG for making steady progress on the next version of HTML. Using implementation data to back up additions, removals or re-writes to the core HTML specification has helped improve the standard. In general, browser implementors have been very supportive of the current direction, so there is much to be celebrated when it comes to HTML5.

The HTML5 editorial process, however, has also created several complaints among long-time members of the web standards community that find themselves marginalized as the specification proceeds. The problem has more to do with politics than it does science, but as we find in the real world — the politics are shaping the science.

The biggest complaint with the current process is that the power to change the originating specification lies with one individual. This gives that one individual, or group of individuals, an advantage that has created an acrimonious environment.

In a Consortium like the W3C, a process that shows favor to certain members by giving them privileges that other members can never attain is fundamentally unfair.

In this particular case, it tilts the table toward the current HTML5 specification editor and toward the browser manufacturers. Search companies, tool developers, web designers and developers, usability experts and many others have suddenly found themselves without a voice. Some say that this approach is a good thing — it focuses on those that must adhere to the standard and on those that are producing results. Unfortunately, the approach also creates conflict. The secondary communities feel a sense of unfairness because they feel their needs for the Web are not being met. It is not a simple problem with an easy solution.

HTML5 is now the way forward. In order to ensure that dissenting argumentation can have an impact on the specification, if the argumentation is valid, we must subtly change the editorial process. The changes should not affect the speed at which HTML5 is proceeding, so there is a certain finess that must be employed to any action we perform to make the HTML5 community better.

This set of proposals addresses our current situation and what other similar communities have done to improve their own development processes.

The Goal

The goal of the actions listed in this document is to allow all of the communities interested in HTML5 to collaborate on the specification in an efficient and agreeable manner. The tools that we elect to use have an effect on the perceived editorial process in place. Currently, the process does not allow for wide-scale collaboration and the sharing and discussion of proposals that support consortium-based specification authoring.

The Strategy

The strategy for moving HTML5 forward should focus on being inclusive without increasing disruption, red tape, or maintenance headaches for those who are contributing the most to the current HTML5 specification. Any strategy employed should ensure that we create a more open, egalitarian environment where everyone who would like to contribute to the HTML5 specification has the ability to do so without the barriers to entry that exist today.

About the Author

Manu Sporny is a Founder of Digital Bazaar and the Commons Design Initiative, an Invited Expert to the W3C, the editor for the hAudio Microformat specification, the RDF Audio and Video vocabularies, a member of the RDFa Task Force and the editor of the HTML5+RDFa specification.

Over the next six months, he will be raising funding from private enterprise and public institutions to address many of the issues outlined below. If your company depends on the Internet and can afford to fund just a small fraction of the work below (minimum $8K grant), then please contact him at msporny@digitalbazaar.com. If you know of an institution that is able to fund the work described on this page, please have them contact Manu.

The Issues

The majority of this document lists some of the current HTML5 issues and attempts to provide actions that the standards community could take to address them.

Problem: A Kitchen Sink Specification

The HTML5 specification currently weighs in at 4.1 megabytes of HTML. If one were to print it out, single-spaced and using 12-point font, the document would span 844 pages. It is not even complete yet and it is already roughly the same length as the new edition of “War and Peace” – a full 3 inches thick.

Reading an 844 page technical specification is daunting for even the most masochistic of standards junkies. The HTML 4.01 specification, problematic in its own right, is 389 pages. Not only are large specifications overwhelming, but it can be almost impossible to find all of the information you need in them. Clearly the specification needs to be long enough to be specific, but not any larger. The issue has more to do with focus and accessibility to web authors and developers than length.

The current HTML5 specification contains information that is of interest to web authors and designers, parser writers, browser manufacturers, HTML rendering engine developers, and CSS developers. Unfortunately, the spec attempts to address all of these audiences at once, quickly losing its focus. A document that is not focused on its intended audience will reduce its utility for all audiences. Therefore, some effort should be placed into making the current document more coherent for each intended audience.

Action: Splitting HTML5 into Logically Targeted Documents

“Know your audience” is a lesson that many creative writers learn far before their first comic, book, or novel is published. Instead of creating a single uber-specification, we should instead split the document into logically targeted sections that are more focused on their intended audience. For example, the following is one such break-out:

  • HTML5: The Language – Syntax and Semantics
    • This document lists all of the language elements and their intended use
    • Useful to authors and content creators
  • HTML5: Parsing and Processing Rules
    • This document lists all of the text->DOM parsing and conversion rules
    • Useful for validator and parser writers
  • HTML5: Rendering and Display Rules
    • This document lists all of the document rendering rules
    • Useful for browser manufacturers and developers of otherapplications that perform visual and auditory display
  • HTML5: An Implementers Guide
    • This document lists implementation guidelines, common algorithms, and other implementation details that don’t fit cleanly into the other 3 documents.
    • Useful for application writers who consume HTML5

Problem: Commit Then Review

The HTML5 specification, to date, has been edited in a way that has enjoyed large success in other open source projects. It uses a development philosophy called Commit-Then-Review (CTR). This process is used from time to time at the W3C for small changes, with larger, possibly contentious changes using a process called Consensus-Then-Commit (CTC).

In a CTR process, developers make changes to an open source project and commit their changes for review by other developers. If the new changes work better, they are kept. If they don’t work, they are removed. Source control systems such as CVS, Subversion, and Git are heavily relied upon to provide the ability to rewind history and recover lost changes. This process of making additions to a specification can cause an unintended psychological effect; ideas that are already in a specification are granted more weight than those that are not. In the worst case, one may use the fact that text exists to solve a certain problem to squelch arguments for better solutions.

In a CTC process, as used at the W3C, consensus should be reached before an editor changes a document. This approach assumes that it is much harder to remove language than it is to add it. The approach is often painfully slow and it can take weeks to reach consensus on particularly touchy items. Many have asserted that the HTML5 specification could not have been developed via CTC, an assertion that is also held by the author of this document.

There is nothing wrong with either approach as long as certain presumptions hold. One of the presumptions that CTR makes is that there are many people who may edit a given project. This ensures that good ideas are improved upon, bad ideas are quickly replaced by better ideas, and that no one has the ability to impose his or her views on an entire community without being challenged. However, in HTML5, there is only one person who has editing privileges for the source document. This shifts the power to that individual and requires everyone else in the project to react to changes implemented by that committer.

CTR also requires a community where there is mutual trust among the project leaders and contributors. The HTML5 community is, unfortunately, not in that position yet.

Action: More Committers + Distributed Source Control

Commit Then Review is a valuable philosophy, but it is dangerous when you only have one committer. Having only one committer creates a barrier to dissenting opinion. It does not allow anyone else to make a lasting impact on the only product of HTML WG and WHAT WG – the HTML5 specification.

It is imperative that more editors are empowered in order to level the playing field for the HTML5 specification. It is also important that edit-wars are prevented by adopting a system that allows for distributed editing and source management.

The Git source control system is one such distributed editing and management solution. It doesn’t require linear development and it is used to develop the Linux kernel – a project dealing with many more changes per day than the HTML5 specification. It allows for separate change sets to be used and, most importantly, there is no one in “control” of the repository at any given point. There is no central authority, no political barrier to entry, and no central gatekeeper preventing someone from working on any part of the specification.

Distributed source control systems are like peer-to-peer networks. They make the playing field flat and are very difficult to censor. The more people that have the power to clone and edit the source repository, the larger the network effects. We would go from the one editor we have now, to ten editors in a very short time, and perhaps a hundred contributors over the next decade.

Problem: No Way for Experts to Contribute in a Meaningful Way

There have been three major instances spanning 12-24 months in which expert opinion was not respected during the development of the current HTML5 specification. These instances concerned Scalable Vector Graphics (SVG), Resource Description Framework in Attributes (RDFa), and the Web Accessibility Initiative (WAI).

The situation has received enough attention so that there are now web comics and blogs devoted to the conflict between various web experts and the author of the HTML5 specification. These disagreements have become a spectacle with many experts now refusing to take part in the development of the HTML5 specification citing others’ unsuccessful attempts to convey decades of research to the current editor of the HTML5 specification.

Similarly, the editor of the current specification has many valid reasons not to integrate suggestions into the massive document that is HTML5. However, an imperfection in a particular suggestion does not eliminate the need for a discussion of alternative proposals. The way the problem is being approached, by both sides, is fundamentally flawed.

Action: Alternate, Swappable Specification Sections

The HTML5 document should be broken up into smaller sections. Let’s call them microsections. The current specification source is 3.4 megabytes of editable text and is very difficult to author. It is especially daunting to someone who only wants to edit a small section related to their area of expertise. It is even more challenging if one wishes to re-arrange how the sections fit together or to re-use a section in two documents without having to keep them in sync with one another.

Ideally, certain experts, W3C Task Forces, Working Groups, and technology providers could edit the HTML5 specification microsections without having to worry about larger formatting, editing, merging, or layout issues. These microsections could then be processed by a documentation build system into different end-products. For example, HTML5+RDFa or HTML5+RDFa+ARIA, or HTML5+ARIA-RDFa-Microdata. Specifications containing different technologies could be produced very quickly without the overhead of having to author and maintain an entirely new set of documents. You wouldn’t need an editor per proposal to keep all of them in sync with one another, thus reducing the cost of making new proposals. Some microsections could even be used across 2-3 HTML5-related documents.

This approach, coupled with the move to a more decentralized source control mechanism, would provide a path for anyone who can clone a Git repository and edit an HTML file to contribute to the HTML5 specification. Merging changes into the “official” repository could be done via W3C staff contacts to ensure fair treatment to all proposals.

Problem: Mixing Experimental Features with Stable Ones

There is language in the HTML5 specification that indicates that different parts of the specification are at different levels of maturity. However, it is difficult to tell which parts of the specification are at which level of maturity without deep knowledge of the history of the document. This needs to change if we are going to start giving the impression of a stable HTML5 specification.

When someone who is not knowledgeable about the arcane history of the HTML5 Editors Draft sees the <datagrid> element in the same document as the <canvas> element, it is difficult for them to discern the level of maturity of each. In other words, canvas is implemented in many browsers while datagrid is not, yet they are outlined in the same document. The HTML5 specification does have a pop-up noting which features are implemented, have test cases, and are marked for removal, however it is difficult to read the document knowing exactly which paragraphs, sentences and features are experimental and which ones are not.

This is not to say that either <datagrid> or <canvas> shouldn’t be in the HTML5 specification. Rather, there should be different HTML5 specification maturity levels and only features that have reached maturity should be placed into a document entering Last Call at the W3C. We should clearly mark what is and isn’t experimental in the HTML5 specification. We should not standardize on any features that don’t have working implementations.

Action: Shifting to an Unstable/Testing/Stable Release Model

There is much that the standards community can learn from the release processes of larger communities like Debian, Ubuntu, RedHat, the Linux kernel, FreeBSD and others. Each of those communities clearly differentiates software that is very experimental, software that is entering a pre-release testing phase, and software that is tested and intended for public consumption.

If the HTML5 specification generation process adopts Microsections and a distributed source control mechanism, it should be easy to add, remove, and migrate features from an experimental document (Editors Draft), to a testing specification (Working Draft), to a stable specification (Recommendation).

While it may seem as if this is how W3C already operates, note that there is usually only a single stage that is being worked on at a time. This doesn’t fit well with the way HTML5 is being developed in that there are many people working on stable features, testing features, and experimental features simultaneously. A new release process is needed to ensure a smooth, non-disruptive transition from one phase to the next.

Problem: Two Communities, One Specification

When the WHAT WG started what was to become HTML5, the group was asked to do the work outside of the World Wide Web Consortium process. As a benefit of working outside the W3C process, the HTML5 specification was authored and gained support and features very quickly. Letting anybody join the group, having a single editor, dealing directly with the browser manufacturers and focusing on backwards compatability all resulted in the HTML5 specification as we know it today.

When the W3C and WHAT WG decided to collaborate on HTML5 as the future language of the web, it was decided that work would continue both in the HTML WG and the WHAT WG. People from the WHAT WG joined the HTML WG and vice-versa to show good faith and move towards openly collaborating with one another. At first, it seemed as if things were fine between the two communities. That is, until the emergence of an “us vs. them” undercurrent – both at the W3C and in WHAT WG.

Keeping both communities active was and will continue to be a mistake. Instead of combining mailing lists, source control systems, bug trackers and the variety of other resources controlled by each group, we now operate with duplicates of many of the systems. It is not only confusing, but inefficient to have duplicate resources for a community that is supposed to be working on the same specification. It sends the wrong signal to the public. Why would two communities that are working on the same thing continue to separate themselves from one another, unless there was a more fundamental issue that existed?

Action: Merging the Communities

The communities should be merged slowly. Data should be migrated from each duplicate system and a single system should be selected. The mailing lists should be one of the first things to be merged. If either community feels that the other community isn’t the proper place for a list, then a completely new community should be created that merges everyone into a single, cohesive group.

The two communities should bid on an html5.xyz domain (html5.org, html5.info, html5.us) and consolidate resources. This would not only be more efficient, but also eventually remove the “us vs. them” undercurrent.

Problem: Specification Ambiguity

A common problem for specification writers is that, over the years, their familiarity with the specification makes them unable to see ambiguities and errors. This is one of the reasons why all W3C specifications must go through a very rigorous internal review process as well as a public review process. Public review and feedback are necessary in order to clarify specification details, gather implementation feedback, and ultimately produce a better specification.

In order to contribute bug reports, features, or comments on the HTML5 specification, one must send an e-mail to either the HTML WG or the WHAT WG. The combined HTML5 and WHAT WG mailing list traffic can range from 600 to 1,200 very technical e-mails a month. Asking those who would like to comment on the HTML5 specification to devote a large amount of their time to write and respond to mailing list traffic is a formidable request. So formidable that many choose not to participate in the development of HTML5.

Action: In-line Specification Commenting and Bug Tracking

There are many websites that allow people to interactively comment on articles or reply to comments on web pages. We need to ensure that there are as few barriers as possible for commenting on the HTML5 specification. Sending an e-mail to the HTML WG or WHAT WG mailing lists should not be a requirement for providing specification feedback. It should be fairly easy to create a system that allows specification readers to comment directly on text ambiguities, propose alternate solutions, or suggest text deletions when viewing the HTML5 specification.

The down-side to this approach is that there may be a large amount of noise and a small amount of signal but that would be better than no signal at all. We must understand that many web developers, authors, and those who have an interest in the future of the Web cannot put as much time into the HTML5 specification as those who are paid to work on it.

Problem: No Way to Propose Lasting Alternate Proposals

If one were to go to the trouble of adding several new sections into HTML5 or modifying parts of the document, they would then need to keep their changes in sync with the latest version of the document at svn.whatwg.org. This is because there is currently only one person who is allowed to make changes to the “official” HTML5 specification. Keeping documents in sync is time consuming and should not be required of participants in order to affect change.

Since the HTML5 specification document is changed on an almost daily basis, the current approach forces editors to play a perpetual game of catch-up. This results in them spending more time merging changes than contributing to the HTML5 document. It may also cause the feeling that their changes are less important than those further upstream.

Action: At-will Specification Generation from Interchangeable Modules

As previously mentioned, moving to a microsectioned approach can help in this case as well. There can be a number of alternative microsections that editors may author so that each section can be a drop-in replacement. The document build system could be instructed, via a configuration file, on which microsections to use for a particular output product. Therefore if someone wanted to construct a version of HTML5 with a certain feature X, all that would be required is the authoring of the microsection and an instruction for the documentation build system to generate an alternative specification with the microsection included.

Problem: Partial Distributed Extensibility

One of the things that will vanish when the XHTML2 Working Group is shut down at the end of this year is the concept that there would be a unified, distributed platform extensibility mechanism for the web. In short, distributed platform extensibility allows for the HTML language to be extended to contain any XML document. Examples of XHTML-based extensibility include embedding SVG, MathML, and a variety of other specialized XML-based markup languages into XHTML documents.

HTML5 is specifically designed not to be extended in a decentralized manner for the non-XML version of the language. It special-cases SVG and MathML into the HTML platform. It also disallows platform extensibility and language extensibility in HTML5 (not XHTML5) using the same restricted rubric when they are clearly different types of extensibility mechanisms.

Many proponents of distributed extensibility are very concerned by this rubric and resulting design decision. At the heart of distributed extensibility is the assertion that anyone should be able to extend HTML in the future to address their markup needs. It is a forward-looking statement that asserts that the current generation cannot know how the world might want to extend HTML. The power should be placed into the hands of web developers so that they may have more tools available to them to solve their particular set of problems.

Action: A Set of Proposals for Distributed Extensibility

Whether or not distributed extensibility will ever be used on a large scale is not the issue. The issue is that there are currently no proposals for distributed extensibility in HTML5 (again, not XHTML5). Without a proposal, there is no counter-point for the “no distributed extensibility” assertion that HTML5 makes. Thus, if the W3C were to form consensus at this point, there would be only one option.

Consensus around a single option is not consensus. In the very least, HTML5 needs draft language for distributed extensibility. It doesn’t need to be the same solution as XHTML5 provides, it doesn’t even need to be close, but alternatives should exist. XHTML spent many years solving this problem and because of that, SVG and MathML found a home in XHTML documents. Enabling specialist communities, such as the visual arts and mathematics, to extend HTML to do things that were not conceivable during the early days of the Web is a fundamental pillar of the way the Web operates today.

Similarly, a set of tools to provide data extensibility do not exist in a form that are acceptable to the standards community. These tools are also going to fundamentally shape the way we embed and transfer information in web pages. If we are realistic about the expanse of problems that the Web is being called upon to solve, we should ensure that data extensibility capabilities are provided as we move forward.

We cannot be everything to everyone, we should provide some combination of features like Javascript, embedding XML documents, and RDFa, in both HTML5 and XHTML5 to help web developers solve their own problems without needing to affect change in the HTML5 specification.

Problem: Disregarding Input from the Accessibility Community

Accessibility is rarely seen as important until one finds oneself in a position where they or a loved one’s vision, hearing, or motor skills do not function at a level that makes it easy to navigate the web. Accessible websites are important not only to those with disabilities, but also to those who cannot interact with a website in a typical fashion. For example, web accessibility is also important when one is on a small form factor device, using a text-only interface, or a sound-based interface.

Members of the Website Accessibility Initiative (WAI) and the creators of The Accessible Rich Internet Applications (ARIA) technical specification have noted on a number of occasions that they feel as if they are being ignored by the HTML5 community.

Action: Integrate the Accessibility Community’s Input

Empowering WAI to edit the HTML5 specification in a way that does not conflict with others, but produces an accessibility-enhanced HTML5 specification is important to the future of the Web. Microsections and distributed source control would allow this type of collaboration without affecting the speed at which HTML5 is being developed. It may be that WAI needs a specification writer that is capable of producing unambiguous language that will enable browser manufacturers to easily create interoperable implementations.

The Plan of Action

In order to provide a greater impact on the near-term health of HTML5, the proposals listed above should be performed in the following order (which is not the order in which they are presented above):

  1. Implementation of git for distributed source control.
  2. Microsection splitter and documentation build system.
  3. Recruit more committers into the HTML5 community.
  4. Split features based on their experimental nature into unstable and testing during Last Call.
  5. Implement in-line feedback mechanism for HTML5 spec.
  6. Distributed extensibility proposals for HTML5.
  7. Better, more precise accessibility language for HTML5.
  8. Merge the HTML WG and WHAT WG communities.

Acknowledgements

The author would like to thank the following people for reviewing this document and providing feedback and guidance (in alphabetical order): Ben Adida, John Allsopp, Tab Atkins Jr., L. David Baron, Dan Connolly, John Drinkwater, Micah Dubinko, Michael Hausenblas, Ian Hickson, Mike Johnson, David I. Lehn, Dave Longley, Samantha Longley, Shelley Powers, Sam Ruby, Doug Schepers, and Kyle Weems.