Credentials: A Retrospective of Mild Successes and Dramatic Failures

Over the past 20 years, various organizations have tried to “solve” problems related to identity and credentialing on the Web and have been met with varying degrees of mild success, but mostly failure. Understanding what worked and what did not is necessary if we’re going to make progress on identity and credentialing on the Web.

The word identity means different things to different people and is often discussed as a problem waiting to be solved on the Web. In the physical world, we have many identities. We have an identity for work life and home life. We have an identity that we use when we talk with our friends and one that we use when we talk with our families. The concept of identity is as nuanced as it is broad.

There are aspects of our identities that have very little consequence to others, such as whether we have dark brown hair or black hair. There are also aspects of our identities that are vital for proving that we should be able to perform certain tasks, like a drivers license or nursing license. Then there are aspects of our identities that are important for social reasons, such as the rapport that we build with our friends over multiple decades.

Many aspects of our identity are often expressed via credentials, which can be seen as verifiable statements made by one person or organization about another. There have been multiple attempts at formalizing credentials on the Web; each one of them have been met with varying degrees of mild success, but mostly failure. This blog post explores the development goals and system capabilities that would lead to a healthy credentialing ecosystem and why previous attempts to achieve these goals have been met with limited success.

Credentialing Ecosystem Goals

A healthy credentialing ecosystem should have a number of qualities:

  • Credentials should be interoperable and portable. Credentials should be used by as broad of a range of organizations as possible. The recipient of a credential should be able to store, manage, and share credentials throughout their lifetime with relative ease.
  • The ecosystem should scale to the 3 billion people using the Web today and then to the 6 billion people that will be using the Web by the year 2020.
  • The process of exchanging a credential should be privacy enhancing and recipient controlled such that the system protects the privacy of the individual or organization using the credential by placing the recipient in control of who is allowed to access their credential.
  • Implementing systems that issue and consume credentials should be easy for Web developers in order to lower barriers to entry and increase the amount of software solutions in the ecosystem.
  • Creating systems that are accessible should be a fundamental design criteria, as 10% of the world’s population have disabilities and the solution should be usable by as many people as possible.
  • The solution should follow a number of core Web principles such as being patent and royalty-free, adhering to Web architecture fundamentals, supporting network and device independence, and being machine-readable where possible to enable automation and engagement of non-human actors.

Credentialing Capabilities

A solution for a healthy credential ecosystem should have the following capabilities:

  • Extensible Data Model – A data model that supports an entity making an unbounded set of claims about another entity. This enables a very broad applicability of credentials to different use cases and market verticals.
  • Choice of Syntax – A data model that is capable of being expressed in a variety of data syntaxes. This increases interoperability between disparate credentialing systems and increases the long-term viability of the technology.
  • Decentralized Vocabularies – A formal mechanism of expressing new types of claims without centralized coordination. This promotes a high degree of parallel adoption and innovation.
  • Web-based PKI – A digital signature mechanism that does not require out-of-band information to verify the authenticity of claims; instead it should enable public keys to be automatically fetched via the Web during verification. It should not render the signed data opaque because opaque data is harder to learn from, program to, and debug. This makes the digital signature mechanism easier to use for developers and system integrators.
  • Choice of Storage – A protocol for storing a credential at an arbitrary identity provider after it has been issued by an arbitrary issuer. This helps create a level playing field for all actors in the ecosystem.
  • App Integration – A protocol for managing credentials by a recipient using arbitrary 3rd party applications. This promotes a healthy application ecosystem for managing credentials.
  • Privacy-enhanced Sharing – A protocol that enables the recipient to share their credentials without revealing the intended destination to their identity provider. This enhances privacy.
  • Credential Portability – A protocol for migrating from one identity provider to another without the need to reissue each credential. This promotes a healthy identity provider ecosystem.
  • Credential Revocation – A protocol for revocation of a previously issued credential by the credential issuer. This enables issuers to ensure that the credentials they have issued accurately represent their claims.

A Foreword on the Analysis

There are a number of existing identity, credentialing, and general authentication solutions that have been deployed in the past and have seen success according to the design goals of the particular solution. The design goals and capabilities listed in the previous sections do not always align with the design goals and capabilities of the systems that have been analyzed below. It is fair to say that some of the analysis below is “unfair” as some of the systems are being judged against criteria that they were not meant to address. The reason these solutions have been included below are because 1) they come up in conversation as potential solutions, and/or 2) they have been successful in meeting some of the criteria listed above, and/or 3) they have failed in important ways that we need to learn from to ensure that a new initiative doesn’t make the same mistakes.

With that said, let’s get on with the analysis.

SAML

The grandparent of these identity initiatives is SAML, an XML-based, open-standard data format for exchanging authentication and authorization data between parties, in particular, between an identity provider and a service provider.

Capability Rating Summary
Extensible Data Model Problematic XML-based data model. Extensible, but extensibility is rarely used leading to very limited claim types.
Choice of Syntax Poor Only XML is supported.
Decentralized Vocabularies Problematic The use of simple key-value pairs created the possibility of name clashes, limiting decentralized innovation and adoption.
Web-based PKI Problematic Service Providers had to explicitly trust Identity Providers leading to scalability issues.
Choice of Storage Poor Recipients can only have credentials issued and stored via a single identity provider.
App Integration Poor A recipient could only manage their credentials via their identity provider interface.
Privacy-enhanced Sharing Poor Identity providers cannot be prevented from knowing where recipients use their credentials.
Credential Portability Poor Credentials cannot be transferred from one identity provider to another.
Credential Revocation Problematic Credentials can only be issued and revoked by identity providers.

SAML has failed to gain traction outside the education and public service organization sectors and is rarely used as an option to log into non-enterprise websites. SAML isn’t a viable solution for the goals and capabilities listed above because 1) it is hobbled by older technologies such as XML and SOAP, 2) it has scalability issues due to the need to manually setup federations of identity and service providers, 3) it restricts the organizations that can issue valid credentials to identity providers, 4) it enables tracking and violates a number of privacy requirements listed above, and 5) it encourages centralization and super providers as more people use the system because of the administrative overhead of managing the identity and service provider federations as they grow.

Windows CardSpace / Infocard

Microsoft released CardSpace (code named InfoCard) in late 2006. CardSpace stored references to your digital identity, presenting them for selection as visual Information Cards. CardSpace provided a consistent interface designed to help you easily and securely use these identities in applications and websites.

Capability Rating Summary
Extensible Data Model Problematic Extensibility was possible (XML and URLs), but hard coded strings were used for parameter names.
Choice of Syntax Poor Only XML was supported.
Decentralized Vocabularies Problematic URLs were used for parameter values, but the solution was ultimately Windows-only.
Web-based PKI Good XML enveloping signatures with attached X509 certificates were used.
Choice of Storage Good Credentials were stored in an application controlled by the recipient.
App Integration Problematic Credentials were managed by a Windows application.
Privacy-enhanced Sharing Good Credential consumers requested credentials directly from the recipient.
Credential Portability Problematic Credentials could only live on Windows devices with no automatic cross-device synchronization capability (although manual export/import was supported).
Credential Revocation Good Security tokens are not granted for revoked credentials.

Windows CardSpace failed to gain traction and the project was canceled in 2011. The initiative has been replaced with U-Prove, a Microsoft Research project. CardSpace did get a number of things right, but ultimately failed because 1) it was largely a Microsoft-centric solution that required Windows and Active Directory 2) early adopters that initially backed the project did not feel as if there was a strong near-term demand for a solution and thus didn’t roll out products to support the standard, 3) it didn’t scale to mobile and was largely tied to the desktop, 4) it was a separate product requiring manual installation and not a feature of Internet Explorer, 5) while it supported verified claims through a trusted third party, very few applications surfaced that enabled that optional feature, and 6) it was partly based on the WS-* technology stack, which has been derided as “bloated, opaque, and insanely complex“.

Shibboleth

Shibboleth is a middleware initiative for an architecture and open-source implementation for identity management and federated identity-based authentication and authorization. It is based on SAML and thus has many of the same advantages and drawbacks.

Capability Rating Summary
Extensible Data Model Problematic XML-based data model. Extensible, but extensibility is rarely used leading to very limited claim types.
Choice of Syntax Poor Only XML is supported.
Decentralized Vocabularies Problematic The use of simple key-value pairs created the possibility of name clashes, limiting decentralized innovation and adoption.
Web-based PKI Problematic Service Providers had to explicitly trust Identity Providers leading to scalability issues.
Choice of Storage Poor Recipients can only have credentials issued and stored via a single identity provider.
App Integration Poor A recipient can only manage their credentials via their identity provider interface.
Privacy-enhanced Sharing Poor Identity providers cannot be prevented from knowing where recipients use their credentials.
Credential Portability Poor Credentials cannot be transferred from one identity provider to another.
Credential Revocation Problematic Credentials can only be issued and revoked by identity providers.

Shibboleth has failed to gain traction outside the research, education, and public service organization sectors and is rarely used as an option to log into non-enterprise websites. Shibboleth is not a viable solution for the goals and capabilities listed above because 1) it is hobbled by older technologies such as XML and SOAP, 2) it has scalability issues due to the need to manually setup federations of identity and service providers, 3) it restricts the organizations that can issue valid credentials to identity providers, 4) it enables tracking and violates a number of privacy requirements listed above, and 5) it encourages centralization and super providers as more people use the system because of the administrative overhead of managing the identity and service provider federations as they grow.

OAuth 2.0

While not an identity or credentialing solution, OAuth 2.0 often comes up as being in the same class of solution and has been used (by Facebook and OpenID Connect) to achieve an authentication/authorization solution. Introduced in 2006 and finalized as OAuth 2.0 in 2012, the latest version of the framework has wide deployment among Facebook, Google, and Microsoft but suffers from multiple non-interoperable implementations.

Capability Rating Summary
Extensible Data Model Poor Key-value pairs, which require centralized coordination to avoid conflicts.
Choice of Syntax Poor Only string-based key-value pairs are supported.
Decentralized Vocabularies Poor Key-value pairs, which require centralized coordination to avoid conflicts.
Web-based PKI N/A OAuth 2.0 is not designed to perform digital signatures.
Choice of Storage N/A OAuth 2.0 is not designed to express credentials.
App Integration N/A OAuth 2.0 is not designed to manage credentials.
Privacy-enhanced Sharing N/A OAuth 2.0 is not designed to request credentials.
Credential Portability N/A OAuth 2.0 is not designed to port credentials.
Credential Revocation Problematic OAuth 2.0 tokens timeout after a while, but operate as bearer tokens (if they are stolen, they can be used by other people for a non-trivial amount of time).

OAuth 2.0 was never designed to achieve the goals and capabilities listed at the beginning of this blog post, and so the comparison above isn’t very fair. That said, there are a number of reasons that OAuth 2.0 is not a good fit for the goals and required capabilities listed above: 1) it is often criticized as being problematic from a security standpoint, overly complex, and favoring large enterprise deployments, 2) it doesn’t have a data model that is flexible enough to model more than the most basic type of credentials, 3) it does not support digital signatures, and 4) it doesn’t solve the credentialing problem described above because it is designed for a completely different set of use cases: providing access to 3rd parties that want to perform certain operations on your account. While OAuth 2.0 is not a solution to the credentialing problem, it can be used as part of a credentialing solution, which is what OpenID Connect does.

OpenID Connect

OpenID Connect is a simple identity layer on top of the OAuth 2.0 protocol, which allows clients to verify the identity of an end-user based on the authentication performed by an authorization server, as well as to obtain basic profile information about the end-user in an interoperable and REST-like manner.

Capability Rating Summary
Extensible Data Model Problematic JSON is extensible, but needs a single centralized registry for terms or full URLs must be used.
Choice of Syntax Poor Only JSON is supported.
Decentralized Vocabularies Poor Extensions require a single centralized registry for terms or full URLs must be used.
Web-based PKI Poor URLs can be used for JWK key IDs, but the mechanism to do discovery isn’t specified in the specifications.
Choice of Storage Poor Credential information is not portable.
App Integration Problematic Recipients can only manage credentials via their identity provider.
Privacy-enhanced Sharing Poor Identity providers can track all logins.
Credential Portability Poor Credentials are not portable between identity providers.
Credential Revocation Problematic Distributed credentials can be revoked at any time, but are rarely used.

The OpenID initiative started in 2005. Today Google, Microsoft, Deutsche Telekom, and SalesForce use the OpenID Connect protocol to support federated login, but Facebook and Twitter do not. OpenID Connect has been fairly successful in creating a Web-scale single-sign on solution, but it fails to address the goals and required capabilities for the following reasons: 1) it depends heavily on centralized registries to define types of credentials, 2) credentials issued by 3rd parties and digital signatures are rarely used (there is no rich 3rd party credential ecosystem), 3) there is no protocol for transferring credentials from one provider to another, 4) it enables tracking and other privacy violations, and 5) it promotes centralization and super-providers due to a reliance on email-addresses-as-identifiers, ensuring that email super providers become the new credential super providers.

WebID+TLS

WebID+TLS, previously known as FOAF+SSL, is a decentralized and secure authentication protocol built on W3C standards around Linked Data and utilizing client-side x509 certificates and TLS to boostrap the identity discovery process.

Capability Rating Summary
Extensible Data Model Good WebID is based on RDF which is a provably extensible data model.
Choice of Syntax Good WebID is based on RDF and thus supports multiple syntaxes.
Decentralized Vocabularies Good All data is expressed using vocabularies, which can be created by anyone. Data merging is designed to happen automatically.
Web-based PKI Problematic All claims in a WebID are self-issued claims. No clear specification for 3rd party claims.
Choice of Storage Good WebIDs can be stored at the recipient’s preferred location
App Integration Problematic Possible, but no clear specification exists to do so.
Privacy-enhanced Sharing Problematic Identity Providers can track logins by default (unless a mixnet is used).
Credential Portability Problematic Self-issued credentials can be ported easily. If there were 3rd party credentials tied to WebID URL, those credentials would not be portable.
Credential Revocation Problematic Revocation is possible, but there is no clear specification for doing so.

WebID+TLS was first presented for the W3C Workshop on the Future of Social Networking in 2009, but has failed to gain any significant traction in the past six years. WebID+TLS is also problematic because 1) it depends on browser client-side certificate handling, which is a bad UI experience, 2) it depends on the KEYGEN HTML element in a non-trivial way, which is currently under discussion to be removed from certain browsers, and 3) it is not well specified how you achieve many of the credential goals and required capabilities listed in the beginning of this post.

Mozilla Persona

Persona was launched in 2011 and shares some of its goals with some similar authentication systems like OpenID or Facebook Connect, but it is different in several important ways: 1) It used email addresses as identifiers, 2) It was more focused on privacy, 3) It was intended to be fully integrated in the browser.

Capability Rating Summary
Extensible Data Model N/A Not a design requirement.
Choice of Syntax N/A Not a design requirement.
Decentralized Vocabularies N/A Not a design requirement.
Web-based PKI Problematic Public keys are discoverable but content is largely hidden.
Choice of Storage Poor Identity provider is tied to login domain.
App Integration N/A There are no credentials to manage.
Privacy-enhanced Sharing Good Logins are privacy-enhancing and not easily trackable.
Credential Portability Poor There are no credentials to port.
Credential Revocation N/A There are no credentials to revoke.

Persona is solely an authentication system and was not designed to support arbitrary credentials. Mozilla was hoping for widespread adoption of the protocol, but that did not occur because super providers like Google and Facebook had already developed their own competing login mechanisms. All full-time developers were pulled from the project in 2014.

Omitted Technologies

There are a number of technologies, such as U-Prove, the WebAppSec Login/Credentials API, and Mozilla’s Firefox Account API, that were not included in this analysis because they are still very early in the research and development phases. APIs like Facebook Connect and Login with Google were not included because they are not intended to be open standards.

Conclusion

There are a number of goals and required capabilities that need to be fulfilled in order to create a vibrant credentialing ecosystem. There have been various attempts at addressing subsets of the problems in the ecosystem and those solutions have been met with small successes and varied failures. There still is no widely deployed and adopted way of issuing, storing, managing, and transmitting credentials on the Web today and we do have quite a bit of insight into why the prior attempts at solving the general identity/credentialing problem have failed.

These findings lead to a simple question: Is it time to do something about Credentials on the Web?

Trackbacks for this post

  1. Rebalancing How the Web is Built | The Beautiful, Tormented Machine

Leave a Comment

Let us know your thoughts on this post but remember to play nicely folks!