Stanford Web Security Research

Beware of Coarser-Grained Origins

You forgot the scheme.
Collin Jackson, Adam Barth
24 May 2008

Many browser security features require serializating the origin of a document to a string. These proposals often focus exclusively on the host component of origins, adding scheme (and port) as an afterthought, if at all.

We recommend the HTML 5 origin serialization algorithm, which includes both the scheme and host.

Why to specify the scheme

Browser security proposals typically try to provide security against two classes of attackers:

  • Web attackers own a server with a untrusted domain name and want to impersonate a trusted domain name.
  • Active network attackers have the ability to inspect or corrupt HTTP traffic, but not HTTPS traffic. Network attacks may be launched using a compromised network router or insecure wireless network.

Checking the host provides protection against web attackers. Checking the scheme (if HTTPS) provides protection against active network attackers. Thus, to handle both classes of attackers, browser security mechanisms should include the both host and scheme in a serialized origin.

How to specify the scheme

The HTML 5 specification proposes the following algorithm for serializing an origin:

  1. If the origin in question is not a scheme/host/port tuple, then return the empty string.
  2. Otherwise, let result be the scheme part of the origin tuple.
  3. Append the string "://" to result.
  4. Apply the IDNA ToUnicode algorithm to each component of the host part of the origin tuple, and append the results — each component, in the same order, separated by U+002E FULL STOP characters (".") — to result.
  5. If the port part of the origin tuple gives a port that is different from the default port for the protocol given by the scheme part of the origin tuple, then append a U+003A COLON character (":") and the given port, in base ten, to result.
  6. Return result.

Features that forgot the scheme

  • postMessage. The postMessage API allows the recipient of a message to determine its sender. The original implementation in Opera included only the host of the sender's origin. (Firefox and Safari's postMessage implementations follow the HTML 5 origin serialization algorithm. Internet Explorer 8 beta current includes the scheme and host as separate strings.)
  • JSONRequest. The JSONRequest proposal originally used a Domain HTTP header to specify to message's origin. This header can be used to reject requests from undesirable origins. It specified only the host of the origin, not the scheme. We proposed that this header be replaced by an Origin header that specifies the full origin. Our proposal was adopted on May 25, 2008.
  • updateEnabled. Firefox's updateEnabled API determines which origins are allowed to install extensions using a whitelist. By default, the only sites on the whitelist are addons.mozilla.org and update.mozilla.org. Both of these sites redirect automatically to HTTPS. However, the whitelist only specifies the host of the origin, not the scheme. Thus, a network attacker can spoof http://addons.mozilla.org and install extensions. This vulnerability can be prevented by using the HTML 5 origin serialization algorithm. (See Mozilla bug 432532.)

Features that have always specified the scheme

  • Firefox Saved Passwords. When a user saves a password in Firefox, the password is autofilled not only for the current page, but for all pages for the same origin. Firefox stores the origin of the page using an algorithm that is similar to the HTML 5 origin serialization algorithm.
  • Cross-site XMLHttpRequest. The Access Control for Cross-Site Requests proposal specifies the origin of requests using the Origin header. The Origin header is generated using an algorithm that is similar to the HTML 5 origin serialization algorithm.
  • XDomainRequest. The XDomainRequest proposal specifies the origin of requests using the Referer header. For XDomainRequest, this header is generating using an algorithm that is similar to the HTML 5 origin serialization algorithm.

When to forget the scheme

There is one case when omitting the scheme is sensible: when the browser will fill it in automatically with a safe value.

To provide protection against active network attackers, HTTPS sites need to avoid importing libraries or exporting confidential information over HTTP. However, web sites often mirror exactly the same content over HTTP and HTTPS. If the developer expected some of the content to be served over HTTP only, the developer is likely to embed scripts using absolute paths containing the HTTP scheme:

<script src="http://a.com/foo.js"></script>

Unfortunately, this tag compromises the security of HTTPS on the entire site because an active attacker can navigate the user's browser to the broken page over HTTPS, replace the insecure script with his own, and invade the security context of the secure site. This mistake can easily be corrected by using scheme-relative paths:

<script src="//a.com/foo.js"></script>

Forms can also use scheme-relative paths:

<form action="//a.com/foo.cgi">...</form>

These paths cause the browser to import and export over HTTP when the current page is viewed over HTTP and over HTTPS when the current page is viewed over HTTPS. Using this technique, a site can benefit from caching and increased performance when the page is viewed over HTTP but retain security when the page is viewed over HTTPS.