Patterns for authentication at the edge
Identity is a boring, but necessary element of most website builds. Validating a user’s identity and access rights is something that is in the critical performance path, required site-wide, and often implemented in a bespoke way. Moving it to the edge improves performance, and can simplify your application architecture.
When I first started working in web development, I was developing intranet applications for the UK’s air traffic control service. Their network at the time was based on Windows NT, and used Internet Explorer in a special trusted mode when browsing local intranet sites. This meant that the windows domain username of the user was sent in an HTTP header on every request, so to find out who was browsing my site, I literally just had to read it, which in 2000-era PHP was something like:
```
echo $_SERVER['PHP_AUTH_USER'];
```
And produced:
Well, that was easy! Of course, on the wider web this doesn’t work. So as developers we spend a long time making elaborate sign-up and sign-in systems that are specific to our applications. But it’s always struck me as architecturally elegant to apply identity as a layer in front of the application, just as I experienced in that intranet job, and an edge network like Fastly is actually a really great place to do it.
With help and permission from the lovely people at Conde Nast, the Financial Times (FT), Vice, Nikkei, USA TODAY NETWORK, Svenska Dagbladet (SvD) and 7digital, let’s dive into some of the identity and access patterns in use by real Fastly customers, what they use them for, and how they work.
Almost all authentication solutions these customers use can be built from a combination of four basic patterns:
1. Edge auth: Direct authentication against a credential database stored at the edge
2. Tokens: Reading, writing and validating signed tokens to persist an authentication state
3. Preflight: Sending a request to one backend for authentication prior to sending to another for the content
4. Redirect: Redirecting the user to a third party or other out-of-band service, and expecting to get them back later with some kind of token
The customers who participated in this research use a combination of these. Let’s look at each pattern in detail.
Check out the Forrester New Wave™ for Edge Development.
Read the reportEdge auth: Direct authentication with HTTP Basic auth
No-one uses HTTP auth anymore, right? So I thought, but it turns out lots of Fastly customers do, mostly to protect staging, beta and other pre-production environments, or internal tools. With TLS now an essential component of any modern site, the ‘digest’ variant is no longer in common use, but if you favor this venerable browser-native authentication mechanism, implementing the ‘basic’ variant in VCL is pretty easy:
Using Fastly’s support for cryptographic and hashing functions in VCL, you can decode the Base64 encoded credential of the Basic auth format, look up the supplied username in an Edge dictionary, and check that the password matches.
Edge dictionaries allow the key-value pairs of a named VCL data table to be updated via the Fastly API. If using an edge dictionary to store credentials, we recommend making it private, which masks the data from anywhere where you can view the VCL.
Vice is one Fastly customer to benefit from this technique to protect their staging and test sites.
Direct edge auth with client IP
Another useful application of the edge auth pattern is for authentication based on the source IP address. This is common for sites that offer automatic premium access to people who work at a particular organisation, or are visiting a particular partner location, like an airport business lounge. __SvD__ and the __Financial Times__ are both using IP addresses to create premium ‘zones’ in which access is automatically granted, though while SvD does it in VCL, FT performs the same logic using a backend service. Here’s how you could do it using an access control list (ACL) in VCL:
Tokens: repeatable stateless session validation
Of course, most public websites don’t use HTTP auth, and don’t want to push their entire user database to an edge cloud. Instead, at a high level, most have a web form which submits a username and password to a login endpoint, and then sets a cookie to allow access for the duration of the session.
Generally, in a production setting you will want to provide your own service to perform this authentication. But setting, reading, and validating the cookie can all be done at the edge. You can use the popular [JSON Web Token](https://jwt.io/) format for that, and connect to a custom origin for the login request. First, let’s visualize the flow, because this is a bit more involved:
Here’s what is happening here:
1. The first request is simply asking for the login page. We deliver this from an authentication service origin.
2. The second request is the POST to actually sign in. The authentication origin returns a set of user data that it wants to store in the user’s active session token. We encode this as a [JSON web token](https://jwt.io/) (JWT), sign it, and set it as a cookie in the response to the user. We also redirect them to the post-login page.
3. The user can now request a content page from the site, and because the request contains a cookie, we decode the JWT, validate the signature and break out the user data into separate headers, which we append to the origin request. Importantly, we also strip the Cookie header, so that we isolate the cookie behavior to the edge layer.
4. The content origin produces a page, and tells you whether any of the identity data was queried by setting a `Vary` header. This is where it is super efficient to spread the session data across multiple headers, because the content origin might only be interested in whether the user is subscribed or not, rather than their individual identity, so we can cache just two versions of this page.
5. We strip the `Vary` header after caching, because it doesn’t make any sense to the browser.
The main benefit of this is that your site doesn’t need to know anything about cookies or JWTs, you can simply read the authentication data directly out of trusted headers, achieving that nice separation of concerns we talked about in the introduction. Here’s the code to implement the flow above:
Variations of this mechanism are used by lots of Fastly customers, including USA TODAY NETWORK, Nikkei and the FT. Fastly supports multiple forms of crypto, so you can adapt this pattern to the format of your own authentication tokens.
Single-use, time-limited tokens
The token decoding principle is incredibly versatile, and is also commonly used to generate links intended to grant access to a download or video stream, for a single invocation. In an edge world, we try to avoid maintaining globally synchronized state, so rather than expiring a token when it is used, it’s often easier to place a short-duration time limit on it. This can be combined with limiting the token to a single IP address and user-agent.
To do this, have your origin server generate a link to a file at a protected path, and append a query string variable which contains the expiry time in the clear, along with a signed hash of:
`{object URL} + {expiry time} + {user agent} + {client IP} + {secret}`
We can then recompute the hash at the edge, and if it matches and the expiry time has not been reached, deliver the object:
Preflight: per-request auth for paywalls
Using an authentication service only for the initial validation of the login credentials is efficient: subsequently you can just get everything you need by decoding a cookie. However, some sites can’t do this, because the ability for a given user to access a given piece of content is much more complicated than can be expressed in a cookie.
For example, perhaps content has properties that can change, like whether it is accessible to ‘standard tier’ subscribers (rather than ‘premium’ ones), or whether it counts towards a metered allowance, or maybe even how much it costs (for sites that charge micropayments for content). You may also need to update some aspect of a user’s account based on the act of accessing the page they want to see — to deduct a credit, for example, or register that they’ve bought this content so they don’t get charged twice.
These kinds of use cases benefit from a pattern we call __preflight__. In a preflight, a request for a page is ‘parked’ and a separate request for metadata is made.
Let’s look at a complex example that includes preflights for both content and user:
Here’s what we’re doing:
1. First we receive a request for an article, and retrieve that article as normal. But the response includes a `Paywall` header, so we know that we need to consult the referenced URL to check whether the user will be able to view this content.
2. Now, while holding the same client-side request open, we make a second backend request to the paywall service, and set the URL to the one specified by the content. However, we also add the user or session ID of the current user as a header.
3. The paywall service returns result information in HTTP headers, such as `Paywall-Result` and `Paywall-Meta`.
4. We transfer those headers onto the request and restart again to recover the content that the user wanted. We’ll be able to get this from the local cache since we already loaded it very recently.
5. The content is delivered to the user, and we can optionally include a short-lived cookie containing the paywall related metadata, so that the page can tell the user the status of their credit (eg. “You have 5 articles remaining this month”)
You might look at this and think that’s a lot of sequential, blocking requests. Combining the backend requests into one might well make sense, but consider the cache implications of doing that. In the above example, the first backend request is very cacheable. We’ll only do one of these per article, not per user, so Fastly’s cache will quickly become a distributed database of content for your site, accessible in a few milliseconds at most. Our CEO, Artur, is very fond of the idea that you can think of Fastly simply as a massive key value store keyed by URL.
If a lot of your content is not metered, or otherwise has properties that indicate there’s no need to make the paywall request, then we can simply deliver the content without a paywall preflight. Overall, the objective is to minimize the absolute number of preflights, and you can do that by increasing their cacheability, and your ability to be smart about whether to make them at all.
Here’s the VCL for the above example, which connects to a toy paywall service I threw together in Glitch:
The `Paywall` header is a neat solution to allow a generic implementation of paywall functionality at the edge. Since semantically it’s a form of “linked content”, you could also take a more abstract approach using the `Link` header with a non-standard `rel` parameter, moving more configuration into HTTP and hard coding less behavior at the edge. I’m making up internet standards here, but you could imagine something like this:
```
Link:
```
Paywalls are a common use case for Fastly customers. The preflight pattern is used by Conde Nast for the New Yorker paywall, for example, as well as SvD and the FT. Some customers have their own backends for doing metering and paywall access decisions, while others use third party services such as Evolok. The pattern also has use cases outside of paywalls: 7digital uses it to authenticate access to music streams.
Client-side redirection: SSO & OAuth
So far, so good, but what about federated identity systems, single-sign-on etc? All these kinds of use cases fall under the general pattern of redirecting the user to an out of band service, and then expecting to get them back later with some additional metadata.
Let’s look at OAuth2, which is the most common type of provider protocol encountered by Fastly customers.
The body problem
The challenge in supporting these flows at the edge is that they commonly require reading and writing the request and response body of the request from the edge server to the authentication provider in order to perform a verification over a private, trusted backchannel. Fastly currently does not permit constructing custom request bodies, nor do we provide access to the response body of a response from an upstream server, because we stream requests. Your code runs before we have access to the body. We can solve this in a few ways:
- Use an OAuth provider whose token exchange endpoint supports GET requests and returns tokens and profile information in HTTP headers. If you are the OAuth provider, you can of course make an Oauth backend that does this, but it’s not common, nor is it technically standards-compliant.
- Use a simple proxy that converts an OAuth callback request into a token exchange call and use the traditional authorization code grant type (eg. as supported by GitHub’s OAuth flow). This will work with virtually all OAuth services.
- Use a mechanism that does not require body access. If your OAuth server supports the implicit grant type, and issues tokens that can be validated using a mechanism that doesn’t require an API call, then this is possible. Google’s OpenID Connect service supports this method.
Running your own OAuth service
Everything’s pretty easy if you run your own OAuth service. Let’s start with the user attempting to do something for which you need them to be logged in. You’ll redirect them to the authentication provider, for login and consent, and then get back an authorization code. Here’s the full flow:
The user’s browser, following the redirect back to us after logging in with the authentication service, will load a ‘callback’ URL with a code parameter. At this point, according to the OAuth spec, the request to the token endpoint must use an HTTP POST. However, to avoid having to construct a request body at the edge, your OAuth service will support GET on this endpoint. The spec also requires that the response to token requests be JSON-encoded and in the response body. You need the response in headers at the edge so your OAuth service needs to extend the spec and include the response data in headers as well.
This example uses a custom OAuth-like provider, and as with the paywall I made the code available on Glitch.
Using a proxy for third party OAuth services
This is all well and good if you can roll your own OAuth provider and don’t mind stretching the spec a bit. But if you can’t, you can add a simple proxy. This will take the GET request from the callback and convert it into a POST to the real token endpoint, read the result from the response body, and convert it to headers on the response back to the edge:
From the perspective of Fastly, this solution is identical to the previous one. The only difference is that the Fastly-friendly OAuth backend is not actually the OAuth provider but a proxy to a less Fastly-friendly standards-compliant OAuth backend. Here’s a [glitch example](https://glitch.com/edit/#!/fiddle-oauth-proxy) showing the necessary transformations.
Implicit grants
Google’s OAuth provider is particularly fully featured, and supports the [implicit grant type](https://oauth.net/2/grant-types/implicit/), which is relatively rare for OAuth providers in general. It also formats its access token as a JSON Web Token, and provides public keys that allow the token to be validated without an API call. Your callback flow can now look like this:
This method comes with pros and cons. With the auth code grant, we had to check each login with the auth provider. With implicit grants, that backchannel doesn’t exist, and in Google’s case they provide public keys that enable offline validation of tokens. These keys are rotated daily, but are cacheable, so you can use Fastly’s global cache to retrieve and store the keys on demand. Fastly’s RSA Verify function allows us to use the public key to check the signature from the JWT.
Here’s a full featured demo using the Google provider and an implicit grant type:
Note that this demo uses the real Google Cloud Platform provider, so after granting access to your Google account, you may want to revoke that access on the [Apps with access to my account](https://myaccount.google.com/permissions?pli=1) page of your Google account settings.
Wrapping up
There’s no one right way to manage access and identity. Even within these general patterns there is massive variety. You can always start from here and adapt these patterns to your use case. I’m always interested in new and creative ways of processing identity and access at the edge, if you have one, feel free to [drop me a line](mailto:abetts@fastly.com) or post in our [community forum](https://community.fastly.com/).
Resources
Here’s a quick summary of all the demos and example code featured in this article:
* Edge auth: [Fiddle](https://fiddle.fastlydemo.net/fiddle/95dab48c)
* Tokens: [Session token fiddle](https://fiddle.fastlydemo.net/fiddle/115f15c5), [single-use token fiddle](https://fiddle.fastlydemo.net/fiddle/a04d81ca)
* Preflight: [Fiddle](https://fiddle.fastlydemo.net/fiddle/f01adda8), [Example paywall glitch](https://glitch.com/edit/#!/fiddle-paywall)
* Redirects: [Custom provider fiddle](https://fiddle.fastlydemo.net/fiddle/29868408), Example custom [OAuth server glitch](https://glitch.com/edit/#!/fiddle-oauth), [implicit grant fiddle](https://fiddle.fastlydemo.net/fiddle/08678488)