Overriding Origin TTL in Varnish, or My Beginner's Mistake

Senior Professional Services Engineer

November 11, 2014

A long time ago, I was helping out at a gaming conference where there was an intranet CMS using a Twitter search plugin. Unfortunately, the rather saturated Internet connection was slowing down all of the Twitter search requests. Each page needed 4 searches, at 500ms each, for a total of 2-3 seconds per page.

I had recently heard about Varnish, and thought this was a great opportunity to see what it could do. I set up Varnish with Twitter as a backend (the Varnish terminology for origin), and used /etc/hosts to point search.twitter.com at 127.0.0.1. I verified with the varnishlog command that the requests were going through Varnish, but there was no caching yet.

After staring at the logs for a bit, I noticed that there was a Cache-Control: max-age=0 header, causing Varnish not to cache the responses. So I wrote my very first VCL (the Varnish Configuration Language) to rewrite the Cache-Control header in vcl_fetch. This did not have the expected result. As far as I could tell, there was no difference at all.

I hopped on the official Varnish IRC channel (irc.linpro.no, channel #varnish) and asked what I was doing wrong. The friendly folks there quickly explained my problem to me.

How TTL is Handled in Varnish

Varnish examines the headers to determine the TTL before vcl_fetch is run. So any changes to Cache-Control that you make ends up on the object returned to the client, but makes no difference to the TTL.

In the Twitter search plugin example, what I had to do instead was this:

sub vcl_fetch {
  # cache everything for 5 minutes, ignoring any cache headers
  set beresp.ttl = 5m;
}

That was it! Other than the backend definition pointing at Twitter, I needed no further VCL.

With this 5 minute TTL, the Twitter searches went from 500ms to below 1ms, and the page render times went from 2-3 seconds to 20-50 milliseconds.

What Would I Do Differently Now?

Back then, this setup was the extent of the possibilities, but it was sufficient. However, every 5 minutes or so, a user would pay the price, and have to wait for Varnish to fetch fresh copies of the 4 searches. While not ideal, this was sporadic enough that users didn't notice. But things have changed since then, and these days no one needs to wait for Varnish to fetch fresh objects from the origin.

With Fastly, stale-while-revalidate is available, so if you use Fastly's Varnish, you can write:

sub vcl_fetch {
  # cache everything for 5 minutes, ignoring any cache headers
  set beresp.ttl = 5m;
  # use stale to answer immediately if expired less than a minute ago
  set beresp.stale_while_revalidate = 1m;
  # use stale for a lot longer in case of fail whale, or our internet is down
  set beresp.stale_if_error = 48h;
}

With this setup, even if your Internet went out, the searches would still work and the CMS wouldn't break. And as long as any page with the plugin was getting requested about once a minute, no one would have to wait for the fresh results to be fetched from Twitter, since stale_while_revalidate triggers a background fetch.

If, like the historical me in this tale, you are stuck on an intranet without the option to use Fastly, you can use Varnish 4. The background fetch is the default behavior when delivering a stale object, but doing both stale-if-error and stale-while-revalidate is a little more complex. Varnish Software has an excellent blog post that outlines how to set up both cases.