The lengthiest HTTP headers
Large web page bodies make your page load slowly, but what about large headers?
HTTP headers helpfully attach metadata about the page body. Some of these headers are crucial, like Content-Type
, which communicates the type of content, say text/html
or image/jpeg
. Most headers are quite short, but what are the largest, lengthiest headers?
Andrew Betts posted previously on this blog about The headers we want and The headers we don't want. He included a recommendation:
“So as well as trying to remove unnecessary headers, try to keep the values of the headers that you do include as short as possible.”
How can we find these lengthy headers? The HTTP Archive tracks information on over 1 million pages every month. They make the data available via Google BigQuery. They recently introduced some new datasets which are even easier to use. I queried the HTTP Archive with DuckDB, specifically the 2024-10-01 crawl.
HTTP header names
Headers consist of a name and a value. Let’s start by looking at their names. Most header names are very short, with the most common being 4 or 13 characters long:
These short header names will not cause any performance problems.
HTTP header values
Let’s shift our focus to the header values. Looking at all the values, half are under 15 characters long, but we can see that 1% of values are over 263 characters long:
The most interesting of these long values comes from the Set-Cookie
and the Content-Security-Policy
header names.
Set-Cookie header values
The Set-Cookie
header sends a cookie from the server to the browser, which the browser then sends to the server in the future. Typically these cookies are a short identifier, but some a kilobytes long:
Some of these are misconfigurations and some of these push a lot of state to the browser, sometimes even whole databases of products. I recommend keeping cookies small.
Content-Security-Policy header values
The Content-Security-Policy
header defines directives that govern resource fetching, the state of a document, aspects of navigation, and reporting. For very simple sites, the policy can be short but for most sites, the policy ends up quite long:
A small Content-Security-Policy
Let’s look at a small policy. The policy for https://www.w3.org/ is:
There are two directive names, separated by a semicolon: frame-ancestors
and upgrade-insecure-requests
.
The frame-ancestors
directive restricts the URLs which can embed the page using <frame>
, <iframe>
, <object>
, or <embed>
. This avoids the risk of being embedded into potentially hostile contexts. The frame-ancestors directive values are:
'self': the page can be embedded by pages from the same origin
https://cms.w3.org/: the page can be embedded by the content management system
https://cms-dev.w3.org/: the page can be embedded by the development content management system
The upgrade-insecure-requests
directive asserts to the user agent that the owner intends a site to load only secure resources, and that insecure URLs ought to be treated as though they had been replaced with equivalent secure URLs.
A long Content-Security-Policy
Now let’s look at a longer policy. The policy for https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy is:
That’s a lot of directives! Let’s look at just one: the img-src
directive restricts the URLs from which image resources may be loaded. In this case, images may be loaded from the origin itself, inline images (data:
), or inline images from a number of URLs:
'self'
*.githubusercontent.com
*.googleusercontent.com
*.gravatar.com
data:
firefoxusercontent.com
https://*.google-analytics.com
https://*.googletagmanager.com www.gstatic.com
https://mdn.dev/
https://mdn.github.io/shared-assets/
interactive-examples.mdn.allizom.net
interactive-examples.mdn.mozilla.net
mdn.dev
mozillausercontent.com
profile.accounts.firefox.com
profile.stage.mozaws.net
upload.wikimedia.org
wikipedia.org
For example, the CSS <image> image-set() reference page loads an image https://mdn.github.io/shared-assets/images/examples/balloons-landscape.jpg from mdn.github.io, which is allowed by the policy.
Most common directives
If I look at the largest policies, the top five directives are:
img-src
, which restricts image loading URLsconnect-src
, which restricts EventSource, WebSockets, and XMLHttpRequest URLsscript-src
, which restricts script execution URLsframe-src
, which restricts<iframe>
URLsdefault-src
, which is a fallback for the other fetch directives
Most common directive values for img-src
Let’s pick the most common directive, img-src
. The top ten directives are:
data:
blob:
https://*.google-analytics.com
android-webview-video-poster:
*.facebook.com
https://googleads.g.doubleclick.net
*.fbcdn.net
*.cdninstagram.com
https://*.fbsbx.com
*.giphy.com
The analytics and advertisements are almost always third parties. More interesting is the “CDN” domains: many sites often have an entire domain for image assets.
Smaller policies
I recommend having smaller policies. You can think of all of these policies as security exceptions. Simplicity is key to keeping security and web performance issues in check. If simplicity were the primary priority for your website, then all of the assets should come from the website itself. When browsers create separate HTTP connections to many additional websites, this significantly slows down page loading.
Andrew already articulated this recommendation in great detail in Taming third parties with a single-origin website. You can proxy other domains through Fastly so that they are swiftly served from your main domain. This means you can maintain a small, strict Content-Security-Policy
and you can maximise web performance.