Revenir au blog

Follow and Subscribe

Disponible uniquement en anglais

Cette page n'est actuellement disponible qu'en anglais. Nous nous excusons pour la gêne occasionnée, merci de revenir sur cette page ultérieurement.

The lengthiest HTTP headers

Leon Brocard

Principal Solutions Architect, Fastly

Large web page bodies make your page load slowly, but what about large headers?

HTTP headers helpfully attach metadata about the page body. Some of these headers are crucial, like Content-Type, which communicates the type of content, say text/html or image/jpeg. Most headers are quite short, but what are the largest, lengthiest headers?

Andrew Betts posted previously on this blog about The headers we want and The headers we don't want. He included a recommendation:

“So as well as trying to remove unnecessary headers, try to keep the values of the headers that you do include as short as possible.”

How can we find these lengthy headers? The HTTP Archive tracks information on over 1 million pages every month. They make the data available via Google BigQuery. They recently introduced some new datasets which are even easier to use. I queried the HTTP Archive with DuckDB, specifically the 2024-10-01 crawl.

HTTP header names

Headers consist of a name and a value. Let’s start by looking at their names. Most header names are very short, with the most common being 4 or 13 characters long:

These short header names will not cause any performance problems.

HTTP header values

Let’s shift our focus to the header values. Looking at all the values, half are under 15 characters long, but we can see that 1% of values are over 263 characters long:

The most interesting of these long values comes from the Set-Cookie and the Content-Security-Policy header names.

Set-Cookie header values

The Set-Cookie header sends a cookie from the server to the browser, which the browser then sends to the server in the future. Typically these cookies are a short identifier, but some a kilobytes long:

Some of these are misconfigurations and some of these push a lot of state to the browser, sometimes even whole databases of products. I recommend keeping cookies small.

Content-Security-Policy header values

The Content-Security-Policy header defines directives that govern resource fetching, the state of a document, aspects of navigation, and reporting. For very simple sites, the policy can be short but for most sites, the policy ends up quite long:

A small Content-Security-Policy

Let’s look at a small policy. The policy for https://www.w3.org/ is:

Content-Security-Policy: frame-ancestors 'self' https://cms.w3.org/ https://cms-dev.w3.org/; upgrade-insecure-requests

There are two directive names, separated by a semicolon: frame-ancestors and upgrade-insecure-requests.

The frame-ancestors directive restricts the URLs which can embed the page using <frame>, <iframe>, <object>, or <embed>. This avoids the risk of being embedded into potentially hostile contexts. The frame-ancestors directive values are:

  • 'self': the page can be embedded by pages from the same origin

  • https://cms.w3.org/: the page can be embedded by the content management system

  • https://cms-dev.w3.org/: the page can be embedded by the development content management system

The upgrade-insecure-requests directive asserts to the user agent that the owner intends a site to load only secure resources, and that insecure URLs ought to be treated as though they had been replaced with equivalent secure URLs.

A long Content-Security-Policy

Now let’s look at a longer policy. The policy for https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Content-Security-Policy is:

Content-Security-Policy: default-src 'self'; script-src 'report-sample' 'self' https://www.google-analytics.com/analytics.js https://www.googletagmanager.com/gtag/js assets.codepen.io production-assets.codepen.io https://js.stripe.com 'sha256-EehWlTYp7Bqy57gDeQttaWKp0ukTTEUKGP44h8GVeik=' 'sha256-XNBp89FG76amD8BqrJzyflxOF9PaWPqPqvJfKZPCv7M='; script-src-elem 'report-sample' 'self' https://www.google-analytics.com/analytics.js https://www.googletagmanager.com/gtag/js assets.codepen.io production-assets.codepen.io https://js.stripe.com 'sha256-EehWlTYp7Bqy57gDeQttaWKp0ukTTEUKGP44h8GVeik=' 'sha256-XNBp89FG76amD8BqrJzyflxOF9PaWPqPqvJfKZPCv7M='; style-src 'report-sample' 'self' 'unsafe-inline'; object-src 'none'; base-uri 'self'; connect-src 'self' developer.allizom.org bcd.developer.allizom.org bcd.developer.mozilla.org updates.developer.allizom.org updates.developer.mozilla.org https://*.google-analytics.com https://*.analytics.google.com https://*.googletagmanager.com https://observatory-api.mdn.allizom.net https://observatory-api.mdn.mozilla.net https://api.github.com/search/issues stats.g.doubleclick.net https://api.stripe.com; font-src 'self'; frame-src 'self' interactive-examples.mdn.mozilla.net interactive-examples.mdn.allizom.net mdn.github.io live-samples.mdn.mozilla.net live-samples.mdn.allizom.net *.mdnplay.dev *.mdnyalp.dev *.play.test.mdn.allizom.net https://v2.scrimba.com https://scrimba.com jsfiddle.net www.youtube-nocookie.com codepen.io survey.alchemer.com https://js.stripe.com; img-src 'self' data: *.githubusercontent.com *.googleusercontent.com *.gravatar.com mozillausercontent.com firefoxusercontent.com profile.stage.mozaws.net profile.accounts.firefox.com mdn.dev interactive-examples.mdn.mozilla.net interactive-examples.mdn.allizom.net wikipedia.org upload.wikimedia.org https://mdn.github.io/shared-assets/ https://mdn.dev/ https://*.google-analytics.com https://*.googletagmanager.com www.gstatic.com; manifest-src 'self'; media-src 'self' archive.org videos.cdn.mozilla.net https://mdn.github.io/shared-assets/; child-src 'self'; worker-src 'self';

That’s a lot of directives! Let’s look at just one: the img-src directive restricts the URLs from which image resources may be loaded. In this case, images may be loaded from the origin itself, inline images (data:), or inline images from a number of URLs:

  • 'self'

  • *.githubusercontent.com

  • *.googleusercontent.com

  • *.gravatar.com

  • data:

  • firefoxusercontent.com

  • https://*.google-analytics.com

  • https://*.googletagmanager.com www.gstatic.com

  • https://mdn.dev/

  • https://mdn.github.io/shared-assets/

  • interactive-examples.mdn.allizom.net

  • interactive-examples.mdn.mozilla.net

  • mdn.dev

  • mozillausercontent.com

  • profile.accounts.firefox.com

  • profile.stage.mozaws.net

  • upload.wikimedia.org

  • wikipedia.org

For example, the CSS <image> image-set() reference page loads an image https://mdn.github.io/shared-assets/images/examples/balloons-landscape.jpg from mdn.github.io, which is allowed by the policy.

Most common directives

If I look at the largest policies, the top five directives are:

  • img-src, which restricts image loading URLs

  • connect-src, which restricts EventSource, WebSockets, and XMLHttpRequest URLs

  • script-src, which restricts script execution URLs

  • frame-src, which restricts <iframe> URLs

  • default-src, which is a fallback for the other fetch directives

Most common directive values for img-src

Let’s pick the most common directive, img-src. The top ten directives are:

  • data:

  • blob:

  • https://*.google-analytics.com

  • android-webview-video-poster:

  • *.facebook.com

  • https://googleads.g.doubleclick.net

  • *.fbcdn.net

  • *.cdninstagram.com

  • https://*.fbsbx.com

  • *.giphy.com

The analytics and advertisements are almost always third parties. More interesting is the “CDN” domains: many sites often have an entire domain for image assets.

Smaller policies

I recommend having smaller policies. You can think of all of these policies as security exceptions. Simplicity is key to keeping security and web performance issues in check. If simplicity were the primary priority for your website, then all of the assets should come from the website itself. When browsers create separate HTTP connections to many additional websites, this significantly slows down page loading.

Andrew already articulated this recommendation in great detail in Taming third parties with a single-origin website. You can proxy other domains through Fastly so that they are swiftly served from your main domain. This means you can maintain a small, strict Content-Security-Policy and you can maximise web performance.