How Fastly builds POPs
Building a new point of presence (POP) from scratch involves all of the engineering groups within Fastly. Our data center infrastructure (DCI) team spearheads and coordinates the POP build from hardware procurement to putting the POP into production and serving traffic.
Today, we announced a new POP in Osaka, Japan — our second POP in the region. We’ll continue to add POPs in new regions as our customers and growth demand it. You can expect to see POPs in Brazil, South Africa, Spain, United Arab Emirates, and India in the future. Check out our map of current and planned locations for more information.
We thought it might be interesting to share how we build a POP from the ground up:
Naming our POPs
It’s no secret that we use IATA airport codes to name our POPs. For example, our POP in Osaka, Japan is called “ITM.” Our POP in Stockholm, Sweden is “BMA.” Using these codes allows us (and our customers) to know exactly where each POP is located.
You can see these POP codes in the X-Served-By: debug header.
This is a quick and easy way to know exactly where your request (geographically) was served.
Location and size
We determine the location and size of POP should we build based around key parameters:
Population: The number of “eyeballs” that would use the POP. A POP in a major metropolitan area would be larger than one in a smaller city.
Data center and interconnection capabilities: Different parts of the world are more well connected than others, so we strategically locate POPs where we can connect to the core of the Internet. You’ll note that most of our POPs today are nearby to major Internet peering fabrics.
Customer/user demand. If a customer has a large user base in a location where we are building out a new POP, it may need to serve more customer traffic.
Once the location and size of the POP is determined, then we begin the procurement process to order the hardware and ship it out to the data center.
Build out
Once the hardware arrives to the data center, our DCI team hits the ground running and begins unpacking and racking the hardware.
All of the hardware is installed and wired by Fastly staff to ensure that we meet stringent internal quality controls; the hardware and cabling are double checked and thoroughly tested before the DCI team signs off on the build and leaves the data center.
Connectivity
With the physical layer work completed, our Network Engineering team is responsible for bringing basic connectivity online to the POP. They will configure transit and peering links, establish BGP sessions, and bring up basic network functionality.
Bootstrap and configuration
At this point, the new POP is handed over to our Network Systems team for bootstrap, configuration, and testing.
This process is mostly automated, using a combination of both in-house written tools and Chef. As each machine is PXE booted, it’s given a unique name, the operating system is installed, and all of the Fastly “secret sauce” is installed and configured.
There are internal systems that control every aspect of our CDN. From loading Varnish Configuration Language (VCL) to the servers, to enabling Real-Time Purging, to Real-Time Stats, we work with teams across engineering to get everything online. It takes a handful of GitHub pull requests coordinated by a few lead engineers to make it happen.
Once all of the machines in the POP are configured, they are thoroughly tested to ensure that they can handle the anticipated load.
Pre-flight checklists
There’s lots of moving pieces between the physical build, the network build, and the systems build going on. Before the POP goes live, we run a series of “GO/NO-GO” meetings between our teams to make sure that every checkbox is checked, every subsystem is running, and every test is complete and passing. Even with automation, we’re checking our work at every step to make sure we’ve completed all the required prerequisites.
Going live
This is where Fastly teamwork really shines. Deploying a new POP takes coordination with all the engineering teams at Fastly.
On the go-live date, we use an internal Slack channel to coordinate all of the necessary activities. Once the network and DNS configurations have been updated, the new POP will start taking traffic. We also update our status page, which lets our customers know that a new POP is going live. A short time later, our Customer Portal is updated to allow our customers to choose that POP for shielding.
All eyes are on our monitoring and alerting system to ensure that this new POP is performing as expected and that the appropriate traffic is now being sent to it.
Traffic engineering
Because the new POP is now available for serving requests, our Network Engineering team might have to do some Traffic Engineering (TE) to ensure that the optimal network paths are used to reach it. We keep a constant eye on performance and ensure that our customers’ users get to the closest POP as efficiently as possible. This means that constant tweaks are made to the way traffic is routed to the Fastly network.
We’re hiring
If you think that the above sounds exciting and you’d like to be a part of our team, check out our open positions — we’re always looking for talented people!