Take Back Control: Make AI Bots Play by Your Rules

Lorraine Bellon

Senior Product Marketing Manager, Security

Tracy Hinds

Fast Forward Program Lead

April 15, 2025

Security Product

Disclaimer: This post was written by humans, for humans.

Fastly Bot Management just got even better. Tired of AI bots scraping your data? Want to protect your IP from random AI crawlers? What’s the TL;DR? You are in control.

Fastly AI Bot Management can help you:

Understand what AI bots are crawling your content
Control what AI bots can crawl or scrape your content
Block any bots that take things without your consent
Stop AI bots from costing you money

The Open Web is under attack. Creating online and without barriers, in many ways, feels like a dying art—yet it’s more valuable than ever. Every day, our customers raise the alarm about AI bots that scrape, consume, and learn from their intellectual property, in order to evolve their own proprietary products by building their knowledge bases or creating derivative content. These crawlers do this without the consent of the creator and without giving credit back to the original source. There’s no question that content creators are hurting. From bloggers and journalists to global free and open source projects with millions of users, creators are facing a critical inflection point. The honor system of publishing in the open is being eviscerated, and in return, the benefit to the public good is eroded for short-term gains.

Unauthorized scraping enables AI companies to exploit the valuable content their bots crawl, learn from, and adopt into their models, without the opportunity to consent from the content owner. It’s not just a minor annoyance, either. This directly threatens the business models of organizations that rely on the value of their original content to generate revenue – and the livelihoods of the creators who do the work. AI bots are also overwhelming major free and open source projects, putting in jeopardy the open code and content work that 70% of the world relies on. As AI tools proliferate, the impact on content creators and hosting platforms grows exponentially. Without effective countermeasures, we risk a future where original content and publishing code on the Open Web loses its intrinsic value, and the organizations that depend on revenue and collaborative progress from that content can no longer sustain themselves.

Too many bots, too little time

To make matters worse, AI bot scraping activity can lead to massive unwanted increases in traffic from the scrapers, which can degrade site performance for legitimate users and lead to bandwidth overage charges. The Wikimedia Foundation recently highlighted this issue’s severity. Their infrastructure, built for human traffic spikes, is suffering under relentless AI scraper bot attacks, and the costs and risks associated with unchecked AI content scraping are skyrocketing. Drew DeVault, a prominent open source community figure, stated bluntly, “Over the past few months, instead of working on our priorities at SourceHut, I have spent anywhere from 20-100% of my time in any given week mitigating hyper-aggressive LLM crawlers at scale.” Non-profit and open source organizations are particularly challenged by this problem because they are already constrained for resources.

What can be done to stop this? There are a few techniques in the toolbox. Traditional defenses like robots.txt files have proven ineffective against some AI bots–the types that crawl indiscriminately, ignoring established protocols and etiquette. This leaves content creators watching helplessly as their work is consumed and repurposed without permission or compensation.

Even with existing bot management tools, pinpointing and mitigating specific AI scraper activity has been difficult. First, security teams need to be able to detect and identify the presence of AI bots. From there, they may wish to block them entirely or launch more sophisticated countermeasures to intercept, deceive, or even enforce monetization. They don’t want whatever protective measures they deploy to stop the AI bots they do want, like those that enable AI-powered search engine results. Creators need a way to manage AI bots strategically, letting the good ones through while blocking malicious or undesired AI bots.

Introducing Fastly AI Bot Management

Fastly AI Bot Management builds on the power of Fastly Bot Management, trusted by brands like JetBlue and LeMonde to protect their websites from attacks and keep systems resilient for customers. It gives you the power to manage and control the behavior of AI bots that crawl and scrape website content. Detect which AI bots are accessing your content, and take action to block, intercept, or allow particular AI bots based on your own unique policies and desired responses. It’s now available for all Fastly Bot Management customers and can be added for major FOSS and Open Web projects and the non-profits that serve them at no charge through our Fast Forward program. Fastly delivers one million requests per second on behalf of the open source projects we support.

Figure 1: Fastly AI Bot Management

To make this possible, we’ve introduced new signals for two separate categories of verified AI bots.

AI Crawler

This signal identifies AI bots that crawl the internet building up knowledge, with or without consent from the content owner or attribution of credit.

AI Fetcher

This signal identifies bots that provide answers in real-time with data found on the internet. Think about when you do an AI-powered Google search for “flu symptoms” or ask OpenAI’s ChatGPT to help you research a topic for a new blog post. These bots generally provide attribution to the website from which they are obtaining the information.

No one can verify the identity of a bot that does not provide verifiable methods to do so. What does that mean, exactly? A bot operator must publish a method to make its bot identifiable to others. Usually, this is a list of IP addresses that the operator attests that the bot will use exclusively. However, not all bot operators publish verifiable methods, particularly those that scrape content without consent or attribution.

To address these unverifiable AI bots, we’ve added two other AI bot signals that identify suspected AI Crawler or AI Fetcher bots based on their user-agent information. Customers can take the same actions on these signals as they would on the verified signals.

Scrape-proof your valuable content

Protecting the Open Web and supporting the free exchange of knowledge is crucial. By empowering content creators and platforms to make informed choices about AI bot access, we can help preserve online content and code integrity to ensure fair compensation and opportunity for consent for those who produce it.

Looking to protect your IP and data, or gain better insights into what’s crawling your site? Chat with our team of security experts for a personalized demo to see what AI Bot Management can do for you! If you’re already using Fastly Bot Management, it’s easy to get started today just by using the new AI bot signals. If you’re a free and open source project or an organization that serves them, get in touch with us to enroll in Fast Forward and get protected – for free!