The Man Who Put a Sign on the Internet

April 29, 2026

This conversation is imagined. Martijn Koster is very real — a Dutch software engineer who created robots.txt in 1994 and, by most accounts, the internet's first search engine. His actual words, where they exist in the public record, are cited. Everything else is constructed from what's known about his perspective, his temperament, and the dry pragmatism that runs through three decades of his public writing. He did not participate in this interview, has not endorsed it, and would probably find the whole exercise slightly odd.

In September 1993, a scrounged Sun 4/330 server at Nexor in the UK started behaving strangely. The logs showed the same sequence of documents being retrieved "at an enormous rate, in the order of 1 document per second."¹ A web crawler was hammering it.

The server belonged to Martijn Koster. Five months later, he posted a message to the www-talk mailing list — at the time, the primary communication channel for nearly everyone building the web² — proposing a simple text file that site owners could place on their servers to tell bots which parts to leave alone. He called it robots.txt. The proposal was modest by design: "This proposed standard doesn't require any server/client/protocol changes, and can provide a partial solution to problems caused by robots."³

Thirty-two years later, robots.txt is used by over 500 million websites.⁴ In 2025, a federal court in Manhattan called it "more akin to a sign than a barrier."⁵ A peer-reviewed study found AI crawlers check the file less than 40% of the time.⁶ And every serious proposal for governing AI's access to the web — RSL, ai.txt, llms.txt — is built on top of it.

We wanted to talk to Koster about signs, barriers, and what happens when a neighborly fix outlives the neighborhood.

When you wrote robots.txt, were you thinking about permissions at all?

Martijn: No. I was thinking about my server falling over. The web had maybe ten thousand sites.⁷ I knew most of the people running crawlers, or I knew someone who knew them. The problem wasn't philosophical.

I should say — I'd built ALIWEB, which was essentially a search engine. So I understood why someone would want to crawl. I wasn't trying to stop crawling. I was trying to give crawlers a way to be told where not to go.

The www-talk list, the Geneva conference in May 1994 — the whole community was, what, a few hundred people?

Martijn: The Geneva conference had 380 attendees, and that felt enormous.⁸ People were meeting email colleagues in person for the first time. You have to understand — when I posted the proposal, I literally asked everyone to keep the discussion focused and "not degenerate in a 'robots are good/bad' discussion that won't be resolved."⁹ I knew these people. I knew their bots. The compliance model wasn't naive. It was appropriate. For that room.

And now a federal judge has described what you built as "more akin to a sign than a barrier."

Martijn: Well, it is a sign. That's what I made. A sign. Not a lock. Not a fence. A text file that says "please don't go here."

The judge got it exactly right. The question is why anyone expected a sign to do the work of a barrier.

Do you have a theory?

Martijn: For about twenty years, the sign worked. Not because signs are powerful, but because the people reading them had reasons to cooperate. Search engines needed good relationships with webmasters. If you ignored robots.txt, webmasters would block you other ways, or just complain loudly enough that it mattered. The incentives were aligned.

What changed isn't the sign. What changed is who's walking past it.

The Duke study found AI crawlers are the least likely category of bot to even check robots.txt — under 40% within a week-long window. And when the rules are stricter, compliance drops further.⁶

Martijn: That's the data confirming what everyone already suspected. But the interesting part isn't the non-compliance. It's the selectivity. They check when the rules are permissive. They stop checking when the rules say no. That's a protocol being read and deliberately ignored. Which is a very different problem from a protocol nobody reads.

RSL — Really Simple Licensing — launched in 2025 with over 1,500 endorsing organizations.¹⁰ It extends robots.txt with machine-readable licensing terms. Cloudflare and Akamai can enforce it at the network level.¹¹ Does that change anything?

Martijn: It's clever. Building on robots.txt rather than replacing it — that's pragmatic, and I appreciate pragmatism. But notice what's actually happening. You're adding licensing terms to a text file, and then you need Cloudflare to enforce them. The enforcement doesn't live in the file. It lives in the infrastructure provider's decision to act on the file.

So the question becomes about Cloudflare's incentives. Not about the standard.

The sign still needs someone willing to enforce it.

“

Martijn: The sign always needed that. In 1994, the enforcement was social. Everyone knew everyone. Now it has to be structural — someone with a chokepoint has to decide it's worth their while. I was solving a problem between colleagues. That's a fundamentally different situation.

You participated in the effort to formalize robots.txt as an IETF standard. That became RFC 9309 in 2022 — twenty-eight years after your original proposal.¹² What took so long?

Martijn: [laughs] It worked without being a standard. That's the honest answer. Nobody needed it to be formal because everyone just... used it. Which is also why it's so hard to replace now. You can't deprecate something that was never officially anything.

It succeeded so completely that it became impossible to fix.

Martijn: Yes. And every proposal to replace it references it. ai.txt, llms.txt, RSL — they're all defined in relation to robots.txt. They either extend it or explain why they're different from it. The original file has become the conceptual frame for its own successors. Even the replacements carry its assumptions.

Which assumptions worry you most?

Martijn: That the problem is access control. Robots.txt asks: should this bot be allowed to visit this URL? But the real question in 2026 is what happens to the content after the bot visits. I can tell a crawler not to index a page. I cannot tell a language model not to learn from it. Robots.txt doesn't have a vocabulary for that. It was never meant to.

If you could go back to February 1994 and rewrite the proposal, would you change anything?

Martijn: No.

And I mean that. The proposal was right for the problem I had. A text file, no protocol changes, partial solution. I said "partial" in the original email. I was explicit about that.³ The mistake was everyone else treating a partial solution as a complete one for thirty years.

Twenty thousand pages crawled for every single referral returned — that's one estimate of the current ratio for AI training crawlers. What replaces the sign?

“

Martijn: I don't know. But I'll tell you what won't work: a better sign.

Martijn Koster, "Robots.txt is 25 years old," greenhills.co.uk, 2019. https://www.greenhills.co.uk/posts/robotstxt-25/ ↩
Wikipedia, "robots.txt." https://en.wikipedia.org/wiki/Robots.txt ↩
Koster, "Robots.txt is 25 years old." ↩ ↩²
Koster, "Robots.txt is 25 years old" — "robots.txt endures; it is used by over 500 million websites according to Google." ↩
Ziff Davis v. OpenAI, No. 1:25-cv-04315 (S.D.N.Y. 2025), as cited in FKKS Technology Law, "Is Your Site's Robots.txt Giving Content to AI Models for Free?" December 23, 2025. https://technologylaw.fkks.com/post/102lz1g/is-your-sites-robots-txt-giving-content-to-ai-models-for-free ↩
Taein Kim et al., "Scrapers Selectively Respect robots.txt Directives," ACM Internet Measurement Conference, 2025. https://dl.acm.org/doi/10.1145/3730567.3764471 ↩ ↩²
Cybercultural, "1994: Cool Site of the Day and the rise of curated web design," February 17, 2026. https://cybercultural.com/p/1994-cool-site-of-the-day/ ↩
Computer History Museum, "'Woodstock of the Web' at 25," October 2019. https://computerhistory.org/blog/woodstock-of-the-web-at-25/ ↩
Koster, "Robots.txt is 25 years old." ↩
Globe Newswire / RSL Collective, "RSL AI Licensing 1.0 Now an Official Industry Standard," December 10, 2025. https://www.globenewswire.com/news-release/2025/12/10/3203217/0/en/ ↩
Shelly Palmer, "AI Web-Scraping Gets an 'Official' Standard," December 11, 2025. https://shellypalmer.com/2025/12/ai-web-scraping-gets-an-official-standard/ ↩
RFC 9309, IETF, September 2022, as cited in Kim et al. (2025). ↩

We wanted to talk to Koster about signs, barriers, and what happens when a neighborly fix outlives the neighborhood.

When you wrote robots.txt, were you thinking about permissions at all?

The www-talk list, the Geneva conference in May 1994 — the whole community was, what, a few hundred people?

And now a federal judge has described what you built as "more akin to a sign than a barrier."

Martijn: Well, it is a sign. That's what I made. A sign. Not a lock. Not a fence. A text file that says "please don't go here."

The judge got it exactly right. The question is why anyone expected a sign to do the work of a barrier.

Do you have a theory?

What changed isn't the sign. What changed is who's walking past it.

The Duke study found AI crawlers are the least likely category of bot to even check robots.txt — under 40% within a week-long window. And when the rules are stricter, compliance drops further.⁶

RSL — Really Simple Licensing — launched in 2025 with over 1,500 endorsing organizations.¹⁰ It extends robots.txt with machine-readable licensing terms. Cloudflare and Akamai can enforce it at the network level.¹¹ Does that change anything?

So the question becomes about Cloudflare's incentives. Not about the standard.

The sign still needs someone willing to enforce it.

“

You participated in the effort to formalize robots.txt as an IETF standard. That became RFC 9309 in 2022 — twenty-eight years after your original proposal.¹² What took so long?

It succeeded so completely that it became impossible to fix.

Which assumptions worry you most?

If you could go back to February 1994 and rewrite the proposal, would you change anything?

Martijn: No.

Twenty thousand pages crawled for every single referral returned — that's one estimate of the current ratio for AI training crawlers. What replaces the sign?

“

Martijn: I don't know. But I'll tell you what won't work: a better sign.

Martijn Koster, "Robots.txt is 25 years old," greenhills.co.uk, 2019. https://www.greenhills.co.uk/posts/robotstxt-25/ ↩
Wikipedia, "robots.txt." https://en.wikipedia.org/wiki/Robots.txt ↩
Koster, "Robots.txt is 25 years old." ↩ ↩²
Koster, "Robots.txt is 25 years old" — "robots.txt endures; it is used by over 500 million websites according to Google." ↩
Ziff Davis v. OpenAI, No. 1:25-cv-04315 (S.D.N.Y. 2025), as cited in FKKS Technology Law, "Is Your Site's Robots.txt Giving Content to AI Models for Free?" December 23, 2025. https://technologylaw.fkks.com/post/102lz1g/is-your-sites-robots-txt-giving-content-to-ai-models-for-free ↩
Taein Kim et al., "Scrapers Selectively Respect robots.txt Directives," ACM Internet Measurement Conference, 2025. https://dl.acm.org/doi/10.1145/3730567.3764471 ↩ ↩²
Cybercultural, "1994: Cool Site of the Day and the rise of curated web design," February 17, 2026. https://cybercultural.com/p/1994-cool-site-of-the-day/ ↩
Computer History Museum, "'Woodstock of the Web' at 25," October 2019. https://computerhistory.org/blog/woodstock-of-the-web-at-25/ ↩
Koster, "Robots.txt is 25 years old." ↩
Globe Newswire / RSL Collective, "RSL AI Licensing 1.0 Now an Official Industry Standard," December 10, 2025. https://www.globenewswire.com/news-release/2025/12/10/3203217/0/en/ ↩
Shelly Palmer, "AI Web-Scraping Gets an 'Official' Standard," December 11, 2025. https://shellypalmer.com/2025/12/ai-web-scraping-gets-an-official-standard/ ↩
RFC 9309, IETF, September 2022, as cited in Kim et al. (2025). ↩

The Man Who Put a Sign on the Internet

When you wrote robots.txt, were you thinking about permissions at all?

The www-talk list, the Geneva conference in May 1994 — the whole community was, what, a few hundred people?

And now a federal judge has described what you built as "more akin to a sign than a barrier."

Do you have a theory?

The Duke study found AI crawlers are the least likely category of bot to even check robots.txt — under 40% within a week-long window. And when the rules are stricter, compliance drops further.6

RSL — Really Simple Licensing — launched in 2025 with over 1,500 endorsing organizations.10 It extends robots.txt with machine-readable licensing terms. Cloudflare and Akamai can enforce it at the network level.11 Does that change anything?

The sign still needs someone willing to enforce it.

You participated in the effort to formalize robots.txt as an IETF standard. That became RFC 9309 in 2022 — twenty-eight years after your original proposal.12 What took so long?

It succeeded so completely that it became impossible to fix.

Which assumptions worry you most?

If you could go back to February 1994 and rewrite the proposal, would you change anything?

Twenty thousand pages crawled for every single referral returned — that's one estimate of the current ratio for AI training crawlers. What replaces the sign?

Footnotes

When you wrote robots.txt, were you thinking about permissions at all?

The www-talk list, the Geneva conference in May 1994 — the whole community was, what, a few hundred people?

And now a federal judge has described what you built as "more akin to a sign than a barrier."

Do you have a theory?

The Duke study found AI crawlers are the least likely category of bot to even check robots.txt — under 40% within a week-long window. And when the rules are stricter, compliance drops further.6

RSL — Really Simple Licensing — launched in 2025 with over 1,500 endorsing organizations.10 It extends robots.txt with machine-readable licensing terms. Cloudflare and Akamai can enforce it at the network level.11 Does that change anything?

The sign still needs someone willing to enforce it.

You participated in the effort to formalize robots.txt as an IETF standard. That became RFC 9309 in 2022 — twenty-eight years after your original proposal.12 What took so long?

It succeeded so completely that it became impossible to fix.

Which assumptions worry you most?

If you could go back to February 1994 and rewrite the proposal, would you change anything?

Twenty thousand pages crawled for every single referral returned — that's one estimate of the current ratio for AI training crawlers. What replaces the sign?

Footnotes

The Duke study found AI crawlers are the least likely category of bot to even check robots.txt — under 40% within a week-long window. And when the rules are stricter, compliance drops further.⁶

RSL — Really Simple Licensing — launched in 2025 with over 1,500 endorsing organizations.¹⁰ It extends robots.txt with machine-readable licensing terms. Cloudflare and Akamai can enforce it at the network level.¹¹ Does that change anything?

You participated in the effort to formalize robots.txt as an IETF standard. That became RFC 9309 in 2022 — twenty-eight years after your original proposal.¹² What took so long?

The Duke study found AI crawlers are the least likely category of bot to even check robots.txt — under 40% within a week-long window. And when the rules are stricter, compliance drops further.⁶

RSL — Really Simple Licensing — launched in 2025 with over 1,500 endorsing organizations.¹⁰ It extends robots.txt with machine-readable licensing terms. Cloudflare and Akamai can enforce it at the network level.¹¹ Does that change anything?

You participated in the effort to formalize robots.txt as an IETF standard. That became RFC 9309 in 2022 — twenty-eight years after your original proposal.¹² What took so long?