The rise and fall of robots.txt
David Pierce,
The Verge,
Feb 16, 2024
This article is getting a lot of circulation around social media. It describes robots.txt, a text file that sits in the home directory of every web site (including mine) and instructs crawlers about what they can index and what they should avoid. As David Pierce reports, following robots.txt is not a legal requirement, but it's something considerate web crawlers and search engine indexers did. But no more, as AI engines are insatiable in their quest for data. They run right through robots.txt as though it didn't exist. And that's bad for the long-term future of the web. On the other hand, replacing robots.txt with something more enforceable is also bad for the long-term future of the web.
Today: 4 Total: 93 [Share]
] [