Illustration by Alex Castro / The Verge
The New York Times has blocked OpenAI’s web crawler, meaning that OpenAI can’t use content from the publication to train its AI models. If you check the NYT’s robots.txt page, you can see that the NYT disallows GPTBot, the crawler that OpenAI introduced earlier this month. Based on the Internet Archive’s Wayback Machine, it appears NYT blocked the crawler as early as August 17th.
Screenshot by Jay Peters / The Verge
A snippet of the NYT’s robots.txt showing that the company has disallowed GPTBot.
The change comes after the NYT updated its terms of service at the beginning of this month to prohibit the use of its content to train AI models. The NYT and OpenAI didn’t immediately reply to a request for comment….