A growing number of websites are taking steps to ban AI bot traffic so that their work isn’t used as training data and their servers aren’t overwhelmed by non-human users. However, some companies are ignoring the bans and scraping anyway.

Online traffic analysis conducted by BuiltWith, a web metrics biz, indicates that the number of publishers trying to prevent AI bots from scraping content for use in model training has surged since July.

About 5.6 million websites presently have added OpenAI’s GPTBot to the disallow list in their robots.txt file, up from about 3.3 million at the start of July 2025. That’s an increase of almost 70 percent.

Websites can signal to visiting crawlers whether they allow automated requests to harv…

Similar Posts

Loading similar posts...

Keyboard Shortcuts

Navigation
Next / previous item
j/k
Open post
oorEnter
Preview post
v
Post Actions
Love post
a
Like post
l
Dislike post
d
Undo reaction
u
Recommendations
Add interest / feed
Enter
Not interested
x
Go to
Home
gh
Interests
gi
Feeds
gf
Likes
gl
History
gy
Changelog
gc
Settings
gs
Browse
gb
Search
/
General
Show this help
?
Submit feedback
!
Close modal / unfocus
Esc

Press ? anytime to show this help