Prevent ChatGPT from Scraping Your SiteAugust 11th, 2023 • filed under Programming
A short but sweet post. Ready?
OpenAI quietly published the crawler name/user agent for ChatGPT, creatively named GPTBot.
Since we know the user agent, now, we can effectively prevent it from crawling a site using
robots.txt like so:
User-agent: GPTBot Disallow: /
OpenAI was also generous enough to provide a list of IP ranges their crawler will connect from, so go ahead and add these to your firewall rules, too:
22.214.171.124/28 126.96.36.199/28 188.8.131.52/28 184.108.40.206/28 220.127.116.11/28 18.104.22.168/28 22.214.171.124/28 126.96.36.199/28 188.8.131.52/28