Skip to content

Commit

Permalink
Update BadBotBlocker to include crawlers that collect training data f…
Browse files Browse the repository at this point in the history
…or LLMs
  • Loading branch information
fisharebest committed Oct 11, 2023
1 parent f64ed0a commit abf0d18
Showing 1 changed file with 5 additions and 0 deletions.
5 changes: 5 additions & 0 deletions app/Http/Middleware/BadBotBlocker.php
Original file line number Diff line number Diff line change
Expand Up @@ -71,9 +71,13 @@ class BadBotBlocker implements MiddlewareInterface
'Barkrowler',
'BLEXBot',
'Bytespider',
'CCBot', // Used to train a number of LLMs
'ChatGPT-User', // Used by ChatGPT during operation
'DataForSEO',
'DataForSeoBot', // https://dataforseo.com/dataforseo-bot
'DotBot',
'FacebookBot', // Collects training data for Facebook's LLM translator.
'Google-Extended', // Collects training data for Google Bard
'GPTBot', // Collects training data for ChatGPT
'Grapeshot',
'Honolulu-bot', // Aggressive crawer, no info available
Expand All @@ -83,6 +87,7 @@ class BadBotBlocker implements MiddlewareInterface
'MegaIndex.ru',
'MJ12bot',
'netEstate NE',
'Omgilibot', // Collects training data for LLMs
'panscient',
'PetalBot',
'proximic',
Expand Down

0 comments on commit abf0d18

Please sign in to comment.