[fpc-pascal] Effect of blocking AI crawlers in forums and wikis
Martin Frb
lazarus at mfriebe.de
Sat Sep 6 17:53:38 CEST 2025
On 06/09/2025 17:20, Wayne Sherman via fpc-pascal wrote:
> Something similar can happen when search engine crawlers are blocked.
> How useful would Google Search be for Free Pascal related queries if
> none of the Free Pascal websites were indexed?
>
> What do you think? How would FPC and Lazarus related queries perform
> on an LLM with no training data for FPC and Lazarus in the form of
> articles, answers, and discussions?
As far as I understand neither search engines, nor AI crawlers are
blocked by default.
The pages have a robot.txt, and any bot that obeys that afaik can crawl.
Bots that don't, get blocked. (AFAIK)
On the wiki there is Anubis, but some bots are excluded from being
checked, and in principal, bots can solve this, the same as normal
browsers do for their users. Anubis does not block access. It just adds
a cost, and that cost is tiny, unless you keep getting your IP banned,
and you keep coming from multiple IP, and have to solve it many thousand
(or 100k) times. Then it starts accumulating cost.
There are Companies that provide crawl services, using that many
different IP, from all over the world. Those companies know that they
break their target content providers (they go to great length to keep
breaking others services). They don't care.
But as long as there are other companies that respect the robot.txt,
then those abusers will loose out eventually, as they will not have the
data to compete. Their problem.
More information about the fpc-pascal
mailing list