floss.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
For people who care about, support, and build Free, Libre, and Open Source Software (FLOSS).

Administered by:

Server stats:

689
active users

alcinnz

I'm wondering about deploying Nepenthes on my personal site. To punish any AI crawlers.

zadzmo.org/code/nepenthes/

On the one hand I'm wary about tanking my search rankings, on the other... I'm not finding the major search engines much good anymore anyways!

I do like SearchMySite, but I *think* its smart enough to not be tripped up by it.

zadzmo.orgZADZMO code

O.K., I've got a clear answer already!

A few people are concerned about the wastefulness, but most are keen to see them poisoned! And reflecting upon my motivation to not be DoS'd again...

I think I'll configure aggressive ratelimits 1st, and then I'm seeing some more tools to choose between...

I considered Poison the WeLLMs before...

@skyfaller @OliverUv @lucabtz

Locaine: git.madhouse-project.org/alger
Quixotic: marcusb.org/hacks/quixotic.htm
Poison the WeLLMs: codeberg.org/MikeCoats/poison-

MadHouse Git RepositoriesiocaineThe deadliest poison known to AI.

@alcinnz I saw a comment about that software saying you will burn CPU cycles to make some other software burn CPU cycles, in the end it is a lot of wasted resources ending up in warming up the planet.

Made me rethink about it

@lucabtz Fair complaint.

I would deploy it in the defensive setting rather than offensive, which should minimize that drawback.

My goal is to avoid getting DoS'd again, so I'm open to other suggestions!

@alcinnz I'm not sure I know enough to evaluate the strengths and weaknesses of different approaches, but have you seen iocaine?

git.madhouse-project.org/alger

MadHouse Git RepositoriesiocaineThe deadliest poison known to AI.

@skyfaller Thanks for the links, I'm investing!

@alcinnz any search engine worth its salt honors robots.txt so using that you can protect them. These tools were specifically made to punish those who do not honor robots.txt

As far as I'm aware, google still honor it

@OliverUv The issue as I understand it is that any bot would see the Nepenthes tarpit, even will behaved ones. So unless the crawler's limiting how much of your site it visits...

Or we could ask well-behaved bots not to crawl the tarpit, I'm guessing the misbehaved ones won't honor that hint...