Not logged in. · Lost password · Register
Forum: Support Ideas and suggestions RSS
Captcha on searches?
Bots are ruining everything!
Avatar
NFG #1
Member since Sep 2006 · 120 posts
Group memberships: Members
Show profile · Link to this post
Subject: Captcha on searches?
Apparently I have 5,000+ visitors every day, even though actual human visitors are like 1% of that.  Bots looping endlessly through the search pages are completely screwing with the stat count and raising bandwidth usage, etc etc.

Is it possible to easily add a simple 'check here if you're human' checkbox to the search page, so bots are blocked?  This sort of simple mechanism completely stopped spam on one of my other sites, so I imagine it'd work well here.
Avatar
Yves (Administrator) #2
User title: UNB developer & webmaster
Member since Jan 2004 · 3814 posts · Location: Erlangen, Germany
Group memberships: Administrators, Members
Show profile · Link to this post
A simple checkbox may be checken by the bots as well. An image CAPTCHA (like on registration or guest posting) may be a bit too much of a burden for regular users. There's ways in between, like simple text tasks or maths questions. I have no experience with those nor would I know what helps and how it should be designed to work well for everybody (users, disabled users, crazy bots, evil bots). But principially there's session support already, so it shouldn't be too hard a technical problem to put such a check into the search process. It just hasn't been done yet. I could imagine adding two plug-in hooks for that (one to display the question and another one to evaluate its answer) so there could be different methods.
♪ ...nanananah, all in all we’re just brilliant thieves, nanananah... ♪♬
Avatar
NFG #3
Member since Sep 2006 · 120 posts
Group memberships: Members
Show profile · Link to this post
As I mentioned, I use a checkbox on a contact form, and it stopped the bots dead in their tracks.  Most of them are blind scripts, and if it fails no one comes to check why 'cause there's a million other forums to annoy.  Even if they work out they should check the box, simply changing the form name of the box is often enough to stop them again, until someone rewrites the script.  Which basically never happens once, never mind twice.  =)

A checkbox is also the least intrusive manual option for the users.  It's a single click, no thinking, no effort.  A session check would definitely be better, but that assumes the bots aren't using capable browser substitutes...  But even restricting searches to logged-in users would work, keeping the bots at bay.

I put a 'rel nofollow' and the bots still followed it.  I excluded the search in the robots.txt and they still followed it.  I had 5728 'guests' hammering the searches yesterday.
Avatar
Yves (Administrator) #4
User title: UNB developer & webmaster
Member since Jan 2004 · 3814 posts · Location: Erlangen, Germany
Group memberships: Administrators, Members
Show profile · Link to this post
Can you identify a certain user-agent name of the offending bot? If so, you could lock it out by a .htaccess file.

Or you might try with the following line in search.inc.php right after the if (!defined... line:

if (!$UNB['Client']['is_browser']) die('Die, bot, die!');

(Untested, may require small corrections.)

This relies on the board's user-agent recognition and to classify browsers and bots. To avoid false positives, a browser is everything that is not a bot, and a bot is recognised by a couple of signatures, amongst them the major search engines and some other (older) crap.
♪ ...nanananah, all in all we’re just brilliant thieves, nanananah... ♪♬
Avatar
NFG #5
Member since Sep 2006 · 120 posts
Group memberships: Members
Show profile · Link to this post
I added that line and it seems fine.  I also added Chrome to the list of browsers, and added the bot annoying me (A fake Majesticle?) in this case explicitly as a bot.

So far testing indicates things are working fine.  I'll keep an eye on the logs.  =)

Over 8,000 bot hits yesterday.  Gotta stop that crap.

UPDATE: I've added an extra bit of code to track the action:

if (!$UNB['Client']['is_browser']) {
    UnbAddLog('A bot was killed here.');
    die('Die, bot, die!');
}

I'm getting bot death notices every three or four seconds for long stretches.  Looks like a total success.  =)
This post was edited on 2010-08-17, 06:38 by NFG.
Avatar
NFG #6
Member since Sep 2006 · 120 posts
Group memberships: Members
Show profile · Link to this post
Subject: Total success!
Total success!  Thanks Yves.

Killing the bots this way has cut unnecessary searches to zero, and the logs are now filled with actual page views (by the same combination of bots and search engines of course) but no more endless search loops.
Close Smaller – Larger + Reply to this post:
Verification code: VeriCode Please enter the word from the image into the text field below. (Type the letters only, lower case is okay.)
Smileys: :-) ;-) :-D :-p :blush: :cool: :rolleyes: :huh: :-/ <_< :-( :'( :#: :scared: 8-( :nuts: :-O
Special characters:
Go to forum
This board is powered by the Unclassified NewsBoard software, 20110527-dev, © 2003-2011 by Yves Goergen
Page created in 194.1 ms (101.9 ms) · 69 database queries in 135.8 ms
Current time: 2012-02-07, 19:55:38 (UTC +01:00)