Blocking Java suckers
There are a number of bots using Sun’s java implementation. I found one of the IP numbers on a list of honeypot trapped IP numbers for e-mail harvesting.
So I’m banning the suckers.
Here’s how you can do it, in .htaccess:
SetEnvIfNoCase User-Agent Java/1.4. spambot=yes
SetEnvIfNoCase User-Agent Java/1.5. spambot=yes
deny from env=spambot
The reason I’m not banning Java and be done with it, is that it might be used for legitimate bots as well. For more background, read the Webmasterworld thread on this.
Update April 18
I found an entry in my log that had been blocked. Not a good thing, because it was a link checker from Dmoz. User agent (in this case)
TulipChain/6.03 (http://ostermiller.org/tulipchain) Java/1.4.2_05 (http://apple.com/) Mac_OS_X/10.3.9
I’ve been put in the bookmarks section of an editor there, so that’s why the link checker came by. I think I need to change the .htaccess. I’ll see what I can figure out.
March 25th, 2005 at 5:01 am
Ann Elizabeth, does that go in the mod_rewrite rules?
BTW, I see you’re using WordPress. You might like ScriptyGoddess’ Subscribe to Comments plugin:
http://www.scriptygoddess.com/archives/2004/06/03/wp-subscribe-to-comments/
March 25th, 2005 at 5:30 am
Nah, I’ve got a comment feed. That should be enough for now.
March 25th, 2005 at 4:09 pm
Oops, missed a question.
No, it’s not part of mod_rewrite. It’s part of mod_setenvif
It can also be done with browser match and a few others.
July 20th, 2006 at 3:15 pm
First off, thank you. This is much easier than typing in each offender’s IP. Have you come up with a solution to allow good bots? What if your trick were part of an allow,deny rule?
July 20th, 2006 at 3:52 pm
Yikes Jonathan!
Your stumbleupon page gets some traffic, huh?
Umm, the htaccess tricks won’t even work on all servers. I haven only come up with a solution to block each type of bot. If you come up with something better, feel free to contribute!
July 20th, 2006 at 4:24 pm
Ha-ha! Actually, I wrote that about my blog.
) So far there have been no more visits from Java programs. I didn’t know how else to put it in the .htaccess file, so I made it part of a Limit argument, and following that I had the allow,deny setting. Should I add good Java bots along with allow all? For instance:
July 21st, 2006 at 4:10 pm
[…] 12 Spamhuntress: “ Blocking Java Suckers” […]
December 13th, 2006 at 11:43 am
I added
SetEnvIfNoCase User-Agent Java/1.4. spambot=yes
SetEnvIfNoCase User-Agent Java/1.5. spambot=yes
and
[quote]
order allow,deny
allow from all
deny from env=spambot
[/quote]
to my .htaccess file in November, and then wondered why I was getting client denied by server configuration: /home/xxxxxx/public_html/modules/frozen_bubble/frozenBubble.jar type errors every time someone tried to use the applet.
So I removed (actually, commented out, those lines from my .htaccess file, and now all is well. What did I do wrong?
December 13th, 2006 at 11:45 am
Let me try that again. Inside [quote] and [/quote] it should have said:
<Files ~ “^.*$”>:
order allow,deny
allow from all
deny from env=ban
deny from env=spambot
</Files>