Comment spam trainee?

I got a couple of google.com referrers in my log, and got suspicious. ALLWAYS check out those google referrers with no search terms!

It appears to be a comment spam trainee. Not quite making it work.

80.77.90.229
which is from hqhost.net. Traditionally a linkspammer lair.
Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)

It’s the same pattern over and over. Coming in to a few (often the same) pages with http://www.google.com as a referrer., then finally going to one post, then trying to post a comment, with the post as the referrer. And although they get a 200 status code, the comment isn’t successfully entered.

4 Responses to “Comment spam trainee?”

  1. Olliver Says:

    I’ve seen quite a lot of these testing bids over the past years, the “Google check” only one of them. Other variants may include:
    an obviously faked or empty user agent string from the ip range of hosting companies (like ThePlanet and EV1)
    continiously changing ip addresses (typically open proxies) with the same user agent using hitting a particular page normally not requested very often
    visitors hitting the site with continously changing user agents (switching between bots and browsers)
    visitors with browser user agents making HEAD requests
    A couple of entries in Apache’s error log complaining about HTTP violations, like sending HTTP 1.1 without Host header or GET requests containing back slashes
    Some of these spambots even reveal an odd sense of humour, like for instance only spamming posts related to, well, referrer spam :-) . But at any case the test run will either include an url that looks unsuspicious or doesn’t even contain a referrer.

    The bot nature will become apparent if you look into Apache’s log files as there won’t be any css, js, or image files requested (embedded into the page delivered) - just the page/post itself because in most cases the bots are too dumb to understand HTML. Also, a lot of HTTP headers usually sent with a browser request are missing (in most cases you only have the Host, User-Agent and Referrer header defined). So checking for missing headers in requests made by typical browser user agents may be a criterium for filter rules (mod_rewrite would be the preferred choice here).

  2. Paulo Says:

    I’ve found that “google.com” also comes up as a referrer when someone gets to your site via a click on “I’m Feeling Lucky,” so you really have to check the originating server.

  3. jon Says:

    Wow so my trackback spam didnt work?

  4. Search Engines Web :- O Says:

    ///// ALWAYS check out those google referrers with no search terms!

    What does a GOOGLE referrer and an MSN referer with No search terms mean? Are they in fact falsly created referrer strings? Does certain information in a referrer NEUTRALIZE duplicates - in other words, creating a false referrer in a string eliminate the default info due to the limitations of server log technology?

    Those two Search Engines seem to produce a relatively large number of them - but Yahoo seldom does.

    When doing a “feel lucky” search - the referrer keywords did in fact appear

Leave a Reply