Archive for the 'Search engine spam' Category

I’m feeling lucky spam

Saturday, January 5th, 2008

I got a spam tonight that had a URL that led to Google. That piqued my curiosity, so I checked it out and figured out why it worked.

The spam was of the Canadian pharmacy variety. Just the usual stuff. So I tried recreating the URL, pointing to one of my sites.

Their URL was a variant on this:

http://google.com///search?hl=en&q=ann+elisabeth&btnI=5437

But it does work even without the ID at the end. So what’s the point? In my tests I couldn’t see the referrer when I used a similar URL, so it can’t be for referrers, unless they have tools that retain more info than regular referrer logs.

&btnI= at the end means this is the “I feel lucky” option Google uses. Which means if a site feels safe their site will be returned as the lucky site, then this sort of spam works.

Scraper sells off sites

Monday, August 28th, 2006

Ajay got a bit hot under the collar regarding a site that stole his content via an RSS scraper. That site is now for sale, apparently as is.

Stolen Content, Sitepoint and Host turn blind eye!

BTW, I’ve had some questions why I don’t like scraping of my entire articles, and republishing without my permission. Besides the obvious - I want people to come to my site rather than thinking it’s over somewhere else, there’s another reason:

I often update my posts, sometimes even weeks and months after I first wrote them. If new information comes to my attention, or I got something wrong, I will update the post. I don’t like the thought of another site having an entire copy of an outdated post, without my edits on it!

Fourth fake Spamcop site

Saturday, August 26th, 2006

I’ve written about a few fake Spamcop sites.

The fourth is abusecenter.org.

It’s on 82.179.172.131, which holds maybe 50 Italian language websites.

I checked one of them, and it had Google websearch results, complete with links to the cache (which has a Google IP). But people won’t see that page, and won’t click on those links (tipping Google off). Because human visitors are redirected through a tricky obfuscated javascript (not the same script as the other fake Spamcop sites I’ve seen so far). That javascript sends you to

http://js.gbeb.cc/advertizing/?ref=

There’s an even trickier redirect on there, that will spit you out to abusecenter.org if you’re not coming directly from a search engine.

But I’m not going to try coming from a search engine - at least not on this machine. Because I found some Italians talking about a trojan on that IP, and mentioned the site I tried specifically. This was yesterday, and the Babelfish translation isn’t good enough to figure out exactly what they’re complaining about. I did figure out they’re complaining about the search engine spam this group is committing, though.
So, this MIGHT be another spammer, with a similar MO. I haven’t been able to find any throwaways pointing to this version, so I don’t know for sure what’s going on.

Whois:

Domain Name:ABUSECENTER.ORG
Created On:26-Jul-2006 12:28:40 UTC
Last Updated On:26-Jul-2006 12:28:46 UTC
Expiration Date:26-Jul-2007 12:28:40 UTC
Sponsoring Registrar:EstDomains, Inc. (R1345-LROR)
Registrant Name:Josef Gehringer
Registrant Organization:none
Registrant Street1:Lexington Avenue 91 47
Registrant City:NEWARK
Registrant State/Province:New Jersey
Registrant Postal Code:07175
Registrant Country:US
Registrant Phone:+1.2012246424
Registrant Phone Ext.:
Registrant FAX:
Registrant FAX Ext.:
Registrant Email: admin@abusecenter.org
Name Server:MANAGEDNS1.ESTHOST.COM
Name Server:MANAGEDNS2.ESTHOST.COM

And, this one too was registered with EstDomains, though the other domains were registered somewhere else. And the IP is from St. Petersburg - Russia

What is search engine spam?

Thursday, August 24th, 2006

There’s a project named Web Spam Test Collections. They’ve got some definitions on what is web spam. Actually, what they’re describing is search engine spam - ie spammy directories etc. Useful definitions (of search engine spam).

Web spam is of course much more than what they’re describing here, just to be accurate. It includes the actual spamming of interactive webservices such as blogs, forums and guestbooks.

Tricky stuff, these definitions, ey?

Hat tip to Threadwatch for the tip.

Why free webhosts need to get rid of spammer redirects

Monday, August 21st, 2006

I’ve been talking about the blasted redirects for some time now. They’re not new. This has been going on for a looong time. What was new to me was the use of unauthorized uploads.

But it’s time to focus on free website services and redirects.

Why is that so bad?

Free webservices are generally based on one business idea: Ad revenue.

And that’s not what the spammers are after. They want surfers to see THEIR ads, and nobody else’s. Which breaks the free webservice business model. Which means they need to break the redirects, and preferably scan their harddrives for websites using those kinds of redirects and booting them off the service.

My latest analysis was alice.it. There’s a javascript file pointed to from xoomer.alice.it/put0/3/, which has a convoluted but easy to decipher javascript on it that does the redirect. Not understanding Italian, I couldn’t find a way to notify them. If there are any Italian speaking people here, could you help? Comment here if you notify alice.it.

Manila Industries location

Saturday, August 19th, 2006

Manila Industries first came to my attention as a spammer. But later on it’s gotten a lot of folks riled up as an outfit that buys domains people forgot to renew. The domains are then used to earn ad income.

Today, someone left a new edit on the Manila Industries wiki page where contact info was added. I peeked at the logs, and he or she has been thinking about this for close to a week before deciding on adding that information.

Here’s the text from the wiki page:

In speaking with someone at Manila Industries named Jill, who thought I was a prospective job candidate for the legal department (with extensive trademark experience, as she requested), I was provided the following contact information.

Jill Johnson
Manila Industries
714-920-9883

60 Palatine 112
irvine,ca 92612

3845 S bristol 628
Santa Ana, CA 92704

The Santa Ana address has been seen before in whois info, but the Irvine address is new. I also checked the phone number. It’s (provided the number hasn’t been ported somewhere else) a Nextel phone registered in Anaheim.

I checked satellite images on Google maps and yellow pages listings. The Irvine address is also the location of 24-7 Radiology. There’s a residential area nearby, but I’m not quite sure what the house in question is. Looks a bit scuzzy from orbit ;-) (Eek, that Google thing isn’t completely housebroken. Next time I searched, the arrow was somewhere else. This time it’s inside the nice gated residential area! If I search for the address, it’s the scuzzy area, and if I search for the Radiology place, it’s the nice area). The Santa Ana location has the arrow pointing at a parking lot. But I’m guessing it’s a mall, and there’s a Nextel retailer in the yellow pages with the same address.

SEO hacking cpanel

Monday, August 14th, 2006

There’s a thread on Search Engine Watch puzzling over server side search engine cloaking of an innocent third party’s website (thanks Joe for the tip).

After the conversation had died down, Brian White (works for Matt Cutts at Google) came around and told them:

“…We’ve discovered that the likely explanation is that a third party gained access to a number of sites and dropped files in these accounts (including a modified .htaccess using rewrite rules) for the purpose of rewriting the home page through a proxy script. The proxy script adds links when Googlebot visits, and in a sinister twist, adds the rel=nofollow link to cap off PageRank bound for any external URL not under control of this third party. As Danny noted, they also add a NOARCHIVE meta tag to disable the cached version in results…”

“…We don’t know how the third party got the files on the webhosts, but cPanel seems to be the common denominator. We’re in touch with some hosts who appear be affected by this….”

I guess it was bound to happen. Hacking for SEO…

Dirty subdomains on popular sites

Wednesday, August 2nd, 2006

Just a week ago, I received a tip about a site that had lots of spammy pages. The root site was a business site that at least in the past had seemed solid. But the spammy pages seemed unrelated, and had also been spamvertized.

While we were looking, the pages disappeared from the site and from Google. I never found out if the site had been hacked or if the spammy pages were there on purpose.

And here’s a related story from May 2006. Syndic8 got “tricked” into accepting and promoting spammy subdomains, with the resulting fallout.

Jeff from Syndic8 wrote a blog post about his stupidity.

I thought it was a good tale, and hopefully will make another webmaster think twice about going down that path.

Proof of harvesting

Friday, July 21st, 2006

I just got an e-mail to a spamtrap requesting a return link from one of my websites to: mybaby.net.au

They’d already added one of my pages to a specified subpage.

The thing is, that e-mail address is not connected with that website at all. …except it’s displayed (on the index page) below an image with this text:

The e-mail address below is a spamtrap. Do NOT e-mail. I feed all e-mail to that address to my spamfilters, unread.

None of my visitors have ever tried e-mailing me there…

So, it’s a scraper site, and should be blocked from Google. They boast PR3-4 and rising, so they (almost) know what they’re doing.

Microsoft research into webspam

Wednesday, July 19th, 2006

Microsoft just released a research project they think can catch webspamming attempts.

Strider Search Defender: Automatic and Systematic Discovery of Search Spammers through Non-Content Analysis

I also found their application Strider URL Tracer interesting. Too bad it doesn’t look secure from what I’m seeing? Keep in mind that the visitors to my blog are likely to use this tool for the most risky tasks. Anything that uses the actual Internet Explorer to visit questionable sites isn’t secure. They say you should use a virtual system or non-mission critical machine in order to do Batched Scanning. I’d head that warning…

They married it to Internet Explorer because of its other uses, of course.