Talk:Omni-explorer
From Spamhuntress
I'm locking all the talk pages that get misused by spammers. If you've got something to talk about regarding this page, do it on this universal talk page, and make a note of what article you're commenting. I'll transfer your contribution to the right page manually.
Aleksandar Vukelja
This Czech claims to have been involved in the development of CarsCrawler for omni-explorer.com.
http://proxima-soft.com/o_proximi.html
The guy has another site, http://www.masstheory.org, to promote his physics theories (which aren't getting much traction [1] ;), but in light of the heavy crawling so many of us have experienced the site's legal disclaimer is interesting:
http://www.masstheory.org/legal.html
In particular, this part:
I am inviting readers to do as I do: If you have something that you basically want to give for free, then do not be afraid of people. Make it as free as the air that you breath. Without some hidden charge, without explicit or hidden threat. Do it generously. Be happy if you succeed in helping the common thing of us all.
OmniExplorer also unescapes found URIs before it moves on to spider them, without escaping the result.
So something along the lines of:
http://www.example.com/getDoc?s=name%20like%20%27%25adept%25%27
will be converted by their spider into
http://www.example.com/getDoc?s=name+like+"%adept%"
which--when submitted to most servers--will 404 since %ad is interpreted as an escaped character of decimal value 173 (which appears to be defined as a "soft hyphen").
When did it become acceptable for spider authors to ignore the primary RFCs by which any of this Internet crap actually works?
These guys are clueless amateurs. Block'em.
We banned it too
I've had several run-ins with this bot.
- Firstly, it's ravenous. It downloads more than all the other search engines combined.
- Secondly, it thinks it's clever. I've seen examples where it appears to be indexing javascript: links - and poorly at that. I saw a error in my logs for "/link/" + variable + "/whatever" (not exactly that but the details are not important).
- Thirdly, despite being around for ages they have not informed AWStats (a major stats package) of their presence. Which means my logs now show thousands of 'real' visits which didn't really happen.
- Fourthly (and this is the point we banned it) It somehow got access to another users session or found some other way into a secure section of the site (requiring login authentication) and deleted an article from our site. What REALLY pisses me off is that the action requires javascript dialog confirmation and the bot "clicked" Yes. It's simply common sense not to index javascript - yet this bot is essentially greedy and doesn't seem to care what it does.
Onmi-Explorer / WordIndexer or whatever you are - you are permanently banned from all sites we manage!
SpliFF
