Jun 24, 2007
A few days ago I wrote a quick and nasty script that emails me a dump of the PHP $_SERVER variables. The purpose was to find out some connection information from a client - for whatever reason his IP address wasn't being detected properly, and I needed some more information to work with.I put this script online, tested it was working properly, and asked to client to visit a specific URL. About 2 days later, the script sent me an email, as expected. Except the deatils of the connection were for the Alexa crawler, not the client.
Obviously the Alexa widget of my Searchstatus Firefox extension (this extension is a must have for web developers) has phoned home to Alexa and let them know about this new page. Alexa has sent the bot out to crawl this new page.
Lessons learned
This whole experience isn't really at all remarkable, but I guess I was surprised at how quickly Alexa came through and crawled this URL. The take-home lesson is to always password protect those pages you don't want to be found by random crawlers.So many developers put pages on their sites and don't link to or publish the URL anywhere - and then consider this "hidden" page to be safe. It's not.
<< SEO Articles index < Dreaming, surely | Has my 3 letter CAPTCHA been hacked? >
Comments
Chris Giddings - Aug 21, 2007
Wouldn't creating a directory and setting it aside specifically for testing/etc. be as useful if the directory was excluded from crawling by a .htaccess file or had the proper meta-tagging?
Alex Mielus - Sep 4, 2007
A passworded directory would be ok for testing I think. I didn't quite bothered about this until now. Maybe I will when my sites are hacked ... dunno :)
Vincent AM - Mar 14, 2008
Such crawlers tends to multiply nowadays, you are right saying a hidden doesn't mean secured. Nice article

Post Comment
We welcome comments on this article, provided they have something to contribute. Please note that all links will be created using the nofollow attribute. This is a spam free zone. HTML is stripped from comments, but BBCode is allowed.











Naresh Kumar - Jul 10, 2007
Very good article u have released but i have some other techniques for improving the rank of the site.