Alexa bot on the ball

A few days ago I wrote a quick and nasty script that emails me a dump of the PHP $_SERVER variables. The purpose was to find out some connection information from a client - for whatever reason his IP address wasn't being detected properly, and I needed some more information to work with.

I put this script online, tested it was working properly, and asked to client to visit a specific URL. About 2 days later, the script sent me an email, as expected. Except the deatils of the connection were for the Alexa crawler, not the client.

Obviously the Alexa widget of my Searchstatus Firefox extension (this extension is a must have for web developers) has phoned home to Alexa and let them know about this new page. Alexa has sent the bot out to crawl this new page.

Lessons learned

This whole experience isn't really at all remarkable, but I guess I was surprised at how quickly Alexa came through and crawled this URL. The take-home lesson is to always password protect those pages you don't want to be found by random crawlers.

So many developers put pages on their sites and don't link to or publish the URL anywhere - and then consider this "hidden" page to be safe. It's not.
Digg StumbleUpon del.icio.us technorati blinklist furl reddit sphinn

Tags: alexa