« Typekey Authentication | | Online Back Up Solutions - Man have they gotten better! »

Remember Check Your Robots.txt file!

When I launched Early Miser I simply copied over my robots.txt file from Best Buy Zone. This was a mistake. Hey this is one of my mistakes so let's learn from it.

The robots.txt file was over 4 years old and included numerous exclusions. As I have stated before Best Buys Zone is an extremely brittle applications from my viewpoint. (Yes I designed the site 4 years ago). All caching is MySQL based so it hits the database especially hard. I have a single box serving it so this isn't much of a problem. It was NEVER designed to be hugely popular merely explore the role of RSS in comparison shopping. (That said traffic varies between 2500-3000 visitors a day).

Because the application was more brittle than I would like; I excluded Teoma, Ask and Looksmart from crawling the site. I remember that Ask was an especially hungry bot and I was getting no traffic from ask.com in return. So no problem I thought, "It's only a thought experiment anyway." After letting the thing run for a few years now, when I got Early Miser ready to launch I then decided, "Hey the robots.txt file for bestbuyszone.com worked great. I will use that one." I had forgotten I had excluded a whole bunch of sites like Ask, Looksmart and Google Image Search. Looksmart with Furl.net is looking like a pretty good source of traffic.

I didn't really notice anything of course until I began using Google Web Master Tools. I had loaded up the robots.txt analysis tools and was looking at Early Miser's robots.txt. Then it all came rushing back to me. After wandering why I was banning large parts of the web from indexing the site, I soon remembered why I had done so. After slapping myself on my forehead, I quickly corrected the issue since Earlymiser.com is a pretty robust application and has a much more robust caching scheme.

So the lesson here is pretty clear - Check your robots.txt file periodically. You need to compare it to any new bots that have arisen, so check your log files and the appropriate forums at Webmaster world.com. I probably wouldn't have caught this if hadn't been usingGoogle's Web Master Tools so I always recommend that tool. I am interested in seeing how much traffic will be generated to earlymiser.com now that looksmart, Ask.com and other will actually index the site!

AddThis Feed Button AddThis Social Bookmark Button
Technorati Tags: ask.com earlymiser earlymiser.com google google webmaster tools robots.txt

TrackBack

TrackBack URL for this entry:
http://thalasar.com/mt/mt-tb.cgi/10715

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)