Thalasar Ventures

Remember Check Your Robots.txt file!

When I launched Early Miser I simply copied over my robots.txt file from Best Buy Zone. This was a mistake. Hey this is one of my mistakes so let’s learn from it.


The robots.txt file was over 4 years old and included numerous exclusions. As I have stated before Best Buys Zone is an extremely brittle applications from my viewpoint. (Yes I designed the site 4 years ago). All caching is MySQL based so it hits the database especially hard. I have a single box serving it so this isn’t much of a problem. It was NEVER designed to be hugely popular merely explore the role of RSS in comparison shopping. (That said traffic varies between 2500-3000 visitors a day).
Because the application was more brittle than I would like; I excluded Teoma, Ask and Looksmart from crawling the site. I remember that Ask was an especially hungry bot and I was getting no traffic from ask.com in return. So no problem I thought, “It’s only a thought experiment anyway.” After letting the thing run for a few years now, when I got Early Miser ready to launch I then decided, “Hey the robots.txt file for bestbuyszone.com worked great. I will use that one.” I had forgotten I had excluded a whole bunch of sites like Ask, Looksmart and Google Image Search. Looksmart with Furl.net is looking like a pretty good source of traffic.
I didn’t really notice anything of course until I began using Google Web Master Tools. I had loaded up the robots.txt analysis tools and was looking at Early Miser’s robots.txt. Then it all came rushing back to me. After wandering why I was banning large parts of the web from indexing the site, I soon remembered why I had done so. After slapping myself on my forehead, I quickly corrected the issue since Earlymiser.com is a pretty robust application and has a much more robust caching scheme.
So the lesson here is pretty clear – Check your robots.txt file periodically. You need to compare it to any new bots that have arisen, so check your log files and the appropriate forums at Webmaster world.com. I probably wouldn’t have caught this if hadn’t been usingGoogle’s Web Master Tools so I always recommend that tool. I am interested in seeing how much traffic will be generated to earlymiser.com now that looksmart, Ask.com and other will actually index the site!

Both comments and pings are currently closed.

Comments are closed.