If you own or manage a website, chances are you’ll have built it with a target audience of people in mind. However, did you know that about half of all web traffic isn’t people visiting websites, but “bots”?
Bots are programs that interact with websites and the best known ones are search engine spiders, such as Googlebot and Bingbot. Search engines use these to crawl websites and clearly having them visit your website is beneficial.
Unfortunately, not all bots are good. An extreme example of bad bots are those used in coordinated DDoS attacks (Distributed Denial of Service). Other examples of bad bots are those used to scrape and steal content from websites. Clearly visits from such bots is not welcome.
Many people managing websites naively think they can simply block bad bots visiting their websites through the use of the robots.txt protocol. Using robots.txt, a website manager can specify which bots may or may not access a website and the content they can crawl. (You can see if your website has a robots.txt file by entering the following into your browser www.yourwebsitename.co.nz/robots.txt).
Robots.txt works with bots that respect it (such as Googlebot and Bingbot). But — big surprise! — bad bots simply ignore it. Relying on robots.txt to keep bad bots at bay is akin to putting a sign on your unlocked front door saying “burglars please keep out.”
In the same way proper security is required to keep burglars from looting your business premises, security measures are needed to keep bad bots away from your website. Relying on robots.txt just doesn’t cut it.
If a bad bot operates from a single IP address, you can block its access to your web server through server configuration or with a network firewall. This can however become a case of “Wack a Mole”if bad bot operators simply keep changing their IP addresses. And if copies of a bot operate at lots of different IP addresses, such as hijacked PCs that are part of a large Botnet, then things becomes more difficult. The best option then is to use advanced firewall rules configuration that automatically block access to IP addresses that make many connections; but unfortunately that can hit good robots as well bad robots.
The good news is that this week Akamai announced a new tool called Bot Manager which gives website owners granular control over bots hitting their website. The tool provides detailed information about an individual bot, such as hostname, source IP address, what activities it performed and the impact that had on website performance. This level of detail enables a website manager to determine whether a bot is good or bad and decide how best to handle it using Bot Manager.
For example, an eCommerce site might let a price scraper bot through for a partner, but block bots from aggressive competitors who are using the scraped information to match or beat their prices. Or, if they want to play hardball, a page with false prices can be created and Bot Manager used to send competitor price scraping bots to this page (an evil but cool idea).
Why should you care?
If you’re having issues with your website being hammered by unfriendly bots then it may be worth taking a look at Akamai Bot Manager which we understand will be available later this month. We also recommend reading the Incapsula Blog which contains some great information.
Click here for more search marketing news.
If you found this useful, please tell your friends.
Mark is a Partner and Senior Consultant at SureFire which he founded back in 2002. Prior to establishing SureFire he worked for KPMG Consulting. Today Mark heads up SEO, embracing the challenges that can come with complex website implementations. Outside of work, his interests beyond his family are running, snowsports, diving and fishing (badly).