Web Bot Technology: How Scrapers And Botnet Operators Work

By Charlie Minesinger, Director of Sales, Distil Networks

In our last post, we talked about the true scope of the threat posed by web bots. To truly begin fighting web bots on your own site and in your ad campaigns, though, you’ll first need to understand how they work and how to measure bot traffic. Bots can skew traffic numbers, submit fraudulent forms, click on your ads and infect your site. Learning how bots carry out these attacks can help you more effectively understand how to safeguard your site and your business from these malicious web programs. As you identify bot traffic, you will then need to ask how bot traffic affects your online marketing spend or your analytics and KPI?

How Web Bots Work

Web bots are software programs designed to mimic human behavior on websites to trigger payable marketing and advertising events, and bots may also mine, parse and steal online data such as copyrighted images, pricing, blog posts, meta data on products and other kinds of content. Search engines use web bots to gain a snapshot of available content and websites on the internet, and then provide recommendations according to the user’s searches. Unfortunately, web bots are fairly easy to create, so this same technology can be employed for fraudulent clicks, scraping prices, or mining content to be used by competitors, nefarious actors or even organized crime.

Technologically speaking, here’s how a web bot, scraper or botnet (group of bots) can work when it attacks your site:

  • The bot may use the PHP/CURL library to scan URLs and download individual web pages. It can even bypass URL forwarding, encryption, authentication and other security techniques.
  • The bot then parses the downloaded content and sorts out the desired data. Ultimately, this is how content, copy, images and pricing are stolen, sold and even duplicated.
  • In addition to downloading web pages and content, web bots can also be programmed to enter data into forms and make posts on forums. This can result in fraudulent registrations, fake form submissions, spam comments and more.

There are other bots that may infect a legitimate consumer’s device with malware. Many times these bots are simply loading pages to generate fraudulent traffic for display ads (impression fraud) or for clicking on banners (click fraud). These fraud bots typically aim to run up a competitor’s ad spend (or decrease their campaign’s efficacy). They may also run display network ads on their own site, driving up the income they receive from the distributor with increased clicks.

The concern here is that bots can be deployed to attack, to steal, or to make fraudulent transactions and to do so, bots may be deployed from hosting providers or on consumer devices. So to defend against a variety of bot technologies that are deployed from environments where IP addresses are easily changed or IP addresses are shared with legitimate consumers, websites need a breadth of bot detection technologies and need to remove any reliance on IP addresses as a unique identifier of bot traffic.

Both fraud and theft bots can be seriously detrimental to a website owner’s SEO efforts, traffic and bottom line. Stay tuned for our next post, and discover how they can affect your KPIs, too.


Please enter your comment!
Please enter your name here