Apr 3rd, 2020

How Web Servers Stop Bots?

Many web servers mainly want to allow humans to query them and disallow bots.

Here are the main ways how the web servers stop bots from doing their thing:

  1. They look at the User-Agent-String header sent by the bot. If it is not a familiar one sent by a known browser on a known OS, they might reject the call.

  1. The bot is making too many calls too frequently. Web servers know that humans dont do that and can block you out.

  1. The bot is making calls in a sequential order which humans rarely do.

  1. The bot is making calls at the same time every day. The call logs show it and the webmaster can block you.

  1. The bot is making calls at predictable intervals.

  1. The bot is not passing the cookies back. Even though the cookies are not relevant to the use of the data, the web servers expect it to come back as all browsers do pass them back.

  1. The User-Agent-String is always the same and is not rotated.

The IP address is always the same and is not rotated using a Rotating Proxy Service like Proxies API (I am the founder of Proxies API).

Share this article:

Get our articles in your inbox

Dont miss our best tips/tricks/tutorials about Web Scraping
Only great content, we don’t share your email with third parties.