Stories from the Web Crawling trenches in IP blocking

How to Build a Simple HTTP Proxy in CSharp in just 25 lines of code

Author: Mohan Ganesan

Date: Oct 1, 2023

Build a basic proxy server in C# using the .NET framework. Use HttpListener and WebClient classes. Avoid IP blocking with rotating proxy service.

How to Build a Simple HTTP Proxy in Rust in just 40 lines

Author: Mohan Ganesan

Date: Oct 1, 2023

Rust is a great language for network programming. Learn how to build a basic HTTP proxy in just 40 lines of code. Also, discover the benefits of using a rotating proxy to avoid IP blocking.

How to Build a Super Simple HTTP Proxy in C++ in just 30 lines of code

Author: Mohan Ganesan

Date: Oct 1, 2023

Build a basic HTTP proxy in C++ in 30 lines of code. Use a rotating proxy service to avoid IP blocking with an API.

How to Build a Super Simple HTTP Proxy in JavaScript in just 20 lines of code

Author: Mohan Ganesan

Date: Oct 1, 2023

Build a basic proxy server with JavaScript using Node.js http and request modules. Avoid IP blocking with a rotating proxy service.

How to Use Proxies with Puppeteer in 2024

Author: Mohan Ganesan

Date: Jan 9, 2024

Learn how to effectively use proxies with Puppeteer for web scraping, including the importance of proxies, configuring proxies in Puppeteer, rotating multiple proxies to avoid blocks, configuring authentication for premium proxies, and advanced proxy chaining. Discover common issues and troubleshooting tips, as well as criteria for selecting proxy services. Consider leveraging Proxies API for uninterrupted web scraping with worldwide locations, built-in rotation, JavaScript rendering, CAPTCHA solving, and high availability.

How to Build a Super Simple HTTP proxy in Go in just 20 lines of code

Author: Mohan Ganesan

Date: Oct 1, 2023

Go is a great language for writing simple and efficient network applications. Learn how to build a basic HTTP proxy in Go in under 20 lines of code. To handle IP blocking, consider using a rotating proxy service like Proxies API.

Building a Simple Proxy Rotator with PHP and SimpleHTMLDOM

Author: Mohan Ganesan

Date: Oct 2, 2023

Implement a rotating proxy in PHP using free proxies from sslproxies.org. Use SimpleHTMLDOM and cURL to fetch and parse the proxies. Rotate IPs and User-Agent-String to avoid IP blocking with Proxies API.

How to Build a Super Simple HTTP Proxy in Perl in just 20 lines of code

Author: Mohan Ganesan

Date: Oct 1, 2023

Build a basic HTTP proxy server in Perl using less than 20 lines of code. Use rotating proxy service to avoid IP blocking.

Web Scraping New York Times News Headlines in Go

Author: Mohan Ganesan

Date: Dec 6, 2023

Web scraping is the process of extracting data from websites using code. This article provides a tutorial on web scraping using Go language and goquery library. It covers the steps to send a GET request, parse HTML content, extract data, and handle common scraping challenges like IP blocking.

Building a Super Simple HTTP Proxy in Ruby in just 9 lines of code

Author: Mohan Ganesan

Date: Oct 1, 2023

Building a Simple HTTP Proxy in Ruby. Learn how to create a basic HTTP proxy using Ruby's socket library and net/http. Also, discover the importance of using a rotating proxy service to avoid IP blocking.

Scraping all the Images from a Website with Rust

Author: Mohan Ganesan

Date: Dec 13, 2023

Learn how to use Rust for web scraping, including data extraction, image scraping, and error handling. Overcome IP blocking with a rotating proxy service like Proxies API.

How to Build a Super Simple HTTP Proxy in Elixir in just 20 lines of code

Author: Mohan Ganesan

Date: Oct 1, 2023

Elixir makes it easy to build fast and scalable network applications. Here is a basic HTTP proxy server in less than 20 lines of Elixir code.

Scraping All Images from a Website with R

Author: Mohan Ganesan

Date: Dec 13, 2023

Scrape web pages using R libraries, send HTTP requests, parse HTML, extract data, download images, and overcome IP blocking with a rotating proxy server.

Scraping All Images from a Website with Java

Author: Mohan Ganesan

Date: Dec 13, 2023

Web scraping is the process of extracting data from websites automatically. This article explains how to scrape dog breed images from a Wikipedia page using Java and Jsoup library. It also discusses the use of CSS selectors and overcoming IP blocking.

Scraping Wikipedia in Java for Beginners

Author: Mohan Ganesan

Date: Dec 6, 2023

Web scraping is the process of extracting data from websites. This article provides a code example using Jsoup to scrape Wikipedia for data on US presidents. It also discusses handling IP blocking with a rotating proxy service.

Scraping Craigslist Listings with CSharp

Author: Mohan Ganesan

Date: Oct 1, 2023

Learn how to scrape Craigslist apartment listings using C# and HtmlAgilityPack. Avoid IP blocking with a rotating proxy server.

How to Build a Super Simple HTTP Proxy in Scala in Just 20 Lines of Code

Author: Mohan Ganesan

Date: Oct 1, 2023

Scala makes it easy to build networked applications with concise syntax and strong libraries. Here is an HTTP proxy server in Scala using Akka in just 20 lines of code. It is prone to get blocked due to single IP usage, but a rotating proxy service like Proxies API can solve IP blocking problems instantly.

Scraping Craigslist Listings with Go

Author: Mohan Ganesan

Date: Oct 1, 2023

Learn how to scrape Craigslist apartment listings using Go and goquery. Avoid IP blocking with a rotating proxy server.

How to Build a Super Simple HTTP Proxy in R in just 20 lines of code

Author: Mohan Ganesan

Date: Oct 1, 2023

Build a basic HTTP proxy server in R using httpuv and httr packages. Learn how to handle IP blocking with a rotating proxy service.

Building a Simple Proxy Rotator with Perl and Mojo

Author: Mohan Ganesan

Date: Oct 2, 2023

Use Mojo::UserAgent to fetch and parse proxy lists, extract proxies, refresh periodically, select a random proxy, and make proxied requests with LWP::UserAgent. Consider using a rotating proxy service like Proxies API to solve IP blocking problems.

Scraping New York Times News Headlines in C++

Author: Mohan Ganesan

Date: Dec 6, 2023

Web scraping is a technique for extracting data from websites using C++. This article explains how to scrape article titles and links from The New York Times. It covers concepts like HTTP requests, HTML structure, libcurl, and Gumbo. It also mentions the challenges of IP blocking and suggests using a rotating proxy service like Proxies API.

Building a Simple Proxy Rotator with Objective-C

Author: Mohan Ganesan

Date: Oct 2, 2023

Fetch and parse proxies from free proxy pools to rotate and use in Objective-C projects, solving IP blocking problems with a rotating proxy service.

Scraping Craigslist Listings with Ruby

Author: Mohan Ganesan

Date: Oct 1, 2023

Learn how to scrape Craigslist apartment listings using Ruby and Nokogiri. Avoid IP blocking with a rotating proxy server.

Scraping Craigslist Listings with Perl

Author: Mohan Ganesan

Date: Oct 1, 2023

Learn how to scrape Craigslist apartment listings using Perl and modules LWP::UserAgent and HTML::TreeBuilder. Avoid IP blocking with a rotating proxy server.

Scraping Craigslist Listings with Visual Basic

Author: Mohan Ganesan

Date: Oct 1, 2023

Learn how to scrape Craigslist apartment listings using Visual Basic and HtmlAgilityPack library. Avoid IP blocking with a rotating proxy server.

Is web scraping free?

Author: Mohan Ganesan

Date: Feb 20, 2024

Web scraping is free initially, but costs may incur for bandwidth, IP blocking, and legal restrictions. Have a plan and budget to scale safely.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!