Stories from the Web Crawling trenches in parsing HTML

Scraping All Images from a Website with R

Author: Mohan Ganesan

Date: Dec 13, 2023

Scrape web pages using R libraries, send HTTP requests, parse HTML, extract data, download images, and overcome IP blocking with a rotating proxy server.

Building a Simple Proxy Rotator with JavaScript and Puppeteer

Author: Mohan Ganesan

Date: Oct 2, 2023

Fetch and parse proxies using Puppeteer and cheerio, and select a random proxy for JavaScript projects.

Building a Simple Proxy Rotator with C++ and libcurl

Author: Mohan Ganesan

Date: Oct 2, 2023

A simple proxy rotator in C++ using libcurl and RapidXML to fetch and parse proxies from sslproxies.org. Consider using a rotating proxy service for production use.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!