Stories from the Web Crawling trenches in base URL

Scraping Multiple Pages in Java with JSoup

Author: Mohan Ganesan

Date: Oct 15, 2023

Web scraping in Java using JSoup to extract data from multiple pages. Use base URL pattern, loop through pages, send request, parse HTML, and extract data using selectors.

Scraping Multiple Pages in Go with net/http and goquery

Author: Mohan Ganesan

Date: Oct 15, 2023

Web scraping in Go using net/http and goquery to extract data from multiple pages. Use a base URL pattern with %d placeholder and loop through pages to construct each page URL. Send request and parse HTML with goquery to find and extract data. Print or store scraped data.

Scraping Multiple Pages in Ruby with Nokogiri

Author: Mohan Ganesan

Date: Oct 15, 2023

Web scraping in Ruby using Nokogiri to extract data from multiple pages. Use base URL pattern, loop through pages, parse HTML, and extract data.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!