Stories from the Web Crawling trenches in Go

Web Scraping New York Times News Headlines in Go

Author: Mohan Ganesan

Date: Dec 6, 2023

Web scraping is the process of extracting data from websites using code. This article provides a tutorial on web scraping using Go language and goquery library. It covers the steps to send a GET request, parse HTML content, extract data, and handle common scraping challenges like IP blocking.

How to Build a Super Simple HTTP proxy in Go in just 20 lines of code

Author: Mohan Ganesan

Date: Oct 1, 2023

Go is a great language for writing simple and efficient network applications. Learn how to build a basic HTTP proxy in Go in under 20 lines of code. To handle IP blocking, consider using a rotating proxy service like Proxies API.

The Definitive Guide to Handling Proxies in Go in 2024

Author: Mohan Ganesan

Date: Jan 9, 2024

Dealing with proxies in Go for web scraping: setup, security, privacy, performance, and troubleshooting. Proxies API offers a solution for developers.

Scraping Multiple Pages in Go with net/http and goquery

Author: Mohan Ganesan

Date: Oct 15, 2023

Web scraping in Go using net/http and goquery to extract data from multiple pages. Use a base URL pattern with %d placeholder and loop through pages to construct each page URL. Send request and parse HTML with goquery to find and extract data. Print or store scraped data.

What are the fastest languages for web scraping?

Author: Mohan Ganesan

Date: Feb 5, 2024

Web scraping involves extracting data from websites. Choosing the right programming language is crucial for scraping large sites. C++ and Rust offer speed, while Go provides simplicity and speed.

Scraping eBay Listings in Go in 2023

Author: Mohan Ganesan

Date: Oct 5, 2023

Step-by-step tutorial for extracting data from eBay listings using Go. Use net/http and github.com/PuerkitoBio/goquery packages for HTML parsing.

Scraping Craigslist Listings with Go

Author: Mohan Ganesan

Date: Oct 1, 2023

Learn how to scrape Craigslist apartment listings using Go and goquery. Avoid IP blocking with a rotating proxy server.

Converting Python Requests to Go net/http for Easier HTTP Clients

Author: Mohan Ganesan

Date: Feb 3, 2024

Learn the key differences between making HTTP requests in Python using Requests library and in Go using net/http package. Convert Python Requests code to Go net/http more easily.

Downloading Images from a Website with Go and goquery

Author: Mohan Ganesan

Date: Oct 15, 2023

Learn how to use Go and goquery to download images from a Wikipedia page, extract data from HTML tables, and scrape websites. Use Proxies API for IP rotation and CAPTCHA solving.

Scraping Real Estate Listings From Realtor with Go

Author: Mohan Ganesan

Date: Jan 9, 2024

Learn how to scrape real estate listing data from Realtor.com using Go and the goquery library. Use web scraping to collect and analyze housing data.

How to Scrape Reddit Posts in Go

Author: Mohan Ganesan

Date: Jan 9, 2024

Learn how to scrape Reddit using Go with a step-by-step guide. Extract information about posts using HTML parsing and HTTP requests.

Web Scraping with Go & ChatGPT

Author: Mohan Ganesan

Date: Sep 25, 2023

Go is a great language for web scraping with ChatGPT's assistance. It provides explanations, code generation, and supports HTML parsing and CSV output. A web scraping API like Proxies API can handle anti-scraping measures and JavaScript rendering.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!