Stories from the Web Crawling trenches in Rust

How to Build a Simple HTTP Proxy in Rust in just 40 lines

Author: Mohan Ganesan

Date: Oct 1, 2023

Rust is a great language for network programming. Learn how to build a basic HTTP proxy in just 40 lines of code. Also, discover the benefits of using a rotating proxy to avoid IP blocking.

The Ultimate Select.rs Cheat Sheet for Rust

Author: Mohan Ganesan

Date: Oct 31, 2023

select.rs is a robust HTML/XML scraping library for Rust. This cheat sheet covers its features, including installation, loading documents, selecting nodes, traversing nodes, extracting/modifying nodes, creating/inserting/removing nodes, output formats, caching and persistence, headless browsers, validation, encoding, advanced selectors, caching and performance, common recipes, troubleshooting, and ecosystem libraries.

Downloading Images from a Website with Rust and scraper

Author: Mohan Ganesan

Date: Oct 15, 2023

Learn how to use Rust and the reqwest and scraper crates to download all the images from a Wikipedia page.

Scraping all the Images from a Website with Rust

Author: Mohan Ganesan

Date: Dec 13, 2023

Learn how to use Rust for web scraping, including data extraction, image scraping, and error handling. Overcome IP blocking with a rotating proxy service like Proxies API.

Web Scraping with Rust & ChatGPT

Author: Mohan Ganesan

Date: Sep 25, 2023

Rust is a great language for web scraping with ChatGPT's help. It involves sending HTTP requests, extracting data, and using selectors. ChatGPT can provide explanations and generate code snippets. A web scraping API like Proxies API can be used for more robust solutions.

Scrape Any Website with OpenAI Function Calling in Rust

Author: Mohan Ganesan

Date: Sep 25, 2023

Web scraping with OpenAI in Rust allows resilient data extraction from websites using function calling.

Scraping Multiple Pages in Rust with reqwest and selectors

Author: Mohan Ganesan

Date: Oct 15, 2023

Web scraping in Rust using reqwest and selectors crates to extract data from multiple pages. Use proxies for scaling up scraping.

What are the fastest languages for web scraping?

Author: Mohan Ganesan

Date: Feb 5, 2024

Web scraping involves extracting data from websites. Choosing the right programming language is crucial for scraping large sites. C++ and Rust offer speed, while Go provides simplicity and speed.

Scraping eBay Listings in Rust in 2023

Author: Mohan Ganesan

Date: Oct 5, 2023

Learn how to scrape and extract data from eBay listings using Rust, reqwest, and select crates.

Whats the equivalent of pythons request package for rust?

Author: Mohan Ganesan

Date: Feb 3, 2024

Rust is a systems programming language focused on performance, reliability, and efficiency. reqwest is a popular HTTP client library for Rust, providing a similar developer experience to Python's requests package.

Scraping Reddit Posts with Rust

Author: Mohan Ganesan

Date: Jan 9, 2024

Code walkthrough for scraping Reddit using Rust to extract post information.

Scraping Craigslist Listings with Rust

Author: Mohan Ganesan

Date: Oct 1, 2023

Learn how to scrape Craigslist apartment listings using Rust and the reqwest and selectors crates.

Web Scraping Yelp Business Listings with Rust

Author: Mohan Ganesan

Date: Dec 6, 2023

Learn how to scrape Yelp business listings using Rust, including setting up the development environment, handling proxies, making HTTP requests, parsing HTML, and extracting business details.

Downloading Images from URLs in Rust

Author: Mohan Ganesan

Date: May 5, 2024

Learn how to download images efficiently using Rust with reqwest, hyper, surf, ureq, and attohttpc libraries.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!