Stories from the Web Crawling trenches in user-agent

Troubleshooting 403 Errors with Python Requests Despite Setting User-Agent

Author: Mohan Ganesan

Date: Feb 3, 2024

Ensure User-Agent mimics a real browser. Use residential proxy or VPN for blocked IP. Set CF-Connecting-IP header for Cloudflare. Slow request rate and verify quotas. Register API keys or whitelist server IP.

Customizing HTTPX User Agents for Effective API Requests

Author: Mohan Ganesan

Date: Feb 5, 2024

Customize the User Agent header in HTTPX Python library for API analytics, compatibility checks, and access control.

How to Build a Reddit Scraper in Java

Author: Mohan Ganesan

Date: Jan 9, 2024

Learn how to scrape Reddit posts using Java, web scraping, HTML parsing, selectors, and user-agent headers.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!