Stories from the Web Crawling trenches in debugging

The Complete HTTPBin CheatSheet in Python

Author: Mohan Ganesan

Date: Dec 6, 2023

Httpbin is a popular online service for testing and debugging HTTP libraries and clients. It is useful for testing HTTP client code, experimenting with APIs, learning HTTP concepts, debugging issues, and more.

Debugging HTTP Requests in Python with Request Logging

Author: Mohan Ganesan

Date: Feb 3, 2024

Add comprehensive logging to Python requests for visibility into issues when making HTTP requests.

How to fix TooManyRedirects error in Python requests

Author: Mohan Ganesan

Date: Oct 22, 2023

The TooManyRedirects error in Python requests occurs when the request exceeds the default limit of 30 redirects. This article explains the causes of the error and provides solutions to fix it, including modifying redirect behavior, increasing max redirects, disabling redirects, and implementing custom redirect handling. It also offers best practices for handling redirects and answers frequently asked questions about the error.

Logging and Debugging with Requests

Author: Mohan Ganesan

Date: Oct 22, 2023

Guide to enable detailed logging and debugging with Requests library in Python for HTTP requests using urllib3 and http.client.

Debugging HTTP Requests with httpx Debug

Author: Mohan Ganesan

Date: Feb 5, 2024

Making HTTP requests is core functionality for many Python applications. httpx debug is a debugging proxy server that captures HTTP traffic, logs request/response data, and allows for mocking and modifying traffic for testing scenarios.

Using Proxies in LWP::UserAgent in Perl in 2024

Author: Mohan Ganesan

Date: Jan 9, 2024

Proxies are essential for web scraping to prevent blocks. LWP::UserAgent makes it easy to configure proxies for large-scale scraping. Learn how to use proxies, handle proxy authentication, make SSL/HTTPS requests, and debug common issues.

Troubleshooting the Python Requests Module Not Working

Author: Mohan Ganesan

Date: Feb 3, 2024

Reinstall packages after Python upgrades. Watch for SSL/TLS certificate problems. Simplify to basic HTTP requests for debugging. Create isolated environments to test Requests.

Debugging urllib Issues

Author: Mohan Ganesan

Date: Feb 8, 2024

Using urllib module for HTTP requests in Python can run into issues. Tips for debugging: validate URL, handle exceptions, use logging, inspect request details.

Troubleshooting HTTP 404 Errors with Python's urllib

Author: Mohan Ganesan

Date: Feb 6, 2024

Encountering HTTP 404 errors when trying to access web pages with Python's urllib module can be frustrating. This guide provides common causes and solutions for debugging 404 errors.

Keeping Track of Asyncio Loops in Python

Author: Mohan Ganesan

Date: Mar 25, 2024

Tips for detecting and keeping track of active asyncio loops in Python. Use get_running_loop() to get the current running loop. Use all_tasks() to iterate through scheduled tasks. Use contextvars to track the loop a task is running on.

Why is asyncio faster than threading python ?

Author: Mohan Ganesan

Date: Mar 17, 2024

Python's asyncio module allows for non-blocking, asynchronous code execution, achieving better performance by minimizing blocking calls and maximizing CPU utilization.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!