Stories from the Web Crawling trenches in headers

Python Requests Cheatsheet

Author: Mohan Ganesan

Date: Jan 9, 2024

Overview of Requests, a popular HTTP library for Python. Features include making GET and POST requests, handling response content and headers.

Setting the Content-Type Header for Python Requests

Author: Mohan Ganesan

Date: Feb 3, 2024

Properly setting the Content-Type helps the receiving server interpret and handle the data correctly. When sending JSON data or other formats, you'll want to explicitly set the header instead. Uploading multipart form data requires setting the content type accordingly. Handling responses and content types appropriately is important for robust integrations.

Finding Headers in BeautifulSoup

Author: Mohan Ganesan

Date: Oct 6, 2023

When parsing HTML and XML documents, accessing and working with headers is a common task. Understanding header tags in BeautifulSoup is important for efficient parsing and processing of documents.

Debugging HTTP Requests in Python with Request Logging

Author: Mohan Ganesan

Date: Feb 3, 2024

Add comprehensive logging to Python requests for visibility into issues when making HTTP requests.

Controlling HTTP Requests with urllib Headers

Author: Mohan Ganesan

Date: Feb 6, 2024

The Python urllib module provides a powerful way to make HTTP requests in your code. Headers allow you to specify important metadata about the request, like the user agent, authentication credentials, caching settings, and more.

Troubleshooting Bad Requests in Python Requests

Author: Mohan Ganesan

Date: Feb 3, 2024

The Python requests module is invaluable for making HTTP requests in your code. Troubleshoot and fix 400 status errors by checking headers and parameters.

Configuring Headers with aiohttp Clients for Effective API Calls

Author: Mohan Ganesan

Date: Feb 22, 2024

Properly configuring headers in aiohttp is crucial for smooth API requests. Headers serve purposes like authentication, context, security, and caching.

Mastering Urllib Sessions in Python for Effective Web Scraping

Author: Mohan Ganesan

Date: Feb 8, 2024

Urllib sessions allow persisting specific parameters across multiple requests. This is very useful for web scraping authenticated sites or sites that track browser state.

Handling Responses with urllib in Python

Author: Mohan Ganesan

Date: Feb 6, 2024

The urllib module in Python provides functionality for fetching data from URLs. Properly handling the response is important for robust code.

Using Python Requests to Populate Date Fields in Web Forms

Author: Mohan Ganesan

Date: Feb 3, 2024

Use Python Requests library and headers to populate date fields in web forms with date pickers for automation.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!