Stories from the Web Crawling trenches in HTTP client

Downloading Files with Python Requests - Tips, Tricks and Code Example

Author: Mohan Ganesan

Date: Oct 31, 2023

Learn how to use Python Requests to download files from the web with ease. Requests provides a simple API for making HTTP calls, supports advanced features like streaming downloads and authentication, and is actively maintained. Use Requests to download files like a pro!

The Complete HTTPBin CheatSheet in Python

Author: Mohan Ganesan

Date: Dec 6, 2023

Httpbin is a popular online service for testing and debugging HTTP libraries and clients. It is useful for testing HTTP client code, experimenting with APIs, learning HTTP concepts, debugging issues, and more.

Getting Started with HTTPX in Python: Practical Examples and Usage Tips

Author: Mohan Ganesan

Date: Feb 5, 2024

HTTPX is a powerful Python HTTP client that makes API calls, handles authentication, timeouts, and more. Easily make GET and POST requests, handle JSON, forms, files, and headers. Supports async requests and session reuse for optimal performance.

Using Proxies with Axios in 2024

Author: Mohan Ganesan

Date: Jan 9, 2024

Learn how to integrate proxies with Axios for efficient web scraping and bot development. Avoid IP bans and scale your projects with ease.

Scraping Multiple Pages in Kotlin with HTTP Client and kotlinx.html

Author: Mohan Ganesan

Date: Oct 15, 2023

Web scraping in Kotlin using native HTTP client and kotlinx.html libraries to extract data from multiple pages. Use CSS selectors to scrape and extract information. Consider using Proxies API for scaling web scraping.

What is the difference between Httplib and Urllib?

Author: Mohan Ganesan

Date: Feb 20, 2024

Python code can make HTTP requests using urllib and httplib libraries. urllib is simpler and part of the standard library, while httplib provides more control and is suitable for advanced cases.

Sending HTTP Requests in Python: Request vs Requests

Author: Mohan Ganesan

Date: Feb 3, 2024

Python applications often require HTTP requests. The request library is built-in, while requests is a more powerful third-party library that simplifies the process.

Scraping New York Times News Headlines in CSharp

Author: Mohan Ganesan

Date: Dec 6, 2023

Automate data extraction from websites using C# and HTML Agility Pack for web scraping. Use HTTP client for making requests and XPath for parsing HTML elements.

Using Proxies in Axios in Node.js for Web Scraping in 2024

Author: Mohan Ganesan

Date: Jan 9, 2024

Configure proxies for Node.js web scraping using Axios library. Learn about proxy options, authentication, rotating proxies, environment variables, custom logic, and proxy services like Proxies API.

Scraping Multiple Pages in Scala with HTTP Client and XML Libraries

Author: Mohan Ganesan

Date: Oct 15, 2023

Web scraping in Scala using HTTP client and XML libraries to extract data from multiple pages. Use XPath expressions and proxies for scalability.

Making HTTP Requests in Python with HTTPX

Author: Mohan Ganesan

Date: Feb 5, 2024

Python HTTP client HTTPX simplifies making HTTP requests, supports HTTP/1.1 and HTTP/2, and offers features like timeouts and retries.

Enable Detailed HTTP Debug Logging in Python Requests

Author: Mohan Ganesan

Date: Feb 3, 2024

Enable debug logging in Python Requests library to get detailed insight into HTTP requests and save time debugging issues.

Scraping New York Times News Headlines using Kotlin

Author: Mohan Ganesan

Date: Dec 6, 2023

The New York Times homepage can be scraped programmatically using Python and JSoup to extract article titles and links.

异步HTTP客户端/服务器框架aiohttp入门指南

Author: Mohan Ganesan

Date: Mar 3, 2024

aiohttp is a powerful Python asynchronous network programming framework for building high-performance asynchronous IO applications.

Getting Started with the HTTPX Python Library

Author: Mohan Ganesan

Date: Feb 5, 2024

The HTTPX library is a powerful and user-friendly HTTP client for Python. Install it with pip and make requests easily with its elegant API.

Making HTTP Requests with aiohttp in Python

Author: Mohan Ganesan

Date: Feb 22, 2024

The aiohttp library is a popular asynchronous HTTP client/server framework for Python. It allows you to make HTTP requests without blocking your application, perfect for building highly concurrent or asynchronous services.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!