Stories from the Web Crawling trenches in streaming

Downloading Files in Python with aiohttp

Author: Mohan Ganesan

Date: Feb 22, 2024

Python's aiohttp library allows for asynchronous and non-blocking downloading of files. It provides a simple API, handles streams efficiently, and supports progress reporting and error handling.

Downloading Binary Files with Python Requests

Author: Mohan Ganesan

Date: Feb 3, 2024

Python's requests module makes it easy to download binary files from the internet. Learn how to stream the download and display a progress bar for efficient downloading.

Efficient File Uploads in Python with aiohttp

Author: Mohan Ganesan

Date: Feb 22, 2024

aiohttp provides a straightforward API for handling file uploads from clients. Validate and process uploads as byte streams. Check file headers for size/type before storage. Support multiple parallel uploads. Store uploaded files appropriately based on application needs.

Reading CSV Files with Python's urllib

Author: Mohan Ganesan

Date: Feb 8, 2024

CSV files can be easily downloaded and parsed using Python's urllib module. It is useful for data analysis, data integration, and streaming large CSV files.

Why Your Python Requests Timeout May Not Be Timing Out As Expected

Author: Mohan Ganesan

Date: Feb 3, 2024

When using the requests library in Python, you can specify a timeout value to prevent your code from hanging indefinitely if a request gets stuck.

Streaming Downloads with Python Requests

Author: Mohan Ganesan

Date: Feb 3, 2024

Stream large downloads in Python using requests library to avoid memory issues and start processing data sooner.

Efficiently Sending Files with aiohttp in Python

Author: Mohan Ganesan

Date: Mar 3, 2024

Sending files over the network asynchronously in Python using aiohttp library for efficient file transfers.

Downloading ZIP Files with aiohttp in Python

Author: Mohan Ganesan

Date: Feb 22, 2024

aiohttp is a Python library for asynchronous HTTP clients and servers. It allows for streaming ZIP file downloads in web applications and APIs.

Why Large Requests Can Fail in Python

Author: Mohan Ganesan

Date: Feb 3, 2024

Requests library in Python can encounter errors with large requests due to TCP packet size. Solutions include chunking the request body, lowering stream threshold, compressing data, or switching protocols.

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!