Downloading Binary Files with Python Requests

Feb 3, 2024 ยท 2 min read

Python's requests module makes it easy to download files from the internet. While requests is often used to fetch text data like JSON or HTML, it can also download binary files like images, audio, PDFs, and more. In this guide, I'll walk through the key things you need to know to download binary files with requests.

Setting Response Type to Binary

By default, the response from requests is decoded as text (UTF-8). To treat the response as a binary file instead, you need to set the response.content type to bytes by adding the following parameter:

response = requests.get(url, stream=True)
response.raw.decode_content = True

This ensures requests doesn't try to decode the binary data as text.

Stream the Download

For large files, you'll want to stream the download instead of loading the entire file into memory. This is done by setting the stream parameter:

response = requests.get(url, stream=True)

This will download just small chunks of the file at a time instead of the whole thing.

Write the File Contents

To save the downloaded file, loop through the response content and write each chunk to disk:

with open(filepath, 'wb') as f:
    for chunk in response.iter_content(chunk_size=1024): 

This iterates 1024 bytes at a time and appends each chunk using byte mode.

Handling Images and Other Media

If downloading images, videos, or other media, be sure to include the appropriate Accept header to signal the type of file you're expecting:

headers = {'Accept': 'image/jpeg'} 
response = requests.get(url, headers=headers, stream=True)

Progress Reporting

For long downloads, you may want to show a progress bar. The iter_content method supports this by including a chunk_size and returning an iterator:

from tqdm import tqdm

progress = tqdm(response.iter_content(chunk_size=1024), total=total_size)
with open(filepath, 'wb') as f:
    for chunk in progress:

The tqdm module handles displaying the progress bar updated after each chunk.

Following these patterns allows efficiently downloading binaries from images to executables. Requests handles all the HTTP logic while streaming and chunked writing gives you control over memory usage.

Browse by tags:

Browse by language:

Tired of getting blocked while scraping the web?

ProxiesAPI handles headless browsers and rotates proxies for you.
Get access to 1,000 free API credits, no credit card required!