Speed Up Python Requests with Caching

When making HTTP requests in Python using the popular requests library, you may notice that subsequent requests to the same URL are faster. This speedup is due to caching.

Under the hood, requests caches various attributes from the initial request in order to avoid unnecessary work on follow-up requests. For example, it will cache information like:

The resolved IP address of the host

The negotiated TLS session (for HTTPS requests)

HTTP authentication

This means that instead of re-doing DNS lookups, setting up new TLS connections, and re-authenticating on every request, requests can skip right to re-using existing cached info to streamline things.

Here's a quick example:

import requests

start = time.perf_counter()
resp = requests.get('https://example.com/')
end = time.perf_counter()

print(f"Initial request took {end - start:.2f} seconds")

start = time.perf_counter() 
resp = requests.get('https://example.com/')
end = time.perf_counter()

print(f"Subsequent request took {end - start:.2f} seconds")

On the initial request, we time how long it takes to resolve DNS, establish a TLS connection, send the request, and get the response. On the second request, all that cached info is re-used, so it should be much faster!

The ability for requests to leverage caching and keep connections alive is very useful for building fast and efficient data retrieval workflows. Just be aware that the cache does have a maximum size and entries can expire, so performance may degrade over time or on less-frequent requests.

Speed Up Python Requests with Caching

Browse by tags:

Browse by language:

The easiest way to do Web Scraping

Speed Up Python Requests with Caching

The easiest way to do Web Scraping

Don't leave just yet!