Caching in Python

Dec 6, 2023 ยท 8 min read

APIs are essential for modern applications, but excessive API requests can really slow things down. In this comprehensive guide, I'll share how to cache API responses in Python to supercharge performance.

Back when I was first learning web scraping, I ran into major slowdowns because my scripts would repeatedly call the same APIs. After banging my head for a while, I discovered caching - a total game changer! Now I'll show you how to cache like a pro.

Why is Caching Helpful?

Caching allows you to save a copy of an API response and return the cached data instead of making duplicate requests.

This improves speed and reduces traffic to APIs. For instance, say your script fetches a product list from an ecommerce API. Instead of hitting the API on each search, you could cache the response for a certain period. Now your script can read the cached data and only refetch when the cache expires.

Other caching benefits:

  • Lightning fast response times for cached data
  • Lower API costs by reducing requests
  • Smoother experience even on poor networks
  • Avoid rate limiting issues on APIs
  • Reduce server and infrastructure load
  • Caching is a must-have tool for large web scraping and data collection projects.

    Installing Caching Libraries

    Python doesn't include caching natively, but we can easily add it. The requests-cache library is purpose-built for caching API responses from Python's super popular requests module.

    Install both via pip:

    pip install requests requests-cache

    Requests-cache supports various storage backends like SQLite, Redis, and MongoDB for saving cached responses. I prefer SQLite since it's a simple single file database.

    Some other handy caching tools are cachecontrol, which implements client-side caching, and django-cache-machine for Django apps. We'll focus on requests-cache here since it's easy to use standalone.

    Basic Caching Usage

    Requests-cache provides a CachedSession class that acts like a requests Session, but with automatic caching behavior.

    Here's an example:

    import requests
    import requests_cache
    session = requests_cache.CachedSession('api_cache')
    response = session.get('<>')

    Now the first call fetches and caches the response. Subsequent calls just read from cache without hitting the API again until the cache expires. Easy enough!

    CachedSession is great for simple use cases. But what if you want to enable caching on all requests by default without managing a session?

    Requests-cache offers a clever install_cache method that patches requests globally:

    import requests
    import requests_cache
    response = requests.get('<>')

    This transparently caches all requests through requests. Much easier than wrapping every call in a session!

    A common doubt is whether enabling caching affects getting fresh data. By default, requests-cache caches indefinitely. But we can set an expiry period so caches automatically invalidate after some time:

    requests_cache.install_cache('api_cache', expire_after=3600)

    Now caches last 1 hour (3600 seconds). The next request after that fetches a new response from the API and caches it again.

    Advanced Caching Techniques

    Now that you've seen basic caching, let's dig into some more advanced tactics:

    Conditional Requests

    When a cache expires, instead of fetching the full response again, the client can make a conditional request to just check if the resource changed:

    GET /data
    If-Modified-Since: <date>

    This validates the cache against the server's current version. If unchanged, it returns a 304 status without any body, saving bandwidth. The client keeps using the existing cached data.

    Cache Invalidation

    Sometimes you need to purge the cache and fetch fresh data on demand. Requests-cache provides a cache.delete_url() method for this:

    import requests_cache
    requests_cache.uninstall_cache() # Clear all caches
    requests_cache.delete_url('<>') # Clear one URL

    This allows manual cache invalidation when the source data changes.

    Cache Hierarchies

    Real-world caching uses a hierarchy of layered caches. Requests may check the local browser cache, then a shared proxy cache, then finally the origin server.

    We can mimic this in Python by combining requests-cache with client-side caching tools like CacheControl. Add both caches and prioritize CacheControl for low-level caching.

    Avoiding Stale Caches

    Stale caches can cause hard-to-debug issues if you expect live data. Some tips:

  • Set reasonable cache expiry times
  • Add caching only for endpoints not frequently updated
  • Use cache deletion and cache-busting parameters (e.g. ?v=2)
  • Design APIs to announce data changes via headers like Last-Modified
  • When to Cache Responses

    Now let's discuss when caching makes sense and when to avoid it:

  • Unchanging data: Perfect for infinite caching. Use hashes in filenames and long cache times.
  • Frequent requests: Cache even dynamically generated data. Set shorter cache times.
  • Slow APIs: Reduce calls to speed up overall performance.
  • User-specific data: Avoid caching to prevent leakage to other users.
  • Rapidly updating data: Disable caching completely via no-cache headers.
  • Already speedy APIs: May not benefit much from caching since the API is fast already.
  • Also consider caching these types of requests:

  • Large responses: Caching avoids re-downloading large amounts of data.
  • Frequently accessed resources: The cache reduces load on frequently used resources.
  • Rate limited APIs: Caching ensures you don't exceed rate limits.
  • High cost APIs: Cut costs by caching data locally.
  • Common Caching Patterns

    Here are some useful caching architectures:


    Load data on demand, check if cached, otherwise fetch from API and cache result. Great for dynamic data.


    Fetch first from cache, return result. Then asynchronously fetch latest from API in background to keep cache fresh.


    On cache miss, load from API and add result to cache before returning response. Keeps cache populated.


    When updating API, propagate write to cache too. Ensures cache consistency with source.

    These patterns help build robust, high-performance caching that stays in sync with APIs.

    Handling Caching Gracefully

    Caching can fail unexpectedly if the cache is cleared or unavailable. Make sure your code handles these scenarios gracefully:

    import requests
    import requests_cache
      response = requests.get('<>')
    except requests_cache.CacheMissingError:
      # Cache missing - fall back to live API request
      response = requests.get('<>')

    Use cache.get_cache_response() and cache.get_cache_timeout() methods to check cache state without exceptions:

    from datetime import timedelta
    response = cache.get_cached_response(url, timedelta(minutes=15))
    if not response:
      # Cache expired, fetch from live API
      response = requests.get(url)

    This allows handling scenarios like expired caches or disabled caching more smoothly in your code.

    Caching HTTP Errors

    By default, requests-cache caches only 200 responses. To cache 404s, 500s and other HTTP errors:

    session = requests_cache.CachedSession(cache_errors=True)

    Now all responses including errors will be cached. This avoids hitting APIs repeatedly just to collect error codes.

    Set allowable_codes to cache only certain errors, like 404s:

    session = requests_cache.CachedSession(

    Caching errors can improve efficiency in scrapers that process many links. But don't cache sensitive errors containing information leaks!


    Caching is a crucial technique for improving API-driven applications. With tools like requests-cache, it's easy to add powerful caching to Python programs.

    We covered the basics of enabling caching, as well as advanced topics like cache invalidation, conditional requests, cache hierarchies, and common caching architectures.

    Key takeaways:

  • Caching reduces API requests and improves speed
  • Use requests-cache for simple caching with Python
  • Set reasonable cache expiration times
  • Invalidate caches when data changes
  • Apply common caching patterns like cache-aside
  • Now you have all the knowledge to implement caching like a pro!


    What is the difference between caching and data replication?

    Caching stores temporary copies of frequently accessed data to improve speed. Replication creates multiple redundant copies of data on different servers to improve availability if one server goes offline.

    How is Redis different than caching?

    Redis is a fast in-memory data store often used to build caches. Caching is a general concept for improving performance by saving copies of data. Redis provides an infrastructure to implement caching.

    Where is the HTTP cache stored on browsers?

    Browser caches are typically stored on disk in a cache folder specific to that browser. For example, Chrome caches are under ~/Library/Caches/Google/Chrome on macOS. Caches may also be kept in memory for even faster access.

    Should POST requests be cached?

    No, POST requests should not be cached as they are often used for data creation with side-effects. Caching POSTs can lead to duplicate data submissions if replayed from cache.

    How can I benchmark my caching performance?

    Use a tool like Locust to load test APIs with and without caching enabled. Compare metrics like response times and requests per second to quantify caching gains.

    Hope this guide helps you become a caching master! Let me know if you have any other questions.

    Browse by tags:

    Browse by language:

    The easiest way to do Web Scraping

    Get HTML from any page with a simple API call. We handle proxy rotation, browser identities, automatic retries, CAPTCHAs, JavaScript rendering, etc automatically for you

    Try ProxiesAPI for free

    curl ""

    <!doctype html>
        <title>Example Domain</title>
        <meta charset="utf-8" />
        <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1" />


    Don't leave just yet!

    Enter your email below to claim your free API key: