Retrying Failed Requests in Python Requests (with Code Examples!)

Oct 31, 2023 ยท 11 min read

Networks are unreliable. Servers fail. APIs go down temporarily. As developers, we've all experienced the frustration of HTTP requests failing at the worst possible time. Just when you need to process time-sensitive data or charge a customer, the API you rely on has an outage.

But with a robust retry mechanism, these inevitable failures don't have to ruin your application. By automatically retrying requests, you can vastly improve reliability despite flakey networks and services.

In this comprehensive guide, you'll learn how to retry failed requests in Python using the excellent Requests library. We'll cover:

  • The different types of request failures
  • Implementing retry logic with Sessions and HTTPAdapter
  • Building a custom retry wrapper from scratch
  • Configuring retries and delays
  • Advanced retry strategies
  • Special considerations for different HTTP methods like POST
  • Insider tips to avoid common pitfalls
  • To demonstrate each concept, we'll use practical code examples from real-world scenarios. By the end, you'll be able to incorporate robust request retries to handle errors and build resilient applications in Python. Let's get started!

    Why Retry Failed Requests?

    First, it helps to understand why retries are so crucial when working with remote APIs and services.

    Distributed systems fail in complex ways. Here are just some common issues that can happen:

  • Network errors - the route to the server goes down temporarily
  • Overloaded servers - too many requests flood the API server
  • Timeouts - the server takes too long to process the request
  • 5xx errors - the server encounters an internal error
  • 429 rate limiting - you've hit a rate limit and been throttled
  • These failures occur frequently, especially when relying on external APIs. But in many cases, the issue is transient.

    For example, let's say you're a ridesharing company that uses a payments API to charge customers. But when a huge processing load hits your servers during peak hours, the payment API starts timing out.

    Without retries, you'll start seeing failed payments and angry customers! But if you retry the charge requests, there's a good chance the timeout was a temporary blip that will succeed on retry.

    In short, retrying failed requests provides fault tolerance against the inherent unreliability of distributed systems. This prevents transient errors from affecting your application and improves reliability immensely.

    Categorizing Request Failure Scenarios

    To implement request retries effectively, you first need to understand the various types of failures that can occur. This allows you to customize your retry behavior accordingly.

    There are two major categories:

    Network Errors

    These occur when the HTTP client cannot establish a connection to the server in the first place. Some examples include:

  • DNS lookup failures - the domain name can't be resolved to an IP address
  • Connection refusals - the server rejects the connection
  • Connection timeouts - the connection takes too long to establish
  • Often retrying on network errors is safe, as connectivity issues are usually intermittent.

    HTTP Errors

    Once a connection is established, the server may still return an HTTP error response:

  • 4xx client errors - invalid request, authorization failure, etc
  • 5xx server errors - internal server error, gateway timeout, etc
  • 429 Too Many Requests - rate limiting threshold exceeded
  • 5xx errors and 429 rate limiting specifically indicate transient server problems where a retry is appropriate.

    However, for some 4xx client errors like 401 Unauthorized, retrying will likely fail again. So you may want to avoid retrying certain 4xx errors. We'll look at how to do this later.

    Now that we understand the common failure scenarios, let's look at implementing robust retries in Python Requests!

    Configuring Retries in Requests

    The Requests library provides several options for retrying failed requests automatically.

    The two main approaches are:

    1. Using Sessions with a HTTPAdapter
    2. Building your own custom retry wrapper

    Let's explore these in detail.

    Using Sessions and HTTPAdapter

    Requests has the concept of a Session - a container that stores common settings across multiple requests.

    This allows us to configure a retry strategy once that applies to all requests using that Session.

    Here's a simple example:

    import requests
    from requests.adapters import HTTPAdapter
    from urllib3.util import Retry
    
    session = requests.Session()
    
    retries = Retry(total=5,
                    backoff_factor=1,
                    status_forcelist=[502, 503, 504])
    
    session.mount('https://', HTTPAdapter(max_retries=retries))
    

    We create a Retry object that defines our retry strategy:

  • total - Max number of retries (5)
  • backoff_factor - Sleep time increases exponentially between retries (defaults to 0)
  • status_forcelist - Retry on server errors like 5xx (502 Bad Gateway, 503 Service Unavailable, 504 Gateway Timeout)
  • Next, we create an HTTPAdapter and mount it to the session. This applies our retry strategy to all requests made through this session.

    Now any calls using this session will automatically retry up to 5 times on those 5xx errors:

    response = session.get('<https://api.example.com/data>')
    

    The HTTPAdapter approach is great because it's simple and built-in. However, we don't have much control over the detailed retry logic. For that, a custom wrapper is better.

    Building a Custom Retry Wrapper

    To implement a retry wrapper, we'll create a new get_with_retries() method that wraps the usual requests.get() call.

    Here's a basic example:

    MAX_RETRIES = 5
    
    def get_with_retries(url):
        for i in range(MAX_RETRIES):
            try:
                response = requests.get(url)
    
                # Exit if no error
                if response.status_code == 200:
                    return response
    
            except RequestException:
                print(f'Request failed, retrying {i+1}/{MAX_RETRIES} times...')
    
        return response
    

    We try making the request up to 5 times inside the for loop. If any RequestException occurs, we catch it and retry.

    This gives us complete control over the retry logic. Later we'll see how to customize it further.

    The retry wrapper approach requires more code, but allows handling edge cases the HTTPAdapter way cannot. So they each have their niche uses.

    Now let's look at how to configure retries for robustness.

    Defining Your Retry Strategy

    Simply enabling retries isn't enough - we need to tune them for our specific use case. Here are some key parameters to consider:

    Number of Retries

    How many times should a failed request be retried before giving up?

    This depends on the API and type of failures expected. For transient server issues, 3-5 retries is usually sufficient. But a lower number like 2-3 prevents endless retries.

    # Retry up to 3 times
    retries = Retry(total=3)
    

    For more permanent errors like 400 Bad Request, retrying is pointless so you may want total=0.

    Tuning this parameter balances reliability vs. not retrying excessively. Start low and increase as needed.

    Delay Between Retries

    To avoid hammering a server, it's good to add a delay between retries:

    # Wait 1 second between retries
    retries = Retry(backoff_factor=1)
    

    The backoff_factor provides exponential backoff - the retry delay will grow exponentially on each try.

    The actual delay is calculated as:

    {backoff factor} * (2 ** ({number of retries} - 1))
    
    # With backoff_factor = 1
    # Retries will wait: 1, 2, 4, 8, 16, 32, etc. seconds
    

    Starting with small backoff factors (0.1-1) prevents waiting too long. The delays will increase as retries continue to fail.

    Backoff Algorithms

    Two advanced backoff strategies are worth mentioning:

  • Jitter - Add random jitter to the backoff delays. This helps when making parallel requests by avoiding synchronized retries.
  • Rate limiting - Monitor recent request rates and backoff dynamically based on the rate limit. This prevents banging your head against a rate limit.
  • These require custom wrappers to implement. But they can make your retries smarter.

    Status Codes to Retry

    By default, retries only happen on network errors before the request reaches the server. To retry on certain HTTP status codes, specify the status_forcelist:

    retries = Retry(total=3,
                    status_forcelist=[502, 503, 504])
    

    Common transient server issues like 502, 503, and 504 are good candidates to retry.

    429 Too Many Requests can indicate hitting a rate limit - retrying with backoff is recommended.

    Some 4xx client errors may be retry-worthy depending on context. But avoid retrying errors like 400 Bad Request which are client mistakes.

    Dealing with Timeouts

    Network timeouts are a special case - once a request times out, the server may still be processing the original request. Simply retrying could cause duplicate effects.

    There are a couple ways to handle this:

  • For idempotent requests, retry timeouts immediately
  • For non-idempotent requests, wait before retrying with an exponential backoff
  • Idempotent requests (GET, PUT, DELETE) are safe to retry, but POST and others could duplicate. The backoff gives time for the original to finish before retrying.

    We'll discuss idempotency more later. But dealing with timeouts properly ensures duplicate requests don't happen.

    Advanced Retry Customization

    For more control, you can get creative with your custom retry wrapper. Here are some advanced strategies:

    Retry Conditions

    Instead of a simple catch-all except, you can specify certain conditions to trigger a retry.

    For example, to retry on a 422 status code:

    if response.status_code == 422:
      retry()
    

    This allows retrying on domain-specific transient failures beyond just 5xx.

    Conditional Retries

    Certain responses may contain indicators that a retry is or isn't recommended:

    if 'shouldRetry' in response.headers:
      retry()
    
    if 'doNotRetry' in response.json():
      return response
    

    This gives you added flexibility beyond just status codes.

    Request Hooks

    Hooks allow attaching callbacks to different stages of a request.

    We can use them to log retries or increment counters:

    def retry_hook(response, *args, **kwargs):
      print(f'Retrying {response.request.url}...')
    
    s.hooks['response'] = [retry_hook]
    

    This helps debug retries and gives visibility into how often they occur.

    Exponential Backoff

    Instead of fixed delays, backing off exponentially prevents hitting rate limits:

    def get_backoff_time(attempts):
      delay = 2 ** attempts # exponential backoff
      return delay
    
    time.sleep(get_backoff_time(retries))
    

    Starting with 2 seconds, this waits 2, 4, 8, 16, 32... seconds between retries.

    Jittered Retries

    Adding a random "jitter" factor prevents multiple clients synchronizing retries:

    @random.uniform(0.5, 1.5)
    def get_jittered_backoff(attempts):
      delay = 2 ** attempts
      return delay * jitter_factor
    
    time.sleep(get_jittered_backoff(retries))
    

    This avoids sudden spikes of retries from many clients.

    Handling Different Request Types

    Retrying gets nuanced when we consider different HTTP request methods like POST, PUT, and DELETE.

    Retrying GET Requests

    GET requests are read-only and idempotent. This means they're safe to retry without side effects - each retry is identical to the first.

    So it's usually fine to retry GETs without any special handling. Just watch for infinite loops due to programming errors.

    Retrying POST Requests

    Retrying POST requests can be dangerous - a duplicate POST may create duplicate resources.

    To safely retry POSTs:

  • Check for idempotency keys in the request
  • Add delays between retries
  • Avoid retrying 4xx client errors
  • Idempotency keys are a technique where the client adds a unique key to ensure duplicate requests can be detected.

    Adding a backoff delay also prevents duplicates by allowing time for the original to finish before retrying.

    And 4xx errors like 400 Bad Request usually indicate a client mistake that warrants investigation before retrying blindly.

    With proper caution, POSTs can usually be retried safely.

    Retrying PUT/PATCH Requests

    PUT and PATCH requests to update resources are also non-idempotent. Similar precautions should be taken as POST:

  • Use idempotency keys
  • Add delays between retries
  • Avoid retrying 4xx errors
  • For example, a common pattern is to retry PUTs and PATCHes on 5xx errors but not 4xx.

    Again taking care to prevent duplicates, these request types can also be retried in most cases.

    Retrying DELETE Requests

    DELETE requests can be retried safely as they are idempotent - a duplicate DELETE causes no harm.

    The main risk is deleting something unintentionally if the original DELETE succeeded but the retry succeeded.

    Checking the response status before retrying avoids this edge case:

    if response.status_code not in [200, 204]:
      retry_delete(response)
    

    With status code checking, DELETEs can usually be retried freely.

    Avoiding Problems with Retries

    While powerful, misusing retries can also cause problems. Here are some pitfalls to avoid:

    Infinite Retries

    If there's a programming bug that causes each retry to fail, you can wind up in an infinite loop. Use a conservative max_retries and increment a counter to prevent this.

    Overloading APIs

    Too many rapid retries may overload APIs and worsen outages. Exponential backoff helps by adding delays between retries.

    Duplicate Requests

    As discussed above, take care to avoid duplicating side effects when retrying PUT, POST, DELETE requests.

    Blocking Traffic

    If traffic is blocked by a firewall misconfiguration, retrying endlessly won't fix it. At some point you'll want to fail and alert developers.

    Disregarding Rate Limits

    Ignoring 429 Too Many Requests and blindly retrying may completely block access. Dynamic rate limit monitoring helps, as discussed earlier.

    Best Practices for Production

    Here are some recommended best practices when implementing request retries:

  • Start with conservative limits like 2-3 retries maximum
  • Handle429 rate limiting errors with backoff
  • Use exponential backoff and jitter between retries
  • Be very cautious with POST/PUT/DELETE retries
  • Implement request hooks to log and monitor retries
  • Set overall request timeouts along with retries
  • Consider dynamic rate limit detection
  • Flag excessive retries and alert developers
  • Thoroughly test edge cases before deploying
  • Following these will help avoid common pitfalls and ensure your retries improve reliability.

    Common Mistakes to Avoid

    To wrap up, here are some common mistakes that can undermine your retry logic:

  • Retrying endlessly without limits
  • Not adding any delays between retries
  • Blindly retrying all error codes without checking
  • Retrying 404 Not Found errors pointlessly
  • Failing to handle POST/PUT/DELETE retries carefully
  • Forgetting overall request timeouts in addition to retries
  • Disabling retries in production due to edge case bugs
  • Carefully avoiding these blunders will set you up for success with request retries!

    Browse by tags:

    Browse by language:

    Tired of getting blocked while scraping the web?

    ProxiesAPI handles headless browsers and rotates proxies for you.
    Get access to 1,000 free API credits, no credit card required!