Using Proxies with Python Requests

Oct 22, 2023 ยท 11 min read

Introduction

Python requests is the most popular Python library for making HTTP requests. It abstracts away much of the complexity of working with HTTP and allows you to make API calls with minimal code.

However, when scraping or repeatedly querying APIs, you may find your IP address gets blocked. This is where proxies come in handy.

What is a Proxy Server?

A proxy server acts as an intermediary between your client (like a Python script) and the target server you want to interact with. The proxy receives the request, forwards it to the target server while appearing as the original client, gets the response, and sends it back to you.

This allows you to mask your real IP address and avoid getting blocked by servers due to repeated requests from the same IP.

Overview of Python Requests Library

The requests library provides an elegant and simple way to make HTTP calls in Python. Here are some key features:

  • Supports GET, POST, PUT, DELETE and other HTTP methods
  • Allows passing parameters, headers, cookies, files and authentication
  • Automatic JSON decoding for responses
  • Sessions with cookie persistence
  • Timeout and retries
  • Connection pooling and keep-alive
  • Works with HTTP proxies
  • With this powerful library, you can interact with APIs and web services easily. Now let's see how to use it with proxies.

    Setting Up Proxies

    To use a proxy with requests, we need to configure the proxy URL. Here's how:

    Creating the Proxies Dictionary

    Requests expects us to pass a dictionary containing proxies for HTTP and HTTPS connections:

    proxies = {
      'http': '<http://10.10.1.10:3128>',
      'https': '<http://10.10.1.10:1080>'
    }
    

    We can then pass this proxies dictionary to any requests method.

    Proxy URL Format

    The proxy URL follows this format:

    <PROTOCOL>://<IP>:<PORT>
    

    Common protocols are HTTP, HTTPS and SOCKS. The IP is the proxy server's address. And port is the proxy port it exposes.

    Multiple Protocols

    You can use the same or different proxies for HTTP and HTTPS traffic.

    For example, if your HTTP proxy is 10.10.1.10:3128 and HTTPS proxy is 10.10.1.11:1080, the proxies dictionary would be:

    proxies = {
      'http': '<http://10.10.1.10:3128>',
      'https': '<http://10.10.1.11:1080>'
    }
    

    Requests will route HTTP requests to 10.10.1.10:3128 and HTTPS requests to 10.10.1.11:1080.

    Same vs Different Proxies

    Using the same proxy for HTTP and HTTPS is also common:

    proxy = '<http://10.10.1.10:3128>'
    
    proxies = {
      'http': proxy,
      'https': proxy
    }
    

    This forwards both HTTP and HTTPS traffic to 10.10.1.10:3128.

    Making Requests

    Once we have the proxies configured, we can make requests as usual:

    GET Requests

    import requests
    
    proxies = {
      'https': '<http://10.10.1.10:1080>'
    }
    
    response = requests.get('<https://httpbin.org/ip>', proxies=proxies)
    

    This will fetch your public IP from Httpbin over the HTTPS proxy.

    POST Requests

    Sending a POST request is just as easy:

    data = {'key1': 'value1', 'key2': 'value2'}
    
    response = requests.post('<https://httpbin.org/post>', data=data, proxies=proxies)
    

    This POSTs the data to Httpbin via the proxy server.

    Other HTTP Methods

    Besides GET and POST, requests supports other HTTP methods like PUT, DELETE, HEAD and OPTIONS. The syntax is similar:

    requests.put(url, data=data, proxies=proxies)
    requests.delete(url, proxies=proxies)
    requests.head(url, proxies=proxies)
    requests.options(url, proxies=proxies)
    

    So using a proxy is just an additional parameter to pass.

    Authentication

    Some proxies require authentication to allow access. Here's how to pass credentials:

    Basic Authentication

    To pass a username and password for proxy authentication, use this syntax:

    http://<USERNAME>:<PASSWORD>@<IP>:<PORT>
    

    For example:

    proxy = '<http://user123:pass456@10.10.1.10:3128>'
    
    proxies = {
      'http': proxy,
      'https': proxy
    }
    

    This will authenticate using the provided credentials.

    Other forms of authentication like Digest and NTLM are also supported.

    Proxy URL with Credentials

    An alternative is passing the credentials in the proxy URL like so:

    proxies = {
      'http': '<http://10.10.1.10:3128>',
      'https': '<http://user123:pass456@10.10.1.10:3128>'
    }
    

    Here HTTP requests will go over the public proxy while HTTPS requests will be authenticated.

    Sessions

    When working with APIs, you often need to persist cookies over multiple requests. For this, the requests Session is useful:

    Creating a Session Object

    session = requests.Session()
    

    This will initialize a new session.

    Setting Proxies on Session

    Then we set proxies on the session instead of individual requests:

    session.proxies = proxies
    

    Now all requests through this session will use the defined proxies.

    For example:

    response = session.get('<https://httpbin.org/cookies/set?name=abc>')
    print(response.text)
    
    response = session.get('<https://httpbin.org/cookies>')
    print(response.text)
    

    This persists cookies across requests.

    Environment Variables

    For convenience, requests also checks for HTTP_PROXY and HTTPS_PROXY environment variables:

    HTTP_PROXY and HTTPS_PROXY

    On Linux/macOS:

    export HTTP_PROXY="<http://10.10.1.10:3128>"
    export HTTPS_PROXY="<https://10.10.1.10:8080>"
    

    On Windows:

    set HTTP_PROXY="<http://10.10.1.10:3128>"
    set HTTPS_PROXY="<https://10.10.1.10:8080>"
    

    With this, you don't need to pass proxies in code.

    Advantages

  • Avoid repeating proxy config in every script
  • Centralize your proxy settings
  • Dynamically change proxies
  • Override specific proxies by passing proxies dictionary
  • So use environment variables or the proxies dict based on your needs.

    Advanced Usage

    Now that we've covered the basics, let's look at some advanced use cases:

    Rotating Proxies

    To avoid getting blocked, you can rotate between a pool of proxy IPs:

    1. Get a list of free or paid proxy servers
    2. Implement proxy rotation logic:
    import random
    
    proxy_pool = ['1.1.1.1:8000', '9.9.9.9:8001'...]
    
    random_proxy = random.choice(proxy_pool)
    
    proxies = {
      'http': random_proxy,
      'https': random_proxy
    }
    
    response = requests.get(url, proxies=proxies)
    

    This picks a random proxy from the pool on every request.

    1. On errors, discard problematic proxies and retry with a new one.

    Premium Proxies

    Free proxies often suffer from downtime and get blocked frequently.

    Paid premium proxies from providers like Luminati, Oxylabs and Smartproxy are more reliable.

    Benefits:

  • High uptime and availability
  • Geo-targeting: Proxies from required city/country
  • Static vs rotating: Fix or rotate IPs as needed
  • Faster speeds with private proxies
  • Packages for different use cases
  • While premium proxies require an investment, the benefits may justify the cost depending on your use case.

    SSL Certificate Verification

    By default, requests will verify the SSL certificate when using an HTTPS proxy. This can sometimes cause issues.

    You can disable SSL verification with:

    response = requests.get(url, proxies=proxies, verify=False)
    

    Note that disabling SSL verification compromises security. Only do this if you fully trust the proxy and have no other option.

    Timeout, Retries and Backoff

    When using free proxies, connection errors and timeouts are common. Use these parameters to make your requests more robust:

  • timeout - Max seconds to wait for a response before failing
  • retries - Number of times to retry a failed request
  • backoff_factor - Delay between retries (doubles every retry by default)
  • For example:

    requests.get(url, proxies=proxies, timeout=3, retries=5, backoff_factor=1)
    

    This will retry on failure up to 5 times with a 1 second delay between each retry.

    Tune these parameters based on your use case.

    Troubleshooting

    Despite the best proxy setup, sometimes things fail. Here are solutions for common issues:

    Common Error Messages

  • 407 Proxy Authentication Required - Incorrect proxy authentication credentials
  • 503 Service Unavailable - Proxy server is down
  • Timeout errors - Server isn't responding in time
  • Tools for Debugging

    Try the request without proxies to isolate issues

  • Use the arguments verbose=True and verify=False for debugging info
  • Check your network - are direct connections failing too?
  • Test proxies individually to identify bad ones
  • Use logging to capture request details on failures
  • With some diligence, you can identify and fix most proxy problems.

    Common Questions

    What is the default proxy for Python requests?

    By default, requests will check for HTTP_PROXY and HTTPS_PROXY environment variables. If set, requests will use those as the default proxies.

    If those environment variables are not set, requests will make direct connections without a proxy by default.

    How to resolve 407 Proxy Authentication Required error?

    A 407 error means the proxy requires authentication. Make sure to pass valid username/password credentials in the proxy URL like:

    <http://username>:password@proxy_ip:port
    

    Also check that the authentication method supported by the proxy (e.g. Basic, Digest) matches what you are providing through requests.

    How do you authenticate a request in Python?

    Requests supports Basic, Digest, NTLM and Kerberos authentication. Pass a tuple of (username, password) using the auth parameter:

    requests.get(url, auth=('user', 'pass'))
    

    Alternatively include credentials in the proxy URL itself.

    How to use SOCKS5 proxy with Python requests?

    Install the SOCKS plugin:

    pip install requests[socks]
    

    Then configure the proxies as:

    proxies = {
        'http': 'socks5://user:pass@host:port',
        'https': 'socks5://user:pass@host:port'
    }
    

    Why is requests not working in Python?

    Some common issues:

  • Proxy configuration incorrect
  • Authentication failing
  • SSL certificate verification failing (use verify=False if trusted)
  • URL/payload incorrect
  • Timeout too short
  • Enable debug logs and check for errors to identify root cause.

    How to resolve common proxy errors like 502, timeout, SSL, authentication?

  • 502 - Bad gateway - proxy server is down
  • Timeout - Slow proxy, set higher timeout value
  • SSL - Set verify=False if you trust the proxy
  • Authentication - Invalid credentials, wrong auth method
  • Is SOCKS5 proxy secure?

    SOCKS5 can securely encrypt traffic, but it depends on the server config. Always connect through SOCKS over SSL/TLS rather than plain unencrypted TCP for security.

    Conclusion

    Proxies are indispensable when working with HTTP requests and APIs in Python. By allowing you to route your traffic through intermediate servers, they enhance privacy, security and prevent IP blocking.

    We learned how to configure HTTP and HTTPS proxies in Python requests using a dictionary or environment variables. We covered proxy authentication, sessions, SSL verification, timeouts and other best practices.

    With this comprehensive guide, you should feel empowered to use proxies effectively in Python requests.

    The key takeaways are:

  • Proxies help prevent IP blocking when making repeated requests
  • Pass a proxies dictionary to any requests method to use a proxy
  • Authenticate via credentials in the proxy URL
  • Rotate proxies randomly to appear like different users
  • Use sessions to persist data across requests
  • Disable SSL verification if you fully trust the proxy
  • Tweak timeouts and retries to make requests robust
  • However, managing proxies programmatically can get complex. This is where Proxies API comes in - it handles all the proxy complexity behind a simple API.

    Here is how you can use Proxies API with Python requests:

    import requests
    
    api_key = 'YOUR_API_KEY'
    
    proxies = {
      'http': f'<http://api.proxiesapi.com/?key={api_key}>'
    }
    
    url = '<http://example.com>'
    
    response = requests.get(url, proxies=proxies)
    

    This makes a request to example.com through Proxies API. The service will:

  • Rotate IP addresses automatically
  • Rotate user agents
  • Handle CAPTCHAs
  • Render JavaScript
  • And you get back the final HTML.

    So Proxies API simplifies proxy management down to a single API call.

    We offer a free tier for 1000 api calls to try it out. Get your API key and supercharge your Python requests today!

    Frequently asked questions

    How do I use proxies requests in Python?

    Pass a proxies dictionary containing the proxy URLs to any requests method like requests.get(url, proxies=proxies). See the "Making Requests" section in the article for examples.

    What is the proxy for web scraping in Python?

    You can use either HTTP or SOCKS5 proxies for web scraping in Python requests. Configure the proxy URL in the proxies dict as shown in the article.

    How to test proxies with Python?

    You can test proxies by making a simple request like requests.get('', proxies=proxies) which returns your public IP. If you get the proxy IP, it is working. You can write a script to test a list of proxies.

    How to configure proxy for pip on Python?

    Set the environment variables HTTP_PROXY and HTTPS_PROXY to your proxy URLs before running pip, like:

    export HTTP_PROXY="<http://10.10.1.10:3128>"
    pip install package_name
    

    What is port 8080 proxy?

    Port 8080 is a common port used by proxy servers to listen for incoming connections. Configuring your system or code to use a proxy at say 10.10.1.10:8080 will route traffic through that proxy.

    How to link pip with Python?

    Pip is already bundled and configured to work with Python. Just call pip from your command prompt to install packages. If you installed Python and pip separately, ensure pip is in your PATH.

    How to use PyCharm with proxy?

    Go to Settings > Appearance & Behavior > System Settings > HTTP Proxy and enter your proxy details. This will make PyCharm use the proxy.

    How do you specify proxy in URL?

    Include username and password in the proxy URL like http://username:password@proxy_ip:port.

    How to apply proxy settings to all users?

    On Linux/MacOS, modify the system-wide environment variables HTTP_PROXY and HTTPS_PROXY in /etc/environment. On Windows, modify them under Advanced System Settings.

    Hope this helps summarize some of the popular proxy-related questions! Let me know if you need any clarification.

    Browse by tags:

    Browse by language:

    Tired of getting blocked while scraping the web?

    ProxiesAPI handles headless browsers and rotates proxies for you.
    Get access to 1,000 free API credits, no credit card required!