Making Python Faster: An Introduction to Asynchronous HTTP Requests

Feb 1, 2024 ยท 7 min read

Making web requests is a common task in Python programming. However, traditional requests can block the execution of your code while waiting for a response. This can make your programs feel sluggish or unresponsive.

Enter asynchronous HTTP requests - a way to execute requests in a non-blocking manner. By running requests "in the background", asynchronous requests allow your Python code to continue executing while the request is pending. This leads to faster and more responsive programs.

In this post, we'll learn the basics of asynchronous requests in Python. We'll cover:

  • What asynchronous code is and why it's useful
  • How the asyncio module works
  • Making basic async HTTP requests with aiohttp
  • Using async/await syntax to write async code
  • Handling responses from asynchronous requests
  • Potential gotchas to be aware of
  • Let's dig in!

    Blocking vs Asynchronous Code

    To understand what asynchronous code is, let's first contrast it with standard "blocking" code.

    When you make a typical function call in Python, the execution of your program "blocks" until that function returns a value. For example:

    response = requests.get('https://api.example.com/users') 
    print(response.text)

    Here our code stops and waits while requests.get() runs. Only after it finishes does the print line execute. This is "blocking" behavior - execution is halted in one spot while waiting on an operation.

    Asynchronous code works differently. Long-running operations are started, but execution continues without having to wait for them. The operations run concurrently along with the rest of our program.

    For example, making an asynchronous HTTP request would start the request right away, allow other code to keep running, and handle the response whenever it comes back from the server:

    start_request() # Starts immediately and returns
    print("I don't have to wait!") 
    response = get_response() # Gets response whenever it's ready

    This allows a Python program to avoid unnecessary waiting and do more work while I/O operations are in progress.

    Introducing asyncio

    In Python, asynchronous programming is centered around the asyncio module. This module provides ways to run code concurrently and includes abstractions for common async patterns like tasks, events, and streams.

    At the core of asyncio is an event loop. This loop runs your asynchronous code and handles switching between any pending tasks and operations. By leveraging the event loop, we can avoid threaded or multi-process code while still achieving concurrency in Python.

    To use asyncio, you write coroutines - generator-like functions that use the async/await syntax. When called, coroutines return a coroutine object instead of their normal return value. The event loop can interleave execution of these coroutines however needed to maximize efficiency.

    Let's see a simple coroutine in action:

    import asyncio
    
    async def my_coro(text):
        print("Do work before await")
        
        await asyncio.sleep(1)
        
        print(f"Finished await: {text}")
    
    asyncio.run(my_coro("Hello World!"))

    This schedules my_coro on the event loop, prints the first message, waits 1 second, then prints the finish message. The await expression allows other tasks to run during that 1 second pause.

    Pretty cool! Now let's apply this idea to make some asynchronous web requests.

    Making Async HTTP Requests with aiohttp

    The most popular way to perform asynchronous HTTP requests in Python is with the aiohttp library. This gives us an API similar to requests, but with all async operations.

    To make an async GET request with aiohttp:

    import aiohttp
    
    async def fetch_data(session, url):
        async with session.get(url) as response:
            return await response.text()
            
    async def main():
        async with aiohttp.ClientSession() as session:
            html = await fetch_data(session, "http://python.org")
            print(html[:100])
    
    asyncio.run(main())

    Breaking this down:

  • We use an aiohttp.ClientSession as a base for making requests
  • The await keyword waits for and returns the final response text
  • Our main coroutine main handles the session and request call
  • asyncio.run schedules main on the event loop
  • By using await, we avoid blocking while the request is in progress. Our program can continue executing other code until the response comes back from the server.

    Async Functions with async/await Syntax

    Coroutines introduced earlier are written asynchronously but called synchronously. To call them inline without blocking, we need one more ingredient - the async keyword.

    Functions defined with async become asynchronous functions. These act like regular functions but implicitly return a coroutine object when called.

    We await on asynchronous functions to run the underlying coroutine and get its actual return value once ready:

    import asyncio
    import aiohttp
    
    async def fetch_page(url):
        session = aiohttp.ClientSession()
        response = await session.get(url)
        text = await response.text()
        await session.close()
        
        return text
    
    def main():
        loop = asyncio.get_event_loop()
        html = loop.run_until_complete(fetch_page("http://python.org"))
        print(html[:100])
    
    main()

    Now fetch_page is an asynchronous function, which we can await directly without having to schedule it on the event loop manually. Much simpler!

    Async functions allow us to use the full capabilities of asyncio while writing code in a synchronous-looking style. This makes async logic far easier to reason about.

    Handling Responses from Async Requests

    Once an asynchronous request completes, how do we work with the response and potential errors?

    Awaiting the request returns the response object itself. We can check the status, headers, and other metadata on it:

    async def get_user(id):
        url = f'https://api.example.com/users/{id}'
        
        try:
            response = await session.get(url)
            
            if response.status == 200:
                json = await response.json()
                return json['name']
            else:
                print(f"Error with status {response.status}")
                return None
        except aiohttp.ClientConnectorError:
            print("Connection problem")
            return None

    Like synchronous code, use try/except blocks to handle any connection errors, 500 statuses, timeouts, etc. The body content or errors become available once the response is awaited.

    For APIs that return JSON, await the response.json() method to get the parsed object. Other properties like response.text can read the raw string content.

    When Asyncio Gets Tricky

    Asynchronous programming opens up performance gains through concurrency, but it also creates new potential issues:

  • Race conditions - with async logic running in parallel, subtle timing bugs can occur if you aren't careful with shared state.
  • Difficult debugging - a downside of concurrency is that tracebacks become non-linear making bugs hard to reason about.
  • Compatibility - asyncio code won't work in threads or processes launched via libraries like multiprocessing. Special care must be taken if integrating async code with other systems.
  • Blocking calls - libraries doing I/O in a synchronous way can defeat the purpose of asyncio if they block the event loop. Watch out for blocking calls.
  • While asyncio is powerful, these complexities make using it correctly non-trivial. Thorough testing is a must to avoid subtle concurrency issues. Consider if threads/processes might work better for your use case.

    Next Steps

    We've covered the basics, but there is much more to discover with asyncio:

  • Making multiple requests in parallel with asyncio.gather
  • Websockets for real-time communication
  • Async ORM libraries like httpx to avoid blocking database I/O
  • asyncio queues and locks for coordinating coroutine execution
  • asyncio streams for data streaming applications
  • Asyncio is a versatile framework for all kinds of asynchronous tasks. Whether for highly concurrent network apps, CPU-bound processing, or interfacing with async libraries, give asyncio a look for making Python faster.

    The event loop handles all the difficult concurrency details - our code can focus on business logic while remaining performant, responsive and clear. This lets Python support demanding applications typically requiring lower-level languages.

    Browse by language:

    The easiest way to do Web Scraping

    Get HTML from any page with a simple API call. We handle proxy rotation, browser identities, automatic retries, CAPTCHAs, JavaScript rendering, etc automatically for you


    Try ProxiesAPI for free

    curl "http://api.proxiesapi.com/?key=API_KEY&url=https://example.com"

    <!doctype html>
    <html>
    <head>
        <title>Example Domain</title>
        <meta charset="utf-8" />
        <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1" />
    ...

    X

    Don't leave just yet!

    Enter your email below to claim your free API key: