Why is Python Multithreading Slow and How to Speed It Up

Mar 17, 2024 ยท 4 min read

Multithreading in Python often seems slower compared to other languages like Java and C++. This is due to something called the Global Interpreter Lock (GIL). In this article, we'll understand what the GIL is, why it makes Python multithreading slow, and explore some workarounds to speed up parallel processing in Python.

What is the Global Interpreter Lock (GIL)

The CPython implementation of Python uses a construct called the Global Interpreter Lock or GIL. This lock prevents multiple threads from running Python bytecodes at once. Essentially, it makes CPython execution single-threaded.

At any point, only one thread holds the GIL and runs Python bytecodes. Other threads cannot start execution until the current thread releases the GIL. This serialized execution is why Python multithreading often seems slow.

The GIL was introduced in CPython to avoid problems with non-threadsafe memory management and bindings to non-threadsafe 3rd party libraries. It avoids race conditions where multiple threads access the same resource without synchronization.

Why the GIL Makes Multithreading Slow

The GIL causes 2 major performance issues for Python multithreading:

  1. Only one thread executes at a time: As we saw earlier, only one Python thread can execute bytecodes at once. Others have to wait until the current thread yields execution. This serialized execution limits parallelism and hurts performance.
  2. Cannot utilize multiple CPU cores: The GIL prevents multiple Python threads from running bytecodes in parallel. So even with multiple cores, only one core executes the Python process at a time. The GIL underutilizes available computing resources.

As a result, CPU-bound Python programs don't see much speedup from multithreading. The threads end up taking turns on one CPU core instead of utilizing all cores.

However, the GIL doesn't affect I/O-bound tasks as much. Threads can release the GIL when performing blocking I/O operations like file, network or database access. So threads performing a lot of I/O can run in parallel.

Workarounds to Speed up Python Multithreading

There are a few ways to work around the GIL and speed up parallel processing in Python:

1. Multi-processing

We can use multiple Python processes instead of threads. The GIL is limited to one process, so multiple processes can utilize multiple cores.

The multiprocessing module makes it easy to spin new processes. Communication and data sharing between processes involves serialization and IPC, but computing happens in parallel.

import multiprocessing 

def worker(x):
   # Do some work
   return x * 2

if __name__ == "__main__":
   pool = multiprocessing.Pool(processes=4) 
   inputs = [1, 2, 3, 4]
   outputs = pool.map(worker, inputs)
   print(outputs) # [2, 4, 6, 8] parallelly processed  

So for CPU-bound work, multiprocessing is faster than multithreading.

2. Multi-threading for I/O-bound tasks

For I/O-bound tasks, we can still use threads for parallelism. As mentioned earlier, threads can release GIL while waiting for I/O. So for network, file and database access, multithreading works well.

Web scraping is an example of I/O-bound work. Threads can scrape multiple webpages in parallel by releasing GIL while sites load.

3. Multi-threading in external C/C++ libraries

We can implement CPU-intensive sections in external C/C++ libraries. These libraries release the GIL during execution.

So they can run computations in parallel across multiple threads. Python code calls these libraries for parallelism.

For instance, the Numerical Python (NumPy) library uses optimized multi-threaded C code internally. NumPy calls release the GIL, enabling good parallelism.

import numpy as np

a = np.random.rand(5,5) 
b = np.random.rand(5,5)  

# Parallel matrix multiplication using NumPy
c = np.dot(a, b)   

4. Newer Python Versions

Upcoming Python versions may have better multithreading capabilities. Python 3.9 introduced the Py_BEGIN_ALLOW_THREADS and END_ALLOW_THREADS macros to explicitly release the GIL in C/C++ extensions. This may enable more multi-threaded libraries.

There are also alternative Python implementations like Jython and IronPython that don't use the GIL. They show good parallel throughput for multithreading.

Key Takeaways

  • The Global Interpreter Lock (GIL) in CPython limits parallel execution, making Python multithreading seem slow.
  • For CPU-bound work, multiprocessing works better than multithreading.
  • For I/O-bound tasks, multithreading works well.
  • External C/C++ libraries can release the GIL for better parallelism.
  • Upcoming Python versions may improve multithreading capabilities.
  • So while the GIL affects some use cases, there are workarounds to enable parallel processing where needed. Python continues to be a versatile language for building modern applications.

    Browse by tags:

    Browse by language:

    The easiest way to do Web Scraping

    Get HTML from any page with a simple API call. We handle proxy rotation, browser identities, automatic retries, CAPTCHAs, JavaScript rendering, etc automatically for you

    Try ProxiesAPI for free

    curl "http://api.proxiesapi.com/?key=API_KEY&url=https://example.com"

    <!doctype html>
        <title>Example Domain</title>
        <meta charset="utf-8" />
        <meta http-equiv="Content-type" content="text/html; charset=utf-8" />
        <meta name="viewport" content="width=device-width, initial-scale=1" />


    Don't leave just yet!

    Enter your email below to claim your free API key: