Python's multithreading capabilities often surprise newcomers - threads seem to run slowly and parallelism is limited. This stems from Python's design, but there are solutions.
The GIL: Serializing Concurrency
Python uses a Global Interpreter Lock (GIL) that allows only one thread to execute Python bytecodes at a time. This effectively serializes execution, limiting multicore scalability.
The GIL exists because Python's memory management is not thread-safe. The lock allows threads to run concurrently, but only one executes Python code at a time. Others awaiting the GIL still run cPython code.
import threading
def countdown(n):
    while n > 0:
        n -= 1
t1 = threading.Thread(target=countdown, args=(1000000,))  
t2 = threading.Thread(target=countdown, args=(1000000,))
t1.start()
t2.start()
t1.join()
t2.join()This will take ~2 seconds despite having two threads, due to the GIL!
Solutions
For CPU-bound processing, multiprocessing can sidestep the GIL by spawning new Python processes:
import multiprocessing
def countdown(n):
    # As before
if __name__ == "__main__":
    p1 = multiprocessing.Process(target=countdown, args=(1000000,))  
    p2 = multiprocessing.Process(target=countdown, args=(1000000,))
    
    p1.start()
    p2.start()
    p1.join()
    p2.join()Now the countdowns run in parallel across processes, completing in ~1 second.
For IO-bound tasks like web serving, asynchronous frameworks like asyncio avoid the GIL by suspending threads instead of blocking:
import asyncio
async def countdown(n):
    while n > 0:
        n -= 1
        await asyncio.sleep(0) # Suspend thread
# Restarts each time await releases the GILSo while Python's multithreading seems slow, solutions exist - you just need the right tool for the job!
