Skip to content

FastAPI Request Processing: Sequential vs Parallel Execution

Problem Statement

When testing a FastAPI endpoint that includes a blocking operation like time.sleep(5), you might notice that requests appear to process sequentially rather than concurrently. This behavior can be confusing, especially when you expect FastAPI's async capabilities to handle multiple requests simultaneously.

Here's an example that demonstrates the issue:

python
from fastapi import FastAPI, Request
import time
    
app = FastAPI()

@app.get("/ping")
async def ping(request: Request):
    print("Hello")
    time.sleep(5)
    print("bye")
    return {"ping": "pong!"}

When calling this endpoint from multiple browser tabs simultaneously, the output shows:

Hello
bye
Hello
bye

Instead of the expected concurrent behavior:

Hello
Hello
bye
bye

Understanding FastAPI's Execution Model

FastAPI uses an asynchronous event loop to handle requests, but the behavior depends on how you define your endpoints and what operations you perform inside them.

async def vs def Endpoints

FastAPI handles endpoints differently based on their definition:

  • async def endpoints: Run directly in the event loop
  • def endpoints: Run in a separate thread from an external threadpool and then awaited

INFO

FastAPI uses Starlette's run_in_threadpool() function (which internally uses anyio.to_thread.run_sync()) to execute synchronous functions in a threadpool with a default of 40 worker threads.

The Blocking Operation Problem

The issue in the example code is that time.sleep(5) is a blocking operation that doesn't yield control back to the event loop. When used in an async def endpoint, it blocks the entire event loop, preventing other requests from being processed concurrently.

Solutions

Solution 1: Use Synchronous Endpoint Definition

For endpoints with blocking operations, use a normal def instead of async def:

python
@app.get("/ping")
def ping(request: Request):
    print("Hello")
    time.sleep(5)
    print("bye")
    return {"ping": "pong!"}

This allows FastAPI to run each request in a separate thread from its threadpool, enabling concurrent processing.

Solution 2: Use Asynchronous Alternatives

Replace blocking operations with their async equivalents. For sleeping, use asyncio.sleep() instead of time.sleep():

python
import asyncio
from fastapi import FastAPI, Request

app = FastAPI()

@app.get("/ping")
async def ping(request: Request):
    print("Hello")
    await asyncio.sleep(5)  # Non-blocking sleep
    print("bye")
    return {"ping": "pong!"}

Solution 3: Run Blocking Operations in Threadpool

For CPU-bound or blocking operations in async endpoints, explicitly run them in a threadpool:

python
from fastapi import FastAPI, Request
from fastapi.concurrency import run_in_threadpool
import time

app = FastAPI()

def blocking_operation():
    print("Hello")
    time.sleep(5)
    print("bye")
    return {"ping": "pong!"}

@app.get("/ping")
async def ping(request: Request):
    result = await run_in_threadpool(blocking_operation)
    return result

Alternatively, use asyncio.to_thread() (Python 3.9+):

python
import asyncio
import time

def blocking_operation():
    print("Hello")
    time.sleep(5)
    print("bye")
    return {"ping": "pong!"}

@app.get("/ping")
async def ping(request: Request):
    result = await asyncio.to_thread(blocking_operation)
    return result

Solution 4: For CPU-bound Tasks Use ProcessPoolExecutor

For CPU-intensive tasks, use a process pool to avoid GIL limitations:

python
import asyncio
from concurrent.futures import ProcessPoolExecutor

def cpu_intensive_task():
    # Your CPU-bound computation here
    return result

@app.get("/compute")
async def compute():
    loop = asyncio.get_running_loop()
    with ProcessPoolExecutor() as pool:
        result = await loop.run_in_executor(pool, cpu_intensive_task)
    return {"result": result}

Browser Behavior Considerations

WARNING

Web browsers often sequentialize requests to the same endpoint from tabs in the same session. This client-side behavior can make it appear that your server isn't handling requests concurrently even when it is.

To properly test concurrent behavior, use:

  1. Browser tabs in incognito mode
  2. Different browsers
  3. Programmatic testing with httpx:
python
import httpx
import asyncio

async def test_concurrency():
    async with httpx.AsyncClient() as client:
        tasks = [client.get("http://localhost:8000/ping") for _ in range(3)]
        responses = await asyncio.gather(*tasks)
        for response in responses:
            print(response.json())

Best Practices

  1. Use async def for endpoints that perform I/O operations with async libraries
  2. Use def for endpoints with blocking operations or CPU-bound tasks
  3. Never use blocking operations in async def endpoints without proper threading
  4. Consider increasing threadpool size if you have many synchronous endpoints:
python
import asyncio
from concurrent.futures import ThreadPoolExecutor
from fastapi import FastAPI

app = FastAPI()

# Custom thread pool with more workers
thread_pool = ThreadPoolExecutor(max_workers=100)

@app.on_event("startup")
async def startup_event():
    loop = asyncio.get_running_loop()
    # Set custom default executor
    loop.set_default_executor(thread_pool)
  1. Use multiple workers for true parallelism across CPU cores:
bash
uvicorn main:app --workers 4

Conclusion

FastAPI provides excellent support for both synchronous and asynchronous request handling. The key to achieving proper concurrency is understanding the difference between async def and def endpoints, and ensuring that blocking operations don't interfere with the event loop. By following the patterns outlined above, you can ensure your FastAPI application handles requests efficiently and concurrently.