FastAPI Request Processing: Sequential vs Parallel Execution
Problem Statement
When testing a FastAPI endpoint that includes a blocking operation like time.sleep(5), you might notice that requests appear to process sequentially rather than concurrently. This behavior can be confusing, especially when you expect FastAPI's async capabilities to handle multiple requests simultaneously.
Here's an example that demonstrates the issue:
from fastapi import FastAPI, Request
import time
app = FastAPI()
@app.get("/ping")
async def ping(request: Request):
print("Hello")
time.sleep(5)
print("bye")
return {"ping": "pong!"}When calling this endpoint from multiple browser tabs simultaneously, the output shows:
Hello
bye
Hello
byeInstead of the expected concurrent behavior:
Hello
Hello
bye
byeUnderstanding FastAPI's Execution Model
FastAPI uses an asynchronous event loop to handle requests, but the behavior depends on how you define your endpoints and what operations you perform inside them.
async def vs def Endpoints
FastAPI handles endpoints differently based on their definition:
async defendpoints: Run directly in the event loopdefendpoints: Run in a separate thread from an external threadpool and then awaited
INFO
FastAPI uses Starlette's run_in_threadpool() function (which internally uses anyio.to_thread.run_sync()) to execute synchronous functions in a threadpool with a default of 40 worker threads.
The Blocking Operation Problem
The issue in the example code is that time.sleep(5) is a blocking operation that doesn't yield control back to the event loop. When used in an async def endpoint, it blocks the entire event loop, preventing other requests from being processed concurrently.
Solutions
Solution 1: Use Synchronous Endpoint Definition
For endpoints with blocking operations, use a normal def instead of async def:
@app.get("/ping")
def ping(request: Request):
print("Hello")
time.sleep(5)
print("bye")
return {"ping": "pong!"}This allows FastAPI to run each request in a separate thread from its threadpool, enabling concurrent processing.
Solution 2: Use Asynchronous Alternatives
Replace blocking operations with their async equivalents. For sleeping, use asyncio.sleep() instead of time.sleep():
import asyncio
from fastapi import FastAPI, Request
app = FastAPI()
@app.get("/ping")
async def ping(request: Request):
print("Hello")
await asyncio.sleep(5) # Non-blocking sleep
print("bye")
return {"ping": "pong!"}Solution 3: Run Blocking Operations in Threadpool
For CPU-bound or blocking operations in async endpoints, explicitly run them in a threadpool:
from fastapi import FastAPI, Request
from fastapi.concurrency import run_in_threadpool
import time
app = FastAPI()
def blocking_operation():
print("Hello")
time.sleep(5)
print("bye")
return {"ping": "pong!"}
@app.get("/ping")
async def ping(request: Request):
result = await run_in_threadpool(blocking_operation)
return resultAlternatively, use asyncio.to_thread() (Python 3.9+):
import asyncio
import time
def blocking_operation():
print("Hello")
time.sleep(5)
print("bye")
return {"ping": "pong!"}
@app.get("/ping")
async def ping(request: Request):
result = await asyncio.to_thread(blocking_operation)
return resultSolution 4: For CPU-bound Tasks Use ProcessPoolExecutor
For CPU-intensive tasks, use a process pool to avoid GIL limitations:
import asyncio
from concurrent.futures import ProcessPoolExecutor
def cpu_intensive_task():
# Your CPU-bound computation here
return result
@app.get("/compute")
async def compute():
loop = asyncio.get_running_loop()
with ProcessPoolExecutor() as pool:
result = await loop.run_in_executor(pool, cpu_intensive_task)
return {"result": result}Browser Behavior Considerations
WARNING
Web browsers often sequentialize requests to the same endpoint from tabs in the same session. This client-side behavior can make it appear that your server isn't handling requests concurrently even when it is.
To properly test concurrent behavior, use:
- Browser tabs in incognito mode
- Different browsers
- Programmatic testing with
httpx:
import httpx
import asyncio
async def test_concurrency():
async with httpx.AsyncClient() as client:
tasks = [client.get("http://localhost:8000/ping") for _ in range(3)]
responses = await asyncio.gather(*tasks)
for response in responses:
print(response.json())Best Practices
- Use
async deffor endpoints that perform I/O operations with async libraries - Use
deffor endpoints with blocking operations or CPU-bound tasks - Never use blocking operations in
async defendpoints without proper threading - Consider increasing threadpool size if you have many synchronous endpoints:
import asyncio
from concurrent.futures import ThreadPoolExecutor
from fastapi import FastAPI
app = FastAPI()
# Custom thread pool with more workers
thread_pool = ThreadPoolExecutor(max_workers=100)
@app.on_event("startup")
async def startup_event():
loop = asyncio.get_running_loop()
# Set custom default executor
loop.set_default_executor(thread_pool)- Use multiple workers for true parallelism across CPU cores:
uvicorn main:app --workers 4Conclusion
FastAPI provides excellent support for both synchronous and asynchronous request handling. The key to achieving proper concurrency is understanding the difference between async def and def endpoints, and ensuring that blocking operations don't interfere with the event loop. By following the patterns outlined above, you can ensure your FastAPI application handles requests efficiently and concurrently.