Make long-running tasks time out in FastAPI
The Problem
One of the API endpoints in my FastAPI application executes a data retrieval task that sometimes takes a very long time to complete, making my application feel sluggish. How can I cancel the execution of this task if it takes more than a set amount of time and serve cached data instead?
The Solution
We can do this using the asyncio.wait_for function that takes an awaitable and a timeout value in seconds as parameters. The awaitable is then run as a task. If the task is completed before the timeout is reached, wait_for will return the value the task returns. If the timeout is reached before the task returns, the task will be canceled, and wait_for will raise a TimeoutError. In Python versions before 3.11, it raised an asyncio.TimeoutError instead.
The following example code demonstrates using a simulated data retrieval task that sleeps for 10 seconds before returning a random number and a timeout of 5 seconds. For Python versions below 3.11 replace except TimeoutError with asyncio.exceptions.TimeoutError.
from fastapi import FastAPI
import asyncio, random
app = FastAPI()
cache = {"result": random.random()} # seed the cache
async def long_running_task():
await asyncio.sleep(10) # sleep for 10 seconds
return random.random() # return a random number
@app.get("/retrieve-data")
async def retrieve_data():
try:
result = await asyncio.wait_for(long_running_task(), timeout=5) # timeout after five seconds of waiting
cache["result"] = result # cache the result
return {"message": result, "cache": False}
except TimeoutError:
return {"message": cache["result"], "cache": True}
# Run FastAPI app
if __name__ == "__main__":
import uvicorn
uvicorn.run(app, host="0.0.0.0", port=8000)
A request to this application’s /retrieve-data endpoint will invoke long_running_task, cancel it after five seconds, and then return the cached random number. If we swap the task’s sleep time and the timeout value, a request to /retrieve-data will wait for five seconds before returning a new random number. In both cases, the API indicates whether the value returned was sourced from the cache.
Monitoring Response Times in Production
While implementing timeouts helps prevent slow endpoints from degrading your API, response time monitoring in production helps you identify why endpoints are slow in the first place. Production environments reveal performance patterns you won’t see locally—database query slowdowns under load, external API latency variations, or resource contention during peak traffic. Application performance monitoring tools automatically track endpoint response times, identify slow database queries, and alert you when performance degrades, helping you optimize before timeouts become necessary.
Considered "not bad" by 4 million developers and more than 150,000 organizations worldwide, Sentry provides code-level observability to many of the world's best-known companies like Disney, Peloton, Cloudflare, Eventbrite, Slack, Supercell, and Rockstar Games. Each month we process billions of exceptions from the most popular products on the internet.