Owen Boyd
09/04/2025, 10:26 PMTask run failed with exception: TaskRunTimeoutError('Scope timed out after 60.0 second(s).') - Retry 1/3 will start 10 second(s) from now 02:29:37 PM
Finished in state Completed() 02:29:22 PM
Marvin
09/04/2025, 10:26 PMMarvin
09/04/2025, 10:31 PMtimeout_seconds
cancels the task’s execution scope, but it cannot forcibly kill blocking work inside your function (e.g., a requests/DB call without its own timeout). Typical symptoms:
- The first attempt times out and logs that a retry is scheduled
- The underlying blocking call keeps running in the background thread
- The retry cannot actually start until the first attempt yields control
- If the blocking call eventually returns successfully, the task run may get marked Completed and you never see a retry execute
Given your note about a downstream resource being overloaded, this fits pretty well.
What to check next
- In the UI, open the specific Task Run and review the State History. Do you see “AwaitingRetry” with a scheduled time, then no subsequent “Running” attempt? That indicates the first attempt never really yielded.
- Confirm your task configuration: retries
, retry_delay_seconds
, and timeout_seconds
on the @task
decorator. Also check for retry_condition_fn
or state hooks that could change behavior.
- Ensure your task code is not catching and swallowing the timeout/cancellation. Broad except Exception:
around the body can convert a failure into a normal return, which will mark the task Completed.
- Turn on debug logs to capture state transitions and scheduling: set PREFECT_LOGGING_LEVEL=DEBUG
in the environment for the run.
How to make timeouts and retries reliable with blocking I/O
- Add explicit timeouts to external calls so they fail quickly and cooperatively:
- requests:
import requests
# (connect_timeout, read_timeout) seconds
resp = requests.get(url, timeout=(5, 60))
- Database drivers: set statement/query timeouts at the driver/session level.
- If the external client supports cancellation/interrupts, use those APIs.
- For truly non-cooperative work, run it in a subprocess that you can terminate on timeout, then let the task retry. A small wrapper function invoked via subprocess
or a worker pool you can cancel is often the most reliable for misbehaving calls.
Quick sanity checklist
- Is your task declared with something like:
from prefect import task
@task(retries=3, retry_delay_seconds=10, timeout_seconds=60)
def my_task(...):
...
- Are you invoking it normally from a flow (no custom asyncio task juggling, no manual futures you don’t await)?
- Do you see “AwaitingRetry” in the Task Run’s state timeline in the UI?
If you can share:
- Your Prefect version (prefect version
)
- The task decorator line (showing retries/timeout config)
- How you call the task inside the flow
- Any external call(s) inside the task (e.g., requests/DB code)
I can help pinpoint the exact cause and recommend the cleanest fix.
References
- Task options (retries, timeouts) API signature: Prefect 3 task API
- Task engine source (timeout and retry handling): prefect/task_engine.py
- Prefect 3 release notes: Prefect 3 release notes