Hi Everyone,
We are having an issue with Prefect v3 on cloud: Some flows are being retried mid-run.
The tasks are supposed to run sequentially using
ThreadPoolTaskRunner(max_workers=1)
and
wait_for
. Few tasks in, at a random task, a new execution (or run) logs are shown and the UI shows a new separate flow tree.
Checking the pods status, we found out that the original pod of the flow was deleted, and a new pod was created.
What should we do to stop this strange behavior?
Thank you!
b
Bianca Hoch
03/13/2025, 5:40 PM
Hi Lina! Based on what you've described, it sounds like the job is getting evicted and k8s is attempting to re-run the pod.
Couple questions for you:
• Does the original flow ever enter a
Crashed
state? Does it get stuck in a particular state?
• Do you have any retries set for your flow's decorator? ie:
@flow(retries=3)
l
Lina M
03/17/2025, 9:10 AM
Hi @Bianca Hoch, Thank you for your answer.
• No, the original flow does not change its state along the way. It stays on running.
• No. The retries are set on the tasks level only.
u
Ulysse Petit
03/23/2025, 4:43 PM
I got the same issue with a GCP work pool (with Cloud Run Services as workers).
When I run the flow, a new run starts and suddenly the run is retrying in the middle of the execution showing "Run count 2/1" in the UI.
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.