Hello, several times already I met a problem when a task with timeout never times out, actually. Python process that is orchestrated by Prefect is terminated sometimes by the GPU (something like OOM, not important now). I use @task(state_handlers=failure_notifiers, nout=2, timeout=60 * 60 * 4) decorator to set a timeout of 4 hours. In the Cloud dashboard, there is no mention of the timeout that I ordered for that task. Is the absence of timeout info in the dashboard expected, or my timeout is not being set, somehow?
k
Kevin Kho
10/18/2021, 6:27 PM
Hey @Anatoly Alekseev, is the timeout taking effect? I think you would see it in the logs? Like a
TimedOutError
a
Anatoly Alekseev
10/18/2021, 8:01 PM
No, not taking at all. Task is hanging in Running state for days...
k
Kevin Kho
10/18/2021, 8:14 PM
Gotcha, is the execution happening on different hardware than the flow? Like are you using a cluster or another VM?
Kevin Kho
10/18/2021, 8:16 PM
What Executor are you using?
a
Anatoly Alekseev
10/18/2021, 8:21 PM
No no, just a local executor and a single server: with Flow(
name="Ежедневное переобучение моделей",
schedule=IntervalSchedule(
start_date=pendulum.datetime(2019, 1, 1, 2, 0, 0, tz="Europe/Moscow"),
interval=timedelta(days=1),
),
storage=Local(add_default_labels=True),
run_config=LocalRun(working_dir=WORKING_DIR)
)
k
Kevin Kho
10/18/2021, 8:27 PM
Got it. Will ask the team about it
👍 1
z
Zanie
10/18/2021, 8:54 PM
Hi! Can you run the flow with debug level logs? Perhaps set the timeout to a couple seconds so we can see if it fails quickly.
🙌 1
Zanie
10/18/2021, 8:57 PM
I'd also note that if your flow is killed by the machine, we can't enforce a timeout because the process that checks for a timeout is also likely killed. We generally recommend using Flow "did not finish" SLAs for this which can be setup under "Automations" in the UI. https://docs.prefect.io/orchestration/concepts/automations.html#automations
🙌 1
a
Anatoly Alekseev
10/18/2021, 9:18 PM
Got it, that should be the reason. Need to buy a paid plan! Thank you guys so much for looking into it. However, it would be amazing if timing out (or maybe failing) could occur even if the process was killed )