We are on `1.1.0` . I cancelled a running flow thr...
# prefect-server
a
We are on
1.1.0
. I cancelled a running flow through the UI; it was deep in the middle of running a series of mapped tasks. Turns out it continued to run for 6 days before ending with
[11 July 2022 2:49pm]: No heartbeat detected from the flow run; marking the run as failed.
What is the expected behavior when "cancelling" a flow engaged in executing mapped tasks? (We are also on the dask-executor, with thread-based parallelism.)
k
On the Prefect side, cancellation is a best effort thing and can be hard when the execution is happening on a different process. I think Dask itself has so cancellation mechanism so they may still end up in a running state. Do some of your tasks take that long to run (days)?
a
No, expected runtime is 30 seconds or so.
It looks like they each run for that expected duration, and then die with an exception related to a database server we attempt to reach.
(The external DB connection, I am quite sure, is not the issue here, given the errant runtime, as well as the final-state issue being the heartbeat problem.)
Thanks for the info on the cancellation.
k
I think this may have stopped running but it just took long for Prefect to mark it as Failed in the UI. Or do you have some sign there was activity for the 6 days?
a
Good point. Could be UI time-delta wonkiness. I'll double check. On that note, because I've encountered UI displaying odd durations before -- is the fact that the UI can be out-of-sync with the logical durations of flow execution, considered an issue, or intended given the nature of the service?
k
It’s cuz we don’t calculate duration on the Flow. UI just gets delta of start time and end time. So if you do stuff like restart a flow, it can definitely be wonky
1
a
So here's the final logs from the flow:
Copy code
17:43:48 // Jul 5
INFO
LocalDaskExecutor
Attempting to interrupt and cancel all running tasks...
14:49:24 // Jul 11
ERROR
prefect-server.ZombieKiller.FlowRun
No heartbeat detected from the flow run; marking the run as failed.
Jul 11 is probably when I checked on the flow
Is the UI going to reach out to get the flow state on-demand?