https://prefect.io logo
Title
d

David Elliott

01/11/2023, 2:43 PM
Hey! Question about flow cancellation & Dask workers on k8s (Prefect 2.7.7) I’ve got a Kubernetes deployed flow, which uses dask_kubernetes and spawns dask pods (scheduler + workers) to execute the tasks. When I cancel a running flow from the cloud UI, it terminates the flow run and deletes the prefect job pod, but it leaves the dask pods there just hanging - still alive, but not actually doing anything. Is that intended, and if so, any suggestions on best approach for auto-cleanup of these? At the end of a failed or successful flow run these are auto-terminated, but not when the flow is cancelled. Thanks!
c

Christopher Boyd

01/11/2023, 2:47 PM
Hi David, you would likely need a customer handler. The Dask scheduler should be responsible for cleaning up the dask workers, but if that’s not the case, you would need a handler to do it manually - https://distributed.dask.org/en/stable/worker.html
I can check with the team if that’s on the integration to be included natively
d

David Elliott

01/11/2023, 2:50 PM
Hmm, but the Dask scheduler is also left hanging there as well - as in, when the flow run is cancelled, I think the prefect-job pod just gets deleted without having a chance to send a signal to the dask scheduler telling it to terminate. Whereas when a flow runs to completion, the prefect-job sends the dask scheduler (and its workers) the termination signal..?
c

Christopher Boyd

01/11/2023, 3:01 PM
Let me check, when you get the cancellation, that sends it to the agent directly, which submits the cancellation to the infrastructure. There should be a ~30 second grace period before it’s forcefully killed, but I’ll check with the team
Would you be willing to file this as an issue here? https://github.com/PrefectHQ/prefect-dask
d

David Elliott

01/11/2023, 4:05 PM
Sure - have tried to summarise it here https://github.com/PrefectHQ/prefect-dask/issues/68 Let me know if it’s unclear?
🙌 1
👀 1
Like you say, it feels like the agent should allow the flow to take a cancellation action (e.g send a termination signal to the dask scheduler) but at present the flow pod just gets immediately deleted
c

Christopher Boyd

01/11/2023, 4:07 PM
that issue is fantastic and detailed
thank you
Ill raise this with the team
🙌 1
:gratitude-thank-you: 1