Hi folks -- was wondering if anyone had any guidan...
# ask-community
h
Hi folks -- was wondering if anyone had any guidance for us on troubleshooting something we've been seeing. We've got a scheduled flow that kicks off every (1) minute. This runs a bunch of sync tasks in parallel using
DaskTaskRunner
(just a default dask cluster, not connecting to existing cluster). (We're calling
<task>.submit()
and then using
fut.wait()
in a loop to ensure it is complete.) What we're seeing is that we have a zombie python processes under prefect-agent pid ticking up every minute -- seems to be correlated to this. Is there something we should be doing additionally to gracefully close down the Dask workers so that we don't end up with zombie processes? After a few days of this, it ends up crashing the server .... while I think we have a workaround (there's no need to use Dask for this; threads would work fine), I'd love to understand what might be going on as we do want to use Dask for other longer-running prefect flows.
This is prefect 2.13.1 running on amd64 architecture. The container is our own based on rockylinux 9, python 3.11 (prefect is pip-installed, as is prefect-dask, etc.)
Following up on my own post, upgrading to latest dask and latest prefect-dask (0.2.4->0.2.5) seems to have helped -- as well as removing Dask from tasks that run very frequently (and really didn't need the parallelization).