Hey guys! Let me describe my use case to you. I have a Paddle Paddle and Tensorflow 2.0 model which inferences over a video stream. I'm using a Local Dask Execution Environment to run these two models in parallel. After the flow quits, the Dask Worker still seems to be holding the memory and not releasing it for some reason.
In order to get the memory to release, I have to kill the Dask worker (which I'm doing with a keyboard interrupt now). So I decided to shift the Prefect Local Agent, Dask Scheduler and Dask Worker into Monit (which is a Unix util to monitor processes).
1. I'm not able to get Monit to work with any of these services. Does anyone have experience?
2. Would it be sensible to have the final node in the DAG look at the Dask worker's PID and use pkill to kill it? That way before the next flow run starts, Monit can recover and restart the Dask worker.
Would greatly appreciate if someone who's worked on something similar can help since I'm kinda clueless now. 😫
This is my monit process.
This is the script I'm using.
12/23/2020, 4:41 PM
Hello Ganesh! Although I have never used Monit, I would suggest to check out Resource Manager, which should help you with cleaning up the resources 👍
12/24/2020, 8:33 AM
Hey @Mariia Kerimova! Thanks for the help. This looks to be extremely promising!!
Hey @Mariia Kerimova, I ended up using DaskExecutor() as is and that created an LocalCluster() which circumvented the need to cleanup resources!