Avi A

07/25/2020, 8:39 PM
Hey there, I’m having a flow with many mapped tasks (about 5k), running on a
. The flow fails due to high memory usage by the dask worker which is being killed again and again endlessly. Questions: 1. It seems that Dask/Prefect don’t serialize the completed tasks outputs when the worker gets to high memory usage. I’m not that proficient in Dask, I used Spark a lot and I know that it dumps to disk whenever memory is running low. How can I configure Dask to do the same (or is it a prefect thing?). Worth to mention that the prefect agent, dask scheduler and worker are all on the same machine. 2. Is there some way to have the flow fail in this case? It keeps restarting the worker but not failing, so I don’t get a message that this failure happens Thanks!
👍 2