Matic Lubej
11/10/2021, 1:03 PMAnna Geller
$ dask-worker <tcp://scheduler>:port --memory-limit=auto # TOTAL_MEMORY * min(1, nthreads / total_nthreads)
$ dask-worker <tcp://scheduler>:port --memory-limit="4 GiB" # four gigabytes per worker process.
But I wonder, even if you could allocate more resources to that specific Dask worker, how would you guarantee that this specific task is processed by that specific worker? I honestly don’t know how this could be accomplished, since the Dask scheduler is who decides how to distribute work across workers.
Also, my understanding is that when you do reduce, not the worker, but the driver node collects the mapped tasks results in memory, so the driver node would need to have enough memory to perform that.Matic Lubej
11/10/2021, 1:47 PMAnna Geller
Anna Geller
coiled.Cluster
class passed to DaskExecutor
, and on this cluster, you can set both:
• scheduler_memory - defaults to 4 GiB
• worker_memory - defaults to 8 GiB
I’m no Dask expert but I can point you to some resources that can perhaps help a bit more:
• This blog post explains more about Dask Cluster configuration,
• The Prefect docs on Dask Deployment https://docs.prefect.io/core/advanced_tutorials/dask-cluster.html#deployment-daskMatic Lubej
11/10/2021, 2:22 PMKevin Kho