with Dask is there a good way to determine a reaso...
# prefect-community
b
with Dask is there a good way to determine a reasonable number of workers and number of threads per worker in a containerized world like Fargate and/or K8s? The fargate containers that we use for our flows have .25 cpu. I started with 1 worker and 2 threads per worker on a local dask cluster, but having trouble understanding/finding documentation on how to guide folks to choose proper number of workers and threads as our container sizes increase for single machine local dask cluster. Any feedback would be greatly appreciated.
c
I’d argue that the number of workers is dependent entirely on the amount of parallelism you’re hoping to achieve (with the understanding that each worker is also capable of some parallelism depending on the number of threads and processes). The creator of Dask has this to say about nthreads / nprocs: https://stackoverflow.com/questions/49406987/how-do-we-choose-nthreads-and-nprocs-per-worker-in-dask-distributed
d
Hey Braun, this seems like an interesting question for the Dask community specifically. Here are some resources that might be helpful: https://docs.dask.org/en/latest/support.html
b
thanks for this...not a lot on the web on dask threads/procs as it relates to docker. Seems like 256 cpu container should stick with 1 thread, but 3 threads does seem to offer parallelism
going to see what the gitter says
dask 2
👍 2
d
Let us know what you find! 😄
upvote 1
Matt Rocklin commented on the question and said it was not specific enough which is fair. We are going to use vcpu = worker setup with 3 threads per core for some level of concurrency for single core flows. We will just optimize as needed based on the performance we need. Through this we have a much better understanding of concurrency and parallelism in Python, which is great.
c
That’s awesome! Yea if you ever learn any good rules of thumb, especially as they relate to Prefect, we’d love to hear about it!