This isn't 100% Prefect related but thought it wou...
# prefect-community
j
This isn't 100% Prefect related but thought it would be relevant for some users. We have a need for different types of Dask workers, e.g. high memory, GPU, etc. This would work really well with Prefect's support for task tagging to specify Dask worker resources: https://docs.prefect.io/api/unreleased/engine/executors.html#daskexecutor We use dask-kubernetes (on AWS EKS) which currently supports just a single k8s pod spec for all Dask workers. I'm about to start on a potential PR in dask-kubernetes that would allow for different worker types, but only in non-adaptive mode. Before I dive in, I thought I'd ask if anyone knows of existing work in this area of creating and scaling different Dask worker types, especially on k8s. I see this old issue: https://github.com/dask/distributed/issues/2118 but haven't found much else. (I've also asked on the Dask users gitter channel.)
dask 3
marvin 2
j
Hey Joe, I’m not aware of work done to use multiple worker definitions using dask-kubernetes but I just want to throw some past experience in here. I really like the idea of being able to provide multiple worker specs to dask-kubernetes because in the past when I have tried to use different worker types on k8s I had to abandon the use of the library and use manual scheduler/worker deployments. Where I had a single scheduler deployment and then a couple different worker deployments, each with their own specification. Is your intention with this new feature to work into a way where adaptive mode would one day become possible or is your use case to always have a static amount of workers?
j
Hey @josh, thanks for the thoughts and question. Here's roughly what I'm thinking: • Leave the existing worker spec as a special
default
worker type • Adaptive mode would only use the
default
worker type • cluster.scale(n) would use the
default
worker type • Provide the ability to add any number of worker types, stored as dict entries of str -> pod spec • Provide the ability to call cluster.scale(n, worker_type='gpu') to manually scale specific worker types I have this prototyped and sorta working in a fork of dask-kubernetes. I need to clean it up, make sure it scales down workers types correctly, and do much more testing, but the basics seem to work so far. In theory adaptive mode (using the
default
worker type) may be able to co-exist with manual scaling of specific worker types, but I haven't looked at that yet. I debate whether this is something appropriate to be merged into dask-kubernetes or whether it's a bit of a one-off that happens to meet our needs. I'd love any thoughts & feedback on the way I'm thinking about it.