Hey guys, <@U01MEG3ET6C> and I are using `LocalDas...
# ask-community
g
Hey guys, @Dana Merrick and I are using
LocalDaskExecutor
in combination with
RunNamespacedJob
to launch a bunch of trivially parallelizable child processes on Kubernetes. However, it looks like only 2 child jobs are getting run at a time. Is there a way to increase the parallelism // are we doing this right? 😄
k
Hey @Gabe Grand ! How do you define the LocalDaskExecutor? Did you specify the number of workers?
d
nope just defaults
LocalDaskExecutor()
k
Ah ok I think we just need to up the number of workers from the default. Will look how.
🙏 2
Try something like
flow.executor = LocalDaskExecutor(num_workers=2)
ah yes, i see it now
👍 1
ty!
👍 1
g
oh nice! well that’s easy 🙂
thanks for the quick reply @Kevin Kho!
k
Of course!
g
just thinking down the line a bit here - is there a way to specify different
num_workers
for different tasks? we want high
num_workers
for these
RunNamespacedJob
tasks, but there are other tasks in the flow that I could see breaking with too many threads
d
i think you just override the
executor
in the flow definition
k
That would be cool if that worked @Dana Merrick. Will ask around and see if there’s any other ways and get back to you on this one.
g
@Kevin Kho @Dylan Could you sanity check our design pattern? We’re running
t1 = RunNamespacedJob.map(body=[b1, b2, …])
to create a bunch of parallel K8s jobs, and then we have some downstream tasks that depend on
t1
. Not sure if this is the best way to achieve MapReduce-style parallelism with Prefect.
k
So workers are allocated at initialization and overriding in the flow will not work because the latest one is the one saved for execution. Task concurrency limiting is the only way to achieve that.
👍 1
g
@Dylan ah right! I remember you mentioned this was one of the main use cases for tags. thanks!
k
Yes I think the design pattern is correct. The immediate non-mapped task after t1 will perform the reduce.
👍 1
g
We’re currently using a custom setup for Prefect server, but will think about Cloud to take advantage of the global concurrency limit tagging feature