https://prefect.io logo
Title
b

Braun Reyes

01/21/2020, 11:21 PM
Is there a way to implement task max concurrency when using task.map? I have a list of 6 configurations than I want to execute a task against in parallel. I was thinking I could use Dask executor for this...but having trouble finding kwargs to pass in for this. Using dask local cluster would allow me to test this
c

Chris White

01/21/2020, 11:23 PM
Hey @Braun Reyes would Cloud task-tag concurrency tags work for your use case? https://docs.prefect.io/cloud/concepts/concurrency-limiting.html#task-concurrency-limiting
b

Braun Reyes

01/21/2020, 11:23 PM
You could only test this once it’s deployed right?
c

Chris White

01/21/2020, 11:24 PM
well, if you use local storage + a local agent you can still test it locally; but yea, it does require Cloud to track the concurrency
I’ll send you the Dask docs for how to replicate this feature without Cloud…
b

Braun Reyes

01/21/2020, 11:24 PM
Hey thanks
c

Chris White

01/21/2020, 11:26 PM
if you tag your tasks with a tag of the form
"dask-resource:KEY=NUM"
they will be parsed and passed as Worker Resources of the form
{"KEY": float(NUM)}
to the Dask Scheduler. And here’s how to set up Dask worker resources: https://distributed.dask.org/en/latest/resources.html
b

Braun Reyes

01/22/2020, 2:45 PM
I could not get tags portion to work. Though this did work.
executor = DaskExecutor(n_workers=1, threads_per_worker=3)
    flow.run(executor=executor)
c

Chris White

01/22/2020, 4:50 PM
@Braun Reyes did you spin up dask workers with resources? You’ll have to create a cluster outside of the
Executor
call for that to work
b

Braun Reyes

01/22/2020, 4:56 PM
I only trying to use Dask as a way to achieve parallel task processing, so we would only ever use local cluster on a single machine until our tasks out grew what a single fargate container would give us.
c

Chris White

01/22/2020, 4:56 PM
ahhh OK, yea in that case you can’t use worker resources
b

Braun Reyes

01/22/2020, 5:22 PM
I think have a concept of flow based concurrency using threads_per_worker should work just fine
we will pin it to 2 or 3 one a single local worker and then extend from there
dask is impressive though....was able to map a list to ssh tunnel based sql workflows with 3 tunnels running in parallel at a time
:dask: 1
c

Chris White

01/22/2020, 6:08 PM
that’s awesome!