Question: I’m setting my flow to use the LocalDaskExecutor and deploying to a Docker agent. When I use a docker agent on my laptop it parallelizes just fine, but when I try to run on a VM (16 CPUs, Google Compute Engine) it’s only using a single core, and only executing one task at a time. Not sure if I’m setting it up wrong, any ideas what might be happening?
a
Anna Geller
11/02/2021, 7:30 PM
@Greg Adams are you using it with mapping? This should parallelize work across your threads or processes. But you can also adjust it yourself using num_workers:
Copy code
# Use 16 threads
flow.executor = LocalDaskExecutor(scheduler="threads", num_workers=16)
# Use 16 processes
flow.executor = LocalDaskExecutor(scheduler="processes", num_workers=16)
Anna Geller
11/02/2021, 7:32 PM
but using Docker agent is a bit tricky - you may need to configure Docker to use more resources, e.g. on Docker Desktop you can configure it this way:
👍 1
Anna Geller
11/02/2021, 7:37 PM
@Greg Adams it looks like you can leverage the host_config to specify that on a per flow basis:
Copy code
from prefect.run_configs import DockerRun
run_config = DockerRun(host_config=dict(cpuset_cpus="0-15"))
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.