https://prefect.io logo
Title
a

Avi A

05/30/2020, 10:00 PM
Hey community! I’m having a problem with
LocalDaskExecutor
. I keep getting the following error messages, which are probably related:
Error message: can't start new thread
Error message: 'DummyProcess' object has no attribute 'terminate'
BlockingIOError: [Errno 11] Resource temporarily unavailable
More info / questions: 1. I’m running on a 32-core machine. The CPU usage is about 20%, memory ~50%, so it seems like there’s no exhaustion of resources. 2. I have 3 independent mapped tasks. It seems that each of them has 32 running tasks at each given time. Is that expected? Does Dask really want to run 32*3 threads concurrently? 3. Any ideas on how to start tackling this?
a

Alex Cano

05/30/2020, 10:15 PM
Can you share the code where you’re creating your executor? Do you run into the same issues when using the
DaskExecutor
? Without much more detail on where those error messages come from, it might be a bit tough to debug (without someone intimately familiar w/ Dask)
a

Avi A

05/31/2020, 8:31 AM
@Alex Cano I don’t create the executor, I tell the
Flow
to use it, like so:
with Flow() as flow:
  ...

flow.environment = RemoteEnvironment(
    executor="prefect.engine.executors.LocalDaskExecutor",
    executor_kwargs={'scheduler': 'threads'},
)
flow.register()
I’m seeing the error in the UI when I try to run it. I haven’t tried using
DaskExecutor
, I see no reason to since I’m only using one machine
a

Alex Cano

05/31/2020, 8:04 PM
@Avi A Gotcha! In that case, consider running the
DaskExecutor
, since the core underlying differences between the
DaskExecutor
and the
LocalDaskExecutor
is which scheduler is used under the hood. One thing I will note is that I’m not aware of a scenario where using the
LocalDaskExecutor
is preferable to the
DaskExecutor
itself, so that’s probably a place for someone more familiar with Dask to chime in. See below for the link to the docs on Dask itself and the differing kinds of schedulers. Different schedulers link: https://docs.dask.org/en/latest/scheduling.html Some more docs on using the distributed scheduler (
DaskExecutor
) on a single machine: https://docs.dask.org/en/latest/setup/single-distributed.html Hopefully these links can help!