https://prefect.io logo
Title
a

Aliza Rayman

01/28/2020, 4:11 PM
Hi All! I've been trying to run Prefect using a pretty intense pipeline with about 210,000 tasks in total split into 7 flows which are created and run in a loop. (The maximum number of tasks mapped to 1 task is 1000.) I'm running it via a
DaskEnvironment
. I've been getting various errors, including a
CancelledError
,
distributed.client - WARNING - Couldn't gather 3 keys (refereing to TCP keys)
, and
TCP Broken Pipe Error
Has anyone experienced this/ debugged this?
j

Joe Schmid

01/28/2020, 4:55 PM
Hi @Aliza Rayman, we've run some Flows with large numbers of mapped tasks. We do run into some intermittent issues like the
CancelledError
you mentioned and have found the issues are nearly always Dask related rather than Prefect. Could you say a little more about your Dask cluster? e.g. are you using Prefect's
DaskKubernetesEnvironment
with a cloud provider like AWS or GCP? We use a long-running Dask cluster that we create and manage on our own and found that to be successful, though it certainly is more work and more complex to set up initially. We tell our Flows to run on that Dask cluster using Prefect's
RemoteEnvironment
like this:
RemoteEnvironment(
    executor="prefect.engine.executors.DaskExecutor",
    executor_kwargs={"address": "<tcp://dask-scheduler:8786>"},
    labels=["staging"],
)