Matic Lubej

01/19/2021, 8:18 PM
Hi again! I'm trying to run a process over a Fargate cluster using
API. running tutorials and sample code from
this works great, but for prefect I have created a dedicated docker image which I provide to the cluster initializer. The cluster gets created, but as soon as the flow starts, after 10 s I get the following time-out error:
Copy code
Traceback (most recent call last):
  File "/usr/local/lib/python3.8/site-packages/prefect/engine/", line 48, in inner
    new_state = method(self, state, *args, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/prefect/engine/", line 418, in get_flow_run_state
    with self.check_for_cancellation(), executor.start():
  File "/usr/local/lib/python3.8/", line 113, in __enter__
    return next(self.gen)
  File "/usr/local/lib/python3.8/site-packages/prefect/executors/", line 203, in start
    with Client(self.address, **self.client_kwargs) as client:
  File "/usr/local/lib/python3.8/site-packages/distributed/", line 748, in __init__
  File "/usr/local/lib/python3.8/site-packages/distributed/", line 953, in start
    sync(self.loop, self._start, **kwargs)
  File "/usr/local/lib/python3.8/site-packages/distributed/", line 340, in sync
    raise exc.with_traceback(tb)
  File "/usr/local/lib/python3.8/site-packages/distributed/", line 324, in f
    result[0] = yield future
  File "/usr/local/lib/python3.8/site-packages/tornado/", line 762, in run
    value = future.result()
  File "/usr/local/lib/python3.8/site-packages/distributed/", line 1043, in _start
    await self._ensure_connected(timeout=timeout)
  File "/usr/local/lib/python3.8/site-packages/distributed/", line 1100, in _ensure_connected
    comm = await connect(
  File "/usr/local/lib/python3.8/site-packages/distributed/comm/", line 308, in connect
    raise IOError(
OSError: Timed out trying to connect to <tcp://> after 10 s
[2021-01-19 20:14:19+0000] ERROR - prefect.Execute process | Unexpected error occured in FlowRunner: OSError('Timed out trying to connect to <tcp://> after 10 s')
Traceback (most recent call last):
  File "", line 114, in <module>
    assert status.is_successful()
Any ideas what is going on? Is the dask scheduler having issues connecting to the workers? Or might it be something else? Thanks!


01/19/2021, 8:53 PM
Hi @Matic Lubej I can’t say for certain based on the information you provided but it looks like wherever your flow is running it cannot communicate with your Dask scheduler

Matic Lubej

01/20/2021, 9:41 PM
It turned out hat I had to add the security information to the dask executor, info provided in Thanks!