https://prefect.io logo
Title
b

Brett Jurman

08/04/2021, 4:07 PM
Sometimes with my coiled flows it takes long enough that prefect times out while starting the cluster. Is there a way to extend the timeout?
k

Kevin Kho

08/04/2021, 4:10 PM
Hey @Brett Jurman, I’ll check with the team. What error you do get?
Is it an API timeout?
b

Brett Jurman

08/04/2021, 4:13 PM
Unexpected error: TimeoutError('Timed out after 600 seconds waiting for 1 workers to arrive, check your notifications with coiled.get_notifications() for further details') Traceback (most recent call last): File "/home/brett/miniconda3/envs/neuro4/lib/python3.8/site-packages/prefect/engine/runner.py", line 48, in inner new_state = method(self, state, *args, **kwargs) File "/home/brett/miniconda3/envs/neuro4/lib/python3.8/site-packages/prefect/engine/flow_runner.py", line 442, in get_flow_run_state with self.check_for_cancellation(), executor.start(): File "/home/brett/miniconda3/envs/neuro4/lib/python3.8/contextlib.py", line 113, in enter return next(self.gen) File "/home/brett/miniconda3/envs/neuro4/lib/python3.8/site-packages/prefect/executors/dask.py", line 223, in start with self.cluster_class(**self.cluster_kwargs) as cluster: File "/home/brett/miniconda3/envs/neuro4/lib/python3.8/site-packages/coiled/cluster.py", line 310, in init self.sync(self._start) File "/home/brett/miniconda3/envs/neuro4/lib/python3.8/site-packages/coiled/cluster.py", line 346, in sync return super().sync( File "/home/brett/miniconda3/envs/neuro4/lib/python3.8/site-packages/distributed/deploy/cluster.py", line 193, in sync return sync(self.loop, func, *args, **kwargs) File "/home/brett/miniconda3/envs/neuro4/lib/python3.8/site-packages/distributed/utils.py", line 338, in sync raise exc.with_traceback(tb) File "/home/brett/miniconda3/envs/neuro4/lib/python3.8/site-packages/distributed/utils.py", line 321, in f result[0] = yield future File "/home/brett/miniconda3/envs/neuro4/lib/python3.8/site-packages/tornado/gen.py", line 762, in run value = future.result() File "/home/brett/miniconda3/envs/neuro4/lib/python3.8/site-packages/coiled/cluster.py", line 462, in _start raise e File "/home/brett/miniconda3/envs/neuro4/lib/python3.8/site-packages/coiled/cluster.py", line 459, in _start await self._wait_for_workers(1, timeout="10 minutes") File "/home/brett/miniconda3/envs/neuro4/lib/python3.8/site-packages/coiled/cluster.py", line 476, in _wait_for_workers raise TimeoutError(err_msg) TimeoutError: Timed out after 600 seconds waiting for 1 workers to arrive, check your notifications with coiled.get_notifications() for further details
🙌 1
k

Kevin Kho

08/04/2021, 4:14 PM
I think this is a Coiled timeout, not a Prefect one right? All the error logs are on the Coiled side? Prefect probably fails the flow once this error comes.
b

Brett Jurman

08/04/2021, 4:16 PM
i see, so you are saying its not a prefect timeout, but its a coiled timeout
k

Kevin Kho

08/04/2021, 4:17 PM
I think so yep
Could you ask Coiled? I’ll keep an eye out for the thread.