Philipp Eisen
01/20/2022, 2:32 PMNo heartbeat detected from the remote task; marking the run as failed.
Is there some obvious things to look for?Kevin Kho
thread
heartbeats to be more stablePhilipp Eisen
01/20/2022, 2:44 PMUnexpected error: KilledWorker('xxx-148f1de6c7e840498044bee6c2534264', <WorkerState '<tcp://10.48.39.117:32963>', name: 21, status: closed, memory: 0, processing: 27>)
Traceback (most recent call last):
File "/usr/local/lib/python3.8/site-packages/prefect/engine/runner.py", line 48, in inner
new_state = method(self, state, *args, **kwargs)
File "/usr/local/lib/python3.8/site-packages/prefect/engine/flow_runner.py", line 542, in get_flow_run_state
upstream_states = executor.wait(
File "/usr/local/lib/python3.8/site-packages/prefect/executors/dask.py", line 440, in wait
return self.client.gather(futures)
File "/usr/local/lib/python3.8/site-packages/distributed/client.py", line 1946, in gather
return self.sync(
File "/usr/local/lib/python3.8/site-packages/distributed/utils.py", line 310, in sync
return sync(
File "/usr/local/lib/python3.8/site-packages/distributed/utils.py", line 364, in sync
raise exc.with_traceback(tb)
File "/usr/local/lib/python3.8/site-packages/distributed/utils.py", line 349, in f
result[0] = yield future
File "/usr/local/lib/python3.8/site-packages/tornado/gen.py", line 762, in run
value = future.result()
File "/usr/local/lib/python3.8/site-packages/distributed/client.py", line 1811, in _gather
raise exception.with_traceback(traceback)
distributed.scheduler.KilledWorker: ('xxx-148f1de6c7e840498044bee6c2534264', <WorkerState '<tcp://10.48.39.117:32963>', name: 21, status: closed, memory: 0, processing: 27>)
And then
No heartbeat detected from the remote task; marking the run as failed.
Kevin Kho
Philipp Eisen
01/20/2022, 2:56 PMKevin Kho