`KeyError: 'data'` — have you seen this error mess...
# ask-community
h
KeyError: 'data'
— have you seen this error message before?
Copy code
Unexpected error: KeyError('data')
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/prefect/engine/runner.py", line 48, in inner
    new_state = method(self, state, *args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/prefect/engine/flow_runner.py", line 542, in get_flow_run_state
    upstream_states = executor.wait(
  File "/usr/local/lib/python3.9/site-packages/prefect/executors/dask.py", line 440, in wait
    return self.client.gather(futures)
  File "/usr/local/lib/python3.9/site-packages/distributed/client.py", line 1969, in gather
    return self.sync(
  File "/usr/local/lib/python3.9/site-packages/distributed/client.py", line 865, in sync
    return sync(
  File "/usr/local/lib/python3.9/site-packages/distributed/utils.py", line 327, in sync
    raise exc.with_traceback(tb)
  File "/usr/local/lib/python3.9/site-packages/distributed/utils.py", line 310, in f
    result[0] = yield future
  File "/usr/local/lib/python3.9/site-packages/tornado/gen.py", line 762, in run
    value = future.result()
  File "/usr/local/lib/python3.9/site-packages/distributed/client.py", line 1863, in _gather
    response = await future
  File "/usr/local/lib/python3.9/site-packages/distributed/client.py", line 1914, in _gather_remote
    response = await retry_operation(self.scheduler.gather, keys=keys)
  File "/usr/local/lib/python3.9/site-packages/distributed/utils_comm.py", line 385, in retry_operation
    return await retry(
  File "/usr/local/lib/python3.9/site-packages/distributed/utils_comm.py", line 370, in retry
    return await coro()
  File "/usr/local/lib/python3.9/site-packages/distributed/core.py", line 863, in send_recv_from_rpc
    result = await send_recv(comm=comm, op=key, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/distributed/core.py", line 656, in send_recv
    raise exc.with_traceback(tb)
  File "/opt/conda/lib/python3.9/site-packages/distributed/core.py", line 498, in handle_comm
  File "/opt/conda/lib/python3.9/site-packages/distributed/scheduler.py", line 5759, in gather
  File "/opt/conda/lib/python3.9/site-packages/distributed/utils_comm.py", line 88, in gather_from_workers
KeyError: 'data'
a
I have not, but it looks like an error in the reduce step after mapping. Could be due to OOM on the scheduler.
h
Ok... The tasks do have a lot of leeway memory wise; but you may be right. Can I extract prometheus metrics from the runs to check?
a
Possibly, but I don’t have that much experience with Prometheus to guide you through that
h
Are there API:s in the Prefect code running on Dask, to fetch metrics?
a
Perhaps the Dask Performance Report can give you more information about that? This has been included in 0.15.7 release https://github.com/PrefectHQ/prefect/pull/5032
h
Do I log its path in a final state handler?
a
Yes, specifically you could use a terminal state handler to save the report e,g, to S3. An example is provided here: https://docs.prefect.io/orchestration/flow_config/executors.html#performance-reports