Ilya Galperin
01/06/2023, 6:53 PMprefect.exceptions.MissingResult: State data is missing. Typically, this occurs when result persistence is disabled and the state has been retrieved from the API.
(full traceback in thread)Ilya Galperin
01/06/2023, 6:54 PMEncountered exception during execution:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 637, in orchestrate_flow_run
result = await run_sync(flow_call)
File "/usr/local/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 91, in run_sync_in_worker_thread
return await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/opt/prefect/flow.py", line 160, in entrypoint
results.append(future.result())
File "/usr/local/lib/python3.10/site-packages/prefect/futures.py", line 226, in result
return sync(
File "/usr/local/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 267, in sync
return run_async_from_worker_thread(__async_fn, *args, **kwargs)
File "/usr/local/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 177, in run_async_from_worker_thread
return anyio.from_thread.run(call)
File "/usr/local/lib/python3.10/site-packages/anyio/from_thread.py", line 49, in run
return asynclib.run_async_from_thread(func, *args)
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 970, in run_async_from_thread
return f.result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.__get_result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/usr/local/lib/python3.10/site-packages/prefect/futures.py", line 237, in _result
return await final_state.result(raise_on_failure=raise_on_failure, fetch=True)
File "/usr/local/lib/python3.10/site-packages/prefect/states.py", line 100, in _get_state_result
raise MissingResult(
prefect.exceptions.MissingResult: State data is missing. Typically, this occurs when result persistence is disabled and the state has been retrieved from the API.
Kalise Richmond
01/06/2023, 7:33 PMIlya Galperin
01/06/2023, 7:45 PMZanie
Zanie
Ilya Galperin
01/06/2023, 8:02 PMIlya Galperin
01/06/2023, 8:03 PMZanie
Ilya Galperin
01/06/2023, 8:06 PMIlya Galperin
01/06/2023, 8:06 PMIlya Galperin
01/06/2023, 8:07 PMZanie
Zanie
Ilya Galperin
01/06/2023, 8:08 PMIlya Galperin
01/06/2023, 8:08 PMZanie
Ilya Galperin
01/06/2023, 8:08 PMZanie
Zanie
Ilya Galperin
01/06/2023, 8:09 PMbad5b6dd-8f0f-428c-a045-e95cf72da4cd
82b0ca57-1501-47a2-91c8-ed621f184e82
646d0c66-320a-4435-9fb2-e0f4d03bb547
012ae57c-9296-46d0-af83-16eafc1218e0
Zanie
Zanie
File "/opt/prefect/flow.py", line 160, in entrypoint
results.append(future.result())
Ilya Galperin
01/06/2023, 8:14 PMfutures = []
for e, x in enumerate(mylist):
futures.append(
sometask.with_options(name=task_name).submit(
x=x,
y=y,
z=z,
)
)
results = []
for future in futures:
results.append(future.result())
Ilya Galperin
01/06/2023, 8:15 PMIlya Galperin
01/06/2023, 8:15 PMIlya Galperin
01/06/2023, 8:15 PMIlya Galperin
01/06/2023, 8:16 PMEncountered exception during execution:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 637, in orchestrate_flow_run
result = await run_sync(flow_call)
File "/usr/local/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 91, in run_sync_in_worker_thread
return await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/opt/prefect/flow.py", line 228, in entrypoint
temp_table_created = create_temp_table(
File "/usr/local/lib/python3.10/site-packages/prefect/tasks.py", line 436, in __call__
return enter_task_run_engine(
File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 927, in enter_task_run_engine
return run_async_from_worker_thread(begin_run)
File "/usr/local/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 177, in run_async_from_worker_thread
return anyio.from_thread.run(call)
File "/usr/local/lib/python3.10/site-packages/anyio/from_thread.py", line 49, in run
return asynclib.run_async_from_thread(func, *args)
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 970, in run_async_from_thread
return f.result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.__get_result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 1068, in get_task_call_return_value
return await future._result()
File "/usr/local/lib/python3.10/site-packages/prefect/futures.py", line 237, in _result
return await final_state.result(raise_on_failure=raise_on_failure, fetch=True)
File "/usr/local/lib/python3.10/site-packages/prefect/states.py", line 100, in _get_state_result
raise MissingResult(
prefect.exceptions.MissingResult: State data is missing. Typically, this occurs when result persistence is disabled and the state has been retrieved from the API.
Ilya Galperin
01/06/2023, 8:18 PMEncountered exception during execution:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 637, in orchestrate_flow_run
result = await run_sync(flow_call)
File "/usr/local/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 91, in run_sync_in_worker_thread
return await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
result = context.run(func, *args)
File "/opt/prefect/flow.py", line 279, in entrypoint
cols, _ = get_table_schema(
File "/usr/local/lib/python3.10/site-packages/prefect/tasks.py", line 436, in __call__
return enter_task_run_engine(
File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 927, in enter_task_run_engine
return run_async_from_worker_thread(begin_run)
File "/usr/local/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 177, in run_async_from_worker_thread
return anyio.from_thread.run(call)
File "/usr/local/lib/python3.10/site-packages/anyio/from_thread.py", line 49, in run
return asynclib.run_async_from_thread(func, *args)
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 970, in run_async_from_thread
return f.result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 458, in result
return self.__get_result()
File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
raise self._exception
File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 1068, in get_task_call_return_value
return await future._result()
File "/usr/local/lib/python3.10/site-packages/prefect/futures.py", line 237, in _result
return await final_state.result(raise_on_failure=raise_on_failure, fetch=True)
File "/usr/local/lib/python3.10/site-packages/prefect/states.py", line 100, in _get_state_result
raise MissingResult(
prefect.exceptions.MissingResult: State data is missing. Typically, this occurs when result persistence is disabled and the state has been retrieved from the API.
Zanie
Ilya Galperin
01/06/2023, 8:20 PMZanie
cache_result_in_memory=False
Ilya Galperin
01/06/2023, 8:20 PMZanie
Ilya Galperin
01/06/2023, 8:20 PMZanie
Zanie
Ilya Galperin
01/06/2023, 8:22 PMIlya Galperin
01/06/2023, 8:23 PMIlya Galperin
01/06/2023, 8:24 PMZanie
Ilya Galperin
01/06/2023, 8:25 PMIlya Galperin
01/06/2023, 8:25 PMZanie
.submit
the default is concurrent.Zanie
Ilya Galperin
01/06/2023, 8:26 PMIlya Galperin
01/06/2023, 8:26 PMIlya Galperin
01/06/2023, 8:26 PMZanie
Zanie
Zanie
Ilya Galperin
01/06/2023, 8:30 PMZanie
Zanie
Ilya Galperin
01/06/2023, 8:35 PMIlya Galperin
01/06/2023, 8:35 PMZanie
Zanie
Zanie
9bf44e45-bcc6-413b-a67e-0ee8a1b06c77 at 02:06:30.633 EST
0424b8f8-06ff-4f78-808d-13543167632f at 02:06:00.680 EST
3ae5277e-61f1-4a10-9ff3-cf89af740aa7 at 02:05:45.138 EST
Zanie
Ilya Galperin
01/06/2023, 8:43 PMIlya Galperin
01/06/2023, 8:46 PMIlya Galperin
01/06/2023, 8:46 PMIlya Galperin
01/06/2023, 8:46 PMIlya Galperin
01/06/2023, 8:46 PMIlya Galperin
01/06/2023, 8:47 PMIlya Galperin
01/06/2023, 8:47 PMZanie
Zanie
Zanie
Ilya Galperin
01/06/2023, 8:50 PMIlya Galperin
01/06/2023, 8:53 PM3ae5277e-61f1-4a10-9ff3-cf89af740aa7
only uses values that are passed in as arguments to the flow function.Zanie
Ilya Galperin
01/06/2023, 8:54 PMIlya Galperin
01/06/2023, 8:55 PM3ae5277e-61f1-4a10-9ff3-cf89af740aa7
Task run '3ae5277e-61f1-4a10-9ff3-cf89af740aa7' already finished.
9bf44e45-bcc6-413b-a67e-0ee8a1b06c77
Task run '9bf44e45-bcc6-413b-a67e-0ee8a1b06c77' already finished.
9bf44e45-bcc6-413b-a67e-0ee8a1b06c77
Task run '9bf44e45-bcc6-413b-a67e-0ee8a1b06c77' already finished.
Ilya Galperin
01/06/2023, 8:55 PMZanie
Ilya Galperin
01/06/2023, 8:56 PMZanie
Ilya Galperin
01/06/2023, 8:57 PMZanie
Zanie
Ilya Galperin
01/06/2023, 9:05 PMZanie
Ilya Galperin
01/06/2023, 9:06 PMZanie
return_state=True
on the task call and just like.. call it again if you happen to get a pending state back, but that’s pretty hacky.Ilya Galperin
01/06/2023, 9:08 PMIlya Galperin
01/06/2023, 9:08 PMZanie
Ilya Galperin
01/06/2023, 9:10 PMZanie
Ilya Galperin
01/06/2023, 9:10 PMIlya Galperin
01/06/2023, 9:11 PMZanie
PREFECT_LOGGING_LEVEL=DEBUG
— we only capture what you set the level toIlya Galperin
01/06/2023, 9:11 PMZanie
Ilya Galperin
01/06/2023, 9:11 PMIlya Galperin
01/06/2023, 9:12 PMIlya Galperin
01/06/2023, 9:12 PMIlya Galperin
01/06/2023, 9:12 PMIlya Galperin
01/06/2023, 9:14 PMZanie
Zanie
Ilya Galperin
01/06/2023, 9:15 PMZanie
Zanie
Ilya Galperin
01/06/2023, 9:17 PMIlya Galperin
01/06/2023, 11:11 PMZanie
Ilya Galperin
01/06/2023, 11:34 PMZanie
EXTRA_PIP_PACKAGES
variable if you’re using our image entrypoint e.g.
docker run --env EXTRA_PIP_PACKAGES="git+<https://github.com/PrefectHQ/prefect@task-run-abort-log>" prefecthq/prefect:2-latest prefect version
Ilya Galperin
01/10/2023, 10:34 PMTask run '8775d5ff-1614-4df3-83bd-dc3b3fc4bb0f' received abort during orchestration: The enclosing flow must be running to begin task execution.. Task run is in PENDING state.
Zanie
Zanie
Zanie
Zanie
Alix Cook
01/10/2023, 11:18 PMrefect.exceptions.MissingResult: State data is missing. Typically, this occurs when result persistence is disabled and the state has been retrieved from the API.
we're on prefect 2.7.1. we use prefect cloud
we don't have any retries setup, and we use whatever the default persistence is (the docs say that it should be enabled).
The first time we saw this error was 2022-12-15 163948 UTCAlix Cook
01/10/2023, 11:19 PMAlix Cook
01/10/2023, 11:22 PMAndrew Huang
01/10/2023, 11:24 PMimport dask.dataframe
import dask.distributed
from prefect import flow, task
from prefect_dask import DaskTaskRunner, get_dask_client
client = dask.distributed.Client()
@task
def read_data(start: str, end: str) -> dask.dataframe.DataFrame:
df = dask.datasets.timeseries(start, end, partition_freq="4w")
return df
@task
def process_data(df) -> dask.dataframe.DataFrame:
df_yearly_avg = df.groupby(df.index.year).mean()
return df_yearly_avg.compute()
@flow(task_runner=DaskTaskRunner(address=client.scheduler.address))
def dask_flow():
df = read_data.submit("1988", "2022")
df_yearly_average = process_data.submit(df)
return df_yearly_average
dask_flow()
in this case, it’s because the memory maxed out on a single worker
wrapping with get_dask_client in my case allows it to run successfully
@task
def process_data(df) -> dask.dataframe.DataFrame:
with get_dask_client():
df_yearly_avg = df.groupby(df.index.year).mean()
return df_yearly_avg.compute()
Zanie
Alix Cook
01/10/2023, 11:27 PMIlya Galperin
01/10/2023, 11:28 PMFlow run infrastructure exited with non-zero status code -1.
Since this is almost always crashing at the 4 hour mark, it makes me suspicious that there is some timeout we’re missing. However, our other very long-running flows are not timing out at any point. I should also note the other flow where we saw this intermittent behavior on crashed at an arbitrary time after 2 hours, so maybe not.Andrew Huang
01/10/2023, 11:28 PMZanie
Zanie
Ilya Galperin
01/10/2023, 11:31 PMZanie
Zanie
Alix Cook
01/10/2023, 11:40 PMZanie
Ilya Galperin
01/11/2023, 12:34 AMAlix Cook
01/11/2023, 4:24 PMIlya Galperin
01/11/2023, 6:30 PMZanie
Ilya Galperin
01/11/2023, 7:28 PM