Hi, under which circumstances could I get this err...
# ask-community
t
Hi, under which circumstances could I get this error:
State data is missing. Typically, this occurs when result persistence is disabled and the state has been retrieved from the API.
. Result persistence is indeed disabled, but I do not understand what/why "state" would be retrieved from the API. My flow returns the result of a run_deployment call (with an await).
n
hi @Ton Steijvers - would you be able to share your code? are you trying to access the result returned by the flow run triggered with
run_deployment
? it seems that would explain your getting that warning
t
Hi @Nate, yes I am using the results from
run_deployment
as follows:
Copy code
@flow
async def parallel_flow(org_ids):
    concurrency_limit = asyncio.Semophore(8)
    async def core_run(org_id: str):
        async with concurrency_limit:
            return await run_deployment(
                name="core_flow/us-east-k8s",
                parameters=dict(org_id=org_id)
            )

    results = await asyncio.gather( *[core_run(org_id) for org_id in org_ids] )
    return [(result.name, result.state_name) for result in results]
The "State data is missing" is not a warning, but an error caused by an exception. It makes the flow fail. How would the
run_deployment
be related to the "State data is missing" error, the way it is used in the code above?
n
hmm weird, i would expect that to work, since it seems you're not actually fetching the
state
data (where the actual
result
lives), but instead just the name and state_name of the
FlowRun
object returned by
run_deployment
- do you have the full trace here?
t
Don't have the full trace any more. We attempted a workaround, but that has other unwanted side-effects. I'll revert the workaround and try to get a full trace. Is a stack-trace up to the exception sufficient?
n
that would be helpful! whatever you can grab
j
hey, have we found anything on this?
n
what are you running into @Jons Cyriac?
j
Copy code
prefect.exceptions.MissingResult: State data is missing. Typically, this occurs when result persistence is disabled and the state has been retrieved from the API.
quite often.
Copy code
PREFECT_RESULTS_PERSIST_BY_DEFAULT: true
n
do you have a trace / example from when you get such an error?
j
Im not sure on how to get the trace. where do I get it from? This comes randomly on a flow that runs fine 99% of the time..
Untitled
n
thanks! would you be able to share how you're attempting to access the results?
j
for this specific scenario, the task returns a list, the flow calls another task with this list passed into it
Untitled
n
hmm: • does it make a difference at all if you say
persist_result=True
on your flow? (it shouldnt, just sanity checking) • where are you expecting these results to land? by default, it would be the prefect home directory where the flow is running
j
Will have to try if
persist_result=True
on the flow itself is gonna make a difference.. since this is intermittent, it'll take some time for that.. These flows are run in k8s infra, one pod takes care of on flow-run. The default works for us (home directory)
n
These flows are run in k8s infra, one pod takes care of on flow-run. The default works for us (home directory)
ok just to be clear, that result storage will get wiped out after the pod dies, so the result will be inaccessible after that point (bc its written to disk on the pod) so if you wanted to actually persist that over flow runs, you'd want some blob result storage like s3 / gcs
My flow returns the result of a run_deployment call (with an await).
like in OP's case, if you do
Copy code
flow_run = run_deployment(..) # kick off job on k8s that persists locally (wrt itself)
flow_run.state.result() # State data is missing for the caller, bc the results died with the pod
j
ok, but we dont want to persist these over flow runs.. should I be turning off PREFECT_RESULTS_PERSIST_BY_DEFAULT?
n
sorry, i might be misunderstanding - what specifically is throwing your
MissingResult
error? you can't get the result of a flow run via
run_deployment
(via the API) without persisting it to external storage
j
umm, I'm not calling
run_deployment
, im not trying to get the result of a flow run outside the run either. I think when a subsequent task is calling prefect server api to get result of the previous task, and when the api call is timing out, it throws this error. this is just an intuition.
n
umm, I'm not calling
run_deployment
ah, I thought you were because that's that OP was asking about in this thread when you said
hey, have we found anything on this?
if you're getting
Copy code
prefect.exceptions.MissingResult: State data is missing. Typically, this occurs when result persistence is disabled and the state has been retrieved from the API.
without trying to access results through the API, I'd be confused. can you try to isolate what line is throwing it in your code?