https://prefect.io logo
t

Ton Steijvers

06/28/2023, 12:14 PM
Hi, under which circumstances could I get this error:
State data is missing. Typically, this occurs when result persistence is disabled and the state has been retrieved from the API.
. Result persistence is indeed disabled, but I do not understand what/why "state" would be retrieved from the API. My flow returns the result of a run_deployment call (with an await).
n

Nate

06/28/2023, 5:06 PM
hi @Ton Steijvers - would you be able to share your code? are you trying to access the result returned by the flow run triggered with
run_deployment
? it seems that would explain your getting that warning
t

Ton Steijvers

06/29/2023, 6:34 AM
Hi @Nate, yes I am using the results from
run_deployment
as follows:
Copy code
@flow
async def parallel_flow(org_ids):
    concurrency_limit = asyncio.Semophore(8)
    async def core_run(org_id: str):
        async with concurrency_limit:
            return await run_deployment(
                name="core_flow/us-east-k8s",
                parameters=dict(org_id=org_id)
            )

    results = await asyncio.gather( *[core_run(org_id) for org_id in org_ids] )
    return [(result.name, result.state_name) for result in results]
The "State data is missing" is not a warning, but an error caused by an exception. It makes the flow fail. How would the
run_deployment
be related to the "State data is missing" error, the way it is used in the code above?
n

Nate

06/30/2023, 3:25 PM
hmm weird, i would expect that to work, since it seems you're not actually fetching the
state
data (where the actual
result
lives), but instead just the name and state_name of the
FlowRun
object returned by
run_deployment
- do you have the full trace here?
t

Ton Steijvers

06/30/2023, 3:47 PM
Don't have the full trace any more. We attempted a workaround, but that has other unwanted side-effects. I'll revert the workaround and try to get a full trace. Is a stack-trace up to the exception sufficient?
n

Nate

06/30/2023, 10:27 PM
that would be helpful! whatever you can grab
j

Jons Cyriac

08/17/2023, 10:51 AM
hey, have we found anything on this?
n

Nate

08/17/2023, 3:48 PM
what are you running into @Jons Cyriac?
j

Jons Cyriac

08/25/2023, 3:21 PM
Copy code
prefect.exceptions.MissingResult: State data is missing. Typically, this occurs when result persistence is disabled and the state has been retrieved from the API.
quite often.
Copy code
PREFECT_RESULTS_PERSIST_BY_DEFAULT: true
n

Nate

08/25/2023, 3:25 PM
do you have a trace / example from when you get such an error?
j

Jons Cyriac

08/25/2023, 3:33 PM
Im not sure on how to get the trace. where do I get it from? This comes randomly on a flow that runs fine 99% of the time..
Untitled
n

Nate

08/25/2023, 3:35 PM
thanks! would you be able to share how you're attempting to access the results?
j

Jons Cyriac

08/25/2023, 3:46 PM
for this specific scenario, the task returns a list, the flow calls another task with this list passed into it
Untitled
n

Nate

08/25/2023, 4:36 PM
hmm: • does it make a difference at all if you say
persist_result=True
on your flow? (it shouldnt, just sanity checking) • where are you expecting these results to land? by default, it would be the prefect home directory where the flow is running
j

Jons Cyriac

08/25/2023, 5:01 PM
Will have to try if
persist_result=True
on the flow itself is gonna make a difference.. since this is intermittent, it'll take some time for that.. These flows are run in k8s infra, one pod takes care of on flow-run. The default works for us (home directory)
n

Nate

08/25/2023, 5:02 PM
These flows are run in k8s infra, one pod takes care of on flow-run. The default works for us (home directory)
ok just to be clear, that result storage will get wiped out after the pod dies, so the result will be inaccessible after that point (bc its written to disk on the pod) so if you wanted to actually persist that over flow runs, you'd want some blob result storage like s3 / gcs
My flow returns the result of a run_deployment call (with an await).
like in OP's case, if you do
Copy code
flow_run = run_deployment(..) # kick off job on k8s that persists locally (wrt itself)
flow_run.state.result() # State data is missing for the caller, bc the results died with the pod
j

Jons Cyriac

08/25/2023, 6:57 PM
ok, but we dont want to persist these over flow runs.. should I be turning off PREFECT_RESULTS_PERSIST_BY_DEFAULT?
n

Nate

08/25/2023, 10:11 PM
sorry, i might be misunderstanding - what specifically is throwing your
MissingResult
error? you can't get the result of a flow run via
run_deployment
(via the API) without persisting it to external storage
j

Jons Cyriac

08/26/2023, 11:03 AM
umm, I'm not calling
run_deployment
, im not trying to get the result of a flow run outside the run either. I think when a subsequent task is calling prefect server api to get result of the previous task, and when the api call is timing out, it throws this error. this is just an intuition.
n

Nate

08/28/2023, 2:08 PM
umm, I'm not calling
run_deployment
ah, I thought you were because that's that OP was asking about in this thread when you said
hey, have we found anything on this?
if you're getting
Copy code
prefect.exceptions.MissingResult: State data is missing. Typically, this occurs when result persistence is disabled and the state has been retrieved from the API.
without trying to access results through the API, I'd be confused. can you try to isolate what line is throwing it in your code?
14 Views