Hi all, Working in GKE k8 environment. I have thi...
# prefect-kubernetes
j
Hi all, Working in GKE k8 environment. I have this issue where my pod will go down and the main deployment will restart. The issue is that the deployment will attempt to retrieve the past task results but it is not able to causing the common issue:
Copy code
21:16:34.430 | ERROR   | Flow run 'tricky-alligator' - Finished in state Failed('Flow run encountered an exception. MissingResult: The result was not persisted and is no longer available.\n')
My guess is that new pod or new start causes all persisted data to go away. Is there something I should be doing to address this? Thanks k8 gang
n
hey @John Horn - where are you persisting your results? if you're not using GCS for this
My guess is that new pod or new start causes all persisted data to go away.
that might be something you wanna check out - that way they can live according to lifecycle rules or something
j
I started persisting now to GCS to see if that will help
n
nice! that's what i would do too
j
One question I have is when persisting a sub-flow result, I noticed you can't specify the serializer
so I'm assuming the unique inputs make the fingerprint for the subflow persisted results
this gets tricky say I have: subflow: task1() task2()
I guess I should focus my persistence logic on task1 and task2 including how long to persist those tasks
rather then embed GCS persist logic in the subflow
since the results should only be valid for lets say 30 min
I don't think the subflow gives me that level of control
but the tasks do
but the error I was getting was it couldn't find the persisted result on the subflow
that was what was throwing me off
n
I noticed you can't specify the serializer
you should be able to, like this
Copy code
In [5]: from prefect import flow

In [6]: @flow(persist_result=True)
   ...: def child():
   ...:     return "something important"
   ...:

In [7]: @flow
   ...: def parent():
   ...:     child.with_options(result_serializer=...)()
   ...:
are you using
persist_result=True
on the subflow? or i guess you dont need
with_options
here unless you wanna use different serializers in different cases, you could just do it in the decorator of
child
right away as well
j
If I try something like:
Copy code
@flow(
    persist_result=True,
    cache_expiration=timedelta(minutes=15),
    result_storage=GCS(
        bucket_path='yo-bucket',
        project='foobar-project',           service_account_info=os.environ['gcs-creds']
    )
)
def foobar_subflow(
then I get back:
Copy code
TypeError: flow() got an unexpected keyword argument 'cache_expiration'
and in the wild when running this deployment the pod occasionally dies with:
Copy code
prefect.exceptions.MissingResult: The result was not persisted and is no longer availabl
and the stack trace leads up to the subflow:
Copy code
blah_output = foobar_subflow(
That said haven't tried with the persist on the subflow yet to GCS
I'd almost rather it not persist at all and just retry the subflow
since it is time sensitive and not expensive to run
but that sub-flows parent flow should do persistence
n
ahh, i see.
That said haven't tried with the persist on the subflow yet to GCS
i would try that otherwise there's 2 workarounds I could think of (but I anticipate the above working) • GCS lifecycle rules • deploy the subflow, wrap
run_deployment
in a task with
cache_expiration
and call that from your parent
j
if I set the subflow to persist_result=False then if the pod goes down it should just re-run that specific subflow right?
n
well if you're calling the subflow as a python function it doesnt get its own infra (pod) it runs on the same pod as the parent
if you run a "subflow" via
run_deployment
, then that flow run gets its own infra
and in that case, if the entrypoint flow has retries, then yes it would re-run that specific flow
j
I guess in this example
Copy code
deployment_flow = parent_flow():

def parent_flow:

    @flow(persist_result=False)
    def  sub_flow_1:
        task_1

        task_2

    @flow(persist_result=False)
    def  sub_flow_2:
        task_1

        task_2
if the pod goes down mid-run I would hope that sub_flow_1 starts from scratch but sub_flow_2 can try to recover the results
assuming I rigged sub_flow_2 to persist and use GCS
n
yes. I believe that is would what would happen if you had
persist_result=True
on
sub_flow_2
and the parent had to retry for some reason which would not be possible if you didnt have remote result storage, since if you had
persist_result=True
but "local" result storage, it would die along with the pod
gratitude thank you 1
āœ… 1
j
gah now I'm doubting this is true because I already has persist_result = False on sub_flow_1 and was getting that persist error
oh wells thanks I'll think on this some more
n
sure - feel free to bring back any roadblocks
šŸ‘ 1