Hi experts, we are currently having problem with r...
# ask-community
l
Hi experts, we are currently having problem with retrying failed flow starting from the point of the failure. We created a master control flow to using
StartFlowRun
to invoke individual flows in their defined orders. All the flows run in their own docker container. When master flow execution flow fails due to specific failed execution, it was not working. I read in docs and said that docker run does not yet support retaining results/inputs/outputs. I read a thread here by @Kevin Kho and seemed to have suggested that PrefectResult may be a workaround if we don't care about the data we push to Prefect Cloud. https://prefect-community.slack.com/archives/CL09KU1K7/p1622645858327700 I am not exactly sure how PrefectResult can be used in our scenario. Please help!
k
Hi @liren zhang, I actually think this behavior might be more seamless if you use the new KV Store. You can persist a value for later retrieval. How big is the data being passed around?
If you grab a value from the KV Store, the retrying will be more seamless
l
Hi @Kevin Kho KV store seems like a good solution. I want to ask about the a more fundamental question about task retry after the entire flow end up in failed state due to a task failure inside of the flow. If the failed task has a input from the previous task result, how does Prefect actually
retry
using the previous input value? How does this work behind the scene.
k
Return values of tasks are persisted as
Results
if checkpointing is on, which it is for Prefect Cloud. When a task is retried, it retrieves the upstream results needed.
PrefectResult
specifically stored stuff on Cloud
l
so if I use PrefectResult to store the task inputs/outputs, we will be ok even we run flows in docker container?
k
I believe so. The only note here is that Prefect by default does not see any of your data. Using the PrefectResult means we see it. It also might not work for large DataFrames because we’re storing that in our database
If you’re actually using DataFrames, something like S3 might serve you better
👍 1
l
So when I click on
retry
button from Prefect Cloud UI, how exactly Prefect bring back the previous inputs and continue the execution of the past flow run?
If I define a S3 storage for the results, will Prefect know to reach out to S3 persistent storage to retrieve the previous inputs? How does it actually work?
k
S3Result* Storage is where for is stored. But yes configure the task with the Result
@task(result = S3Result(...))
and then when a re-run happens, it knows where to retrieve the output of upstream tasks as it continues the Flow.
👍 1
l
I will try this out. Thanks
k
Oh I read your original question though. I have been talking in the context of a flow. For the
StartFlowRun
task, the result is not the location of data unfortunately. This will be fixed later on but for now, it’s better to manually persist and retrieve for data dependencies between sub flow runs
The KVStore might be a better solution because you can store the location of something there to be retrieved by downstream flows. The
Results
of
StartFlowRun
tasks don’t propagate to downstream flow runs.
👍 1
l
I am exactly clear on how a downstream flow retrieve the previous run input when I manually click on the retry button from the prefect UI. Do you have some sort of examples that demonstrate this
k
Yeah I’ll make one for you. This is without StartFlowRun right?
l
I was hope for an example with StartFlowRun as that is what use case I am dealing with.
But if you cannot right now, it will still help for me to start figuring this out
In the end, I think it is about learning how to store the results somewhere and being able to retrieve the previous run inputs in a retry.