I'm trying to understand retry behavior, specifica...
# ask-community
c
I'm trying to understand retry behavior, specifically which tasks retry when a flow fails. I'm debugging with the flow:
Copy code
from prefect import get_run_logger, task, flow

@task
def child_a():
    logger = get_run_logger()
    <http://logger.info|logger.info>("child A")
    raise ValueError("simulated")

@task
def child_b():
    logger = get_run_logger()
    <http://logger.info|logger.info>("child B")

@task
def child_c():
    logger = get_run_logger()
    <http://logger.info|logger.info>("child C")

@flow(
    name="test_flow_retry",
)
def hello_world():
    a = child_a.submit()
    child_b.submit(wait_for=[a])
    child_c.submit()
when I deploy, run the flow, and then retry it, I'm observing that all tasks are run again (after I click
retry
in the UI). that's expected right? do I need to persist results for each of those tasks so that successful tasks (e.g.
child_c
) are not executed again when the flow is retried?
k
yep, results need to be persisted to skip execution for successful tasks upon retry
c
even when the task doesn't return anything (i.e. has no results)?
k
python functions always return something even if that something is
None
and implicit
c
so then why doesn't this actually work for me? it sounds like the tasks should already be getting stored (since they return
None
)
k
hm, good point
let me try that too
yeah, result persistence still has to be turned on for this to happen
c
so are the docs wrong about
None
? do they persist with
True/False
?
k
If
persist_result
is set to
False
, these values will never be stored.
Copy code
@task(persist_result=True)
def child_c():
    logger = get_run_logger()
    <http://logger.info|logger.info>("child C")
maybe it's just unclear wording?
c
but
persist_result
defaults to None (not False)
this feels like an unusual default - I would expect that by default, when my flow has some failing tasks, that asking the flow to restart would just rerun the failed tasks (like in Prefect 1?)
yess 1
k
yeah I think I agree "stored by the API without persistence to storage" seems to imply that you shouldn't have to set
persist_result
to
True
for none and bool type results
yess 1
I think I'll have to do some asking around to know for sure what's intended so I know whether this is something we should clarify in writing or that this is something that needs fixing
c
hmmm I'm wondering if, although in tasks
persist_result=None
, there is the setting
PREFECT_RESULTS_PERSIST_BY_DEFAULT
(docs) which defaults to False - so perhaps that latter setting is ultimately dictating behavior?
k
from what I can see that setting only applies if
persist_result
is not set to
True
or
False
in the flow or task decorator
looking over how we decide to persist results, my takeaway is that we will only persist results, regardless of type, if you have persistence enabled or prefect decides it should be enabled. It never actually stays as
None
at runtime. After that, where the result is persisted to depends on its type.
c
I've set
PREFECT_RESULTS_PERSIST_BY_DEFAULT=True
, and the flow that I originally shared behaves as I expected: retrying the flow after an initial run only reruns the previously failed
child_a
k
I think that's a solid approach to get the behavior you're looking for
c
agreed - I think in this case the docs could have better called out that the vanilla default is not to persist any, regardless of type, unless either: •
persist_result=True
is set on the tasks OR •
PREFECT_RESULTS_PERSIST_BY_DEFAULT=True
setting is set
Thanks for all the help @Kevin Grismore
k
thanks for digging into it with me!
y
Very informative thread! I'm pinning this. Thanks 👍