Is restarting a failed flow that essentially needs...
# ask-community
n
Is restarting a failed flow that essentially needs the whole flow to rerun not supported? I have a couple of tasks where dependencies are managed through passing data with the functional API as well as manually setting
upstream_tasks=[some_task]
for those that just need to run in order. When the last one fails it’ll only retry the failed one and not the ones that it depends on. I use S3 storage, therefore S3 results but I have added
checkpoint=False
to all my tasks in the flow. When restarting in the Cloud UI I get a message saying that “restarting tasks may require more config, read docs”, but I don’t see where I’d solve my use case. I would expect that the tasks that have dependent tasks in the flow gets rerun since they didn’t have any checkpoint of their results.
Simplified example:
Copy code
@task(checkpoint=False)
def get_var():
    return 123


@task(checkpoint=False)
def do_work():
    pass  # Do some work that doesn't need returning


@task(checkpoint=False)
def failing_task(var):
    pass  # Do stuff that needed var, but also do_work to run before


with Flow("dbt daily") as flow:
    var = get_var()
    work = do_work()
    failing_task(var, upstream_tasks=[work], checkpoint=False)
I’m on ECS Fargate so the do_work task (e.g., install some dependency) is missing when restarting the failed task. But isn’t rerun on that restart even though checkpoint was false.
k
Hi @Noah Holm, I checked with the team and it looks like a whole flow rerun is not supported, and the recommendation is to create a new flow with the same parameters. Would it helped in this case if you installed the dependencies as part of your Docker container?
n
Thanks Kevin, that’s what I figured. I might be able to rework it into the container but I’m not sure the trade-off is worth it.