Noah Holm
06/21/2021, 7:02 AMupstream_tasks=[some_task]
for those that just need to run in order. When the last one fails it’ll only retry the failed one and not the ones that it depends on. I use S3 storage, therefore S3 results but I have added checkpoint=False
to all my tasks in the flow. When restarting in the Cloud UI I get a message saying that “restarting tasks may require more config, read docs”, but I don’t see where I’d solve my use case. I would expect that the tasks that have dependent tasks in the flow gets rerun since they didn’t have any checkpoint of their results.Noah Holm
06/21/2021, 7:09 AM@task(checkpoint=False)
def get_var():
return 123
@task(checkpoint=False)
def do_work():
pass # Do some work that doesn't need returning
@task(checkpoint=False)
def failing_task(var):
pass # Do stuff that needed var, but also do_work to run before
with Flow("dbt daily") as flow:
var = get_var()
work = do_work()
failing_task(var, upstream_tasks=[work], checkpoint=False)
I’m on ECS Fargate so the do_work task (e.g., install some dependency) is missing when restarting the failed task. But isn’t rerun on that restart even though checkpoint was false.Kevin Kho
Noah Holm
06/22/2021, 6:41 AM