m

    Martim Lobao

    11 months ago
    If a task in a flow failed but the flow hasn't failed yet (because there are other branched tasks in the flow run that are still running), how can I restart the failed task without having to wait for the other tasks to finish?
    Anna Geller

    Anna Geller

    11 months ago
    @Martim Lobao do you have automatic retries set on the failing task? The best way of rerunning a single failed task is via automatic retry.
    from datetime import timedelta
    
    @task(max_retries=5, retry_delay=timedelta(minutes=1))
    When it comes to “restart”, the only way for it to work is if you have results configured for this task, because in order to restart this task, Prefect needs to know all inputs required by this task. The way restarts work is that you trigger a flow run from a specific failed task. Since you restart from a TaskRun page in the UI, I believe that you can restart even if some tasks in this FlowRun are not finished yet. However, you cannot restart only one specific task, but you restart the entire FlowRun from a specific failed TaskRun downstream.
    m

    Martim Lobao

    11 months ago
    we don't, and in this case the fix required a manual intervention (though no changes to the codebase)
    so clicking the restart button even while the flow is still running should be fine?
    it'll only restart the failed task and ignore ones that are still running?
    @Anna Geller (just making sure you got the ping)
    Anna Geller

    Anna Geller

    11 months ago
    I think you could try rerunning it even if the “old” flow run is in progress, because you start a FlowRun from this failed task run. This means you go to the TaskRun page of the failed task, and click the Restart from there. This way, it should rerun this failed task, and all tasks downstream. But note that this run will have the effect that the downstream tasks that are still running will be potentially executed twice - once in the old run when the task in question failed, and once in the restarted flow run. There is more about this here in the docs https://docs.prefect.io/orchestration/ui/flow-run.html#restarting
    m

    Martim Lobao

    11 months ago
    hmm, I'll try restarting then and see what happens for the tasks that are still pending
    thanks for your prompt response!
    Anna Geller

    Anna Geller

    11 months ago
    but be warned, I think the downstream tasks that are still in progress will be executed twice 🙂 because this new run that runs as a result from a Restart is an independent new FlowRun, that simply starts from a later task than normally
    m

    Martim Lobao

    11 months ago
    yeah, that's not ideal. in this particular case, i don't think it's problematic if the tasks run twice, but it would be preferable if Prefect allowed some way to restart a single failed task (and its dependent tasks) without having to wait for the entire flow to finish or executing pending tasks twice
    Anna Geller

    Anna Geller

    11 months ago
    thanks for your feedback @Martim Lobao. That would be super convenient, for sure. I think it’s hard to accomplish because tasks are not independent constructs, they only run as part of a Flow. Rerunning only one single task could make it hard to govern the dependencies, and would be very difficult to accomplish given the hybrid execution model, where the Flow may run in so many different environments and the tasks can be distributed to various Dask workers. But I definitely understand what you mean.
    If rerunning only one task is super critical for you, one solution that you could try is running a flow of flows. Prefect has a task called
    create_flow_run
    that allows you to run a Flow as if it were a task. This way, you could rerun each “task” (because it’s an independent flow) at any time, by triggering the independent child flow that failed.
    m

    Martim Lobao

    11 months ago
    thanks, we actually have a flow of flows set up elsewhere, but I've found it's not that well supported. for example, one issue we've had is that clicking the restart button in a flow of flows doesn’t restart a failed flow, we have to go into each flow and restart it independently.
    Kevin Kho

    Kevin Kho

    11 months ago
    Hey @Martim Lobao, I think what you can do is go to the UI and change the state of that
    Failed
    task to
    Scheduled
    . This will mark that the task run should be run and I think it will be picked up again. Not super sure. You’re trying to restart one task from the UI right? and not touch the code?
    m

    Martim Lobao

    11 months ago
    @Kevin Kho, you’re right, i just noticed that option on my laptop browser – it wasn’t visible on my phone. Side note: can I put in a feature request for a mobile app/improved mobile site? 😛
    Kevin Kho

    Kevin Kho

    11 months ago
    Prefect on the phone!? Wow. But yes Orion will be mobile friendly
    m

    Martim Lobao

    11 months ago
    sometimes you gotta restart a flow while on the go 😂