Justin Trautmann
02/16/2023, 9:46 AMPreventRedundantTransitions
orchestration policy prevents a repeated task execution as the task remains in status Running
when the instance is terminated and a new replacement instance will try to start the task from scratch, including proposing the state Running
again which leads to:
Task run <task_run_id> received abort during orchestration: This run cannot transition to the RUNNING state from the RUNNING state. Task run is in RUNNING state
I feel like this policy has a legitimate reason of preventing multiple agents from attempting to orchestrate the same run but also catches the case where the same agent tries to execute the same task multiple times which should be allowed imo.
Anyone having experience with Prefect + Ray on AWS spot and ideas on how to handle this case?
PS: @Tim Galvin not sure about Dask and SLURM but maybe your issue is somewhat relatedYaron Levi
02/16/2023, 11:55 AM