https://prefect.io logo
Title
b

Ben Muller

02/07/2023, 9:15 PM
Sorry, lots of questions because Prefect seems to be erroring much more than usual. I have an automation that cancels and then retries a flow that has been waiting for 30 mins. For the last few days I am being notified with an error of:
State message: Submission failed. IndexError: list index out of range
cc: @Will Raphaelson in case this is good info for you.
w

Will Raphaelson

02/07/2023, 9:31 PM
Thanks Ben - I think this is, as currently spelled “against the rules”. A cancelled state is terminal, so state transitions that are proposed from cancelled to another state are rejected. That said, I think we can improve the error message here for sure. I believe that this can / should be accomplished without automations for individual flow runs (via flow retries) but automations could help here in terms of setting something like a global retry policy. We could consider surfacing a retry action type for this case. I’ll file a tracking issue internally. Going to tag in my orchestration expert colleagues who may have perspective as well, thanks.
Well, the time-based component does need to happen in automations, to retry after the specified run time. I’ll get this on our radar, we should have time to implement something like this in the next month or so.
b

Ben Muller

02/07/2023, 9:45 PM
If you see my question two above, this cant be done with the decorator.
w

Will Raphaelson

02/07/2023, 9:57 PM
ah yes sorry, I see that now. yeah we agree this is a good use case for automations. I dont have a precise timeline yet but I want to get this done.
b

Ben Muller

02/07/2023, 10:08 PM
awesome thanks Will. Getting heaps of these recently and I believe they are Prefect issues, so would be great to have a way to handle it.
w

Will Raphaelson

02/07/2023, 10:24 PM
Yeah, in the interim unfortunately I dont think we have an elegant way to support this retry after x time in y state. Would it work to have two actions fire - one to cancel the inferred flow run and one to kick off a new one? I know its not ideal but it might be a bandaid until we can introduce retry action types?
b

Ben Muller

02/07/2023, 10:53 PM
I think we aren't on the same page - why would this not work?
w

Will Raphaelson

02/07/2023, 11:08 PM
Ahh okay i think i understand now. It is my expectation that the configuration in that screenshot would in fact work. And just to clarify you’ve taken the retries argument off the flow right? If so, I think I could use a github issue to dig in more, would you provide the URL, relevant stack traces, etc in an issue in the prefect repo? I’ll try to reproduce locally as well.
b

Ben Muller

02/07/2023, 11:11 PM
it is for all flows - so I cant confirm I have taken the decorator off, because it is likely that some flows will have it. The retry decorator will have zero impact here because the flows are not failing - all we are checking for is if they are in pending or late. When the first automation action runs it will cancel it - the retry decorator only triggers for "FAILED" flows.
👍 1
wild guess - but do I need to do the
run a deployment
action before the cancel a flow run action? the error might be suggesting that if the flow is cancelled the action trying to infer and run the deployment can't find the flow to run?
that might explain the IndexError
w

Will Raphaelson

02/07/2023, 11:18 PM
that would surprise me but its worth a try, im poking around now. where exactly are you receiving that indexerror, on the event for the automation.failed right?
b

Ben Muller

02/07/2023, 11:19 PM
I get it on the notification
so it would be flow_run.state.message
I have to run - sorry wont be able to respond for a bit, thanks for the help
w

Will Raphaelson

02/07/2023, 11:20 PM
okay i thought the full automation was failing including the kickoff of the second flow run.
yeah no worries, thanks for raising it and for the back and forth. Let me write up an issue and we can coordinate there.
one more thing, im writing up reproduction instructions for the ticket. can you tell me the output of the CLI command
prefect version