Is there any documentation for orion on what happens to flows that are late? Our execution environment running the agent crashed and the Prefect response was not what we expected. Based on our testing it looks like the first step is that flows are marked late. If an agent becomes available a late flow will start. But then at some point after a few hours the late flow will be marked pending and it will not restart even if an agent becomes available. Is there a way to fine tune this behavior?
k
Kevin Kho
06/17/2022, 4:13 PM
Ah I think there are two parts to this:
1. A late flow will always be picked up by an agent when available. You can use Automations to cancel a flow if it’s late by a certain amount. You can also use the state handler of the Flow to exit. You can compare
datetime.datetime.now()
to
prefect.context.scheduled_start_time
to exit the Flow.
2. The second piece is the “marked as pending”, which happens when the flow in unable to start. Prefect tries to mark it as failed if it’s stuck that like for a while. Our Zombie Killer hasn’t been performant lately so we are working to stabilize that.
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.