Hi! Not sure what the best/right channel for suppo...
# ask-community
s
Hi! Not sure what the best/right channel for support requests is, so I'll try here. • Problem: Flow that has worked previously suddenly stopped working yesterday • Setup: Orchestration with Prefect Cloud, agent running on AWS EC2 instance (agent process hosted with forever) • Done so far: ◦ Restarted the agent, last agent log line is "[2021-03-02 113718,922] INFO - agent | Waiting for flow runs..." ◦ Restarted the flow run in Prefect Cloud, but no logs are being generated (picture attached) This is production stuff and related to month change process, so any pointers are highly appreciated... EDIT: After agent restart, the agent status was green for about 15 minutes, but now turned red.
j
Hi @Sami Niemelä - I’ve got a few more questions to see if we can figure out what's going on. When you say “flow has stopped working” can you give a bit more information - is it ever switching to a running state? Is it failing or is it stuck in a scheduled state? Are other flows working? How are you running the flow? Is it on a schedule? Or are you starting it from the UI? Are there any label warning errors? (A red flag that would come up by upcoming runs or in the flow run details tile)
s
• No, the flow is never switching to running state, it stays on Scheduled. • No, both flows are facing the same issue. • Scheduled runs and UI started runs are both facing the same issue. • A flow that was scheduled to run yesterday has failed with an error today. This might have something to do with my reboots but the error message refers to SLA level. Could this be the cause?
Ok, I suppose I found the reason. One run was stuck and agent restart did not cancel the run completely, so the ghost flow is reserving our run slot.
How can I enforce the cancellation, so that we can get the runs started again?
Ping @Jenny
j
Ah ha! You can use the cancel button on the flow run page.
s
Tried it. The flow stays in "Cancelling" state, which still reserves the slot.
j
It can take a while for a flow run to switch to a a fully cancelled state as we wait for any running tasks to finish. How long has it been in cancelling state?
s
"Set State" => Cancelled seems to have solved the issue, Let's see if the queued tasks get scheduled properly now.
j
Has that cleared the running flow from your runs in progress tile?
s
Yes, thanks! Now I can focus on the issue with our own logic.
👍 1