<@ULVA73B9P> are there any reasons why prefect hos...
# ask-marvin
j
@Marvin are there any reasons why prefect hosted on AWS would have an agent submit the flow request twice, causing a crash?
m
I apologize for the confusion earlier. It seems that there are no existing GitHub issues related to your problem. If you're still experiencing the issue, I recommend opening a new issue on the Prefect GitHub repository to seek assistance from the Prefect community and developers. Make sure to provide detailed information about your setup, agent configuration, and the issue you're facing to help them better understand and address your problem.
n
@Marvin can you reconsider the question, focus on the relationship between the agent and the ECS infrastructure block
upvote 1
gratitude thank you 1
dang, looks like were hitting some network timeouts from openai right now
@Joshua Grant can you share your crash logs?
j
no worries. I think it's related to RDS
Essentially the agent completes submission of the flow, the flow does something like this:
Copy code
Apr 19 21:44:54.333
prefect-v2-agent
01:44:54.333 | INFO | prefect.agent - Submitting flow run '9d15021f-e169-41c7-b826-3af39c6b9ef0'

Apr 19 21:44:54.315
prefect-v2-agent
01:44:54.314 | INFO | prefect.agent - Submitting flow run '9d15021f-e169-41c7-b826-3af39c6b9ef0'
it submits the flow runs twice at different times. This has only happened once. Both the flow complains:
Copy code
01:46:00.764 | INFO    | prefect.engine - Engine execution of flow run '9d15021f-e169-41c7-b826-3af39c6b9ef0' aborted by orchestrator: This run cannot transition to the RUNNING state from the RUNNING state.
Then finally dies with both running flows complaining:
Copy code
01:47:25.236 | INFO    | prefect.engine - Engine execution of flow run '9d15021f-e169-41c7-b826-3af39c6b9ef0' aborted by orchestrator: This run has already terminated.
n
hmm, who's responsible for submitting these flow runs? is it the scheduler / an automation / api call?
j
API call.
z
Those timestamps for submission are really close to each other. Are you only running one agent there?
j
Every flow has its own agent and work-queue.
z
Interesting. Does this happen often?
j
No, this has happened in our beta environment. Never happened in dev or alpha, but is a concern preventing promotion to prod. This is the only occurrence.
z
Hm very weird
Here’s a draft poking at a possible bug? It’d be super weird though https://github.com/PrefectHQ/prefect/pull/9280
I’m a bit confused by those logs being out of order (the submitting ones)
j
sorry, copied and pasted from DataDog, it's default is most recent on top
z
Ahh okay 👍
j
ha, dev-ops team had replicas on the agents, believe that was causing the issue
z
Sounds about right 😄 we still try to guard against race conditions there but within 2ms is pretty tough