[REPOST]
Hi all,
I am using Prefect 2.14.15 and have my own prefect Orion server with multiple work-pools deployed on different servers.
Flows running on my machine is failing due to PrefectHTTPStatusError for some of the internal prefect api like /flow_run, /flows, /set_state.
I am not able to figure out the root cause of this issue.
My infra setup:
- prefect-orion server: 16 GB RAM, 4 Core
work-pools:
- default-pool: same as prefect-orion server | 1 agent
- pool1: server2 16 GB RAM, 4 core | 5 agents
- pool2: same server as pool1 | 7 agents
Pool1 has daily load of nearly 30k flows out of which ~5% failed due to these error.
Pool2 has daily load of nearly 10k flows out of which ~8% is failed due to these error.
Currently there are around 350k total flow runs until know.
All the machines are on same VPC. I am using postgresQL DB connected to my prefect orion server with some table recods (like log, flow_run_state) nearly reaching 700k entries.
Earlier i thought it was due to huge number of data in DB with is causing timeout but i had removed nearly 30% on my old data from DB but the error still exists.
Does anyone have ony idea what might be the cause of it?