hey, I'm getting server errors under heavy loads when self hosting on k8s
Copy code
sqlalchemy.exc.TimeoutError: QueuePool limit of size 5 overflow 10 reached, connection timed out, timeout 30.00 (Background on this error at: <https://sqlalche.me/e/20/3o7r>)
this causes flows to crash and sometimes creates zombie flows that I have to cancel/delete maually
the loads are around 100 flows (each flow has around 100 sub-tasks), and flows are running on Process type agents
I can increase these settings on the server, or maybe even put in a better Postgres DB, but i'm not sure how to avoid these errors in the future
b
Benji
01/31/2024, 6:58 PM
Hi @Lior Barak, did you ever resolve this?
l
Lior Barak
02/27/2024, 7:20 PM
hi
@Benji a very late response:
I changed my DB to work with an external AWS RDS
changed the server sqlalchemy settings:
PREFECT_SQLALCHEMY_POOL_SIZE
PREFECT_SQLALCHEMY_MAX_OVERFLOW
(notice that it can have adverse affects on your interaction with the DB)
also, I have multiple prefect servers running in parallel and a limited amount of agents so I won't get to this overflow
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.