<@ULVA73B9P> I'm using a self-hosted Prefect versi...
# ask-marvin
w
@Marvin I'm using a self-hosted Prefect version 3 setup with PostgreSQL + PGbouncer. I’m trying to understand the best way to configure connection pooling in this environment. Can you find any mention or guidelines or best practices to set the number of connections? Can you search on the codebase and over the github issues how many connection do I need? Here's my scenario: • I have two Prefect instances: ◦ Instance 1 has 1 worker, 1 pool, and always shows 5 connections (the default pool size), which occasionally jumps to 7 when a flow runs. ◦ Instance 2 has 4 workers, 4 pools, and seems to keep 9 connections open all the time. My main confusion is that workers don't directly connect to the database, so I'm not sure why the total connections rise when flows start running, is there any predictable number of connections added when a flow start to run? Recently, Instance 2 reached 20 pending connections, and PGbouncer started timing out. My settings are configurable for 5 connection and an overflow of 10, but it feels like the overflow isn’t fully respected. Can you share more details on how Prefect Server handles DB connections, or a rule of thumb for sizing PGbouncer pool/overflow in relation to the number of workers and flows? The number of workers reflect on the number of connection that I need on pgbouncer?
m
Here's a summary of what I found regarding Prefect's handling of database connections and PGbouncer configuration: 1. Prefect and Database Connections: - Prefect supports PostgreSQL for its database, which is recommended for more robust, multi-server setups. However, specific configurations for connection pooling with PGbouncer are not directly mentioned in the documentation. - Prefect uses SQLAlchemy for database connections, but there is a feature request to allow more granular control over connection arguments, which might be useful for your PGbouncer setup (GitHub Issue). 2. Connection Pooling and PGbouncer: - The issue you’re experiencing with PGbouncer may relate to how Prefect and its database connections are managed. PGbouncer settings like
statement_cache_size
and
prepared_statement_cache_size
can affect how connections are pooled and reused. - There is a known limitation when using PGbouncer in transaction mode with
asyncpg
, which can lead to errors if prepared statements are not handled correctly. 3. Connection Behavior: - The increase in database connections when flows start could be due to Prefect's internal management of flow states and logs, which require database interactions. - The number of workers might indirectly affect the number of connections due to increased parallel task execution, leading to more frequent database interactions. 4. Best Practices and Recommendations: - While there is no explicit guidance on the exact number of connections needed, it’s crucial to monitor and adjust based on your workload. It might be beneficial to increase the pool size and overflow settings in PGbouncer, considering your current setup leads to timeouts. - Consider setting
statement_cache_size
to 0 if you encounter issues with prepared statements, as suggested in the GitHub issue. 5. General Advice: - Monitor your database and PGbouncer logs to understand connection patterns and adjust configurations accordingly. - Test different configurations in a staging environment to find the optimal settings for your workload. For more detailed guidance, you might need to experiment with different configurations or consult with Prefect community forums for user experiences similar to yours. Here’s the Prefect self-hosting guide for additional context.