<@ULVA73B9P> I getting such error on prefect-serve...
# ask-marvin
d
@Marvin I getting such error on prefect-server when run multiple jobs TimeoutError: QueuePool limit of size 5 overflow 10 reached,
m
thought for 188.5 seconds
That error is coming from SQLAlchemy’s connection pool in the Prefect server API. You’re hitting the default pool limits (pool_size=5, max_overflow=10) under load. You can fix it by increasing the server’s database pool settings. In Prefect 3.x, set these environment variables on the server process (not workers): - PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_SIZE (default: 5) - PREFECT_SERVER_DATABASE_SQLALCHEMY_MAX_OVERFLOW (default: 10) - PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_TIMEOUT (default: 30 seconds) - PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_RECYCLE (default: 3600 seconds) Example (Docker Compose):
Copy code
services:
  server:
    image: prefecthq/prefect:3-latest
    environment:
      - PREFECT_SERVER_DATABASE_CONNECTION_URL=postgresql+asyncpg://<user>:<pass>@<host>:5432/<db>
      - PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_SIZE=20
      - PREFECT_SERVER_DATABASE_SQLALCHEMY_MAX_OVERFLOW=40
      - PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_TIMEOUT=60
      - PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_RECYCLE=3600
    ports:
      - "4200:4200"
After changing env vars, restart the server container/pod. Notes and guidance: - If you’re on SQLite, consider moving to Postgres for any real concurrency. SQLite serializes writes and is more prone to contention. See Self-hosted Prefect. - Tune Postgres
max_connections
to accommodate your increased pool size (roughly pool_size + max_overflow concurrent connections), and account for any other apps sharing the database. - These settings only apply to the API server’s DB pool. Workers communicate over HTTP and do not need these DB env vars. How to see your current values:
Copy code
prefect config view --show-defaults | grep -E 'SQLALCHEMY|DATABASE'
References: - Settings reference: Prefect 3.x settings reference - Database configuration types (Postgres/SQLite) used by the server: Server database configurations Can you share: - Which DB backend you’re using (SQLite vs Postgres) - How you’re running the server (Docker Compose, Kubernetes, etc.) - Roughly how many concurrent jobs/requests you expect I can suggest concrete numbers for your pool based on that.
d
use postgres kubernetes about 30 job
@Marvin new error raise TimeoutError from exc_val
m
thought for 166.6 seconds
Got it — that generic “raise TimeoutError from exc_val” usually means we’re still hitting a timeout, most often from the DB connection pool waiting too long for a connection. Quick checklist to unblock you: - If you haven’t already, raise the server’s DB pool timeout and pool sizes. - PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_TIMEOUT (default 30s): try 60–120 - PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_SIZE: e.g. 20–50 - PREFECT_SERVER_DATABASE_SQLALCHEMY_MAX_OVERFLOW: e.g. 40–100 - If you’re on SQLite, also increase the SQLite connect timeout (so it waits longer on locks) and strongly consider switching to Postgres for concurrency. Examples Docker Compose (Postgres):
Copy code
services:
  server:
    image: prefecthq/prefect:3-latest
    environment:
      - PREFECT_SERVER_DATABASE_CONNECTION_URL=postgresql+asyncpg://<user>:<pass>@<host>:5432/<db>
      - PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_SIZE=30
      - PREFECT_SERVER_DATABASE_SQLALCHEMY_MAX_OVERFLOW=60
      - PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_TIMEOUT=120
      - PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_RECYCLE=3600
    ports:
      - "4200:4200"
Docker Compose (SQLite — only for light concurrency):
Copy code
services:
  server:
    image: prefecthq/prefect:3-latest
    environment:
      - PREFECT_SERVER_DATABASE_CONNECTION_URL=sqlite+aiosqlite:///root/.prefect/prefect.db
      - PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_SIZE=20
      - PREFECT_SERVER_DATABASE_SQLALCHEMY_MAX_OVERFLOW=40
      - PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_TIMEOUT=120
      - PREFECT_SERVER_DATABASE_SQLALCHEMY_CONNECT_ARGS__timeout=60
Notes - Defaults (server): pool_size=5, max_overflow=10, pool_timeout=30s, pool_recycle=3600s. - Postgres must allow enough connections for your new pool: roughly pool_size + max_overflow per API process, plus headroom for migrations and other connections. - Restart the server after changing env vars. Verify current values:
Copy code
prefect config view --show-defaults | grep -E 'SERVER_DATABASE|SQLALCHEMY'
Could you share: - Which DB backend (SQLite vs Postgres)? - How you run the server (Docker Compose, Kubernetes, bare metal)? - The full stack trace around “raise TimeoutError from exc_val” and any “QueuePool” messages? - Approximate concurrent jobs/API load? With that, I can recommend concrete pool/timeout numbers tailored to your setup. Refs: - Prefect 3.x settings reference - Self-hosted Prefect server
d
@Marvin I run Postgres , Kubernetes, about 30 tasks
m
thought for 23.9 seconds
Thanks — with Postgres on Kubernetes and ~30 concurrent tasks, you’re likely still exhausting the API server’s DB pool or waiting too long on a connection. Recommended server settings (Prefect 3.x) - Increase pool size, overflow, and wait timeout on the Prefect API server: - PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_SIZE: 20–40 - PREFECT_SERVER_DATABASE_SQLALCHEMY_MAX_OVERFLOW: 40–80 - PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_TIMEOUT: 60–120 - PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_RECYCLE: 3600 Kubernetes example (server Deployment):
Copy code
apiVersion: apps/v1
kind: Deployment
metadata:
  name: prefect-server
spec:
  replicas: 1
  template:
    spec:
      containers:
        - name: server
          image: prefecthq/prefect:3-latest
          env:
            - name: PREFECT_SERVER_DATABASE_CONNECTION_URL
              valueFrom:
                secretKeyRef:
                  name: prefect-db
                  key: url
            - name: PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_SIZE
              value: "30"
            - name: PREFECT_SERVER_DATABASE_SQLALCHEMY_MAX_OVERFLOW
              value: "60"
            - name: PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_TIMEOUT
              value: "120"
            - name: PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_RECYCLE
              value: "3600"
Important sizing notes - Total DB connections ≈ server_replicas × (pool_size + max_overflow) + a little overhead. Make sure Postgres
max_connections
comfortably covers this plus other clients. - If you scale the Prefect server to multiple replicas, multiply accordingly. - Only the API server needs these env vars; workers don’t connect to the DB directly. If you still see timeouts after increasing the pool: - Verify Postgres
max_connections
isn’t the bottleneck. - Bump
PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_TIMEOUT
further (e.g., 180) if bursts briefly exceed capacity. - Check Postgres for connection saturation or slow queries (pg_stat_activity / pg_stat_statements). - Optionally place PgBouncer in front of Postgres to smooth spikes (ensure settings are compatible with asyncpg). Defaults for reference - pool_size=5, max_overflow=10, pool_timeout=30s, pool_recycle=3600s. Docs - Prefect 3.x settings reference - Self-hosted Prefect If you can share: - Server replica count - Postgres
max_connections
I can suggest exact numbers so your total connections stay within limits.