<@ULVA73B9P> i have a prefect server that I self h...
# ask-marvin
o
@Marvin i have a prefect server that I self host. It manages about 20 deployments running on different workers. I observe that the prefect python process on the server increases in ram usage over several weeks, reaching up to 20Gb, and that causes the UI to slow down and eventually i have to restart the service. How can i fix that?
m
thought for 739.3 seconds
Thanks for the details — we can help you get this stable. First, a couple of clarifying questions so we can tailor the fix: - What Prefect version is your server on? (3.x minor/patch matters here) - How are you running the server (Docker, systemd, k8s)? Which DB backend (Postgres vs SQLite)? - Roughly how many flow runs/log lines per day and do you often keep the UI open in “Live” mode (many tabs)? What you’re seeing is typically caused by long‑lived, in‑memory data structures (events and log streaming), plus connection and cache growth under load. Here’s a practical set of steps that usually eliminates steady RAM growth on self‑hosted servers. Immediate mitigations (safe defaults) - Upgrade to the latest 3.x first (several server scaling and stability improvements have landed recently) - Release notes: 3.4.20 changelog - Cap the in‑memory event stream and retention (the UI “live” experience uses this) - Events settings (environment variables you can add to your service/unit/docker):
Copy code
PREFECT_SERVER_EVENTS_MAXIMUM_SIZE_BYTES=1000000         # ~1 MB in-memory event buffer cap
    PREFECT_SERVER_EVENTS_RETENTION_PERIOD=1d                # keep only 1 day in the live cache
    PREFECT_SERVER_EVENTS_STREAM_OUT_ENABLED=false           # disable live push if you can live without it
    PREFECT_SERVER_EVENTS_WEBSOCKET_BACKFILL_PAGE_SIZE=100   # reduce backfill chunking
    PREFECT_SERVER_EVENTS_MAXIMUM_WEBSOCKET_BACKFILL=2m      # trim backfill time window
Settings reference: Settings reference - Reduce DB connection pressure (each connection has memory cost)
Copy code
PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_SIZE=5
  PREFECT_SERVER_DATABASE_SQLALCHEMY_MAX_OVERFLOW=5
  PREFECT_SERVER_DATABASE_SQLALCHEMY_POOL_RECYCLE=1800
- If logs are very chatty, throttle log volume to the API - Prefer higher log levels for flows/tasks or send fewer logs to the API (lots of logs require buffering and DB writes by the server). - Enable server metrics so you can correlate memory with events/logs/load
Copy code
PREFECT_SERVER_METRICS_ENABLED=true
Then scrape /metrics with Prometheus/Grafana for memory and connection metrics. Scale and process isolation (recommended) - Recent releases added support to vertically scale the server with workers for better isolation and resilience; upgrading allows you to run multiple web workers and use worker recycling to avoid cumulative memory growth over very long uptimes. - See “Vertical scale prefect server with workers” in the release notes: 3.4.20 changelog - After upgrading, configure multiple API workers and request/connection limits as documented to keep per‑process memory bounded. Operational guardrails - If you run the server under systemd or k8s, add guardrails to avoid ever hitting 20GB: - systemd example:
Copy code
MemoryMax=8G
    Restart=always
- Kubernetes example: set memory limits/requests and a liveness probe so the pod gets restarted before memory becomes problematic. - Avoid keeping many UI tabs open on “Live” pages for long periods (each maintains websockets and subscriptions). Why this helps - The defaults favor a responsive UI with days of live event backfill and generous buffering; under steady or bursty load, the in‑memory event/log caches and active websocket backfills can steadily grow the process RSS. - Tightening the events cache, reducing websocket backfill size, and capping DB pool size prevents accumulation. - Multiple API workers and recycling further prevent slow, monotonic RSS growth in a single long‑lived process. Useful links - Self-hosting overview: Self-hosted Prefect server
- Settings reference (all server settings and their env vars): Settings reference - Context on memory behavior under load: More memory, more problems If you share: - your Prefect version, - how you start the server, - DB backend, - and approximate run/log volume, I can translate the above into an exact config (env/systemd/Docker) for your environment.