<@ULVA73B9P> when I'm setting up a scalable self h...
# ask-marvin
j
@Marvin when I'm setting up a scalable self hosted prefect instance (based on https://docs-3.prefect.io/v3/advanced/self-hosted) Do I need to use
PREFECT_SERVER_DATABASE_
or
PREFECT_API_DATABASE_
env vars on the API and on the background services instances?
m
thought for 89.3 seconds
Short answer: use
PREFECT_SERVER_DATABASE_*
for both the API and the background services. There is no
PREFECT_API_DATABASE_*
setting in Prefect 3.x for server-side database configuration. Details: - Any container/process that runs the Prefect Server (the API service and each background service instance) should receive the same
PREFECT_SERVER_DATABASE_*
settings so they all point to the same Postgres database. - Common options you can set: -
PREFECT_SERVER_DATABASE_CONNECTION_URL
(recommended single URL) - Or the component fields if you prefer: -
PREFECT_SERVER_DATABASE_DRIVER
(e.g.,
postgresql+asyncpg
) -
PREFECT_SERVER_DATABASE_HOST
-
PREFECT_SERVER_DATABASE_PORT
-
PREFECT_SERVER_DATABASE_USER
-
PREFECT_SERVER_DATABASE_PASSWORD
-
PREFECT_SERVER_DATABASE_NAME
- Optional SQLAlchemy tuning/TLS (e.g.,
PREFECT_SERVER_DATABASE_SQLALCHEMY_ECHO
,
PREFECT_SERVER_DATABASE_SQLALCHEMY_CONNECT_ARGS_TLS_CA_FILE
, etc.) Example env snippet:
Copy code
PREFECT_SERVER_DATABASE_CONNECTION_URL=<postgresql+asyncpg://prefect:prefect@my-postgres:5432/prefect>
# or:
PREFECT_SERVER_DATABASE_DRIVER=postgresql+asyncpg
PREFECT_SERVER_DATABASE_HOST=my-postgres
PREFECT_SERVER_DATABASE_PORT=5432
PREFECT_SERVER_DATABASE_USER=prefect
PREFECT_SERVER_DATABASE_PASSWORD=prefect
PREFECT_SERVER_DATABASE_NAME=prefect
Docs for reference: - Self-hosted guide: Scalable self-hosted Prefect - Settings reference: Settings reference (search for “ServerDatabaseSettings” and
PREFECT_SERVER_DATABASE_*
) If you share how you’re deploying (Docker Compose, Kubernetes/Helm), I can tailor an example manifest with the right env vars wired into both the API and services.
j
@Marvin what does the background services responsible for and can it be scaled for high availability?
m
thought for 143.6 seconds
Short version - Background services are Prefect Server’s periodic/async control-plane workers (not flow runners). They handle things like scheduling, triggers, late-run marking, cancellations cleanup, repossessing stuck runs, concurrency limits, pause expirations, and event logging/persisting. - Yes, you can (and should) run multiple replicas for high availability and throughput. They are stateless and coordinate via the database to avoid duplicate work, so horizontal scaling is safe. What they do (high level) - Scheduler: materializes flow runs from deployment schedules. - Triggers engine: evaluates trigger conditions and starts runs. - Late runs: marks overdue runs. - Cancellation cleanup: finalizes cancellations that didn’t complete cleanly. - Repossessor: recovers/marks stuck runs so resources/concurrency slots are freed. - Concurrency limits: enforces configured concurrency policies. - Pause expirations: expires paused runs at configured times. - Event logger/persister: collects and persists events for the UI and auditing. - Task run recorder: ensures final task/flow state and related records are durably written. HA and scaling guidance - Run 2+ replicas of the background services process; they coordinate through the DB and are designed to be idempotent, so multiple replicas won’t double-execute work. - Use the same database settings for all replicas (and the API) so they share state. - Use managed Postgres or a highly available Postgres setup; services depend on it. - You can scale replicas up to improve scheduling/trigger evaluation throughput and reduce tail latencies. - Ensure clocks are roughly in sync across nodes (helps with “late” detection). Config pointers - Use the same
PREFECT_SERVER_DATABASE_*
env vars on all background-service replicas and API replicas. - Also set
PREFECT_API_URL
(and
PREFECT_API_AUTH_STRING
if auth is enabled) so services can reach the API. Docs - Scalable self-hosted guide: Scalable self-hosted Prefect - Settings reference (search for “ServerServices” and “ServerDatabaseSettings”): Settings reference If you share whether you’re on Kubernetes or Docker Compose, I can sketch a concrete deployment with multiple replicas and the right env vars.
j
@Marvin what is redis needed for then if the background services and the api share state via postgres?
m
thought for 302.0 seconds
n
redis is the messaging broker for things like events
🙏 1
m
Great question. In Prefect 3.x, Postgres is the durable system-of-record for runs, deployments, etc. Redis is optional and used for fast, ephemeral messaging and caching that let multiple API/background-service replicas coordinate real-time streams. What Redis is used for - Real-time events and logs streaming: A pub/sub “messaging broker” powers UI live updates (runs, states, logs) and event pipelines. With a single API process, an in-memory broker works; across multiple replicas, you need a shared broker so messages are visible to all pods. - Ephemeral caches for events/logs: Short-lived caches that support websocket backfill, deduplication, and quick lookups are backed by a messaging cache. Redis lets these caches be shared across replicas. What Redis is not used for - Not for durable state of runs/deployments/schedules; that’s all in Postgres. - Not for orchestration locks or concurrency counters; these use database transactions and are safe across multiple replicas without Redis. Is Redis required? - Single-node: No. The default in-memory broker/cache are fine. - Multi-replica/HA: Strongly recommended. Without Redis, each replica only sees its own in-memory streams, so live UI updates and streaming logs/events won’t be consistent across replicas. How to enable Redis for HA streaming - Switch the events broker and cache to the Redis-backed implementations: -
PREFECT_SERVER_EVENTS_MESSAGING_BROKER
-
PREFECT_SERVER_EVENTS_MESSAGING_CACHE
- These default to in-memory; set them to the Redis modules and provide Redis connection details for your API and background services containers. Example shape:
Copy code
PREFECT_SERVER_EVENTS_MESSAGING_BROKER=prefect.server.utilities.messaging.redis
PREFECT_SERVER_EVENTS_MESSAGING_CACHE=prefect.server.utilities.messaging.redis
# Provide your Redis connection details to the broker/cache (URL/TLS/etc.)
# (I can share an exact example for Docker Compose or Helm if you’d like.)
- Make sure the Prefect images have the Redis client dependency installed. Docs - Scalable self-hosted Prefect: Guide - Settings reference (see ServerEventsSettings and related): Settings reference If you share whether you’re on Docker Compose or Kubernetes, I can give you a concrete config with the right environment variables and dependencies for Redis-backed events/logs across multiple replicas.
n
and @Marvin is actually wrong, i would not run many instances of the background services (marvin go find the issue about this) though i would be surprised if that became a bottleneck for you
m
thought for 226.3 seconds
j
it hasn't become, I was just wondering. Thanks a lot @Nate!
m
You’re right — thanks for the correction. Current guidance is to run a single background-services instance. Multiple replicas are not universally safe yet. Reference: - “Document and improve horizontal scalability of background services” prefect#18753 — Maintainers note that not all background services are safe to run with multiple replicas, and this is being audited. The plan is to document which services are singleton vs HA-safe and possibly expose separate entrypoints. Implications and recommendations today: - Scale the API horizontally behind a load balancer. - Keep background services at replicas: 1. - If you need to manage load, you can split services into separate processes/pods and toggle them via
PREFECT_SERVER_SERVICES_*_ENABLED
flags — but avoid running duplicate replicas of the same services concurrently until the audit is complete. - Related tracking issues you may find useful: - Healthcheck for background services: prefect#18997 - Scheduler CPU hot spot with some RRule schedules: prefect#18861 If you’d like, I can share a Docker Compose or Helm example that: - runs N replicas of the API, - runs 1 replica of background services, - and uses service toggles to isolate heavier services like the Scheduler.
🙌 1