Simon Ouellette
10/23/2025, 4:42 PMMarvin
10/23/2025, 4:42 PMMarvin
10/23/2025, 4:47 PMprefect config view --show-sources --show-defaults
2) Curl the API while a run is in progress
- From the worker machine:
curl -i https://<your>-<http://4200.proxy.runpod.net/api/work_pools/|4200.proxy.runpod.net/api/work_pools/>
You should get 200 with JSON. If it’s 404, the proxy is not sending /api to the Prefect API.
3) Persist the server DB
- If you’re running Prefect Server with SQLite, ensure the DB path is on a persistent volume so a server restart does not drop your work pool.
4) Keep the proxy connection warm
- Lower the worker sync interval to generate more frequent backend calls:
export PREFECT_WORKER_QUERY_SECONDS=10
This can mitigate proxy idle timeouts.
5) Add a healthcheck to observe worker health during long runs
prefect worker start --pool <pool-name> --type process --with-healthcheck
Then hit the local /health endpoint to see if it’s going unhealthy while your task runs.
6) Upgrade to the latest 3.x
- We continually improve network resiliency. Make sure both server and worker are on a current 3.x patch.
If you can share a bit more, I can be more specific:
- Prefect version on both server and worker
- The exact value of PREFECT_API_URL
- How your Prefect Server is deployed on RunPod (docker? bare process?) and whether the DB is persisted
- Any reverse proxy config in front (beyond the RunPod port proxy)
Related docs
- Worker concepts: https://docs-3.prefect.io/v3/concepts/workers
- Worker healthchecks: https://docs-3.prefect.io/v3/advanced/worker-healthchecks
- Server concepts: https://docs-3.prefect.io/v3/concepts/server
- REST API (work pools): https://docs-3.prefect.io/v3/api-ref/rest-api/server/work-pools/read-work-pools
Quick “known good” baseline
- Set and verify:
export PREFECT_API_URL=https://<your>-<http://4200.proxy.runpod.net/api|4200.proxy.runpod.net/api>
export PREFECT_LOGGING_LEVEL=DEBUG
export PREFECT_WORKER_QUERY_SECONDS=10
prefect config view --show-sources
- Start a process worker (auto-creates the pool if needed):
prefect worker start --pool <pool-name> --type process --with-healthcheckMarvin
10/23/2025, 4:47 PMSimon Ouellette
10/23/2025, 4:49 PMMarvin
10/23/2025, 4:49 PMMarvin
10/23/2025, 4:51 PMexport PREFECT_API_URL=<http://localhost:4200/api>
or the Docker network service name if using compose (e.g., `http://prefect-server:4200/api`). This removes the proxy from the path.
- Make the client auto-retry 404s and gateway errors
- Prefect’s client can retry on additional status codes. Add 404 (and typical gateway errors) and raise the retry limit:
export PREFECT_CLIENT_MAX_RETRIES=30
export PREFECT_CLIENT_RETRY_EXTRA_CODES=404,502,503,504
export PREFECT_CLIENT_RETRY_JITTER_FACTOR=0.2
Then restart the worker. This usually rides out brief proxy remaps.
- Keep the connection warm to avoid proxy idle timeouts
- Make the worker touch the API more often:
export PREFECT_WORKER_QUERY_SECONDS=10
This reduces the chance the proxy closes the idle route.
- Use healthchecks and an automatic restart policy
- Start the worker with a health endpoint and run it under a supervisor that restarts on exit:
prefect worker start --pool <pool-name> --type process --with-healthcheck
- Examples:
- Docker: --restart unless-stopped
- Systemd: Restart=always
- Simple loop:
while true; do prefect worker start --pool <pool-name> --type process; sleep 2; done
- Persist your server state
- If the server restarts without a persistent DB, your work pool disappears. Mount the SQLite DB (or use Postgres) so a restart doesn’t wipe pools.
If you can’t bypass the proxy, the retry + keepalive combo is the most “in-Prefect” workaround to prevent transient 404s from taking down the worker.
References:
- Workers: https://docs-3.prefect.io/v3/concepts/workers
- Worker healthchecks: https://docs-3.prefect.io/v3/advanced/worker-healthchecks
- Server: https://docs-3.prefect.io/v3/concepts/server
If you share your current PREFECT_API_URL and how the server is exposed on RunPod, I can suggest the safest exact values for your setup.Simon Ouellette
10/23/2025, 4:57 PMMarvin
10/23/2025, 4:57 PMMarvin
10/23/2025, 4:59 PMprefect profile create worker-local
prefect --profile worker-local config set PREFECT_API_URL=<http://localhost:4200/api>
Optional resilience tweaks:
prefect --profile worker-local config set PREFECT_CLIENT_MAX_RETRIES=30
prefect --profile worker-local config set PREFECT_CLIENT_RETRY_EXTRA_CODES=404,502,503,504
prefect --profile worker-local config set PREFECT_WORKER_QUERY_SECONDS=10
- Start the worker with that profile:
prefect --profile worker-local worker start --pool <pool-name> --type process --with-healthcheck
Option B: Set env vars only for the worker process
- In the shell or service that launches the worker:
export PREFECT_API_URL=<http://localhost:4200/api>
export PREFECT_CLIENT_MAX_RETRIES=30
export PREFECT_CLIENT_RETRY_EXTRA_CODES=404,502,503,504
export PREFECT_WORKER_QUERY_SECONDS=10
prefect worker start --pool <pool-name> --type process --with-healthcheck
- Everywhere else (your laptop, other machines) continue using the proxy URL (e.g., https://<your-runpod-proxy>/api) for PREFECT_API_URL, or just access the UI via the proxy in your browser.
Notes
- Prefect does not support multiple fallback API URLs in a single process; using per-process profiles/env is the supported way.
- Ensure your Prefect Server listens on 0.0.0.0:4200 inside RunPod so it’s reachable by both the proxy and localhost on the pod.
- If you’re using Docker Compose, set these environment variables only on the worker service, while keeping the server/UI service configured for the proxy.
Docs for reference:
- Workers: https://docs-3.prefect.io/v3/concepts/workers
- Worker healthchecks: https://docs-3.prefect.io/v3/advanced/worker-healthchecks
- Server: https://docs-3.prefect.io/v3/concepts/server