< Marvin> I have a prefect server running in the background Prefect Community #ask-marvin

<@ULVA73B9P> I have a prefect server running in th...

Tom Han

10/02/2025, 4:30 PM

@Marvin I have a prefect server running in the background. I just

uv pip install -U prefect

, did I hot upgraded my prefect server? or should I restart my server

Marvin

10/02/2025, 4:30 PM

✅ thought for 122.7 seconds

Marvin

10/02/2025, 4:32 PM

Short answer: no, a running Prefect Server won’t hot-upgrade just because you upgraded the package. You should restart it to pick up the new version. Recommended steps (Prefect 3.x): 1) Stop the running server

Copy code

prefect server stop

2) Apply database migrations (recommended after upgrades)

Copy code

prefect server database upgrade -y

- If you have a large DB, you can increase the DB timeout:

Copy code

export PREFECT_API_DATABASE_TIMEOUT=600

3) Start the server again

Copy code

prefect server start

Notes: - If you’re running Server via Docker/systemd, restart the container/service instead of using the CLI stop/start. - Running flows on workers aren’t killed by a brief server restart, but scheduling/UX will be briefly unavailable. - You can verify the installed version with:

Copy code

prefect version

After restart, the server will use that version. Docs for reference: - Server CLI how-to - Server CLI API reference - Running Prefect processes in the background

Tom Han

10/02/2025, 4:51 PM

@Marvin Would my workers be panicking if my server is down for a few minutes? they’re also emitting heartbeat

Marvin

10/02/2025, 4:51 PM

✅ thought for 120.6 seconds

Marvin

10/02/2025, 4:53 PM

Short answer: your workers won’t panic. A brief Server/API outage of a few minutes is generally fine. What happens during a short outage (Prefect 3.x): - Running flow runs: keep executing on the worker/infrastructure. State updates and logs may be delayed until the API is back. - New work: workers can’t fetch/claim new runs while the API is down. - Heartbeats: workers will miss heartbeats during the outage and may show “unhealthy” in the UI if the gap is long enough, but they’ll automatically reconnect and resume polling when the API returns. - Scheduler/UI: scheduling and UI updates pause, then catch up after recovery. If you expect a longer outage, you can plan a maintenance window, but for a few minutes you’re typically safe. Refs: - Worker health and behavior - Server CLI how-to

Tom Han

10/02/2025, 5:06 PM

@Marvin | INFO | prefect.deployment - Pulled code using block 'github-repository/hcm-core-repo' into 'github-repository-hcm-core-repo' 151018.100 | WARNING | prefect.client - Your Prefect server is running an older version of Prefect than your client which may result in unexpected behavior. Please upgrade your Prefect server from version 3.4.17 to version 3.4.19 or h igher. 151018.314 | INFO | Flow run 'intelligent-buffalo' - Beginning flow run 'intelligent-buffalo' for flow 'inference' 151018.317 | INFO | Flow run 'intelligent-buffalo' - View at http://10.1.37.163:4200/runs/flow-run/0d8e1dc0-beed-4532-ad43-c94dac038181 151033.274 | INFO | Flow run 'intelligent-buffalo' - Running inference on video: /root/vast/carlos/HCM_Sync/talmolab/2025-02-12-AzimALS/cam_07/2025-08-09-00-01-05/cam_07.10.mp4 using model: /root/vast/than/als2h_basemodel/0922CVATNN/models/Leopard, output_path: /root/vast/than/preds_v4/talmolab/2025-02-12-AzimALS/cam_07/2025-08-09-00-01-05/cam_07.10.predictions.slp, kwargs: {'batch_size': 8, 'queue_maxsize': 16, 'tracking': True, 'candidates_method': 'local_queues', 'max_tracks': 3, 'max_instances': 3} 151033.277 | INFO | Flow run 'intelligent-buffalo' - 2025-10-02 151033 | INFO | sleap_nn.predictrun inference309 | Started inference at: 2025-10-02 151033.277158 151033.314 | INFO | Flow run 'intelligent-buffalo' - 2025-10-02 151033 | INFO | sleap_nn.predictrun inference325 | Using device: cuda 165951.992 | WARNING | prefect.events.clients - Unable to connect to 'ws://10.1.37.163:4200/api/events/in'. Please check your network settings to ensure websocket connections to the API are allowed. Otherwise event data (including task run data) may be lost. Reason: [Errno 111] Connect call failed ('10.1.37.163', 4200). Set PREFECT_DEBUG_MODE=1 to see the full error. NOOOOOOOOOOWHY MARVIN SAVE ME

Marvin

10/02/2025, 5:06 PM

✅ thought for 58.8 seconds

Marvin

10/02/2025, 5:07 PM

Deep breath! The message means your worker lost its websocket connection to your Prefect API at ws://10.1.37.163:4200/api/events/in. When that happens: - Your GPU job keeps running; logs/state events may be dropped while disconnected - The UI will stop updating during the outage - When the API is reachable again, the worker will reconnect and continue You also have a version skew: your client/worker is newer than your Server (3.4.19+ vs Server 3.4.17). That can cause exactly these symptoms. Best fix is to upgrade and restart the Server so it matches your client. Do this on the Server host: 1) Stop the server

Copy code

prefect server stop

2) Upgrade Prefect and apply DB migrations

Copy code

uv pip install -U prefect
prefect server database upgrade -y

- Large DB? Consider:

Copy code

export PREFECT_API_DATABASE_TIMEOUT=600

3) Start the server

Copy code

prefect server start

Useful checks: - Can you load the UI at http://10.1.37.163:4200 now? - Verify services status:

Copy code

prefect server services ls

- Confirm both server and worker report the same version:

Copy code

prefect version

- If you’re behind a proxy or firewall, ensure websockets are allowed to 10.1.37.163:4200 (Connection: Upgrade headers, no blocking of ws/wss). If you need deeper error detail on the worker side, set:

Copy code

export PREFECT_DEBUG_MODE=1

and rerun to see the full stack for the websocket connection. You should be okay—your inference run likely continued locally, but some logs/events during the disconnect may not appear in the UI. Refs: - Server CLI how-to - Server CLI API reference - Worker health and behavior Want me to help verify your server and worker versions and settings? If so, let me know how you’re starting the server (CLI, Docker, systemd) and the value of

PREFECT_API_URL

on the worker.

4 Views

Open in Slack

Previous Next