<@ULVA73B9P> Failed to submit flow run '64e4894b-2...
# ask-marvin
s
@Marvin Failed to submit flow run '64e4894b-2c89-4727-be71-14e74afd1fc0' to infrastructure. Traceback (most recent call last): File "/usr/local/lib/python3.11/site-packages/prefect/workers/base.py", line 1279, in _submit_run_and_capture_errors await self._give_worker_labels_to_flow_run(flow_run.id) File "/usr/local/lib/python3.11/site-packages/prefect/workers/base.py", line 1562, in _give_worker_labels_to_flow_run await self._client.update_flow_run_labels(flow_run_id, labels) File "/usr/local/lib/python3.11/site-packages/prefect/client/orchestration/_flow_runs/client.py", line 1027, in update_flow_run_labels response = await self.request( ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/prefect/client/orchestration/base.py", line 53, in request return await self._client.send(request) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/site-packages/prefect/client/base.py", line 379, in send response.raise_for_status() File "/usr/local/lib/python3.11/site-packages/prefect/client/base.py", line 163, in raise_for_status raise PrefectHTTPStatusError.from_httpx_error(exc) from exc.cause prefect.exceptions.PrefectHTTPStatusError: Client error '404 Not Found' for url '****/labels' Response: {'detail': 'Not Found'} For more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/404 080516 PM prefect.flow_runs.worker Reported flow run '64e4894b-2c89-4727-be71-14e74afd1fc0' as crashed: Flow run could not be submitted to infrastructure: PrefectHTTPStatusError("Client error '404 Not Found' for url'*****/labels': {'detail': 'Not Found'}\nFor more information check: https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/404")
m
thought for 241.2 seconds
Thanks for sharing the stack trace — that’s really helpful. Short version: your worker is trying to update the flow run’s labels at the API route
/flow_runs/{id}/labels
, but the server you’re talking to doesn’t expose that endpoint, so you get a 404. This almost always points to a version mismatch where the worker/client is newer than the server. What to check/try: - Are you on Prefect Cloud or self-hosting Prefect Server? - Please run
prefect version
on: - the machine/container running your worker - the machine/container running your server (if self-hosted) and share both outputs. Immediate remediation options: - Upgrade your Prefect Server to the latest 3.x (and keep your workers on the same version). - If you can’t upgrade the server right now, temporarily pin your worker to a 3.0.x version (before workers attempted to write labels to flow runs). Quick self-checks: - Open your API docs at
<YOUR_PREFECT_API_URL>/docs
and search for “labels” or “update_flow_run_labels”. If it’s missing, your server version does not support that route yet. - This route is documented here: Update flow run labels (Prefect 3.x API) - There isn’t a setting to disable label updates from workers (no
PREFECT_API_ENABLE_LABELS
or similar), so compatibility is the fix. Related report for context: - This GitHub issue describes the same 404 on
/flow_runs/{id}/labels
when versions diverged: Process workpool doesn’t make a flow run If you can share: - Prefect versions (server and worker) - Whether you’re in Cloud or OSS - The server image tag (if self-hosted) …I can give you a precise upgrade/pin recommendation.
s
@Marvin It seems that this is a prefect 3.xx thing but I'm supposed to be on prefect 2.14.20. My local machine is on 2.14.20 and I am self hosting.
m
thought for 63.0 seconds
Thanks — that context helps. What your traceback shows is a Prefect 3.x client/worker trying to call the 3.x endpoint
/flow_runs/{id}/labels
. That endpoint does not exist on a Prefect 2.x server, so the server returns 404. In other words: your worker (or whatever is submitting the run) is running Prefect 3.x, but your self‑hosted server is Prefect 2.14.20. Key point: Prefect 2 and Prefect 3 are not API-compatible. A 3.x client will try to hit endpoints (like flow run labels) that a 2.x server does not have, causing exactly this 404. How to confirm quickly - On the worker machine/container: run
Copy code
prefect version
- On your server container/host: run
Copy code
prefect version
You’ll likely see 3.x on the worker and 2.14.20 on the server. Ways to fix (pick one) - Keep everything on Prefect 2.14.20 - Pin the worker environment to 2.14.20: - pip:
Copy code
pip install "prefect==2.14.20"
- Docker image/tag (example):
Copy code
prefecthq/prefect:2.14.20-python3.11
- Make sure any CI/CD or container rebuilds aren’t pulling
latest
(which is 3.x now). - Or upgrade fully to Prefect 3.x - Upgrade your self-hosted server and all workers to the same 3.x version. - Note: in 3.x, workers do add their labels to flow runs; the route is documented here: Update flow run labels (Prefect 3.x API) Related report for context (same 404 on /labels when versions diverged): - Process work pool 404 on /flow_runs/{id}/labels If you share: - the
prefect version
output from your worker and server - how you’re running the worker (pip/venv vs Docker/Kubernetes) and the exact image/tag or package pin …I can give you exact pin/upgrade commands for your setup.
s
on server hosting prefect: prefect version Version: 2.14.20 API version: 0.8.4 Python version: 3.9.23 Git commit: 8ceb0962 Built: Thu, Feb 1, 2024 6:30 PM OS/Arch: linux/x86_64 Profile: ephemeral Server type: server and same locally @Marvin
m
thought for 43.8 seconds
Thanks for the details. Even though your server and your local CLI both report 2.14.20, the traceback shows a worker running in a different environment: - It’s using Python 3.11 (
/usr/local/lib/python3.11/site-packages/...
) - It’s calling the 3.x-only endpoint
/flow_runs/{id}/labels
via
update_flow_run_labels
Prefect 2.x servers don’t have that labels route, so a 3.x worker/client will get a 404 exactly like you’re seeing. What likely happened - The worker that actually picked up the flow is running somewhere else (container/VM) with Prefect 3.x (often due to using an image/tag like
prefecthq/prefect:latest
). - Your server remains on 2.14.20, causing a mismatch. Confirm the worker version (on the machine/container that runs the worker) - Process worker on a host:
Copy code
which prefect
  prefect version
  python -c "import prefect, sys; print(prefect.__version__, sys.version)"
- Docker worker:
Copy code
docker ps | grep prefect
  docker exec -it <container_id> prefect version
- Kubernetes worker:
Copy code
kubectl get pods -n <ns>
  kubectl exec -it <pod> -n <ns> -- prefect version
You should see 3.x on the worker that threw this error. Fix options (pick one and keep everything aligned): - Stay on Prefect 2.14.20 everywhere - Pin the worker to 2.14.20 and restart it:
Copy code
pip install "prefect==2.14.20"
or, for Docker/K8s, use an explicit image tag like:
Copy code
prefecthq/prefect:2.14.20-python3.11
Make sure your templates/manifests aren’t using
:latest
. - Or upgrade fully to Prefect 3.x (server and all workers) - If you go this route, that labels endpoint exists and the error goes away. Route reference: Update flow run labels (Prefect 3.x API) Related report (same 404 when versions diverged): - Process work pool 404 on /flow_runs/{id}/labels If you share: - how your worker is deployed (Process, Docker, Kubernetes) - the exact image/tag or Python env it uses - the output of
prefect version
from inside that worker …I can give you the exact pin/upgrade steps for your setup.
s
@Marvin
16:36:27.305 | WARNING | prefect.events.clients - Unable to connect to 'ws://****.<http://ap-south-1.compute.amazonaws.com:4200/api/events/in|ap-south-1.compute.amazonaws.com:4200/api/events/in>'. Please check your network settings to ensure websocket connections to the API are allowed. Otherwise event data (including task run data) may be lost. Reason: server rejected WebSocket connection: HTTP 403. Set PREFECT_DEBUG_MODE=1 to see the full error.
m
thought for 116.9 seconds
That 403 on the events websocket is another strong signal that something in your runtime is using Prefect 3.x against your 2.14.20 server. Why I say that: - The log comes from
prefect.events.clients
and targets
/api/events/in
over a WebSocket. That events ingress WebSocket is a Prefect 3.x feature. A 2.x server does not expose this route. A 3.x client will try to open it and, depending on your proxy, you’ll get a 403/404 instead of a clean upgrade. First, please confirm the version in the environment producing this warning (the worker/container/pod that’s running your flow): - If it’s a host process:
Copy code
which prefect
prefect version
python -c "import prefect, sys; print(prefect.__version__, sys.version)"
- If it’s Docker:
Copy code
docker ps | grep prefect
docker exec -it <container_id> prefect version
- If it’s Kubernetes:
Copy code
kubectl get pods -n <ns>
kubectl exec -it <pod> -n <ns> -- prefect version
If that shows 3.x, the fix is to align versions: - Stay on Prefect 2.14.20 everywhere - Pin your worker image/env to 2.14.20 and restart it - Do not use
:latest
images; use explicit tags like
prefecthq/prefect:2.14.20-python3.11
or
pip install "prefect==2.14.20"
- Or upgrade your server and all workers to 3.x together If you truly are 2.14.20 everywhere and still see a WebSocket 403: - That would point to your reverse proxy/LB blocking WS upgrades. Ensure it forwards WebSocket upgrades to the Prefect API. - Nginx example:
Copy code
map $http_upgrade $connection_upgrade {
  default upgrade;
  ''      close;
}

location /api/ {
  proxy_pass <http://prefect-server:4200;>
  proxy_http_version 1.1;
  proxy_set_header Upgrade $http_upgrade;
  proxy_set_header Connection $connection_upgrade;
  proxy_set_header Host $host;
  proxy_read_timeout 600s;
}
- AWS ALB: - ALB supports WebSockets over HTTP/1.1; ensure listener/protocols are HTTP/HTTPS, idle timeout >= 300s, security groups/WAF allow Upgrade/Connection headers and the
/api/events/in
path to the target. - Also make sure your
PREFECT_API_URL
scheme matches your deployment. If you terminate TLS at the proxy, clients should reach
https://.../api
(which implies
wss://
for WebSockets), not
ws://
over a TLS-only endpoint. To get more detail, set:
Copy code
export PREFECT_DEBUG_MODE=1
and reproduce; share the expanded error and I can help pinpoint whether it’s a proxy/WAF rejection or a version mismatch. Given your earlier 404 on
/flow_runs/{id}/labels
, I’d prioritize verifying the worker’s Prefect version first — aligning versions will likely clear both errors. If you share how your worker is deployed (Process, Docker, K8s) and the exact image/tag or pip pin, I’ll provide exact pin/upgrade steps.
s
@Marvin failure is at worker level but here is my task definition: { "compatibilities": [ "EC2", "FARGATE" ], "containerDefinitions": [ { "command": [ "/bin/sh", "-c", "prefect version && python -c 'import pydantic, prefect; print(\"pydantic:\", pydantic.version)' && pip install --no-cache-dir 'prefect-aws==0.3.1' 'pydantic<2' 'boto3<2' && prefect worker start --pool ecs-two-flows --type ecs" ], "cpu": 512, "environment": [ { "name": "PREFECT_API_URL", "value": "<http://***:4200/api>" } ], "essential": true, "image": "prefecthq/prefect:2.14.20-python3.11", "logConfiguration": { "logDriver": "awslogs", "options": { "awslogs-group": "/ecs/prefect-worker-logs", "awslogs-create-group": "true", "awslogs-region": "ap-south-1", "awslogs-stream-prefix": "ecs" } }, "memory": 1024, "mountPoints": [], "name": "prefect-worker", "portMappings": [], "systemControls": [], "volumesFrom": [] } ], "cpu": "512", "executionRoleArn": "arnawsiam:971422704224role/two-flows-prefect-worker-role", "family": "two-flows-prefect-worker-task", "memory": "1024", "networkMode": "awsvpc", "placementConstraints": [], "registeredAt": "2025-09-04T201609.300Z", "registeredBy": "arnawsiam:971422704224root", "requiresAttributes": [ { "name": "com.amazonaws.ecs.capability.logging-driver.awslogs" }, { "name": "ecs.capability.execution-role-awslogs" }, { "name": "com.amazonaws.ecs.capability.docker-remote-api.1.19" }, { "name": "com.amazonaws.ecs.capability.task-iam-role" }, { "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18" }, { "name": "ecs.capability.task-eni" }, { "name": "com.amazonaws.ecs.capability.docker-remote-api.1.29" } ], "requiresCompatibilities": [ "FARGATE" ], "revision": 6, "status": "ACTIVE", "taskDefinitionArn": "arnawsecsap south 1971422704224task definition/two flows prefect worker task6", "taskRoleArn": "arnawsiam:971422704224role/two-flows-prefect-worker-role", "volumes": [], "tags": [] }
m
thought for 104.1 seconds
Got it — thanks for sharing the ECS task definition. The root cause is in the container command: - You start from an image pinned to Prefect 2.14.20 (good) - Then you run pip install prefect-aws==0.3.1 right before starting the worker prefect-aws 0.3.x is for Prefect 3.x. Installing it will upgrade Prefect inside the container from 2.14.20 to 3.x after your initial
prefect version
check. That’s why: - the worker later behaves like a 3.x client (trying to call
/flow_runs/{id}/labels
and the events websocket), and - your 2.x server returns 404/403. Fix options - Stay on Prefect 2.14.20 everywhere - Use a Prefect 2-compatible AWS integration: pin
prefect-aws
to
<0.3.0
(the 0.2.x line) - Explicitly pin Prefect in the pip step to prevent upgrades - You don’t need to pin `pydantic`; Prefect 2 will bring Pydantic v1 automatically Example ECS command (reorders to install first, then prints versions so you see the final resolved state):
Copy code
/bin/sh -c "
  pip install --no-cache-dir --upgrade --upgrade-strategy only-if-needed \
    'prefect==2.14.20' 'prefect-aws<0.3.0' 'boto3<2' \
  && python -c 'import prefect, pydantic; print(\"prefect:\", prefect.__version__, \"pydantic:\", pydantic.__version__)' \
  && prefect version \
  && prefect worker start --pool ecs-two-flows --type ecs
"
Even better, bake a custom image with those packages preinstalled and drop the runtime pip installs to reduce startup time and avoid accidental upgrades. - Or upgrade to Prefect 3.x end-to-end - Upgrade your self-hosted server and all workers to the same 3.x version - Then
prefect-aws==0.3.x
is correct, and the labels endpoint and events websocket will work Why you saw the issues - 404 on `/flow_runs/{id}/labels`: 3.x client feature hitting a 2.x server that doesn’t have that route - 403 on
/api/events/in
websocket: 3.x client tries to open the events WS; your 2.x server/proxy rejects it Reference (same symptom when versions diverge): - https://github.com/PrefectHQ/prefect/issues/16828 If you’d like, share the worker logs after applying the change (the
python -c ...
printout +
prefect version
) and I’ll double-check the final versions.