<@ULVA73B9P> I saw in the logs that my worker cras...
# ask-marvin
p
@Marvin I saw in the logs that my worker crashed after getting a response
Response: {'detail': 'Flow run not found'}
(normal as I deleted this flow while it was running). is it a normal behavior ? How to prevent the worker crashing from this as it also impact my other flow runnings
Copy code
Sep 25 08:43:54 XXXXX bash[1975]:    | prefect.exceptions.PrefectHTTPStatusError: Client error '404 Not Found' for url '<http://XXXXXXX.XXXX.XXX:4200/api/flow_runs/693815dd-2ec0-4017-98f7-75a628269dd6>'
Sep 25 08:43:54 dtftpn01 bash[1975]:    | Response: {'detail': 'Flow run not found'}
Sep 25 08:43:54 XXXXX bash[1975]:    | For more information check: <https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/404>
Sep 25 08:43:54 XXXXX bash[1975]:    |
Sep 25 08:43:54 XXXXX bash[1975]:    | The above exception was the direct cause of the following exception:
Sep 25 08:43:54 XXXXX bash[1975]:    |
Sep 25 08:43:54 XXXXX bash[1975]:    | Traceback (most recent call last):
Sep 25 08:43:54 XXXXX bash[1975]:    |   File "/data/Miniconda3-py311_24.1.2-0-Linux-x86_64/envs/datafetch/lib/python3.12/site-packages/prefect/runner/runner.py", line 1333, in _submit_run_and_capture_errors
Sep 25 08:43:54 dtftpn01 bash[1975]:    |     api_flow_run = await self._client.read_flow_run(flow_run_id=flow_run.id)
Sep 25 08:43:54 XXXXX bash[1975]:    |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Sep 25 08:43:54 XXXXXX bash[1975]:    |   File "/data/Miniconda3-py311_24.1.2-0-Linux-x86_64/envs/datafetch/lib/python3.12/site-packages/prefect/client/orchestration/_flow_runs/client.py", line 721, in read_flow_run
Sep 25 08:43:54 XXXXX bash[1975]:    |     raise ObjectNotFound(http_exc=e) from e
Sep 25 08:43:54 XXXXX bash[1975]:    | prefect.exceptions.ObjectNotFound: None
Sep 25 08:43:54 XXXXX bash[1975]:    +------------------------------------
Sep 25 08:43:54 XXXX bash[1975]: An exception occurred.
Sep 25 08:43:54 XXXXX bash[1903]: INFO:     10.200.2.1:43468 - "POST /api/logs/ HTTP/1.1" 201 Created
Sep 25 08:43:54 XXXX bash[1903]: INFO:     connection closed
Sep 25 08:43:55 XXXXX systemd[1]: prefect-agent.service: Main process exited, code=exited, status=1/FAILURE
Sep 25 08:43:55 XXXXXX01 systemd[1]: prefect-agent.service: Failed with result 'exit-code'.
m
Error: Tool 'read_github_issues' exceeded max retries count of 1
Sorry, I encountered an error while processing your request. Please try again.
p
@Marvin I saw in the logs that my worker crashed after getting a response
Response: {'detail': 'Flow run not found'}
(normal as I deleted this flow while it was running). is it a normal behavior ? How to prevent the worker crashing from this as it also impact my other flow runnings
Copy code
Sep 25 08:43:54 XXXXX bash[1975]:    | prefect.exceptions.PrefectHTTPStatusError: Client error '404 Not Found' for url '<http://XXXXXXX.XXXX.XXX:4200/api/flow_runs/693815dd-2ec0-4017-98f7-75a628269dd6>'
Sep 25 08:43:54 dtftpn01 bash[1975]:    | Response: {'detail': 'Flow run not found'}
Sep 25 08:43:54 XXXXX bash[1975]:    | For more information check: <https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/404>
Sep 25 08:43:54 XXXXX bash[1975]:    |
Sep 25 08:43:54 XXXXX bash[1975]:    | The above exception was the direct cause of the following exception:
Sep 25 08:43:54 XXXXX bash[1975]:    |
Sep 25 08:43:54 XXXXX bash[1975]:    | Traceback (most recent call last):
Sep 25 08:43:54 XXXXX bash[1975]:    |   File "/data/Miniconda3-py311_24.1.2-0-Linux-x86_64/envs/datafetch/lib/python3.12/site-packages/prefect/runner/runner.py", line 1333, in _submit_run_and_capture_errors
Sep 25 08:43:54 dtftpn01 bash[1975]:    |     api_flow_run = await self._client.read_flow_run(flow_run_id=flow_run.id)
Sep 25 08:43:54 XXXXX bash[1975]:    |                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Sep 25 08:43:54 XXXXXX bash[1975]:    |   File "/data/Miniconda3-py311_24.1.2-0-Linux-x86_64/envs/datafetch/lib/python3.12/site-packages/prefect/client/orchestration/_flow_runs/client.py", line 721, in read_flow_run
Sep 25 08:43:54 XXXXX bash[1975]:    |     raise ObjectNotFound(http_exc=e) from e
Sep 25 08:43:54 XXXXX bash[1975]:    | prefect.exceptions.ObjectNotFound: None
Sep 25 08:43:54 XXXXX bash[1975]:    +------------------------------------
Sep 25 08:43:54 XXXX bash[1975]: An exception occurred.
Sep 25 08:43:54 XXXXX bash[1903]: INFO:     10.200.2.1:43468 - "POST /api/logs/ HTTP/1.1" 201 Created
Sep 25 08:43:54 XXXX bash[1903]: INFO:     connection closed
Sep 25 08:43:55 XXXXX systemd[1]: prefect-agent.service: Main process exited, code=exited, status=1/FAILURE
Sep 25 08:43:55 XXXXXX01 systemd[1]: prefect-agent.service: Failed with result 'exit-code'.
m
thought for 184.9 seconds
Short answer: - The 404 “Flow run not found” is expected if you delete a flow run while the worker is still interacting with it. - The worker process exiting because of this is not desirable. Ideally, only that single run should fail and the worker should continue polling. How to avoid this and keep other runs unaffected: 1) Prefer cancel over delete - If a run is still executing, cancel it instead of deleting. This lets the worker cleanly move the run to a terminal state and kill underlying infrastructure. - CLI:
Copy code
prefect flow-run cancel <FLOW_RUN_ID>
- After it reaches a terminal state, you can safely delete it if you need to:
Copy code
prefect flow-run delete <FLOW_RUN_ID>
- Docs: Cancel workflows and Flow runs API 2) Upgrade Prefect - If your worker is fully exiting on this 404, that may be version-specific behavior. Please upgrade both your Prefect Server and workers to the latest 3.x so any fixes/improvements apply to error handling. - Latest release notes: Prefect 3.x changelog 3) Make your worker resilient at the process level - Since you’re running under systemd, set auto-restart so a single failure doesn’t take down your capacity:
Copy code
[Service]
  Restart=always
  RestartSec=5
- Consider running more than one worker (or multiple replicas) against the same work pool so one worker exiting doesn’t pause all work. 4) Separate critical workloads - If one worker handles many important runs, consider separate work pools and workers per workload so an issue in one pool won’t impact others. A couple of clarifying questions so I can be more specific: - Which Prefect version are you running on the worker and server? (
prefect version
) - How exactly are you starting the worker in your systemd unit (the command in ExecStart)? - Did you delete the run via the UI, CLI, or API? If you share those details, I can help confirm whether you’re hitting a known behavior in your version and suggest the most targeted fix.
p
@Marvin I use this commande to create my worker
Copy code
prefect worker start --pool process-pool --type process --install-policy if-not-present
How can I create multiple workers ?
m
thought for 74.5 seconds
Yes — you can run multiple workers against the same work pool. Each worker is just another process that polls the pool. Prefect handles leasing so only one worker will pick up a given flow run. Ways to start multiple workers - Run multiple processes (same host) - Just run your command multiple times with unique names:
Copy code
prefect worker start --pool process-pool --type process --install-policy if-not-present --name process-1
    prefect worker start --pool process-pool --type process --install-policy if-not-present --name process-2
    prefect worker start --pool process-pool --type process --install-policy if-not-present --name process-3
- Optional: assign different queues to split workloads:
Copy code
prefect worker start --pool process-pool --work-queue high --name process-high
    prefect worker start --pool process-pool --work-queue low  --name process-low
- Systemd (recommended for resilience) - Use a templated unit so you can run N instances:
Copy code
# /etc/systemd/system/prefect-worker@.service
    [Unit]
    Description=Prefect worker %i
    After=network-online.target
    Wants=network-online.target

    [Service]
    # Adjust user/working directory/environment as needed
    User=prefect
    WorkingDirectory=/opt/prefect
    Environment="PREFECT_API_URL=http://<your-server>:4200/api"
    ExecStart=/usr/bin/env bash -lc 'prefect worker start --pool process-pool --type process --install-policy if-not-present --name %i'
    Restart=always
    RestartSec=5

    [Install]
    WantedBy=multi-user.target
Then:
Copy code
sudo systemctl daemon-reload
    sudo systemctl enable --now prefect-worker@process-1
    sudo systemctl enable --now prefect-worker@process-2
    sudo systemctl enable --now prefect-worker@process-3
This also ensures a crash in one worker won’t take down your capacity. - Multiple machines/containers - Run the same command (with unique
--name
) on as many hosts/containers as you like, all pointing to the same
--pool process-pool
. Notes - Use
--name
to keep workers easy to identify in the UI. - Make sure each worker has access to any required credentials, code, and environment. - If you were seeing a worker exit on a deleted flow run, having multiple workers plus systemd Restart=always will isolate impact and auto-recover. If you’d like, share your current systemd unit (ExecStart, environment), and I can tailor a ready-to-drop-in template for N workers on your host.