My prefect worker is continuously crashing with th...
# prefect-community
a
My prefect worker is continuously crashing with the following error:
Copy code
raise PrefectHTTPStatusError.from_httpx_error(exc) from exc.__cause__
prefect.exceptions.PrefectHTTPStatusError: Client error '404 Not Found' for url '<http://prefect.mls.gh.st:4200/api/deployments/762eab44-76ca-44ee-8334-f2a5d46d4883>'
Response: {'detail': 'Deployment not found'}
For more information check: <https://httpstatuses.com/404>
An exception occurred.
That deployment doesn’t actually exist. Do I need to do some cleaning in the database?
1
c
yea, if you delete all flow runs that were associated with this deployment it should unblock you - we should have better error handling for this situation also, I can open a PR for that
could you share a full stack trace for me?
a
Copy code
Worker 'KubernetesWorker 34e03e38-4794-4fdb-9b89-be2ab5f65c9c' started!
17:33:19.966 | INFO    | prefect.worker.kubernetes.kubernetesworker 34e03e38-4794-4fdb-9b89-be2ab5f65c9c - Found 1 flow runs awaiting cancellation.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/cli/_utilities.py", line 41, in wrapper
    return fn(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 260, in coroutine_wrapper
    return call()
  File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 245, in __call__
    return self.result()
  File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 173, in result
    return self.future.result(timeout=timeout)
  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 451, in result
    return self.__get_result()
  File "/usr/local/lib/python3.10/concurrent/futures/_base.py", line 403, in __get_result
    raise self._exception
  File "/usr/local/lib/python3.10/site-packages/prefect/_internal/concurrency/calls.py", line 218, in _run_async
    result = await coro
  File "/usr/local/lib/python3.10/site-packages/prefect/cli/worker.py", line 103, in start
    async with worker_cls(
  File "/usr/local/lib/python3.10/site-packages/prefect/workers/base.py", line 957, in __aexit__
    await self.teardown(*exc_info)
  File "/usr/local/lib/python3.10/site-packages/prefect/workers/base.py", line 447, in teardown
    await self._runs_task_group.__aexit__(*exc_info)
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 662, in __aexit__
    raise exceptions[0]
  File "/usr/local/lib/python3.10/site-packages/prefect/workers/base.py", line 531, in cancel_run
    configuration = await self._get_configuration(flow_run)
  File "/usr/local/lib/python3.10/site-packages/prefect/workers/base.py", line 816, in _get_configuration
    deployment = await self._client.read_deployment(flow_run.deployment_id)
  File "/usr/local/lib/python3.10/site-packages/prefect/client/orchestration.py", line 1485, in read_deployment
    response = await self._client.get(f"/deployments/{deployment_id}")
  File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1754, in get
    return await self.request(
  File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1530, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/usr/local/lib/python3.10/site-packages/prefect/client/base.py", line 280, in send
    response.raise_for_status()
  File "/usr/local/lib/python3.10/site-packages/prefect/client/base.py", line 137, in raise_for_status
    raise PrefectHTTPStatusError.from_httpx_error(exc) from exc.__cause__
prefect.exceptions.PrefectHTTPStatusError: Client error '404 Not Found' for url '<http://prefect.mls.gh.st:4200/api/deployments/762eab44-76ca-44ee-8334-f2a5d46d4883>'
Response: {'detail': 'Deployment not found'}
For more information check: <https://httpstatuses.com/404>
An exception occurred.
Is there an easy way to find the flow runs with this deployment?
c
great thank you -- i believe you can query your DB directly looking at the deployment_id on the flow run table; i believe we don't enforce a FK on that so it should still be present with the value
Copy code
762eab44-76ca-44ee-8334-f2a5d46d4883
Otherwise I suggest looking at all "Cancelling" runs associated with your work pool
hopefully i can get this PR into the next release which will solve this problem automatically for you: https://github.com/PrefectHQ/prefect/pull/9464
a
Finding the “Cancelling” ones and deleting them did it! Thank you!!
💯 1