https://prefect.io logo
#prefect-community
Title
# prefect-community
t

Thomas Pedersen

11/09/2022, 10:05 AM
When flows crash (according to their logs), Orion UI keeps showing the flows as running, and their run time keeps increasing. If the flow isn't running on the agent, shouldn't it show as "Failed" and not running? Still looking into what actually causes the crash - this is more about how Prefect handles them. We do have flows that runs successful as well, this is just a single flow or two crashing every time.
Here's the stack trace causing it - not sure it tells me much about what the issue is, but will look at it again later...
Copy code
Crash details:
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 237, in handle_async_request
    response = await connection.handle_async_request(request)
  File "/usr/local/lib/python3.10/site-packages/httpcore/_async/connection.py", line 88, in handle_async_request
    raise ConnectionNotAvailable()
httpcore.ConnectionNotAvailable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 1332, in report_task_run_crashes
    yield
  File "/usr/local/lib/python3.10/site-packages/prefect/engine.py", line 1068, in begin_task_run
    connect_error = await client.api_healthcheck()
  File "/usr/local/lib/python3.10/site-packages/prefect/client/orion.py", line 183, in api_healthcheck
    await self._client.get("/health")
  File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1751, in get
    return await self.request(
  File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1527, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/usr/local/lib/python3.10/site-packages/prefect/client/base.py", line 159, in send
    await super().send(*args, **kwargs)
  File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1614, in send
    response = await self._send_handling_auth(
  File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1642, in _send_handling_auth
    response = await self._send_handling_redirects(
  File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1679, in _send_handling_redirects
    response = await self._send_single_request(request)
  File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1716, in _send_single_request
    response = await transport.handle_async_request(request)
  File "/usr/local/lib/python3.10/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/usr/local/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 246, in handle_async_request
    async with self._pool_lock:
  File "/usr/local/lib/python3.10/site-packages/anyio/_core/_synchronization.py", line 130, in acquire
    await event.wait()
  File "/usr/local/lib/python3.10/asyncio/locks.py", line 214, in wait
    await fut
asyncio.exceptions.CancelledError
t

Tim Galvin

11/10/2022, 2:13 AM
I have found this as well and I do not have a nice fix for it.
m

Mason Menges

11/11/2022, 10:30 PM
Hey @Thomas Pedersen @Tim Galvin we're tracking related issues around this here https://github.com/PrefectHQ/prefect/issues/7512 if you have some time we'd greatly appreciate adding your reports for these issues there
t

Thomas Pedersen

11/14/2022, 7:21 AM
@Mason Menges, done - but if the agent never reports back to the server, should the server just leave the flows as running in the UI forever? I feel like there need to be some kind of keep-alive reporting for long running flows, where the Agent keeps telling the server that the flow is active and running. If the server never hears anything from the Agent it then should assume that the flow is crashed or similar...
3 Views