Hey Prefect - I get a heap of these errors now day...
# prefect-community
b
Hey Prefect - I get a heap of these errors now days with 2.0: 🧵
Copy code
State message: Flow run encountered an exception. Traceback (most recent call last):
 File "/usr/local/lib/python3.9/site-packages/prefect/client/orion.py", line 204, in api_healthcheck
   await self._client.get("/health")
 File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1757, in get
   return await self.request(
 File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1533, in request
   return await self.send(request, auth=auth, follow_redirects=follow_redirects)
 File "/usr/local/lib/python3.9/site-packages/prefect/client/base.py", line 159, in send
   await super().send(*args, **kwargs)
 File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1620, in send
   response = await self._send_handling_auth(
 File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1648, in _send_handling_auth
   response = await self._send_handling_redirects(
 File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1685, in _send_handling_redirects
   response = await self._send_single_request(request)
 File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1722, in _send_single_request
   response = await transport.handle_async_request(request)
 File "/usr/local/lib/python3.9/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
   resp = await self._pool.handle_async_request(req)
 File "/usr/local/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 221, in handle_async_request
   await self._attempt_to_acquire_connection(status)
 File "/usr/local/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 160, in _attempt_to_acquire_connection
   status.set_connection(connection)
 File "/usr/local/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 22, in set_connection
   assert self.connection is None
AssertionError
The above exception was the direct cause of the following exception:
RuntimeError: Cannot orchestrate task run '9cc3fc0a-ef27-4f5d-bba6-595b7c653c35'.
any ideas?
t
it's actually annoying, but from my end, it does not impact functionalities.....
b
hmm @Tuoyi Zhao I dont think so. the logs are from my task run, not my agent. As mentioned in the discourse by @Anna Geller I use an ECS service for my agent - so it might be something different
a
thanks for sharing more details - traceback implies this is a connection/network problem, but I can't say why your run couldn't communicate with the backend. Do you run it in custom VPC? perhaps there was some disruption in that AZ? hard to say
I'd say monitor it for a couple of days and if the problem persists, open a GitHub issue and we would need to reproduce and try to fix
is it a long-running task? you could try to move that task into a flow and check if this way the issue doesn't occur - this way we could say for sure that a long-running task causes this
j
I’ve started noticing a large number of these errors over the past few days as well. For me it’s happening when submitting a large number of tasks to the
DaskTaskRunner
(more than 500 in the group). Partway though adding them I’ll see the error and the flow will crash. This is using the local dask cluster option. Crashes don’t happen on every run.
a
did anyone in this group open a GitHub issue about it? any volunteers? Slack is not a good place for bug reports, especially not for complex ones that only happen occasionally, we'd need to reproduce and investigate
b
Hi @Anna Geller - I can confirm that I get this error not when I run a long running task, but when I run many short running tasks with
.submit
and the
ConcurrentTaskRunner
. Ak The exact same way of running things with .map used to work with Prefect 1.0 and now it just breaks. It is not ideal.
Copy code
Crash detected! Execution was interrupted by an unexpected exception: Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/prefect/client/orion.py", line 204, in api_healthcheck
    await self._client.get("/health")
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1757, in get
    return await self.request(
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1533, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/usr/local/lib/python3.9/site-packages/prefect/client/base.py", line 159, in send
    await super().send(*args, **kwargs)
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1620, in send
    response = await self._send_handling_auth(
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1648, in _send_handling_auth
    response = await self._send_handling_redirects(
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1685, in _send_handling_redirects
    response = await self._send_single_request(request)
  File "/usr/local/lib/python3.9/site-packages/httpx/_client.py", line 1722, in _send_single_request
    response = await transport.handle_async_request(request)
  File "/usr/local/lib/python3.9/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/usr/local/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 221, in handle_async_request
    await self._attempt_to_acquire_connection(status)
  File "/usr/local/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 160, in _attempt_to_acquire_connection
    status.set_connection(connection)
  File "/usr/local/lib/python3.9/site-packages/httpcore/_async/connection_pool.py", line 22, in set_connection
    assert self.connection is None
AssertionError

The above exception was the direct cause of the following exception:

RuntimeError: Cannot orchestrate task run 'ccdabf35-19ae-4fc3-91f9-2af74acf111b'. Failed to connect to API at <https://api.prefect.cloud/api/accounts/a0af308b-5af5-4edf-a50e-945f515efc16/workspaces/8e15f190-bb28-45fa-a1e8-b3335e28b718/>.
Is there anyway I can handle these errors in my code or is this just a downside of using async?
Can confirm this is an async thing - when I switch to the DaskTaskRunner the issue goes away 🦜
a
GitHub issue? 🙏
Nice work figuring out what's causing it Ben
b
Does it need an issue? Isn't it just a symptom of the hardware?
a
Could you say more? I thought you found an issue running async tasks
b
I just got errors when there was heavy use of requests in async tasks. When 100's ran in parallel. I don't think that's a bug
Maybe just a limitation of running on a single thread
a
Gotcha. Thanks for this update and clarifying, much appreciated
🚀 1
a
So it's not possible to run 100s of tasks concurrently? Connection to backend cloud api keeps breaking
d
facing the same issue... wasnt happening day before yesterday
a
So it's a problem with cloud backend? Rather than new version release?
1
d
I’m facing this issue on normal task runner when I’m using the .map function . It wasn’t occurring a day back
b
As @Anna Geller recommended, if you want this looked at, I'd put in a bug report on gh
p
Howdy yall - thanks a ton for bringing this to our attention! @Deepanshu Aggarwal or @alvin goh can one of you please open an issue and we'll give this a look https://github.com/PrefectHQ/prefect/issues
d
g
btw I'm seeing the same issue with SequentialTaskRunner and no async tasks. Will try downgrading the agent and task runner to see if that helps.
ok downgrading the agent solved it for me
d
@Giuliano Mega can you share which prefect version are you using for agent ?
👀 1
g
Ouch, probably too late for that, but I'm using 2.6.5. 😬
p
Hey Giuliano - did you upgrade your
httpcore
version?
This wound up being an issue with one of our dependencies that they fixed in the next release
@Ben Muller double checking - did you wind up getting this resolved?
b
Hey @Peyton Runyan - I did. I upgraded httpcore to the most recent version ( in my flows docker image )
🔥 2
p
Excellent - I'm glad to hear!
🙌 2
d
btw it worked for me as well. i used prefect 2.6.0 version and installed httpcore 0.16.2 but i think i had to update it for the image used in kubernetes jobs as well and not just the agent
🔥 2