Hi all we re seeing sporadic `BrokenPipeError Errno 32 ` Bro Prefect Community #ask-community

Hi all - we’re seeing sporadic `BrokenPipeError: [...

Ilya Galperin

01/11/2023, 6:29 PM

Hi all - we’re seeing sporadic

BrokenPipeError: [Errno 32]

Broken pipe crashes in one of our flows on 2.7.7, running on DaskTaskRunner. This flow runs ~1000 tasks, occasionally one of them will enter a Crashed state with this error and cause our flow to enter a Failed state. Retries on these crashed tasks don’t seem to work (I’m guessing Crashed state tasks are excluded from retry logic). Full traceback in the thread. Any ideas? Thank you!

Ilya Galperin

01/11/2023, 6:29 PM

Copy code

Crash detected! Execution was interrupted by an unexpected exception: Traceback (most recent call last):
  File "/usr/local/lib/python3.10/asyncio/selector_events.py", line 924, in write
    n = self._sock.send(data)
BrokenPipeError: [Errno 32] Broken pipe

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/httpcore/_exceptions.py", line 10, in map_exceptions
    yield
  File "/usr/local/lib/python3.10/site-packages/httpcore/backends/asyncio.py", line 51, in write
    await self._stream.send(item=buffer)
  File "/usr/local/lib/python3.10/site-packages/anyio/streams/tls.py", line 202, in send
    await self._call_sslobject_method(self._ssl_object.write, item)
  File "/usr/local/lib/python3.10/site-packages/anyio/streams/tls.py", line 168, in _call_sslobject_method
    await self.transport_stream.send(self._write_bio.read())
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 1297, in send
    raise self._protocol.exception
anyio.BrokenResourceError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
    yield
  File "/usr/local/lib/python3.10/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/usr/local/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 253, in handle_async_request
    raise exc
  File "/usr/local/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 237, in handle_async_request
    response = await connection.handle_async_request(request)
  File "/usr/local/lib/python3.10/site-packages/httpcore/_async/connection.py", line 90, in handle_async_request
    return await self._connection.handle_async_request(request)
  File "/usr/local/lib/python3.10/site-packages/httpcore/_async/http2.py", line 144, in handle_async_request
    raise exc
  File "/usr/local/lib/python3.10/site-packages/httpcore/_async/http2.py", line 106, in handle_async_request
    await self._send_request_headers(request=request, stream_id=stream_id)
  File "/usr/local/lib/python3.10/site-packages/httpcore/_async/http2.py", line 205, in _send_request_headers
    await self._write_outgoing_data(request)
  File "/usr/local/lib/python3.10/site-packages/httpcore/_async/http2.py", line 370, in _write_outgoing_data
    raise exc
  File "/usr/local/lib/python3.10/site-packages/httpcore/_async/http2.py", line 358, in _write_outgoing_data
    await self._network_stream.write(data_to_send, timeout)
  File "/usr/local/lib/python3.10/site-packages/httpcore/backends/asyncio.py", line 49, in write
    with map_exceptions(exc_map):
  File "/usr/local/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/usr/local/lib/python3.10/site-packages/httpcore/_exceptions.py", line 14, in map_exceptions
    raise to_exc(exc)
httpcore.WriteError

The above exception was the direct cause of the following exception:

httpx.WriteError

Zanie

01/11/2023, 6:30 PM

Crashed states are indeed excluded from retries since the failure happens outside of your code.

Zanie

01/11/2023, 6:31 PM

I’m not sure of the best way to resolve these issues, perhaps we should retry on these. We retry on similar

ReadError

exceptions.

Zanie

01/11/2023, 6:33 PM

https://github.com/PrefectHQ/prefect/pull/8145

Ilya Galperin

01/11/2023, 7:30 PM

Thank you for the clarification @Zanie - this makes total sense. We will keep track of this PR and in the meantime we can add custom retries to tasks that come back with crashed states. Do you know what might be the cause of the error/where we might want to start looking?

Zanie

01/11/2023, 7:36 PM

I’m not sure, it’s a networking issue. I’ve reached out to the httpx team.

Ilya Galperin

01/11/2023, 7:41 PM

Makes sense, thanks again for your help on this Michael!

4 Views

Open in Slack

Previous Next