Prefect 2.6 has brought me back... but I am gettin...
# ask-community
t
Prefect 2.6 has brought me back... but I am getting the occasional error like this. This looks to be a failure with the Prefect API? Can anyone shed some light on it for me? Thanks
Copy code
11:20:34.559 | ERROR   | Task run 'Get-Items-d8ed86f1-2199' - Crash detected! Request to <https://api.prefect.cloud/api/accounts/cafe3a79-624b-468d-87a7-97fde3358a01/workspaces/5d09d677-90b8-4ef8-a9be-9760b422937a/task_runs/a13a62a5-5e54-4b1d-a976-98635a6913c6> failed: Traceback (most recent call last):
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/anyio/streams/tls.py", line 130, in _call_sslobject_method
    result = func(*args)
  File "/usr/lib/python3.10/ssl.py", line 975, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLWantReadError: The operation did not complete (read) (_ssl.c:997)
z
Hm. That’s the full traceback? @Zach Angell did you see this during the reliability wor?
t
I think this is fthe full stack trace
Copy code
11:20:55.376 | ERROR   | Task run 'Get-Items-d8ed86f1-326' - Crash detected! Request to <https://api.prefect.cloud/api/accounts/cafe3a79-624b-468d-87a7-97fde3358a01/workspaces/5d09d677-90b8-4ef8-a9be-9760b422937a/task_runs/c8013fad-e78f-4da0-a9e6-8084f4e1aec6> failed: Traceback (most recent call last):
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/anyio/streams/tls.py", line 130, in _call_sslobject_method
    result = func(*args)
  File "/usr/lib/python3.10/ssl.py", line 975, in do_handshake
    self._sslobj.do_handshake()
ssl.SSLWantReadError: The operation did not complete (read) (_ssl.c:997)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/httpcore/backends/asyncio.py", line 67, in start_tls
    ssl_stream = await anyio.streams.tls.TLSStream.wrap(
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/anyio/streams/tls.py", line 122, in wrap
    await wrapper._call_sslobject_method(ssl_object.do_handshake)
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/anyio/streams/tls.py", line 137, in _call_sslobject_method
    data = await self.transport_stream.receive()
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 1265, in receive
    await self._protocol.read_event.wait()
  File "/usr/lib/python3.10/asyncio/locks.py", line 214, in wait
    await fut
asyncio.exceptions.CancelledError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/httpcore/_exceptions.py", line 8, in map_exceptions
    yield
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/httpcore/backends/asyncio.py", line 76, in start_tls
    raise exc
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/httpcore/backends/asyncio.py", line 66, in start_tls
    with anyio.fail_after(timeout):
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/anyio/_core/_tasks.py", line 118, in __exit__
    raise TimeoutError
TimeoutError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/httpx/_transports/default.py", line 60, in map_httpcore_exceptions
    yield
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/httpx/_transports/default.py", line 353, in handle_async_request
    resp = await self._pool.handle_async_request(req)
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 253, in handle_async_request
    raise exc
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/httpcore/_async/connection_pool.py", line 237, in handle_async_request
    response = await connection.handle_async_request(request)
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/httpcore/_async/connection.py", line 86, in handle_async_request
    raise exc
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/httpcore/_async/connection.py", line 63, in handle_async_request
    stream = await self._connect(request)
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/httpcore/_async/connection.py", line 150, in _connect
    stream = await stream.start_tls(**kwargs)
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/httpcore/backends/asyncio.py", line 64, in start_tls
    with map_exceptions(exc_map):
  File "/usr/lib/python3.10/contextlib.py", line 153, in __exit__
    self.gen.throw(typ, value, traceback)
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/httpcore/_exceptions.py", line 12, in map_exceptions
    raise to_exc(exc)
httpcore.ConnectTimeout

The above exception was the direct cause of the following exception:

httpx.ConnectTimeout
z
Oh well that’s really helpful 🙂 It’s a connection timeout.
You could bump
PREFECT_API_REQUEST_TIMEOUT
— I’m not sure why you’d be experiencing a long connection time though.
t
what is the default value? I don't currently have it set
z
Copy code
❯ prefect config view --show-defaults | grep TIMEOUT
PREFECT_API_REQUEST_TIMEOUT='30.0' (from defaults)
t
TIL, cool!
z
🙂 Is your internet particularly slow? 30s is a long time and could be an issue on our end.
t
let me get some internet data. I just upgraded to fiber a few weeks ago though...
My internet looks great. 369 Mbps down, 74 up, ping of 15ms
z
Using the default task runner (concurrent)?
t
corrct
z
Great thanks! We’ll need to wait for someone that knows more about recently Cloud performance work to chime in. Presumably, we’ll need to fix this on our side.
Bumping the timeout may help in the meantime.
t
Awesome, thank you
z
Hey Tim - thanks for the write up! When did this error occur? I’m having trouble tracking down any server side error logs corresponding to your stack trace. Could you also share a bit of detail about what your flow looks like? Because the Prefect client is async, sometimes we hit this error because the event loop is blocked rather than the server returning an error.
t
The timestamp on the errors is in Central time, that is when they occurred. I will shar some of my flow
Copy code
dest_dataset = os.environ.get("GCP_ACCOUNTS_DATASET")
    query = f"select max(CAST(id AS INT64)) as last_id from `{dest_dataset}.subscriptions_orion` where api_source = 'accounts'"
    highwater = query_highwater(query)

    params = get_params(highwater, full_active_load, full_load)

    pages_list = get_pages_list(client, "subscriptions", params)
    try:
        items = get_items_list.map(
            unmapped(client), unmapped("subscriptions"), pages_list
        )
    except Exception as e:
        return Failed("Failed on an Exception")
    sub_data = []
    for item in items:
        sub_data.extend(item.result())
    if not sub_data:
        return Completed(message="No data from the API")
    else:
        transformed_data = transform_data(sub_data)

    bqclient = bigquery.Client()
    bq_result = load_df_bq(bqclient, transformed_data)
    bq_result = None

    if isinstance(bq_result, LoadJob) and bq_result.state == "DONE":
        return Completed(message="Load Finished!")
    elif isinstance(bq_result, Failed):
        slack_webhook_block = SlackWebhook.load("data-pipeline-notifications")
        slack_webhook_block.notify("Hello from Prefect 2.0!")
        return bq_result
    else:
        slack_webhook_block = SlackWebhook.load("data-pipeline-notifications")
        slack_webhook_block.notify("Hello from Prefect 2.0!")
        return Failed(message="Load Failure")
    # breakpoint()


if __name__ == "__main__":
    flow_result = main()
The error us ocurring during the mapping operation call to
get_items_list
z
How frequently do you encounter it? Curious if you can reproduce against a local server.
t
I have about 3000 mapped tasks, so I can't use a local server or I hit other errors.
Also, should I be worried about seeing this occasionally?
Copy code
--- Orion logging error ---
Traceback (most recent call last):
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/prefect/logging/handlers.py", line 146, in send_logs
    await client.create_logs(self._pending_logs)
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/prefect/client/orion.py", line 1683, in create_logs
    await <http://self._client.post|self._client.post>(f"/logs/", json=serialized_logs)
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/httpx/_client.py", line 1842, in post
    return await self.request(
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/httpx/_client.py", line 1527, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/prefect/client/base.py", line 182, in send
    response.raise_for_status()
  File "/home/tenders/.cache/pypoetry/virtualenvs/prefect-orion-HonJDUqB-py3.10/lib/python3.10/site-packages/prefect/client/base.py", line 125, in raise_for_status
    raise PrefectHTTPStatusError.from_httpx_error(exc) from exc.__cause__
prefect.exceptions.PrefectHTTPStatusError: Server error '500 Internal Server Error' for url '<https://api.prefect.cloud/api/accounts/cafe3a79-624b-468d-87a7-97fde3358a01/workspaces/5d09d677-90b8-4ef8-a9be-9760b422937a/logs/>'
Response: {'exception_message': 'Internal Server Error'}
For more information check: <https://httpstatuses.com/500>
Worker information:
    Approximate queue length: 62
    Pending log batch length: 2866
    Pending log batch size: 1031760
The log worker will attempt to send these logs again in 2.0s
z
Generally no since we'll retry. Good to have a report of it though as the server needs to be able to handle high log volumes.
t
Hmmm, I set the timeout to 45 second and I have been getting crashes. closed out my terminal and venv, trying again but I am seeing this spam in the logs:
Copy code
15:53:17.222 | ERROR   | Task run 'Get-Items-d8ed86f1-56' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.223 | ERROR   | Task run 'Get-Items-d8ed86f1-2718' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.224 | ERROR   | Task run 'Get-Items-d8ed86f1-2738' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.225 | ERROR   | Task run 'Get-Items-d8ed86f1-61' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.226 | ERROR   | Task run 'Get-Items-d8ed86f1-47' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.227 | ERROR   | Task run 'Get-Items-d8ed86f1-46' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.227 | ERROR   | Task run 'Get-Items-d8ed86f1-148' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.228 | ERROR   | Task run 'Get-Items-d8ed86f1-2732' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.229 | ERROR   | Task run 'Get-Items-d8ed86f1-126' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.230 | ERROR   | Task run 'Get-Items-d8ed86f1-66' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.231 | ERROR   | Task run 'Get-Items-d8ed86f1-2695' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.232 | ERROR   | Task run 'Get-Items-d8ed86f1-3060' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.234 | ERROR   | Task run 'Get-Items-d8ed86f1-3087' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.235 | ERROR   | Task run 'Get-Items-d8ed86f1-140' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.236 | ERROR   | Task run 'Get-Items-d8ed86f1-77' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.237 | ERROR   | Task run 'Get-Items-d8ed86f1-3070' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.238 | ERROR   | Task run 'Get-Items-d8ed86f1-2736' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.239 | ERROR   | Task run 'Get-Items-d8ed86f1-135' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.240 | ERROR   | Task run 'Get-Items-d8ed86f1-2756' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.241 | ERROR   | Task run 'Get-Items-d8ed86f1-133' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.241 | ERROR   | Task run 'Get-Items-d8ed86f1-101' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.242 | ERROR   | Task run 'Get-Items-d8ed86f1-174' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.243 | ERROR   | Task run 'Get-Items-d8ed86f1-94' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.244 | ERROR   | Task run 'Get-Items-d8ed86f1-113' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.245 | ERROR   | Task run 'Get-Items-d8ed86f1-2782' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.246 | ERROR   | Task run 'Get-Items-d8ed86f1-49' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.246 | ERROR   | Task run 'Get-Items-d8ed86f1-2735' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.247 | ERROR   | Task run 'Get-Items-d8ed86f1-41' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.248 | ERROR   | Task run 'Get-Items-d8ed86f1-2762' - Crash detected! Execution was cancelled by the runtime environment.
15:53:17.249 | ERROR   | Task run 'Get-Items-d8ed86f1-62' - Crash detected! Execution was cancelled by the runtime environment.
m
Hey @Tim Enders - just wondering if you determined the cause of this issue. I'm seeing the same thing in one of my flows.
z
If you turn on DEBUG level logs we’ll get a full traceback. That message means that a
asyncio.exceptions.CancelledError
was raised though.
Sounds like we’re just seeing that cancellation from your other traceback propagate through all the tasks.
Resolving these issues is likely my next top priority.
t
👍
Let me know if you need anything more from me @Zanie I had an emergency come up and have been offline for a couple of days