https://prefect.io logo
Title
g

Giuliano Mega

01/20/2023, 11:44 AM
Hello there, since upgrading my agents to 2.7.8 (Prefect Cloud + GKE autopilot), my agent logs are full of those:
--- Orion logging error ---
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/logging/handlers.py", line 151, in send_logs
    await client.create_logs(self._pending_logs)
  File "/usr/local/lib/python3.10/site-packages/prefect/client/orion.py", line 1830, in create_logs
    await <http://self._client.post|self._client.post>(f"/logs/", json=serialized_logs)
  File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1848, in post
    return await self.request(
  File "/usr/local/lib/python3.10/site-packages/httpx/_client.py", line 1533, in request
    return await self.send(request, auth=auth, follow_redirects=follow_redirects)
  File "/usr/local/lib/python3.10/site-packages/prefect/client/base.py", line 253, in send
    response.raise_for_status()
  File "/usr/local/lib/python3.10/site-packages/httpx/_models.py", line 745, in raise_for_status
    raise HTTPStatusError(message, request=request, response=self)
httpx.HTTPStatusError: Client error '429 Too Many Requests' for url '<https://api.prefect.cloud/api/accounts/[account]/workspaces/[workspace]/logs/>'
For more information check: <https://httpstatuses.com/429>
Worker information:
    Approximate queue length: 0
    Pending log batch length: 458
    Pending log batch size: 164880
The log worker has tried to send these logs 4 times and will now drop them.
which I guess isn't my fault as the Prefect agent is in control of rate limiting calls to the logging API. Looking into Prefect Cloud's UI it indeed seems to be dropping logs, which I don't think was happening before. Is this a known issue? Anything I can do to mitigate?
c

Christopher Boyd

01/20/2023, 3:20 PM
Hi @Giuliano Mega - how many flows do you run in an hour, and how large are these logs on average? There are indeed rate limits on prefect cloud, but I can’t say whether this is expected behavior, or unexpected behavior without more context on your environment and how much traffic you are generating
g

Giuliano Mega

01/20/2023, 5:11 PM
Hey @Christopher Boyd thanks for replying. Stats on average log size are currently not very easy to gather (I have those in GCP cloud logging but they're not easily grouped by flow), but I'll try to get a number to report back. We run about 10 flows a day, a few of those run concurrently but hardly more than 2 at a time. Another point is that I don't recall ever seeing such exceptions before the update to 2.7.8.
Hm, I realize now that I can map flows back to pods by using the flow run names, which should allow me to run historical queries. This should give us a rough estimate.