One of the problems we are seeing in multi-agent workflows running on Prefect is when an agentic task "fails", both Prefect and the manager agent's retry mechanism kick-in. Seems like we now need to distinguish between errors which normal Prefect retries can recover from versus errors where agent needs to try something different.
How are others here dealing with this problem?