To avoid the crash when submitting flow runs to Fargate due to too many concurrent attempts to create a new revision of the specific family, consider the following strategies:
1. Allocate enough CPU and memory to your agent to make sure it can handle provisioning infrastructure for your flow runs.
2. Set a work-queue concurrency limit to prevent the agent from processing all runs simultaneously, which can lead to resource contention and API call errors.
Keep in mind that some API call failures could be due to transient client or server-side issues. Implementing robust error-handling with retries can also help mitigate these problems.
You can find more information and context in the Prefect documentation:
-
Concepts/Infrastructure
Let me know if you need further assistance or details on how to implement these solutions in your specific setup.