Occasionally I am getting crashes for one of my Fa...
# ask-community
j
Occasionally I am getting crashes for one of my Fargate deployments when I schedule many to run:
Copy code
Flow run could not be submitted to infrastructure: An error occurred (ClientException) when calling the RegisterTaskDefinition operation: Too many concurrent attempts to create a new revision of the specified family.
I try and schedule my jobs at least a few seconds apart (perhaps this is not enough). I also am using concurrency limits to enqueue jobs, it may be possible many late jobs awaiting a concurrency slot are trying to enter at once? Not sure. Any help or advice in avoiding these crashes would be greatly appreciated!
k
for the task def registration request, the max capacity of tokens for a given second is 20, but refills at only 1 per second, so it could be that your bucket of request tokens is always nearly empty
I'm curious whether every run is registering a task def, or only under certain circumstances, like when a deployment config changes. it should be the case that rerunning the same deployment after making no changes shouldn't result in the registration of a task definition
there are some strategies for handling this, but they all revolve around minimizing the quantity of registrations by trying to keep config stable or discoverable
j
Yeah same here, I wouldn't have expected a new task definition to be registered by runs of the same deployment. I'm only changing the parameters between runs.
for the task def registration request, the max capacity of tokens for a given second is 20, but refills at only 1 per second, so it could be that your bucket of request tokens is always nearly empty
Helpful to know... perhaps I'll have to schedule things out a little further apart or maybe decrease my concurrency limit from 20