Hi guys, I have been using Prefect but stuck at an...
# prefect-server
s
Hi guys, I have been using Prefect but stuck at an issue. I am using ECSAgent and sometimes I get an error that says "An error occurred (ClientException) when calling the RegisterTaskDefinition operation: Too many concurrent attempts to create a new revision of the specified family.". I am running around 80 flows that all run every minute. I am planning to run more flows, maybe in 100s and all of them running per minute. Am I using Prefect incorrectly and can it handle this scale? I checked out this link from the docs - https://docs.prefect.io/orchestration/agents/ecs.html#throttling-errors-on-flow-submission. I am already running the ECS agent with the specified options for retry mode but these errors still happen. Is there any way to fix this? I don't really need Prefect to create new task definitions on every flow run, can I disable that behaviour? I am also using a custom task definition, could it be because of that?
Let me know if you guys need any other information
a
Great that you found this documentation page! 1. What did you set as max attempts? Based on this issue, other user set it to 100 and it helped in their case 2. If you don’t need to create a new task definition on each run, you can provide an explicit task definition ARN to your ECSRun - this way, agent won’t register a new one at each run. Here’s agent code that confirms that:
Copy code
if run_config.task_definition_arn is None:
            # Register a new task definition
            self.logger.debug(
                "Registering new task definition for flow %s", flow_run.flow.id
            )
            taskdef = self.generate_task_definition(flow_run, run_config)
            resp = self.ecs_client.register_task_definition(**taskdef)
s
Hey @Anna Geller, thanks for replying, 1. I set the MAX ATTEMPTS to 10, I tried it with 100 as well but it still didn't work 2. I am already using the
--task-definition /home/ec2-user/taskdefinition.yaml
option when running the agent, should'nt that fix this issue? 3. Also, is Prefect really the right tool for my usecase? I know Prefect has a lot of advanced features, but I am just looking for a cron that can run a huge amount of tasks easily on multiple nodes.
a
Thanks for trying it. Regarding #2, I think you would really need to explicitly register a task definition and reference the ARN in order to reuse it. If you need more info how to do it, LMK. For #3, absolutely! Prefect is a perfect replacement for Cron jobs and is capable of running millions of tasks on whatever infrastructure you need them to run. Also, it gives you much more visibility and allows to take action on failure which would be hard to do with Cron.