Hi, I have an issue using ecs work pool when flow...
# ask-community
r
Hi, I have an issue using ecs work pool when flows start at the same moment and they try to register aws task definition:
Copy code
An error occurred (ClientException) when calling the RegisterTaskDefinition operation: Too many concurrent attempts to create a new revision of the specified family.
I've tried to find solution on the Internet, looked at github issues, and now I try here because nothing really helped. This issue is similar to mine: https://github.com/PrefectHQ/prefect/issues/10102 My stack: • prefect 2.20.2 • prefect-aws 0.4.2 • ecs work pool • scheduled deployments I've tried to set these variables in my worker Dockerfile:
Copy code
ENV AWS_RETRY_MODE=adaptive
ENV AWS_MAX_ATTEMPTS=100
however, this didn't resolve the issue. I have 5 scheduled flows that start simultaneously each day. These flows attempt to register new task definition each, likely because a new deployment version is created each night. out of the 5 flows, 4 run successflully. 1 flow fails (not the same, every day different one) due to the error: Too many concurrent attempts to create a new revision of the specified family. My question is: Is there a way to avoid this issue or prevent a new definition from registering every time, especially since I'm using the same Docker image for each flow with only input parameter differing(I have enabled Match Latest Revision In Family (Optional) but it's not working)? I've been stuck on this for a while :(
k
Have you been able to diff say, the two most recent revisions in a family? Could it be the flow run name changing every time that's doing it? what's your ECS logging config on the work pool?
r
thanks for your response there are 2 most recent versions:
Copy code
{
  "taskDefinitionArn": "arn:aws:ecs:MY_REGION:MY_AWS_ACCOUNT:task-definition/prefect_ecs-test-work-pool_family:1315",
  "containerDefinitions": [
    {
      "name": "prefect-ecs_work_pool-flows",
      "image": "<http://MY_AWS_ACCOUNT.dkr.ecr.MY_REGION.amazonaws.com/prefect-ecs-flow-image:latest|MY_AWS_ACCOUNT.dkr.ecr.MY_REGION.amazonaws.com/prefect-ecs-flow-image:latest>",
      "cpu": 0,
      "portMappings": [],
      "essential": true,
      "environment": [],
      "mountPoints": [],
      "volumesFrom": [],
      "systemControls": []
    }
  ],
  "family": "prefect_ecs-test-work-pool_family",
  "executionRoleArn": "arn:aws:iam::MY_AWS_ACCOUNT:role/ecsTaskExecutionRole",
  "networkMode": "awsvpc",
  "revision": 1315,
  "volumes": [],
  "status": "ACTIVE",
  "requiresAttributes": [
    {
      "name": "com.amazonaws.ecs.capability.ecr-auth"
    },
    {
      "name": "ecs.capability.execution-role-ecr-pull"
    },
    {
      "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
    },
    {
      "name": "ecs.capability.task-eni"
    }
  ],
  "placementConstraints": [],
  "compatibilities": [
    "EC2",
    "FARGATE"
  ],
  "requiresCompatibilities": [
    "FARGATE"
  ],
  "cpu": "1024",
  "memory": "2048",
  "registeredAt": "2024-10-28T07:59:56.404Z",
  "registeredBy": "arn:aws:sts::MY_AWS_ACCOUNT:assumed-role/ecsTaskExecutionRole/172ab11eca18439ea69540877e25bbe3",
  "enableFaultInjection": false,
  "tags": []
}
and 2nd
Copy code
{
  "taskDefinitionArn": "arn:aws:ecs:MY_REGION:MY_AWS_ACCOUNT:task-definition/prefect_ecs-test-work-pool_family:1314",
  "containerDefinitions": [
    {
      "name": "prefect-ecs_work_pool-flows",
      "image": "<http://MY_AWS_ACCOUNT.dkr.ecr.MY_REGION.amazonaws.com/prefect-ecs-flow-image:latest|MY_AWS_ACCOUNT.dkr.ecr.MY_REGION.amazonaws.com/prefect-ecs-flow-image:latest>",
      "cpu": 0,
      "portMappings": [],
      "essential": true,
      "environment": [],
      "mountPoints": [],
      "volumesFrom": [],
      "systemControls": []
    }
  ],
  "family": "prefect_ecs-test-work-pool_family",
  "executionRoleArn": "arn:aws:iam::MY_AWS_ACCOUNT:role/ecsTaskExecutionRole",
  "networkMode": "awsvpc",
  "revision": 1314,
  "volumes": [],
  "status": "ACTIVE",
  "requiresAttributes": [
    {
      "name": "com.amazonaws.ecs.capability.ecr-auth"
    },
    {
      "name": "ecs.capability.execution-role-ecr-pull"
    },
    {
      "name": "com.amazonaws.ecs.capability.docker-remote-api.1.18"
    },
    {
      "name": "ecs.capability.task-eni"
    }
  ],
  "placementConstraints": [],
  "compatibilities": [
    "EC2",
    "FARGATE"
  ],
  "requiresCompatibilities": [
    "FARGATE"
  ],
  "cpu": "1024",
  "memory": "2048",
  "registeredAt": "2024-10-28T07:59:56.333Z",
  "registeredBy": "arn:aws:sts::MY_AWS_ACCOUNT:assumed-role/ecsTaskExecutionRole/172ab11eca18439ea69540877e25bbe3",
  "enableFaultInjection": false,
  "tags": []
}
So, there only 3 differences I can see: • revision name • revision • registeredAT nothing rly changed. I saw in one topic on the Internet that aws logging if configured can leads to creation new task definition every time so I turned it off while ago but still issue occurs. The name of flow stays unchanged every time.
@Kevin Grismore any ideas? it looks like bug for me, should I raise it on github?
s
If @Kevin Grismore or @Robert Torbiczuk, you've managed to solve this, please let me know how
r
@Samuel Hinton I've just retry it by set lambda in aws 😕
🥲 1