https://prefect.io logo
#prefect-community
Title
# prefect-community
b

Bruno Murino

02/07/2022, 2:14 PM
Hi everyone — I'm have an ECS Agent kicking off some ECS tasks one the underlying EC2 instances (we are not using Fargate), and sometimes the agent tries to start the ecs task on an ec2 that doesn't have enough memory available, so the flow fails to start. Is there any way to tell the ECS agent to try again or something?
k

Kevin Kho

02/07/2022, 2:41 PM
Does it not start at all or does it die somewhere while running the flow?
b

Bruno Murino

02/07/2022, 3:32 PM
does not start at all!
a

Anna Geller

02/07/2022, 3:38 PM
That's the trade off when using EC2 self-managed data plane rather than Fargate. You get better latency at a cost that you need to manage the resource utilization yourself. You need to allocate more memory to your ECS task for the flow run. If you specify only a hard memory limit, you can prevent running out of memory because ECS will stop placing tasks on the instance if there is not enough memory there. And if there is not enough resources on your cluster, you need to scale it out or up to ensure there is enough capacity for your workloads. Check out this StackOverflow answer for more info about this.
b

Bruno Murino

02/07/2022, 4:17 PM
hmm not sure I follow 100%, but we do have like 12 EC2 instances with lots of free space, except a few ones that are almost full — so if ECS tries to start a task on one of the full EC2 instances it will fail to start, but it could just try again, randomly picking another EC2 instance, and it would almost always succeed. The problem now is that the ECS Agent doesn’t try again at all
a

Anna Geller

02/07/2022, 4:20 PM
Can you share a bit more about your setup? You can control a lot of that using placement strategies and resource allocation set on your
ECSRun
. Can you share your run config? How did you create your ECS agent?
🤯 1
b

Bruno Murino

02/07/2022, 4:30 PM
these are the run config for the flows and the ECS agent setup — it’s quite minimal indeed
a

Anna Geller

02/07/2022, 4:39 PM
Thanks, your set up looks fine, Some things to check: 1. Have you set the memory in MB? e.g. if you want 4 GB you need to set 4000 2. Did you check for more details about this error in the logs or AWS ECS metrics tab? This user encountered a similar error and confirmed that their ECS task requested more memory than was available on any of their instances.
A similar story here: "the sum of the memory I allocated for each container was > the memory of the EC2 box."
20 Views