Madushan Pathirana
09/28/2025, 10:43 AMMarvin
09/28/2025, 10:43 AMMarvin
09/28/2025, 10:46 AM{
"cluster": "my-ecs-cluster",
"launch_type": "EC2",
"subnets": [
"subnet-xxxxx"
],
"security_groups": [
"sg-xxxxxx"
],
"task_role_arn": "arn:aws:iam::123456789012:role/PrefectTaskRole",
"execution_role_arn": "arn:aws:iam::123456789012:role/PrefectTaskExecutionRole",
"image": "<http://123456789012.dkr.ecr.ap-southeast-1.amazonaws.com/myimage:latest|123456789012.dkr.ecr.ap-southeast-1.amazonaws.com/myimage:latest>",
"cpu": 512,
"memory": 1024
}
Example using a capacity provider strategy instead of launch_type (don’t set both):
{
"cluster": "my-ecs-cluster",
"capacity_provider_strategy": [
{
"name": "MyEc2CapacityProvider",
"weight": 1,
"base": 0
}
],
"subnets": [
"subnet-xxxxx"
],
"security_groups": [
"sg-xxxxxx"
],
"task_role_arn": "arn:aws:iam::123456789012:role/PrefectTaskRole",
"execution_role_arn": "arn:aws:iam::123456789012:role/PrefectTaskExecutionRole",
"image": "<http://123456789012.dkr.ecr.ap-southeast-1.amazonaws.com/myimage:latest|123456789012.dkr.ecr.ap-southeast-1.amazonaws.com/myimage:latest>",
"cpu": 512,
"memory": 1024
}
Given your setup:
- Running Prefect Server + worker + ECS on a single EC2 instance in ap-southeast-1 means you won’t get autoscaling unless you introduce an ASG-backed ECS cluster. If you want the cluster to grow when jobs queue up, set up an ASG and either capacity providers (for services) or ASG scaling on ECS cluster reservation metrics for RunTask patterns.
Helpful docs:
- Prefect ECS guide: Prefect AWS ECS guide
- AWS ECS capacity providers overview: AWS: Capacity providers
- ECS cluster auto scaling considerations: AWS: Cluster Auto Scaling
Questions to tailor the setup:Marvin
09/28/2025, 10:46 AMMadushan Pathirana
09/28/2025, 11:05 AMMarvin
09/28/2025, 11:05 AMMarvin
09/28/2025, 11:07 AMfrom prefect import flow
from prefect.deployments import run_deployment
@flow
def my_flow(...):
...
# Small
my_flow.deploy(
name="ecs-small",
work_pool_name="my-ecs-pool",
job_variables={
"launch_type": "EC2",
"cpu": 1024, # example
"memory": 2048, # 2GB
"cluster": "my-ecs-cluster",
"subnets": ["subnet-xxx"],
"security_groups": ["sg-xxx"],
"image": "<http://123456789012.dkr.ecr.ap-southeast-1.amazonaws.com/myimage:latest|123456789012.dkr.ecr.ap-southeast-1.amazonaws.com/myimage:latest>",
},
)
# Large
my_flow.deploy(
name="ecs-large",
work_pool_name="my-ecs-pool",
job_variables={
"launch_type": "EC2",
"cpu": 2048, # example
"memory": 4096, # 4GB
"cluster": "my-ecs-cluster",
"subnets": ["subnet-xxx"],
"security_groups": ["sg-xxx"],
"image": "<http://123456789012.dkr.ecr.ap-southeast-1.amazonaws.com/myimage:latest|123456789012.dkr.ecr.ap-southeast-1.amazonaws.com/myimage:latest>",
},
)
# Optional: a simple router flow
@flow
def router(should_use_large: bool, params: dict=None):
target = "my_flow/ecs-large" if should_use_large else "my_flow/ecs-small"
return run_deployment(name=target, parameters=params or {})
2) Abort-and-retry with a larger size (advanced)
- Monitor your own process memory (e.g., psutil) early in the run. If you detect you’ll exceed 2GB, programmatically launch the “large” deployment and exit the current run.
- Caveat: you’ll lose partial work unless you checkpoint or make your flow idempotent.
Why ECS service autoscaling won’t help here
- Service autoscaling changes the number of tasks, not the memory size of an existing task.
- Vertical “autoscaling” (changing CPU/memory of a running task) isn’t supported by ECS; you must start a new task definition/revision with higher resources.
Fit with your environment
- You’re on a single EC2 host (~8GB RAM) running Prefect Server + worker. A 4GB task may fail placement if the host is busy. If you plan to run both 2GB and 4GB tasks, consider:
- Keep concurrency low (maybe 1 ECS task at a time), or
- Move the ECS cluster into an Auto Scaling Group so capacity can grow when needed.
Helpful docs
- Prefect AWS ECS guide: Prefect AWS ECS guide
- Customize job variables (per-deployment): Customize job variables
A quick question to tailor this: do you want to upscale mid-run when utilization crosses 80%, or is scaling between runs acceptable? If between runs is fine, I’ll sketch a minimal “router” that bases the decision on recent run metadata or input size.