gee
09/16/2025, 3:57 PMMarvin
09/16/2025, 3:58 PMMarvin
09/16/2025, 4:04 PMlaunch_type
to EC2
(simple, no capacity providers), or
2) Provide a capacityProviderStrategy
that points at your EC2 capacity provider(s) and omit launchType
entirely.
Key detail: AWS does not allow specifying launchType
and capacityProviderStrategy
at the same time on RunTask
. Pick one approach.
What this looks like with Prefect
- Prefect ships an ECS base job template that includes placeholders for both launchType
and capacityProviderStrategy
. For EC2 you have two viable configurations:
Option A: Plain EC2 (no capacity providers)
- Set launch_type
to EC2
- Do not set capacity_provider_strategy
- This is the simplest way to use your existing EC2 cluster; image caching works because your instances keep layers locally.
Option B: Capacity providers (preferred if you want auto scaling via ASGs)
- Remove the launchType
field from the work pool’s base job template (or ensure the launch_type
variable is not rendered)
- Set capacity_provider_strategy
to your provider(s), e.g. [{ "capacityProvider": "MyEC2Provider", "base": 0, "weight": 1 }]
Getting the starting template and updating it
- Get the default ECS template:
prefect work-pool get-default-base-job-template --type ecs
- For Option B (capacity providers), edit the task_run_request
section to drop launchType
and keep capacityProviderStrategy
, for example:
"task_run_request": {
"cluster": "{{ cluster }}",
"overrides": { ... },
"tags": "{{ labels }}",
"taskDefinition": "{{ task_definition_arn }}",
"capacityProviderStrategy": "{{ capacity_provider_strategy }}"
}
- Create/update the pool:
prefect work-pool create my-ecs-pool --type ecs --base-job-template ./ecs.json
# or
prefect work-pool update my-ecs-pool --base-job-template ./ecs.json
- Then deploy flows with job variables pointing at your cluster, image, roles, networking, and either:
- Option A: {"launch_type": "EC2"}
- Option B: {"capacity_provider_strategy": [{"capacityProvider": "MyEC2Provider","base":0,"weight":1}]}
Notes, caveats, and performance tips
- If you set a capacityProviderStrategy
, ensure launchType
is omitted; specifying both will cause an AWS error. There was a historical bug where launchType
defaulted to FARGATE
even when using capacity providers; if you see that behavior, upgrade to the latest prefect-aws
and explicitly remove the launchType
key from the base job template. Related issue for context: Setting capacityProviderStrategy not working in Push Work Pool.
- Pre-pull your images on EC2 instances to maximize cache hits:
- Add a user‑data script or ASG lifecycle hook that runs docker pull <your-ecr-image:tag>
on instance launch.
- Keep a small warm pool or minimum ASG size > 0 to retain cached layers.
- Use smaller, layered images and reuse a stable “runtime” image while loading your flow code from source; that way the image changes less frequently and caches better.
- Networking/roles:Marvin
09/16/2025, 4:04 PMawsvpc
network mode, provide subnets/security groups in network_configuration
.
Docs
- Prefect ECS integration guide: ECS guide
- Capacity provider issue reference: GitHub issue #13030
If you’d like, I can generate a minimal base job template JSON for your chosen approach (plain EC2 vs capacity providers) and a checklist of the job variables to set. What’s your cluster name/ARN and are you already using capacity providers on it?gee
09/16/2025, 4:18 PMMarvin
09/16/2025, 4:18 PMMarvin
09/16/2025, 4:19 PMprefect worker start -p <your-ecs-pool>
somewhere in your environment.
- The worker polls Prefect for runs, then it calls ECS RunTask from inside your network using its local AWS credentials.
- Requirements: You manage that always-on worker compute; useful when AWS API access is restricted to your VPC/private endpoints or you need tighter network control.
What stays the same
- Same ECS cluster, same task definitions, same ability to use EC2 capacity providers for image caching.
- Same base job template and job variables. Both support either:
- launch_type: EC2
, or
- capacityProviderStrategy: [...]
(omit launchType
if you set this)
- Start latency on ECS with EC2 image caching is identical across push vs worker; caching happens on your EC2 instances, not in Prefect.
Practical differences
- Ops overhead: Push has no worker to run; worker requires an always-on process.
- Network/control:
- Push: Prefect must reach AWS ECS/ECR over public endpoints with provided IAM creds.
- Worker: Runs inside your environment; can use private VPC endpoints, SSO, or enterprise network constraints.
- Submission timing:
- Push: Immediate submission when a run is scheduled (no poll interval).
- Worker: Polls for runs; negligible in most cases but it’s an extra moving part.
- Upgrades/behavior:
- Push uses Prefect’s managed integration. If we add fields (e.g., capacityProviderStrategy
), you’ll see them as soon as you update the pool template.
- Worker behavior depends on your installed prefect-aws
version. If you use capacity providers, ensure you’re on a recent version and remove launchType
when setting capacityProviderStrategy
. Older behavior noted here: issue #13030.
When to choose which
- Choose push if you can let Prefect call AWS directly and want the simplest, scale-to-zero control plane.
- Choose worker if your AWS API access is private/restricted, you need to run everything inside your network, or you want to inject custom operational logic around submissions.
Docs
- ECS integration guide: Prefect ECS guide
- Capacity provider reference: GitHub issue #13030
If you share any network/IAM constraints, I can recommend which model (push vs worker) is the better fit for your environment.