https://prefect.io logo
s

Sean Talia

02/09/2022, 4:33 PM
Hi all, does anyone here have with running Prefect flows using AWS ECS tasks on EC2? My org currently runs a lot of flows using ECS tasks on Fargate, but Fargate has some constraints that EC2 doesn't (e.g. you can't use GPU-enabled instances w/ Fargate). In order to accommodate some of these requests, I was considering either: 1. Launching a new EC2 instance (that has all the hardware requirements that the requesting teams need) on which we'd have a DockerAgent running, and then use labels to manage which
DockerRun
flows will or won't run on that beefed-up EC2 instance 2. Using ECS on EC2, and then simply manage the execution of the flows on the EC2 instance by configuring an
ECSRun
flow to use an ECS task that executes on EC2 The former option seems a little more straight-forward, and we could get it up and running pretty quickly, but it would involve overhead of managing labels, new API keys for the agents, etc. The latter is probably more flexible in the end, but there's more up front work for us since ECS on EC2 is not a workflow that we currently support. Has anyone ever deliberated over this issue or experimented with it?
k

Kevin Kho

02/09/2022, 5:44 PM
We have users on number 2. It also reduces startup time compared to Fargate due to image caching. Both of these are valid though and if you need a GPU, it sounds like having a dedicated box for it would be the best because it’s not as straightforward to choose which EC2 instance gets the ECS work
You’d have to fiddle with placement strategies to get the work on the GPU I think
s

Sean Talia

02/09/2022, 5:50 PM
Okay cool, this is very much what I was anticipating...the ECS route seems like both the overhead and the ongoing management could end up being more trouble than it's worth...deciding which work goes where is so straightforward with the prefect agents and labels that just being able to leverage that alone seems to justify having a dedicated box that can be created/destroyed at will
k

Kevin Kho

02/09/2022, 6:32 PM
Yes ECS Management of underlying EC2 requires management on your part. Agents would handle it on the Prefect level
👍 1
s

Sean Talia

02/10/2022, 2:43 PM
Btw @Kevin Kho, we're still on 0.14.22, haven't yet moved to 0.15...it looks like the
host_config
param wasn't introduced to the DockerRun class until
0.15.2
so is it safe to assume we can't even take advantage of passing the
gpu
runtime params until we do that?
k

Kevin Kho

02/10/2022, 2:45 PM
Yes and also getting the container to use a GPU can be hard . Docker py doesnt have direct support for it. I dont know if you have experience
s

Sean Talia

02/10/2022, 2:46 PM
hahaha i was just reading through this open issue
it seems like this:
Copy code
device_requests=[
        docker.types.DeviceRequest(count=-1, capabilities=[['gpu']])
    ]
actually was working for some people though?
k

Kevin Kho

02/10/2022, 2:55 PM
i dont know if the prefect implementation of Docker run exposes this. or would that go in host_config?
s

Sean Talia

02/10/2022, 3:00 PM
I would think so because the
device_requests
parameter is something that can be passed to the
create_host_config
method
k

Kevin Kho

02/10/2022, 3:04 PM
ah i see. yeah that could work and yes you’d need to upgrade but i think going to 0.15.0 didnt have that many breaking changes
You can check here
s

Sean Talia

02/10/2022, 3:05 PM
yeah I was just reading the release notes and it doesn't seem like we'll be affected by much since most of what was deprecated appears to be pretty old
3 Views