https://prefect.io logo
r

Robert Banick

08/17/2023, 6:55 PM
My team is struggling to implement a Prefect-AWS stack that allows us to customize on demand the type compute we use for different flows. Our use case is that we have flows w/ radically different computational needs and want to use the most cost-effective compute for each. We run each flow once a day and outside of this time period the capacity will go unused, so we want to only provide the capacity on demand. Note that these flows well exceed the limits of what’s obtainable over Fargate. The problem is that we’re unsure how to implement such a system. Prefect’s
ECSTask
object allows reference to a Task Definition, but TDs don’t allow us specify the instance type and/or AWS AutoScaling Group we want, just the required CPU and Memory. The triggers for increasing the capacity of AutoScaling Groups all seem tied to internal AWS metrics, not requests passed along by the Agent. Moreover, AWS won’t let us maintain an ECS Service with 0 active deployments or tasks — the service will shut down if unused. We’re feeling a bit lost on how to implement this and not finding much corresponding documentation. Is there something we’re missing here on the Prefect side? Should the instance type / auto scaling group specification be implemented w/in the Agent/Worker configuration or other interface outside of an ECSTask block? Any help would be greatly appreciated.
j

juan suarez (hf)

08/17/2023, 7:07 PM
I don't know if I'm able to help you but I have some questions because I might face a similar issue in the future. Let's put Prefect on the side. How would you do this if you had to do it manually ? You specify that your requirements exceed Fargate, so I'm guessing that you would use an EC2 with auto-scaling ? If yes, how would you trigger this on a programmatic way ? Wouldn't you be able to create an intermediary EC2 that handles the logic (with AWS-cli) of starting and stoping the other infrastructures ?
r

Robert Banick

08/17/2023, 7:09 PM
We’ve been manually managing with AutoScaling groups that we manually scale up an down. This is pretty unsatisfying given that a) it requires lots of button clicking and b) the triggered flow does not always find the desired container, instead choosing another with the wrong resource match.
Isn’t an “intermediary EC2” effectively an Agent/Worker?
j

juan suarez (hf)

08/17/2023, 7:12 PM
A Worker can take an ECSTask and use it to run your task. That's why I'm asking, I'm feeling that your problem is not related to Prefect but purely to AWS. If ECSTasks can't define properly your setup, then the Workers won't neither
r

Robert Banick

08/17/2023, 7:13 PM
I agree that the solution is probably partly to do with AWS
j

juan suarez (hf)

08/17/2023, 7:13 PM
So if ECTasks are not satisfying, you would need Prefect Server ⇒ trigger a worker ⇒ Trigger an intermediary EC2 + Infrastructure creation with AWScli or Cloud Formation that will create the infrastructure adhoc
So you can still have the orchestration part managed by Prefect but the infrastructure definition done programmatically and not dependent on Prefect.
r

Robert Banick

08/17/2023, 7:15 PM
How would the worker provide appropriate instructions to the intermediary EC2? Run a flow that does that?
j

juan suarez (hf)

08/17/2023, 7:17 PM
Let's say that the Worker triggers a Fargate right ? That Fargate runs a container with : The scripts + Cloud Formation That script + cloud formation setups the infrasctructure that you need Once that the infrastructure is done, then you trigger the code that you need to run
In other terms you need to create a Worker-bis that does the infrastructure part, since the official Worker can't
And I can imagine that you case is an edge case so I don't see in a short term future that Prefect Worker will be able to handle complex architecture definitions
r

Robert Banick

08/17/2023, 7:19 PM
Interesting, I wouldn’t have thought that being specific w/ architectural requirements is an edge case, but I guess so.
Will think about this solution a bit, thanks for the feedback @juan suarez (hf)
j

juan suarez (hf)

08/17/2023, 7:20 PM
I mean, if your needs exceed Fargate capacity and the ECSTask can't define correctly your infrastructure, I would assume that not even AWS thinks that your situations is entirely standard 😉
d

Dominic Tarro

08/17/2023, 8:02 PM
@Robert Banick One place to check is the
run_deployment
function. You may be able to apply infrastructure overrides.
j

Jake Kaplan

08/17/2023, 8:36 PM
I'd recommend trying to use an ECSWorker and ECS Work Pool. Even outside of the basic configuration that is exposed, in the work pool advanced tab, you can basically add anything to the base job template that you'd like to pass to the ECS API on per work pool or per deployment basis for the following: • registering the task (task_definition) • running the task (task_run_request) However to the above point, if you're unable to do what you want via those endpoints directly, it's likely outside the realm of what a ECS Worker/Work Pool support. You can still use prefect and the above to run flows that spin up/down the infrastructure you need (the intermediary step), but it's more of definitely more of a manual process since you'll need to invoke some specialized calls to AWS