My brain hurts. Shout out to <@UL1CF4QLC>, <@UUDF...
# show-and-tell
j
My brain hurts. Shout out to @Tyler Wanner, @Kyle Moon-Wright, @Kyle DeMerritt, @Amanda Wee for their support. Special shout out to @Anna Geller (old account) and @Laura Lorenz for their content on EKS and the new 0.14 updates. I've successfully deployed my flows using Prefect Cloud, AWS ECR, ECS, and Fargate. I'm a self taught dev and would love any feedback to clarify or include any concepts I may have luckily bypassed. https://lejimmy.com/distributed-data-pipelines-with-aws-ecs-fargate-and-prefect-cloud/
marvin 9
πŸŽ‰ 4
🧠 9
πŸš€ 10
t
glad u got there! and thanks so much for sharing your feedback and learning with the community πŸ™
πŸš€ 2
f
@Jimmy Le amazing tutorial, thanks for sharing with the community. Excellent written blogpost with nice intro, screenshots etc., will definitely try to follow it step-by-step 😍
πŸŽ‰ 3
j
Playing around with Repl.it, I was able to deploy the ECSAgent to an "Always-On" Repl. Code here: https://repl.it/talk/share/Prefect-ECSAgent-to-deploy-AWS-ECS-Fargate-tasks/120958 It could be pretty cost effective if I can get
supervisord
to work on it. πŸ€”
b
@Jimmy Le did you consider running the agent as an ecs service to avoid having to deal with
supervisord
? it's an option anyway.
j
@Billy McMonagle I thought about it but haven't quite solidified how to execute on it. I'm still new to the AWS world. Would you run one agent per ECS task or would you run an ECS service?
b
The ECS terminology is confusing, so very understandable. A "Service" just means it's always running, so ECS is sort of acting like
supervisord
for you. I believe it is correct to say that the task is part of the service (or the service runs the task)
j
Gotcha that makes sense, so do you specify running multiple agents in the Dockerfile via
RUN
commands?
b
Hm, I'm not sure about that. I tested out ECS with one agent before going with kubernetes instead.
I think it would depend on how you are differentiating and tagging your agents.
j
Yeah, previously I had to create a bunch of agents since I was running the flows locally and each had their own virtual environment. It was nice that I could just tell
supervisord
where the flow was stored and which virtual environment it should use. It would then make sure all the agents were running. But now that all the dependencies are nicely wrapped up in the ECS image, I don't currently see a reason to create so many labels. Of course, I'm learning and would be open to any insights you've discovered.
b
hmm that's interesting. I think it depends on which storage you're using and how you are building your flows.
gimme a sec
My preferred way to use agents, so far, is to have a single agent per "environment". That is, an agent labeled "production" and an agent labeled "development" (with more, as and when needed). My flows use docker storage, which means that I don't need to worry about matching up the right virtual environment with the right flow.
If you're running flows on fargate - which i think is cool and a reasonable choice - this should work for you.
πŸš€ 1
a
Currently my approach is to have the EC2-based ECS service's Dockerfile use a shell script as the
ENTRYPOINT
, then from that shell script I create projects, register flows, and finally start the local agent. If my team decides to move ahead with running flows as Fargate tasks, what I'd try first is to modify the shell script to run a second agent, i.e., the ECS agent. Each agent will have its own label and flows will be labelled accordingly. This way, flows that are fast and don't consume much CPU/memory can continue to be handled by the local agent, while others will be handled by the ECS agent, and I can downsize the CPU/memory limits for this ECS service. It won't quite fit the paradigm of one ECS task corresponding to one process (well it does, except that that process is a shell script spawning the prefect agents as subprocesses, and the local agent spawns a subprocess for each flow run), but it'll be less of a hassle than setting up another ECS service (as we do so through terraform assisted by our infrastructure team). There is a difference in setup in that we're using S3 storage, so we'll need to create the Docker image for the ECS agent during our CI/CD process rather than as a result of using Docker storage.
πŸš€ 1
j
@Jimmy Le: thank you for that article! i get an error at the very last step in the article: i launch the local ESCAgent and then attempt to run my basic-etl-prefect-flow in Prefect Cloud when my agent reports:
Copy code
botocore.errorfactory.InvalidParameterException: An error occurred (InvalidParameterException) when calling the RunTask operation: Task definition does not support launch_type FARGATE.
I've gone over the article several times to verify that I've set up properly- everything looks good. Looking for troubleshooting tips from anyone. Thanks!
j
In your ECS cluster do you have Fargate as a capacity provider?
j
Per the article, I executed the following:
Copy code
aws ecs create-cluster --cluster-name my-fargate-cluster --capacity-providers FARGATE
Confirmation in AWS ECS:
j
Can you check the task definition that was created and see if Fargate is included in the Compatibilities?
j
in my prefect-basic-etl-prefect-flow JSON:
j
You can pass in boto3 arguments inside ECSRun run_task_kwargs. I think there’s one for requiresCompatibilities
j
from the article:
Copy code
RUN_CONFIG = ECSRun(run_task_kwargs={'cluster': 'my-fargate-cluster'},
                    execution_role_arn='arn:aws:iam::<ACCOUNT>:role/<ROLE>,
                    labels=['ecs', 'dev'])
however i did add the following to the run_task_kwargs:
Copy code
'launchType': 'FARGATE'
yes, i also tried:
Copy code
'requiresCompatibilities': ['FARGATE']
Just a quick follow-up on this matter. Looks like others have encountered the same issue. One instance discussed in this Prefect Community Slack channel. The following ticket has been opened.
πŸ™Œ 1