Hey, so far i developped flows locally and tested ...
# prefect-community
f
Hey, so far i developped flows locally and tested them locally. Now that they seems robust enough, it's time for production. This woud be on aws but i'm quite new about that and a bit lost among aws services. I red this article on how to deploy but i'm still confused, especially about where the kubernetes agent is running ? i suppose this is the only thing that needs to run all the time. I'm also a novice about k8s but if i understood correctly, this is the best option regarding scalability. Any suggestion on ressources to look at to do so ? thanks !
a
where the kubernetes agent is running ?
Thanks for reading it and sorry to hear it's not clear where the agent is running! Perhaps I could have made it clearer in the article, so thanks for pointing that out. The agent is running on the AWS EKS Kubernetes cluster as an independent deployment - this post shows how to spin up a 1-node cluster and then how to generate a deployment manifest which then deploys the agent pod - this way, the agent runs on Kubernetes, and thanks to the KubernetesRun run config with the same label as that agent, the flow runs also end up running on the same Kubernetes cluster as the agent.
i suppose this is the only thing that needs to run all the time
100% correct yes, the agent is the only process that needs to be running 24/7 when using Prefect Cloud re scalability, there are two options on AWS EKS: 1. fully managed serverless data plane with Fargate - takes care of all the management pain and scale, but as a drawback, it adds quite a bit of latency due to serverless (needs to provision the actual infra) 2. autoscaling policy which is nicely described here
f
Thanks for your responses. I think it is more about me that is really not very familiar with the aws/kubernetes env than you being clear 😅 Is it possible in that configuration to have dask execution of the flows ? Also, is there a way to "schedule" the uptime of the agent ? as we would only run a few scheduled etl jobs everyday, having a 24/7 cluster running an agent is not very optimised in term of cost.
a
Gotcha, I can relate especially when running some playground project you may not want to have an agent running 24/7 - this tutorial is for Azure but it shows one pattern that may be interesting for you: you could start any agent using supervisor running command:
prefect agent xxx start --label zzz
and then this supervisor process could be started on VM boot using crontab:
Copy code
echo "@reboot root supervisord -c /home/ec2user/supervisord.conf -l /home/ec2user/supervisord.log -u root" >> /etc/crontab
this way, you can use e.g. this instance scheduler for EC2 instance running agent and this crontab will ensure anytime machine is stored, the PRefect agent process starts too
this way the agent itself doesn't run on K8s but if your kubectl on this instance points to your EKS cluster, and you start a Kubernetes agent
prefect agent kubernetes start --label k8s
then all flow runs spun up by this agent process running on EC2 will be executed as K8s jobs
the thing is though - this setup would be overly complicated and likely unnecessary because you still need to run Kubernetes control plane 24/7 so not sure whether this helps
I guess the easiest for you if you have limited compute budget is to stick to EKS on Fargate with serverless - no management overhead, you pay not for the whole instance but only for a single pod running the agent process 24/7 and the flow runs run in "serverless" pods
🙏 1
f
ok yeah i'll try to measure the price of a single pod running 24 7. Had also this innocent thought from this conversation, couldn't the agent management be on the "cloud" side ? While still having the computation at the same place, keeping it as a hybrid model
oh but the agent is the bridge between cloud and the computation machine so maybe not 😅 nvm
a
exactly! great that you think about this critically and indeed the agent must be running on your infrastructure as a bridge between orchestration API in Cloud and the execution layer e.g. on your K8s cluster - this bridge is achieved by polling the Cloud API for scheduled runs and then deploying it to the related infrastructure (to which the agent has direct access to, but the orchestration API in Cloud does not)
f
Ok thanks for all of those clarifications !
👍 1