Our data engineering team is considering the follo...
# ask-community
a
Our data engineering team is considering the following setup: • snowflake as the database • s3 buckets as storage • prefect or airflow as orchestration tool For airflow AWS offers MWAA which basically creates the whole infrastructure (scheduler, UI through webserver, worker)needed for orchestration. For prefect I would think that we could use prefect cloud, which handles the ui/webserver aspect, but we would still need to setup a server running an agent, right? As we would like to not manage any server in order to keep the infrastructure as simple as possible: What would be the best option? Or did I misunderstand something? Thanks for advice in advance!
a
Correct, you would deploy the agent on your infrastructure to make sure that all your code and data remain secure on your infrastructure - this is called the hybrid model. Regarding an easy setup with the least possible infrastructure management, you could try one of the serverless execution platforms that Prefect supports: • ECSAgent spins up your flows as ECS tasks - all infrastructure is provisioned on-demand by AWS. Here is a blog post that explains how to set this up: https://towardsdatascience.com/how-to-cut-your-aws-ecs-costs-with-fargate-spot-and-prefect-1a1ba5d2e2df • Vertex agent spins up your flows on on-demand GCP VMs - this excellent tutorial explains how you could use Vertex even with a distributed Dask cluster https://rdrn.me/scaling-out-prefect/
a
Thanks for the links. The ECS Article looks promising. Will dive into it tomorrow and potentially come back with some questions. 😉
a
sure, anytime!