Hey 🙂 Prefect so far looks to be quite interesting and stream lined! I'm just wondering if there is any good way to set up an CI/CD process?
Let's say I have a repo, where I store my ETLs. How to I setup a way that prefect loads changes from the master branch and registers the changes? Is this even a problem prefect wants to solve?
01/15/2021, 3:02 PM
Haven’t actually implemented this, but conceptually I’d think you could use one of 2 options.
Option 1, CI/CD to re-register your flows:
In this option, after running your CI process, you could rebuild any deployment artifacts (say you’re using Docker images as storage, you’d rebuild your image and push them), then re-register your flows with Server (or Cloud). This is a good option b/c you hook into an existing ecosystem of CI/CD tools and have a ton flexibility with how you want your process to look.
Option 2, use the file based storage w/ git (link to docs below):
This option points your storage at a git provider (Github or Gitlab, but you can use other file based options if you’d like, docs: https://docs.prefect.io/core/idioms/file-based.html) and will automagically use the newly committed code as the code to run. The upside of this is that you don’t need to worry about rebuilding specific flows, write some kind of custom logic to parse out the differences between old and new code, etc. that would increase development overhead on your side. Basically, if code passes your CI process and merges into the branch on the storage option, it’s considered deployed!
01/16/2021, 1:14 AM
My engineering department uses ECS services as far as possible, so my team's setup basically involves breaking Prefect's docker-compose.yml file into a multi-container prefect-server ECS service (hasura, graphql, towel, apollo), a prefect-ui ECS service, and another ECS service for the local agent/local executor. My idea was that when the agent/executor ECS task starts up, it will create projects as needed, register the flows, and then start the agent. Our usual CI/CD process (triggered by pushing to GitHub) would have already created the Docker image for the container that this ECS task runs, so the updated flows will already be there, ready to be serialised to be saved in S3 storage.
It seems to work, but if you're thinking of going this path, unfortunately, we're encountering issues where the ECS task restarts for various reasons other than changes to the flows initiated by pushing to GitHub, and this then re-registers the flows unnecessarily (and eats S3 storage unnecessarily too, I believe). Consequently, I'm looking to see if we can integrate flow registration into the CI/CD process, so we only register the flows when we push to GitHub, not when we restart the ECS task.
Benedikt Maria Beckermann
01/16/2021, 6:50 PM
Thank you both for your input! Especially the suggestion to use the file based flow storage seems to fit the described usecase! If the prefect people are reading this: Maybe make this and the flow registration via CLI a bit more prominent in the docs 🙂
01/18/2021, 12:33 PM
@Amanda Wee Thanks for the description of your Prefect Server deployment, please can you guide me for a similar setup. I also need to split the docker-compose.yml into a multi-container prefect-server ECS service. How should I write my task definitions, please share any guide that can make it easy for me using AWS console.
The only difference is we dockerize our flows using storage and run them on AWS EKS. But for our concurrency needs we need to setup prefect server and currently can't do it using the ECS ClLI docker compose up approach as we already have lot of overlapping infra provisioned like ALB, Fargate Clusters which we can re-use.
01/26/2021, 12:21 PM
@Sagun Garg we followed the docker compose setup quite closely:
Postgres -> Aurora postgres
Hasura, graphql, towel, apollo -> prefect-server ECS service using links with two load balancers (private for the prefect agent ECS service to access; "public" for the team to access via prefect ui when connected to VPN) that listen on port 80 and forward to the apollo container on port 4200
ui -> prefect-ui ECS service with a "public" load balancer listening on port 80