Background Existing Infrastructure: - We do have...
# prefect-community
b
Background Existing Infrastructure: • We do have full fledged Kubernetes clusters running on top of AWS/EKS. • We have stable Jenkins CI/CD pipelines to build/deploy Docker Images and Helm Charts • We support multiple environments separated at namespace level • Our observability is via NewRelic • Our Credential Stores are wired very well in Helm chart via custom charts+annotations • Data Warehouse is Snowflake and data sources are plethora of DBs and API services • ELT is carried out by Meltano + DBT or Python + Celery • Everything running on Kubernetes I am future user of a workflow engine, I am asking around for evaluations, I constructed this so far. I have to confess spent more time at
#dagster-*
channels compared to other communities and and it shows in the recommendations chart below under the light of our existing infra setup. But I wanted to add more “fairness” to the evaluation ratings and I’d highly appreciate some constructive feedback from the community on where I can improve this rating.
👀 1
k
For 2.0, we have the parameter validation (type safe). It’s on the flow level now and not task level yet. You can also test your function by doing
task.fn()
. The Prefect column looks pretty right. I think you should be able to do most of what you want whichever you choose. It’s just the abstractions may be different. Dagster is more opinionated than Prefect in general, but if those abstractions resonate with you, then it sounds like the tool would give you more. If there is some awkwardness with the abstractions, then you may benefit from exploring a more general purpose framework.
t
@Kevin Kho can you give an example of what you mean by abstractions in this case?
@Binoy Shah dunno how relevant it is for you, but for us - the stability of the project (in terms of how likely it is to continue to grow, be maintained, and not disappear one random morning) - helped shift the scales towards Prefect: it seems like it is at round B (compared to the A round of Dagster), and - it happens to have the same investors that we do (which we consider to be highly reputable and we trust their judgement) pulling some other bullets from the comparison we did 2 years ago : • prefect has roughly 2x the number of Github stars • we found the Prefect documentation to be clearer at the time (seems like dagster improved theirs in the meantime) • prefect seemed to support flow versioning vs. what i managed to figure out from Dagster (maybe they do now?) • the task library for Prefect matched what we need/use better (at least back then)
k
I haven’t dug in to Dagster heavily myself but I mean their Software-Defined Assets as a core part of their architecture. Though, we will eventually have something that can achieve that in the future.
b
firstly thank you for the non-dismissive feedback. I am glad that I am not way off from doing justice to Prefect 2.0's feature sets. Yes, somethings will matter more in our Dev practices compared to others. There are lot of things we wont be using. Dynamic Data Assets (dagster) looks really good for our small team where one stop visibility into data journey makes our Product Owner’s life easy
and I also understand the importance of Maturity and longevity. but things can change for better/worse, “future is fluid” kind of thing. <Personal-Rant> For longest of time, I acted as an evangelist for a framework called Playframework (Lightbend) and eventually they stopped development and support for it after few years </Personal-Rant>
I will update the matrix using the feedback as provided here. thank you
k
Definitely! We aren’t dismissive to competitors of course 🙂
b
One point for @Kevin Kho or @Tom Klein is there a way to separate the deployment and maintenance of Prefect servers on Kubernetes and Deployments of Application/Pipeline code into Kubernetes. We have 2 separate teams who would like to focus on their specialized workstreams
Ops vs Data teams
t
@Binoy Shah I'm not sure i fully understood your question but i'll try to answer: i can't testify regarding prefect server , because we use prefect cloud (which only required us to: have a dedicated namespace in our k8s, install [permanent] Prefect agent there, and - manage some permissions, roles, and what not) once everything is deployed correctly (pretty straightforward) and has the right permissions (admittedly - not always the case) - you don't require any more "ops" efforts - and can just create new flows, etc. this is until of course you wanna do more advanced stuff like Dask clusters, or interacting with other k8s resources outside of the Prefect ecosystem itself. also, sometimes things break for various reasons (e.g. custom Docker images, dependency collision, mis-configuration, etc.) and your "Ops" might have to assist with that too. deploying (or more accurately, "registering") new flows (which i'm assuming is what you call "application/pipeline code") is very straightforward with prefect cloud if you wanna have a CI/CD system - that's another task that your "Ops" will have to take on themselves Such a system could be responsible for deploying flows whenever they are changed in Git(hub), running automatic tests, and so on. just like any other kind of software. because our team is very small we don't bother with having a CI/CD yet - and manually take care of registering new flows or updating them