Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.

Prefect Community

Hello, I’m software engineer looking into replacing our Airflow deployments and pipelines with Prefect :slightly_smiling_face: Mainly because we can’t move some of our pipelines to Airflow as it is missing good support for dynamic dags. We need a high degree of dynamism in some of our pipelines.

Welcome! Let us know how we can help :smile:

<@U01HRC7L4MD> welcome! We will be rolling out entirely new forms of dynamicism in 2021 and would love to learn more about your use case

<@UKTUC906M> could you write more about what is in store with this?

My current use-case is quite simple and I think that Prefect covers it currently.

I have an input, which is either a query or file with some datapoints (there could be from 1 to xxx million datapoints). So here I need conditional tasks, based on input. After this I need to split those datapoints to 10k batches and do some procesing on each batch. So there can be 1 to xx thousand batches and here I need to create a task for each batch.

I will be working on this in the next couple of weeks and I hope everything works out okay, as I already promised presentation about our Airflow and Prefect journey to local Python meetup :slightly_smiling_face:

Currently, after cursory research, I see couple things that is harder for me currently than Airflow.
1. Kubernetes deployment. I understand Helm charts and operators, but I have not found anything in that regard with Prefect, so Kubernetes deployment is not clear for me (could be that I didn’t do enough research yet)
2. Secret storage. I know that Prefect supports secret storage if using your cloud solution, but we are looking into full on-premises deployment currently.
3. UI authentication. This is similar to 2 point.
4. Prefect seems a bit less mature than Airflow, but it is not a deal breaker for me.

And thanks for warm welcome :slightly_smiling_face:

Oh and after all processing I have to create from 1 to 50 manual tasks, which should be executed only after a responsible person reviews it and starts it.

re 1, We have a working helm chart for deploying Prefect Server and we’re exploring making deployment easier! <https://github.com/PrefectHQ/server/tree/master/helm/prefect-server>

Oh, cool and thanks. Somehow I missed it :slightly_smiling_face:

image.png

Btw, this is my current outline of the talk/presentation. I will be modifying it, when I get more experience with Prefect, but if you have any suggestions I will happily incorporate them:

The features we’ve introduced that allow users to break out of the traditional DAG-based workflow model (runtime mapping, looping, and configurable output caching, among others) have instantly proved to be our most popular features, so we’ve acquired almost 2 years of user data on how people work with dynamic or runtime-discovered workflow requirements. A major focus for this year will be to expand the building blocks we offer for generating such workflows.

Your use case about manual approval, for example, is one that we added in the very earliest days of Prefect because it seems obvious to us as data consumers! Here’s a doc describing it: <https://docs.prefect.io/core/idioms/pause-for-approval.html#pause-for-approval> Today, these manual-approval tasks will automatically put a prompt in the UI.

The thing that makes this all possible is how we view our rich `State` objects as the currency of the system, rather than a DAG object. By reacting to states, our system gets a lot more powerful because it can dynamically expand beyond the static DAG.