https://prefect.io logo
Title
k

Kotra Pali

04/30/2020, 12:53 PM
Hey 👋 . Thanks for a great product, I really like it! I would like to ask if our usecase is a good fit for prefect. Basically, we are processing tables via a few simple rules parametrized by user. Imagine about ~10 transformation funcs like:
Rename(source='cola', target='colb').
Transpose(source=[...]),
Map(colname="a", values={3:4, 5:7}),
Assign(condition='colb>2', target="cola", value=100)
...
We have about 100k of these per single processing and there are some dependencies between commands (e.g. you can see that
cola
depends on values of
colb
in the last
Assign
command and we are able to parse these dependencies in advance). What I am thinking about is to create `prefect.Task`s from these transformation funcs and create a graph. That means a DAG with like 100k nodes with some edges at least (but each of them very lightweight). Do you think it's a good fit/idea for prefect? Can it handle such huge DAGs, or is it rather design for smaller ones?
👀 1
l

Laura Lorenz (she/her)

04/30/2020, 1:06 PM
Hi and welcome @Kotra Pali! Yes, that sounds like a great use case for Prefect, especially since you have some data dependencies between your tasks that you know about up front and that you’ll want to run them parameterized by user. There is no problem regarding larger flows at this size from the Prefect side. In fact, we ourselves generally recommend many lightweight tasks as opposed to fewer large ones, so you can take advantage of Prefect’s engine the most (for example retry semantics or using state handlers can be more targeted and sophisticated if you have your tasks broken down into smaller pieces). Feel free to ask any questions in here as you are getting started! Welcome again 🙂
k

Kotra Pali

04/30/2020, 1:12 PM
Hey Laura. Thanks a lot for a swift answer, it's great to hear that, we'll give it a shot!
🎉 1
:marvin: 1