Can anybody point to a relatively large project that uses prefect in github to checkout their directory tree structure ?
Wondering how they group and name the python files of their flows and tasks when they have a lot of them.
k
Kevin Kho
01/31/2022, 11:52 PM
The only big public one I have seen in the edx repo here but it’s also been a while since I searched for Prefect
👍 1
d
Daniel Kornhauser
02/01/2022, 3:00 PM
@Kevin Kho, had a look, but it’s not exactly what I am looking for, the issue I am trying to really address is how to organize flows ?
We settle on using
We will have many other sources like rxiv, twitter, etc …
Each as a directory with the structure above
Wonder what is your opinion about the
pubmed-
name redundancy, so far we haven’t needed to create multiple files containing tasks or flows, we actually place everything in a single flow.
k
Kevin Kho
02/01/2022, 3:05 PM
Not an open source repo, but some people talk about their setup here . have you seen it?
d
Daniel Kornhauser
02/01/2022, 3:13 PM
Wow, looks pretty nice, thanks for sharing, long read, will take my time to read through it, very different than our approach.
I guess that there will be different directory structures for different prefect use cases.
For example machine learning application will have very different directory structures than a harvesting application. Even within ML, a training pipeline line will be very different than a NER pipeline.
There are other factors such as if you organize vertically where a developer owns a whole pipeline, or horizontally where a developer owns a single step in several pipelines.
Hopefully in the future there will be more examples 🙂
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.