Can anybody point to a relatively large project th...
# ask-community
d
Can anybody point to a relatively large project that uses prefect in github to checkout their directory tree structure ? Wondering how they group and name the python files of their flows and tasks when they have a lot of them.
k
The only big public one I have seen in the edx repo here but it’s also been a while since I searched for Prefect
👍 1
d
@Kevin Kho, had a look, but it’s not exactly what I am looking for, the issue I am trying to really address is how to organize flows ? We settle on using
Copy code
pipe-line-source-name
    logic
    tasks.py
    flow.py
Copy code
pubmed/
    logic/
       pubmed-bulk-download.py
       pubmed-api-pull.py
       pubmed-to-jsonl.py
       pubmed-enrichment.py
       pubmed-indexer.py
       pubmed-storage.py
    pubmed-tasks.py
    pubmed-flow.py
We will have many other sources like rxiv, twitter, etc … Each as a directory with the structure above Wonder what is your opinion about the
pubmed-
name redundancy, so far we haven’t needed to create multiple files containing tasks or flows, we actually place everything in a single flow.
k
Not an open source repo, but some people talk about their setup here . have you seen it?
d
Wow, looks pretty nice, thanks for sharing, long read, will take my time to read through it, very different than our approach. I guess that there will be different directory structures for different prefect use cases. For example machine learning application will have very different directory structures than a harvesting application. Even within ML, a training pipeline line will be very different than a NER pipeline. There are other factors such as if you organize vertically where a developer owns a whole pipeline, or horizontally where a developer owns a single step in several pipelines. Hopefully in the future there will be more examples 🙂
k
Yes for sure