Hi everyone! I am currently trying to implement Pr...
# prefect-dbt
k
Hi everyone! I am currently trying to implement Prefect with DBT. We have a prefect repo with all organization's data workflows. I'm wondering whether we should put our DBT code inside that Prefect repo or should it have it's own Github repo? What would be the pros and cons? In terms of Developer experience and CI/CD?
o
For my team, mixing the two in the same repo seems to work best. It's nice to be able to review a single pull request when changes to SQL logic and flow logic are made at the same time.
🙌 1
m
You can go either way on this, but whichever path you take, leave yourself some room to change your mind in the future.
We have prefect & dbt in separate repos. This is mostly because the CI/CD builds are different and we have different groups working on each codebase
🙌 1
b
IMO if the same people are writing both the dbt modules and prefect flows, then you may as well have it in one repo.
🙌 1
d
@George Coyne wrote a great blog post about our recommended setup with Prefect 1, I think all recommendations / the flow as a whole translates to Prefect 2: https://medium.com/slateco-blog/prefect-orchestrating-dbt-10f3ca0baea9
🙌 1
tl;dr - it's pretty straightforward to clone and run the Dbt repo as part of your flow, so it's best to keep them separate since they could be maintained by different teams / CI/CD processes
I'd love to hear about combined setups, though
g
Also this is slightly dated as there is a hack for an earlier version of the DBT shell task, but otherwise solid!
🙌 1
Also Prefect1 couldn't deploy all files for you, so Prefect2 makes mono repo a lot easier
upvote 1
k
Thanks everyone for sharing your inputs! We're rather a small team now and I wanted to onboard our data analyst on DBT (with mainly SQL skills), and I thought it could be a little bit overwhelming to work with the monorepo. Also I like how Airbyte pulls a dbt repo from Github to run DBT transformations and wanted to mimick this behaviour in Prefect instead of using Airbyte's built-in dbt capability. But in the monorepo, I also like the fact that the dbt code is part of a flow, and not just an external piece that comes at runtime, which can make sense for versioning purpose...unless there's a clean way to tag flows with commits taken from another dbt repo.