Does anyone have a documentation framework for data pipeline/testing best practices? I guess similar to dbt's style guide but for your pipeline?
k
Kevin Kho
04/04/2022, 5:25 PM
Not exactly beyond this. But you can also see the unit tests in the Prefect repo for specific tasks
a
Anna Geller
04/04/2022, 5:35 PM
it would be great to think what would make sense to test in your use case - dbt tests are great because they solve the problem of ensuring that your data aligns with your expected schema (not null, unique, accepted values)
what would be the problem you would like to solve testing your Prefect flows? you could e,g. implemented very similar tests in your custom Prefect flows just in Python or e.g. using Pandera
m
Madison Schott
04/04/2022, 6:01 PM
I was thinking dbt tests would be a large part of the data model itself. Wondering if there's tests for ensuring data is going to right database/schema/table using Prefect? Or the right dependencies are set? That type of thing. I have a test flow that I deploy the models to dev first before prod but schemas and tables are a bit different there vs prod.
a
Anna Geller
04/04/2022, 6:27 PM
if there's tests for ensuring data is going to right database/schema/table using Prefect?
Gotcha, I don't think we can do that in Prefect 1.0, but this seems to be in scope with Prefect 2.0. I'll keep your use case in mind when discussing this feature with the product team, thanks for sharing!
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.