Looking for any resources regarding development li...
# random
Looking for any resources regarding development lifecycles between local, dev, and prod environments for data engineering projects - specifically using Dask/Prefect, but thatโ€™s probably too niche so other resources are welcome!
I would love to see more, but the only thing in this topic Iโ€™ve seen so far is this
๐Ÿ‘Œ 1
๐Ÿ‘ 1
It is quite niche though since it depends on the CI/CD and the Storage you use (Github? Docker?). Some people use the CI/CD to register their flows in different environments. On the Prefect side, you can use different projects to host the different environments. If you are on enterprise, this separation can be provided on the tenant level.
Awesome - thank you! Yes - we are planning on Kubernetes deployments of Prefect Server, and Dask Gateway, with Gitlab storage. Initially thought we would have multiple deployments of Prefect Server rather than projects - one for development environments and one for production, and using different IAM roles for each to control what is accessed and where things are run. Something like running prefect flows locally via python (
python myflow.py
) with a dedicated dask cluster for this purpose, then running in development, then finally pushing to prod. Things are a little tricky with the local step, because we may have a use case for users playing with sensitive data at that stage, which is a whole other animal.
๐Ÿ‘€ 2
Ah i see, but even if you had multiple servers, it still seems the sensitive data being exposed would still be an issue right? The setup to push to prod looks good though
๐Ÿ’ฏ 1