< Marvin> I use the S3 storage block to store my flow code H Prefect Community #ask-marvin

<@ULVA73B9P> I use the S3 storage block to store m...

dathey

03/05/2024, 8:59 PM

@Marvin I use the S3 storage block to store my flow code. However, when I create new deployments, deleted files are not removed from S3 which causes bugs in my flow code. Is there anyway to force Prefect to delete files from S3 which aren't included in the Deployment code?

Marvin

03/05/2024, 8:59 PM

It appears that the Prefect documentation does not contain explicit instructions about deleting files from S3 that are no longer part of the deployment code. However, managing deployment artifacts to keep them in sync with the codebase could be part of a development CI/CD process. One method to handle this is by implementing a custom script that cleans up the S3 bucket and runs as part of your deployment process, ensuring that only the current files are retained in S3. If necessary, you might consider raising an issue or requesting this feature on the Prefect GitHub repository: Prefect GitHub Issues. As this seems like a common issue one might face when deploying flows with Prefect, I recommend reaching out to the Prefect community or support to see if there are any established patterns or recommendations for this scenario. Please refer to these links for additional context about storage and deployment in Prefect: - Prefect Storage Blocks - Prefect Block-Based Deployments

Nate

03/05/2024, 9:28 PM

we generally wont (i dont think ever) automatically delete things from your infrastructure

causes bugs in my flow code.

what bugs are caused by deleting deployments?

dathey

03/05/2024, 9:53 PM

@Nate Similar issue was experienced here -- This may be specific to DBT, but it is designed to run all the SQL models in a given directory. If we delete a model from this directory in our repository, but that deletion doesn't happen in S3, DBT ends up running that model inside of our Prefect flows because it was never deleted from S3. We could write some sort of custom CI/CD script to remove files older than the latest deployment's timestamp, but this seems like something Prefect should be able to handle. Let me know if I can clarify

Nate

03/05/2024, 10:00 PM

hmm thats interesting (disclaimer im not a dbt buff) at the time when you write to your disk, can you look up the deployment that needs to use those files / paths? im thinking that if you wrote to a path that contains the name + version of the deployment that will need that path, that you could have the flow use

prefect.runtime.deployment

to only try to run the models that are in the place on disk that corresponds to that deployment then yeah, if you needed to clean up stuff on disk associated with old deployments, you could have some script in CI that deletes paths associated with deployments you can't find in the API anymore does that make sense?

dathey

03/05/2024, 10:15 PM

Hey Nate -- Yeah, sort of. I think I came up with a similar, but slightly different approach. I was planning to dynamically set the S3 bucket_path to include the current date when we define the

storage

object for a given deployment. It would mean that every time we deploy new code we would essentially create a new S3 directory for it. Then we could just set some lifecycle rules on our S3 bucket to get rid of old code after a couple months or something. I do still feel like in

prefect.filesystems.s3

it would be nice to have an optional attribute to overwrite the directory when you instantiate an S3 Storage object, but the above solution is certainly acceptable to get things working. https://docs.prefect.io/latest/api-ref/prefect/filesystems/#prefect.filesystems.S3 Re: DBT --> This is just my personal example. I guess I could foresee this being an issue not just with DBT, but really with any script that runs every file inside a given directory.

Nate

03/05/2024, 10:55 PM

that makes sense to me. if you have the bandwidth, an issue codifying your ask here would be helpful in making sure this problem is articulated / on our radar officially!

dathey

03/06/2024, 3:38 PM

where can I file an issue?

Nate

03/06/2024, 3:39 PM

https://github.com/PrefectHQ/prefect/issues/new/choose

dathey

03/06/2024, 3:39 PM

Thanks!

Nate

03/06/2024, 3:39 PM

👍

17 Views

Open in Slack

Previous Next