Does a local agent delete cached files when it's d...
# ask-community
t
Does a local agent delete cached files when it's done with them? I have a machine with a local agent running designed for some larger, high-memory processes, and the disk keeps filling up with the ~/.prefect/results folder. Do I need to do something for cached files to be cleaned up automatically?
k
Hey @Tom Shaffner, it does not as Prefect does not delete files. What I’ve seen some users do is specify the 
location
 of the checkpoint to the same file. This means you always have the checkpoints of the most recent run and the new runs will overwrite the old files. Doing this gives you retries from Failure while minimizing the data footprint.
You can also turn off checkpointing for some tasks if it makes sense
t
That's a bit surprising; the files are all automatically dated, is it not possible for Prefect to delete them after a certain amount of time? It could simply delete the ones older than the longest cache each day. I'm assuming the location you referenced above is the
result
option discussed at https://docs.prefect.io/core/concepts/persistence.html#output-caching-based-on-a-file-target? I'll try using it for now then; if I'm misunderstanding what you mean please let me know.
k
Yes exactly on the result location. I think we’re just not in the business of deleting files in general
t
@Kevin Kho, I just tried it using
_result_=LocalResult(_dir_='~/.prefect/results/{plan}_plan_data')
, and in that case the {plan} part didn't convert on the basis of mapped tasks; rather it created a directory with that name and saved them all, in usual name format, in it. I realized though, that's the DIR parameter, so I thought maybe I needed to switch to a file one. According to the docs though, https://docs.prefect.io/api/latest/engine/results.html#localresult, there IS no file parameter. So how do I do this with mapped tasks?
k
For mapped tasks, you need to template the location so that each of the mapped task gets a separate location
t
@Kevin Kho isn't that what I was doing above? The brackets in the result were an attempt to do just that; is there a difference I'm missing?
Oooh, looking at the link you sent, I think it's that I was using Dir, and I should use
location
I'll try that
k
Yeah that sounds right
t
It's then perhaps worth noting that this Location parameter doesn't show up in the docs at https://docs.prefect.io/api/latest/engine/results.html#localresult
k
Ah cuz it’s inherited from the base class and those are autogenerated
t
Good to know; I'll check that next time, thanks! Retrying this way; if you don't hear from me it worked. Thanks for the help!
Incidentally, this worked but was only a partial solution. It worked for those particular files, but in other cases the flow still seems to be creating and caching a lot of files. To correct that I added a daily cron job that deletes all files in the results folder older than 3 days (longer than my longest cache). It's clunky, and I really think Prefect should have something like this built into it, but it's the only reliable way I have to keep my local agent from filling up my drive each week for large processes.
For this purpose, https://www.howtogeek.com/howto/ubuntu/delete-files-older-than-x-days-on-linux/ was useful, for any others following this chain.