Hi everyone, I am trying to wrap my head around re...
# prefect-community
Hi everyone, I am trying to wrap my head around result caching šŸ˜… . On a core only run on my workstation, I keep failing to reuse my result on a long running task. My latest attempt is as follows:
Copy code
meta_df = SnowflakePandasResultTask(
This persist files with arbitrary names under
. On every run I get a warning that my cache is not valid anymore, Can anyone point me to where I am doing things wrong?
Hi @emre! Each time you run the flow containing this task, are you doing so from a new process?
I think so, I run from the terminal and every run builds the flow, runs it and then exits back to the terminal
ok gotcha - so when using
alone, the storage of all previous cached runs occurs in memory; this means that if you call this from new processes they have no way of sharing information. However, there is a relatively simple workaround: all cached states from all tasks are stored in
so if you save this after each run and load it before each run, it should start behaving as you expect. Something like:
Copy code
with open(".prefect_cache/THE_CACHE.pkl", "wb") as f:
    cloudpickle.dump(prefect.context.caches, f)

# on load
with open(".prefect_cache/THE_CACHE.pkl", "rb") as f:
    the_cache = cloudpickle.load(f)
Thanks @Chris White worked like a charm! Btw, this behavior built in would be very useful for me, I would like to see it as a feature. If it does sound ok to you, I want to try add it as an option to prefect core.
šŸ˜„ 1
In general for any sort of stateful work we recommend people use Prefect Server or Prefect Cloud, so Iā€™d be hesitant to include this in Core alone ā€” it will require more configuration for caching (where to store the cache), which is already a little confusing for folks
I see, makes sense šŸ™‚