Hi everyone!
I am defining the cache policy for a task that is composed of a function in which body I am calling a couple of other utility functions I wrote. I want to use the cache in case the task has been called on the same inputs and its source code and the source of the functions called by it has not been modified. I set
@task(cache_policy=INPUTS+TASK_SOURCE)
, but this way if the source of any of the functions called inside the task changed the cached result is used instead of rerunning the task. How can I solve this?
n
Nate
03/11/2025, 2:13 PM
hi @Valerio - you’d have to use a custom cache key fn, so CacheKeyFnPolicy and then use something like inspect to get the relevant source code and hash it, as by default (by design) we don’t recursively consider source code for the TASK_SOURCE policy
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.