Hello all :slightly_smiling_face: Is it possible t...
# ask-community
k
Hello all 🙂 Is it possible to have dynamic cache keys? Currently they can be templated, so they are semi-dynamic, but I haven't found any way to make data that's passed through the pipeline part of the key. Usecase: caching mapped tasks where the order can be non-deterministic, so
prefect.context.map_index
might not be enough
n
Hi @Krzysztof Nawara - can you help explain a little further what you're trying to achieve? Maybe a small code sample might help as well
k
Hi @nicholas 🙂 So the upstream task produces list of files that I want to process - so I'm using mapped task. Now I want to cache results of those mapped tasks, but if the list grows/shrinks/changes, with cache keys that rely on
map_index
incorrect results are going to be read from cache. Does it make more sense now?
n
Definitely, thank you @Krzysztof Nawara! I would try something like this:
Copy code
@task(
  cache_key="some_global_key", # to share among mapped children
  cache_for=timedelta(days=1),
  cache_validator=partial_inputs_only(validate_on=['x', 'y']))
def add(x, y):
    return x + y
where you can validate the cache on something like the name of the file that you pass from the upstream task, and any other number of inputs
k
Does it mean that prefect will iterate over all matching cache entries (with the same cache key) until it finds the one for which validator returns True? And another question - how is that behaviour implemented under the hood? From what I have seen in the signature, cache_validator doesn't get access for input values in current execution, only for the previous ones?
Copy code
- state (State): a `Success` state from the last successful Task run that contains the cache
- inputs (dict): a `dict` of inputs that were available on the last
            successful run of the cached Task
- parameters (dict): a `dict` of parameters that were available on the
            last successful run of the cached Task
I decided to check the source code and I'm even more confused - while state indeed comes from cache, both inputs and parameters come from the current run of the flow, not cached one.
n
Hi @Krzysztof Nawara - I'm a little confused, are you running into a problem with caching?
k
Not really a problem, but I don't think I understand how cache validator is designed to work. It seems to me that the comments on the arguments are inconsistent with the implementation.
n
Hm, got it. That's definitely something you could PR where you see inconsistencies! I know there was a little work done on those docstrings recently since we added Results
k
@nicholas Makes sense, thanks for the help 🙂
👍 1