https://prefect.io logo
k

Krzysztof Nawara

10/09/2020, 7:26 PM
Hi everyone! I'd like to ask for explanation how built-in
all_inputs
cache validator is meant to work. Cache validator receives 3 arguments, but only 2 of those are relevant here:
Copy code
- state (State): a `Success` state from the last successful Task run that contains the cache
- inputs (dict): a `dict` of inputs that were available on the last successful run of the cached Task
Now my current understanding (almost certainly incorrect) is that they come from the same run. But then the logic of the validator wouldn't make any sense:
Copy code
elif getattr(state, "hashed_inputs", None) is not None:
        if state.hashed_inputs == {key: tokenize(val) for key, val in inputs.items()}:
            return True
        else:
            return False
    elif {key: res.value for key, res in state.cached_inputs.items()} == inputs:
        return True
It just compares inputs passed directly to validator to inputs extracted from the state. So it's pretty clear those 2 arguments can come from different runs, but I don't understand how is that possible. If someone could provide an explenation I'd be very grateful 🙂
s

Spencer

10/09/2020, 7:39 PM
I think it depends on the scenario that you're caching. Typically the cache applies when you re-running a flow, though there may be a case where you're caching a task's output across flows.
k

Krzysztof Nawara

10/09/2020, 7:43 PM
Hmmm
the last successful Task run that contains the cache
- so this would be the run that generated the value that is in the cache, correct? because all runs that read from cache don't overwrite it
last successful run of the cached Task
- but this is the same, isn't it? last successful run -> run that wrote the cache
Would it matter which pipeline they belonged to as long as they have the same cache key?