Janet Carson
12/28/2023, 1:10 AMMarvin
12/28/2023, 1:11 AMPREFECT_LOCAL_STORAGE_PATH
setting. This implies that the caching of task results between flow runs is limited to those runs that have access to the same local storage path.
- Task run identifiers: Task runs are identified by a unique key that combines the flow run ID and a hash of the task inputs and parameters. When a cache key function such as cache_within_flow_run
is used, it will construct this identifier to determine if a task should be retrieved from cache based on whether it has already been computed with the same parameters within the flow run.
Given this information:
- If a task is executed within two different flow runs with identical keyword arguments (and assuming the cache settings allow for caching across flow runs), it is possible that the task result for the second run could be retrieved from the cache, provided that the caching mechanism has access to the shared local storage path where the result was persisted.
- The flow parameters themselves do not necessarily have to be identical. The caching mechanism primarily looks at the task's inputs and parameters to determine if the result can be retrieved from cache. However, the cache key strategy (implemented via the cache_key_fn
) could potentially take other factors into consideration, including flow parameters, if explicitly coded to do so.
To summarize, under default conditions, as long as the task inputs and parameters are identical and the caching configuration allows for it, the task results could be retrieved from the cache across different flow runs with different flow parameters.
The behavior can be further customized by defining a cache_key_fn
on a task, which can include any logic for computing whether a task should be cached or not. If the caching strategy does not inspect flow parameters, they will not impact the caching decision.