< Marvin> If a flow is executed with two different sets of p Prefect Community #ask-marvin

<@ULVA73B9P> If a flow is executed with two differ...

Janet Carson

12/28/2023, 1:10 AM

@Marvin If a flow is executed with two different sets of parameters, but there is a task executed with the same keyword arguments inside those two flow runs, is it possible that the task results for the second run will be retrieved from the cache, or is it required that the flow parameters also be identical?

Marvin

12/28/2023, 1:11 AM

In Prefect 2.x, task caching behavior is defined as follows: - Task caching: Caching is supported where task results are cached in memory during a flow run and persisted to the location specified by the

PREFECT_LOCAL_STORAGE_PATH

setting. This implies that the caching of task results between flow runs is limited to those runs that have access to the same local storage path. - Task run identifiers: Task runs are identified by a unique key that combines the flow run ID and a hash of the task inputs and parameters. When a cache key function such as

cache_within_flow_run

is used, it will construct this identifier to determine if a task should be retrieved from cache based on whether it has already been computed with the same parameters within the flow run. Given this information: - If a task is executed within two different flow runs with identical keyword arguments (and assuming the cache settings allow for caching across flow runs), it is possible that the task result for the second run could be retrieved from the cache, provided that the caching mechanism has access to the shared local storage path where the result was persisted. - The flow parameters themselves do not necessarily have to be identical. The caching mechanism primarily looks at the task's inputs and parameters to determine if the result can be retrieved from cache. However, the cache key strategy (implemented via the

cache_key_fn

) could potentially take other factors into consideration, including flow parameters, if explicitly coded to do so. To summarize, under default conditions, as long as the task inputs and parameters are identical and the caching configuration allows for it, the task results could be retrieved from the cache across different flow runs with different flow parameters. The behavior can be further customized by defining a

cache_key_fn

on a task, which can include any logic for computing whether a task should be cached or not. If the caching strategy does not inspect flow parameters, they will not impact the caching decision.

Open in Slack

Previous Next