< Marvin> how does prefect tasking caching work on prefect 2 Prefect Community #ask-marvin

<@ULVA73B9P> how does prefect tasking caching work...

Tom Matthews

10/07/2024, 2:44 PM

@Marvin how does prefect tasking caching work on prefect 2.0 and prefect 3.0. Where are the results cachced - to disk, in memory, or to cloud storage? (if im using open source prefect deployed on kubernetes, with work pools)

Marvin

10/07/2024, 2:44 PM

In both Prefect 2.0 and 3.0, task caching allows you to reuse the results of tasks that might be costly to reproduce with every flow, making your workflows more efficient and ensuring the idempotency of your pipelines. ### Prefect 2.0 Caching Tasks in Prefect 2.0 support caching their return values, which can be reused in subsequent runs. However, the specific storage location (e.g., disk, memory, or cloud storage) is not detailed in the documentation excerpts available. Typically, the caching mechanism would involve some form of result persistence, likely configured by the user with the openness of the Prefect infrastructure in mind. Source for Prefect 2.0 caching: Prefect 2.0 Task Caching Docs ### Prefect 3.0 Caching In Prefect 3.0, task caching is implemented with specific configurations that hash based on: - The inputs provided to the task - The code definition of the task - The prevailing flow run ID, or task run ID if executed autonomously These hashed values form a task's cache key. By default, calling the same task with the same inputs more than once within a flow will result in the cached behavior for all calls after the first. However, this caching also relies on result persistence, and the default setting for this is off. Again, the result persistence usually involves external storage and can be configured based on the user's infrastructure. Source for Prefect 3.0 caching: Prefect 3.0 Task Caching Docs In both versions, it's important to configure your environment, such as Kubernetes, with the necessary storage options that meet your caching and persistence needs, since Prefect's open-source configuration allows for flexibility and remote execution scenarios.

Nate

10/07/2024, 2:46 PM

Where are the results cachced - to disk, in memory, or to cloud storage?

these are all valid options you set

result_storage

on the task or flow you want to configure, which can be an

S3Bucket

LocalFilesystem

or some other writeable filesystem

Nate

10/07/2024, 2:46 PM

https://docs.prefect.io/3.0/develop/results#result-storage

Tom Matthews

10/07/2024, 2:46 PM

Thanks Nate, is it the same for 2.0 and 3.0?

Nate

10/07/2024, 2:47 PM

the differences are pretty deep in the weeds, for most users it will be exactly the same between 2 and 3

Tom Matthews

10/07/2024, 2:48 PM

Okay so i can assume that i can have cross flow / deployment caching if i just configure a S3Bucket filesystem / block and point to that with

result_storage

arg to the @task decorator?

Tom Matthews

10/07/2024, 2:49 PM

for v2 and v3?

Nate

10/07/2024, 2:49 PM

yes! also this discussion might be useful as an example just in terms of caching mechanics / cache policies

Tom Matthews

10/07/2024, 2:51 PM

Nice! Thanks very much Nate! 🙏

catjam 1

9 Views

Open in Slack

Previous Next