Hawkar Mahmod
04/01/2021, 1:23 PMemre
04/01/2021, 1:45 PM<s3://your_bucket/your/path/to/pickle>
Check out results for more details:
https://docs.prefect.io/core/concepts/results.html#resultsHawkar Mahmod
04/01/2021, 1:46 PMemre
04/01/2021, 1:49 PMemre
04/01/2021, 1:50 PMprefect.context
object, which is in memory, and therefore short lived. The persisting output section notes that you need a Result
object to explicitly specify where and how your data will be stored.Hawkar Mahmod
04/01/2021, 1:53 PMemre
04/01/2021, 3:18 PMcache_for
and cache_key
isn't good enough. First run notes that the cache is invalid, and runs tasks normally. Subsequent runs mark the task as cached, meaning a cache has been found, but passes None
to downstream tasks, failing the flow.
Adding a result
parameter, specifically a LocalResult
object made the non-cached task run persist its output, and subsequent runs used the cached value successfully.
Here is what I think is going on:
A Result
merely exists as a way to persist task outputs to somewhere. It does not have to be involved with caching.
For prefect core runs, the cache is simply prefect.context
. Result
configurations aren't involved here at all.
For server/cloud runs, the cache is stored on the API side. But the cached value is not the data itself, but the Result
objects location. If the server API determines that a task has a valid cache, the cached location is used, alongside your Result
configuration, to retrieve the actual value of your data.Zanie
Results
bridge the gap between the API and your runtime environment. Since the API is designed to maintain separation from your data, we need a way to tell the API where the data is stored in your own infrastructure.Zanie
Jeremy Tee
04/02/2021, 1:42 AMHawkar Mahmod
04/06/2021, 7:48 AMZanie
Zanie