Hawkar Mahmod
04/01/2021, 1:23 PMemre
04/01/2021, 1:45 PM<s3://your_bucket/your/path/to/pickle>
Check out results for more details:
https://docs.prefect.io/core/concepts/results.html#resultsHawkar Mahmod
04/01/2021, 1:46 PMemre
04/01/2021, 1:49 PMemre
04/01/2021, 1:50 PMprefect.context object, which is in memory, and therefore short lived. The persisting output section notes that you need a Result object to explicitly specify where and how your data will be stored.Hawkar Mahmod
04/01/2021, 1:53 PMemre
04/01/2021, 3:18 PMcache_for and cache_key isn't good enough. First run notes that the cache is invalid, and runs tasks normally. Subsequent runs mark the task as cached, meaning a cache has been found, but passes None to downstream tasks, failing the flow.
Adding a result parameter, specifically a LocalResult object made the non-cached task run persist its output, and subsequent runs used the cached value successfully.
Here is what I think is going on:
A Result merely exists as a way to persist task outputs to somewhere. It does not have to be involved with caching.
For prefect core runs, the cache is simply prefect.context . Result configurations aren't involved here at all.
For server/cloud runs, the cache is stored on the API side. But the cached value is not the data itself, but the Result objects location. If the server API determines that a task has a valid cache, the cached location is used, alongside your Result configuration, to retrieve the actual value of your data.Zanie
Results bridge the gap between the API and your runtime environment. Since the API is designed to maintain separation from your data, we need a way to tell the API where the data is stored in your own infrastructure.Zanie
Jeremy Tee
04/02/2021, 1:42 AMHawkar Mahmod
04/06/2021, 7:48 AMZanie
Zanie