hi there, a question not very clear from teh docum...
# prefect-getting-started
r
hi there, a question not very clear from teh documentation, when we use cache_key_fn to cache the result, how to make the cache writes into the S3? or this has to be persist_result with result_storage, I found this bit is very confusing
k
caching layers on top of the results system, so the persisted result going to your configured result storage is the thing that'll make it work
r
do you by chance have an example we can refer?
I am not sure I understand if I use perist_result why i still need cache_key: from prefect import flow, task from prefect_aws.s3 import S3Bucket from prefect.tasks import task_input_hash from datetime import timedelta # Configure S3 storage s3_block = S3Bucket.load(“my-s3-block”) @task( cache_key_fn=task_input_hash, cache_expiration=timedelta(days=1), persist_result=True, result_storage=s3_block ) def my_cached_task(input_data): # Task logic here return processed_result @flow(result_storage=s3_block) def my_flow(input_data): result = my_cached_task(input_data) return result # Run the flow my_flow(“some input”)
k
ah, the difference between just persisting results and caching can be confusing, yeah
with only result persistence, successful tasks will be skipped when you retry the same flow run
with caching, tasks that have completed with persisted results will be skipped across different flow runs if the cache key fn returns the same value
r
so our set-up is your ecs fargate with your cloud, and we like to presist or cache some task across days (so every day new instance launched), what is the best way to do that
k
so you're saying for example, you have a deployment that is scheduled to run every x number of hours
for a given task you want it to be skipped if it completed successfully at 9am when it runs again at 9pm
r
yes somethin like that - because at the moment cache key does not work, it says Path /root/.prefect/storage/a0e933971f834ebe8158a38ecd67a3c7 does not exist. 090404 PM prefect.flow_runs ERROR Finished in state Failed(‘Flow run encountered an exception. ValueError: Path /root/.prefect/storage/a0e933971f834ebe8158a38ecd67a3c7 does not exist.’) - i think is because instance is turned off and on
so my guess is I have to cache into remote S3
k
yeah, you need to use a cache key fn and set your result storage location to s3
r
I cannot find documetation set cache storage to s3; or u mean just set results_storage is same thing
k
results_storage is the same thing
r
ok, makes sense, so it is like we said bit of confusing
all good I will try then
👍 1
thank u so much for detailed discussion
k
no problem!