https://prefect.io logo
Title
r

Ravi

03/06/2023, 1:52 PM
Hi all, slightly confused when looking this up in the documentation (I have asked this question before, but came up with no luck) Is there any roadmap/intention for prefect 2 to allow the the manual naming of cache file names similar to older prefect versions "target" argument? The current solution of using hashed file names doesn't seem that useful when it comes to archiving data
c

Christopher Boyd

03/06/2023, 2:05 PM
Hi Ravi, Hashes aren’t generally meant for archiving data, that would be for you in code to generally decide. The results are meant for state based decisions on whether or not a task should be re-run based on the hash of the cache as computed by the input function
The idea is if your inputs and your code don’t change on a function you are cacheing, then the results don’t change, and you don’t need to re-compute it
If you have some computed result in your code like a model or file, or writing a table, that would generally be for you to decide how you want to save it
👍 1
r

Ravi

03/06/2023, 2:07 PM
so if I wanted to store results for any other reason, then I would have to manually make processes/flows with the intent of storing that data?
although that makes some sense, it does feel like that can be a bit waistful in terms of storage
c

Christopher Boyd

03/06/2023, 2:09 PM
yes
if you wanted to download a file for some purpose, we can hash the input and code to decide whether or not to re-download the file, but what you want to do with it is up to you
and where it goes
d

Deceivious

03/06/2023, 3:16 PM
If you are have the task-run-id , you can always use the API to get the filename generated using hash@Ravi.
r

Ravi

03/06/2023, 3:19 PM
Thanks, I am somewhat aware of this from looking at retrieval of results from flow runs, so was under the understanding that the same would apply to tasks. I think that the naming convension of file names between prefect 1's target and prefect 2's hashes is just a bit of a big jump in what users are used to
I think the main issue I have is "if my prefect db gets corrupted, how can I retrieve previous flow run data". because from the sounds of it, everything would have to re-run from scratch
d

Deceivious

03/06/2023, 4:07 PM
It seems that while calling task, theres a
return_state
param or something similar which also has the storage id name.
Unsure about the data corruption if u are hosting the server, you can use postgres database and backups are on ur end. Unsure about the prefect cloud as well, havnt read the "fine prints" myself 😄
👍 1
b

Brad

03/11/2023, 10:05 PM
Hey team - I'm also wondering about this - I'm aware I can use prefect to retrieve the paths, but this limits the use for downstream consumers of the data. I'd like to generate data and persist to storage (s3 etc) in a more structured form. Is there any reason we couldn't have the target concept back in an optional way?
(I'd also be willing to contribute the change)
And just to be more concrete; I'd like to be able to set the
key
variable in
PersistedResult.create
that then gets passed through to the
storage_block