Hi all slightly confused when looking this up in the documen Prefect Community #ask-community

Hi all, slightly confused when looking this up in ...

Ravi

03/06/2023, 1:52 PM

Hi all, slightly confused when looking this up in the documentation (I have asked this question before, but came up with no luck) Is there any roadmap/intention for prefect 2 to allow the the manual naming of cache file names similar to older prefect versions "target" argument? The current solution of using hashed file names doesn't seem that useful when it comes to archiving data

Christopher Boyd

03/06/2023, 2:05 PM

Hi Ravi, Hashes aren’t generally meant for archiving data, that would be for you in code to generally decide. The results are meant for state based decisions on whether or not a task should be re-run based on the hash of the cache as computed by the input function

Christopher Boyd

03/06/2023, 2:06 PM

The idea is if your inputs and your code don’t change on a function you are cacheing, then the results don’t change, and you don’t need to re-compute it

Christopher Boyd

03/06/2023, 2:06 PM

If you have some computed result in your code like a model or file, or writing a table, that would generally be for you to decide how you want to save it

👍 1

Ravi

03/06/2023, 2:07 PM

so if I wanted to store results for any other reason, then I would have to manually make processes/flows with the intent of storing that data?

Ravi

03/06/2023, 2:08 PM

although that makes some sense, it does feel like that can be a bit waistful in terms of storage

Christopher Boyd

03/06/2023, 2:09 PM

yes

Christopher Boyd

03/06/2023, 2:10 PM

if you wanted to download a file for some purpose, we can hash the input and code to decide whether or not to re-download the file, but what you want to do with it is up to you

Christopher Boyd

03/06/2023, 2:10 PM

and where it goes

Deceivious

03/06/2023, 3:16 PM

If you are have the task-run-id , you can always use the API to get the filename generated using hash@Ravi.

Ravi

03/06/2023, 3:19 PM

Thanks, I am somewhat aware of this from looking at retrieval of results from flow runs, so was under the understanding that the same would apply to tasks. I think that the naming convension of file names between prefect 1's target and prefect 2's hashes is just a bit of a big jump in what users are used to

Ravi

03/06/2023, 3:54 PM

I think the main issue I have is "if my prefect db gets corrupted, how can I retrieve previous flow run data". because from the sounds of it, everything would have to re-run from scratch

Deceivious

03/06/2023, 4:07 PM

It seems that while calling task, theres a

return_state

param or something similar which also has the storage id name.

Deceivious

03/06/2023, 4:08 PM

Unsure about the data corruption if u are hosting the server, you can use postgres database and backups are on ur end. Unsure about the prefect cloud as well, havnt read the "fine prints" myself 😄

👍 1

Brad

03/11/2023, 10:05 PM

Hey team - I'm also wondering about this - I'm aware I can use prefect to retrieve the paths, but this limits the use for downstream consumers of the data. I'd like to generate data and persist to storage (s3 etc) in a more structured form. Is there any reason we couldn't have the target concept back in an optional way?

Brad

03/11/2023, 10:06 PM

(I'd also be willing to contribute the change)

Brad

03/11/2023, 10:29 PM

And just to be more concrete; I'd like to be able to set the

key

variable in

PersistedResult.create

that then gets passed through to the

storage_block

2 Views

Open in Slack

Previous Next