Hi, What is the best practice to persist with a custom file format? I want to both get the benefits ...
r
Hi, What is the best practice to persist with a custom file format? I want to both get the benefits of persistence as well as get a usable file. I'm trying to write parquet files but prefect wants to write an object that contains serialized data wrapped in a json with metadata about the serializer. My current solution is to subclass
LocalFileSystem
, deserialize the data, convert to parquet, and write to file with additional metadata about the serializer. Is this the best practice or is there a different solution?
j
This is just my personal opinion, not an official pronouncement or anything -- but I think it's best to think of the "persist" option on a task as something that is for Prefect, itself, not for you. If your flow gets retried, Prefect can use your persisted task result instead of rerunning that particular task. If you need data persisted for any other reason, you should persist it in the time, place, and format that is convenient to you. Deciding whether to turn on the task "persist" flag and deciding how/where to persist information for other reasons are really two separate decisions unless serialized, wrapped json is a handy format for your purposes.
1
👍 1