https://prefect.io logo
a

Alex Furrier

05/04/2021, 8:14 PM
Is there a recommended file format for flat file storage created from a DataFrame? The data may have mixed data types including arrays stored as a value. In the past I've had trouble with complex data types using
feather
format and sometimes ran into errors with
HDF
as well
m

Mariia Kerimova

05/04/2021, 8:27 PM
Hello Alex! Prefect is data format agnostic, and you can use any data types. I personally never used feather, but let's see if community has something to share.
k

Kevin Kho

05/04/2021, 8:34 PM
Hey @Alex Furrier, Mariia is right that Prefect is data format agnostic. Mixed types are generally hard to deal with and I think that you will probably run into errors with
feather
because Apache Arrow is strongly typed. If you want to force this, you can turn that column into a binary blob and then use
feather
. When you load it and put it in pandas, by unpickling, I think it will work.
d

Dharhas Pothina

05/05/2021, 5:26 PM
For tabular data Parquet is a very good option.
👍 1