Is there a recommended file format for flat file storage created from a DataFrame? The data may have mixed data types including arrays stored as a value. In the past I've had trouble with complex data types using
feather
format and sometimes ran into errors with
HDF
as well
m
Mariia Kerimova
05/04/2021, 8:27 PM
Hello Alex! Prefect is data format agnostic, and you can use any data types. I personally never used feather, but let's see if community has something to share.
k
Kevin Kho
05/04/2021, 8:34 PM
Hey @Alex Furrier, Mariia is right that Prefect is data format agnostic. Mixed types are generally hard to deal with and I think that you will probably run into errors with
feather
because Apache Arrow is strongly typed. If you want to force this, you can turn that column into a binary blob and then use
feather
. When you load it and put it in pandas, by unpickling, I think it will work.
Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.