iobruno
04/04/2023, 7:10 PMupload_from_dataframe
, we went with .parquet.snappy
and .parquet.gz
for the compressed parquet files.
However, comma, ergo, vis-a-vis, it came to my attention that when you're using tools like Tad and others to visualize tabular data, they EXPECT the file extension to end with .parquet
instead
(Like it works if I rename file.parquet.snappy
to file.snappy.parquet
, or file.parquet.gz
to file.gz.parquet
.
I also noticed that Spark and Flink are actually saving compressed parquets as .snappy.parquet
or .gz.parquet
instead.iobruno
04/04/2023, 7:11 PMDataFrameSerializationFormat
and fixing the pytests to expect .gz.parquet
or .snappy.parquet
instead.iobruno
04/04/2023, 7:12 PMalex
04/04/2023, 7:17 PMiobruno
04/04/2023, 7:19 PMiobruno
04/04/2023, 8:06 PMiobruno
04/04/2023, 8:28 PMZanie