iobruno
04/04/2023, 7:10 PMupload_from_dataframe , we went with .parquet.snappy and .parquet.gz for the compressed parquet files.
However, comma, ergo, vis-a-vis, it came to my attention that when you're using tools like Tad and others to visualize tabular data, they EXPECT the file extension to end with .parquet instead
(Like it works if I rename file.parquet.snappy to file.snappy.parquet, or file.parquet.gz to file.gz.parquet.
I also noticed that Spark and Flink are actually saving compressed parquets as .snappy.parquet or .gz.parquet instead.iobruno
04/04/2023, 7:11 PMDataFrameSerializationFormat and fixing the pytests to expect .gz.parquet or .snappy.parquet instead.iobruno
04/04/2023, 7:12 PMalex
04/04/2023, 7:17 PMiobruno
04/04/2023, 7:19 PMiobruno
04/04/2023, 8:06 PMiobruno
04/04/2023, 8:28 PMZanie