Charles Liu
04/13/2021, 3:52 PMKevin Kho
Charles Liu
04/13/2021, 4:06 PMCharles Liu
04/13/2021, 4:06 PMCharles Liu
04/13/2021, 4:07 PMKevin Kho
Kevin Kho
Charles Liu
04/13/2021, 4:17 PMKevin Kho
Charles Liu
04/13/2021, 4:19 PMCharles Liu
04/13/2021, 4:19 PMKevin Kho
dask.distributed read_csv
, combined with s3fs
. To load/save on top of S3Charles Liu
04/13/2021, 4:23 PMKevin Kho
Kevin Kho
Kevin Kho
Kevin Kho
df.compute()
converts a Dask DataFrame to PandasCharles Liu
04/13/2021, 4:26 PMCharles Liu
04/13/2021, 4:26 PMKevin Kho
Charles Liu
04/13/2021, 4:28 PMKevin Kho
Kevin Kho
emre
04/13/2021, 4:35 PMpandasresult = LocalResult(
dir="./qparse_results",
location="{flow_id}_{task_full_name}_{flow_run_name}.csv",
serializer=PandasSerializer(file_type="csv"))
So turns out result objects can specify custom serializers, and pandas->csv serializer is supported out of the box.
Although I didn't use it with an S3Result
, it has been a real treat while developing.
https://github.com/PrefectHQ/prefect/blob/e3b43402ac5e43c7ad4297ef36ef800360c7391b/src/prefect/engine/serializers.py#L153Kevin Kho
emre
04/13/2021, 4:36 PMCharles Liu
04/13/2021, 4:41 PMCharles Liu
04/13/2021, 4:42 PM