Jan Therhaag

09/04/2019, 3:07 PM
Hi - I have a question regarding the usage of Dask collections in prefect flows. Basically what I'm trying to accomplish is: - reading a bunch of parquet files from disk repeatedly (say once a day) - combine them into a Dask dataframe and do several transformations - write out the dataframe to a Kartothek dataset (basically also a collection of parquet files with metadata if you don't know the package)