Thread
#prefect-community
    Chris L.

    Chris L.

    1 year ago
    Hi Prefect community! Does anybody have any examples of
    big_future = client.scatter(big_dask_dataframe)
    and passing
    future = client.submit(func, big_future)
    as an output from one task to be used as an input in another task? I found this UserWarning at the bottom of the "prefect-etl" article in the dask docs (https://examples.dask.org/applications/prefect-etl.html) as well. Was wondering if anybody has encountered this issue as well? And whether there's a solution to this. Thank you in advance!
    Michael Adkins

    Michael Adkins

    1 year ago
    I'm not sure passing futures around like this will work, I'd be curious to hear how it goes.
    Generally we require a
    with worker_client
    block to avoid deadlocks https://distributed.dask.org/en/latest/task-launch.html#connection-with-context-manager
    Chris L.

    Chris L.

    1 year ago
    Thanks for the response Michael. I am working on an ETL pipeline with Dask Dataframes. I wanted to persist the DataFrame after reading a parquet file from S3, scatter the DataFrame into memory across the workers, then pass the future to downstream ETL tasks.
    But futures (to my knowledge) arent serialisable? So I’ll stick with calling dask.compute once inside the very last ETL task in the pipeline.