Jean-Paul Berthelot

    Jean-Paul Berthelot

    2 months ago
    Hi all I would like to understand performance tuning options for
    RayTaskRunner
    versus
    DaskTaskRunner
    , I have a requirement to develop a Prefect Workflow for a high volume ingestion workflow. The architecture I am looking at is based on Data Mesh. I have a local environment and worked through several POCs similar to the workflow described here. Could I please be directed to relevent reading materials or examples?
    Kevin Kho

    Kevin Kho

    2 months ago
    I don’t think we have any resources about making this decision. What normally happens is that people on Dask just choose the DaskTaskRunner and the people on Ray just choose the RayTaskRunner (not a lot at the moment). And then you would just tune the engine you are using. The configuration would be passed through during initialization. There isn’t much material though about Dask versus Ray though from what I’ve seen.
    Jean-Paul Berthelot

    Jean-Paul Berthelot

    2 months ago
    Thanks Kevin, evidently it would be the chosen technology stack with Dask or Ray. My question was more around benchmarking however that can addressed through testing.