Jean-Paul Berthelot

06/26/2022, 1:04 PM
Hi all I would like to understand performance tuning options for
, I have a requirement to develop a Prefect Workflow for a high volume ingestion workflow. The architecture I am looking at is based on Data Mesh. I have a local environment and worked through several POCs similar to the workflow described here. Could I please be directed to relevent reading materials or examples?

Kevin Kho

06/26/2022, 9:11 PM
I don’t think we have any resources about making this decision. What normally happens is that people on Dask just choose the DaskTaskRunner and the people on Ray just choose the RayTaskRunner (not a lot at the moment). And then you would just tune the engine you are using. The configuration would be passed through during initialization. There isn’t much material though about Dask versus Ray though from what I’ve seen.

Jean-Paul Berthelot

06/27/2022, 12:43 PM
Thanks Kevin, evidently it would be the chosen technology stack with Dask or Ray. My question was more around benchmarking however that can addressed through testing.