Hi ,
first - thanks for the wonderful package!
i have created a basic scikit pipeline flow , and i am using the DaskExecutor as the executor.
i wonder if the scikit-learn algorithms are really using dask to run in a parallelised distributed manner or the wrapper task are distributed but the actual ML work in it is running locally….
is it the same as running daskML algorithms ?
That will distribute over Dask but it specifically is the like “compute-bound” portion of dask-ml, not the “memory-bound” where you train a model on a Dask DataFrame that is too big for one machine