Hello. My flow contains a dask-ml prediction model...
# ask-community
j
Hello. My flow contains a dask-ml prediction model such as logistic regression/xgboost etc. Is it possible to use dask executor to run the flow also? It just failed for both algorithms. Or should I just use normal sklearn model inside flow? Without specifying the execute=dask_executor, the flow just works fine.
c
Hi @Jie Lou interesting question -> would you mind providing some sort of traceback for how the task failed?
j
Hi Chris. Thanks for replying. It did not show much info. It just said, some tasks failed
It seems that calling client = Client() inside task function makes it fail
c
Ah gotcha; tasks run on dask workers so you should use a worker client instead
j
Thanks! I'll try.
c
Let me know how it goes! I definitely want prefect to support ML pipelines with ease
j
Chris, the issue is that I am using an edited version of dask-xgboost which requires to call Client() to set host address etc. I'm not sure if it can call workers' client or not.
c
Oh interesting, I’ll have to look into that; in the meantime you can trying using a LocalExexutor
j
cool. Thanks for your time😊
c
Anytime!
j
Some updates: other algorithms under dask-ml like logistic regression/random forest works well with dask executors. Time is shortened thankfully. I think the problem only exists when I have to use that special version of dask-xgboost (to solve address issue) which has to call client. That seems not likely to be a typical problem for prefect users, more in dask-xgboost side.
c
Ah that’s good news! Thank you very much for the update @Jie Lou
@Marvin archive “Issue running dask-ml models in Prefect”
t
I'm looking into using Prefect as a single language/DSL to describe Airflow like ML Pipeline and sklearn like featurization pipeline. Has there been work done (or an example) to have Dask ML Pipeline be described in Flow? Thanks you!
c
Hi @Taleb Zeghmi - I know for sure that people have used Prefect for similar use cases, but unfortunately no examples have been made public yet; if you happen to get something up and running it’d be awesome if you shared it!
t
I worry that there's extra work to make it work together. Could you provide an example given this is a main use case and pattern for ML? thanks!
c
I personally won’t have time to put together such an example anytime soon unfortunately
Hi @Taleb Zeghmi - this is probably not exactly what you’re looking for, but here’s an example of an executable ETL workflow for “image processing” that runs on Dask + binder: https://examples.dask.org/applications/prefect-etl.html
There’s a binder link at the top of the page if you want to run it live