Hi Prefect community, for anyone attending this we...
# ask-community
j
Hi Prefect community, for anyone attending this week's Dask Summit online conference, I'll be giving one of the keynote presentations tomorrow morning at 9am ET along with my colleague @Jie Lou on our use of Prefect & Dask for machine learning in healthcare. Prefect (and Dask) have become very key parts of our technology and have delivered great results for our business. Paid tickets are required to attend live tomorrow, but I think the talks will get posted to the Dask Youtube channel for free access after the conference: https://summit.dask.org/schedule/presentation/62/dask-prefect-for-healthcare-machine-learning-on-aws/
upvote 1
👍 3
k
Could you also post in #CL09KTZPX so it does not get drowned out with the support stuff?
d
Hey, Joe! 👋
👋 1
e
Will this be uploaded somewhere? I missed it by a hairs width 😢
k
The Dask Summit in general will be uploaded all the talks to their Youtube channel in a couple of weeks. If you are registered for the Dask Summit, they are pretty good about posting their talks at the end of the day.
e
hello, did this ever get posted to youtube? Please share a link if it did! Thanks!
k
As far as I know it wasn't yet but they will in the next month or two (they said 2 months after the event)
e
Ahh! I thought I had read 2 weeks!
k
Did you sign up for it though? Cuz they have the Zoom recordings on their platform if you did
e
@Kevin Kho off hand, do you know of any repos or code examples using prefect/dask with ML? I’m curious what people’s best practices are regarding prefect flows and sklearn pipelines .
I didn't, I wish I would have but didn’t know about it :(
k
Are you using Pandas or Dask DataFrames? I guess Pandas cuz sklearn?
e
Well, that’s the interesting part to me, it could be either. I’ve been looking at the dask_ml docs and have a few flows I’ve made using dask as the backend pointing to a dask cluster instead of using joblib. I know there is some integration of sklearn with dask df, So really examples of either would be awesome.
k
I only have this
🙌 1
e
Awesome , that’s definitely helpful! That’s what I was wondering, if most people were doing something along the lines of what you shared which is sort of rolling your own pipeline with a prefect flow.
k
Yes but it's more common to persist the data between tasks and load them in downstream tasks rather than pass dataframes.
e
When you say persist the data between tasks, you mean something like writing to a parquet file somewhere and passing the location of that file instead of passing a df, correct?
k
Yes exactly!
e
You are Awesome Kevin! Thanks for the help!
🙏 1