https://prefect.io logo
Title
l

Luis Muniz

06/26/2020, 7:17 PM
Hi, another newb question: What is the reasonable usage of
map()
? is using
map()
to handle a list of thousands (or millions?) of elements one by one reasonable? Should I micro-batch it into chunks? Does prefect start to choke if you have many many tasks in a flow? What about the Dashboard when you examine such a flow after it has run?
👀 1
n

nicholas

06/26/2020, 7:19 PM
Hi @Luis Muniz - not a newb question at all.
.map
is great for handling lists in the thousands but I think as you go beyond that (or even at that) batching becomes really valuable. In particular as you scale up, you'll want to take a look at parallelization and depth-first execution on Dask; you'll see really improved performance with the latter in particular.
l

Luis Muniz

06/26/2020, 7:21 PM
ok, thanks I really appreciate it. The tutorial example scraping movie scripts seemed to indicate that this is a viable pattern but somehow my "unicorns don't exist" radar was flashing pink
n

nicholas

06/26/2020, 7:22 PM
To my knowledge so far, we've had users with mapped tasks in the hundreds of thousands without issue; if you find you're having trouble you can definitely report that to us and we'll be happy to help 🙂
l

Luis Muniz

06/26/2020, 7:22 PM
I have to say that these advanced techniques you mention are unknown to me, I would appreciate some pointers that would help me learn about them
n

nicholas

06/26/2020, 7:22 PM
Definitely, one sec and let me find some links (meant to include those in the initial message, mb)
👍 1
l

Luis Muniz

06/26/2020, 7:24 PM
Thanks a million. What a great community!
🚀 4