https://prefect.io logo
Title
c

Chris Hart

07/31/2019, 11:09 PM
this might be a silly question but are there any alternatives to dask for parallelization that could just do “native” multiprocessing/async? (I’ve discovered that some of our chosen 3rd party libraries don’t work on dask, but the tradeoff of not having them is not worth it)
c

Chris White

07/31/2019, 11:52 PM
Not a silly question at all; I believe Ray might be an alternative to dask for parallelization. We might also consider implementing a multiprocessing executor of some kind if this gets traction. One of the biggest technical challenges in implementing parallelization is task-scheduling within the Flow, i.e., after you fire off a bunch of parallel tasks, determining when is it appropriate to fire off the next downstream task?
🤔 1
👍 1
c

Chris Hart

08/01/2019, 12:03 AM
sweet, will have a look at Ray.. if you’re curious our final blocker to switching to dask, after stripping out the official elasticsearch client (which doesn’t play nicely with pickle), and validating that everything is pickle-able.. ended up being this: https://github.com/profusion/sgqlc/issues/58
c

Chris White

08/01/2019, 12:11 AM
definitely interested in collecting this info, thank you very much for sharing!