Hello Prefect Community, I started working on a d...
# show-and-tell
a
Hello Prefect Community, I started working on a distributed task queue library a few months back. The library is available as a python package to install a start using : daskqueue - pypi package For all its greatness, Dask implements a central scheduler (basically a simple tornado event loop) involved in every decision, which can sometimes create a central bottleneck. This is a pretty serious limitation when trying to use Dask in high-throughput situations. Daskqueue is a small python library built on top of Dask and Dask Distributed that implements a very lightweight Distributed Task Queue. Daskqueue also implements persistent queues for holding tasks on disk and surviving Dask cluster restart. This could be used as a special Executor to enqueue work. I have experiment the limits of the
.map()
method where mapping on a large Iterable will bring the Dask cluster to a halt. I also wrote an article about implementation details: https://medium.com/@aminedirhoussi1/daskqueue-dask-based-distributed-task-queue-6fb95517dfea Hope you enjoy it, can't wait to hear about your feedback 😉 !
gratitude thank you 5
🎉 4
👀 5
👍 4
marvin duck 5
🙌 6
👏 6
a
thanks for sharing!
👍 1