Chen Di

11/20/2021, 3:30 PM
Hi,I'm new to Prefect, but I'm using airflow for a while, we got an issue in airflow, so I wonder if Prefect could handle it better than airflow. We tried to run 40,000 tasks (very simple tasks, just print commands, with 1000 dags, 10 dag runs per dag, 4 task per dag run. ) at same time, we found airflow took a very long time to finish them (about 1 hours) , our server resource is good enough, it looks like scheduler is the bottleneck, so I wonder would Prefect be better than airflow to handle many tasks at same time. What's the max amount of task run at the same time you have seen? How long Prefect takes to finish them? Thanks

Anna Geller

11/20/2021, 4:33 PM
Hi @Chen Di! I can definitely understand your frustration with Airflow. Prefect users often run thousands of parallel tasks using a DaskExecutor and Mapping. It would be hard to give any benchmarks since it all depends on many variables - the type of task you’re doing, your infrastructure, etc. Instead, I would recommend you to try it yourself and then decide whether it works for your use case. The easiest way to get started would be to signup for a free Standard tier on Prefect Cloud and spin up an example flow using a LocalDaskExecutor - this will allow you to run things in parallel without having to spin up a Dask cluster. You can follow this tutorial, and then to test parallelism you can use this example flow.