https://prefect.io logo
l

Lior Barak

07/15/2023, 3:19 PM
Hi all! i'm running multiple tasks inside a subflow, using default DaskRunner and I keep hitting a wall. not all tasks are started at once, looks like there are 12 tasks starting and than the rest of them are stuck until a few seconds later. I see when looking at Dask dashboard I have 4 workers with 3 n_threads how can I edit these default numbers? (all tasks are just waiting for http calls so I don't mind handling a lot of threads)
1
e

Emil Christensen

07/17/2023, 4:21 PM
@Lior Barak you can change those in the
DaskTaskRunner
constructor (docs):
Copy code
# Use 4 worker processes, each with 2 threads
DaskTaskRunner(
    cluster_kwargs={"n_workers": 4, "threads_per_worker": 2}
)
Also, if you’re just making http calls, you don’t have to use Dask, you could use the default ConcurrentTaskRunner which executes submitted tasks using Python’s native concurrency. Should be lighter weight if you are only IO bound.
l

Lior Barak

07/18/2023, 7:44 AM
amazing thanks! when I used ConcurrentTaskRunner it took quite a lot of time to start the new tasks, so starting 20 tasks (http requests) from a single flow took a little long (maybe 4 seconds) that's why i'm looking into Dask
e

Emil Christensen

07/18/2023, 2:52 PM
Hmm, that’s odd… it should be very quick - quicker than Dask. Were you using
.submit
?
l

Lior Barak

07/27/2023, 9:57 AM
ah I see I wasn't using
submit
properly is there an upper limit for
ConcurrentTaskRunner
? lets say I want to run 1000 flows with 100 tasks each in parallel (on Server not cloud) can I just make Prefect run on a strong machine?
(90% of the tasks are async post calls to external REST servers)