https://prefect.io logo
Title
k

Kelvin DeCosta

01/25/2023, 7:49 AM
Hey everyone! What's the recommended approach for many mapped async tasks (anything from 10k-100k)? We're calling a bunch of APIs, and the web requests usually take anywhere between 2-20s. I've tried create batches of these tasks and then calling
.map
for all the tasks in a batch. While this works, it feels very hacky and isn't ideal for performance. Any help is appreciated
z

Zanie

01/25/2023, 4:26 PM
Hey! I’m working on some improvements in this area. What kind of problems are you running into? What kind of HTTP client are you using?
k

Kelvin DeCosta

01/26/2023, 2:32 PM
We're using a wrapper around the
requests
module. At the moment the main issue is with the time taken for each task. Sometimes the task is over within 10-20s, other times it takes 2-5 mins.
z

Zanie

01/26/2023, 4:14 PM
If you’re writing async tasks, you should use an async client like httpx
Although it depends what your wrapper looks like, I’m guessing you’re sending requests in threads?
k

Kelvin DeCosta

01/27/2023, 6:11 AM
I don't think we're using threads, I'm not sure about it. The wrapper essentially calls
requests.request
z

Zanie

01/27/2023, 3:44 PM
Oh, that’ll be bad for the event loop — it’ll scale very poorly that way.
k

Kelvin DeCosta

02/01/2023, 7:41 AM
Thanks for the heads up @Zanie From what I can tell, when previously using
concurrent.futures.ThreadPoolExecutor
, our
requests
wrapper performed maybe 5-20 times faster than it does now via prefect tasks. I'm considering migrating from
requests
to
httpx
. I'd really appreciate it if you could give a rough estimate of the expected performance increase.
z

Zanie

02/01/2023, 3:55 PM
I can’t give an estimate — but performing synchronous IO in an asynchronous context is very bad for async performance.
Running each request in a Prefect task will always be slower though, because the task needs to be orchestrated which takes a few API calls so you’re increasing the number of total requests.
1
k

Kelvin DeCosta

02/02/2023, 5:26 AM
Thank you for the heads up!