Has anyone experienced using prefect for lots (millions) of API requests? I initially created one mapped task for each request, but this caused issues with prefect server performance. I’ve batched the requests into ~5000 per child task, but I can only retry entire batches, not single requests. I’m on the verge of tracking the state of each request outside of Prefect, but wonder if anyone has experienced / overcome similar issues?
06/25/2020, 1:33 PM
Hi @Howard Cornwell,
Looks like you’re in a situation where scale is important. Prefect Core can definitely handle that many mapped task runs. As you’ve identified, your orchestration service is the bottleneck. You have a couple of options:
1) You can try to horizontally scale Prefect Server.
This may be difficult. You’ll need additional Apollo/GraphQL instances and you’ll probably want to connect to a different Postgres instance (possible managed, something like CloudSQL) . Doing so will require changing the Docker Compose file. If you’re comfortable with all of that, give it a shot and let us know how it goes!
2) You can migrate to Prefect Cloud
Prefect Cloud has mechanisms that will help a workflow of that size scale nicely (caching, batch writing logs/states, etc). Thanks to the hybrid model, you data will never leave your infrastructure. You can try Prefect Cloud for free!
06/25/2020, 2:00 PM
Hey, so the issues tend to come when tasks get cancelled or unexpectedly fail. The mapped tasks (hundreds of thousands of them) all attempt to generate at once. When the tasks run normally, the server handles this fine (creating 1 every ~2 to 5 seconds), but when it’s unexpected, it starts dumping the child tasks that never ran maybe ~5 per second until it’s done. For a few smaller tasks this is fine, but when there’s 100,000, that’s 5 hours of the server spinning for nothing
06/25/2020, 7:09 PM
This sounds like something we could improve, actually