Bring your towel and join one of the fastest growing data communities. Welcome to our second-generation open source orchestration platform, a completely rethought approach to dataflow automation.

Prefect Community

I've been trying to figure out how to run a flow consisting of ~3000 parallel flow runs using ACI as a backend. I tried a basic prefect server in a VM, then switched to the helm chart, then added a postgresql-ha chart. All these set ups work up to ~500 active flow runs and then crash with various kinds of server errors (too many active connections, timeouts, etc.). In my case the server, ACI workers and the k8s cluster have autoscaling enabled and seem to be scaling to 10-20 pods each during runs.

I was wandering if prefect is designed to handle such use cases at all and if anyone has tried doing that with a self-hosted server. I've got no experience in setting up infrastructure like that so any tips, even the obvious ones, would be welcome.

I also epxerienced issues with
1. ACI instances not getting terminated after the expected time (the flow is just a dummy sleep)
2. Concurrency slots not being freed when instances stop
3. UI showing some subflows/tasks as running/pending even after the parent flow crashes

This may all be the symptoms of postgresql connection issues.