Dear Prefect people, can you please help me find out how to make Prefect (Cloud?) work for my use case? 🧑💻
We want to process thousands of satellite images 🛰️🌍 on a Kubernetes cluster, scaling to ~100 TB of RAM and thousands of jobs running in parallel. We already successfully did this with our own little event-based workflow manager (REST APIs, Postgres, Docker, Python K8s API), but it is lacking a lot of the bells and whistles that Prefect has, choking a bit on all the messages, and we would rather like to join the awesome community you have here! ❤️ Using your nice docs and the open source code, we already ran a single processing job on our K8s cluster via a Prefect flow 👍. Now we want to scale this up.
Intuitively, I would have one flow process a single image (using 3-8 K8s tasks, ~10 min per task). But we want to scale to 1000s of images getting processed in parallel. Seeing that Prefect Cloud Teams plan only allows for 2️⃣(!) concurrent flows, I get in doubt whether Prefect is built for this number of concurrent flows and all the logs and messages generated from that.
❓ Can you please let me know whether Prefect is meant to scale to 10ths of 1000s of tasks running in parallel? And if so, whether I can run 1000 flows concurrently or should rather have one flow branch out into 1000 tasks with each their subtasks (if that is possible)?
Please note that this is burst compute once a month or so, with lower (10s of images) daily loads. I am happy to work with Prefect Core, but Cloud might also be very nice if we can afford it.