Sam Brownlow03/18/2022, 7:51 AM
is deployed to a single node via docker-compose with a single agent, instead of being deployed via something like Helm, which horizontally scales scheduled flows across a cluster of agents? How quickly does the size of the PostgreSQL database generally grow, relative to the number of flows run? Is there any reason that regularly deleting old runs would be any more complicated than as suggested here? Are there any case studies with Prefect Server being used to run 10s of thousands of concurrent flows? I see that there used to be a Nomad Agent, are there any helpful resources for running Prefect on a nomad cluster? Thanks for any advice you are able to share. I have been diving into Prefect for only the past couple of days so greatly appreciate any referential pointers here.
Anna Geller03/18/2022, 1:22 PM
Prefect would be the central pipe through which the state of the data flows (but not the data itself). It would hand off all computation to microservices via (a)sync requests and its only concern would be controlling how the state is pushed from service to service.Prefect is a dataflow automation tool, so it makes sense to leverage the benefits of Prefect e.g. with respect to passing data between tasks. If you instead write your actual pipelines in some separate (micro)service, you don't take advantage of Prefect with respect to visibility, and Prefect then merely serves as a job scheduler rather than a workflow orchestrator and data automation tool. This is a valid use case for Prefect, but not really taking advantage of its capabilities. Still, we have many users who use Prefect e.g. to orchestrate AWS Batch jobs - a Prefect task triggers a batch job and polls for execution state of that job. When it finishes successfully, you see that a task and flow run were successful. At the same time, you don't really have any information about what is happening within that job - again, a totally valid use case if you prefer this option.
Does the above degradation occur because theScaling Server is hard, to be honest. For Prefect Cloud, we have an entire infrastructure team managing the underlying compute and ensuring that all services scale and are running reliably at all times. But if you want to do it yourself, the database is the only stateful component for which you need to ensure that it scales properly. How quickly your storage grows depends on so many factors (number of flows, flow runs, logs, ...) but if you pick some Postgres-compatible Cloud database like AWS Aurora or GCP Cloud Spanner, it should allow you to grow as you need to. I don't know enough here to judge if simply having 3 Apollo containers instead of one can be immediately used to scale the service. I would suggest starting by scaling vertically, i.e. assigning more vCPU and RAM to your Server components when needed before trying to scale horizontally and introducing load balancing (which is not trivial IMO).is deployed to a single node via docker-compose with a single agent, instead of being deployed via something like Helm, which horizontally scales scheduled flows across a cluster of agents?
horizontally scales scheduled flows across a cluster of agents?Scaling agents that way is not necessary if you use e.g.
- each agent (even a local one!) itself is a lightweight process that polls for new scheduled flow runs and deploys those - for
, it deploys flow runs as separate Kubernetes jobs
Are there any case studies with Prefect Server being used to run 10s of thousands of concurrent flows?I'm not aware of any such case study, most users that need that scale are using Prefect Cloud
I see that there used to be a Nomad Agent, are there any helpful resources for running Prefect on a nomad cluster?There was one user in the community, but the conversation about it is gone from Slack and they didn't contribute it to the main repo
Sam Brownlow03/19/2022, 2:10 AM