Cole Murray05/01/2022, 9:52 PM
• the centralized nature of the Airflow scheduler provides a single point of failure for the systemIn a typical master/worker deployment, you have a single-point-of-failure schedule (yes you can run it highly available with locking / other mechanisms) responsible for reading from a schedule DB, to invoke tasks into a queue to be processed by workers. Can someone clarify how Prefect server solves this issue? From the docs, it seems Prefect server is also a SPOF in this architecture. Based on code here: https://github.com/PrefectHQ/server/blob/master/src/prefect_server/services/towel/scheduler.py#L21, we would not be able to run several instances of the server simultaneously, as there is no locking taking place against the DB, and would cause double execution
• Homogenous dependencies, but may change in the future as we move to more ML based workflows.given this use case, you may use SubprocessFlowRunner with a virtual environment
Can someone clarify how Prefect server solves this issue?Sure, we can. In the default Prefect Server configuration, the scheduler service can be seen as SPOF, but that's not true when you use Prefect Cloud or when you scale Server to be distributed - usually, when you need that level of reliability and scale, you could opt for Prefect Cloud. The same is valid for Prefect 2.0. Feel free to ask more questions, if my answer hasn't addressed your concerns about SPOF
Cole Murray05/01/2022, 11:55 PM
will give you a Kubernetes Deployment manifest specifying everything you need to deploy Orion components to Kubernetes. You may then use Kubernetes-native features to ensure that all services run reliably in a fault-tolerant way. If you want to use Zookeeper, I'm afraid that at the moment, this is out of scope for Prefect 2.0, but you can give it a try and perhaps even contribute some recipe or blog post once you figure out a good setup? Fwiw, I can reassure you that scaling Orion is much easier than e.g. Prefect 1.0 because, under the hood, Orion is comprised of REST API services which are easier to scale and load balance than, e.g. GraphQL. Does this information help? Our focus is currently on building the most critical Prefect 2.0 features in order for 2.0 to come out of beta, but we may revisit distributed setting at a later time. If you are interested, I could open an issue so that you could keep track.
prefect orion kubernetes-manifest