https://prefect.io logo
j

Jeremy Phelps

01/04/2021, 11:03 PM
Hi all. Am I correct in my understanding that Prefect Cloud delegates all the real work to our own infrastructure, which we are responsible for setting up and maintaining?
👍 1
j

Jim Crist-Harif

01/04/2021, 11:08 PM
Yes, this is correct. This is to keep your code and data secure - prefect cloud never has direct access to it. You can read more about our hybrid execution model here: https://www.prefect.io/why-prefect/hybrid-model
prefect cloud 1
P 1
j

Jeremy Phelps

01/04/2021, 11:34 PM
Then what advantage is there to using Prefect Cloud instead of just running everything entirely on our infrastructure? It must be small.
c

Chris White

01/04/2021, 11:51 PM
Hey Jeremy - it’s beneficial to think of Prefect as two components: a workflow engine that knows how to run your code on various platforms (and knows how to communicate with a Prefect API) and a stateful orchestration API. Prefect Cloud is the stateful orchestration API - it manages global state of all of your workflows, provides a highly available notification layer, and provides many orchestration features based on the workflow signals your workflows send (permissions, concurrency limiting, SLAs, robust failure detection, etc.). If you don’t have any requirements around availability, scale or monitoring, then using Prefect Core alone is perfectly valid
j

Jeremy Phelps

01/05/2021, 12:01 AM
I would think that the scale comes from the infrastructure we provide to handle the workload. More workers = bigger scale. The orchestration end of it should be a light workload. It's just bookkeeping. A single modest server should be able to orchestrate thousands of worker machines. Does Prefect Core not handle workflow signals, permissions, concurrency limits, failure detection, etc?
c

Chris White

01/05/2021, 12:06 AM
What I mean by “scale” is that managing hundreds of workflows without a central backend is chaotic to track and monitor. Also note the API is not a passive consumer - depending on the feature set you use, the API is making decisions for your flows and tasks. Prefect Core does handle signals and failure detection as much as it can given there is no central broker of state - for example, if your workflow process dies, there is nothing that can alert you to that fact when using Core alone.
j

Jeremy Phelps

01/05/2021, 12:10 AM
So the server that runs when you call
prefect server start
is not a centralized back-end that can handle an arbitrary number of workflows? What are all the differences between that server and the one that Prefect Cloud provides?
c

Chris White

01/05/2021, 12:25 AM
Ah yes, I glossed over Server - Server is a centralized backend for monitoring and bookkeeping; Server and Cloud are similar in that respect but there are some differences: • Server does have scaling limits largely dependent on the performance of the database layer (you most likely wouldn’t be able to run 10s of thousands of parallelized tasks w/ server, as an extreme but still real-world example) • if you’re multi-cloud or even multi-node, hosting Server will require knowledge of networking • the infrastructure delta of Server depends on how far you take it but is non-zero - you’ll need to host the API / keep it available / etc. whereas the infrastructure delta of Cloud is close to 0 - typically it’s just a single agent process, as what your flows and tasks are doing / connecting to is already something in your stack • Server has no concept of user and thus no concept of API permissions • there is a feature gap between Server and Cloud; it’s best to think of Server as a “state / history bookkeeper” instead of an orchestrator / insurance layer You should choose whatever combination of tools makes sense for your needs - if all you need is bookkeeping and using a hosted API isn’t in the cards for you, then Server should work just fine
j

Jeremy Phelps

01/05/2021, 12:32 AM
What I was really hoping to find in Prefect Cloud was a fully managed solution— including the actual workload. A lot of our current problems with Airflow/Google Cloud Composer are due to Kubernetes pods having their filesystems freeze up inexplicably. Since Prefect Cloud leaves this part of the system in our hands, we can expect to still have to manage the Kubernetes cluster, manually delete dead pods, and probably also manually intervene in the Prefect UI, just like we have to do with Airflow. How does the scaling limit in Prefect Server arise? 10s of thousands of tasks should just translate to 10s of thousands of database rows, something Postgres can handle without much difficulty.
c

Chris White

01/05/2021, 12:43 AM
I see; depending on the nature of the freeze, that situation should be capable of being automated away with Cloud (details highly dependent on your flows). Storage of data != accessing that data; every API request in Server involves interacting with the database, and that is where the limit fundamentally arises from. So, for example, adding logs and updating states, etc. for 10,000 things concurrently is not something every postgres installation can natively handle. Regardless of that, if you prefer managed solutions to non-managed ones then I don’t see how taking on the hosting of Server is going to help you out.
j

Jeremy Phelps

01/05/2021, 12:55 AM
The way I see it is this: You charge a lot of money for Cloud, but we still have to manage an expensive chunk of the system. I think Cloud may be charging for the easy, cheap part of the job. If we have to manage the worker cluster (and also set up a server to run the Agent, and provide for the Agent's reliability), perhaps we might as well also take the small additional burden of running the Server ourselves too. The nature of the freeze is that worker processes get stuck in uninterruptible sleep while trying to access the local filesystem within GKE.
c

Chris White

01/05/2021, 1:30 AM
If your primary concern is running an execution cluster for your code, then I think your problems are DevOps related, rather than workflow problems - perhaps Prefect Cloud is not the product you need at this time. We released Server so that you would have the option of self-hosting a Prefect API, and if that’s preferable to you I absolutely encourage you to do so. However, as you are currently in a Cloud PoC, you have a variety of dedicated resources and support channels available to you, and I would have expected you to leverage those before airing your grievances in our support forum.  These are the types of decisions and issues your account manager is for.
2 Views