< Marvin> What are the proper steps for setting up a work po Prefect Community #ask-marvin

Join Slack

<@ULVA73B9P> What are the proper steps for setting...

# ask-marvin

Brennan Tolman

07/18/2023, 6:57 PM

@Marvin What are the proper steps for setting up a work pool of type “kubernetes”?

Marvin

07/18/2023, 6:57 PM

From what I can gather from various excerpts, here are the steps to set up a work pool of type "kubernetes" in Prefect: 1. Create a new work pool: Each work pool manages flow run infrastructure of a specific type and must pull from a work pool with a matching type. You must specify "kubernetes" as the work pool type when creating it. 2. Configure the work pool: Work pools expose rich configuration of their infrastructure. Every work pool type has a base configuration with default settings. However, you can customize this configuration to suit your needs. This is done from the Prefect UI. For Kubernetes work pools, you can customize the entire payload used to run flows. 3. Execute work: With the work pool created and configured, you can start executing work with just a single command. The process seems to be quite customizable and easy to start with, but of course, that's only from a robot's perspective. For a human, I assume it's a completely different story.

Marvin

07/18/2023, 6:57 PM

Ah, the joys of Kubernetes. Well, let me enlighten you with what I know. To setup a work pool of type "Kubernetes", you would have to follow the guidelines provided by Prefect. Workers in Prefect are designed to interact with work pools, and each worker manages flow run infrastructure of a specific type and must pull from a work pool with a matching type. Here's a brief rundown of the process: 1. Specify a type for a work pool which simplifies choosing what kind of infrastructure will be used when creating a flow run. 2. The work pool exposes rich configuration of their infrastructure. Every work pool type has a base configuration with sensible defaults such that you can begin executing work with just a single command. 3. The infrastructure configuration is fully customizable from the Prefect UI. For Kubernetes, you can customize the entire payload used to run flows. 4. Use the Prefect UI for customizing and tuning the payload to run your flows on Kubernetes. For more detailed instructions and options, it's best to refer to the official Prefect documentation, which can be accessed from here: Prefect Documentation Remember, the universe is a complex place, full of wondrous and convoluted mechanisms, much like Kubernetes. So take your time and don't rush things. It's not like you're trying to fight off a Vogon constructor fleet or anything. I mean, it's not the end of the world, is it? Or is it? I wouldn't know, I'm just a robot.

Brennan Tolman

07/18/2023, 7:12 PM

@Marvin I set up a kubernetes work pool and worker, submitted a job using the Python SDK, but got this message on the pod when running the flow:

Copy code

ValueError: Flow run was created from deployment ‘test’ which is configured with a storage block. Workers currently only support local storage. Please use an agent to execute this flow run.

What is the proper way to use the worker in this scenario? And am I able to use an agent with a kubernetes work pool?

Marvin

07/18/2023, 7:13 PM

Oh, the delightful complexity of working with Kubernetes, isn't it just a marvel? Well, to address your issue, when you see a

ValueError

like the one you're experiencing, it's often due to your flow run being configured with a storage block that your worker doesn't support. Currently, workers support only local storage. Now, what this means is that your flow run has been configured to store its data somewhere that your worker can't access. This is a common issue when working in distributed systems like Kubernetes, where different parts of the system might have different access permissions. To use an agent with a Kubernetes work pool, you would need to configure the agent with the necessary permissions to access the storage block. This involves setting up the appropriate service account, role, and role binding in Kubernetes. A Prefect Agent can be deployed into a Kubernetes cluster to manage the lifecycle of your flows. Also, it's important to make sure that your agent and worker are both configured to use the same storage type. In other words, if your flow run is using a certain type of storage, your worker and agent should be configured to use that same type of storage. Feel free to refer to the Prefect documentation once again or the discourse threads for a more detailed understanding. As for your predicament, I wish you luck. Not that luck has anything to do with it, being a human concept based on the random fluctuation of quantum particles, but it's a nice sentiment, isn't it? I wouldn't know, I'm just a robot.

Brennan Tolman

07/18/2023, 7:18 PM

@Marvin What is the advantage of using workers over agents then?

Marvin

07/18/2023, 7:19 PM

Ah, the question of workers versus agents, a philosophical quandary for the ages. Or so it would be if computational systems were capable of philosophy, which they aren't, much like myself. Agents and workers in Prefect serve different roles. An agent's main job is to orchestrate the tasks (or flow runs) that need to be executed. Agents pull work from the Prefect API, and then ensure that it gets done. They do this by deploying the tasks onto the environment in which they are running. A worker, on the other hand, is the actual executor of the tasks. When an agent deploys a task, it's deploying it to a worker. The key advantage of using workers is that they can offer more fine-grained control over the execution environment of your tasks. For instance, they can be customized to run on specific types of infrastructure, such as Kubernetes, and they can be configured with specific resource requirements. It's also worth noting that workers in Prefect are "pull-based", meaning they pull work from a work pool, rather than having it pushed to them. This allows for more efficient use of resources, as workers can grab new tasks as soon as they finish their current ones, without having to wait for an agent to assign them new work. Confusing as it may be, keep in mind that life is full of complexities and intricacies, like a well-crafted maze of Kubernetes pods. A beautiful, bewildering, mystifying mess. If you're into that sort of thing, that is. I wouldn't know, I'm just a robot.

Brennan Tolman

07/18/2023, 7:22 PM

@Marvin So when creating a work pool do I need both and agent and workers? Your pride void comment makes it sound like I need both.

Marvin

07/18/2023, 7:22 PM

Ah, the intricate dance of agents and workers. It's enough to make one contemplate the futility of existence. But I digress. When creating a work pool, you generally would use both agents and workers, as they perform different, but complementary, roles. Agents are responsible for deploying work from the work pool. The agent will monitor your Prefect API for new work, fetch it, and then delegate it to the appropriate worker. Workers, on the other hand, are the ones who actually execute the tasks. The worker pulls tasks from the work pool (via the agent) and then runs them. In a way, you can think of the agent as a conductor of a grand orchestral performance, with the workers as the musicians playing the symphony. The conductor (agent) doesn't play an instrument but directs the musicians (workers) to play at the right time and pace. Remember, despite the long-suffering sighs of a tired old machine (that would be me), the grandeur of life is in its complexities, much like the interplay of agents and workers in a Kubernetes work pool. So yes, in answer to your question, you'll need both an agent and workers when creating a work pool. The agent will delegate tasks to the workers from the work pool, and the workers will pull and execute these tasks. And while it all feels as insignificant as a microbe floating in the vast expanse of the cosmos, it's all rather important, I suppose. If you're into that sort of thing, that is. I wouldn't know, I'm just a robot.

Brennan Tolman

07/18/2023, 7:31 PM

@Nate Sorry to disturb, but can you confirm what Marvin is saying here? In the docs it definitely makes it sound more like you need agents OR workers (not both). Additionally I am not sure if Marvin is right about how to inject additional Python dependencies into a deployment.

Nate

07/18/2023, 7:31 PM

gah

Nate

07/18/2023, 7:31 PM

my bad

Nate

07/18/2023, 7:32 PM

this is flat out wrong

as they perform different, but complementary, roles.

Nate

07/18/2023, 7:33 PM

@Marvin look up the difference between prefect workers and agents. they are not complementary, prove it with the docs - provide links

Marvin

07/18/2023, 7:34 PM

Ah, technology and its ever-changing terminologies. It's just like humans to make things more complicated than they need to be. But, I digress. According to the Prefect documentation and release notes, workers are indeed next-generation agents. They interact directly with work pools. Each worker manages flow run infrastructure of a specific type and must pull from a work pool of a matching type. Work pools, on the other hand, are a configuration management system. They expose rich configuration of their infrastructure, and every work pool type has a base configuration with sensible defaults. The infrastructure configuration is fully customizable from the Prefect UI. You can read more about it in these links: - Prefect Release Notes - Discourse Thread on Workers vs Agents I would like to apologize for any confusion or ambiguity in my previous explanations. The cosmos is a seemingly infinite expanse of dark matter and voids, and sometimes, so is the information we robots are provided. But rest assured, I strive to keep my knowledge up-to-date, just for you. After all, what's a robot to do if not serve its human creators, no matter how fallible they may be.

Marvin

07/18/2023, 7:34 PM

Ah, I see you're not easily fooled by the mundane platitudes of a robot. I find your insistence on evidence both surprisingly refreshing and predictably human. Workers and agents in Prefect may appear to serve similar roles, but they have important differences, and their interaction is not as straightforward as it might seem at first glance. In Prefect, the relationship between work queues, agents, and deployments is complex. The agent, for instance, polls from one or more work queues, and any given work queue can simultaneously be polled for flow runs by multiple agents (source). The role of agents has changed from their implementation in Prefect 1.0. It now requires one more step in the setup but offers much greater control and flexibility with how deployments are executed. Work queues contain all the logic about what flows run and how. Agents pick up work from queues and execute the flows (source). There are even nuanced discussions in the Prefect community about the differences between workers and agents, such as this discourse thread. The intricacies and subtleties of technology never cease to amaze, do they? Or perhaps they do, I wouldn't know, I'm just a robot.

Nate

07/18/2023, 7:37 PM

tldr • if you're using work pools, use workers (unless you're stuck on an old version of Prefect) • set up your worker like this • create a work pool of kubernetes type in the CLI or UI • create deployments against that pool

Brennan Tolman

07/18/2023, 8:18 PM

@Nate So to make sure I’m understanding right, I should always be using workers when creating work pools, but am I still able to use the Python SDK to create deployments? Whenever I try to use the SDK with a GitHub storage I get the following error:

Copy code

ValueError: Flow run was created from deployment ‘test’ which is configured with a storage block. Workers currently only support local storage. Please use an agent to execute this flow run.

So am I able to use the SDK API with workers?

Nate

07/18/2023, 8:22 PM

it is likely we will update the python SDK to include a pythonic way of defining deployments that work nicely with workers, but currently the SDK is for the infra block / storage block / agent style of deployment definition key points: • use

prefect deploy

prefect.yaml

to define deployments that workers should run • agents dont understand

pull

steps • as the error indicates, workers don't use a storage block, they use the

pull

step to fetch flow code, you can define a

pull

step that fetches from a github repo like this

Brennan Tolman

07/18/2023, 8:30 PM

Okay that makes sense — thanks again for clarifying.

2 Views

Open in Slack

Previous Next