< Marvin> how do I create a deployment that references local Prefect Community #ask-marvin

<@ULVA73B9P> how do I create a deployment that ref...

Charles Marks

03/30/2024, 6:10 PM

@Marvin how do I create a deployment that references local storage to run on a local worker programmatically in the latest version of prefect?

Marvin

03/30/2024, 6:13 PM

Creating a deployment that references local storage to run on a local worker programmatically in the latest version of Prefect can be done by following the instructions in the Prefect documentation. You'd want to focus on two aspects: defining the deployment with the appropriate storage, and then running the deployment on your local worker. Here are the steps extracted from the documentation: 1. Build the Deployment Definition File: You start by building a deployment definition file, referred to as

deployment.yaml

. This step may include uploading your flow to a configured remote storage location if one is specified. Build Deployment 2. Where to Store Your Flow Code: You need to decide where to store your flow code. The storage options are discussed in the storage guide, which focuses on deployments created with the interactive CLI experience or a

prefect.yaml

file. Since you're interested in creating deployments using Python code, the guide suggests looking for that specific section in the documentation for respective instructions. Storage Guide Unfortunately, I could not find a direct link with detailed instructions for the exact process of creating a deployment programmatically for a local worker with local storage. The last excerpt suggests further information could be available but does not provide a direct link. I suggest looking directly into the Prefect documentation for "creating deployments using Python code" and checking the section related to local storage options and configuring a local worker. Implementing local storage and local worker configuration might require using specific Prefect API or classes in Python code, as it is not explicitly elaborated in the excerpts provided above.

Nate

03/30/2024, 6:21 PM

doh I gotta get marvin to stop referencing block based deployments (deprecated in favor of workers) by default if you say

persist_result=True

on a task / flow it'll use local storage

~/.prefect/*

w.r.t the runtime, so if its a local worker, but you can set either •

PREFECT_HOME=/some/path

to control where that default storage is relative to •

PREFECT_DEFAULT_RESULT_STORAGE_BLOCK=some-block-type/some-block-name

to save results elsewhere

Charles Marks

03/30/2024, 6:26 PM

Ah gotcha thanks. But it seems to have gotten it almost right, I was just not getting past the prefect deploy helper because I was using an absolute path lol. Now I have the yaml file so should have what I need to generate a bunch of deployments

Nate

03/30/2024, 6:30 PM

gotcha, i just saw

deployment.yaml

and was like 🙅 bc that was the declarative way to to use

prefect deployment build

which was the old CLI cmd for agents / block based deployments so yep

prefect deploy

prefect.yaml

is what you want - bunch of examples here if its helpful

🙌 1

Charles Marks

03/30/2024, 7:37 PM

@Marvin what is the CLI command to apply a prefect.yaml?

Marvin

03/30/2024, 7:37 PM

To apply a

prefect.yaml

file using the Prefect CLI, you can use the following command:

Copy code

bash
prefect deploy --prefect-file path/to/my_file.yaml

This command allows you to specify the path to your custom

prefect.yaml

file. By default, the

prefect deploy

command will look for a

prefect.yaml

file in the project's root directory. If you want to use a different file or location, you can provide the path as shown above. For more details, you can refer to the Prefect Documentation on deploying with prefect.yaml.

Charles Marks

03/30/2024, 8:11 PM

@Nate I'm being a noob again. Is there a way to bulk load deployments? This is what my yaml reference looks like now.

Nate

03/30/2024, 8:12 PM

you can do

prefect deploy -n name path/to/file.py:entrypoint

prefect --no-prompt deploy --all

Nate

03/30/2024, 8:13 PM

latter is nice for CI etc where you have all the info you need in your

prefect.yaml

i.e. dont need interactive experience

Charles Marks

03/30/2024, 8:13 PM

Also I ran into two more issues. 1 if I create a deployment without referencing the yaml it seems to work and add it to the server, but it won't load these and gives me an event loop error. 2 the one I made manually in the CLI and try to launch in the ui just hangs saying late.

Charles Marks

03/30/2024, 8:14 PM

Cool I'll try

Charles Marks

03/30/2024, 8:15 PM

Oh I see, so it just won't appear in the UI until I try to run it once?

Charles Marks

03/30/2024, 8:21 PM

Yeah I shouldn't need the UI, but my original idea was to split up all the schemas into deployments which I can run one after the other being fed into the worker queue, I'm assuming the local process would just stay alive until it gets through all of them. It would let me investigate a schema that fails as a deployment and still move on to the next

Charles Marks

03/30/2024, 8:21 PM

But maybe that's not how deployments should be used?

Nate

03/30/2024, 8:25 PM

2 the one I made manually in the CLI and try to launch in the ui just hangs saying late.

this is usually because a worker is not running for the work pool that you've assigned the deployments to not so sure what your objective is / how you mean "schema" here but

split up all the schemas into deployments which I can run one after the other being fed into the worker queue, I'm assuming the local process would just stay alive until it gets through all of them

in general a process worker will live as long as the process that starts when you say

prefect worker start -p some-local-pool

Charles Marks

03/30/2024, 8:40 PM

I'm exploring CLI help, starting to see what's going on better. So the worker is the environment that runs your code, the work-pool gives it work, and the queues let you configure how the pool gives flows to the worker? So then the deployment is assigned to the work pool and queue where the queue limits concurrency for the deployments you assign it...?

yess 1

Charles Marks

03/30/2024, 8:42 PM

schema's are postgresql schemas with a bunch of tables.

👍 1

Nate

03/30/2024, 8:47 PM

pretty much you only need to engage with the idea of work queues if: • you need concurrency limits for your flow runs on a more granular level than work pool (you can set them on either/both) • you need to prioritize how a worker picks up flow runs sent to a work pool but basically work pools are things that: • represent a type of runtime for your deployments (k8s, ECS, local process etc) • can have many work queues (just 1 by default) • store the base job template for that infra ◦ env, working directory etc (are examples of

job_variables

that the

local

work-pool type has) ◦ this can be overridden on a deployment basis deployments are assigned to a work pool, then flow runs from that deployment are sent to that work pool, workers pick up runs from that pool according to the above and then submit the run to the execution env according to the template on your work pool + any overrides

Charles Marks

03/30/2024, 9:08 PM

Very elucidating, thanks! So does the worker/env live only as long as the highest level flow in the deployment? Or as long as the pool is backed up with more flow runs? Just want to make sure so I can know how to deal with shared recourses. Also can the same worker be assigned to multiple work pools? I'm guessing as long as work pool is backed up, and no for multi-pool assignment.

Charles Marks

03/30/2024, 9:15 PM

> workers pick up runs from that pool according to the above and then submit the run to the execution env according to the template on your work pool + any override This is what I don't get, the env is live as long as the worker is alive and submitting work to it? Or is it restarting at each deployment to override any variables? And env = infrastructure?

Nate

03/30/2024, 9:48 PM

> So does the worker/env live only as long as the highest level flow in the deployment? yep, the flow run environment is torn down after the "entrypoint" flow exits, but the worker process runs till you kill it or it dies the process worker is a special case in the sense that the flow runs execute as a subprocess of the worker, otherwise the workers job in k8s or ecs is just to spin up a container for the flow run (entirely separate infra from the worker itself)

Nate

03/30/2024, 9:49 PM

a work pool has

job_variables

env

is an example of a

job_variable

that can be set for the duration of a flow run

Charles Marks

03/30/2024, 9:51 PM

Ok that makes sense, then I probably shouldn't make a deployment for each sql schema. And instead cache results or do that explicitly with something else, which I already do with another sql asset tracking table.

👍 1

35 Views

Open in Slack

Previous Next