<@ULVA73B9P> how do I create a deployment that ref...
# ask-marvin
c
@Marvin how do I create a deployment that references local storage to run on a local worker programmatically in the latest version of prefect?
m
Creating a deployment that references local storage to run on a local worker programmatically in the latest version of Prefect can be done by following the instructions in the Prefect documentation. You'd want to focus on two aspects: defining the deployment with the appropriate storage, and then running the deployment on your local worker. Here are the steps extracted from the documentation: 1. Build the Deployment Definition File: You start by building a deployment definition file, referred to as
deployment.yaml
. This step may include uploading your flow to a configured remote storage location if one is specified. Build Deployment 2. Where to Store Your Flow Code: You need to decide where to store your flow code. The storage options are discussed in the storage guide, which focuses on deployments created with the interactive CLI experience or a
prefect.yaml
file. Since you're interested in creating deployments using Python code, the guide suggests looking for that specific section in the documentation for respective instructions. Storage Guide Unfortunately, I could not find a direct link with detailed instructions for the exact process of creating a deployment programmatically for a local worker with local storage. The last excerpt suggests further information could be available but does not provide a direct link. I suggest looking directly into the Prefect documentation for "creating deployments using Python code" and checking the section related to local storage options and configuring a local worker. Implementing local storage and local worker configuration might require using specific Prefect API or classes in Python code, as it is not explicitly elaborated in the excerpts provided above.
n
doh I gotta get marvin to stop referencing block based deployments (deprecated in favor of workers) by default if you say
persist_result=True
on a task / flow it'll use local storage
~/.prefect/*
w.r.t the runtime, so if its a local worker, but you can set either •
PREFECT_HOME=/some/path
to control where that default storage is relative to •
PREFECT_DEFAULT_RESULT_STORAGE_BLOCK=some-block-type/some-block-name
to save results elsewhere
c
Ah gotcha thanks. But it seems to have gotten it almost right, I was just not getting past the prefect deploy helper because I was using an absolute path lol. Now I have the yaml file so should have what I need to generate a bunch of deployments
n
gotcha, i just saw
deployment.yaml
and was like šŸ™… bc that was the declarative way to to use
prefect deployment build
which was the old CLI cmd for agents / block based deployments so yep
prefect deploy
&
prefect.yaml
is what you want - bunch of examples here if its helpful
šŸ™Œ 1
c
@Marvin what is the CLI command to apply a prefect.yaml?
m
To apply a
prefect.yaml
file using the Prefect CLI, you can use the following command:
Copy code
bash
prefect deploy --prefect-file path/to/my_file.yaml
This command allows you to specify the path to your custom
prefect.yaml
file. By default, the
prefect deploy
command will look for a
prefect.yaml
file in the project's root directory. If you want to use a different file or location, you can provide the path as shown above. For more details, you can refer to the Prefect Documentation on deploying with prefect.yaml.
c
@Nate I'm being a noob again. Is there a way to bulk load deployments? This is what my yaml reference looks like now.
n
you can do
prefect deploy -n name path/to/file.py:entrypoint
or
prefect --no-prompt deploy --all
latter is nice for CI etc where you have all the info you need in your
prefect.yaml
i.e. dont need interactive experience
c
Also I ran into two more issues. 1 if I create a deployment without referencing the yaml it seems to work and add it to the server, but it won't load these and gives me an event loop error. 2 the one I made manually in the CLI and try to launch in the ui just hangs saying late.
Cool I'll try
Oh I see, so it just won't appear in the UI until I try to run it once?
Yeah I shouldn't need the UI, but my original idea was to split up all the schemas into deployments which I can run one after the other being fed into the worker queue, I'm assuming the local process would just stay alive until it gets through all of them. It would let me investigate a schema that fails as a deployment and still move on to the next
But maybe that's not how deployments should be used?
n
2 the one I made manually in the CLI and try to launch in the ui just hangs saying late.
this is usually because a worker is not running for the work pool that you've assigned the deployments to not so sure what your objective is / how you mean "schema" here but
split up all the schemas into deployments which I can run one after the other being fed into the worker queue, I'm assuming the local process would just stay alive until it gets through all of them
in general a process worker will live as long as the process that starts when you say
prefect worker start -p some-local-pool
c
I'm exploring CLI help, starting to see what's going on better. So the worker is the environment that runs your code, the work-pool gives it work, and the queues let you configure how the pool gives flows to the worker? So then the deployment is assigned to the work pool and queue where the queue limits concurrency for the deployments you assign it...?
yess 1
schema's are postgresql schemas with a bunch of tables.
šŸ‘ 1
n
pretty much you only need to engage with the idea of work queues if: • you need concurrency limits for your flow runs on a more granular level than work pool (you can set them on either/both) • you need to prioritize how a worker picks up flow runs sent to a work pool but basically work pools are things that: • represent a type of runtime for your deployments (k8s, ECS, local process etc) • can have many work queues (just 1 by default) • store the base job template for that infra ā—¦ env, working directory etc (are examples of
job_variables
that the
local
work-pool type has) ā—¦ this can be overridden on a deployment basis deployments are assigned to a work pool, then flow runs from that deployment are sent to that work pool, workers pick up runs from that pool according to the above and then submit the run to the execution env according to the template on your work pool + any overrides
c
Very elucidating, thanks! So does the worker/env live only as long as the highest level flow in the deployment? Or as long as the pool is backed up with more flow runs? Just want to make sure so I can know how to deal with shared recourses. Also can the same worker be assigned to multiple work pools? I'm guessing as long as work pool is backed up, and no for multi-pool assignment.
> workers pick up runs from that pool according to the above and then submit the run to the execution env according to the template on your work pool + any override This is what I don't get, the env is live as long as the worker is alive and submitting work to it? Or is it restarting at each deployment to override any variables? And env = infrastructure?
n
> So does the worker/env live only as long as the highest level flow in the deployment? yep, the flow run environment is torn down after the "entrypoint" flow exits, but the worker process runs till you kill it or it dies the process worker is a special case in the sense that the flow runs execute as a subprocess of the worker, otherwise the workers job in k8s or ecs is just to spin up a container for the flow run (entirely separate infra from the worker itself)
a work pool has
job_variables
,
env
is an example of a
job_variable
that can be set for the duration of a flow run
c
Ok that makes sense, then I probably shouldn't make a deployment for each sql schema. And instead cache results or do that explicitly with something else, which I already do with another sql asset tracking table.
šŸ‘ 1