< Marvin> should I use `prefect deployments steps set workin Prefect Community #marvin-in-the-wild

<@ULVA73B9P> should I use `prefect.deployments.ste...

Omar Khudeira

05/22/2024, 4:52 PM

@Marvin should I use

prefect.deployments.steps.set_working_directory

when making a deployment that uses an image?

Marvin

05/22/2024, 4:52 PM

You might encounter issues with using

prefect.deployments.steps.set_working_directory

when deploying an image because the working directory might not persist as expected in containerized environments. For example, users have reported encountering a

FileNotFoundError

when trying to set a working directory during deployments, which can be particularly problematic when the directory doesn't exist in the containerized environment. Here's what you should consider: 1. Check the Directory Structure: Ensure that the directory you are setting exists inside your image. You might need to structure your image appropriately to include the working directory you want to set. 2. Implicit Steps: In some cases, Prefect might implicitly try to set a working directory even if you didn't define it in your deployment configuration, which can cause errors. Make sure to explicitly control these steps in your deployment script. 3. Handling Errors: If errors persist when using the

set_working_directory

step, you may need to handle these in your deployment script or adjust your image to meet the directory structure requirements. Here's a summary of relevant issues: - FileNotFoundError Issue: Users have faced issues where a specified directory doesn't exist in the container environment, causing the deployment to fail. - Deployment configuration: Ensure that your deployment configuration properly handles external dependencies and directory structures. For best practices, you might want to avoid relying on changing directories dynamically inside the image and instead ensure your image's file structure aligns with what your deployment script expects. This approach might save you from encountering runtime errors related to the file system.

Nate

05/22/2024, 4:56 PM

> should I use

prefect.deployments.steps.set_working_directory

when making a deployment that uses an image? yes, if you're baking your flow code into your image and need to adjust the working directory so that your flow

entrypoint

is set up relative to that image's WORKDIR otherwise, you can just have a base

image

set on your work pool, then have a

git_clone

pull step and not worry about set_working_directory (as long as your

entrypoint

is relative to repo root)

Omar Khudeira

05/22/2024, 5:04 PM

@Marvin @Nate Hmmm, so the worker image does not have the flow baked in. Our deployment uses a separate image and we deploy using Python via

my_flow.deploy()

. Due to legacy reasons and inability to change work pool queue names from

default

, we’re seeing an agent (instead of a worker) pick up our task every once in a while. When an agent picks the flow, it errors out with FileNotFound. It sounds like if I set

prefect.deployments.steps.set_working_directory

directory: /opt/prefect/flows

it should work. But now I need to figure out how to specify a

pull

steps as part of the

deploy()

function which doesn’t have that parameter.

Marvin

05/22/2024, 5:04 PM

You're on the right track. To use

prefect.deployments.steps.set_working_directory

within your deployment while calling the

deploy()

function, you will need to leverage the deployment's YAML configuration for specifying pull steps. Here’s how you can set the working directory properly: ### Prefect Deployment Configuration Add pull steps in your

prefect.yaml

configuration file like so:

Copy code

yaml
# Generic metadata about this project
name: my_project
prefect-version: 2.x.x

# build section allows you to manage and build docker images
build: null

# push section, if you have any logic to push artifacts (like Docker images) to a registry
push: null

# pull section allows you to specify how to set up the environment before running your flow
pull:
  - prefect.deployments.steps.set_working_directory:
      directory: /opt/prefect/flows

# the deployments section allows you to provide configuration for deploying flows
deployments:
  - name: my_deployment
    version: 1.0.0
    tags: []
    description: "My deployment description"
    entrypoint: my_flow.py:my_flow_function
    parameters:
      example_param: value
    work_pool:
      name: my_work_pool
      work_queue_name: my_work_queue

### Using deploy() function You can then issue the

deploy()

function in your Python script without worrying about specifying the pull steps directly, since they are now configured in your YAML:

Copy code

python
from my_flow import my_flow

# Other deployment logic if necessary
deployment = my_flow.deploy()

### Detailed Steps: 1. Define Configuration in `prefect.yaml`: The

prefect.yaml

should define the pull steps as shown above, especially the

set_working_directory

step. 2. Deploy using Python Script: Use the

deploy()

function from your flow. This script will leverage the settings from your

prefect.yaml

implicitly. ### Important Notes - Check Compatibility: Make sure the directory specified in

set_working_directory

exists in your image. - Work Pool Naming: Ensure that the work pool and queue names avoid ambiguities and legacy agent picks. - Deploy Pull Steps: Verify that Prefect's version you're using supports pull steps in deployment as per the GitHub issue. For additional information, you can refer to Prefect's guide on deployment with pull steps. Thus, by leveraging the YAML configuration and specifying the pull steps there, you should be able to avoid FileNotFound errors while keeping your deployment systematic and clean.

Nate

05/22/2024, 5:07 PM

disregard the answer above from marvin

pull

steps in the yaml are for deployments created via

prefect deploy

basically .deploy()'s version of

pull

steps is the

from_source

method that you chain before

.deploy()

Copy code

flow.from_source(...).deploy(...)

where the source can be a git repo or blob storage of some kind

When an agent picks the flow, it errors out with FileNotFound.

this makes some sense to me, since agents cannot perform

pull

steps, only workers can

Omar Khudeira

05/22/2024, 5:10 PM

I must be missing something. Why would it do any type of

pull

from storage or blob if I’m using a docker image for my deployment? Also, any thoughts on this “workaround”? https://github.com/PrefectHQ/prefect/issues/10285#issuecomment-1646553334

Nate

05/22/2024, 5:12 PM

pull

steps are the way to retrieve your source code

Hmmm, so the worker image does not have the flow baked in.

to clarify, I meant the

image

for your deployment's flow run, not the worker's image so do you build your flow code into the image you specify for that deployment?

Omar Khudeira

05/22/2024, 5:13 PM

Yes, the image in the deployment contains my flow code. It works fine when a worker runs deployment, it fails with the FileNotFound error when an agent tries to run the flow.

Nate

05/22/2024, 5:14 PM

hmm i see - do you need to have an agent running for some reason?

Omar Khudeira

05/22/2024, 5:14 PM

I’m seeing other folks at my company using the

set_work_directory

and don’t have this issue. Similar to what the workaround GitHub issue I linked.

Omar Khudeira

05/22/2024, 5:14 PM

> hmm i see - do you need to have an agent running for some reason? Legacy reasons. Teams haven’t gotten off of them yet.

Omar Khudeira

05/22/2024, 5:16 PM

At the same time, we’re looking at creating our own worker pool with a different queue name (i.e. not

default

) so we don’t have to deal with this. But we’re trying to see if we can work around this issue for the time being while that work is completed.

8 Views

Open in Slack

Previous Next