<@ULVA73B9P> should I use `prefect.deployments.ste...
# marvin-in-the-wild
o
@Marvin should I use
prefect.deployments.steps.set_working_directory
when making a deployment that uses an image?
m
You might encounter issues with using
prefect.deployments.steps.set_working_directory
when deploying an image because the working directory might not persist as expected in containerized environments. For example, users have reported encountering a
FileNotFoundError
when trying to set a working directory during deployments, which can be particularly problematic when the directory doesn't exist in the containerized environment. Here's what you should consider: 1. Check the Directory Structure: Ensure that the directory you are setting exists inside your image. You might need to structure your image appropriately to include the working directory you want to set. 2. Implicit Steps: In some cases, Prefect might implicitly try to set a working directory even if you didn't define it in your deployment configuration, which can cause errors. Make sure to explicitly control these steps in your deployment script. 3. Handling Errors: If errors persist when using the
set_working_directory
step, you may need to handle these in your deployment script or adjust your image to meet the directory structure requirements. Here's a summary of relevant issues: - FileNotFoundError Issue: Users have faced issues where a specified directory doesn't exist in the container environment, causing the deployment to fail. - Deployment configuration: Ensure that your deployment configuration properly handles external dependencies and directory structures. For best practices, you might want to avoid relying on changing directories dynamically inside the image and instead ensure your image's file structure aligns with what your deployment script expects. This approach might save you from encountering runtime errors related to the file system.
n
> should I use
prefect.deployments.steps.set_working_directory
when making a deployment that uses an image? yes, if you're baking your flow code into your image and need to adjust the working directory so that your flow
entrypoint
is set up relative to that image's WORKDIR otherwise, you can just have a base
image
set on your work pool, then have a
git_clone
pull step and not worry about set_working_directory (as long as your
entrypoint
is relative to repo root)
o
@Marvin @Nate Hmmm, so the worker image does not have the flow baked in. Our deployment uses a separate image and we deploy using Python via
my_flow.deploy()
. Due to legacy reasons and inability to change work pool queue names from
default
, we’re seeing an agent (instead of a worker) pick up our task every once in a while. When an agent picks the flow, it errors out with FileNotFound. It sounds like if I set
prefect.deployments.steps.set_working_directory
to
directory: /opt/prefect/flows
it should work. But now I need to figure out how to specify a
pull
steps as part of the
deploy()
function which doesn’t have that parameter.
m
You're on the right track. To use
prefect.deployments.steps.set_working_directory
within your deployment while calling the
deploy()
function, you will need to leverage the deployment's YAML configuration for specifying pull steps. Here’s how you can set the working directory properly: ### Prefect Deployment Configuration Add pull steps in your
prefect.yaml
configuration file like so:
Copy code
yaml
# Generic metadata about this project
name: my_project
prefect-version: 2.x.x

# build section allows you to manage and build docker images
build: null

# push section, if you have any logic to push artifacts (like Docker images) to a registry
push: null

# pull section allows you to specify how to set up the environment before running your flow
pull:
  - prefect.deployments.steps.set_working_directory:
      directory: /opt/prefect/flows

# the deployments section allows you to provide configuration for deploying flows
deployments:
  - name: my_deployment
    version: 1.0.0
    tags: []
    description: "My deployment description"
    entrypoint: my_flow.py:my_flow_function
    parameters:
      example_param: value
    work_pool:
      name: my_work_pool
      work_queue_name: my_work_queue
### Using deploy() function You can then issue the
deploy()
function in your Python script without worrying about specifying the pull steps directly, since they are now configured in your YAML:
Copy code
python
from my_flow import my_flow

# Other deployment logic if necessary
deployment = my_flow.deploy()
### Detailed Steps: 1. Define Configuration in `prefect.yaml`: The
prefect.yaml
should define the pull steps as shown above, especially the
set_working_directory
step. 2. Deploy using Python Script: Use the
deploy()
function from your flow. This script will leverage the settings from your
prefect.yaml
implicitly. ### Important Notes - Check Compatibility: Make sure the directory specified in
set_working_directory
exists in your image. - Work Pool Naming: Ensure that the work pool and queue names avoid ambiguities and legacy agent picks. - Deploy Pull Steps: Verify that Prefect's version you're using supports pull steps in deployment as per the GitHub issue. For additional information, you can refer to Prefect's guide on deployment with pull steps. Thus, by leveraging the YAML configuration and specifying the pull steps there, you should be able to avoid FileNotFound errors while keeping your deployment systematic and clean.
n
disregard the answer above from marvin
pull
steps in the yaml are for deployments created via
prefect deploy
basically .deploy()'s version of
pull
steps is the
from_source
method that you chain before
.deploy()
like
Copy code
flow.from_source(...).deploy(...)
where the source can be a git repo or blob storage of some kind
When an agent picks the flow, it errors out with FileNotFound.
this makes some sense to me, since agents cannot perform
pull
steps, only workers can
o
I must be missing something. Why would it do any type of
pull
from storage or blob if I’m using a docker image for my deployment? Also, any thoughts on this “workaround”? https://github.com/PrefectHQ/prefect/issues/10285#issuecomment-1646553334
n
pull
steps are the way to retrieve your source code
Hmmm, so the worker image does not have the flow baked in.
to clarify, I meant the
image
for your deployment's flow run, not the worker's image so do you build your flow code into the image you specify for that deployment?
o
Yes, the image in the deployment contains my flow code. It works fine when a worker runs deployment, it fails with the FileNotFound error when an agent tries to run the flow.
n
hmm i see - do you need to have an agent running for some reason?
o
I’m seeing other folks at my company using the
set_work_directory
and don’t have this issue. Similar to what the workaround GitHub issue I linked.
> hmm i see - do you need to have an agent running for some reason? Legacy reasons. Teams haven’t gotten off of them yet.
At the same time, we’re looking at creating our own worker pool with a different queue name (i.e. not
default
) so we don’t have to deal with this. But we’re trying to see if we can work around this issue for the time being while that work is completed.