<@ULVA73B9P> We're using the process work pool and...
# ask-marvin
s
@Marvin We're using the process work pool and I want to create a unique venv for each deployment run. You have the
prefect.deployments.steps.run_shell_script
option for YAML deployments, but using
pip install blah
in this shell script installs in the shared venv of the worker. I seem to be able to create a new venv in this step, but I can't find any way of using this venv for the subsequent flow run.
m
thought for 674.5 seconds
Great question — and you’re right about the behavior you’re seeing. A couple key points: - Deployment steps like
prefect.deployments.steps.run_shell_script
run at deploy time (where you run
prefect deploy
), not at flow run time. They can’t change the worker’s runtime environment for each flow run. - With a Process work pool, the way to control which Python/venv runs your flow is by overriding the job’s
command
. You can point at a specific interpreter or wrap the default command to create/use an ephemeral venv per run. Two patterns you can use: 1) Pin a pre-created venv per deployment - Pre-create a venv on the worker host, then set the deployment’s job variable
command
to that interpreter. Example in `prefect.yaml`:
Copy code
deployments:
  - name: my-deployment
    entrypoint: path/to/flow.py:flow
    work_pool:
      name: my-process-pool
      job_variables:
        command: "/opt/prefect/envs/my-deployment/bin/python -m prefect flow-run execute {{ flow_run.id }}"
2) Create a throwaway venv per flow run (fully isolated each time) - Override
command
with a small shell wrapper that: - creates a venv in a temp dir - installs dependencies (including Prefect) - execs the flow run with that venv’s python - Example (Linux/macOS) in `prefect.yaml`:
Copy code
deployments:
  - name: my-deployment
    entrypoint: path/to/flow.py:flow
    work_pool:
      name: my-process-pool
      job_variables:
        command: >
          bash -lc '
          VENV="$(mktemp -d)/venv";
          python -m venv "$VENV";
          "$VENV/bin/python" -m pip install --upgrade pip;
          # Adjust this to your needs; ensure Prefect is installed in the venv
          "$VENV/bin/pip" install -r requirements.txt prefect;
          exec "$VENV/bin/python" -m prefect flow-run execute {{ flow_run.id }};
          '
Notes: - The
{{ flow_run.id }}
Jinja variable is rendered at runtime so the command can target the current run. - If your
requirements.txt
is in your repo, make sure your storage/source pulls it onto the worker at run time. - Use
exec
to replace the shell process so signals/logging behave properly. If you prefer to do this in Python when creating the deployment:
Copy code
from prefect import flow

@flow
def my_flow():
    ...

my_flow.from_source(
    "<https://github.com/owner/repo>",
    entrypoint="flows/my_flow.py:my_flow",
).deploy(
    name="my-deployment",
    work_pool_name="my-process-pool",
    job_variables={
        "command": "bash -lc 'VENV=$(mktemp -d)/venv; python -m venv $VENV; $VENV/bin/python -m pip install --upgrade pip; $VENV/bin/pip install -r requirements.txt prefect; exec $VENV/bin/python -m prefect flow-run execute {{ flow_run.id }}'"
    },
)
Why your current approach doesn’t work: -
run_shell_script
in deployments runs during deployment creation, not when a flow run starts; it cannot switch the interpreter that the worker uses at runtime. Docs and references: - Customize job variables: Customize job variables - Process worker API (job variables include
command
): Process worker API Optional alternative - If you want stronger isolation with less per-run setup cost, consider a Docker work pool where each deployment references an image with its deps pre-baked. Would you like a Windows/PowerShell variant of the command, or help wiring this into your existing
prefect.yaml
?
s
@Marvin When you say, "If your
requirements.txt
is in your repo, make sure your storage/source pulls it onto the worker at run time.", I actually want to
pip install .
inside the cloned GitRepository. When that job command is run, are we already inside the cloned deployment repo?
m
thought for 180.6 seconds
Short answer: yes. If your deployment uses
flow.from_source(...).deploy(...)
(Git source), the Process worker clones the repo for each run and sets the job’s working directory to the checkout root before executing the command. So a
pip install .
in your overridden
command
will run against the cloned repo. Putting it together, here’s a per-run venv example that installs your package from the repo and then executes the flow run:
Copy code
work_pool:
  name: my-process-pool
  job_variables:
    command: >
      bash -lc '
      VENV="$(mktemp -d)/venv";
      python -m venv "$VENV";
      "$VENV/bin/python" -m pip install --upgrade pip;
      # Install Prefect in the venv (needed for `-m prefect`)
      "$VENV/bin/pip" install prefect;
      # Install your project from the cloned repo root
      "$VENV/bin/pip" install .;
      exec "$VENV/bin/python" -m prefect flow-run execute {{ flow_run.id }};
      '
Notes: - If you’ve explicitly overridden
working_dir
in job variables, make sure it points at the repo root, or add an explicit
cd /path/to/repo
before
pip install .
. - This assumes you’re deploying from source (Git). If you aren’t, the repo won’t be present at runtime. Docs: - Customize job variables (including `command`/`working_dir`): Customize job variables If you want a PowerShell/Windows variant, I can share that too.
s
@Marvin This isn't actually correct. The
command
is run before the git clone step. If I add a directory listing to that bash command, the temporary directory that the command runs in is empty. In the worker log, I can see the bash command returning an exit code and then a line with
Flow run 'fiery-serval' -  > Running git_clone step...
m
thought for 65.0 seconds
You’re absolutely right — thanks for the correction. On Process work pools, the job
command
runs first, then Prefect starts the flow-run process which performs the git clone. That means your
command
runs in an empty temp directory and cannot
pip install .
from the repo checkout. Given that ordering, here are workable options: Option A: Install your project from Git in the pre-run command, then exec the flow run - Instead of
pip install .
, install your project via a Git URL pinned to a branch/tag/commit. Then start the flow-run using the venv’s Python. - Example:
Copy code
work_pool:
  name: my-process-pool
  job_variables:
    command: >
      bash -lc '
      VENV="$(mktemp -d)/venv";
      python -m venv "$VENV";
      "$VENV/bin/pip" install --upgrade pip;
      # Install Prefect into the venv
      "$VENV/bin/pip" install prefect;
      # Install your project from Git (pin to a branch/tag/commit)
      "$VENV/bin/pip" install "git+<https://github.com/ORG/REPO@main#egg=yourpkg%22;|https://github.com/ORG/REPO@main#egg=yourpkg";>
      # Now run the flow in this venv
      exec "$VENV/bin/python" -m prefect flow-run execute {{ flow_run.id }};
      '
- If your repo is private, you can: - Inject credentials via env (e.g.,
GIT_ASKPASS
,
GIT_SSH_COMMAND
) using
work_pool.job_variables.env
, or - Use a Git token in the URL (less ideal). - This gives you an isolated venv per run and a project install that matches your Git ref. Prefect will still clone the repo later for loading the entrypoint, but your package deps will already be resolved in the venv. Option B: Publish wheels to an internal index - Build your project wheel on CI per commit and host it in an internal index (or artifact store). Then in the pre-run command: - Create venv -
pip install --index-url ... yourpkg==<version>
-
exec "$VENV/bin/python" -m prefect flow-run execute {{ flow_run.id }}"
- This is fast and repeatable, and still fully isolated per run. Option C: Use a Docker work pool - If you want the clone + pip install . semantics, Docker is a better fit: bake deps into the image or do the
pip install .
during build. Each run gets a fresh container, which gives strong isolation without per-run setup time. If you really need “pip install . from the exact cloned checkout” with a Process worker, that isn’t possible today because the clone occurs inside the flow-run process after your pre-run command has finished. A potential workaround is to pin the commit you deploy and install via Git URL to that same commit in your pre-run command. If you’d like, I can help you wire an env var for the commit/ref into your job variables so the command installs the exact version you deployed. Docs: - Customize job variables (including `command`/`env`): Customize job variables - Flow run execute (the default run command you’ll exec):
prefect flow-run execute
(see
prefect flow-run execute --help
)