<@ULVA73B9P> Why does prefect worker creates neste...
# ask-marvin
s
@Marvin Why does prefect worker creates nested clones when running
prefect.deployments.steps.git_clone
?
m
thought for 308.0 seconds
Short answer: because
git_clone
clones into a new directory named after the repo inside whatever the current working directory is. If your worker or your pull steps change the working directory to the repo path and then run
git_clone
again (explicitly or implicitly), you’ll end up with repo/repo. What’s happening - In Prefect 3.x,
prefect.deployments.steps.pull.git_clone
clones to
<current-working-dir>/<repo-name>
by default; there is no “destination” argument. - Many workers (e.g., Docker) start in a base working directory (commonly
/opt/prefect/flows
). - If you: - run
git_clone
once: you get
/opt/prefect/flows/<repo>
- then set working directory to that clone and run
git_clone
again (directly or because you added a second
git_clone
in your pull steps), you get
/opt/prefect/flows/<repo>/<repo>
(the “nested clone”). Common causes - Adding a manual
git_clone
step when your deployment was already created from a Git source (e.g., via
flow.from_source(...)
), which can include a
git_clone
pull step for you. - Duplicating
git_clone
in your
pull
list. - Calling
set_working_directory
to the clone directory, then calling
git_clone
again later in the same pull sequence. How to prevent nested clones - Ensure you only clone once. - If you need to change into the repo after cloning, give the clone step an id and reference its output with `set_working_directory`:
Copy code
deployments:
- name: my-deployment
  entrypoint: flows/flow.py:flow
  pull:
    - prefect.deployments.steps.pull.git_clone:
        id: clone
        repository: "<https://github.com/org/repo.git>"
        branch: main
    - prefect.deployments.steps.pull.set_working_directory:
        directory: "{{ clone.directory }}"
Tips - Don’t add your own
git_clone
if your deployment source already provides one. - If you need a custom destination path (since
git_clone
doesn’t take a dest arg), either: - set the working dir to the parent you want before cloning, or - replace
git_clone
with a
run_shell_script
step that runs
git clone <url> <dest>
. Docs for reference - git_clone step - prefect.yaml and pull steps If you can share: - your worker type (Docker, Kubernetes, Process) - the
pull:
section from your
prefect.yaml
- and the path you’re seeing in logs …I can pinpoint exactly where the second clone is coming from.
s
@Marvin My worker is a Process worker running on:
Copy code
$ dirs
~/local_clone
local_clone$ git status
On branch <my_dev_branch>
Your branch is up to date with 'origin/<my_dev_branch>'.
And this is my pull step in my
prefect.yaml
driving my deployment:
Copy code
pull:
    - prefect.deployments.steps.git_clone:
        repository: <https://gitlab.com/><my_repo>.git
        branch: <my_dev_branch>
Upon submitting a flow to this deployment the worker raised the following error:
Copy code
subprocess.CalledProcessError: Command '['git', 'config', '--get', 'remote.origin.url']' returned non-zero exit status 1.
m
thought for 216.2 seconds
Thanks for the details — this helps. What you’re seeing has two overlapping behaviors: 1) Why nested clones happen - The
git_clone
pull step clones into
<current-working-dir>/<repo-name>
by default. - If your Process worker is started from inside an existing local clone (e.g.,
~/local_clone
) and your deployment runs
git_clone
, it will create
~/local_clone/<repo-name>
— i.e., a nested clone inside your local repo. - Easiest ways to avoid that: - Start the worker from a neutral directory (not inside your repo), or - Add an initial
set_working_directory
pull step to a neutral parent folder before running
git_clone
, or - If you want to reuse your existing local clone, remove the
git_clone
step and only
set_working_directory
to your existing repo path. 2) Why you got this error
Copy code
subprocess.CalledProcessError: Command '['git', 'config', '--get', 'remote.origin.url']' returned non-zero exit status 1.
- The
git_clone
step probes the current directory to see if it’s already a Git repo by running
git config --get remote.origin.url
. That command exits 1 if the current directory is not a Git repo (or there’s no
origin
remote). - On a Process worker, the job may run in a separate working directory from your shell’s
~/local_clone
, which means the probe runs outside a repo and fails. - Alternatively, if it did run inside a repo that lacks an
origin
remote, you’d see the same error. Recommended fixes - Use the fully-qualified step name (safer):
prefect.deployments.steps.pull.git_clone
. - Pin the working directory before and after the clone. Example:
Copy code
pull:
  - prefect.deployments.steps.pull.set_working_directory:
      directory: "/home/youruser/prefect-workspace"  # neutral, not a git repo
  - prefect.deployments.steps.pull.git_clone:
      id: clone
      repository: "<https://gitlab.com/my_repo.git>"
      branch: "<my_dev_branch>"
  - prefect.deployments.steps.pull.set_working_directory:
      directory: "{{ clone.directory }}"
If you want to reuse your existing local clone (no extra clone):
Copy code
pull:
  - prefect.deployments.steps.pull.set_working_directory:
      directory: "/home/youruser/local_clone"
Notes and links -
git_clone
docs: prefect.deployments.steps.pull.git_clone - Prefect YAML / pull steps: prefect.yaml guide To help pinpoint exactly why the probe failed, could you share: - Your Prefect version (
prefect version
) - The work pool/job template’s working directory (if you set one) - The full
pull:
section you’re using now (to confirm step paths) - Whether your local repo has an
origin
remote (running
git remote -v
in
~/local_clone
)