<@ULVA73B9P> when I use pull: - prefect.depl...
# ask-marvin
f
@Marvin when I use pull: - prefect.deployments.steps.git_clone: . it takes about 20 seconds to clone. Is there any way to reduce the time? maybe add some ignore file in the repo?
m
thought for 390.2 seconds
Short answer: .gitignore won’t help. Git downloads what’s tracked in the remote history; ignore files only prevent new local files from being added. Here are practical ways to speed up cloning with Prefect 3.x: - Use the directories parameter (sparse checkout) If you only need parts of the repo,
git_clone
supports
directories
, which uses Git sparse-checkout to pull only those subdirectories.
Copy code
pull:
    - prefect.deployments.steps.git_clone:
        repository: <https://github.com/org/repo.git>
        branch: main
        # only pull what you need
        directories:
          - flows/
          - libs/shared/
Docs: API: prefect.deployments.steps.pull - If you need even faster: do a shallow/partial clone via a shell step The built-in
git_clone
does not expose depth/filter flags. You can replace it with a shell script that uses shallow clone and partial clone filters:
Copy code
pull:
    - prefect.deployments.steps.utility.run_shell_script:
        shell: |
          set -euo pipefail
          REPO="<https://github.com/org/repo.git>"
          BRANCH="main"   # or pass via env and use $BRANCH
          mkdir -p src && cd src
          # Shallow + partial + sparse
          git clone --depth=1 --filter=blob:none --sparse "$REPO" .
          git sparse-checkout set flows libs/shared
          git checkout "$BRANCH"
This usually cuts time substantially on large repos. - Skip Git LFS downloads (if you use LFS) Set
GIT_LFS_SKIP_SMUDGE=1
so LFS blobs aren’t pulled during clone (you can fetch specific LFS files later if needed).
Copy code
deployments:
    - name: my-deployment
      work_pool:
        name: my-pool
      job_variables:
        env:
          GIT_LFS_SKIP_SMUDGE: "1"
Docs: Customize job variables - Avoid submodules unless necessary Submodules add extra fetches. Keep
include_submodules: false
(it’s false by default) or refactor to avoid them. - Consider packaging code into the image to avoid cloning at runtime If you’re using Docker/Kubernetes work pools, you can build an image with your flow code baked in so there’s no clone per run. This often eliminates that 20s entirely. References and docs - Git-based steps overview and parameters: prefect.deployments.steps.pull - Deployment config and job variables: prefect.yaml and Customize job variables A few clarifying questions to tailor the approach: - Rough repo size? Any Git LFS or large binaries? - Do you need the whole repo or just specific folders for the flow run? - Are you on Docker/Kubernetes/local work pool? If containerized, we can suggest a minimal image-based approach that skips cloning entirely.