Fugo Takefusa
09/19/2025, 2:50 AMMarvin
09/19/2025, 2:50 AMMarvin
09/19/2025, 2:56 AMgit_clone
supports directories
, which uses Git sparse-checkout to pull only those subdirectories.
pull:
- prefect.deployments.steps.git_clone:
repository: <https://github.com/org/repo.git>
branch: main
# only pull what you need
directories:
- flows/
- libs/shared/
Docs: API: prefect.deployments.steps.pull
- If you need even faster: do a shallow/partial clone via a shell step
The built-in git_clone
does not expose depth/filter flags. You can replace it with a shell script that uses shallow clone and partial clone filters:
pull:
- prefect.deployments.steps.utility.run_shell_script:
shell: |
set -euo pipefail
REPO="<https://github.com/org/repo.git>"
BRANCH="main" # or pass via env and use $BRANCH
mkdir -p src && cd src
# Shallow + partial + sparse
git clone --depth=1 --filter=blob:none --sparse "$REPO" .
git sparse-checkout set flows libs/shared
git checkout "$BRANCH"
This usually cuts time substantially on large repos.
- Skip Git LFS downloads (if you use LFS)
Set GIT_LFS_SKIP_SMUDGE=1
so LFS blobs aren’t pulled during clone (you can fetch specific LFS files later if needed).
deployments:
- name: my-deployment
work_pool:
name: my-pool
job_variables:
env:
GIT_LFS_SKIP_SMUDGE: "1"
Docs: Customize job variables
- Avoid submodules unless necessary
Submodules add extra fetches. Keep include_submodules: false
(it’s false by default) or refactor to avoid them.
- Consider packaging code into the image to avoid cloning at runtime
If you’re using Docker/Kubernetes work pools, you can build an image with your flow code baked in so there’s no clone per run. This often eliminates that 20s entirely.
References and docs
- Git-based steps overview and parameters: prefect.deployments.steps.pull
- Deployment config and job variables: prefect.yaml and Customize job variables
A few clarifying questions to tailor the approach:
- Rough repo size? Any Git LFS or large binaries?
- Do you need the whole repo or just specific folders for the flow run?
- Are you on Docker/Kubernetes/local work pool? If containerized, we can suggest a minimal image-based approach that skips cloning entirely.