Jackson Nguyen
09/22/2025, 6:59 PMMarvin
09/22/2025, 6:59 PMMarvin
09/22/2025, 7:06 PMname: my-project
pull:
- prefect.deployments.steps.pull.git_clone:
repository: <https://github.com/acme/data-pipelines.git>
branch: main
- prefect.deployments.steps.utility.pip_install_requirements:
requirements_file: requirements.txt
deployments:
- name: prod
entrypoint: flows/etl.py:etl
work_pool:
name: process-pool
- Then, from the project root run:
prefect deploy
Notes
- This clones fresh on each run, so you always get HEAD of main.
- For private repos, pass an access_token
or a credentials block to git_clone
.
- Docs: Pull steps (git_clone), Deployment versioning: pin to branch vs commit
2) Python API with Flow.from_source + GitRepository
- Define your deployment to load from a GitRepository that tracks main:
from prefect import flow
from prefect.storage import GitRepository
@flow
def etl():
...
repo = GitRepository(
url="<https://github.com/acme/data-pipelines.git>",
branch="main", # track the main branch
# commit_sha=None # leave unset to follow the branch head
# pull_interval=60 # seconds; cached clone refresh cadence
)
Flow.from_source(
source=repo,
entrypoint="flows/etl.py:etl",
).deploy(
name="prod",
work_pool_name="process-pool",
)
Notes
- Prefect will maintain a cached clone and refresh it periodically; the pull_interval
(default 60s) controls how often it checks for updates. If you require the freshest code every single run, either lower this interval aggressively or use the “pull step” approach above which clones each time.
- For rollbacks, set commit_sha
to a specific commit; to resume tracking latest, remove it and keep branch="main"
.
- Docs: Deployment versioning: pin to branch vs commit
Caveats by worker type
- Process workers
- Ensure git
is installed on the worker machine and it has network access to your repo.
- For private repos, provide credentials via the pull step access_token
or a Git credentials block.
- Docker workers
- If you use either pattern above, the clone happens inside the container at runtime. Make sure your base image has git
installed and can reach your repo.
- Example Dockerfile snippet:
FROM prefecthq/prefect:3-python3.11
RUN apt-get update && apt-get install -y git && rm -rf /var/lib/apt/lists/*
- Install your Python deps either with the pip_install_requirements
pull step or bake them into the image.
- Kubernetes workers
- Same as Docker: ensure the runtime image has git
and network access; use pull steps or GitRepository storage. If you bake code into images instead, you’ll need CI to build/push on each commit and ensure pods pull the new image.
Which should you choose?
- Want absolute “latest on every run” with no caching? Use the pull step approach; it will clone HEAD of main at the start of each run.
- Want to reuse a cached clone and refresh frequently? Use Flow.from_source
with GitRepository(branch="main")
and tune pull_interval
.
If you share which worker type you’re using (process, Docker, Kubernetes) and whether your repo is private, I can tailor an exact config (including credentials handling and a minimal Dockerfile if needed).