Hi Prefect team! :wave: I’m encountering an issue...
# ask-community
p
Hi Prefect team! 👋 I’m encountering an issue with
GitRepository
storage in my Prefect deployment and could use some help. Here’s the setup: • Context: I’m trying to set up
GitRepository
storage to fetch flow definitions from a GitLab repository as described in the docs. • Problem: The storage setup fails to pull the repository correctly. The first
GitRepository.pull_code
is successful, because it clones the repo. All subsequent pulls fail with the following error:
Copy code
ERROR   | prefect.runner.storage.git-repository.data-collection - Failed to pull latest changes with exit code Command '['git', 'pull', 'origin', '--depth', '1']' returned non-zero exit status 128
What I’ve Tried: I’ve tried manually running
git clone <repo> --depth 1
and making some changes, then pulling again with
git pull origin --depth 1
. This also fails with the following exception:
fatal: refusing to merge unrelated histories
. I’ve tried setting
git config --global pull.rebase true
. But that doesn’t help. How can I configure git to work with the prefect
GitRepository
storage? I’d appreciate any insights or guidance on what might be going wrong. Let me know if you need additional details or logs! Thanks in advance for your support! 🙏
Here’s the code to setup the
GitRepository
:
prefect_environment.py
Copy code
import os
import subprocess
from pathlib import Path

from prefect.runner.storage import GitRepository


def _get_project_source():
    return Path(__file__).parent.parent.parent


def get_default_config():
    relative_config = "resources/sources/config.yml"
    return _get_project_source() / relative_config


def get_deamon_source():
    # Running on a server.
    repo = os.environ.get("GITLAB_REPO")
    if repo:
        access_token = os.environ.get("GITLAB_ACCESS_TOKEN")
        credentials = None
        if access_token:
            credentials = {"access_token": access_token}
        repo = GitRepository(repo, credentials)
        return repo
    else:
        # Running locally.
        return str(_get_project_source())

def get_git_commit():
    # Check if the directory is a valid git repository
    try:
        commit_hash = subprocess.check_output(
            ["git", "-C", _get_project_source(), "rev-parse", "--short", "HEAD"],
            text=True,
        ).strip()
        return commit_hash
    except subprocess.CalledProcessError:
        print(f"Could not retrieve git commit for directory: {_get_project_source()}")
        return None
prefect_deamon.py
Copy code
from datetime import timedelta

from prefect import flow

from data_collection.prefect_environment import get_deamon_source

if __name__ == "__main__":
    deployment = flow.from_source(
        source=get_deamon_source(),
        entrypoint="src/data_collection/prefect_flow.py:download_all",
    )

    # Schedule the deployment to run every day
    deployment.serve(
        name="data-collection-deamon",
        interval=timedelta(hours=24),
    )