Hi, I'm trying to use KubernetesRun config with Gi...
# ask-community
a
Hi, I'm trying to use KubernetesRun config with Github flow storage and getting a
Failed to load and execute Flow's environment: FileNotFoundError(2, 'No such file or directory')
error running the flow, when using the
job_template_path
option for KubernetesRun. I can successfully register the flow, and when running the flow it seems to be respecting the
template.yml
I passed in (I see my Kubernetes cluster running an appropriate pod based on my template) - but after pulling the image I get the
FileNotFoundError
. Thoughts on what is going on? My flow basically looks like this:
Copy code
with Flow("Test") as test_flow:
    ...

test_flow.run_config = KubernetesRun(
    job_template_path="template.yml"
)

test_flow.storage = GitHub(
    repo="<path>",
    path="flows/test_flow.py",
    access_token_secret="GITHUB_ACCESS_TOKEN"
)
If I use
job_template
instead of
job_template_path
, the flow runs successfully. e.g.
Copy code
test_flow.run_config = KubernetesRun(
    job_template="""
apiVersion: batch/v1
kind: Job
spec:
  template:
    spec:
      containers:
        ...

"""
)
Logs from when it errored:
Copy code
13:44:54
INFO
GitHub
Downloading flow from GitHub storage - repo: '<REPO>, path: 'flows/test_flow.py', ref: 'main'
13:44:55
INFO
GitHub
Flow successfully downloaded. Using commit: 5eb783e29559bb4abf88af04d9889d356df03875
13:44:57
ERROR
execute flow-run
Failed to load and execute Flow's environment: FileNotFoundError(2, 'No such file or directory')
13:45:19
ERROR
k8s-infra
Pod prefect-job-0ef90fdd-4mpx7 failed.
	Container 'test_flow' state: terminated
		Exit Code:: 1
		Reason: Error
k
I think this is similar to this .
I believe that
job_template_path
is relative to where the agent is running, which is why the example uses s3 for it.
a
@Kevin Kho thanks, I will try using GCS storage for it. the documentation at https://docs.prefect.io/api/latest/run_configs.html#kubernetesrun says:
`job_template_path (str, optional)`: Path to a job template to use. If a local path (no file scheme, or a `file`/`local` scheme), the job template will be loaded on initialization and stored on the 
KubernetesRun
 object as the 
job_template
 field. Otherwise the job template will be loaded at runtime on the agent. Supported runtime file schemes include (
s3
gcs
, and 
agent
 (for paths local to the runtime agent)).
this seems to imply that
job_template_path
should be getting converted into
job_template
and stored with the run config if it's a local path, and only use a path relative to the agent if
agent://
file scheme is used. is that the intended behavior? right now, it seems to be expecting the file to exist on the agent even when using a local file path.
k
Gotcha seems like you are right. Let me ask the team and get back to you on this.
a
great, thank you!
k
Ok so here is what happened. Github storage is a script based storage, meaning there is no build. The job_template gets read in during the storage build. So if you use Storage like Docker/S3 (not as script), then it gets read it during registration time. For script based storage, it’s loaded during runtime, and then it looks for that file and it’s not there. For script based storage, this relative path would be evaluated at the agent side. I will open an issue to clarify the docs around this.
a
i see, thanks. i think using pickle-based GCS storage will work for us, so i'll give that a try. currently it's only possible to specify a single
path=<file>.py
path for Github storage, right? if it were possible to specify additional files to include for script-based storage, that could be useful for cases like this.
k
Git Storage might do it for you since it clones the whole repo.
a
I gave Git storage a try yesterday and unfortunately still ran into the same FIle Not Found error. if it's trying to find the file relative to the agent, I'm wondering if the problem may be because I'm running on Kubernetes with the agent on a separate pod. GCS storage worked great, so I've moved to that.
k
Sorry about that! The agent should be responsible for managing those file paths for external files. We’ll look into it for sure and work to make Git storage the go-to way to achieve this. Thanks for reporting back!
Could you give the RunConfig and Storage configurations from when you tried Git storage?
a
sure, i'll post my config here shortly. thanks for looking into it!
here is my config:
Copy code
with Flow("Template") as template_flow:
    task = ShellTask()
    result = task(command="echo Hello World")

template_flow.run_config = KubernetesRun(
    job_template_path="flows/basic_template.yaml"
)

template_flow.storage = Git(
    repo="<repo>",
    flow_path="flows/template_flow.py",
    git_token_secret_name="GITHUB_ACCESS_TOKEN", 
    git_token_username="<username>",
    branch_name="main",
)
Logs from attempting to run:
Copy code
10:34:17
INFO
agent
Submitted for execution: Job prefect-job-dbd5d6a1
10:34:19
ERROR
execute flow-run
Failed to load and execute Flow's environment: FileNotFoundError(2, 'No such file or directory')
k
Thank you!