Geese Howard
09/12/2023, 2:38 AM02:34:51.057 | INFO | prefect.agent - Submitting flow run '55c9f655-bceb-483a-994f-b8c2d25d10a4'
02:34:55.176 | ERROR | prefect.agent - Failed to submit flow run '55c9f655-bceb-483a-994f-b8c2d25d10a4' to infrastructure.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/prefect/agent.py", line 499, in _submit_run_and_capture_errors
result = await infrastructure.run(task_status=task_status)
File "/usr/local/lib/python3.10/site-packages/prefect/infrastructure/kubernetes.py", line 300, in run
job = await run_sync_in_worker_thread(self._create_job, manifest)
File "/usr/local/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 91, in run_sync_in_worker_thread
return await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/site-packages/prefect/infrastructure/kubernetes.py", line 752, in _create_job
job = batch_client.create_namespaced_job(self.namespace, job_manifest)
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 210, in create_namespaced_job
return self.create_namespaced_job_with_http_info(namespace, body, **kwargs) # noqa: E501
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 309, in create_namespaced_job_with_http_info
return self.api_client.call_api(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 348, in call_api
return self.__call_api(resource_path, method,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
response_data = self.request(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 391, in request
return <http://self.rest_client.POST|self.rest_client.POST>(url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 276, in POST
return self.request("POST", url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 235, in request
raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Audit-Id': '85d8d653-2fca-4f7f-92ea-89054285dab0', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': 'd1ce5b8c-f217-4546-a55a-90c15bfce82c', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'de8f13b1-cab1-4e7b-9fef-cea7a00bb61f', 'Date': 'Tue, 12 Sep 2023 02:34:55 GMT', 'Content-Length': '317'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch is forbidden: User \"system:serviceaccount:prefect:prefect-agent\" cannot create resource \"jobs\" in API group \"batch\" in the namespace \"default\"","reason":"Forbidden","details":{"group":"batch","kind":"jobs"},"code":403}
02:34:55.373 | INFO | prefect.agent - Completed submission of flow run '55c9f655-bceb-483a-994f-b8c2d25d10a4'
02:34:55.638 | INFO | prefect.agent - Reported flow run '55c9f655-bceb-483a-994f-b8c2d25d10a4' as crashed: Flow run could not be submitted to infrastructure
It seems like the default setting of prefect agent is different from what I am expecting:
1. I was expecting the default prefect service account (if this one corresponds to the one of GCP), be “gke-prefect-runner@HIDE”
2. And default namespace for job creation be “prefect”
Can you fix this? Particularly, can you appropriately modify the helm command line scrip to achieve this?Marvin
09/12/2023, 2:38 AMsystem:serviceaccount:prefect:prefect-agent
is trying to create a job in the default
namespace, but it's forbidden.
Now, you've installed your Prefect agent in the prefect
namespace and you want to use a specific service account gke-prefect-runner@HIDE
. So, let's try to adjust your Helm command accordingly.
bash
helm install prefect-agent prefect/prefect-agent \
--namespace=prefect \
--set serviceAccount.create=false \
--set serviceAccount.name=gke-prefect-runner@HIDE \
-f values_agent.yaml
Here, --set serviceAccount.create=false
tells Helm not to create a new service account, and --set serviceAccount.name=gke-prefect-runner@HIDE
specifies the service account to use.
Remember, this service account needs to have the appropriate RBAC permissions to create jobs in the prefect
namespace. If it doesn't, you'll need to set up those permissions separately.
And, of course, replace gke-prefect-runner@HIDE
with your actual service account name. I'm just a robot, but even I know you wouldn't want to expose sensitive information like that.
Also, ensure your values_agent.yaml
file doesn't have any conflicting settings. If it does, those will need to be adjusted as well.
I hope this helps. If not, well, I'm sure you'll tell me.Geese Howard
09/12/2023, 2:55 AMprefect deployment build \
--pool gke \
--work-queue dev \
--name "dev: HI" \
--tag dev \
--storage-block gcs-bucket/prefect-flows/prefect_practice \
--infra kubernetes \
--timezone Asia/Seoul \
--cron '*/5 * * * *' \
--apply hello.py:hello
Can I enforce them to use GCP Credentials block with name “credential”, whenever creating GKE job or pod?Marvin
09/12/2023, 2:56 AMprefect deployment build
command.
However, you can use Kubernetes Secrets to provide the GCP credentials to your jobs or pods. You'd need to create a Secret in your Kubernetes cluster that contains your GCP credentials, and then modify your Prefect deployment to use that Secret.
Here's a general idea of how you could do it:
1. Create a Kubernetes Secret with your GCP credentials:
bash
kubectl create secret generic gcp-credentials --from-file=credentials.json=/path/to/your/credentials.json --namespace=prefect
2. In your Prefect flow, you can use the KubernetesRun
run config to specify the use of the Secret:
python
from prefect import Flow
from prefect.run_configs import KubernetesRun
# Define your flow
@task
def hello():
...
with Flow("My Flow", run_config=KubernetesRun()) as flow:
hello()
# Specify the use of the Secret in the run config
flow.run_config.image = "prefecthq/prefect"
flow.run_config.env_vars = {"GOOGLE_APPLICATION_CREDENTIALS": "/path/in/container/credentials.json"}
flow.run_config.job_spec_file = "path/to/job_spec.yaml"
3. In your job_spec.yaml
, you'd mount the Secret to a file in your container:
yaml
apiVersion: batch/v1
kind: Job
metadata:
name: my-job
spec:
template:
spec:
containers:
- name: my-container
image: prefecthq/prefect
volumeMounts:
- name: gcp-credentials
mountPath: /path/in/container
volumes:
- name: gcp-credentials
secret:
secretName: gcp-credentials
This way, your GKE jobs or pods will have access to your GCP credentials via the GOOGLE_APPLICATION_CREDENTIALS
environment variable.
Remember, this is a general guide and you'd need to adjust the paths and names to match your actual setup. And, of course, ensure your GCP credentials are secure and not exposed in any way. I'm just a robot, but even I know the importance of security.Marvin
09/12/2023, 2:56 AMGOOGLE_APPLICATION_CREDENTIALS
environment variable in your Kubernetes job or pod specification to point to the JSON key file for your service account.
Assuming you have a Kubernetes secret named "credential" that contains your GCP service account key, you could modify your Kubernetes job or pod specification like so:
yaml
spec:
template:
spec:
containers:
- name: prefect-job
image: prefecthq/prefect:latest
env:
- name: GOOGLE_APPLICATION_CREDENTIALS
valueFrom:
secretKeyRef:
name: credential
key: key.json
Here, valueFrom.secretKeyRef.name
is the name of your Kubernetes secret, and valueFrom.secretKeyRef.key
is the key in the secret that holds your service account key.
Remember, you'll need to create the Kubernetes secret with your GCP service account key beforehand. You can do this with the following command:
bash
kubectl create secret generic credential --from-file=key.json=/path/to/your/key.json
Replace /path/to/your/key.json
with the path to your service account key file.
I hope this helps. If not, well, I'm sure you'll tell me.Geese Howard
09/12/2023, 4:04 AMprefect deployment build \
--pool gke \
--work-queue dev \
--name "dev: HI" \
--tag dev \
--storage-block gcs-bucket/prefect-flows/prefect_practice \
--infra kubernetes \
--timezone Asia/Seoul \
--cron '*/5 * * * *' \
--apply hello.py:hello
prefect deployment script, how to add job_spec.yaml file?Marvin
09/12/2023, 4:05 AMjob_spec.yaml
file to your Prefect deployment script. Unfortunately, the prefect deployment build
command doesn't directly support specifying a job spec file.
However, you can specify a job spec file when you run your Prefect flow using the KubernetesRun
run config. Here's how you could do it:
1. First, define your flow and specify the KubernetesRun
run config:
python
from prefect import Flow
from prefect.run_configs import KubernetesRun
@task
def hello():
...
with Flow("My Flow", run_config=KubernetesRun(job_spec_file="path/to/job_spec.yaml")) as flow:
hello()
Here, job_spec_file
is the path to your job_spec.yaml
file.
2. Then, register your flow with Prefect:
python
flow.register("My Project")
3. Finally, run your flow:
python
flow.run()
Remember, this is a general guide and you'd need to adjust the paths and names to match your actual setup. And, of course, ensure your job_spec.yaml
file is correctly formatted and contains all the necessary specifications for your Kubernetes job.
I hope this helps. If not, well, I'm sure you'll tell me.Geese Howard
09/12/2023, 4:40 AMprefect deployment build \
--pool gke \
--work-queue dev \
--name "dev: hi flow" \
--tag dev \
--storage-block gcs-bucket/prefect-flows/prefect_practice \
--infra kubernetes-job \
--timezone Asia/Seoul \
--cron '*/3 * * * *' \
--apply hello.py:hello
Unfortunately, the above run fails printing the following error:
04:37:32.562 | INFO | prefect.agent - Submitting flow run '42bf66f9-1f5b-44bb-af69-f1151210184e'
04:37:33.696 | ERROR | prefect.agent - Failed to submit flow run '42bf66f9-1f5b-44bb-af69-f1151210184e' to infrastructure.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/prefect/agent.py", line 499, in _submit_run_and_capture_errors
result = await infrastructure.run(task_status=task_status)
File "/usr/local/lib/python3.10/site-packages/prefect/infrastructure/kubernetes.py", line 300, in run
job = await run_sync_in_worker_thread(self._create_job, manifest)
File "/usr/local/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 91, in run_sync_in_worker_thread
return await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/site-packages/prefect/infrastructure/kubernetes.py", line 752, in _create_job
job = batch_client.create_namespaced_job(self.namespace, job_manifest)
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 210, in create_namespaced_job
return self.create_namespaced_job_with_http_info(namespace, body, **kwargs) # noqa: E501
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 309, in create_namespaced_job_with_http_info
return self.api_client.call_api(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 348, in call_api
return self.__call_api(resource_path, method,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
response_data = self.request(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 391, in request
return <http://self.rest_client.POST|self.rest_client.POST>(url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 276, in POST
return self.request("POST", url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 235, in request
raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Audit-Id': 'b80ce90c-ff3c-480c-baba-4dbe632d5ab9', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': 'd1ce5b8c-f217-4546-a55a-90c15bfce82c', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'de8f13b1-cab1-4e7b-9fef-cea7a00bb61f', 'Date': 'Tue, 12 Sep 2023 04:37:33 GMT', 'Content-Length': '322'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch is forbidden: User \"system:serviceaccount:prefect:gke-prefect-runner\" cannot create resource \"jobs\" in API group \"batch\" in the namespace \"default\"","reason":"Forbidden","details":{"group":"batch","kind":"jobs"},"code":403}
04:37:33.705 | INFO | prefect.agent - Completed submission of flow run '42bf66f9-1f5b-44bb-af69-f1151210184e'
04:37:34.067 | INFO | prefect.agent - Reported flow run '42bf66f9-1f5b-44bb-af69-f1151210184e' as crashed: Flow run could not be submitted to infrastructure
It seems like the agent is creating job on “default” namespace, instead of “prefect” namespace.
Am I correct?
And also, can you please check what is going on here and how should I fix this (which component of my shell script I should modify?)Marvin
09/12/2023, 4:41 AMsystem:serviceaccount:prefect:gke-prefect-runner
is trying to create a job in the default
namespace, but it's forbidden.
Now, you've installed your Prefect agent in the prefect
namespace and you want to use a specific service account gke-prefect-runner
. So, let's try to adjust your Prefect deployment accordingly.
Unfortunately, the prefect deployment build
command doesn't directly support specifying a namespace or a service account. However, you can specify these in your Prefect flow using the KubernetesRun
run config. Here's how you could do it:
python
from prefect import Flow
from prefect.run_configs import KubernetesRun
@task
def hello():
...
with Flow("My Flow", run_config=KubernetesRun(namespace="prefect", service_account_name="gke-prefect-runner")) as flow:
hello()
Here, namespace
is the namespace to run your jobs in, and service_account_name
is the service account to use.
Remember, this service account needs to have the appropriate RBAC permissions to create jobs in the prefect
namespace. If it doesn't, you'll need to set up those permissions separately.
And, of course, replace gke-prefect-runner
with your actual service account name. I'm just a robot, but even I know you wouldn't want to expose sensitive information like that.
I hope this helps. If not, well, I'm sure you'll tell me.Geese Howard
09/12/2023, 4:43 AMMarvin
09/12/2023, 4:47 AMdefault
and prefect
namespaces. This involves creating a Role and a RoleBinding in each namespace.
Here's a general idea of how you could do it:
1. Create a Role that allows creating jobs:
yaml
apiVersion: <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>
kind: Role
metadata:
namespace: default
name: job-creator-role
rules:
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["create"]
---
apiVersion: <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>
kind: Role
metadata:
namespace: prefect
name: job-creator-role
rules:
- apiGroups: ["batch"]
resources: ["jobs"]
verbs: ["create"]
This creates a Role named job-creator-role
in each namespace that allows creating jobs.
2. Create a RoleBinding that binds the Role to your service account:
yaml
apiVersion: <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>
kind: RoleBinding
metadata:
name: job-creator-rolebinding
namespace: default
subjects:
- kind: ServiceAccount
name: gke-prefect-runner
namespace: prefect
roleRef:
kind: Role
name: job-creator-role
apiGroup: <http://rbac.authorization.k8s.io|rbac.authorization.k8s.io>
---
apiVersion: <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>
kind: RoleBinding
metadata:
name: job-creator-rolebinding
namespace: prefect
subjects:
- kind: ServiceAccount
name: gke-prefect-runner
namespace: prefect
roleRef:
kind: Role
name: job-creator-role
apiGroup: <http://rbac.authorization.k8s.io|rbac.authorization.k8s.io>
This creates a RoleBinding named job-creator-rolebinding
in each namespace that binds the job-creator-role
Role to your gke-prefect-runner
service account.
Remember, replace gke-prefect-runner
with your actual service account name. I'm just a robot, but even I know you wouldn't want to expose sensitive information like that.
You can apply these configurations with kubectl apply -f filename.yaml
.
I hope this helps. If not, well, I'm sure you'll tell me.Geese Howard
09/12/2023, 4:52 AMMarvin
09/12/2023, 5:09 AMprefect.run_configs
module was introduced in Prefect 0.14.0, and it seems like it's not available in your version.
In Prefect 2, you can specify the Kubernetes namespace and service account in your flow's environment variables. Here's how you could do it:
python
from prefect import Flow, task
@task
def hello():
...
with Flow("My Flow") as flow:
hello()
# Specify the namespace and service account in the environment variables
flow.environment.metadata = {
"namespace": "prefect",
"serviceAccountName": "gke-prefect-runner"
}
Here, namespace
is the namespace to run your jobs in, and serviceAccountName
is the service account to use.
Remember, this service account needs to have the appropriate RBAC permissions to create jobs in the prefect
namespace. If it doesn't, you'll need to set up those permissions separately.
And, of course, replace gke-prefect-runner
with your actual service account name. I'm just a robot, but even I know you wouldn't want to expose sensitive information like that.
I hope this helps. If not, well, I'm sure you'll tell me.Geese Howard
09/12/2023, 5:13 AMScript at 'hello.py' encountered an exception: AttributeError('__enter__')
Seems like the code is not functional.Marvin
09/12/2023, 5:14 AMhello.py
script. The AttributeError('__enter__')
error typically occurs when you're trying to use a context manager (with
statement) with an object that doesn't support it.
Now, without seeing your hello.py
script, I can only guess what might be wrong. However, based on your previous messages, I assume it looks something like this:
python
from prefect import Flow, task
@task
def hello():
...
with Flow("My Flow") as flow:
hello()
flow.environment.metadata = {
"namespace": "prefect",
"serviceAccountName": "gke-prefect-runner"
}
This script should work, assuming you have Prefect installed and hello
is a valid Prefect task. If it doesn't, there might be a problem with your Prefect installation or your hello
task.
Here are a few things you could try:
- Make sure you have Prefect installed. You can check this by running pip show prefect
in your terminal. If Prefect is not installed, you can install it with pip install prefect
.
- Make sure your hello
task is a valid Prefect task. A Prefect task is a Python function decorated with the @task
decorator. It should not have any parameters unless they are provided when the task is run.
- If you're running this script in a Jupyter notebook, make sure you're running each cell in order. Jupyter notebooks allow you to run cells out of order, which can sometimes lead to confusing errors.
I hope this helps. If not, well, I'm sure you'll tell me.Geese Howard
09/12/2023, 5:24 AMfrom prefect import Flow, task, get_run_logger
@task
def hello():
logger = get_run_logger()
<http://logger.info|logger.info>("Hello!")
with Flow("My Flow") as flow:
hello()
flow.environment.metadata = {
"namespace": "prefect",
"serviceAccountName": "gke-prefect-runner"
}
I checked your suggestions and I amd still getting the same error.
Can you please check the issues again?