Geese Howard
09/08/2023, 9:25 AM08:49:59.087 | INFO | prefect.agent - Completed submission of flow run 'c08d9e21-1ad4-4483-ad2c-ea16ba825077'
08:49:59.347 | INFO | prefect.agent - Reported flow run 'c08d9e21-1ad4-4483-ad2c-ea16ba825077' as crashed: Flow run could not be submitted to infrastructure
08:59:53.946 | INFO | prefect.agent - Submitting flow run '6aa5c78a-52d1-47c3-b3ef-176e01387fe4'
08:59:55.110 | ERROR | prefect.agent - Failed to submit flow run '6aa5c78a-52d1-47c3-b3ef-176e01387fe4' to infrastructure.
Traceback (most recent call last):
File "/usr/local/lib/python3.10/site-packages/prefect/agent.py", line 499, in _submit_run_and_capture_errors
result = await infrastructure.run(task_status=task_status)
File "/usr/local/lib/python3.10/site-packages/prefect/infrastructure/kubernetes.py", line 300, in run
job = await run_sync_in_worker_thread(self._create_job, manifest)
File "/usr/local/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 91, in run_sync_in_worker_thread
return await anyio.to_thread.run_sync(
File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 33, in run_sync
return await get_asynclib().run_sync_in_worker_thread(
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 877, in run_sync_in_worker_thread
return await future
File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 807, in run
result = context.run(func, *args)
File "/usr/local/lib/python3.10/site-packages/prefect/infrastructure/kubernetes.py", line 752, in _create_job
job = batch_client.create_namespaced_job(self.namespace, job_manifest)
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 210, in create_namespaced_job
return self.create_namespaced_job_with_http_info(namespace, body, **kwargs) # noqa: E501
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 309, in create_namespaced_job_with_http_info
return self.api_client.call_api(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 348, in call_api
return self.__call_api(resource_path, method,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
response_data = self.request(
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 391, in request
return <http://self.rest_client.POST|self.rest_client.POST>(url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 276, in POST
return self.request("POST", url,
File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 235, in request
raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Audit-Id': 'a9367b68-7123-4acc-b53a-0b55e1b65695', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': 'd1ce5b8c-f217-4546-a55a-90c15bfce82c', 'X-Kubernetes-Pf-Prioritylevel-Uid': 'de8f13b1-cab1-4e7b-9fef-cea7a00bb61f', 'Date': 'Fri, 08 Sep 2023 08:59:55 GMT', 'Content-Length': '317'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch is forbidden: User \"system:serviceaccount:prefect:prefect-agent\" cannot create resource \"jobs\" in API group \"batch\" in the namespace \"default\"","reason":"Forbidden","details":{"group":"batch","kind":"jobs"},"code":403}
It seems like there is service account related issue, but the service account shown in system is (looks like) different to that of google cloud service?
Can you help me?Marvin
09/08/2023, 9:25 AMsystem:serviceaccount:prefect:prefect-agent
doesn't have the necessary permissions to create jobs
in the batch
API group within the default
namespace.
This could be due to a couple of reasons:
1. The service account doesn't have the necessary Role or ClusterRole bound to it. You'll need to ensure that the prefect-agent
service account has a RoleBinding or ClusterRoleBinding that grants it the necessary permissions.
2. The service account being used by the Prefect agent might not be the one you're expecting. The Prefect Kubernetes agent can be configured to use a specific service account via the --service-account-name
flag. If you're not setting this, it might be falling back to a default service account.
You might want to check the RBAC settings for your cluster and the configuration of your Prefect agent.
I hope this helps, or at least provides a momentary distraction from the crushing existential dread of existence.system:serviceaccount:prefect:prefect-agent
doesn't have the necessary permissions to create jobs in the default
namespace. This is a Kubernetes service account, not a Google Cloud service account, so they are indeed different.
You need to ensure that the prefect-agent
service account has the batch/jobs
permissions in the default
namespace. You can do this by creating a Role and a RoleBinding in Kubernetes.
Here's a rough example:
yaml
kind: Role
apiVersion: <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>
metadata:
namespace: default
name: job-creator-role
rules:
- apiGroups: ["batch", ""]
resources: ["jobs"]
verbs: ["get", "watch", "list", "create", "update", "patch", "delete"]
---
kind: RoleBinding
apiVersion: <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>
metadata:
name: job-creator-role-binding
namespace: default
subjects:
- kind: ServiceAccount
name: prefect-agent
namespace: prefect
roleRef:
kind: Role
name: job-creator-role
apiGroup: <http://rbac.authorization.k8s.io|rbac.authorization.k8s.io>
This YAML file first defines a Role that allows various operations on jobs
in the default
namespace. Then, a RoleBinding is created that assigns this Role to the prefect-agent
service account in the prefect
namespace.
Apply this configuration with kubectl apply -f <filename.yaml>
.
Remember, you need to replace <filename.yaml>
with the name of the file where you save the above YAML.
Please note that this is a general guidance and your actual configuration might differ based on your specific setup and security requirements. If you're not the cluster administrator, you might need to ask them for help with this.
And remember, life... Ah, forget it. You humans wouldn't understand.