I’m having some difficult deploying the prefect ag...
# ask-community
k
I’m having some difficult deploying the prefect agent into Kubernetes using a namespace and service account. When trying to trigger a flow I get the following error. It looks like it’s trying to use the
default
service account in the
prefect
namespace I’m attempting to use:
system:serviceaccount:prefect:default
Copy code
(403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': '0cb3f7e4-3156-4e9e-b025-5c9deb274813', 'X-Kubernetes-Pf-Prioritylevel-Uid': '96071337-eda2-47ce-9026-95ed0ab85b02', 'Date': 'Fri, 03 Sep 2021 19:49:12 GMT', 'Content-Length': '311'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch is forbidden: User \"system:serviceaccount:prefect:default\" cannot create resource \"jobs\" in API group \"batch\" in the namespace \"prefect\"","reason":"Forbidden","details":{"group":"batch","kind":"jobs"},"code":403}
I specified SERVICE_ACCOUNT_NAME in the kubernetes manifest but it doesn’t seem to be honoring it. Am I perhaps missing another configuration?
I’ve also confirmed if I add a rolebinding to the same role to
default
in addition to the
prefect-agent
service account it can create jobs and pods:
Copy code
apiVersion: <http://rbac.authorization.k8s.io/v1|rbac.authorization.k8s.io/v1>
kind: RoleBinding
metadata:
  name: prefect-agent-role
  namespace: prefect
roleRef:
  apiGroup: <http://rbac.authorization.k8s.io|rbac.authorization.k8s.io>
  kind: Role
  name: prefect-agent-role
subjects:
#- kind: ServiceAccount
#  name: default
- kind: ServiceAccount
  name: prefect-agent
k
Hey @Kevin Mullins, could you maybe move the manifest to the thread so we can keep the main channel more compact? Have you seen how to configure the agent here ? Maybe you can add these flags?
k
Got that cleaned up, sorry bout that. Yea. I used the following command to generate the manifest:
Copy code
prefect agent kubernetes install --namespace prefect --service-account-name prefect-agent --backend server --api <http://10.0.0.33:4200/graphql> -l k8s --rbac > prefect-agent.yaml
which generated:
Copy code
apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: prefect-agent
  name: prefect-agent
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prefect-agent
  template:
    metadata:
      labels:
        app: prefect-agent
    spec:
      containers:
      - args:
        - prefect agent kubernetes start
        command:
        - /bin/bash
        - -c
        env:
        - name: PREFECT__CLOUD__AGENT__AUTH_TOKEN
          value: ''
        - name: PREFECT__CLOUD__API
          value: <http://10.0.0.33:4200/graphql>
        - name: NAMESPACE
          value: prefect
        - name: IMAGE_PULL_SECRETS
          value: ''
        - name: PREFECT__CLOUD__AGENT__LABELS
          value: '[''k8s'']'
        - name: JOB_MEM_REQUEST
          value: ''
        - name: JOB_MEM_LIMIT
          value: ''
        - name: JOB_CPU_REQUEST
          value: ''
        - name: JOB_CPU_LIMIT
          value: ''
        - name: IMAGE_PULL_POLICY
          value: ''
        - name: SERVICE_ACCOUNT_NAME
          value: prefect-agent
        - name: PREFECT__BACKEND
          value: server
        - name: PREFECT__CLOUD__AGENT__AGENT_ADDRESS
          value: http://:8080
        - name: PREFECT__CLOUD__API_KEY
          value: ''
        - name: PREFECT__CLOUD__TENANT_ID
          value: ''
        image: prefecthq/prefect:0.15.4-python3.6
        imagePullPolicy: Always
        livenessProbe:
          failureThreshold: 2
          httpGet:
            path: /api/health
            port: 8080
          initialDelaySeconds: 40
          periodSeconds: 40
        name: agent
k
Thanks!
👍 1
Gotcha yeah, the service_account_name and namespace do make it in there. will look more
k
Awesome, thanks so much for the help! I’ve destroyed my cluster multiple times, re-generated the manifest, etc. and can’t seem to get it to work. Let me know if there is any additional information I can provide to assist!
k
Nothing stands out when reading the code here. Stuff around the service_account_name is pretty legible. Can I see your KubernetesRun?
k
Copy code
flow.run_config = KubernetesRun(
            job_template_path="sandbox/job_spec.yaml",
            env={"EXTRA_PIP_PACKAGES": "azure-identity azure-cosmosdb-table azure-storage-blob"}
        )
Here is the job_spec. I’m adding a few ENV vars from a k8s secret:
Copy code
apiVersion: batch/v1
kind: Job
metadata:
  name: test-blob-auth-flow
spec:
  template:
    metadata:
      labels:
        identifier: "azure"
    spec:
      containers:
        - name: flow-azure
          env:
            - name: AZURE_TENANT_ID
              valueFrom:
                secretKeyRef:
                  name: prefect-azure-sp
                  key: prefect-azure-tenant-id
            - name: AZURE_CLIENT_ID
              valueFrom:
                secretKeyRef:
                  name: prefect-azure-sp
                  key: prefect-azure-client-id
            - name: AZURE_CLIENT_SECRET
              valueFrom:
                secretKeyRef:
                  name: prefect-azure-sp
                  key: prefect-azure-client-secret
k
I am wondering if this job spec is somehow injecting
default
because the RunConfig takes precedence over the agent
k
interesting. I can trim down my example to not do that and give it a shot
k
You can also try explicitly specifying it in the KubernetesRun kwarg (though of course we still want to understand this)
k
Neither removing the job spec nor explicitly setting the service_account_name seemed to have an effect:
Copy code
flow.run_config = KubernetesRun(
            service_account_name="prefect-agent",
            env={"EXTRA_PIP_PACKAGES": "azure-identity azure-cosmosdb-table azure-storage-blob"}
        )
Copy code
"message":"jobs.batch is forbidden: User \"system:serviceaccount:prefect:default\" cannot create resource \"jobs\" in API group \"batch\" in the namespace \"prefect\"
Just some additional debugging info. I checked the environment for the agent pod and the environment variable did seem to make it:
Copy code
kubectl -n prefect exec -it prefect-agent-c95d8b749-fwvzc -- /bin/bash
root@prefect-agent-c95d8b749-fwvzc:/# env
KUBERNETES_SERVICE_PORT_HTTPS=443
JOB_CPU_LIMIT=
KUBERNETES_SERVICE_PORT=443
PREFECT__CLOUD__API=<http://10.0.0.33:4200/graphql>
HOSTNAME=prefect-agent-c95d8b749-fwvzc
PYTHON_VERSION=3.6.14
SERVICE_ACCOUNT_NAME=prefect-agent
.... trimmed other stuff
k
What Prefect version are you on just to be sure?
k
Is this the correct version? It’s the image used by the agent
image: prefecthq/prefect:0.15.4-python3.6
Also:
Copy code
pip show prefect
Name: prefect
Version: 0.15.4
Summary: The Prefect Core automation and scheduling engine.
Home-page: <https://www.github.com/PrefectHQ/prefect>
Author: Prefect Technologies, Inc.
Author-email: <mailto:help@prefect.io|help@prefect.io>
License: Apache License 2.0
Location: /Users/kevin/source/pocs/azure-auth-poc/.venv/lib/python3.8/site-packages
Requires: marshmallow, toml, cloudpickle, pyyaml, pytz, mypy-extensions, distributed, croniter, pendulum, tabulate, msgpack, requests, urllib3, python-slugify, docker, click, marshmallow-oneofschema, python-dateutil, python-box, dask
Required-by:
k
that looks good
i’ll have to test this more. this will take a bit
👍 1
k
Ohhh, I figured it out. The
--service-account-name
flag only apply to pods CREATED by the agent, not the agent pod itself. Was my misunderstanding. To have the agent itself run as a service account I added
serviceAccountName
under the spec template in the generated manifest:
Copy code
spec:
  replicas: 1
  selector:
    matchLabels:
      app: prefect-agent
  template:
    metadata:
      labels:
        app: prefect-agent
    spec:
      serviceAccountName: prefect-agent
      containers:
      - args:
        - prefect agent kubernetes start
        command:
        - /bin/bash
        - -c
        env:
k
ah gotcha. thanks for circling back and explaining!
👍 1
k
I’ll admit the documentation was a bit misleading. With the
--rbac
flag and service account, it looks like the role being generated is intended for the agent and not flow runs. I don’t know why the flow runs would need to be able to manipulate jobs, pods, and events. Seems like that was meant for the agent pod
nvm, ignore that. The rolebinding is generated for
default
service account. To match up with using
default
by the agent.
👍 1
Closing this out, thanks again for the help!
k
I didn’t help, but sure thing!