Hey all! So i'm running into an issue where the `...
# prefect-kubernetes
k
Hey all! So i'm running into an issue where the
kubernetes-job
block is not adhering to the
namespace
or
service account name
input fields. Also when deploying both the agent and server into the "prefect" namespace, I get an error when running a flow:
Copy code
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch is forbidden: User \"system:serviceaccount:prefect:prefect-agent\" cannot create resource \"jobs\" in API group \"batch\" in the namespace \"default\"","reason":"Forbidden","details":{"group":"batch","kind":"jobs"},"code":403}
This is even after setting the service account name to some dummy value as well as setting the ns to be "prefect" The next step is "well if it needs access to the default name space, let's just deploy it into the default namespace", after doing that, I get the exact same issue, except now the error points to the job wanting to be created within the "kube-system" namespace?
Copy code
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"namespaces \"kube-system\" is forbidden: User \"system:serviceaccount:default:prefect-agent\" cannot get resource \"namespaces\" in API group \"\" in the namespace \"kube-system\"","reason":"Forbidden","details":{"name":"kube-system","kind":"namespaces"},"code":403}
So yeah, i'm pretty lost for what to do, I've even double checked the prefect-agent service account, and it 100% has access to the resources it claims it doesn't. Versions running:
Copy code
- name: prefect-agent
    version: 2023.07.07
    repository: <https://prefecthq.github.io/prefect-helm>

  - name: prefect-server
    version: 2023.07.07
    repository: <https://prefecthq.github.io/prefect-helm>
c
It’s saying it’s missing get.namespace permissions on kube-system; if this is using the latest, there should be a value / field for the clusterUID where that isn’t required anymore (because it was a little too much permission required)
1
I’d double check the service accounts you have and are using - there is probably both a
default
and a
prefect-agent
you can do
kubectl describe role prefect-worker
or
prefect-agent
in your case to see what permissions the attached role has on the service account $ k describe role prefect-worker
Copy code
Name:         prefect-worker
Labels:       <http://app.kubernetes.io/component=worker|app.kubernetes.io/component=worker>
              <http://app.kubernetes.io/instance=aks-worker|app.kubernetes.io/instance=aks-worker>
              <http://app.kubernetes.io/managed-by=Helm|app.kubernetes.io/managed-by=Helm>
              <http://app.kubernetes.io/name=prefect-worker|app.kubernetes.io/name=prefect-worker>
              <http://helm.sh/chart=prefect-worker-2023.04.13|helm.sh/chart=prefect-worker-2023.04.13>
              prefect-version=2.10.4-python3.11-kubernetes
Annotations:  <http://meta.helm.sh/release-name|meta.helm.sh/release-name>: aks-worker
              <http://meta.helm.sh/release-namespace|meta.helm.sh/release-namespace>: default
PolicyRule:
  Resources    Non-Resource URLs  Resource Names  Verbs
  ---------    -----------------  --------------  -----
  jobs.batch   []                 []              [get list watch create update patch delete]
  pods/log     []                 []              [get watch list]
  pods/status  []                 []              [get watch list]
  pods         []                 []              [get watch list]
k
Copy code
Name:         prefect-agent
Labels:       <http://app.kubernetes.io/component=agent|app.kubernetes.io/component=agent>
              <http://app.kubernetes.io/instance=prefect-agent-primary|app.kubernetes.io/instance=prefect-agent-primary>
              <http://app.kubernetes.io/managed-by=Helm|app.kubernetes.io/managed-by=Helm>
              <http://app.kubernetes.io/name=prefect-agent|app.kubernetes.io/name=prefect-agent>
              <http://argocd.argoproj.io/instance=prefect-agent-primary|argocd.argoproj.io/instance=prefect-agent-primary>
              <http://helm.sh/chart=prefect-agent-2023.07.07|helm.sh/chart=prefect-agent-2023.07.07>
              prefect-version=2.10.20-python3.10
Annotations:  <none>
PolicyRule:
  Resources    Non-Resource URLs  Resource Names  Verbs
  ---------    -----------------  --------------  -----
  jobs.batch   []                 []              [get list watch create update patch delete]
  pods/log     []                 []              [get watch list]
  pods/status  []                 []              [get watch list]
  pods         []                 []              [get watch list]
But when I deploy into the prefect ns, then it complains about getting ns default, when I deploy into default, it complains about getting kube-system ns?
Also when deploying into the "prefect" ns, it fails on
cannot create resource \"jobs\"
But when deploying into the "default" ns, it fails on
cannot get resource \"namespaces\"
c
What are you specifying for the kubernetes job?
what do your kubernetes job block look like if you’re using an agent?
and have you tried the same on the worker?
k
I haven't looked at the workers, only the agent, and the job is just pulling a python based ECR image and should be running in the prefect namespace using the prefect-agent service account.
But the error behaviour seems suspicious as it's the exact same prefect flow config, just a different ns?
b
Hi Chris The K8s job uses the default job template without any substantial modifications other that a
envFrom
reference. @Kyle Hoffman Could it be related to that?
c
So on the first part - if you deploy into the prefect namespace, then the kubernetesJob spec should point to the name space you are using: https://docs.prefect.io/latest/api-ref/prefect/infrastructure/#prefect.infrastructure.KubernetesJob so you can specify the following -
Copy code
namespace="prefect",
service_account_name="prefect-agent",
if you aren’t specifying those in your job, it’s using defaults for both
o
Hi @Kyle Hoffman did you manage to solve this? Just run into same error with Prefect Worker
k
Technically not, but we got it working by installing it into the default namespace and removed any reference to defining a namespace, that got it working, but we still haven't been able to deploy it to custom namespaces
o
Ok, thanks. That’s too bad. I’ll update here if we get it working
Adding
clusterUid
resolved the issue for us. Here is our manifest if that helps:
Copy code
apiVersion: <http://argoproj.io/v1alpha1|argoproj.io/v1alpha1>
kind: Application
metadata:
  name: prefect-worker
  namespace: argocd
  finalizers:
    - <http://resources-finalizer.argocd.argoproj.io|resources-finalizer.argocd.argoproj.io>
spec:
  project: default
  source:
    chart: prefect-worker
    repoURL: "<https://prefecthq.github.io/prefect-helm>"
    targetRevision: "2023.09.18"
    helm:
      valuesObject:
        worker:
          clusterUid: ds-compute-platform-test
          apiConfig: server
          config:
            workPool: k8s-test-pool
          serverApiConfig:
            apiUrl: "<http://prefect-server.prefect.svc.cluster.local:80/api>"
  destination:
    server: "<https://kubernetes.default.svc>"
    namespace: prefect
  syncPolicy:
    automated:
      prune: true
    syncOptions:
      - FailOnSharedResource=true
      - PruneLast=true
      - CreateNamespace=true
1
k
It's strange that not setting it would affect what namespace you're able to deploy the chart in?
Thank's for the feedback though!
👍 1