https://prefect.io logo
Title
s

Sean Turner

09/23/2022, 4:21 PM
Any idea what to make of this (
v2
)? My EKS agent picked up a flow off of a queue but the pod never started?
prefect deployment build main.py:foo \
    -n sean-k8s-test-deployment \
    -q company-name \
    -sb s3/company-name-prefect-staging/sean-turner/foo \
    -ib kubernetes-job/test \
    --apply
EKS agent logs:
16:09:45.420 | INFO    | prefect.agent - Submitting flow run 'c8f0bba7-b104-40d2-aa96-8a278c800f1e'
16:09:45.766 | INFO    | prefect.agent - Completed submission of flow run 'c8f0bba7-b104-40d2-aa96-8a278c800f1e'
16:10:45.785 | ERROR   | prefect.infrastructure.kubernetes-job - Job 'test6qzkh': Pod never started.
My
kubernetes-job/test
infra block has
{"EXTRA_PIP_PACKAGE": "s3fs"}
and does not have a kubeconfig set. My S3 infra block
s3/company-name-prefect-staging/sean-turner/foo
does not have credentials because the agent and orion pods are assuming IAM roles that give permissions to read and write to the bucket.
1
z

Zanie

09/23/2022, 5:00 PM
It’s possible the pod timeout is just too low?
pod_watch_timeout_seconds
defaults to 60 but can be set higher
s

Sean Turner

09/23/2022, 5:06 PM
Hmm, just upped it but I don't feel like that should be the issue. Is there anywhere I can get additional logs? I also added the attached heavily redacted kubeconfig and updated the environment variable to include the awscli
{"EXTRA_PIP_PACKAGES": "s3fs,awscli"}
Maybe I need to update my EKS
aws-auth
configmap to allow the Agent IRSA IAM role? I'm kind of grasping at straws, I don't think this sort of thing is documented anywhere
Updated the EKS aws-auth and experiencing the same issue. Is there a resource anywhere about setting the kubeconfig?
z

Zanie

09/23/2022, 5:28 PM
What does
kubectl describe job <job-name>
get you?
🙌 1
You can also get all pods for the job
kubectl get pods --selector=job-name=<job-name>
s

Sean Turner

09/23/2022, 5:31 PM
Ahhh interesting! I didn't realize that the jobs were actually being created
Events:
  Type     Reason        Age                  From            Message
  ----     ------        ----                 ----            -------
  Warning  FailedCreate  5m29s (x9 over 27m)  job-controller  Error creating: pods "testkbhpn-" is forbidden: error looking up service account prefect/prefect: serviceaccount "prefect" not found
Okay huge progress. I'm unblocked. Thank you so much!
z

Zanie

09/23/2022, 5:54 PM
👍 I believe there’s an issue to report these errors via the agent as well