Gonna christen this channel with first K8s issue, ...
# prefect-kubernetes
Gonna christen this channel with first K8s issue, any thoughts appreciated
Trying to add these aws credentials as env vars atm with
extraEnvVarsSecret: "aws-iam-key"
in the helm values.yaml and a k8s secret in the
Copy code
apiVersion: v1
kind: Secret
  name: aws-iam-key
  namespace: default
type: Opaque
  AWS_ACCESS_KEY_ID: "redacted"
Hmm Would you mind sharing how your deployment is set up for the flow in this case? Are you using the base image for prefect or a custom image for the Kubernetes Job? Assuming adding the aws creds doesn't address this
just to verify, the secret and the agent deployment are in the same namespace?
I have an agent running using the helm chart with
helm install --values k8s-agent.yaml prefect-agent prefect/prefect-agent
where the
yaml is
Copy code
    repository: prefecthq/prefect
    prefectTag: 2-python3.10
    pullPolicy: IfNotPresent

      - test

    accountId: "redact"
    workspaceId: "redact"
prefect api key added as secret as well
kubectl apply -f prefect-secrets.yaml
Copy code
apiVersion: v1
kind: Secret
  name: prefect-api-key
  namespace: default
type: Opaque
  key: "redact"
then I added kube_config block with
Copy code
from prefect.blocks.kubernetes import KubernetesClusterConfig

my_new_k8s_config = KubernetesClusterConfig.from_file(path="~/.kube/config")

and select that from kubernetes-job/k8s-demo and deploy to that
Copy code
prefect deployment build log_flow.py:log_flow -n log-flow-s3 -sb s3/hubspot -q test -o log-flow-s3-k8s-deployment.yaml -t test -ib kubernetes-job/k8s-demo
@Jamie Zieziula yep all in default
Hey @eddy davies this discourse article is related to 1.0 but the concepts should apply similarly here https://discourse.prefect.io/t/how-to-assign-iam-permissions-to-a-kubernetes-cluster-so[…]s-can-access-other-aws-services-such-as-s3-or-dynamodb/341 I haven't seen this particular error come up before generally speaking you may also want to use a custom image for the agent that includes the s3fs module, by default the base image doesn't include this and that may be where the missing module error is coming from.
I am looking into using
with IAM Roles for Service Accounts but having some difficulties. So installing s3fs is shown for ECS-Task section but not on KubernetesJob section, so I missed that, might be worth adding to docs. (I guess you could use remote storage without s3fs but that is an unlikely use case)
on the s3fs point, its not included in the kubernetesjob section because where your pods need to fetch flow code from might not be s3, it could be GCS, github, or someplace else similarly, you only need s3fs as a dependency when running flows with ECS task if you use s3 remote storage specifically
Some form of remote storage is required though, so example without any seems maybe lacking. So this mean I have to create a docker image from the base prefect one, if I upload this to docker hub or ECR can I then point to it from the helm chart?
Copy code
    repository: my_repo/prefect_s3
Some form of remote storage is required though
yep! that's correct - maybe we can make that more clear You shouldn't actually need s3fs in the pod running the agent specifically, because that pod is not actually pulling the flow code when the agent creates the pod for your flow run, the image that this new pod will use will need s3fs to pull flow code from s3 before running, so you can include that in the
that you give to your infrastructure block for your deployment (or pass
to the env of the infra block) since the agent is just responsible for the submission of the flow runs to other pods on your cluster, you should be fine keeping the default image in the helm chart
🙌 2
🙌🏻 1
Useful stuff about the extra packages! Still getting this issue
Copy code
14:02:51.647 | INFO    | prefect.agent - Completed submission of flow run '2c74a2d0-b94f-4f4e-a3b9-1be3a0e936f3'
14:17:45.848 | INFO    | prefect.agent - Submitting flow run 'da7ba311-720f-4194-bc1b-fa7a94c6c6f8'
/usr/local/lib/python3.10/site-packages/prefect/agent.py:215: UserWarning: Block document has schema checksum sha256:686f931093d8fa3a80dee6eb66516be7b022bf29c44a38766da8571f25fede8b which does not match the schema checksum for class 'KubernetesJob'. This indicates the schema has changed and this block may not load.
  infrastructure_block = Block._from_block_document(infra_document)
14:17:46.302 | ERROR   | root - [Errno 2] No such file or directory: 'aws-iam-authenticator'
14:17:46.321 | ERROR   | prefect.agent - Failed to submit flow run 'da7ba311-720f-4194-bc1b-fa7a94c6c6f8' to infrastructure.
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/site-packages/prefect/agent.py", line 259, in _submit_run_and_capture_errors
    result = await infrastructure.run(task_status=task_status)
  File "/usr/local/lib/python3.10/site-packages/prefect/infrastructure/kubernetes.py", line 276, in run
    job_name = await run_sync_in_worker_thread(self._create_job, manifest)
  File "/usr/local/lib/python3.10/site-packages/prefect/utilities/asyncutils.py", line 68, in run_sync_in_worker_thread
    return await anyio.to_thread.run_sync(call, cancellable=True)
  File "/usr/local/lib/python3.10/site-packages/anyio/to_thread.py", line 31, in run_sync
    return await get_asynclib().run_sync_in_worker_thread(
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 937, in run_sync_in_worker_thread
    return await future
  File "/usr/local/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 867, in run
    result = context.run(func, *args)
  File "/usr/local/lib/python3.10/site-packages/prefect/infrastructure/kubernetes.py", line 505, in _create_job
    job = batch_client.create_namespaced_job(self.namespace, job_manifest)
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 210, in create_namespaced_job
    return self.create_namespaced_job_with_http_info(namespace, body, **kwargs)  # noqa: E501
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api/batch_v1_api.py", line 309, in create_namespaced_job_with_http_info
    return self.api_client.call_api(
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 348, in call_api
    return self.__call_api(resource_path, method,
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 180, in __call_api
    response_data = self.request(
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/api_client.py", line 391, in request
    return <http://self.rest_client.POST|self.rest_client.POST>(url,
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 276, in POST
    return self.request("POST", url,
  File "/usr/local/lib/python3.10/site-packages/kubernetes/client/rest.py", line 235, in request
    raise ApiException(http_resp=r)
kubernetes.client.exceptions.ApiException: (403)
Reason: Forbidden
HTTP response headers: HTTPHeaderDict({'Audit-Id': '3e57aac2-6c38-4f0b-80c2-6539f0d512e2', 'Cache-Control': 'no-cache, private', 'Content-Type': 'application/json', 'X-Content-Type-Options': 'nosniff', 'X-Kubernetes-Pf-Flowschema-Uid': '8ffede11-9091-4daa-8d22-0a80961f7f1f', 'X-Kubernetes-Pf-Prioritylevel-Uid': '7852fbbf-8622-427a-92e4-b6021eab94e3', 'Date': 'Fri, 04 Nov 2022 14:17:46 GMT', 'Content-Length': '290'})
HTTP response body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"jobs.batch is forbidden: User \"system:anonymous\" cannot create resource \"jobs\" in API group \"batch\" in the namespace \"default\"","reason":"Forbidden","details":{"group":"batch","kind":"jobs"},"code":403}

14:17:46.323 | INFO    | prefect.agent - Completed submission of flow run 'da7ba311-720f-4194-bc1b-fa7a94c6c6f8'
rpc error: code = Unknown desc = Error: No such container: f8751f98ce4a07ae53d6cfe71633c904effc205b858ffb2c28be3795e1826500%
@Anna Geller @Nate I was just on a call with Eddy and I’m also struggling to understand the error here. I don’t think this has anything to do with flow storage. Rather, I think there’s some disconnect in permissions. Seemingly there are two errors raised: 1.
[Errno 2] No such file or directory: 'aws-iam-authenticator'
- not sure where this is coming from… maybe a secret? None of the flow code or flow code storage has anything with that name. 2.
"status":"Failure","message":"jobs.batch is forbidden: User \"system:anonymous\" cannot create resource \"jobs\" in API group \"batch\" in the namespace \"default\""
- not sure why the use is
or creation of jobs is forbidden. Eddy is using our prefect-helm chart to deploy the agent.
gratitude thank you 1
@Emil Christensen yep eddy just mentioned documentation around storage blocks for deployments above in this thread so that's why I commented on that 1. I'm pretty sure aws iam authenticator is something you would setup during cluster creation 2. I think this is a direct symptom of the pod running the agent not having the iam permissions to hit the batch jobs create endpoint, which aws iam authenticator would handle (along with the helm chart when it sets a service account role binding under the hood) @Jamie Zieziula any thoughts here?
gratitude thank you 1
Yeah this looks like a bunch of the assorted things that AWS does to provide AWS auth in front of k8s endpoints. Throwing your agent in the same cluster as a deployment then providing an appropriately scoped service account will make this way easier than using a cluster config
gratitude thank you 1
I would avoid passing in a cluster config
Let the agent inherit that from the pod metadata
If I am seeing this correctly we are passing your local kubeconfig, which is going to have a bunch of extra stuff but also inform the aws iam auth piece
+1 to George, IAM Roles for Service Accounts is the right answer here - more on that: https://eksctl.io/usage/iamserviceaccounts/
Thanks for all the links and support everyone! 🙏🏻I will look at this at the start of next week and update my progress here!
🙌 1
🦜 4
So I just removed the kubenetes cluster config from my kubenetes job block and it just worked!! I had already run this command before
Copy code
eksctl utils associate-iam-oidc-provider \
--region eu-west-2 \
--cluster eksdemo1 \
The cluster I am using now is called
though so not sure that impacted it
🦜 2
👍 1