Hi, I am attempting a KubernetesRun using DigitalO...
# ask-community
c
Hi, I am attempting a KubernetesRun using DigitalOcean’s managed kubernetes service. I am using S3 for flow storage, and a public Docker image to run the flow. Kubernetes agent has been registered, and S3 credentials specified as env vars following the docs :
prefect agent kubernetes install -t token_value -l label -e aws_access_key_id=XXX -e aws_secret_access_key=YYY --rbac | kubectl apply -f -
The agent is installed and running in the kube cluster on inspection using kubectl get pods
Copy code
NAME                             READY   STATUS    RESTARTS   AGE
prefect-agent-58cf5b46d5-84hh6   1/1     Running   0          26m
Running the flow in Prefect server, and the flow gets picked up by the agent, but encountering the error:
Failed to load and execute Flow's environment: NoCredentialsError('Unable to locate credentials')
This is despite the agent having knowledge of the S3 creds via the -e flags. I have also tried passing the S3 creds as a dict to KubernetesRun inside my flow:
Copy code
KubernetesRun(
  image="my_image",
  labels=["my_labels"],
  env={
    "aws_access_key_id": "xxx",
    "aws_secret_access_key": "yyy"
  }
)
Am I missing something for my kube cluster to authenticate to S3?
I managed to get it to work - the S3 creds should be specified in the storage object instead of in the agent / run_config:
Copy code
from prefect.storage import S3

STORAGE = S3(
  bucket="your_bucket",
  client_options={
    "aws_access_key_id": "xxx",
    "aws_secret_access_key": "yyy"
  }
)
However, I would need to specify these creds for each flow I register to Prefect Cloud
Is there any way for a kube cluster to authenticate with S3 without passing in boto3 creds to the Prefect S3 storage object using
client_options
?
j
You shouldn't be passing in credentials there, as it stores your credentials along with the flow. Same goes for storing as part of the
KubernetesRun
object, as everything passed to that is stored in the Cloud/Server database (and you don't want to be storing your secrets there). The "correct" way to make this work depends on the backing system: • Your first approach (with the
-e
flag on the agent) would likely work, but the environment variables should be in all caps (see https://boto3.amazonaws.com/v1/documentation/api/latest/guide/credentials.html#environment-variables for docs on what environment variables AWS expects). Note that while this may work, your aws credentials will be stored as part of the k8s
Deployment
object (rather than in a k8s
Secret
), which goes against k8s security customs. • If you're using Prefect Cloud (rather than Prefect Server), you could store your credentials in a Prefect Secret. Prefect's aws integration expects a secret named
AWS_CREDENTIALS
containing a JSON dictionary of credentials.
Copy code
# You'd set a secret in Prefect cloud containing your credentials as a JSON dictionary of the following form:
{"ACCESS_KEY": "your aws_access_key_id here",
 "SECRET_ACCESS_KEY": "your aws_secret_access_key here"}

# For now the secret name is hardcoded, so you'd need to name the secret "AWS_CREDENTIALS"

# you'd then attach the secret to your `S3` storage as follows:
flow.storage = S3(..., secrets=["AWS_CREDENTIALS"])
this is admittedly a bit unclear, we plan to change some things soon to make this configuration more flexible (and also better document it). • You might create a k8s secret holding your AWS credentials, then modify the pod template to mount that secret as environment variables in the pod. Note that the environment variable names should be in all caps. You could attach secrets to the pod at the agent level by setting
--job-template
, or at the flow level by passing in
job_template
to
KubernetesRun
• If your k8s deployment supports automatically passing credentials to pods running on a node (say attached to a service account), then you could forward credentials using this mechanism. All pods running would automatically have the correct configuration. Many cloud providers support this, but I'm not sure if you could make this work with digital ocean. AWS's docs for this feature: https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html.
Option 1 (fixing your environment variable names) is certainly the simplest, but comes at the risk that secret things aren't stored in K8s
Secret
objects. This means that anyone that has read access on the k8s agent deployment or flow run pods can view the aws credentials. Option 2 is also a good option (and one that we hope to make cleaner soon). It only works with Prefect Cloud though, so if you're a Prefect Server user this isn't an option. Options 3 & 4 are the k8s native ways of handling this, and are good options if someone on your team has some decent k8s experience. There's a bit more configuration here, but since you have the full range of k8s configuration available to you, you can make things work however you want.
🙏 1
c
Ahh yes, I was getting tripped up on whether S3 creds should be passed to: • Agent • Run config • Storage Seems that the simplest way would be to pass Prefect secrets into the S3 Storage object using the secrets arg. I also went through src > prefect > utilities > aws and learnt about how Prefect handles auth to cloud services (read in secrets through context and pop dict keys). Secrets should be specified as one single dict instead of being split into multiple key:value pairs. DigitalOcean indeed just provides you with a plain vanilla k8s service. The main cloud providers gives you IAM configuration (albeit manually) for the execution layer to auth to other cloud services, in my case, storage service. Many thanks for your informative response @Jim Crist-Harif! Helped me a lot. Cheers