Hello! I've just deployed Prefect Server into Kube...
# prefect-server
a
Hello! I've just deployed Prefect Server into Kubernetes on AWS using official helm chart and I'm trying to make my first hello-world flow to work. I am using GitLab storage and I'm wondering what is the correct (and secure) way to pass
GITLAB_ACCESS_TOKEN
?
a
Currently Gitlab storage expects the name of PrefectSecret, so you would need to set the local secret on your agent:
Copy code
prefect agent xxx start --env PREFECT__CONTEXT__SECRETS__GITLAB_ACCESS_TOKEN="your_token"
for KubernetesAgent:
Copy code
env:
        - name: PREFECT__CONTEXT__SECRETS__GITLAB_ACCESS_TOKEN
          value: 'your_token'
and then reference it in your storage:
Copy code
from prefect import Flow
from prefect.storage import GitLab

flow = Flow(
    "gitlab-flow",
    GitLab(
        repo="org/repo",                           # name of repo
        path="flows/my_flow.py",                   # location of flow file in repo
        access_token_secret="GITLAB_ACCESS_TOKEN"  # name of personal access token secret
    )
)
a
For some reason I can't make that work... I've patched Prefect agent with new env value with
Copy code
kubectl -n prefect set env deployment.apps/prefect-agent PREFECT__CONTEXT__SECRETS__GITLAB_ACCESS_TOKEN="XYZ"
checked that agent pod was restarted has new env in the list with other envs
Copy code
Environment:
      PREFECT__CLOUD__API:                             <http://prefect-apollo.prefect:4200/graphql>
      NAMESPACE:                                       prefect
      IMAGE_PULL_SECRETS:                              []
      PREFECT__CLOUD__AGENT__LABELS:                   []
      JOB_MEM_REQUEST:
      JOB_MEM_LIMIT:
      JOB_CPU_REQUEST:
      JOB_CPU_LIMIT:
      IMAGE_PULL_POLICY:
      SERVICE_ACCOUNT_NAME:                            prefect-serviceaccount
      PREFECT__BACKEND:                                server
      PREFECT__CLOUD__AGENT__AGENT_ADDRESS:            <http://0.0.0.0:8080>
      PREFECT__CONTEXT__SECRETS__GITLAB_ACCESS_TOKEN:  XYZ
but flow run still fails with this
Copy code
Failed to load and execute Flow's environment: ValueError('Local Secret "GITLAB_ACCESS_TOKEN" was not found.')
a
I think you really have to set it when you start the agent (or before that). This thread provides some helpful information
a
Oh, I see. I rely on the helm chart to start the agent so I can't edit the deployment in that manner without forking the entire helm chart. Any other workarounds? Will custom job template work?
a
setting this Secret is equivalent to setting a corresponding env variable on the agent. Can you check with this article if you can add environment variable to your helm chart agent deployment?
I asked the team and you should also be able to set it in the job template for your flow runs. You can set Kubernetes Secret from an environment variable for this as described here.
s
I have a configmap with secrets in it, and then reference that configmap in the job_template
in the UI, you can edit the job_template directly so you can at least test if it works
upvote 1
a
@Anna Geller, @Sam Werbalowsky thank you for your advice!🙏 Well, it took me a while to figure out. I'll share my solution here just in case anyone is interested. In addition to GitLab access token I wanted to also store AWS secrets to provide access to datasets stored in s3 and custom docker images stored in ECR. I've created a user account for Prefect with all the necessary permissions, and I use
AWS_ACCESS_KEY_ID
and
AWS_SECRET_ACCESS_KEY
values created for that account. Kubernetes Secret We use Terraform to manage Kubernetes IaC, so it is used to create the secret
Copy code
# <http://prefect.tf|prefect.tf>

resource "kubernetes_secret" "prefect-secrets" {
  metadata {
    name = "prefect-secrets"
    namespace = kubernetes_namespace.prefect.metadata.0.name
  }

  data = {
    "gitlab-access-token" = local.secrets.gitlab_access_token
    "prefect-aws-access-key-id" = local.secrets.prefect_aws_access_key_id
    "prefect-aws-secret-access-key" = local.secrets.prefect_aws_secret_access_key
  }
}
Instead of storing values in the code, we use references to AWS Secrets manager. AWS Secrets Manager Since it is a bad idea to store secret values in the code, we use AWS Secrets Manager to store values. Terraform retrieves them when configuration is applied to the K8s cluster and creates or updates Kubernetes secret
Copy code
# <http://secrets.tf|secrets.tf>

data "aws_secretsmanager_secret_version" "secrets" {
  secret_id = "k8-secrets"
}

locals {
  secrets = jsondecode(data.aws_secretsmanager_secret_version.secrets.secret_string)
}
Custom Job Template Now we add references to Kubernetes secret in custom job template
Copy code
# prefect-job-template.yaml

apiVersion: batch/v1
kind: Job
spec:
  template:
    spec:
      containers:
        - name: flow
          env:
            - name: PREFECT__CONTEXT__SECRETS__GITLAB_ACCESS_TOKEN
              valueFrom:
                secretKeyRef:
                  name: prefect-secrets
                  key: gitlab-access-token
            - name: AWS_ACCESS_KEY_ID
              valueFrom:
                secretKeyRef:
                  name: prefect-secrets
                  key: prefect-aws-access-key-id
            - name: AWS_SECRET_ACCESS_KEY
              valueFrom:
                secretKeyRef:
                  name: prefect-secrets
                  key: prefect-aws-secret-access-key
Now we can specify that as
job_template
in
KubernetesRun
, but it would be more convenient to store it in s3 and use
job_template_path
instead. In order to achieve that we also need to provide AWS credentials to Prefect's kubernetes agent. That means we need to add references to Kubernetes secret to Prefect's helm_release resource in Terraform
Copy code
# <http://prefect.tf|prefect.tf>

  ...
  
  set {
	name = "agent.env[0].name"
	value = "AWS_ACCESS_KEY_ID"
  }

  set {
	name = "agent.env[0].valueFrom.secretKeyRef.name"
	value = kubernetes_secret.prefect-secrets.metadata.0.name
  }

  set {
	name = "agent.env[0].valueFrom.secretKeyRef.key"
	value = "prefect-aws-access-key-id"
  }

  set {
	name = "agent.env[1].name"
	value = "AWS_SECRET_ACCESS_KEY"
  }

  set {
	name = "agent.env[1].valueFrom.secretKeyRef.name"
	value = kubernetes_secret.prefect-secrets.metadata.0.name
  }

  set {
	name = "agent.env[1].valueFrom.secretKeyRef.key"
	value = "prefect-aws-secret-access-key"
  }
  
  ...
⚠️ Older versions of Prefect's helm chart don't have
agent.env
section! Took me some time to notice that. Flow Now we can use this simple flow to check if everything works.
Copy code
# workflow.py

import prefect
from prefect import task, Flow
from prefect.storage import GitLab
from prefect.run_configs import KubernetesRun
from prefect.executors import LocalExecutor


gitlab_storage = GitLab(
    repo="gitlab-group-name/prefect-test-flow-repo-name",
    ref="main",
    path="workflow.py",
    access_token_secret="GITLAB_ACCESS_TOKEN",
)

kubernetes_run_config = KubernetesRun(job_template_path="<s3://bucket-name/prefect-job-template.yaml>")
local_executor = LocalExecutor()


@task
def hello_task():
    logger = prefect.context.get("logger")
    <http://logger.info|logger.info>("Hello, cloud!")

with Flow("hello-flow") as flow:
    hello_task()


flow.storage = gitlab_storage
flow.run_config = kubernetes_run_config
flow.executor = local_executor
a
This is great! 💯 Thank you so much for sharing!
s
Awesome , nice stuff! Are you planning to use dask executor at any point? I have set that up and there are some things you can do with variables and secrets there as well, so happy to share some of that if you go that route.
a
My plan is to go through basic stuff first and then experiment with Dask. I'm not sure if Prefect`s capability of creating temporary Dask clusters will be enough for us or we'll have to spin up permanent Dask cluster. We'll probably get there in a couple of weeks
s
Awesome - we use Dask Gateway so that we can control the cluster parameters while still having temporary clusters and it works nicely.