Hi wonder if anyone could help me with a problem I...
# prefect-community
j
Hi wonder if anyone could help me with a problem I have working with the Dask KubeCluster? So the issue I am having is that various secrets that I have mounted to the usual flow jobs don't get carried over to the pods that are started by dask - there is an added complexity that I am using two images a dev one and a non dev one tied to two different prefect projects - I am able to do something like this to switch the image;
Copy code
DEV_TAG = os.environ.get("DEV", "") != ""

JOB_IMAGE_NAME = f"blah/flows{':dev' if DEV_TAG else ''}"
And then in each flow I ref the
JOB_IMAGE_NAME
- this just changes the image but otherwise uses the job template I have defined on the agent;
Copy code
apiVersion: batch/v1
kind: Job
spec:
  template:
    spec:
      containers:
        - name: flow
          imagePullPolicy: Always
          env:
            - name: SOME_ENV
              valueFrom:
                secretKeyRef:
                  name: secret-env-vars
                  key: some_env
                  optional: false
Now when I specify the dask setup I do the following;
Copy code
executor=DaskExecutor(
        cluster_class=lambda: KubeCluster(make_pod_spec(image=JOB_IMAGE_NAME)),
        adapt_kwargs={"minimum": 2, "maximum": 3},
    )
But this is obviously missing the env part of my default template - I would like to not have to respecify it (its much bigger then the above snippet) - is it possible to grab a handle on the default template and just override the image name?
a
are you on Prefect Cloud or Server? if you are on Cloud, you could leverage Prefect Secrets which would make the process much easier as you could set those directly from the Prefect Cloud UI
j
Hmm yeah unfortunately not allowed to store these there 😞
a
why? we are SOC-2 compliant
you could also consider storing those in some other secrets manager you trust such as Hashicorp Vault or AWS secrets manager and retrieving those in your flow when needed
j
We just have a policy that all secrets must stay under our control - would be lots of bureaucracy to convince otherwise
So I have actually worked out that since the Dask cluster is started within the job pod which has the secrets in its env I can just do this;
Copy code
DASK_POD_SPEC = make_pod_spec(
    image=JOB_IMAGE_NAME,
    env={
        "SECRET_ENV_VAR": os.environ['SECRET_ENV_VAR'],
    },
)
I did also have to do this;
Copy code
DASK_POD_SPEC.spec.service_account_name = "flow-user"
Since the
make_pod_spec
doesn't allow to set the service account you want to run as
a
thanks for sharing - so your service account points to this environment variable?
or the other way around?
j
No so two things; 1. Env vars are set on the pod created by the agent - since this pod creates the dask pods can just insert 2.
service_account_name
is needed so that I am able to access GCP resources -
KubeCluster
seems to default to using the
default
service account
a
I see, thanks for confirming that!