hello <@ULVA73B9P>, I am using prefect 2.19.3 with...
# ask-marvin
a
hello @Marvin, I am using prefect 2.19.3 with the helm chart 2024.5.30190018. I am struggling to provide the baseJobTemplate in the following:
Copy code
## Deployment Configuration
worker:
  # -- unique cluster identifier, if none is provided this value will be infered at time of helm install
  clusterUid: ""

  image:
    # -- worker image repository
    repository: prefecthq/prefect
    ## prefect tag is pinned to the latest available image tag at packaging time.  Update the value here to
    ## override pinned tag
    # -- prefect image tag (immutable tags are recommended)
    prefectTag: 2.19.3-python3.11-kubernetes
    # -- worker image pull policy
    pullPolicy: IfNotPresent
    ## Optionally specify an array of imagePullSecrets.
    ## Secrets must be manually created in the namespace.
    ## ref: <https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/>
    ## e.g:
    ## pullSecrets:
    ##   - myRegistryKeySecretName
    # -- worker image pull secrets
    pullSecrets: []
    # -- enable worker image debug mode
    debug: false

  ## general configuration of the worker
  config:
    # -- the work pool that your started worker will poll.
    workPool: ${WORK_POOL_NAME}
    # -- one or more work queue names for the worker to pull from. if not provided, the worker will pull from all work queues in the work pool
    workQueues: []

    queryInterval: 5

    prefetchSeconds: 10

    http2: true
    type: kubernetes

    installPolicy: prompt
worker. If not provided, a unique name will be generated.
    name: null
start simultaneously (default: unlimited)
    limit: null
    # -- JSON formatted base job template. If unspecified, Prefect will use the default base job template for the given worker type. If the work pool already exists, this will be ignored.
    baseJobTemplate: |-
      {}
Passing in the json formatted value DOES NOT WORK. Why can I just pass in the name of an EXISTING CONFIGMAP. I know you can do this in later version of prefect, however, for this version it seems IMPOSSIBLE to pass in the json baseJobTemplate!
m
Your message was too long by 3 tokens...
m
Hi Alan, could you share what configuration you're attempting to use and the error you saw?
a
yes, I am using the following values.yaml:
Copy code
## Deployment Configuration
worker:
  # -- unique cluster identifier, if none is provided this value will be infered at time of helm install
  clusterUid: ""

  image:
    # -- worker image repository
    repository: prefecthq/prefect
    ## prefect tag is pinned to the latest available image tag at packaging time.  Update the value here to
    ## override pinned tag
    # -- prefect image tag (immutable tags are recommended)
    prefectTag: 2.19.3-python3.11-kubernetes
    # -- worker image pull policy
    pullPolicy: IfNotPresent
    ## Optionally specify an array of imagePullSecrets.
    ## Secrets must be manually created in the namespace.
    ## ref: <https://kubernetes.io/docs/tasks/configure-pod-container/pull-image-private-registry/>
    ## e.g:
    ## pullSecrets:
    ##   - myRegistryKeySecretName
    # -- worker image pull secrets
    pullSecrets: []
    # -- enable worker image debug mode
    debug: false

  ## general configuration of the worker
  config:
    # -- the work pool that your started worker will poll.
    workPool: ${WORK_POOL_NAME}
    # -- one or more work queue names for the worker to pull from. if not provided, the worker will pull from all work queues in the work pool
    workQueues: []
    # -- how often the worker will query for runs
    queryInterval: 5
    # -- when querying for runs, how many seconds in the future can they be scheduled
    prefetchSeconds: 10
    # -- connect using HTTP/2 if the server supports it (experimental)
    http2: true
    ## You can set the worker type here.
    ## The default image includes only the type "kubernetes".
    ## Custom workers must be properly registered with the prefect cli.
    ## See the guide here: <https://docs.prefect.io/2.11.3/guides/deployment/developing-a-new-worker-type/>
    # -- specify the worker type
    type: kubernetes
    ## one of 'always', 'if-not-present', 'never', 'prompt'
    # --  install policy to use workers from Prefect integration packages.
    installPolicy: prompt
    # -- the name to give to the started worker. If not provided, a unique name will be generated.
    name: null
    # -- maximum number of flow runs to start simultaneously (default: unlimited)
    limit: null
    # -- JSON formatted base job template. If unspecified, Prefect will use the default base job template for the given worker type. If the work pool already exists, this will be ignored.
    baseJobTemplate: ${BASE_JOB_TEMPLATE}
I am deploying it through helm_release in terraform:
Copy code
resource "helm_release" "prefect_worker_gpu" {
  for_each = { for np in var.karpenter_node_pools_gpu : np.node_pool_name => np }

  name       = "prefect-worker-${each.value.workpool_name}"
  namespace  = "${var.environment}-prefect"
  chart      = "prefect-worker"
  repository = "<https://prefecthq.github.io/prefect-helm>"
  version    = "${var.prefect_chart_version}"


  values = [templatefile("${path.module}/values/values-worker.yaml", {
    UI_URL = "<https://prefect>-${var.product_name}-${var.environment}.*.com"
    EXISTING_SECRET = "${kubernetes_secret.db_secret.metadata[0].name}"
    CERTIFICATE_ARN = "${data.aws_acm_certificate.env.arn}"
    SECURITY_GROUP = "${var.prefect_security_group_id}"
    API_URL = "<http://prefect-server>.${var.environment}-prefect.svc.cluster.local:4200/api"
    WORK_POOL_NAME = "${each.value.workpool_name}"
    FULL_NAME_OVERRIDE = "prefect-worker-${each.value.workpool_name}"
    IS_INITIAL = false
    BASE_JOB_TEMPLATE = jsonencode(templatefile("${path.module}/values/baseJobTemplateGPU.json.tpl", {
                             prefect_hq_version  = "${var.prefect_hq_version}"
                             environment    = "${var.environment}"
                             node_pool_name = "${each.value.node_pool_name}"
                             num_gpus = "${each.value.num_gpus}"
                           }))
  })]


  depends_on = [ helm_release.prefect_server, kubernetes_config_map.base_job_template_gpu, kubernetes_config_map.base_job_template_cpu, kubernetes_namespace.prefect, kubernetes_secret.prefect_api_key ]
}


resource "helm_release" "prefect_worker_cpu" {
  for_each = { for np in var.karpenter_node_pools_cpu : np.node_pool_name => np }

  name       = "prefect-worker-${each.value.workpool_name}"
  namespace  = "${var.environment}-prefect"
  chart      = "prefect-worker"
  repository = "<https://prefecthq.github.io/prefect-helm>"
  version    = "${var.prefect_chart_version}"


  values = [templatefile("${path.module}/values/values-worker.yaml", {
    UI_URL = "<https://prefect>-${var.product_name}-${var.environment}.*.com"
    EXISTING_SECRET = "${kubernetes_secret.db_secret.metadata[0].name}"
    CERTIFICATE_ARN = "${data.aws_acm_certificate.env.arn}"
    SECURITY_GROUP = "${var.prefect_security_group_id}"
    API_URL = "<http://prefect-server>.${var.environment}-prefect.svc.cluster.local:4200/api"
    WORK_POOL_NAME = "${each.value.workpool_name}"
    FULL_NAME_OVERRIDE = "prefect-worker-${each.value.workpool_name}"
    IS_INITIAL = false
    BASE_JOB_TEMPLATE = jsonencode(templatefile("${path.module}/values/baseJobTemplateCPU.json.tpl", {
                          prefect_hq_version  = "${var.prefect_hq_version}"
                          environment    = "${var.environment}"
                          node_pool_name = "${each.value.node_pool_name}"
                        }))
  })]


  depends_on = [ helm_release.prefect_server, kubernetes_config_map.base_job_template_gpu, kubernetes_config_map.base_job_template_cpu, kubernetes_namespace.prefect, kubernetes_secret.prefect_api_key ]
}
I get the following error:
Copy code
Error: context deadline exceeded
  with module.prefect.helm_release.prefect_worker_gpu["karpenter-nodepool-gpu-2x"],
  on tfmodules/prefect/workers.tf line 1, in resource "helm_release" "prefect_worker_gpu":
   1: resource "helm_release" "prefect_worker_gpu" {
Error: Unable to continue with install: ConfigMap "prefect-worker-base-job-template" in namespace "dev-prefect" exists and cannot be imported into the current release: invalid ownership metadata; annotation validation error: key "<http://meta.helm.sh/release-name|meta.helm.sh/release-name>" must equal "prefect-worker-work-cpu": current value is "prefect-worker-work-gpu"
  with module.prefect.helm_release.prefect_worker_cpu["karpenter-nodepool-cpu-4x"],
  on tfmodules/prefect/workers.tf line 27, in resource "helm_release" "prefect_worker_cpu":
  27: resource "helm_release" "prefect_worker_cpu" {
My question is, why is it creating a ConfigMap "prefect-worker-base-job-template" when I provided a JSON string for the baseJobTemplate? Also, why is it trying to associate that configmap with anyone of of my helm releases?!
m
when you provide the configuration for the template, it's going to create a ConfigMap with that content: https://github.com/PrefectHQ/prefect-helm/blob/main/charts/prefect-worker/templates/configmap.yaml
it's associating it with your helm release because that template is part of the prefect-worker helm chart. looks like you have two similarly-named releases:
prefect-worker-work-{cpu,gpu}
a
can I assign it a name to distinguish one from the other?
m
if you want to skip that and create your own configmap, do that and then provide
worker.config.baseJobTemplate.existingConfigMapName
a
2.19.3 does not support that (2024.5.30190018)
its literally the next chart that has that
Currently there is a need to stick to prefect 2.19.3 because its well tested for us
m
understandable. if you're providing the config raw instead of an existing configmap name, that raw config is created in a configmap with a static name. any chance you could put this release in a separate namespace so there's no conflict?
a
is that the only viable solution?
seems like a bug to me
m
i agree that configmap probably should have included the release name in its name so it would avoid conflicts. but either way if we fix that, you'd need to update your chart - which doesn't sound like you're ready to do yet
a
ok let me try that!!
👍 1