Hi how do i get the kubernetes flow pod to request...
# prefect-kubernetes
y
Hi how do i get the kubernetes flow pod to request for gpu resources? In normal helm chart i can do this:
Copy code
spec:
      containers:
        - name: nvidia-smi
          image: "nvidia/cuda:11.8.0-runtime-centos7"
          args:
            - "nvidia-smi"
          resources:
            limits:
              <http://nvidia.com/gpu|nvidia.com/gpu>: "1"
      tolerations:
      - key: "<http://nvidia.com/gpu|nvidia.com/gpu>"
        operator: "Exists"
        effect: "NoSchedule"
but if i do this on the prefect-worker deployment.yaml file would it actually do anything to the flow pod when they are launched?
n
hi @Ying Ting Loo - you can set resource requests on your k8s work pool as a default in the advanced tab in the UI (picture attached) or you could override those defaults for a given deployment in your prefect.yaml like
Copy code
deployments:
- name: healthcheck-storage-test
  entrypoint: src/demo_project/healthcheck.py:healthcheck
  work_pool:
    name: k8s
    work_queue_name:
    job_variables:
      env:
        PREFECT_DEFAULT_RESULT_STORAGE_BLOCK: s3/flow-script-storage-main
      job_manifest:
          spec:
            containers:
                resources:
                  limits:
                      <http://nvidia.com/gpu|nvidia.com/gpu>: "1"
where I'll highlight the main difference here: your example is changing the resource request for the pod that runs the actual worker (not where the flow run executes), where my example is changing the spec of the job that this worker creates for the flow run that it submits
if I do
prefect deploy -n healthcheck-storage-test
then in the UI I see this on that deployment's configuration tab
y
thanks so much for the reply @Nate a little side track from gpu but on job manifest does this look right to increase the ephemeral storage? I applied the deployment as such, but the ephemeral storage request is still 0 in the logs. backstory: I tried to build a big image with alot of file downloads upon build, while the job pod keeps get evicted due to
Copy code
The node was low on resource: ephemeral-storage. Threshold quantity: 3210844697, available: 2000796Ki. Container linkerd-proxy was using 4Ki, request is 0, has larger consumption of ephemeral-storage. Container prefect-job was using 108Ki, request is 0, has larger consumption of ephemeral-storage.
n
hi Ying, I'm probably not in a great position to give advice on what your ephemeral storage values should be since I'm not sure what you're up to. that might be a good question in and of itself for the #CL09KU1K7 channel or this channel at large