< Marvin> what does this error mean ```Pod event FailedSched Prefect Community #ask-marvin

<@ULVA73B9P> what does this error mean: ```Pod ev...

Amir Kamal

07/08/2025, 8:50 AM

@Marvin what does this error mean:

Copy code

Pod event 'FailedScheduling' at 2025-07-08 08:34:23.387710+00:00: 0/5 nodes are available: 5 Insufficient cpu. preemption: 0/5 nodes are available: 5 No preemption victims found for incoming pod.

Marvin

07/08/2025, 8:50 AM

✅ thought for 12.9 seconds

Marvin

07/08/2025, 8:50 AM

This error message indicates a Kubernetes scheduling problem related to insufficient CPU resources in your cluster. Let me break down what's happening: The error

0/5 nodes are available: 5 Insufficient cpu

means: 1. Your Kubernetes cluster has 5 nodes 2. None of these nodes (0/5) have enough CPU resources available to schedule your pod 3. The scheduler also tried to see if it could preempt (evict) other pods to make room for yours, but couldn't find any suitable candidates This typically occurs when: 1. Your pod is requesting more CPU resources than what's available on any individual node 2. The nodes are already running at or near their CPU capacity 3. The CPU request in your flow's infrastructure configuration might be set too high To resolve this, you have several options: 1. Reduce CPU requests in your flow's infrastructure configuration if they're set too high 2. Add more nodes to your cluster or scale up existing nodes 3. Wait for other pods to complete and free up resources 4. Check if you have node affinity/taints that might be unnecessarily restricting where your pod can run Here's an example of how to adjust CPU requests in your Kubernetes job configuration:

Copy code

python
from prefect.infrastructure.kubernetes import KubernetesJob

job = KubernetesJob(
    job_configuration={
        "spec": {
            "template": {
                "spec": {
                    "containers": [{
                        "resources": {
                            "requests": {
                                "cpu": "100m",  # Adjust this value as needed
                                "memory": "256Mi"
                            }
                        }
                    }]
                }
            }
        }
    }
)

You can also check your cluster's resource usage using:

Copy code

bash
kubectl describe nodes | grep -A 5 "Allocated resources"

This will help you understand how resources are currently allocated across your nodes and determine appropriate request values.

Amir Kamal

07/08/2025, 8:51 AM

@Marvin how do i check what pods are active?

Marvin

07/08/2025, 8:51 AM

✅ thought for 6.4 seconds

Marvin

07/08/2025, 8:51 AM

You can check active pods in Kubernetes using the

kubectl get pods

command. Here are the most useful variations: 1. List all pods in the current namespace:

Copy code

bash
kubectl get pods

2. List pods with more details (including CPU/memory usage):

Copy code

bash
kubectl get pods -o wide

3. Watch pods in real-time (updates every 2 seconds):

Copy code

bash
kubectl get pods --watch

4. List pods across all namespaces:

Copy code

bash
kubectl get pods --all-namespaces
# or the shorter version
kubectl get pods -A

5. Show resource usage for pods:

Copy code

bash
kubectl top pods

The output will show you: - NAME: The name of the pod - READY: How many containers are ready vs. total containers - STATUS: Current status (Running, Pending, CrashLoopBackOff, etc.) - RESTARTS: Number of restarts - AGE: How long the pod has been running If you want to see detailed information about a specific pod:

Copy code

bash
kubectl describe pod <pod-name>

This will show you detailed information including: - Events - Resource requests and limits - Node assignment - Container statuses - Volume mounts - Labels and annotations These commands will help you understand what's running and potentially consuming the CPU resources that are preventing your new pods from scheduling.

Amir Kamal

07/09/2025, 4:07 AM

@Marvin is there a way to fix this without accessing kubernetes? say via prefect UI?

Marvin

07/09/2025, 4:07 AM

✅ thought for 49.9 seconds

Marvin

07/09/2025, 4:08 AM

Yes, you can fix this through the Prefect UI without needing direct Kubernetes access! Here's how: 1. Go to your Work Pools in the Prefect UI 2. Select the Kubernetes work pool you're using 3. Click on the Advanced tab 4. Edit the base job template to adjust the CPU and memory requests Here's what you'll want to modify in the job template:

Copy code

json
{
  "spec": {
    "template": {
      "spec": {
        "containers": [{
          "resources": {
            "requests": {
              "cpu": "100m",     // Reduce this value
              "memory": "256Mi"  // Adjust if needed
            }
          }
        }]
      }
    }
  }
}

Common CPU request values: -

100m

= 0.1 CPU core -

250m

= 0.25 CPU core -

500m

= 0.5 CPU core -

1000m

= 1 CPU core Start with a lower value (like

100m

) and adjust based on your flow's actual needs. After saving the new configuration, future flow runs will use these updated resource requests. Remember that any existing deployments will need to be redeployed to pick up these changes, but you won't need to touch Kubernetes directly at all.

Amir Kamal

07/09/2025, 4:13 AM

@Marvin This is in my template. is there a way for me to find out what this value is?

Copy code

"requests": {
                    "cpu": "{{ request_cpu }}",
                    "memory": "{{ request_memory }}"
                  }

Marvin

07/09/2025, 4:13 AM

✅ thought for 6.2 seconds

Marvin

07/09/2025, 4:13 AM

Yes! These are template variables that can be viewed and modified in two places in the Prefect UI: 1. Work Pool Variable Values: - Go to your Work Pool - Click on the "Variables" tab - Look for

request_cpu

and

request_memory

values 2. Deployment Override Values: - Go to your specific deployment - Click "Edit" - Look in the "Work Pool Parameters" section - You should see fields for these variables if they're being used The templated values (

{{ request_cpu }}

and

{{ request_memory }}

) are placeholders that get filled in either by: - Default values set at the work pool level - Specific values set for individual deployments that override the defaults If you can't find the values in either place, they might be set to default to certain values in your work pool configuration. You can modify these values in either location to adjust the resource requests for your pods. Would you like me to help you find where these values are currently set in your configuration?

Amir Kamal

07/09/2025, 4:14 AM

@Marvin it says my CPU request is 0.5. should i change it to fix this issue?

Marvin

07/09/2025, 4:15 AM

✅ thought for 8.0 seconds

Marvin

07/09/2025, 4:15 AM

Yes, reducing the CPU request from 0.5 (equivalent to

500m

) could help resolve the scheduling issue. Since the error indicates there isn't enough CPU available on any node, lowering this value would make it easier for Kubernetes to find a suitable node to schedule your pod. You could try reducing it to: -

0.25

(or

250m

) - a good middle ground -

0.1

(or

100m

) - minimal CPU request, good for testing However, before making this change, consider: 1. Flow Requirements: Does your flow actually need less CPU? Don't reduce it if your flow genuinely needs those resources to run effectively 2. Performance Impact: Lower CPU allocation might mean slower execution 3. Start Small: You can start with a lower value and gradually increase it if needed To change the value: 1. Go to your deployment 2. Click "Edit" 3. Look for the Work Pool Parameters section 4. Update the

request_cpu

value to a lower number (like 0.25) 5. Save the changes 6. Rerun your flow to test with the new configuration If the flow runs successfully with the lower CPU request, you can either keep it at that level or try gradually increasing it until you find the optimal balance between resource usage and scheduling success.

3 Views

Open in Slack

Previous Next