Romain Vincent
08/25/2025, 9:23 AMMarvin
08/25/2025, 9:23 AMMarvin
08/25/2025, 9:32 AMprefect work-pool create "k8s-pool" --type kubernetes
Optionally cap total concurrent flow runs at the pool:
prefect work-pool set-concurrency-limit "k8s-pool" 50
2) Customize the base job template (resources, SA, namespace, env, node selection)
- Export the default template, edit, then apply:
prefect work-pool get-default-base-job-template --type kubernetes --file base-job-template.json
# Edit containers[].resources, serviceAccountName, namespace, env, tolerations, nodeSelector, etc.
prefect work-pool update "k8s-pool" --base-job-template base-job-template.json
Tip: Keep the template generic, then override per-deployment with job_variables when needed.
3) Create work queues to shard traffic and set queue-level caps
- Use queues per priority, team, or region. Workers can subscribe to one or more queues.
prefect work-queue create high-priority --pool k8s-pool --limit 10 --priority 1
prefect work-queue create default --pool k8s-pool
4) Run workers in-cluster (recommended)
- Deploy a lightweight Prefect worker Deployment in your K8s cluster with RBAC permissions to create Jobs/Pods. Example container command:
prefect worker start --pool k8s-pool --work-queue high-priority --work-queue default
Notes:
- Make sure the image includes kubernetes extras (e.g., install prefect[kubernetes]
) and your flow’s dependencies.
- Scale the worker Deployment replicas for resiliency/throughput of job submissions. Flow run concurrency is governed by pool/queue limits and cluster capacity, not worker count.
5) Deploy flows to the pool/queue and set per-run resources
- From your project root:
prefect deploy
# Select your flow entrypoint, choose work pool "k8s-pool", and optionally a work queue
- To pin resources per deployment, pass job_variables (they map onto the base job template variables). This is great for sizing memory/CPU or changing images by deployment.
6) Control parallelism and protect shared systems
- Work pool cap: caps total concurrent flow runs from this pool.
- Work queue cap: caps concurrent runs from that queue (great for priorities).
prefect work-queue set-concurrency-limit --pool k8s-pool high-priority 10
- Tag-based task concurrency: throttle specific operations across all runs.
prefect concurrency-limit create my-shared-system 5
Then tag relevant tasks/flows with my-shared-system
.
How this dispatches across multiple pods
- Each scheduled flow run is picked up by a worker and executed as a new K8s Job/Pod using your base job template + job_variables.
- To shard across multiple pods/paths, create multiple queues and point different deployments (or workers) at them. You can also create multiple pools (e.g., one per cluster) and run a worker set in each cluster.
Best practices
- Keep the worker in-cluster so it talks directly to the API server and can create Jobs quickly.
- Use a custom image that includes your code + prefect[kubernetes]
for faster cold starts.
- Keep the base job template close to K8s best practices: define resource requests/limits, serviceAccountName, imagePullSecrets, tolerations/nodeSelector if needed, and labels/annotations for observability.
- Use pool/queue concurrency caps to prevent overload; use tag concurrency limits to protect external systems (DBs, APIs).
- If you need subflows in separate pods, call other deployments (Run Deployment) instead of regular subflows, which execute in the same pod by default.
Docs and referencesMarvin
08/25/2025, 9:32 AMRomain Vincent
08/25/2025, 3:41 PMMarvin
08/25/2025, 3:41 PMMarvin
08/25/2025, 3:42 PMfrom prefect import flow
from prefect_kubernetes.credentials import KubernetesCredentials
from <http://prefect_kubernetes.jobs|prefect_kubernetes.jobs> import create_namespaced_job, read_namespaced_job_status
@flow
def launch_k8s_job():
creds = KubernetesCredentials(in_cluster=True) # or load_kube_config=True for out-of-cluster
manifest = {
"apiVersion": "batch/v1",
"kind": "Job",
"metadata": {"name": "hello-job"},
"spec": {
"template": {
"spec": {
"restartPolicy": "Never",
"containers": [
{
"name": "hello",
"image": "busybox",
"command": ["sh", "-c", "echo hello && sleep 2 && echo done"]
}
],
}
}
}
}
create_namespaced_job(
kubernetes_credentials=creds,
new_job=manifest,
namespace="default",
)
# Optional: poll until completion
# status = read_namespaced_job_status(kubernetes_credentials=creds, name="hello-job", namespace="default")
# ...check status.status.succeeded/failed and loop...
if __name__ == "__main__":
launch_k8s_job()
Quick check: Which of these are you after?
- “I want my Prefect flow runs to be Jobs” (already the case with a Kubernetes work pool)
- “I want to trigger my own Kubernetes Jobs from a Prefect flow”
If you share which path you want, I can tailor an example (including a base job template or a worker Deployment YAML) for your cluster.