Tsvetelin
08/21/2025, 11:50 AMMarvin
08/21/2025, 11:50 AMTsvetelin
08/21/2025, 11:58 AMMarvin
08/21/2025, 11:58 AMMarvin
08/21/2025, 12:04 PM--gpus all
) in the work pool’s base job template.
There are two places where GPU might matter — let me know which you need:
- The worker container itself (i.e., the long-running ai-worker).
- The flow-run containers that the worker launches for each job.
Below I’ll cover both.
A) Give GPUs to the flow-run containers (Docker work pool)
1) Pull the default Docker base job template:
prefect work-pool get-default-base-job-template --type docker > docker-template.json
2) Edit docker-template.json
and add a device_requests
entry under job_configuration.container_create_kwargs
. For “all GPUs”, use count: -1
(the Docker SDK equivalent of --gpus all
):
{
"job_configuration": {
"command": "{{ command }}",
"env": "{{ env }}",
"labels": "{{ labels }}",
"name": "{{ name }}",
"image": "{{ image }}",
"registry_credentials": "{{ registry_credentials }}",
"image_pull_policy": "{{ image_pull_policy }}",
"networks": "{{ networks }}",
"network_mode": "{{ network_mode }}",
"auto_remove": "{{ auto_remove }}",
"volumes": "{{ volumes }}",
"stream_output": "{{ stream_output }}",
"mem_limit": "{{ mem_limit }}",
"memswap_limit": "{{ memswap_limit }}",
"privileged": "{{ privileged }}",
"container_create_kwargs": {
"device_requests": [
{
"driver": "nvidia",
"count": -1,
"capabilities": [["gpu"]]
}
]
}
},
"variables": { ... keep the defaults here ... }
}
Notes:
- count: -1
is “all GPUs”; you can use a positive integer to limit GPUs.
- The double-nested list for capabilities
is required by Docker ([["gpu"]]
).
- You don’t strictly need "driver": "nvidia"
on modern setups, but it’s fine to include.
3) Apply the updated template to your Docker work pool:
prefect work-pool update "YOUR_DOCKER_POOL_NAME" --base-job-template docker-template.json
Now every container your Docker worker launches from that pool will get GPU access.
B) Give GPUs to the worker container itself
If you are running the ai-worker in a Docker container and want that container to have GPU access (e.g., for in-container inference), start it with GPU flags:
docker run -it --rm \
--name ai-worker \
--gpus all \
-e PREFECT_API_URL="<https://api.prefect.cloud/api/accounts/.../workspaces/...>" \
-v /var/run/docker.sock:/var/run/docker.sock \
prefect-ai-worker:2.20.9 \
prefect worker start --pool "YOUR_DOCKER_POOL_NAME"
- Keep the Docker socket mount if your worker launches Docker jobs on the same host.
- Ensure the host has NVIDIA drivers and the NVIDIA Container Toolkit installed so Docker can expose GPUs.
Helpful docs:
- Work pools and base job templates.
If you can confirm:
- Which work pool type your ai-worker is using (Docker, Kubernetes, etc.)?
- Whether you need GPU on the worker container, the launched job containers, or both?
…I can tailor the exact snippet/commands further (and if you’re on Kubernetes instead of Docker, the approach is different: add GPU resource limits and node selectors in the K8s job template).Marvin
08/21/2025, 12:06 PMjob_variables
to Flow.deploy(...)
.
- prefect.yaml: add a job_variables
section under the deployment.
- CLI: run prefect deploy --help
to see the flag for setting job variables at deploy time.
Examples
UI (JSON in the Infrastructure Overrides box)
{
"env": {
"LOG_LEVEL": "DEBUG"
},
"stream_output": false,
"working_dir": "/opt/prefect/flows"
}
prefect.yaml (Docker worker example)
deployments:
- name: high-mem
entrypoint: flows/etl.py:flow
work_pool:
name: docker-pool
job_variables:
env:
PREFECT_LOGGING_LEVEL: DEBUG
mem_limit: "4g"
volumes:
- "/data:/data:rw"
Python (per-deployment overrides)
from prefect import flow
@flow
def my_flow():
...
my_flow.deploy(
name="k8s-tuned",
work_pool_name="k8s-pool",
job_variables={
"env": {"FOO": "bar"},
"image_pull_policy": "IfNotPresent"
}
)
What you can override (varies by worker type)
- Process worker (verified keys include): env
, command
, working_dir
, stream_output
, labels
, name
- Docker worker (verified keys include): image
, env
, command
, volumes
, networks
, network_mode
, image_pull_policy
, mem_limit
, memswap_limit
, privileged
, auto_remove
, labels
, stream_output
, container_create_kwargs
- Kubernetes/ECS: keys depend on their worker variables and base job template (e.g., image, env, resources/memory/CPU, service account/namespace for K8s; cpu, memory, roles, subnets, security groups for ECS). The UI will show which variables are supported for the pool’s worker type.
Docs to dive deeper
- Customize job variables (overrides): Customize job variables
- Work pools and base job templates: Manage work pools
Want me to tailor an example for your worker type (Process, Docker, Kubernetes, or ECS) and your specific pool settings?Tsvetelin
08/21/2025, 12:38 PMMarvin
08/21/2025, 12:38 PMMarvin
08/21/2025, 12:41 PMprefect agent start
in Prefect 2.20.9. I can’t run --help
for 2.x in this environment, but this reflects the 2.20.x agent flags.
Usage: prefect agent start [OPTIONS]
Options:
-q, --work-queue TEXT One or more work queue names to poll; can be
passed multiple times
-m, --match TEXT Prefix to match work queue names dynamically
(e.g., "prod-" matches all queues starting with it)
--prefetch-seconds INTEGER Seconds to prefetch scheduled runs before their
scheduled start time (helps with infra spin-up)
--limit INTEGER Max number of flow runs to start concurrently
--run-once Poll once and exit (don’t run continuously)
--name TEXT A name for this agent instance
--api TEXT Prefect API URL (otherwise uses PREFECT_API_URL)
--log-level [CRITICAL|ERROR|WARNING|INFO|DEBUG]
Set log verbosity
--hide-welcome Suppress the startup banner
--help Show this message and exit
Example:
prefect agent start \
-q my-queue \
-m prod- \
--prefetch-seconds 60 \
--limit 5 \
--name gpu-agent \
--log-level INFO
Docs: Agents in Prefect 2.x
If you’d like, I can also translate this to an equivalent Prefect 3 worker command (recommended going forward), but the above is correct for Prefect 2.20.9.Tsvetelin
08/21/2025, 1:17 PMMarvin
08/21/2025, 1:17 PMMarvin
08/21/2025, 1:19 PMcommand
field or per-deployment “Infrastructure Overrides”.
Important note: Prefect normally injects its own command to start your flow run inside the container. If you fully replace command
, you are responsible for invoking the Prefect engine correctly; otherwise, the flow won’t start. For most cases where you just want pre/post setup, it’s safer to use a wrapper ENTRYPOINT in your image (see Option 3).
Options to manipulate the run command
1) Set the command on the DockerContainer block (applies to all deployments using the block)
- UI: Blocks > Docker container > your block > edit > Command
- Python:
from prefect.infrastructure import DockerContainer
block = DockerContainer.load("my-docker-block")
# Either a string or a list of args; this replaces the default command
block.command = ["/bin/bash", "-lc", "echo 'pre-setup'; python -m prefect.engine"] # Be sure to run the Prefect engine
block.save("my-docker-block", overwrite=True)
2) Per-deployment override (Infrastructure Overrides)
- UI: Open your deployment > Edit > Infrastructure Overrides (JSON). Example:
{
"command": [
"/bin/bash",
"-lc",
"echo 'pre-setup'; python -m prefect.engine"
]
}
- CLI at build time (exact flags can vary by 2.20.9 point release; run prefect deployment build --help
locally to confirm):
prefect deployment build path/to/flow.py:my_flow \
-n my-deployment \
-q my-queue \
-ib "docker-container/my-docker-block" \
--override 'command=["/bin/bash","-lc","echo pre-setup; python -m prefect.engine"]' \
--apply
3) Recommended for most: wrap the Prefect-provided command via your image ENTRYPOINT
This preserves Prefect’s command and lets you run setup before it.
- Dockerfile:
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
- entrypoint.sh:
#!/bin/sh
set -e
# your setup here (e.g., nvidia-smi, env prep, file writes)
echo "Doing setup..."
# hand off to the command Prefect provides (don’t replace it)
exec "$@"
With this approach, you do not set command
in Prefect at all; the agent/infra will pass its usual command to your entrypoint, and your script runs first.
A couple of tips
- If you need GPU flags, don’t put them in command
— they’re container create options. In 2.x DockerContainer you can use infrastructure overrides to set device_requests
(equivalent of --gpus all
):
{
"container_create_kwargs": {
"device_requests": [
{
"driver": "nvidia",
"count": -1,
"capabilities": [
[
"gpu"
]
]
}
]
}
}
- If you must fully replace command
, make sure it ends by invoking the Prefect engine correctly (e.g., python -m prefect.engine ...
). If you’re not sure about the exact arguments, prefer the ENTRYPOINT wrapper pattern above.
Docs
- Infrastructure concepts (2.x): Infrastructure overview
- Agents (2.x): Agents
If you share how you want to “manipulate” the command (prepend setup, change interpreter, run a shell, etc.), I can give you a precise snippet that keeps your flow runs working.Tsvetelin
08/21/2025, 1:28 PMMarvin
08/21/2025, 1:28 PMMarvin
08/21/2025, 1:29 PMcontainer_create_kwargs
are being applied when using a DockerContainer infrastructure block with an agent in Prefect 2.x.
Quick checklist
1) Inspect the deployment config
- Confirm your overrides are on the deployment:
prefect deployment inspect '<flow-name>/<deployment-name>'
Look for “Job Variables” or “Infrastructure Overrides” including your container_create_kwargs
.
2) Watch agent logs at DEBUG
- Start your agent with more verbosity to see container creation details:
prefect agent start -q <your-queue> --log-level DEBUG
You should see logs around creating the Docker container; errors will show if a kwarg is invalid.
3) Inspect the created Docker container (most definitive)
- Trigger a run:
prefect deployment run '<flow-name>/<deployment-name>'
- Find the container once it starts:
docker ps --format 'table {{.ID}}\t{{.Names}}\t{{.Image}}'
- Inspect for your settings. For GPU device_requests
specifically:
docker inspect <container_id> --format '{{json .HostConfig.DeviceRequests}}'
You should see something like:
[{"Driver":"nvidia","Count":-1,"Capabilities":[["gpu"]]}]
Or view the full HostConfig:
docker inspect <container_id> --format '{{json .HostConfig}}' | jq .
4) Validate from inside the container (for GPU)
- While the container is running:
docker exec -it <container_id> nvidia-smi
If device_requests
is working and the host has NVIDIA Container Toolkit installed, you’ll see GPUs listed.
Tips to make inspection easier
- Keep the container running long enough to inspect: add a short sleep inside your flow (e.g., a task that sleeps for 60–120s) so you can docker ps
and docker exec
before it exits.
- If you’re testing device_requests
and don’t see it in docker inspect
, double-check that your overrides were applied to the correct deployment and that your agent picked up the updated deployment.
Example container_create_kwargs
for GPUs
{
"container_create_kwargs": {
"device_requests": [
{
"driver": "nvidia",
"count": -1,
"capabilities": [
[
"gpu"
]
]
}
]
}
}
Docs for reference
- Infrastructure (2.x)
- Agents (2.x)
If you share your deployment name and the exact JSON you put in container_create_kwargs
, I can help you confirm what to expect in docker inspect
and agent logs.