<@ULVA73B9P> Hi, Marvin. Prefect 2:20:9. I have an...
# ask-marvin
t
@Marvin Hi, Marvin. Prefect 2209. I have an agent and and ai-worker. I want to create each ai worker container with gpu enabled capabilities. "driver": "nvidia", "count": all, "capabilities": ["gpu"] e.g the same as docker run -it --rm --name manual-inference-test \ --runtime=nvidia \ --gpus all \ prefect-ai-worker:2.20.9 /bin/bash How do I do that?
m
thought for 847.5 seconds
t
@Marvin what is Infrastructure Overrides (Optional)
m
thought for 499.9 seconds
Great question — and thanks for the version details. In Prefect 2.20.9, you can give GPU access to the containers your Docker worker launches by adding a Docker device request (equivalent to
--gpus all
) in the work pool’s base job template. There are two places where GPU might matter — let me know which you need: - The worker container itself (i.e., the long-running ai-worker). - The flow-run containers that the worker launches for each job. Below I’ll cover both. A) Give GPUs to the flow-run containers (Docker work pool) 1) Pull the default Docker base job template:
Copy code
prefect work-pool get-default-base-job-template --type docker > docker-template.json
2) Edit
docker-template.json
and add a
device_requests
entry under
job_configuration.container_create_kwargs
. For “all GPUs”, use
count: -1
(the Docker SDK equivalent of
--gpus all
):
Copy code
{
  "job_configuration": {
    "command": "{{ command }}",
    "env": "{{ env }}",
    "labels": "{{ labels }}",
    "name": "{{ name }}",
    "image": "{{ image }}",
    "registry_credentials": "{{ registry_credentials }}",
    "image_pull_policy": "{{ image_pull_policy }}",
    "networks": "{{ networks }}",
    "network_mode": "{{ network_mode }}",
    "auto_remove": "{{ auto_remove }}",
    "volumes": "{{ volumes }}",
    "stream_output": "{{ stream_output }}",
    "mem_limit": "{{ mem_limit }}",
    "memswap_limit": "{{ memswap_limit }}",
    "privileged": "{{ privileged }}",
    "container_create_kwargs": {
      "device_requests": [
        {
          "driver": "nvidia",
          "count": -1,
          "capabilities": [["gpu"]]
        }
      ]
    }
  },
  "variables": { ... keep the defaults here ... }
}
Notes: -
count: -1
is “all GPUs”; you can use a positive integer to limit GPUs. - The double-nested list for
capabilities
is required by Docker (
[["gpu"]]
). - You don’t strictly need
"driver": "nvidia"
on modern setups, but it’s fine to include. 3) Apply the updated template to your Docker work pool:
Copy code
prefect work-pool update "YOUR_DOCKER_POOL_NAME" --base-job-template docker-template.json
Now every container your Docker worker launches from that pool will get GPU access. B) Give GPUs to the worker container itself If you are running the ai-worker in a Docker container and want that container to have GPU access (e.g., for in-container inference), start it with GPU flags:
Copy code
docker run -it --rm \
  --name ai-worker \
  --gpus all \
  -e PREFECT_API_URL="<https://api.prefect.cloud/api/accounts/.../workspaces/...>" \
  -v /var/run/docker.sock:/var/run/docker.sock \
  prefect-ai-worker:2.20.9 \
  prefect worker start --pool "YOUR_DOCKER_POOL_NAME"
- Keep the Docker socket mount if your worker launches Docker jobs on the same host. - Ensure the host has NVIDIA drivers and the NVIDIA Container Toolkit installed so Docker can expose GPUs. Helpful docs: - Work pools and base job templates. If you can confirm: - Which work pool type your ai-worker is using (Docker, Kubernetes, etc.)? - Whether you need GPU on the worker container, the launched job containers, or both? …I can tailor the exact snippet/commands further (and if you’re on Kubernetes instead of Docker, the approach is different: add GPU resource limits and node selectors in the K8s job template).
Great question! In Prefect 3.x, the “Infrastructure Overrides (Optional)” field on a deployment lets you provide per-deployment overrides to the work pool’s base job template. In other words, your work pool defines default infrastructure settings (image, env, memory, etc.), and Infrastructure Overrides let you change some of those settings just for this deployment without modifying the pool itself. How it fits together - Work pool: has a base job template (defaults) for a specific worker type (Process, Docker, Kubernetes, ECS). - Infrastructure Overrides: a dictionary of “job variables” applied at deployment time. These override the base template keys for this deployment. - Precedence: deployment overrides > base job template > worker defaults. Where to set them - UI: Paste JSON into “Infrastructure Overrides (Optional)”. - Python: pass
job_variables
to
Flow.deploy(...)
. - prefect.yaml: add a
job_variables
section under the deployment. - CLI: run
prefect deploy --help
to see the flag for setting job variables at deploy time. Examples UI (JSON in the Infrastructure Overrides box)
Copy code
{
  "env": {
    "LOG_LEVEL": "DEBUG"
  },
  "stream_output": false,
  "working_dir": "/opt/prefect/flows"
}
prefect.yaml (Docker worker example)
Copy code
deployments:
  - name: high-mem
    entrypoint: flows/etl.py:flow
    work_pool:
      name: docker-pool
    job_variables:
      env:
        PREFECT_LOGGING_LEVEL: DEBUG
      mem_limit: "4g"
      volumes:
        - "/data:/data:rw"
Python (per-deployment overrides)
Copy code
from prefect import flow

@flow
def my_flow():
    ...

my_flow.deploy(
    name="k8s-tuned",
    work_pool_name="k8s-pool",
    job_variables={
        "env": {"FOO": "bar"},
        "image_pull_policy": "IfNotPresent"
    }
)
What you can override (varies by worker type) - Process worker (verified keys include):
env
,
command
,
working_dir
,
stream_output
,
labels
,
name
- Docker worker (verified keys include):
image
,
env
,
command
,
volumes
,
networks
,
network_mode
,
image_pull_policy
,
mem_limit
,
memswap_limit
,
privileged
,
auto_remove
,
labels
,
stream_output
,
container_create_kwargs
- Kubernetes/ECS: keys depend on their worker variables and base job template (e.g., image, env, resources/memory/CPU, service account/namespace for K8s; cpu, memory, roles, subnets, security groups for ECS). The UI will show which variables are supported for the pool’s worker type. Docs to dive deeper - Customize job variables (overrides): Customize job variables - Work pools and base job templates: Manage work pools Want me to tailor an example for your worker type (Process, Docker, Kubernetes, or ECS) and your specific pool settings?
t
@Marvin give me the "prefect", "agent", "start" args
m
thought for 185.9 seconds
Got it — here are the common args for
prefect agent start
in Prefect 2.20.9. I can’t run
--help
for 2.x in this environment, but this reflects the 2.20.x agent flags.
Copy code
Usage: prefect agent start [OPTIONS]

Options:
  -q, --work-queue TEXT        One or more work queue names to poll; can be
                               passed multiple times
  -m, --match TEXT             Prefix to match work queue names dynamically
                               (e.g., "prod-" matches all queues starting with it)
  --prefetch-seconds INTEGER   Seconds to prefetch scheduled runs before their
                               scheduled start time (helps with infra spin-up)
  --limit INTEGER              Max number of flow runs to start concurrently
  --run-once                   Poll once and exit (don’t run continuously)
  --name TEXT                  A name for this agent instance
  --api TEXT                   Prefect API URL (otherwise uses PREFECT_API_URL)
  --log-level [CRITICAL|ERROR|WARNING|INFO|DEBUG]
                               Set log verbosity
  --hide-welcome               Suppress the startup banner
  --help                       Show this message and exit
Example:
Copy code
prefect agent start \
  -q my-queue \
  -m prod- \
  --prefetch-seconds 60 \
  --limit 5 \
  --name gpu-agent \
  --log-level INFO
Docs: Agents in Prefect 2.x If you’d like, I can also translate this to an equivalent Prefect 3 worker command (recommended going forward), but the above is correct for Prefect 2.20.9.
t
@Marvin How can I manipulate the container run command when the flow is run in a DockerContainer. I'm using agent
m
thought for 127.7 seconds
In Prefect 2.x with an agent and a DockerContainer infrastructure block, you control the container’s run command via the block’s
command
field or per-deployment “Infrastructure Overrides”. Important note: Prefect normally injects its own command to start your flow run inside the container. If you fully replace
command
, you are responsible for invoking the Prefect engine correctly; otherwise, the flow won’t start. For most cases where you just want pre/post setup, it’s safer to use a wrapper ENTRYPOINT in your image (see Option 3). Options to manipulate the run command 1) Set the command on the DockerContainer block (applies to all deployments using the block) - UI: Blocks > Docker container > your block > edit > Command - Python:
Copy code
from prefect.infrastructure import DockerContainer

block = DockerContainer.load("my-docker-block")
# Either a string or a list of args; this replaces the default command
block.command = ["/bin/bash", "-lc", "echo 'pre-setup'; python -m prefect.engine"]  # Be sure to run the Prefect engine
block.save("my-docker-block", overwrite=True)
2) Per-deployment override (Infrastructure Overrides) - UI: Open your deployment > Edit > Infrastructure Overrides (JSON). Example:
Copy code
{
  "command": [
    "/bin/bash",
    "-lc",
    "echo 'pre-setup'; python -m prefect.engine"
  ]
}
- CLI at build time (exact flags can vary by 2.20.9 point release; run
prefect deployment build --help
locally to confirm):
Copy code
prefect deployment build path/to/flow.py:my_flow \
  -n my-deployment \
  -q my-queue \
  -ib "docker-container/my-docker-block" \
  --override 'command=["/bin/bash","-lc","echo pre-setup; python -m prefect.engine"]' \
  --apply
3) Recommended for most: wrap the Prefect-provided command via your image ENTRYPOINT This preserves Prefect’s command and lets you run setup before it. - Dockerfile:
Copy code
COPY entrypoint.sh /entrypoint.sh
RUN chmod +x /entrypoint.sh
ENTRYPOINT ["/entrypoint.sh"]
- entrypoint.sh:
Copy code
#!/bin/sh
set -e
# your setup here (e.g., nvidia-smi, env prep, file writes)
echo "Doing setup..."
# hand off to the command Prefect provides (don’t replace it)
exec "$@"
With this approach, you do not set
command
in Prefect at all; the agent/infra will pass its usual command to your entrypoint, and your script runs first. A couple of tips - If you need GPU flags, don’t put them in
command
— they’re container create options. In 2.x DockerContainer you can use infrastructure overrides to set
device_requests
(equivalent of
--gpus all
):
Copy code
{
  "container_create_kwargs": {
    "device_requests": [
      {
        "driver": "nvidia",
        "count": -1,
        "capabilities": [
          [
            "gpu"
          ]
        ]
      }
    ]
  }
}
- If you must fully replace
command
, make sure it ends by invoking the Prefect engine correctly (e.g.,
python -m prefect.engine ...
). If you’re not sure about the exact arguments, prefer the ENTRYPOINT wrapper pattern above. Docs - Infrastructure concepts (2.x): Infrastructure overview - Agents (2.x): Agents If you share how you want to “manipulate” the command (prepend setup, change interpreter, run a shell, etc.), I can give you a precise snippet that keeps your flow runs working.
t
@Marvin How do I see if the container_create_kwargs is working
m
thought for 55.2 seconds
Great question. There are a few reliable ways to verify that your
container_create_kwargs
are being applied when using a DockerContainer infrastructure block with an agent in Prefect 2.x. Quick checklist 1) Inspect the deployment config - Confirm your overrides are on the deployment:
Copy code
prefect deployment inspect '<flow-name>/<deployment-name>'
Look for “Job Variables” or “Infrastructure Overrides” including your
container_create_kwargs
. 2) Watch agent logs at DEBUG - Start your agent with more verbosity to see container creation details:
Copy code
prefect agent start -q <your-queue> --log-level DEBUG
You should see logs around creating the Docker container; errors will show if a kwarg is invalid. 3) Inspect the created Docker container (most definitive) - Trigger a run:
Copy code
prefect deployment run '<flow-name>/<deployment-name>'
- Find the container once it starts:
Copy code
docker ps --format 'table {{.ID}}\t{{.Names}}\t{{.Image}}'
- Inspect for your settings. For GPU
device_requests
specifically:
Copy code
docker inspect <container_id> --format '{{json .HostConfig.DeviceRequests}}'
You should see something like:
Copy code
[{"Driver":"nvidia","Count":-1,"Capabilities":[["gpu"]]}]
Or view the full HostConfig:
Copy code
docker inspect <container_id> --format '{{json .HostConfig}}' | jq .
4) Validate from inside the container (for GPU) - While the container is running:
Copy code
docker exec -it <container_id> nvidia-smi
If
device_requests
is working and the host has NVIDIA Container Toolkit installed, you’ll see GPUs listed. Tips to make inspection easier - Keep the container running long enough to inspect: add a short sleep inside your flow (e.g., a task that sleeps for 60–120s) so you can
docker ps
and
docker exec
before it exits. - If you’re testing
device_requests
and don’t see it in
docker inspect
, double-check that your overrides were applied to the correct deployment and that your agent picked up the updated deployment. Example
container_create_kwargs
for GPUs
Copy code
{
  "container_create_kwargs": {
    "device_requests": [
      {
        "driver": "nvidia",
        "count": -1,
        "capabilities": [
          [
            "gpu"
          ]
        ]
      }
    ]
  }
}
Docs for reference - Infrastructure (2.x) - Agents (2.x) If you share your deployment name and the exact JSON you put in
container_create_kwargs
, I can help you confirm what to expect in
docker inspect
and agent logs.