Hello Guys I need some clarification here When I have a dock Prefect Community #ask-community

Hello Guys. I need some clarification here. When I...

Alan

03/25/2024, 5:29 PM

Hello Guys. I need some clarification here. When I have a docker worker that is started through

prefect worker start -t docker --pool <pool-name>

and it pulls an image that i specified in a deployment.. How can I ensure that the container is ran with nvidia-runtime? This is so the docker container has access to my host's nvidia driver....

Kevin Grismore

03/25/2024, 5:36 PM

You're looking for something like the equivalent to

Copy code

docker run --gpus all

Kevin Grismore

03/25/2024, 5:38 PM

or maybe more accurately:

docker run --runtime=nvidia --gpus all

Alan

03/25/2024, 5:38 PM

The

runtime=nvidia

, because it seems to me the worker cannot actually detect my host gpu when the worker runs the container

Alan

03/25/2024, 5:40 PM

I have the following variables in my docker image:

Copy code

ENV NVIDIA_VISIBLE_DEVICES all
ENV NVIDIA_DRIVER_CAPABILITIES all

Alan

03/25/2024, 5:40 PM

so that takes care of the --gpus all

Kevin Grismore

03/25/2024, 5:59 PM

it looks like we're not accounting for

runtime

on the worker, but it should be possible with a few changes

Alan

03/25/2024, 6:03 PM

what kind of changes? I am not sure if this is what's expected, but it seems that the container running inside the worker cannot see the underlying host nvidia driver/toolkit because of this?

Alan

03/25/2024, 6:26 PM

@Kevin Grismore

Kevin Grismore

03/25/2024, 6:48 PM

sorry I was on a call

Kevin Grismore

03/25/2024, 6:48 PM

I think this is because of how we're actually starting the container

Kevin Grismore

03/25/2024, 6:48 PM

https://github.com/PrefectHQ/prefect-docker/blob/main/prefect_docker/worker.py#L724

Alan

03/25/2024, 6:49 PM

can't i just set the default run-time to be nvidia?

Alan

03/25/2024, 6:49 PM

Copy code

{
  "runtimes": {
    "nvidia": {
      "path": "nvidia-container-runtime",
      "runtimeArgs": []
    }
  },
  "default-runtime": "nvidia"
}

Kevin Grismore

03/25/2024, 6:49 PM

yeah, is that something you've tried?

Alan

03/25/2024, 6:50 PM

https://earthly.dev/blog/buildingrunning-nvidiacontainer/#:~:text=The%20NVIDIA%20runtime%20must%20be,the%20%2D%2Dgpus%20all%20flag.

Alan

03/25/2024, 6:51 PM

not sure if it would work, but I thought maybe I can set it here: /etc/docker/daemon.json

Kevin Grismore

03/25/2024, 6:51 PM

we create the container through the docker python client, where

runtime

is a kwarg but it doesn't look like we're accounting for it when we build the config that makes up those kwargs

Kevin Grismore

03/25/2024, 6:52 PM

but setting it as the default runtime might work too

Alan

03/25/2024, 6:52 PM

I am trying it now!

Alan

03/25/2024, 7:15 PM

That did not work 😓

Kevin Grismore

03/25/2024, 7:16 PM

dang

Alan

03/25/2024, 7:17 PM

what are your thoughts on using a kubernetes worker?

Kevin Grismore

03/25/2024, 7:17 PM

people are definitely using the k8s worker and doing GPU stuff

19 Views

Open in Slack

Previous Next