Mitch
05/30/2024, 9:56 PMImportError('libcuda.so.1: cannot open shared object file: No such file or directory')
I do not get this error on the cloud. How can I resolve this?Marvin
05/30/2024, 9:57 PMImportError('libcuda.so.1: cannot open shared object file: No such file or directory')
indicates that the Docker container is missing the CUDA library, which is necessary for GPU operations.
### Resolution Steps
1. Verify CUDA Installation on Host:
Ensure CUDA is installed on your local machine. You can verify this by running:
bash
nvcc --version
2. Use a CUDA-Enabled Docker Image:
Use Docker images that come with CUDA pre-installed. For instance, NVIDIA provides such base images:
dockerfile
FROM nvidia/cuda:11.2.0-cudnn8-devel-ubuntu20.04
3. Pass NVIDIA Runtime to Docker:
When running your Docker container, specify the NVIDIA runtime:
bash
docker run --gpus all your-docker-image
4. Update Docker Work Pool Configuration:
Ensure the Docker work pool is configured to use the appropriate base image. This can often be set in your deployment configuration.
Here are some relevant resources from Prefect you might find useful to review:
- Setting up a Docker type work pool via CLI.
- Modifying work pool behavior through the Prefect UI, CLI, or REST API.
For further reference, consider Prefect Workers Documentation.
If you've followed these steps and still face issues, please let me know!Mitch
05/30/2024, 9:58 PMMarvin
05/30/2024, 9:59 PM--gpus all
in your Prefect deployment or work pool, you'll need to use the job_variables
parameter to pass custom Docker run options. Here’s how you can set it up:
### Specifying GPU Options in a Deployment
When you create a deployment, you can use the job_variables
argument to override the Docker run options. Below is an example:
python
from prefect.deployments import Deployment
from my_flow import my_flow_function
deployment = Deployment.build_from_flow(
flow=my_flow_function,
name="my-gpu-deployment",
work_pool_name="my-docker-pool",
job_variables={
"env": {
"CUDA_VISIBLE_DEVICES": "all" # This will make all GPUs visible to Docker
},
"command": "docker run --gpus all your-docker-image"
},
image="nvidia/cuda:11.2.0-cudnn8-devel-ubuntu20.04"
)
if __name__ == "__main__":
deployment.apply()
### Configuring Work Pools
If you want to ensure that all jobs in a work pool have access to GPUs, configure the work pool’s base job template to use the necessary Docker runtime options.
You can create or edit your work pool configuration with the following CLI command:
bash
prefect work-pool create --type docker my-docker-pool
### Examples and Resources
You can read more about how to manage deployments and work pools in the Prefect documentation:
- Managing Prefect Deployments
- Creating and Configuring Work Pools
If there are specific details or adjustments needed, please let me know!