Docker GPU Error After Installing nvidia-docker2
Problem Statement
Users encounter the following Docker error when attempting to use NVIDIA GPU capabilities in containers, even after installing nvidia-docker2
and updating NVIDIA drivers:
docker: Error response from daemon: could not select device driver "" with capabilities: [[gpu]]
This error typically occurs on Linux systems (like Ubuntu 22.04) where:
- NVIDIA drivers are properly installed (
nvidia-smi
works on the host) nvidia-docker2
packages have been installed according to official documentation- Docker commands requiring GPU acceleration fail unexpectedly
- Rebooting the system doesn't resolve the issue
The most common causes include conflicts with Docker installation methods and incomplete NVIDIA Container Toolkit setup.
Verified Solutions
1. Properly Install NVIDIA Container Toolkit
::alert{type="warning"} Important: The nvidia-docker2
package requires explicit configuration of the NVIDIA Container Toolkit to function correctly. ::
# Step 1: Add repository and update
curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg && \
curl -s -L https://nvidia.github.io/libnvidia-container/stable/deb/nvidia-container-toolkit.list | \
sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \
sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list && \
sudo apt-get update
# Step 2: Install the toolkit
sudo apt-get install -y nvidia-container-toolkit
# Step 3: Configure Docker runtime
sudo nvidia-ctk runtime configure --runtime=docker
# Step 4: Restart Docker
sudo systemctl restart docker
After installation, verify with a test container:
sudo docker run --rm --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
::alert{type="info"} Explanation: Many users install nvidia-docker2
but miss the required configuration steps. This explicitly configures Docker to use NVIDIA GPU resources through Linux containers. ::
2. Resolve Snap Installation Conflicts
::alert{type="danger"} If you have Docker installed via Snap, this will conflict with NVIDIA Docker configurations. ::
Remove conflicting Snap installation:
bashsudo snap remove docker
Purge all Docker components:
bashsudo apt purge docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
Reinstall Docker from official repositories following the official Docker installation guide
Reinstall
nvidia-docker2
:bashdistribution=$(. /etc/os-release;echo $ID$VERSION_ID) \ && curl -fsSL https://nvidia.github.io/libnvidia-container/gpgkey | sudo gpg --dearmor -o /usr/share/keyrings/nvidia-container-toolkit-keyring.gpg \ && curl -s -L https://nvidia.github.io/libnvidia-container/$distribution/libnvidia-container.list | \ sed 's#deb https://#deb [signed-by=/usr/share/keyrings/nvidia-container-toolkit-keyring.gpg] https://#g' | \ sudo tee /etc/apt/sources.list.d/nvidia-container-toolkit.list sudo apt-get update sudo apt-get install -y nvidia-docker2 sudo systemctl restart docker
::alert{type="info"} Explanation: Snap installations manage services differently and often conflict with manual Docker configurations. A clean installation using native packages eliminates configuration mismatches. ::
3. Verify CUDA Toolkit Installation
The GPU driver alone isn't sufficient for Docker GPU support. Ensure CUDA toolkit is installed:
Check CUDA version compatibility with your drivers:
bashnvidia-smi
Look for the "CUDA Version" field in the output
Install CUDA Toolkit if missing:
bashwget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu2204/x86_64/cuda-keyring_1.1-1_all.deb sudo dpkg -i cuda-keyring_1.1-1_all.deb sudo apt update sudo apt -y install cuda-toolkit-12-3
Add to your
.bashrc
:bashecho 'export PATH=/usr/local/cuda/bin:$PATH' >> ~/.bashrc echo 'export LD_LIBRARY_PATH=/usr/local/cuda/lib64:$LD_LIBRARY_PATH' >> ~/.bashrc source ~/.bashrc
4. Full Purge and Reinstall (Last Resort)
Remove all Docker and NVIDIA Docker packages:
bashsudo apt purge nvidia-container* nvidia-docker* docker* sudo rm -rf /var/lib/docker sudo rm -rf /etc/docker
Reinstall core dependencies:
bashsudo apt install -f sudo apt autoremove sudo apt update
Reinstall Docker and NVIDIA Docker following the official guide from both:
Final Verification
Validate your installation with both:
sudo docker run --rm --gpus all nvidia/cuda:11.6.2-base-ubuntu20.04 nvidia-smi
sudo docker run --rm -it --gpus all ubuntu nvidia-smi
Expected output shows GPU information identical to running nvidia-smi
on the host machine.
::alert{type="success"} Best Practice Tip: Always install Docker from official repositories rather than Snap to avoid configuration conflicts with GPU drivers. Maintain consistent driver versions across host system and Docker images. :::