Wiki NVIDIA Driver and CUDA Library
Ecosystem:
Hardware:
- NVIDIA GPU(Hardware)
Software:
- NVIDIA Driver(GPU, Graphics Card)
- libnvidia-encode.so
- CUDA Toolkit
- libnvcuvid.so
- libcuda.so
- cuDNN library
- NVIDIA Container Toolkit
- libnvidia-container.so
NVIDIA Driver on Ubuntu
Find out whether the host machine have NVIDIA GPU hardware
$ lspci | grep VGA
0000:ac:00.0 VGA compatible controller: NVIDIA Corporation Device 2233 (rev a1)
or,
$ sudo lshw -C display
*-display
description: VGA compatible controller
product: NVIDIA Corporation
vendor: NVIDIA Corporation
physical id: 0
bus info: pci@0000:ac:00.0
logical name: /dev/fb0
version: a1
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress vga_controller bus_master cap_list rom fb
configuration: depth=32 driver=nouveau latency=0 resolution=1920,1080
resources: iomemory:204f0-204ef iomemory:204f0-204ef irq:68 memory:99000000-99ffffff memory:204fe0000000-204fefffffff memory:204ff0000000-204ff1ffffff ioport:3000(size=128) memory:9a080000-9a0fffff
or,
$ hwinfo --gfxcard --short
graphics card:
nVidia VGA compatible controller
Primary display adapter: #94
Check which NVIDIA driver being used
Ubuntu is using open-source Nouveau drivers
$ lsmod | grep nouveau
nouveau 2306048 1
mxm_wmi 16384 1 nouveau
i2c_algo_bit 16384 1 nouveau
drm_ttm_helper 16384 1 nouveau
ttm 86016 2 drm_ttm_helper,nouveau
drm_kms_helper 311296 1 nouveau
drm 622592 5 drm_kms_helper,drm_ttm_helper,ttm,nouveau
video 65536 2 dell_wmi,nouveau
wmi 32768 7 dell_wmi_sysman,dell_wmi,wmi_bmof,dell_smbios,dell_wmi_descriptor,mxm_wmi,nouveau
Ubuntu is not using the proprietary NVIDIA drivers
$ lsmod | grep nvidia
Install the NVIDIA driver
Ubuntu Linux Install Nvidia Driver (Latest Proprietary Driver)
Install Nvidia Beta Drivers via PPA Repository
Verify the NVIDIA driver
nvidia-smi
nvidia-smi --query-gpu=driver_version --format=csv
dconfig -p | grep nvidia
Reload NVIDIA driver
Get related drivers,
lsmod | grep nvidia
Unload drivers,
sudo rmmod nvidia_drm
sudo rmmod nvidia_modeset
sudo rmmod nvidia_uvm
sudo rmmod nvidia
Load NVIDIA driver again,
nvidia-smi
cuda - Nvidia NVML Driver/library version mismatch - Stack Overflow
Prevent updating NVIDIA driver,
sudo apt-mark hold nvidia-driver-535
sudo apt-mark hold nvidia-dkms-535
sudo apt-mark hold nvidia-utils-535
updates - How to prevent updating of a specific package? - Ask Ubuntu
NVIDIA CUDA Toolkit on WSL
NVIDIA CUDA software stack on WSL 2:
NVIDIA CUDA Toolkit on Ubuntu
Official documentation: CUDA installation
How to Install CUDA on Ubuntu 22.04 LTS
NVIDIA Container Toolkit
Build and run containers leveraging NVIDIA GPUs, already including CUDA Toolkit.
Prerequisites on Host Machine:
Running a docker container ubuntu
,
Specialized Configurations with Docker — container-toolkit 1.14.1 documentation
$ docker run --rm --gpus all ubuntu nvidia-smi
$ docker run --rm --gpus all ubuntu ldconfig -p | grep nvidia
libnvidia-ptxjitcompiler.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1
libnvidia-pkcs11-openssl3.so.535.86.05 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.535.86.05
libnvidia-opencl.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1
libnvidia-nvvm.so.4 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.4
libnvidia-ml.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1
libnvidia-cfg.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.1
libnvidia-allocator.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.1
$ docker run --rm -e NVIDIA_DRIVER_CAPABILITIES=video --gpus all ubuntu ldconfig -p | grep nvidia
libnvidia-ptxjitcompiler.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1
libnvidia-pkcs11-openssl3.so.535.86.05 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.535.86.05
libnvidia-opticalflow.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.1
libnvidia-opencl.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1
libnvidia-nvvm.so.4 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.4
libnvidia-encode.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.1
libnvidia-allocator.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.1
Running a docker container nvidia/cuda
,
Make sure the version of CUDA container nvidia/cuda:xx.x.x-base-ubuntu22.04
such as 12.2.0
in following must be compatible with the version of the Nvidia GPU driver on the host platform such as >525.60.13
.
docker run --rm --gpus all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi
docker run --rm \
--gpus all \
-e NVIDIA_VISIBLE_DEVICES=all \
-e NVIDIA_DRIVER_CAPABILITIES=compute,video,utility,graphics \
nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi
More Dockerfile
examples:
FFmpeg in NVIDIA CUDA container
https://developer.nvidia.com/ffmpeg
https://developer.nvidia.com/blog/nvidia-ffmpeg-transcoding-guide/
Install FFmpeg on Nvidia CUDA Container
Known issues
After random long running time, in Nvidia docker the FFmpeg encoding stops and error comes out: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
.
[Issue]: NVidia Docker transcoding randomly stops working after 5 minutes to 4 hours later. · Issue #9287 · jellyfin/jellyfin · GitHub