Skip to main content

Wiki NVIDIA Driver and CUDA Library

Ecosystem:

Hardware:

  • NVIDIA GPU(Hardware)

Software:

  • NVIDIA Driver(GPU, Graphics Card)
    • libnvidia-encode.so
  • CUDA Toolkit
    • libnvcuvid.so
    • libcuda.so
  • cuDNN library
  • NVIDIA Container Toolkit
    • libnvidia-container.so

CUDA Components

NVIDIA Driver on Ubuntu

Find out whether the host machine have NVIDIA GPU hardware

$ lspci | grep VGA
0000:ac:00.0 VGA compatible controller: NVIDIA Corporation Device 2233 (rev a1)

or,

$ sudo lshw -C display
*-display
description: VGA compatible controller
product: NVIDIA Corporation
vendor: NVIDIA Corporation
physical id: 0
bus info: pci@0000:ac:00.0
logical name: /dev/fb0
version: a1
width: 64 bits
clock: 33MHz
capabilities: pm msi pciexpress vga_controller bus_master cap_list rom fb
configuration: depth=32 driver=nouveau latency=0 resolution=1920,1080
resources: iomemory:204f0-204ef iomemory:204f0-204ef irq:68 memory:99000000-99ffffff memory:204fe0000000-204fefffffff memory:204ff0000000-204ff1ffffff ioport:3000(size=128) memory:9a080000-9a0fffff

or,

$ hwinfo --gfxcard --short
graphics card:
nVidia VGA compatible controller

Primary display adapter: #94

Check which NVIDIA driver being used

Ubuntu is using open-source Nouveau drivers

$ lsmod | grep nouveau
nouveau 2306048 1
mxm_wmi 16384 1 nouveau
i2c_algo_bit 16384 1 nouveau
drm_ttm_helper 16384 1 nouveau
ttm 86016 2 drm_ttm_helper,nouveau
drm_kms_helper 311296 1 nouveau
drm 622592 5 drm_kms_helper,drm_ttm_helper,ttm,nouveau
video 65536 2 dell_wmi,nouveau
wmi 32768 7 dell_wmi_sysman,dell_wmi,wmi_bmof,dell_smbios,dell_wmi_descriptor,mxm_wmi,nouveau

Ubuntu is not using the proprietary NVIDIA drivers

$ lsmod | grep nvidia

Install the NVIDIA driver

Ubuntu Linux Install Nvidia Driver (Latest Proprietary Driver)

Install Nvidia Beta Drivers via PPA Repository

Verify the NVIDIA driver

nvidia-smi

nvidia-smi --query-gpu=driver_version --format=csv
dconfig -p | grep nvidia

Reload NVIDIA driver

Get related drivers,

lsmod | grep nvidia

Unload drivers,

sudo rmmod nvidia_drm
sudo rmmod nvidia_modeset
sudo rmmod nvidia_uvm
sudo rmmod nvidia

Load NVIDIA driver again,

nvidia-smi

cuda - Nvidia NVML Driver/library version mismatch - Stack Overflow

Prevent updating NVIDIA driver,

sudo apt-mark hold nvidia-driver-535
sudo apt-mark hold nvidia-dkms-535
sudo apt-mark hold nvidia-utils-535

updates - How to prevent updating of a specific package? - Ask Ubuntu

NVIDIA CUDA Toolkit on WSL

NVIDIA CUDA software stack on WSL 2:

NVIDIA CUDA software stack on WSL 2

NVIDIA CUDA Toolkit on Ubuntu

Official documentation: CUDA installation

How to Install CUDA on Ubuntu 22.04 LTS

NVIDIA Container Toolkit

Build and run containers leveraging NVIDIA GPUs, already including CUDA Toolkit.

NVIDIA Container Toolkit

Prerequisites on Host Machine:

Running a docker container ubuntu,

Specialized Configurations with Docker — container-toolkit 1.14.1 documentation

$ docker run --rm --gpus all ubuntu nvidia-smi

$ docker run --rm --gpus all ubuntu ldconfig -p | grep nvidia
libnvidia-ptxjitcompiler.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1
libnvidia-pkcs11-openssl3.so.535.86.05 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.535.86.05
libnvidia-opencl.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1
libnvidia-nvvm.so.4 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.4
libnvidia-ml.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-ml.so.1
libnvidia-cfg.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-cfg.so.1
libnvidia-allocator.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.1

$ docker run --rm -e NVIDIA_DRIVER_CAPABILITIES=video --gpus all ubuntu ldconfig -p | grep nvidia
libnvidia-ptxjitcompiler.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-ptxjitcompiler.so.1
libnvidia-pkcs11-openssl3.so.535.86.05 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-pkcs11-openssl3.so.535.86.05
libnvidia-opticalflow.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-opticalflow.so.1
libnvidia-opencl.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-opencl.so.1
libnvidia-nvvm.so.4 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-nvvm.so.4
libnvidia-encode.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-encode.so.1
libnvidia-allocator.so.1 (libc6,x86-64) => /usr/lib/x86_64-linux-gnu/libnvidia-allocator.so.1

Running a docker container nvidia/cuda,

info

Make sure the version of CUDA container nvidia/cuda:xx.x.x-base-ubuntu22.04 such as 12.2.0 in following must be compatible with the version of the Nvidia GPU driver on the host platform such as >525.60.13.

docker run --rm --gpus all nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi
docker run --rm \
--gpus all \
-e NVIDIA_VISIBLE_DEVICES=all \
-e NVIDIA_DRIVER_CAPABILITIES=compute,video,utility,graphics \
nvidia/cuda:12.2.0-base-ubuntu22.04 nvidia-smi

More Dockerfile examples:

FFmpeg in NVIDIA CUDA container

https://developer.nvidia.com/ffmpeg

https://docs.nvidia.com/video-technologies/video-codec-sdk/12.0/ffmpeg-with-nvidia-gpu/index.html#compiling-for-linux

https://developer.nvidia.com/blog/nvidia-ffmpeg-transcoding-guide/

Install FFmpeg on Nvidia CUDA Container

Known issues

After random long running time, in Nvidia docker the FFmpeg encoding stops and error comes out: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected.
[Issue]: NVidia Docker transcoding randomly stops working after 5 minutes to 4 hours later. · Issue #9287 · jellyfin/jellyfin · GitHub

References

CUDA And Nvidia Graphics Driver

CUDA on WSL User Guide