podman-desktop/website/docs/podman/gpu.md
Anders Björklund effc6f84e8
fix: update the markdownlint targets (#10487)
* docs: fix lint markdown style issues

Signed-off-by: Anders F Björklund <anders.f.bjorklund@gmail.com>
2024-12-30 11:37:17 +01:00

8.4 KiB

sidebar_position title description
20 GPU container access GPU passthrough utilization within Windows, macOS and Linux

import Tabs from '@theme/Tabs'; import TabItem from '@theme/TabItem';

GPU container access

Leveraging GPU capabilities within a Podman container provides a powerful and efficient method for running GPU-accelerated workloads. Below are instructions on how to get started setting up your OS to utilize the GPU.

Prerequisites

  • NVIDIA Graphics Card (Pascal or later)
  • WSL2 (Hyper-V is not supported)

Procedure

  1. The most up-to-date NVIDIA GPU Driver will support WSL 2. You are not required to download anything else on your host machine for your NVIDIA card.

  2. Verify that WSL2 was installed when installing Podman Desktop.

  3. Create your Podman Machine.

  4. Install NVIDIA Container Toolkit onto the Podman Machine:

Podman Machine requires the NVIDIA Container Toolkit to be installed.

This can be installed by following the official NVIDIA guide or running the steps below:

SSH into the Podman Machine:

$ podman machine ssh

Run the following commands on the Podman Machine, not the host system:

$ curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo | \
 tee /etc/yum.repos.d/nvidia-container-toolkit.repo && \
 yum install -y nvidia-container-toolkit && \
 nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml && \
 nvidia-ctk cdi list

:::info

A configuration change might occur when you create or remove Multi-Instance GPU (MIG) devices, or upgrade the Compute Unified Device Architecture (CUDA) driver. In such cases, you must generate a new Container Device Interface (CDI) specification.

:::

Verification

To verify that containers created can access the GPU, you can use nvidia-smi from within a container with NVIDIA drivers installed.

Run the following official NVIDIA container on your host machine:

$ podman run --rm --device nvidia.com/gpu=all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi

Example output:

PS C:\Users\admin>  podman run --rm --device nvidia.com/gpu=all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi
Fri Aug 16 18:58:14 2024
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 545.36                 Driver Version: 546.33       CUDA Version: 12.3     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|=========================================+======================+======================|
|   0  NVIDIA GeForce RTX 3060        On  | 00000000:07:00.0  On |                  N/A |
|  0%   34C    P8              20W / 170W |    886MiB / 12288MiB |      1%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+

+---------------------------------------------------------------------------------------+
| Processes:                                                                            |
|  GPU   GI   CI        PID   Type   Process name                            GPU Memory |
|        ID   ID                                                             Usage      |
|=======================================================================================|
|    0   N/A  N/A        33      G   /Xwayland                                 N/A      |
+---------------------------------------------------------------------------------------+

Troubleshooting

Version mismatch

You might encounter the following error inside the containers:

# nvidia-smi
Failed to initialize NVML: N/A

This problem is related to a mismatch between the Container Device Interface (CDI) and the installed version.

To fix this problem, generate a new CDI specification by running the following inside the Podman machine:

nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml

:::info

You might need to restart your Podman machine.

:::

Additional resources

Prerequisites

  • macOS Silicon (M1 or later)

Procedure

Important to note that using the "Metal" GPU on macOS utilizes specialized software to achieve this. Specifically a virtualized GPU from within the Podman Machine that provides translation support from Vulkan and MoltenVK calls to MSL (Metal Shading Language), Apples GPU.

  1. Create a Podman Machine that uses libkrun:

libkrun

Verification

Using the GPU functionality requires a specialized Containerfile containing a patched MESA driver.

  1. Create the following Containerfile:
FROM fedora:40
USER 0

# Install the patched mesa-krunkit drivers
RUN dnf -y install dnf-plugins-core && \
    dnf -y copr enable slp/mesa-krunkit && \
    dnf -y install mesa-vulkan-drivers vulkan-loader-devel vulkan-headers vulkan-tools vulkan-loader && \
    dnf clean all
  1. Build the image:

build_libkrun_image

  1. Verify you can see the GPU by running a test container:
$  podman run --rm -it --device /dev/dri --name gpu-info quay.io/slopezpa/fedora-vgpu vulkaninfo | grep "GPU"

Example output:

$  podman run --rm -it --device /dev/dri --name gpu-info quay.io/slopezpa/fedora-vgpu vulkaninfo | grep "GPU"
  GPU id = 0 (Virtio-GPU Venus (Apple M1 Pro))
  GPU id = 1 (llvmpipe (LLVM 17.0.6, 128 bits))
GPU0:
 deviceType        = PHYSICAL_DEVICE_TYPE_INTEGRATED_GPU
 deviceName        = Virtio-GPU Venus (Apple M1 Pro)
GPU1:

Additional resources

Important note that the virtualized GPU (Virtio-GPU Venus (Apple M1 Pro)) only supports vulkan compute shaders, not rendering / draw. For more information on the available GPU features, see vulkaninfo from within the container.

Prerequisites

  • NVIDIA Graphics Card (Pascal or later)

Procedure

  1. Install the latest NVIDIA GPU Driver for your OS.
  2. Follow the instructions on installing the NVIDIA Container Toolkit in relation to your Linux distribution.
  3. Generate the CDI Specification file for Podman:

This file is saved either to /etc/cdi or /var/run/cdi on your Linux distribution and is used for Podman to detect your GPU(s).

Generate the CDI file:

$ nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml

Check the list of generated devices:

$ nvidia-ctk cdi list

More information as well as troubleshooting tips can be found on the official NVIDIA CDI guide.

Verification

To verify that containers created can access the GPU, you can use nvidia-smi from within a container with NVIDIA drivers installed.

Run the following official NVIDIA container on your host machine:

$ podman run --rm --device nvidia.com/gpu=all nvidia/cuda:11.0.3-base-ubuntu20.04 nvidia-smi

Additional resources