add multiple cards for one app support and update GPU modes description

This commit is contained in:
cal-weng 2025-10-27 15:30:45 +08:00
parent 5a434b5b50
commit d25bde12c3
4 changed files with 58 additions and 42 deletions

View file

@ -12,12 +12,13 @@ Olares allows you to harness the full power of your GPUs to accelerate demanding
This guide helps you understand and configure GPU allocation modes to maximize hardware performance.
::: tip GPU support
Olares supports **only Nvidia GPUs** of **Turing architecture or later** (Turing, Ampere, Ada Lovelace, and Blackwell).
Olares supports **only Nvidia GPUs** of **Turing architecture or later** (Turing, Ampere, Ada Lovelace, and Blackwell).
- Quick check: GTX/RTX **16 series and newer** consumer cards are supported.
- For other models, cross-check with the [compatible GPU table](https://github.com/NVIDIA/open-gpu-kernel-modules?tab=readme-ov-file#compatible-gpus).
- Other models: Cross-check with the [compatible GPU table](https://github.com/NVIDIA/open-gpu-kernel-modules?tab=readme-ov-file#compatible-gpus).
- Unknown model: Run `lspci | grep -i nvidia` to query the GPU architecture code and determine compatibility.
:::
:::
:::warning AI Performance
Even if your GPU architecture is supported, **low VRAM capacity may cause AI applications to fail**. Ensure your GPU has enough memory for your workloads.
@ -27,23 +28,30 @@ Even if your GPU architecture is supported, **low VRAM capacity may cause AI app
Olares supports three GPU allocation modes. Choosing the right mode helps optimize performance based on your needs.
### App Exclusive
In this mode, the GPUs full compute capacity and VRAM are allocated to a single application to ensure the maximized performance.
### Memory Slicing
In this mode, GPU VRAM is allocated to multiple applications by specified VRAM quotas:
- Applications with assigned VRAM can run concurrently on the GPU.
- The sum of all assigned VRAMs must not exceed the GPUs physical VRAM.
### Time Slicing
In this mode, any number of applications can be bound to the same GPU:
In this mode, a GPU can be bound to multiple applications and rotates execution in time slices.
- At any instant, only one application fully occupies the GPUs compute and VRAM.
- VRAM contents of other applications are temporarily swapped out to system memory.
* At any instant, only one application uses all available compute and VRAM of the GPU.
* Other apps enter a wait queue; their CUDA and VRAM content will be swapped to the system memory.
### App Exclusive
In this mode, the entire GPU is allocated to a single application.
* During execution, the app can use all compute and VRAM of the bound GPU.
* No cross-app contention or scheduling overhead so that best performance is guaranteed.
### Memory Slicing
In this mode, VRAM of the GPU is partitioned into fixed quotas for multiple designated applications.
* Users need to manually set a quota for each app.
* The sum of quotas must not exceed physical VRAM of the bound GPU. Oversubscription is not supported.
* Apps with quota assigned can run concurrently, each limited to its own quota.
:::tip Multi-GPU aggregation
You can bind multiple GPUs to one application within the same cluster to gain bigger VRAM. In such scenarios, only **App Exclusive** or **Memory Slicing** modes are supported.
:::
## View GPU status
@ -52,8 +60,10 @@ To view your GPU status:
1. Navigate to **Settings** > **GPU**. The GPU list shows each GPUs model, associated node, total VRAM, and current GPU mode.
2. Click on a specific GPU to visit its details.
![GPU overview](/images/manual/olares/gpu-overview.png#bordered)
::: tip Note
If your Olares only has one GPU, navigating to the GPU section will take you directly to the GPU details page. If you have multiple GPUs, you will see a list first.
If your Olares only has one GPU, navigating to the GPU section will take you directly to the GPU details page.
:::
## Configure GPU mode
@ -69,23 +79,19 @@ On the **GPU details** page, select your desired mode from the **GPU mode** drop
:::tip Note
No manual pinning is required if you only have one GPU in your cluster.
:::
* **App Exclusive**
1. Select this mode from the GPU mode dropdown.
2. In the **Select exclusive app** dropbox, choose your target application.
3. Click **Confirm**.
![App exclusive](/images/manual/olares/gpu-app-exclusive.png#bordered)
![App exclusive](/images/manual/olares/gpu-app-exclusive.png#bordered)
* **Memory Slicing**
1. Select this mode from the dropdown.
2. In the **Allocate VRAM** section, click **Add an application**.
3. Select your target application and assign it a specific amount of VRAM (in GB).
4. Repeat for other applications and click **Confirm**.
![VRAM slicing](/images/manual/olares/gpu-memory-slicing.png#bordered)
::: tip Note
You can't assign a VRAM that's larger than the total VRAM.
:::
* **Memory Slicing**
1. Select this mode from the dropdown.
2. In the **Allocate VRAM** section, click **Add an application**.
3. Select your target application and assign it a specific amount of VRAM in GB.
4. Repeat for other applications and click **Confirm**.
![VRAM slicing](/images/manual/olares/gpu-memory-slicing.png#bordered)
## Learn more
- [Monitor GPU usage in Olares](../resources-usage.md)

Binary file not shown.

After

Width:  |  Height:  |  Size: 61 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 57 KiB

View file

@ -28,27 +28,37 @@ Olares 仅支持 **NVIDIA 显卡**,且要求架构为 **Turing 或更新**T
Olares 提供三种分配方式,可按场景灵活选择。
### 应用独占模式
在此模式下,单张 GPU 的算力和显存将分配给一个应用,以保证最佳性能。
### 显存分片模式
在此模式下GPU 显存可按指定显存分配给多个应用。
- 所有获得显存的应用可同时使用 GPU。
- 所分配显存之和不得超过总物理显存。
### 时间分片模式
在此模式下,任意数量应用可绑定至同一 GPU
- 任一时刻仅有一个应用完全占用 GPU 算力和显存。
- 此时其他应用的显存内容会暂时换出至系统内存。
在此模式下,单张显卡按时间分片分配给多个应用。
- 任一时刻仅一个应用占用全部算力与可用显存。
- 其余应用进入等待队列,其 CUDA 及显存内容被换出至系统内存。
### 应用独占模式
在此模式下,每张显卡的计算能力和显存将分配至单个应用。
- 应用在运行时可使用显卡全部的算力和显存。
- 在这个模式下运行的应用会获得最佳性能。
### 显存分片模式
在此模式下,每张显卡的显存被划分为固定配额,分配给多个指定应用。
- 需为每个应用手动设定配额。
- 各配额之和不得超过对应显卡的物理显存。(暂不支持超订阅)
- 获配额的应用可并行运行,且仅能使用自身配额。
:::tip 多显卡合并
在同一集群中,可将多张显卡绑定至同一应用以获取更大显存和算力;合并场景下仅支持应用独占或显存分片模式。
:::
## 查看显卡状态
1. 进入 **设置 > GPU**。GPU 列表显示每个显卡的型号、所在节点、总显存及当前分配模式。
2. 点击单个显卡以进入其详情页。
![GPU 概览](/images/zh/manual/olares/gpu-overview.png#bordered)
::: tip 注意
如果你的 Olares 集群中只有一块 GPU进入 GPU 页面将直接跳转至详情页;若有多块 GPU则会显示 GPU 列表。
:::