resolve comments

This commit is contained in:
cal-weng 2025-10-11 21:45:15 +08:00
parent d8db9c458c
commit 5a434b5b50
2 changed files with 20 additions and 21 deletions

View file

@ -27,23 +27,23 @@ Even if your GPU architecture is supported, **low VRAM capacity may cause AI app
Olares supports three GPU allocation modes. Choosing the right mode helps optimize performance based on your needs.
### Time Slicing
In this mode, the GPU rotates across applications in time slices.
- At any instant, only one application uses the GPUs compute and VRAM.
- Other applications VRAM contents are temporarily swapped out to system memory.
- Applications not assigned an exclusive GPU or dedicated VRAM are placed in the time-slicing queue by default.
### App Exclusive
In this mode, the GPUs full compute capacity and VRAM are allocated to a single application to ensure the maximized performance.
### Memory Slicing
In this mode, GPU VRAM is allocated to multiple applications by specified quotas:
In this mode, GPU VRAM is allocated to multiple applications by specified VRAM quotas:
- Applications with a quota can run concurrently on the GPU.
- The sum of all quotas must not exceed the GPUs physical VRAM.
- Applications with assigned VRAM can run concurrently on the GPU.
- The sum of all assigned VRAMs must not exceed the GPUs physical VRAM.
### Time Slicing
In this mode, any number of applications can be bound to the same GPU:
- At any instant, only one application fully occupies the GPUs compute and VRAM.
- VRAM contents of other applications are temporarily swapped out to system memory.
## View GPU status

View file

@ -28,22 +28,21 @@ Olares 仅支持 **NVIDIA 显卡**,且要求架构为 **Turing 或更新**T
Olares 提供三种分配方式,可按场景灵活选择。
### 时间分片模式
在此模式下,多应用按时间片轮转共享同一 GPU
- 任一时刻仅有一个应用占用 GPU 算力和显存。
- 此时其他应用的显存内容会暂时换出至系统内存。
- 未被分配独占 GPU 或专有显存的应用将默认加入时间分片队列。
### 应用独占模式
在此模式下,整个 GPU 的算力和显存将分配给单个应用,以保证最佳性能。
在此模式下,单张 GPU 的算力和显存将分配给一个应用,以保证最佳性能。
### 显存分片模式
在此模式下GPU 显存可按指定配额分配给多个应用。
- 获得配额的应用可并行使用 GPU。
- 所分配配额之和不得超过总物理显存。
在此模式下GPU 显存可按指定显存分配给多个应用。
- 所有获得显存的应用可同时使用 GPU。
- 所分配显存之和不得超过总物理显存。
### 时间分片模式
在此模式下,任意数量应用可绑定至同一 GPU
- 任一时刻仅有一个应用完全占用 GPU 算力和显存。
- 此时其他应用的显存内容会暂时换出至系统内存。
## 查看显卡状态