update wording

This commit is contained in:
cal-weng 2025-10-11 21:23:52 +08:00
parent 861c5812b3
commit d8db9c458c
2 changed files with 17 additions and 10 deletions

View file

@ -29,17 +29,21 @@ Olares supports three GPU allocation modes. Choosing the right mode helps optimi
### Time Slicing
In this mode, the GPU rotates across applications in time slices; at any instant only one app runs on the GPU. Other GPU workloads are paused, with their VRAM evicted to system memory and restored on their slice.
Time Slicing mode provides a shared VRAM pool. Applications without an exclusive GPU or dedicated VRAM automatically join the time-sliced queue and use this pool when the GPU is available.
In this mode, the GPU rotates across applications in time slices.
- At any instant, only one application uses the GPUs compute and VRAM.
- Other applications VRAM contents are temporarily swapped out to system memory.
- Applications not assigned an exclusive GPU or dedicated VRAM are placed in the time-slicing queue by default.
### App Exclusive
In this mode, the GPUs full compute capacity and VRAM are allocated to a single application, eliminating cross-application contention and interference.
In this mode, the GPUs full compute capacity and VRAM are allocated to a single application to ensure the maximized performance.
### Memory Slicing
In this mode, VRAM is statically partitioned into fixed quotas and bound to designated applications. The sum of quotas must not exceed the total physical VRAM. Each assigned app can access only its own partition with no cross-app interference.
In this mode, GPU VRAM is allocated to multiple applications by specified quotas:
- Applications with a quota can run concurrently on the GPU.
- The sum of all quotas must not exceed the GPUs physical VRAM.
## View GPU status

View file

@ -30,17 +30,20 @@ Olares 提供三种分配方式,可按场景灵活选择。
### 时间分片模式
在此模式下GPU 按时间片在应用间轮换执行,任一时刻仅一个应用占用 GPU。其余GPU 负载被挂起,所用显存会交换至系统内存,轮到时再恢复。
该模式下GPU 提供默认的显存资源池。未被分配独占 GPU 或专有显存的应用将自动使用时间分片模式下的 GPU。
在此模式下,多应用按时间片轮转共享同一 GPU
- 任一时刻仅有一个应用占用 GPU 算力和显存。
- 此时其他应用的显存内容会暂时换出至系统内存。
- 未被分配独占 GPU 或专有显存的应用将默认加入时间分片队列。
### 应用独占模式
在此模式下,整个 GPU 的算力和显存将分配给单个应用,无跨应用争用与干扰。
在此模式下,整个 GPU 的算力和显存将分配给单个应用,以保证最佳性能。
### 显存分片模式
在此模式下GPU 显存按固定配额静态切分并绑定到指定应用;各配额之和不得超过物理显存。该模式下,每个应用仅能访问分配的显存分区,相互隔离、互不干扰。
在此模式下GPU 显存可按指定配额分配给多个应用。
- 获得配额的应用可并行使用 GPU。
- 所分配配额之和不得超过总物理显存。
## 查看显卡状态