From 5a434b5b50a171fd39a18f43edd0d372be1e7482 Mon Sep 17 00:00:00 2001 From: cal-weng Date: Sat, 11 Oct 2025 21:45:15 +0800 Subject: [PATCH] resolve comments --- docs/manual/olares/settings/gpu-resource.md | 20 +++++++++--------- .../zh/manual/olares/settings/gpu-resource.md | 21 +++++++++---------- 2 files changed, 20 insertions(+), 21 deletions(-) diff --git a/docs/manual/olares/settings/gpu-resource.md b/docs/manual/olares/settings/gpu-resource.md index b448dcd9d..3076eaaeb 100644 --- a/docs/manual/olares/settings/gpu-resource.md +++ b/docs/manual/olares/settings/gpu-resource.md @@ -27,23 +27,23 @@ Even if your GPU architecture is supported, **low VRAM capacity may cause AI app Olares supports three GPU allocation modes. Choosing the right mode helps optimize performance based on your needs. -### Time Slicing - -In this mode, the GPU rotates across applications in time slices. -- At any instant, only one application uses the GPU’s compute and VRAM. -- Other applications’ VRAM contents are temporarily swapped out to system memory. -- Applications not assigned an exclusive GPU or dedicated VRAM are placed in the time-slicing queue by default. - ### App Exclusive In this mode, the GPU’s full compute capacity and VRAM are allocated to a single application to ensure the maximized performance. ### Memory Slicing -In this mode, GPU VRAM is allocated to multiple applications by specified quotas: +In this mode, GPU VRAM is allocated to multiple applications by specified VRAM quotas: -- Applications with a quota can run concurrently on the GPU. -- The sum of all quotas must not exceed the GPU’s physical VRAM. +- Applications with assigned VRAM can run concurrently on the GPU. +- The sum of all assigned VRAMs must not exceed the GPU’s physical VRAM. + +### Time Slicing + +In this mode, any number of applications can be bound to the same GPU: + +- At any instant, only one application fully occupies the GPU’s compute and VRAM. +- VRAM contents of other applications are temporarily swapped out to system memory. ## View GPU status diff --git a/docs/zh/manual/olares/settings/gpu-resource.md b/docs/zh/manual/olares/settings/gpu-resource.md index afdd5d410..010d9df45 100644 --- a/docs/zh/manual/olares/settings/gpu-resource.md +++ b/docs/zh/manual/olares/settings/gpu-resource.md @@ -28,22 +28,21 @@ Olares 仅支持 **NVIDIA 显卡**,且要求架构为 **Turing 或更新**(T Olares 提供三种分配方式,可按场景灵活选择。 -### 时间分片模式 - -在此模式下,多应用按时间片轮转共享同一 GPU: -- 任一时刻仅有一个应用占用 GPU 算力和显存。 -- 此时其他应用的显存内容会暂时换出至系统内存。 -- 未被分配独占 GPU 或专有显存的应用将默认加入时间分片队列。 - ### 应用独占模式 -在此模式下,整个 GPU 的算力和显存将分配给单个应用,以保证最佳性能。 +在此模式下,单张 GPU 的算力和显存将分配给一个应用,以保证最佳性能。 ### 显存分片模式 -在此模式下,GPU 显存可按指定配额分配给多个应用。 -- 获得配额的应用可并行使用 GPU。 -- 所分配配额之和不得超过总物理显存。 +在此模式下,GPU 显存可按指定显存分配给多个应用。 +- 所有获得显存的应用可同时使用 GPU。 +- 所分配显存之和不得超过总物理显存。 + +### 时间分片模式 + +在此模式下,任意数量应用可绑定至同一 GPU: +- 任一时刻仅有一个应用完全占用 GPU 算力和显存。 +- 此时其他应用的显存内容会暂时换出至系统内存。 ## 查看显卡状态