resolve comments

This commit is contained in:
cal-weng 2025-10-29 19:37:55 +08:00
parent 10ce9b44fc
commit 26910b80b9
2 changed files with 14 additions and 5 deletions

View file

@ -34,7 +34,10 @@ In this mode, a GPU can be bound to multiple applications and rotates execution
* At any instant, only one application uses all available compute and VRAM of the GPU.
* Other apps enter a wait queue; Their VRAM contents (e.g., CUDA context, etc.) may be temporarily swapped out to system memory.
* By default, GPUs run in time-slicing mode. Applications not assigned an exclusive GPU or dedicated VRAM will join the time-slicing queue when a time-slicing GPU is available.
:::info Default GPU allocation
By default, GPUs run in time-slicing mode. Applications without allocated GPU resources automatically join the time-sliced GPU queue. If no time-sliced GPU is available, the application pauses after a startup timeout. In this case, you need to allocate a GPU (for example, set a GPU to time-slicing mode, or assign a VRAM quota to the application), then manually resume the application.
:::
### App Exclusive
@ -96,8 +99,10 @@ On the **GPU details** page, select your desired mode from the **GPU mode** drop
4. Repeat for other applications and click **Confirm**.
![VRAM slicing](/images/manual/olares/gpu-memory-slicing.png#bordered)
:::tip Unbinding GPU allocation
After binding GPUs to an application, you can release GPU resources by performing an unbind operation under the corresponding GPU mode.
:::tip Unbinding
- After binding an GPU or its VRAM to an application, you can manually unbind it under the corresponding GPU mode to release GPU resources.
- When you switch a GPUs allocation mode, all applications allocated under that mode are unbound, and the application containers will restart.
:::

View file

@ -33,7 +33,10 @@ Olares 提供三种分配方式,可按场景灵活选择。
在此模式下,单张显卡按时间分片分配给多个应用。
- 任一时刻仅一个应用占用全部算力与可用显存。
- 其余应用进入等待队列,其显存内容(如 CUDA 上下文等)可被临时换出至系统内存。
- 显卡默认处于时间分配模式。未被分配独占 GPU 或专属显存的应用,将默认加入时间分片队列(当有可用的时间分片显卡时)。
::: tip 默认显卡分配
显卡默认处于时间分片模式。未被分配 GPU 资源的应用将自动加入时间分片显卡队列。若系统无可用时间分片显卡,应用会在启动超时后被暂停。此时,需先为应用分配显卡(如设置显卡为时间分片模式,或为应用分配显存)后,可手动恢复应用运行。
:::
### 应用独占模式
@ -86,7 +89,8 @@ Olares 提供三种分配方式,可按场景灵活选择。
![显存分片](/images/zh/manual/olares/gpu-memory-slicing.png#bordered)
:::tip 解除绑定
绑定应用后,如需释放显卡资源,可在相应的显卡模式下执行解绑操作。
- 绑定应用后,如需释放显卡资源,可在相应的显卡模式下手动执行解绑操作。
- 切换某张显卡的分配模式时,显卡在该模式下分配的所有应用将被解除绑定,同时应用容器会重启。
:::
## 了解更多