mirror of
https://github.com/beclab/Olares
synced 2026-05-24 09:18:23 +00:00
resolve comments
This commit is contained in:
parent
10ce9b44fc
commit
26910b80b9
2 changed files with 14 additions and 5 deletions
|
|
@ -34,7 +34,10 @@ In this mode, a GPU can be bound to multiple applications and rotates execution
|
|||
|
||||
* At any instant, only one application uses all available compute and VRAM of the GPU.
|
||||
* Other apps enter a wait queue; Their VRAM contents (e.g., CUDA context, etc.) may be temporarily swapped out to system memory.
|
||||
* By default, GPUs run in time-slicing mode. Applications not assigned an exclusive GPU or dedicated VRAM will join the time-slicing queue when a time-slicing GPU is available.
|
||||
|
||||
:::info Default GPU allocation
|
||||
By default, GPUs run in time-slicing mode. Applications without allocated GPU resources automatically join the time-sliced GPU queue. If no time-sliced GPU is available, the application pauses after a startup timeout. In this case, you need to allocate a GPU (for example, set a GPU to time-slicing mode, or assign a VRAM quota to the application), then manually resume the application.
|
||||
:::
|
||||
|
||||
### App Exclusive
|
||||
|
||||
|
|
@ -96,8 +99,10 @@ On the **GPU details** page, select your desired mode from the **GPU mode** drop
|
|||
4. Repeat for other applications and click **Confirm**.
|
||||

|
||||
|
||||
:::tip Unbinding GPU allocation
|
||||
After binding GPUs to an application, you can release GPU resources by performing an unbind operation under the corresponding GPU mode.
|
||||
:::tip Unbinding
|
||||
- After binding an GPU or its VRAM to an application, you can manually unbind it under the corresponding GPU mode to release GPU resources.
|
||||
|
||||
- When you switch a GPU’s allocation mode, all applications allocated under that mode are unbound, and the application containers will restart.
|
||||
:::
|
||||
|
||||
|
||||
|
|
|
|||
|
|
@ -33,7 +33,10 @@ Olares 提供三种分配方式,可按场景灵活选择。
|
|||
在此模式下,单张显卡按时间分片分配给多个应用。
|
||||
- 任一时刻仅一个应用占用全部算力与可用显存。
|
||||
- 其余应用进入等待队列,其显存内容(如 CUDA 上下文等)可被临时换出至系统内存。
|
||||
- 显卡默认处于时间分配模式。未被分配独占 GPU 或专属显存的应用,将默认加入时间分片队列(当有可用的时间分片显卡时)。
|
||||
|
||||
::: tip 默认显卡分配
|
||||
显卡默认处于时间分片模式。未被分配 GPU 资源的应用将自动加入时间分片显卡队列。若系统无可用时间分片显卡,应用会在启动超时后被暂停。此时,需先为应用分配显卡(如设置显卡为时间分片模式,或为应用分配显存)后,可手动恢复应用运行。
|
||||
:::
|
||||
|
||||
### 应用独占模式
|
||||
|
||||
|
|
@ -86,7 +89,8 @@ Olares 提供三种分配方式,可按场景灵活选择。
|
|||

|
||||
|
||||
:::tip 解除绑定
|
||||
绑定应用后,如需释放显卡资源,可在相应的显卡模式下执行解绑操作。
|
||||
- 绑定应用后,如需释放显卡资源,可在相应的显卡模式下手动执行解绑操作。
|
||||
- 切换某张显卡的分配模式时,显卡在该模式下分配的所有应用将被解除绑定,同时应用容器会重启。
|
||||
:::
|
||||
|
||||
## 了解更多
|
||||
|
|
|
|||
Loading…
Reference in a new issue