diff --git a/docs/manual/olares/settings/gpu-resource.md b/docs/manual/olares/settings/gpu-resource.md index c83672853..6d98d7f62 100644 --- a/docs/manual/olares/settings/gpu-resource.md +++ b/docs/manual/olares/settings/gpu-resource.md @@ -34,7 +34,10 @@ In this mode, a GPU can be bound to multiple applications and rotates execution * At any instant, only one application uses all available compute and VRAM of the GPU. * Other apps enter a wait queue; Their VRAM contents (e.g., CUDA context, etc.) may be temporarily swapped out to system memory. -* By default, GPUs run in time-slicing mode. Applications not assigned an exclusive GPU or dedicated VRAM will join the time-slicing queue when a time-slicing GPU is available. + +:::info Default GPU allocation +By default, GPUs run in time-slicing mode. Applications without allocated GPU resources automatically join the time-sliced GPU queue. If no time-sliced GPU is available, the application pauses after a startup timeout. In this case, you need to allocate a GPU (for example, set a GPU to time-slicing mode, or assign a VRAM quota to the application), then manually resume the application. +::: ### App Exclusive @@ -96,8 +99,10 @@ On the **GPU details** page, select your desired mode from the **GPU mode** drop 4. Repeat for other applications and click **Confirm**. ![VRAM slicing](/images/manual/olares/gpu-memory-slicing.png#bordered) -:::tip Unbinding GPU allocation -After binding GPUs to an application, you can release GPU resources by performing an unbind operation under the corresponding GPU mode. +:::tip Unbinding +- After binding an GPU or its VRAM to an application, you can manually unbind it under the corresponding GPU mode to release GPU resources. + +- When you switch a GPU’s allocation mode, all applications allocated under that mode are unbound, and the application containers will restart. ::: diff --git a/docs/zh/manual/olares/settings/gpu-resource.md b/docs/zh/manual/olares/settings/gpu-resource.md index b8be318ff..6bb11f8a2 100644 --- a/docs/zh/manual/olares/settings/gpu-resource.md +++ b/docs/zh/manual/olares/settings/gpu-resource.md @@ -33,7 +33,10 @@ Olares 提供三种分配方式,可按场景灵活选择。 在此模式下,单张显卡按时间分片分配给多个应用。 - 任一时刻仅一个应用占用全部算力与可用显存。 - 其余应用进入等待队列,其显存内容(如 CUDA 上下文等)可被临时换出至系统内存。 -- 显卡默认处于时间分配模式。未被分配独占 GPU 或专属显存的应用,将默认加入时间分片队列(当有可用的时间分片显卡时)。 + +::: tip 默认显卡分配 +显卡默认处于时间分片模式。未被分配 GPU 资源的应用将自动加入时间分片显卡队列。若系统无可用时间分片显卡,应用会在启动超时后被暂停。此时,需先为应用分配显卡(如设置显卡为时间分片模式,或为应用分配显存)后,可手动恢复应用运行。 +::: ### 应用独占模式 @@ -86,7 +89,8 @@ Olares 提供三种分配方式,可按场景灵活选择。 ![显存分片](/images/zh/manual/olares/gpu-memory-slicing.png#bordered) :::tip 解除绑定 -绑定应用后,如需释放显卡资源,可在相应的显卡模式下执行解绑操作。 +- 绑定应用后,如需释放显卡资源,可在相应的显卡模式下手动执行解绑操作。 +- 切换某张显卡的分配模式时,显卡在该模式下分配的所有应用将被解除绑定,同时应用容器会重启。 ::: ## 了解更多