mirror of
https://github.com/unslothai/unsloth
synced 2026-04-21 13:37:39 +00:00
* Studio: anchor ctx-slider warning threshold at 4096 when weights exceed VRAM
The chat settings sheet's ctx slider reads `max_context_length` from
`/api/inference/status` and renders
Exceeds estimated VRAM capacity (N tokens). The model may use
system RAM.
when the user drags the slider above that value. For models whose
weights fit on some GPU subset, `_max_context_length` was already set
to the binary-search cap and the warning fired correctly.
For models whose weights exceed 90% of every GPU subset's free memory
(e.g. MiniMax-M2.7-GGUF at 131 GB on a 97 GB GPU), the ceiling-probe
loop never matched a subset, so `max_available_ctx` stayed at the
native context (e.g. 196608). The slider ran all the way to native
with no indication that any value above the 4096 spec default would
trigger `--fit on` and degrade performance.
Anchor `max_available_ctx` at `min(4096, native_context_length)` when
no subset fits, so the warning fires at the right threshold and the
user sees the correct safe-zone / warning-zone split:
Before (MiniMax-M2.7 on 97 GB GPU):
slider 0 .. 196608, warning threshold = 196608 (never fires)
After:
slider 0 .. 196608, warning threshold = 4096 (fires correctly)
No frontend changes required: `chat-settings-sheet.tsx` already
consumes `ggufMaxContextLength` (= status.max_context_length) as the
warning threshold and `ggufNativeContextLength` as the slider max.
Adds tests/test_llama_cpp_max_context_threshold.py covering
weights-exceed-VRAM (single / multi-GPU), a native-ctx below the 4096
fallback case (don't lie about supported ctx), fittable-model
regressions (small / multi-GPU / tiny on huge GPU), and the
`max_context_length` property's fallback semantics.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
|
||
|---|---|---|
| .. | ||
| backend | ||
| frontend | ||
| __init__.py | ||
| install_llama_prebuilt.py | ||
| install_python_stack.py | ||
| LICENSE.AGPL-3.0 | ||
| setup.bat | ||
| setup.ps1 | ||
| setup.sh | ||
| Unsloth_Studio_Colab.ipynb | ||