Studio: Ollama support, recommended folders, Custom Folders UX polish (#5050)

* Studio: Ollama support, recommended folders, Custom Folders UX polish

Backend:
- Add _scan_ollama_dir that reads manifests/registry.ollama.ai/library/*
  and creates .gguf symlinks under <ollama_dir>/.studio_links/ pointing
  at the content-addressable blobs, so detect_gguf_model and llama-server
  -m work unchanged for Ollama models
- Filter entries under .studio_links from the generic models/hf/lmstudio
  scanners to avoid duplicate rows and leaked internal paths in the UI
- New GET /api/models/recommended-folders endpoint returning LM Studio
  and Ollama model directories that currently exist on the machine
  (OLLAMA_MODELS env var + standard paths, ~/.lmstudio/models, legacy
  LM Studio cache), used by the Custom Folders quick-add chips
- detect_gguf_model now uses os.path.abspath instead of Path.resolve so
  the readable symlink name is preserved as display_name (e.g.
  qwen2.5-0.5b-Q4_K_M.gguf instead of sha256-abc...)
- llama-server failure with a path under .studio_links or .cache/ollama
  surfaces a friendlier message ("Some Ollama models do not work with
  llama.cpp. Try a different model, or use this model directly through
  Ollama instead.") instead of the generic validation error

Frontend:
- ListLabel supports an optional leading icon and collapse toggle; used
  for Downloaded (download icon), Custom Folders (folder icon), and
  Recommended (star icon)
- Custom Folders header gets folder icon on the left, and +, search,
  and chevron buttons on the right; chevron uses ml-auto so it aligns
  with the Downloaded and Recommended chevrons
- New recommended folder chips render below the registered scan folders
  when there are unregistered well-known paths; one click adds them as
  a scan folder
- Custom folder rows that are direct .gguf files (Ollama symlinks) load
  immediately via onSelect instead of opening the GGUF variant expander
  (which is for repos containing multiple quants, not single files)
- When loading a direct .gguf file path, send max_seq_length = 0 so the
  backend uses the model's native context instead of the 4096 chat
  default (qwen2.5:0.5b now loads at 32768 instead of 4096)
- New listRecommendedFolders() helper on the chat API

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Address review: log silent exceptions and support read-only Ollama dirs

Replace silent except blocks in _scan_ollama_dir and the
recommended-folders endpoint with narrower exception types plus debug
or warning logs, so failures are diagnosable without hiding signal.

Add _ollama_links_dir helper that falls back to a per-ollama-dir hashed
namespace under Studio's own cache (~/.unsloth/studio/cache/ollama_links)
when the Ollama models directory is read-only. Common for system installs
at /usr/share/ollama/.ollama/models and /var/lib/ollama/.ollama/models
where the Studio process has read but not write access. Previously the
scanner returned an empty list in that case and Ollama models would
silently not appear.

The fallback preserves the .gguf suffix on symlink names so
detect_gguf_model keeps recognising them. The prior "raw sha256 blob
path" fallback would have missed the suffix check and failed to load.

* Address review: detect mmproj next to symlink target for vision GGUFs

Codex P1 on model_config.py:1012: when detect_gguf_model returns the
symlink path (to preserve readable display names), detect_mmproj_file
searched the symlink's parent directory instead of the target's. For
vision GGUFs surfaced via Ollama's .studio_links/ -- where the weight
file is symlinked but any mmproj sidecar lives next to the real blob
-- mmproj was no longer detected, so the model was misclassified as
text-only and llama-server would start without --mmproj.

detect_mmproj_file now adds the resolved target's parent to the scan
order when path is a symlink. Direct (non-symlink) .gguf paths are
unchanged, so LM Studio and HF cache layouts keep working exactly as
before. Verified with a fake layout reproducing the bug plus a
regression check on a non-symlink LM Studio model.

* Address review: support all Ollama namespaces and vision projector layers

- Iterate over all directories under registry.ollama.ai/ instead of
  hardcoding the "library" namespace. Custom namespaces like
  "mradermacher/llama3" now get scanned and include the namespace
  prefix in display names, model IDs, and symlink names to avoid
  collisions.

- Create companion -mmproj.gguf symlinks for Ollama vision models
  that have an "application/vnd.ollama.image.projector" layer, so
  detect_mmproj_file can find the projector alongside the model.

- Extract symlink creation into _make_symlink helper to reduce
  duplication between model and projector paths.

* Address review: move imports to top level and add scan limit

- Move hashlib and json imports to the top of the file (PEP 8).
- Remove inline `import json as _json` and `import hashlib` from
  function bodies, use the top-level imports directly.
- Add `limit` parameter to `_scan_ollama_dir()` with early exit
  when the threshold is reached.
- Pass `_MAX_MODELS_PER_FOLDER` into the scanner so it stops
  traversing once enough models are found.

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Address review: Windows fallback, all registry hosts, collision safety

_make_link (formerly _make_symlink):
- Falls back to os.link() hardlink when symlink_to() fails (Windows
  without Developer Mode), then to shutil.copy2 as last resort
- Uses atomic os.replace via tmp file to avoid race window where the
  .gguf path is missing during rescan

Scanner now handles all Ollama registry layouts:
- Uses rglob over manifests/ instead of hardcoding registry.ollama.ai
- Discovers hf.co/org/repo:tag and any other host, not just library/
- Filenames include a stable sha1 hash of the manifest path to prevent
  collisions between models that normalize to the same stem

Per-model subdirectories under .studio_links/:
- Each model's links live in their own hash-keyed subdirectory
- detect_mmproj_file only sees the projector for that specific model,
  not siblings from other Ollama models

Friendly Ollama error detection:
- Now also matches ollama_links/ (the read-only fallback cache path)
  and model_identifier starting with "ollama/"

Recommended folders:
- Added os.access(R_OK | X_OK) check so unreadable system directories
  like /var/lib/ollama/.ollama/models are not advertised as chips

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Address review: filter ollama_links from generic scanners

The generic scanners (models_dir, hf_cache, lmstudio) already filter
out .studio_links to avoid duplicate Ollama entries, but missed the
ollama_links fallback cache directory used for read-only Ollama
installs. Add it to the filter.

* Address review: idempotent link creation and path-component filter

_make_link:
- Skip recreation when a valid link/copy already exists (samefile or
  matching size check). Prevents blocking the model-list API with
  multi-GB copies on repeated scans.
- Use uuid4 instead of os.getpid() for tmp file names to avoid race
  conditions from concurrent scans.
- Log cleanup errors instead of silently swallowing them.

Path filter:
- Use os.sep-bounded checks instead of bare substring match to avoid
  false positives on paths like "my.studio_links.backup/model.gguf".

* [pre-commit.ci] auto fixes from pre-commit.com hooks

for more information, see https://pre-commit.ci

* Address review: drop copy fallback, targeted glob, robust path filter

_make_link:
- Drop shutil.copy2 fallback -- copying multi-GB GGUFs inside a sync
  API request would block the backend. Log a warning and skip the
  model when both symlink and hardlink fail.

Scanner:
- Replace rglob("*") with targeted glob patterns (*/*/* and */*/*/*)
  to avoid traversing unrelated subdirectories in large custom folders.

Path filter:
- Use Path.parts membership check instead of os.sep substring matching
  for robustness across platforms.

Scan limit:
- Skip _scan_ollama_dir when _generic already fills the per-folder cap.

* Address review: sha256, top-level uuid import, Path.absolute()

- Switch hashlib.sha1 to hashlib.sha256 for path hashing consistency.
- Move uuid import to the top of the file instead of inside _make_link.
- Replace os.path.abspath with Path.absolute() in detect_gguf_model
  to match the pathlib style used throughout the codebase.

* Address review: fix stale comments (sha1, rglob, copy fallback)

Update three docstrings/comments that still referenced the old
implementation after recent changes:
- sha1 comment now says "not a security boundary" (no hash name)
- "rglob" -> "targeted glob patterns"
- "file copies as a last resort" -> removed (copy fallback was dropped)

* Address review: fix stale links, support all manifest depths, scope error

_make_link:
- Drop size-based idempotency shortcut that kept stale links after
  ollama pull updates a tag to a same-sized blob. Only samefile()
  is used now -- if the link doesn't point at the exact same inode,
  it gets replaced.

Scanner:
- Revert targeted glob back to rglob so deeper OCI-style repo names
  (5+ path segments) are not silently skipped.

Ollama error:
- Only show "Some Ollama models do not work with llama.cpp" when the
  server output contains GGUF compatibility hints (key not found,
  unknown architecture, failed to load). Unrelated failures like
  OOM or missing binaries now show the generic error instead of
  being misdiagnosed.

---------

Co-authored-by: Daniel Han <info@unsloth.ai>
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: danielhanchen <michaelhan2050@gmail.com>
This commit is contained in:
Daniel Han 2026-04-16 08:24:08 -07:00 committed by GitHub
parent ff23ce40b4
commit 05ec0f110b
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
6 changed files with 500 additions and 35 deletions

View file

@ -1703,6 +1703,28 @@ class LlamaCppBackend:
# Wait for llama-server to become healthy
if not self._wait_for_health(timeout = 600.0):
self._kill_process()
_gguf = gguf_path or ""
_is_ollama = (
".studio_links" in _gguf
or os.sep + "ollama_links" + os.sep in _gguf
or os.sep + ".cache" + os.sep + "ollama" + os.sep in _gguf
or (self._model_identifier or "").startswith("ollama/")
)
# Only show the Ollama-specific message when the server
# output indicates a GGUF compatibility issue, not for
# unrelated failures like OOM or missing binaries.
if _is_ollama:
_output = "\n".join(self._stdout_lines[-50:]).lower()
_gguf_compat_hints = (
"key not found",
"unknown model architecture",
"failed to load model",
)
if any(h in _output for h in _gguf_compat_hints):
raise RuntimeError(
"Some Ollama models do not work with llama.cpp. "
"Try a different model, or use this model directly through Ollama instead."
)
raise RuntimeError(
"llama-server failed to start. "
"Check that the GGUF file is valid and you have enough memory."

View file

@ -5,8 +5,11 @@
Model Management API routes
"""
import hashlib
import json
import os
import sys
import uuid
from pathlib import Path
from fastapi import APIRouter, Body, Depends, HTTPException, Query
from typing import List, Optional
@ -411,6 +414,267 @@ def _scan_lmstudio_dir(lm_dir: Path) -> List[LocalModelInfo]:
return found
def _ollama_links_dir(ollama_dir: Path) -> Optional[Path]:
"""Return a writable directory for Ollama ``.gguf`` symlinks.
Prefers ``<ollama_dir>/.studio_links/`` so the links sit next to the
blobs they point at. Falls back to a per-ollama-dir namespace under
Studio's own cache when the models directory is read-only (common
for system installs under ``/usr/share/ollama`` or ``/var/lib/ollama``)
so we still surface Ollama models in those environments.
"""
from utils.paths.storage_roots import cache_root
primary = ollama_dir / ".studio_links"
try:
primary.mkdir(exist_ok = True)
return primary
except OSError as e:
logger.debug(
"Ollama dir %s not writable for .studio_links (%s); "
"falling back to Studio cache",
ollama_dir,
e,
)
# Fallback: namespace by a hash of the ollama_dir so two different
# Ollama roots don't collide. This is a cache path, not a security
# boundary.
try:
digest = hashlib.sha256(str(ollama_dir.resolve()).encode()).hexdigest()[:12]
except OSError:
digest = "default"
fallback = cache_root() / "ollama_links" / digest
try:
fallback.mkdir(parents = True, exist_ok = True)
return fallback
except OSError as e:
logger.warning(
"Could not create Ollama symlink cache at %s: %s",
fallback,
e,
)
return None
def _scan_ollama_dir(
ollama_dir: Path, limit: Optional[int] = None
) -> List[LocalModelInfo]:
"""Scan an Ollama models directory for downloaded models.
Ollama stores models in a content-addressable layout::
<ollama_dir>/manifests/<host>/<namespace>/<model>/<tag>
<ollama_dir>/blobs/sha256-...
The default host is ``registry.ollama.ai`` with namespace
``library`` (official models), but users can pull from custom
namespaces (``mradermacher/llama3``) or entirely different hosts
(``hf.co/org/repo:tag``). We iterate all manifest files via
``rglob`` so every layout depth is discovered.
Each manifest is JSON with a ``layers`` array. The layer with
``mediaType == "application/vnd.ollama.image.model"`` contains the
GGUF weights. Vision models also have a projector layer
(``application/vnd.ollama.image.projector``). We read the config
layer to extract family/size info.
Since Ollama blobs lack a ``.gguf`` extension (which the GGUF
loading pipeline requires), we create ``.gguf``-named links
pointing at the blobs so the existing ``detect_gguf_model`` and
``llama-server -m`` paths work unchanged. Each model gets its
own subdirectory under the links dir (keyed by a short hash of
the manifest path) so that ``detect_mmproj_file`` only sees the
projector for *that* model. Links are created as symlinks when
possible, falling back to hardlinks (Windows without Developer
Mode) as a last resort. The link dir lives under
``<ollama_dir>/.studio_links/`` when writable, otherwise under
Studio's own cache directory.
"""
manifests_root = ollama_dir / "manifests"
if not manifests_root.is_dir():
return []
found: List[LocalModelInfo] = []
blobs_dir = ollama_dir / "blobs"
links_root = _ollama_links_dir(ollama_dir)
if links_root is None:
logger.warning(
"Skipping Ollama scan for %s: no writable location for .gguf links",
ollama_dir,
)
return []
def _make_link(link_dir: Path, link_name: str, target: Path) -> Optional[str]:
"""Create a .gguf-named link to an Ollama blob.
Tries symlink first, then hardlink (works on Windows without
Developer Mode when target is on the same filesystem). Skips
the model if neither works -- a full file copy of a multi-GB
GGUF inside a synchronous API request would block the backend.
Idempotent: skips recreation when a valid link already exists.
"""
link_dir.mkdir(parents = True, exist_ok = True)
link_path = link_dir / link_name
resolved = target.resolve()
# Skip if the link already points at the exact same blob.
# Only use samefile -- size-based checks can reuse stale links
# after `ollama pull` updates a tag to a same-sized blob.
try:
if link_path.exists() and os.path.samefile(str(link_path), str(resolved)):
return str(link_path)
except OSError as e:
logger.debug("Error checking existing link %s: %s", link_path, e)
tmp_path = link_dir / f".{link_name}.tmp-{uuid.uuid4().hex[:8]}"
try:
if tmp_path.is_symlink() or tmp_path.exists():
tmp_path.unlink()
try:
tmp_path.symlink_to(resolved)
except OSError:
try:
os.link(str(resolved), str(tmp_path))
except OSError:
logger.warning(
"Could not create link for Ollama blob %s "
"(symlinks and hardlinks both failed). "
"Skipping model to avoid blocking the API.",
target,
)
return None
os.replace(str(tmp_path), str(link_path))
return str(link_path)
except OSError as e:
logger.debug("Could not create Ollama link %s: %s", link_path, e)
try:
if tmp_path.is_symlink() or tmp_path.exists():
tmp_path.unlink()
except OSError as cleanup_err:
logger.debug(
"Could not clean up tmp path %s: %s", tmp_path, cleanup_err
)
return None
try:
for tag_file in manifests_root.rglob("*"):
if not tag_file.is_file():
continue
rel = tag_file.relative_to(manifests_root)
parts = rel.parts
if len(parts) < 3:
continue
host = parts[0]
repo_parts = list(parts[1:-1])
tag = parts[-1]
if (
host == "registry.ollama.ai"
and repo_parts
and repo_parts[0] == "library"
):
repo_name = "/".join(repo_parts[1:])
elif host == "registry.ollama.ai":
repo_name = "/".join(repo_parts)
else:
repo_name = "/".join([host] + repo_parts)
if not repo_name:
continue
display = f"{repo_name}:{tag}"
manifest_key = rel.as_posix()
stem_hash = hashlib.sha256(manifest_key.encode()).hexdigest()[:10]
try:
manifest = json.loads(tag_file.read_text())
except (json.JSONDecodeError, OSError) as e:
logger.debug(
"Skipping unreadable/invalid Ollama manifest %s: %s",
tag_file,
e,
)
continue
config_digest = manifest.get("config", {}).get("digest", "")
model_type = ""
file_type = ""
if config_digest and blobs_dir.is_dir():
config_blob = blobs_dir / config_digest.replace(":", "-")
if config_blob.is_file():
try:
cfg = json.loads(config_blob.read_text())
model_type = cfg.get("model_type", "")
file_type = cfg.get("file_type", "")
except (json.JSONDecodeError, OSError) as e:
logger.debug(
"Could not parse Ollama config blob %s: %s",
config_blob,
e,
)
model_link_dir = links_root / stem_hash
gguf_link_path: Optional[str] = None
quant = f"-{file_type}" if file_type else ""
safe_name = repo_name.replace("/", "-")
for layer in manifest.get("layers", []):
media = layer.get("mediaType", "")
digest = layer.get("digest", "")
if not digest:
continue
if media == "application/vnd.ollama.image.model":
candidate = blobs_dir / digest.replace(":", "-")
if candidate.is_file():
link_name = f"{safe_name}-{tag}{quant}.gguf"
gguf_link_path = _make_link(
model_link_dir, link_name, candidate
)
elif media == "application/vnd.ollama.image.projector":
candidate = blobs_dir / digest.replace(":", "-")
if candidate.is_file():
mmproj_name = f"{safe_name}-{tag}-mmproj.gguf"
_make_link(model_link_dir, mmproj_name, candidate)
if not gguf_link_path:
continue
suffix = ""
if model_type:
suffix += f" ({model_type}"
if file_type:
suffix += f" {file_type}"
suffix += ")"
try:
updated_at = tag_file.stat().st_mtime
except OSError:
updated_at = None
found.append(
LocalModelInfo(
id = gguf_link_path,
model_id = f"ollama/{repo_name}:{tag}",
display_name = display + suffix,
path = gguf_link_path,
source = "custom",
updated_at = updated_at,
),
)
if limit is not None and len(found) >= limit:
return found
except OSError as e:
logger.warning("Error scanning Ollama directory %s: %s", ollama_dir, e)
return found
@router.get("/local", response_model = LocalModelListResponse)
async def list_local_models(
models_dir: str = Query(
@ -493,11 +757,27 @@ async def list_local_models(
for folder in custom_folders:
folder_path = Path(folder["path"])
try:
custom_models = (
_scan_models_dir(folder_path, limit = _MAX_MODELS_PER_FOLDER)
+ _scan_hf_cache(folder_path)
+ _scan_lmstudio_dir(folder_path)
)[:_MAX_MODELS_PER_FOLDER]
# Ollama scanner creates .studio_links/ with .gguf symlinks.
# Filter those from the generic scanners to avoid duplicates
# and leaking internal paths into the UI.
_generic = [
m
for m in (
_scan_models_dir(folder_path, limit = _MAX_MODELS_PER_FOLDER)
+ _scan_hf_cache(folder_path)
+ _scan_lmstudio_dir(folder_path)
)
if not any(
p in (".studio_links", "ollama_links")
for p in Path(m.path).parts
)
]
custom_models = _generic
if len(custom_models) < _MAX_MODELS_PER_FOLDER:
custom_models += _scan_ollama_dir(
folder_path,
limit = _MAX_MODELS_PER_FOLDER - len(custom_models),
)
except OSError as e:
logger.warning("Skipping unreadable scan folder %s: %s", folder_path, e)
continue
@ -575,6 +855,57 @@ async def remove_scan_folder_endpoint(
return {"ok": True}
@router.get("/recommended-folders")
async def get_recommended_folders(
current_subject: str = Depends(get_current_subject),
):
"""Return well-known model directories that exist on this machine.
Lightweight alternative to ``browse-folders`` for showing quick-pick
chips without the overhead of enumerating a directory tree. Returns
paths that actually exist on disk (HF cache, LM Studio, Ollama,
``~/models``, etc.) so the frontend can offer them as one-click
"Recommended" shortcuts in the Custom Folders section.
"""
from utils.paths.storage_roots import lmstudio_model_dirs
folders: list[str] = []
seen: set[str] = set()
def _add(p: Optional[Path]) -> None:
if p is None:
return
try:
resolved = str(p.resolve())
except OSError:
return
if resolved in seen:
return
if Path(resolved).is_dir() and os.access(resolved, os.R_OK | os.X_OK):
seen.add(resolved)
folders.append(resolved)
# LM Studio model directories
try:
for p in lmstudio_model_dirs():
_add(p)
except Exception as e:
logger.warning("Failed to scan for LM Studio model directories: %s", e)
# Ollama model directories
ollama_env = os.environ.get("OLLAMA_MODELS")
if ollama_env:
_add(Path(ollama_env).expanduser())
for candidate in (
Path.home() / ".ollama" / "models",
Path("/usr/share/ollama/.ollama/models"),
Path("/var/lib/ollama/.ollama/models"),
):
_add(candidate)
return {"folders": folders}
# Heuristic ceiling on how many children to stat when checking whether a
# directory "looks like" it contains models. Keeps the browser snappy
# even when a directory has thousands of unrelated entries.

View file

@ -959,6 +959,20 @@ def detect_mmproj_file(path: str, search_root: Optional[str] = None) -> Optional
scan_order.append(resolved)
_add(start_dir)
# When ``path`` is a symlink (e.g. Ollama's ``.studio_links/...gguf``
# -> ``blobs/sha256-...``), the symlink's parent directory rarely
# contains the mmproj sibling; the real mmproj file lives next to
# the symlink target. Add the target's parent to the scan so vision
# GGUFs that are surfaced via symlinks are still recognised as
# vision models.
try:
if p.is_symlink() and p.is_file():
target_parent = p.resolve().parent
if target_parent.is_dir():
_add(target_parent)
except OSError:
pass
if search_root is not None:
try:
root_resolved = Path(search_root).resolve()
@ -1006,7 +1020,10 @@ def detect_gguf_model(path: str) -> Optional[str]:
if p.suffix.lower() == ".gguf" and p.is_file():
if _is_mmproj(p.name):
return None
return str(p.resolve())
# Use absolute (not resolve) to preserve symlink names -- e.g.
# Ollama .studio_links/model.gguf -> blobs/sha256-... should
# keep the readable symlink name, not the opaque blob hash.
return str(p.absolute())
# Case 2: directory containing .gguf files (skip mmproj)
if p.is_dir():

View file

@ -27,6 +27,7 @@ import {
listCachedModels,
listGgufVariants,
listLocalModels,
listRecommendedFolders,
listScanFolders,
removeScanFolder,
} from "@/features/chat/api/chat-api";
@ -49,7 +50,7 @@ import { checkVramFit, estimateLoadingVram } from "@/lib/vram";
import { Add01Icon, Cancel01Icon, Folder02Icon, Search01Icon } from "@hugeicons/core-free-icons";
import { HugeiconsIcon } from "@hugeicons/react";
import { FolderBrowser } from "./folder-browser";
import { Trash2Icon } from "lucide-react";
import { ChevronDownIcon, ChevronRightIcon, DownloadIcon, StarIcon, Trash2Icon } from "lucide-react";
import {
type ReactNode,
useCallback,
@ -73,10 +74,35 @@ function normalizeForSearch(s: string): string {
return s.toLowerCase().replace(/[\s\-_\.]/g, "");
}
function ListLabel({ children }: { children: ReactNode }) {
function ListLabel({
children,
icon,
collapsed,
onToggle,
}: {
children: ReactNode;
icon?: ReactNode;
collapsed?: boolean;
onToggle?: () => void;
}) {
return (
<div className="px-2.5 py-1.5 text-[10px] font-semibold uppercase tracking-wider text-muted-foreground">
{children}
<div className="flex items-center justify-between gap-1 px-2.5 py-1.5">
<span className="flex items-center gap-1.5 text-[10px] font-semibold uppercase tracking-wider text-muted-foreground">
{icon}
{children}
</span>
{onToggle && (
<button
type="button"
onClick={onToggle}
aria-label={collapsed ? "Expand section" : "Collapse section"}
className="shrink-0 rounded p-1 text-muted-foreground/60 transition-colors hover:text-foreground"
>
{collapsed
? <ChevronRightIcon className="size-3" />
: <ChevronDownIcon className="size-3" />}
</button>
)}
</div>
);
}
@ -489,6 +515,9 @@ export function HubModelPicker({
// Delete confirmation dialog state
const [deleteTarget, setDeleteTarget] = useState<string | null>(null);
const [deleting, setDeleting] = useState(false);
const [downloadedCollapsed, setDownloadedCollapsed] = useState(false);
const [customFoldersCollapsed, setCustomFoldersCollapsed] = useState(false);
const [recommendedCollapsed, setRecommendedCollapsed] = useState(false);
// Cached (already downloaded) repos -- use module-level cache so
// re-mounting the popover does not flash an empty "Downloaded" section.
@ -514,6 +543,7 @@ export function HubModelPicker({
const [showFolderInput, setShowFolderInput] = useState(false);
const [folderLoading, setFolderLoading] = useState(false);
const [showFolderBrowser, setShowFolderBrowser] = useState(false);
const [recommendedFolders, setRecommendedFolders] = useState<string[]>([]);
const refreshLocalModelsList = useCallback(() => {
listLocalModels()
@ -616,6 +646,9 @@ export function HubModelPicker({
// Always refresh LM Studio + custom folder models (not gated by alreadyCached)
refreshLocalModelsList();
refreshScanFolders();
listRecommendedFolders()
.then(setRecommendedFolders)
.catch(() => {});
// Always refetch cached GGUF/model lists. The module-level caches give
// an instant render with stale data (no spinner flash), but newly
@ -893,8 +926,12 @@ export function HubModelPicker({
(cachedGguf.length > 0 ||
(!chatOnly && cachedModels.length > 0)) ? (
<>
<ListLabel>Downloaded</ListLabel>
{cachedGguf.map((c) => (
<ListLabel
icon={<DownloadIcon className="size-3" />}
collapsed={downloadedCollapsed}
onToggle={() => setDownloadedCollapsed((v) => !v)}
>Downloaded</ListLabel>
{!downloadedCollapsed && cachedGguf.map((c) => (
<div key={c.repo_id}>
<ModelRow
label={c.repo_id}
@ -922,7 +959,7 @@ export function HubModelPicker({
)}
</div>
))}
{!chatOnly &&
{!downloadedCollapsed && !chatOnly &&
cachedModels.map((c) => (
<div key={c.repo_id} className="flex items-center gap-0.5">
<div className="min-w-0 flex-1">
@ -1001,20 +1038,12 @@ export function HubModelPicker({
{!showHfSection ? (
<>
<div className="flex items-center justify-between gap-1 px-2.5 py-1.5">
<span className="text-[10px] font-semibold uppercase tracking-wider text-muted-foreground">
<div className="flex items-center gap-1 px-2.5 py-1.5">
<span className="flex items-center gap-1.5 text-[10px] font-semibold uppercase tracking-wider text-muted-foreground">
<HugeiconsIcon icon={Folder02Icon} className="size-3" />
Custom Folders
</span>
<div className="flex items-center gap-0.5">
<button
type="button"
aria-label="Browse for a folder on the server"
title="Browse folders on the server"
onClick={() => setShowFolderBrowser(true)}
className="shrink-0 rounded p-1 text-muted-foreground/60 transition-colors hover:text-foreground"
>
<HugeiconsIcon icon={Search01Icon} className="size-3" />
</button>
<button
type="button"
aria-label={showFolderInput ? "Cancel adding folder" : "Add scan folder by path"}
@ -1029,11 +1058,33 @@ export function HubModelPicker({
>
<HugeiconsIcon icon={showFolderInput ? Cancel01Icon : Add01Icon} className="size-3" />
</button>
<button
type="button"
aria-label="Browse for a folder on the server"
title="Browse folders on the server"
onClick={() => setShowFolderBrowser(true)}
className="shrink-0 rounded p-0.5 text-muted-foreground/60 transition-colors hover:text-foreground"
>
<HugeiconsIcon icon={Search01Icon} className="size-2.5" />
</button>
</div>
<div className="ml-auto">
<button
type="button"
aria-label={customFoldersCollapsed ? "Expand custom folders" : "Collapse custom folders"}
title={customFoldersCollapsed ? "Expand" : "Collapse"}
onClick={() => setCustomFoldersCollapsed((v) => !v)}
className="shrink-0 rounded p-1 text-muted-foreground/60 transition-colors hover:text-foreground"
>
{customFoldersCollapsed
? <ChevronRightIcon className="size-3" />
: <ChevronDownIcon className="size-3" />}
</button>
</div>
</div>
{/* Folder paths */}
{scanFolders.map((f) => (
{!customFoldersCollapsed && scanFolders.map((f) => (
<div
key={f.id}
className="group flex items-center gap-1.5 px-2.5 py-0.5"
@ -1056,8 +1107,31 @@ export function HubModelPicker({
</div>
))}
{/* Recommended folders */}
{!customFoldersCollapsed && (() => {
const registered = new Set(scanFolders.map((f) => f.path));
const unregistered = recommendedFolders.filter((p) => !registered.has(p));
if (unregistered.length === 0) return null;
return (
<div className="flex flex-wrap gap-1 px-2.5 pb-0.5">
{unregistered.map((p) => (
<button
key={p}
type="button"
onClick={() => void handleAddFolder(p)}
disabled={folderLoading}
title={`Add ${p}`}
className="rounded-full border border-dashed border-border/50 px-2 py-0.5 font-mono text-[10px] text-muted-foreground/70 transition-colors hover:border-foreground/30 hover:bg-accent hover:text-foreground disabled:opacity-40"
>
<span className="text-[11px] font-semibold">+</span> {p.length > 30 ? `...${p.slice(-27)}` : p}
</button>
))}
</div>
);
})()}
{/* Add folder input */}
{showFolderInput && (
{!customFoldersCollapsed && showFolderInput && (
<div className="px-2.5 pb-1 pt-0.5">
<div className="flex items-center gap-1">
<HugeiconsIcon icon={Folder02Icon} className="size-3 shrink-0 text-muted-foreground/40" />
@ -1114,11 +1188,15 @@ export function HubModelPicker({
{/* Models from custom folders */}
{customFolderModels.map((m) => {
{!customFoldersCollapsed && customFolderModels.map((m) => {
const isGgufFile = m.path.toLowerCase().endsWith(".gguf");
const isGguf =
isGgufFile ||
isGgufRepo(m.id) ||
isGgufRepo(m.display_name) ||
m.path.toLowerCase().endsWith(".gguf");
isGgufRepo(m.display_name);
// Single .gguf files (e.g. Ollama blobs) load directly;
// GGUF repos/directories expand to pick a variant.
const isDirectGguf = isGgufFile;
return (
<div key={m.id}>
<ModelRow
@ -1126,7 +1204,13 @@ export function HubModelPicker({
meta={isGguf ? "GGUF" : "Local"}
selected={value === m.id}
onClick={() => {
if (isGguf) {
if (isDirectGguf) {
onSelect(m.id, {
source: "local",
isLora: false,
isDownloaded: true,
});
} else if (isGguf) {
setExpandedGguf((prev) =>
prev === m.id ? null : m.id,
);
@ -1158,8 +1242,12 @@ export function HubModelPicker({
{!showHfSection && cachedReady ? (
<>
<ListLabel>Recommended</ListLabel>
{visibleRecommendedIds.length === 0 ? (
<ListLabel
icon={<StarIcon className="size-3" />}
collapsed={recommendedCollapsed}
onToggle={() => setRecommendedCollapsed((v) => !v)}
>Recommended</ListLabel>
{recommendedCollapsed ? null : visibleRecommendedIds.length === 0 ? (
<div className="px-2.5 py-2 text-xs text-muted-foreground">
No default models.
</div>
@ -1203,7 +1291,7 @@ export function HubModelPicker({
);
})
)}
{hasMoreRecommended && (
{!recommendedCollapsed && hasMoreRecommended && (
<>
<div ref={recommendedSentinelRef} className="h-px" />
<div className="flex items-center justify-center py-2">
@ -1216,7 +1304,7 @@ export function HubModelPicker({
{showHfSection && filteredRecommendedIds.length > 0 ? (
<>
<ListLabel>Recommended</ListLabel>
<ListLabel icon={<StarIcon className="size-3" />}>Recommended</ListLabel>
{filteredRecommendedIds.map((id) => {
const vram = recommendedVramMap.get(id);
return (

View file

@ -262,6 +262,12 @@ export interface BrowseFoldersResponse {
model_files_here?: number;
}
export async function listRecommendedFolders(): Promise<string[]> {
const response = await authFetch("/api/models/recommended-folders");
const data = await parseJsonOrThrow<{ folders: string[] }>(response);
return data.folders;
}
export async function browseFolders(
path?: string,
showHidden = false,

View file

@ -437,9 +437,10 @@ export function useChatModelRuntime() {
const { chatTemplateOverride, kvCacheDtype, customContextLength, ggufContextLength, speculativeType } = useChatRuntimeStore.getState();
// GGUF: use custom context length, or 0 = model's native context
// Non-GGUF: use the Max Seq Length slider value
const isDirectGgufFile = modelId.toLowerCase().endsWith(".gguf");
const effectiveMaxSeqLength = customContextLength != null
? customContextLength
: ggufVariant != null ? (ggufContextLength ?? 0) : maxSeqLength;
: (ggufVariant != null || isDirectGgufFile) ? (ggufContextLength ?? 0) : maxSeqLength;
const loadResponse = await loadModel({
model_path: modelId,
hf_token: hfToken,