mirror of
https://github.com/unslothai/unsloth
synced 2026-04-21 13:37:39 +00:00
11 commits
| Author | SHA1 | Message | Date | |
|---|---|---|---|---|
|
|
7252410ccc
|
studio: stream export worker output into the export dialog (#4897)
* studio: stream export worker output into the export dialog
The Export Model dialog only showed a spinner on the "Exporting..."
button while the worker subprocess was doing the actual heavy lifting.
For Merged to 16bit and GGUF / Llama.cpp exports this meant several
minutes (or more, for large models) of opaque silence, with no way to
tell whether save_pretrained_merged, convert_hf_to_gguf.py, or
llama-quantize was making progress.
This adds a live terminal-style output panel inside the export dialog,
rendered just above the Cancel / Start Export buttons and scrollable
with auto-follow-tail. It shows stdout and stderr from both the worker
process itself and any child process it spawns (GGUF converter,
llama-quantize), coloured by stream.
Backend
- core/export/worker.py: new _setup_log_capture(resp_queue) installed
before LogConfig.setup_logging. It saves the original stdout/stderr
fds, creates pipes, os.dup2's the write ends onto fds 1 and 2 (so
every child process inherits the redirected fds), and spins up two
daemon reader threads. Each thread reads bytes from a pipe, echoes
them back to the original fd (so the server console keeps working),
splits on \n and \r, and forwards each line to the resp queue as
{"type":"log","stream":"stdout|stderr","line":...,"ts":...}.
PYTHONUNBUFFERED=1 is set so nested Python converters flush
immediately.
- core/export/orchestrator.py:
- Thread-safe ring buffer (collections.deque, maxlen 4000) with a
monotonically increasing seq counter. clear_logs(),
get_logs_since(cursor), get_current_log_seq(), is_export_active().
- _wait_response handles rtype == "log" by appending to the buffer
and continuing the wait loop. Status messages are also surfaced as
a "status" stream so users see high level progress alongside raw
subprocess output.
- load_checkpoint, _run_export, and cleanup_memory now wrap their
bodies with the existing self._lock (previously unused), clear the
log buffer at the start of each op, and flip _export_active in a
try/finally so the SSE endpoint can detect idle.
- routes/export.py:
- Wrapped every sync orchestrator call (load_checkpoint,
cleanup_memory, export_merged_model, export_base_model,
export_gguf, export_lora_adapter) in asyncio.to_thread so the
FastAPI event loop stays free during long exports. Without this
the new SSE endpoint could not be served concurrently with the
blocking export POST.
- New GET /api/export/logs/stream SSE endpoint. Honors
Last-Event-ID and a since query param for reconnect, emits log /
heartbeat / complete / error events, uses the id field to carry
the log seq so clients can resume cleanly. On first connect
without an explicit cursor it starts from the current seq so old
lines from a previous run are not replayed.
Frontend
- features/export/api/export-api.ts: streamExportLogs() helper that
authFetches the SSE endpoint and parses id / event / data fields
manually (same pattern as streamTrainingProgress in train-api.ts).
- features/export/components/export-dialog.tsx:
- Local useExportLogs(exporting) hook that opens the SSE stream on
exporting transitions to true, accumulates up to 4000 lines in
component state, and aborts on cleanup.
- New scrollable output panel rendered above DialogFooter, only
shown for Merged to 16bit and GGUF / Llama.cpp (LoRA adapter is
a fast disk write with nothing to show). Dark terminal styling
(bg-black/85, emerald text, rose for stderr, sky for status),
max-height 14rem, auto-scrolls to the bottom on new output but
stops following if the user scrolls up. A small streaming / idle
indicator is shown next to the panel title.
- DialogContent widens from sm:max-w-lg to sm:max-w-2xl when the
output panel is visible so the logs have room to breathe.
Verified
- Python smoke test (tests/smoke_export_log_capture.py): spawns a
real mp.get_context("spawn") process, installs _setup_log_capture,
confirms that parent stdout prints, parent stderr prints, AND a
child subprocess invoked via subprocess.run (both its stdout and
stderr) are all captured in the resp queue. Passes.
- Orchestrator log helpers tested in isolation: _append_log,
get_logs_since (with and without a cursor), clear_logs not
resetting seq so reconnecting clients still progress. Passes.
- routes.export imports cleanly in the studio venv and /logs/stream
shows up in router.routes.
- bun run build: tsc -b plus vite build, no TypeScript errors.
No existing export behavior is changed. If the subprocess, the SSE
endpoint, or the frontend hook fails, the export itself still runs to
completion the same way it did before, with or without logs visible.
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* export dialog: trim bootstrap noise, scope logs per screen, show realpath
Several follow-ups to the live export log work:
1. Worker bootstrap noise (transformers venv activation, Unsloth banner,
"Top GGUF/hub models" lists, vision detection, 2k-step weight load
bar) is dropped from the export-dialog stream. A threading.Event
gate in worker.py defaults closed and only opens once _handle_export
actually starts; until then the reader thread still echoes lines to
the saved console fd for debugging but does not push them onto the
resp_queue. The orchestrator already spawns a fresh subprocess for
every checkpoint load, so the gate is naturally reset between runs.
2. tqdm in non-tty mode defaults to a 10s mininterval, which makes
multi-step bars look frozen in the panel. Set TQDM_MININTERVAL=0.5
in the worker env so any tqdm-driven progress emits more often.
3. The dialog's useExportLogs hook now also clears its line buffer
when exportMethod or open changes, so re-opening the dialog into a
different action's screen no longer shows the previous action's
saved output. A useElapsedSeconds tick + "Working Xs" badge in the
log header gives users a visible sign that long single-step phases
(cache copies, GGUF conversion) are still running when no new lines
are arriving.
4. ExportBackend.export_{merged,base,gguf,lora} now return
(success, message, output_path); the worker forwards output_path on
each export_*_done response, the orchestrator's _run_export passes
it to routes/export.py, which surfaces it via
ExportOperationResponse.details.output_path. The dialog's Export
Complete screen renders the resolved on-disk realpath under "Saved
to" so users can find their exported model directly.
* fix(cli): unpack 3-tuple return from export backend
ExportOrchestrator.export_{merged,base,gguf,lora} now return
(success, message, output_path) so the studio dialog can show
the on-disk realpath. The CLI still unpacked 2 values, so every
`unsloth export --format ...` crashed with ValueError before
reporting completion. Update the four call sites and surface
output_path via a "Saved to:" echo.
* fix(studio): anchor export log SSE cursor at run start
The export dialog SSE defaulted its cursor to get_current_log_seq()
at connect time, so any line emitted between the POST that kicks
off the export and the client opening the stream was buffered with
seqs 1..k and then skipped (seq <= cursor). Long-running exports
looked silent during their first seconds.
Snapshot _log_seq into _run_start_seq inside clear_logs() and
expose it via get_run_start_seq(). The SSE default cursor now uses
that snapshot, so every line emitted since the current run began
is reachable regardless of when the client connects. Old runs
still can't leak in because their seqs are <= the snapshot.
* fix(studio): reconnect export log SSE on stream drop
useExportLogs launched streamExportLogs once per exporting
transition and recorded any drop in .catch(). Long GGUF exports
behind a proxy with an idle kill-timeout would silently lose the
stream for the rest of the run even though the backend already
supports Last-Event-ID resume. The "retry: 3000" directive emitted
by the backend is only meaningful to native EventSource; this
hook uses a manual fetch + ReadableStream parse so it had no
effect.
Wrap streamExportLogs in a retry loop that tracks lastSeq from
ExportLogEvent.id and passes it as since on reconnect. Backoff is
exponential with jitter, capped at 5s, reset on successful open.
The loop stops on explicit backend `complete` event or on effect
cleanup.
* fix(studio): register a second command so Typer keeps `export` as a subcommand
The CLI export unpacking tests wrap `unsloth_cli.commands.export.export`
in a fresh Typer app with a single registered command. Typer flattens a
single-command app into that command, so the test's
`runner.invoke(cli_app, ["export", ckpt, out, ...])` treats the leading
`"export"` token as an unexpected extra positional argument -- every
parametrized case failed with:
Got unexpected extra argument (.../out)
Register a harmless `noop` second command so Typer preserves subcommand
routing and the tests actually exercise the 3-tuple unpack path they
were written to guard.
Before: 4 failed
After: 4 passed
---------
Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
Co-authored-by: studio-install <studio@local.install>
Co-authored-by: Roland Tannous <115670425+rolandtannous@users.noreply.github.com>
Co-authored-by: Lee Jackson <130007945+Imagineer99@users.noreply.github.com>
Co-authored-by: Roland Tannous <rolandtannous@gravityq.ai>
|
||
|
|
9a261aec5f
|
Studio: Expose openai and anthropic compatible external API end points (#4956)
* Studio: add API key authentication for programmatic access External users want to hit the Studio API (chat completions with tool calling, training, export, etc.) without going through the browser login flow. This adds sk-unsloth- prefixed API keys that work as a drop-in replacement for JWTs in the Authorization: Bearer header. Backend: - New api_keys table in SQLite (storage.py) - create/list/revoke/validate functions with SHA-256 hashed storage - API key detection in _get_current_subject before the JWT path - POST/GET/DELETE /api/auth/api-keys endpoints on the auth router Frontend: - /api-keys page with create form, one-time key reveal, keys table - API Keys link in desktop and mobile navbar - Route registered with requireAuth guard Zero changes to any existing route handler -- every endpoint that uses Depends(get_current_subject) automatically works with API keys. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Use actual origin in API key usage examples The examples on /api-keys were hardcoded to localhost:8888 which is wrong for remote users. Use window.location.origin so the examples show the correct URL regardless of where the user is connecting from. * Add `unsloth studio run` CLI command for one-liner model serving Adds a `run` subcommand that starts Studio, loads a model, creates an API key, and prints a ready-to-use curl command -- similar to `ollama run` or `vllm serve`. Usage: unsloth studio run -m unsloth/Qwen3-1.7B-GGUF --gguf-variant UD-Q4_K_XL * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add end-to-end tests for `unsloth studio run` and API key usage Tests the 4 usage examples from the API Keys page: 1. curl basic (non-streaming) chat completions 2. curl streaming (SSE) chat completions 3. OpenAI Python SDK streaming completions 4. curl with tools (web_search + python) Also tests --help output, invalid key rejection, and no-key rejection. All 7 tests pass against Qwen3-1.7B-GGUF. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add /v1/completions, /v1/embeddings, /v1/responses endpoints and --parallel support - llama_cpp.py: accept n_parallel param, pass to llama-server --parallel - run.py: plumb llama_parallel_slots through to app.state - inference.py: add /completions and /embeddings as transparent proxies to llama-server, add /responses as application-level endpoint that converts to ChatCompletionRequest; thread n_parallel through load_model - studio.py: set llama_parallel_slots=4 for `unsloth studio run` path * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Make /v1/responses endpoint match OpenAI Responses API format The existing /v1/responses shim returned Chat Completions format, which broke OpenAI SDK clients using openai.responses.create(). This commit replaces the endpoint with a proper implementation that: - Returns `output` array with `output_text` content parts instead of `choices` with `message` - Uses `input_tokens`/`output_tokens` instead of `prompt_tokens`/ `completion_tokens` in usage - Sets `object: "response"` and `id: "resp_..."` - Emits named SSE events for streaming (response.created, response.output_text.delta, response.completed, etc.) - Accepts all OpenAI Responses API fields (tools, store, metadata, previous_response_id) without erroring -- silently ignored - Maps `developer` role to `system` and `input_text`/`input_image` content parts to the internal Chat format Adds Pydantic schemas for request/response models and 23 unit tests covering schema validation, input normalisation, and response format. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Studio: add Anthropic-compatible /v1/messages endpoint (#4981) * Add Anthropic-compatible /v1/messages endpoint with tool support Translate Anthropic Messages API format to/from internal OpenAI format and reuse the existing server-side agentic tool loop. Supports streaming SSE (message_start, content_block_delta, etc.) and non-streaming JSON. Includes offline unit tests and e2e tests in test_studio_run.py. * Add enable_tools, enabled_tools, session_id to /v1/messages endpoint Support the same shorthand as /v1/chat/completions: enable_tools=true with an optional enabled_tools list uses built-in server tools without requiring full Anthropic tool definitions. session_id is passed through for sandbox isolation. max_tokens is now optional. * Strip leaked tool-call XML from Anthropic endpoint content Apply _TOOL_XML_RE to content events in both streaming and non-streaming tool paths, matching the OpenAI endpoint behavior. * Emit custom tool_result SSE event in Anthropic stream Adds a non-standard tool_result event between the tool_use block close and the next text block, so clients can see server-side tool execution results. Anthropic SDKs ignore unknown event types. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Split /v1/messages into server-side and client-side tool paths enable_tools=true runs the existing server-side agentic loop with built-in tools (web_search/python/terminal). A bare tools=[...] field now triggers a client-side pass-through: client-provided tools are forwarded to llama-server and any tool_use output is returned to the caller with stop_reason=tool_use for client execution. This fixes Claude Code (and any Anthropic SDK client) which sends tools=[...] expecting client-side execution but was previously routed through execute_tool() and failing with 'Unknown tool'. Adds AnthropicPassthroughEmitter to convert llama-server OpenAI SSE chunks into Anthropic SSE events, plus unit tests covering text blocks, tool_use blocks, mixed, stop reasons, and usage. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix httpcore GeneratorExit in /v1/messages passthrough stream Explicitly aclose aiter_lines() before the surrounding async with blocks unwind, mirroring the prior fix in external_provider.py ( |
||
|
|
5557e1fd27
|
studio: unify Windows installer/setup logging style, verbosity controls, and startup messaging (#4651)
* refactor(studio): unify setup terminal output style and add verbose setup mode * studio(windows): align setup.ps1 banner/steps with setup.sh (ANSI, verbose) * studio(setup): revert nvcc path reordering to match main * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * studio(setup): restore fail-fast llama.cpp setup flow * studio(banner): use IPv6 loopback URL when binding :: or ::1 * Fix IPv6 URL bracketing, try_quiet stderr, _step label clamp - Bracket IPv6 display_host in external_url to produce clickable URLs - Redirect try_quiet failure log to stderr instead of stdout - Clamp _step label to column width to prevent negative padding * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add sandbox integration tests for PR #4494 UX fixes Simulation harness (tests/simulate_pr4494.py) creates an isolated uv venv, copies the real source files into it, and runs subprocess tests for all three fixes with visual before/after demos and edge cases. Standalone bash test (tests/test_try_quiet.sh) validates try_quiet stderr redirect across 8 scenarios including broken-version contrast. 39 integration tests total (14 IPv6 + 15 try_quiet + 10 _step), all existing 75 unit tests still pass. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Truncate step() labels in setup.sh to match PS1 and Python The %-15s printf format pads short labels but does not truncate long ones. Change to %-15.15s so labels wider than 15 chars are clipped, matching the PowerShell .Substring(0,15) and Python label[:15] logic. * Remove sandbox integration tests from PR These test files are not part of the styling fix and should not ship with this PR. * Show error output on failure instead of suppressing it - install_python_stack.py: restore _red for patch_package_file warnings (was downgraded to _dim) - setup.ps1: capture winget output and show on failure for CUDA, Node, Python, and OpenSSL installs (was piped to Out-Null) - setup.ps1: always show git pull failure warning, not just in verbose mode * Show winget error output for Git and CMake installs on failure Same capture-and-print-on-failure pattern already used for Node, Python, CUDA, and OpenSSL winget installs. * fix: preserve stderr for _run_quiet error messages in setup.sh The step() helper writes to stdout, but _run_quiet's error header was originally sent to stderr (>&2). Without the redirect, callers that separate stdout/stderr would miss the failure headline while still seeing the log body on stderr. Add >&2 to both step calls inside _run_quiet to match main's behavior. * feat: add --verbose flag to setup and update commands Wire UNSLOTH_VERBOSE=1 through _run_setup_script() so that 'unsloth studio update --verbose' (and the deprecated 'setup') passes the flag to setup.sh / setup.ps1 / install_python_stack.py. * fix(studio): honor verbose logging and keep llama.cpp failures non-blocking * fix(studio): switch installer to 'studio update' and normalize Windows setup logs * chore(studio): refine localhost tip and remove skip-base setup nois * fix(studio): align Windows setup logs with Linux style and improve startup tips * fix(studio): align Windows setup logs with Linux style * refactor(windows-installer): align install/setup logs with Linux style and silence auto-launch output * refactor(windows): align installer/setup output with Linux style and reduce default verbosity * refactor(windows): match install.ps1 output style/colors to setup and quiet default logs * fix(studio-banner): update personal-computer localhost tip * fix(setup.sh): restore verbose llama.cpp build output while keeping default quiet mode * fix(install.sh): align installer logging with setup style and restore POSIX-safe color output * fix(install.sh): preserve installer reliability and launch visibility Export verbose mode for child setup processes, harden install command handling under set -e, and keep first-run studio launch non-silent so users can always see URL and port fallback output. * fix(windows installer): keep exit semantics and degrade status accurate Use quiet command redirection that preserves native exit codes, keep startup output visible on first launch, and report limited install status when llama.cpp is unavailable. * fix(setup.sh): improve log clarity and enforce GGUF degraded signaling Restore clean default setup output, add verbose-only diagnostics, fail fast on Colab dependency install errors, and return non-zero when GGUF prerequisites or llama.cpp artifacts are unavailable. * fix(installer): harden bash preflight and PowerShell GPU checks Fail fast when bash is unavailable before invoking setup.sh, and replace remaining nvidia-smi pipeline checks with stream redirection patterns that preserve reliable native exit-code handling. * fix(windows): keep verbose output visible while preserving exit codes Ensure PowerShell wrapper helpers in install/update stream native command output to host without returning it as function output, so npm logs no longer corrupt exit-code checks in verbose mode. * fix(windows): avoid sticky UNSLOTH_VERBOSE and gate studio update verbosity * Fix degraded llama.cpp exit code, PS verbose stderr, banner URLs, npm verbose - setup.sh: Do not exit non-zero when llama.cpp is unavailable; the footer already reports the limitation, and install.sh runs under set -e so a non-zero exit aborts the entire install including PATH/shortcuts/launch. - setup.ps1: Remove $? check in Invoke-SetupCommand verbose path; PS 5.1 sets $? = $false when native commands write to stderr even with exit 0. Merge stderr into stdout with 2>&1 and rely solely on $LASTEXITCODE. - startup_banner.py: Show the actual bound address when Studio is bound to a non-loopback interface instead of always showing 127.0.0.1/localhost. - setup.sh: Use run_quiet_no_exit instead of run_quiet_no_exit_always for npm install steps so --verbose correctly surfaces npm output. * Fix install.ps1 verbose stderr, propagate UNSLOTH_VERBOSE, fix git clone verbose - install.ps1: Apply same Invoke-InstallCommand fix as setup.ps1 -- merge stderr into stdout with 2>&1 and drop the $? check that misclassifies successful native commands on PS 5.1. - install.ps1 + setup.ps1: Export UNSLOTH_VERBOSE=1 to the process env when --verbose is passed so child processes like install_python_stack.py also run in verbose mode. - setup.sh: Use run_quiet_no_exit for git clone llama.cpp so --verbose correctly surfaces clone diagnostics during source-build fallback. * Surface prebuilt llama.cpp output in verbose mode, remove dead code, fix banner - setup.sh: Use tee in verbose mode for prebuilt llama.cpp installer so users can see download/validation progress while still capturing the log for structured error reporting on failure. - setup.ps1: Same fix for Windows -- use Tee-Object in verbose mode. - setup.sh: Remove run_quiet_no_exit_always() which has no remaining callers. - startup_banner.py: Avoid printing the same URL twice when Studio is bound to a specific non-loopback address that matches the display host. * Fix run_install_cmd exit code after failed if-statement The previous pattern 'if "$@"; then return 0; fi; _rc=$?' always captured $? = 0 because $? reflects the if-statement result, not the command's exit code. Switch to '"$@" && return 0; _rc=$?' which preserves the actual command exit code on failure. Applies to both verbose and quiet branches. * Fix _run_quiet exit code, double uv install, missing --local flag - setup.sh: Fix _run_quiet verbose path that always captured exit code 0 due to $? resetting after if-then-fi with no else. Switch to the same '"$@" && return 0; exit_code=$?' pattern used in install.sh. - setup.sh: Consolidate the two uv install branches (verbose + quiet) into a single attempt with conditional output. Previously, when verbose mode was on and the install failed, a second silent attempt was made. - install.ps1: Pass --local flag to 'unsloth studio update' when $StudioLocalInstall is true. Without this, studio.py's update() command overwrites STUDIO_LOCAL_INSTALL to "0", which could cause issues if setup.ps1 or install_python_stack.py later checks that variable. * Revert SKIP_STUDIO_BASE change for --no-torch, restore install banners - Revert SKIP_STUDIO_BASE from 0 to 1 for --no-torch. install.sh already installs unsloth+unsloth-zoo and no-torch-runtime.txt before calling setup.sh, so letting install_python_stack.py redo it was redundant and slowed down --no-torch installs for no benefit. - Restore the "Unsloth Studio installed!" success banner and "starting Unsloth Studio..." launch message so users get clear install completion feedback before the server starts. * Make llama.cpp build failure a hard error with proper cleanup - setup.sh: Restore exit 1 when _LLAMA_CPP_DEGRADED is true. GGUF inference requires a working llama.cpp build, so this should be a hard failure, not a silent degradation. - install.sh: Catch setup.sh's non-zero exit with '|| _SETUP_EXIT=$?' instead of letting set -e abort immediately. This ensures PATH setup, symlinks, and shortcuts still get created so the user can fix the build deps and retry with 'unsloth studio update'. After post-install steps, propagate the failure with a clear error message. * Revert install.ps1 to 'studio setup' to preserve SKIP_STUDIO_BASE 'studio update' pops SKIP_STUDIO_BASE from the environment, which defeats the fast-path version check added in PR #4667. When called from install.ps1 (which already installed packages), SKIP_STUDIO_BASE=1 must survive into setup.ps1 so it skips the redundant PyPI check and package reinstallation. 'studio setup' does not modify env vars. * Remove deprecation message from 'studio setup' command install.ps1 uses 'studio setup' (not 'studio update') to preserve SKIP_STUDIO_BASE. The deprecation message was confusing during first install since the user never typed the command. * Fix stale env vars, scope degraded exit, generic error message for PR #4651 - install.ps1: Always set STUDIO_LOCAL_INSTALL and clear STUDIO_LOCAL_REPO when not using --local, to prevent stale values from a previous --local run in the same PowerShell session. Fix log messages to say 'setup' not 'update' since we call 'studio setup'. - setup.sh: Only exit non-zero for degraded llama.cpp when called from the installer (SKIP_STUDIO_BASE=1). Direct 'unsloth studio update' keeps degraded installs successful since Studio is still usable for non-GGUF workflows and the footer already reports the limitation. - install.sh: Make the setup failure error message generic instead of GGUF-specific, so unrelated failures (npm, Python deps) do not show misleading cmake/git recovery advice. * Show captured output on failure in quiet mode for PR #4651 Both Invoke-InstallCommand (install.ps1) and Invoke-SetupCommand (setup.ps1) now capture command output in quiet mode and display it in red when the command fails. This matches the behavior of run_install_cmd in install.sh where failure output is surfaced even in quiet mode, making cross-platform error debugging consistent. * Match degraded llama.cpp exit on Windows, fix --local recovery hint for PR #4651 - setup.ps1: Exit non-zero for degraded llama.cpp when called from install.ps1 (SKIP_STUDIO_BASE=1), matching setup.sh behavior. Direct 'unsloth studio update' keeps degraded installs successful. - install.sh: Show 'unsloth studio update --local' in the recovery message when the install was run with --local, so users retry with the correct flag instead of losing local checkout context. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Daniel Han <danielhanchen@gmail.com> |
||
|
|
5bbfabb151
|
fix: [Studio] setup.ps1 update-flow for windows (#4667)
* fix: add PyPI version check to setup.ps1 for fast update path Port the update-flow logic from setup.sh to setup.ps1 so that `unsloth studio update` on Windows skips Python dependency reinstall when the installed version already matches PyPI latest. * fix: clear SKIP_STUDIO_BASE in update command install.ps1 sets SKIP_STUDIO_BASE=1 which persists in the PowerShell session. If the user runs `unsloth studio update` in the same terminal, the env var causes the version check to be skipped. Clear it explicitly in the update command. * fix: harden version check and clear stale env vars in update flow - Normalize $InstalledVer with Out-String + Trim() to avoid array/whitespace comparison issues in PowerShell 5.1 (python output can be captured as string[] instead of scalar string) - Move Fast-Install --upgrade pip inside if (-not $SkipPythonDeps) so the fast path avoids unnecessary network round-trips - Clear STUDIO_LOCAL_REPO when --local is not passed to prevent a previous --local session from leaking into a plain update --------- Co-authored-by: Daniel Han <danielhanchen@gmail.com> |
||
|
|
0233fe7f9c
|
studio: setup log styling (#4494)
* refactor(studio): unify setup terminal output style and add verbose setup mode * studio(windows): align setup.ps1 banner/steps with setup.sh (ANSI, verbose) * studio(setup): revert nvcc path reordering to match main * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * studio(setup): restore fail-fast llama.cpp setup flow * studio(banner): use IPv6 loopback URL when binding :: or ::1 * Fix IPv6 URL bracketing, try_quiet stderr, _step label clamp - Bracket IPv6 display_host in external_url to produce clickable URLs - Redirect try_quiet failure log to stderr instead of stdout - Clamp _step label to column width to prevent negative padding * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Add sandbox integration tests for PR #4494 UX fixes Simulation harness (tests/simulate_pr4494.py) creates an isolated uv venv, copies the real source files into it, and runs subprocess tests for all three fixes with visual before/after demos and edge cases. Standalone bash test (tests/test_try_quiet.sh) validates try_quiet stderr redirect across 8 scenarios including broken-version contrast. 39 integration tests total (14 IPv6 + 15 try_quiet + 10 _step), all existing 75 unit tests still pass. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Truncate step() labels in setup.sh to match PS1 and Python The %-15s printf format pads short labels but does not truncate long ones. Change to %-15.15s so labels wider than 15 chars are clipped, matching the PowerShell .Substring(0,15) and Python label[:15] logic. * Remove sandbox integration tests from PR These test files are not part of the styling fix and should not ship with this PR. * Show error output on failure instead of suppressing it - install_python_stack.py: restore _red for patch_package_file warnings (was downgraded to _dim) - setup.ps1: capture winget output and show on failure for CUDA, Node, Python, and OpenSSL installs (was piped to Out-Null) - setup.ps1: always show git pull failure warning, not just in verbose mode * Show winget error output for Git and CMake installs on failure Same capture-and-print-on-failure pattern already used for Node, Python, CUDA, and OpenSSL winget installs. * fix: preserve stderr for _run_quiet error messages in setup.sh The step() helper writes to stdout, but _run_quiet's error header was originally sent to stderr (>&2). Without the redirect, callers that separate stdout/stderr would miss the failure headline while still seeing the log body on stderr. Add >&2 to both step calls inside _run_quiet to match main's behavior. * feat: add --verbose flag to setup and update commands Wire UNSLOTH_VERBOSE=1 through _run_setup_script() so that 'unsloth studio update --verbose' (and the deprecated 'setup') passes the flag to setup.sh / setup.ps1 / install_python_stack.py. --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Daniel Han <danielhanchen@gmail.com> |
||
|
|
6d6008a1ef
|
Add PID file tracking and unsloth studio stop command (#4598)
* Add PID file tracking and `unsloth studio stop` command On macOS the .app shortcut launches Studio via osascript into a Terminal window, then the launcher script exits. The server process runs outside of the launcher's context with no PID file, so there is no straightforward way to find or stop it. This adds: - PID file at ~/.unsloth/studio/studio.pid, written after the server starts and removed on graceful shutdown or via atexit - `unsloth studio stop` command that reads the PID file and sends SIGTERM (or taskkill on Windows) to shut down the server The PID file is only removed if it still contains the current process ID, avoiding races when a new server instance replaces a crashed one. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Move atexit PID cleanup into run_server() The atexit registration was only in the __main__ block, so it did not cover the `unsloth studio` CLI path that calls run_server() directly via studio_default(). Moving it into run_server() ensures the PID file is cleaned up on unexpected exit regardless of entry point. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> |
||
|
|
19e9c60a8e
|
Consolidate dual venvs and separate install from update (#4530)
* refactor: consolidate dual venvs into single ~/.unsloth/studio/unsloth_studio
* refactor: separate install.sh (first-time) from setup.sh (smart update with PyPI version check)
* fix: install.sh calls setup.sh directly, keep both setup and update CLI commands
* fix: use importlib.resources.files() directly without _path attribute
* fix: bootstrap uv before pip upgrade to handle uv venvs without pip
* fix: frontend 404 when launched via CLI, add global symlink to ~/.local/bin
* feat: add --local flag to install.sh and unsloth studio update for branch testing
* fix: resolve repo root from script location for --local installs
* feat: add --package flag to install.sh for testing with custom package names
* feat: add --package flag to unsloth studio update
* fix: always nuke venv in install.sh for clean installs
* revert: remove Windows changes, will handle in separate PR
* fix: error when --package is passed without an argument
* revert: restore Windows scripts to current main
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix: always explicitly set STUDIO_LOCAL_INSTALL and STUDIO_PACKAGE_NAME env vars
* fix: pass explicit STUDIO_LOCAL_REPO env var for --local installs
* fix: align banner box for Setup vs Update labels
* deprecate: hide 'unsloth studio setup' command, point users to update/install.sh
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix: check stdout not stdin for auto-launch detection (curl pipe fix)
* fix: update install URL to unsloth.ai/install.sh
* fix: update install.sh usage comments to unsloth.ai/install.sh
* fix: use --upgrade-package for base deps to preserve existing torch/CUDA installs
* [pre-commit.ci] auto fixes from pre-commit.com hooks
for more information, see https://pre-commit.ci
* fix: --local install now also installs unsloth-zoo via base.txt before editable overlay
* fix: don't skip base packages for --local installs (editable needs unsloth-zoo)
* refactor: move --local full dep install to install.sh, keep SKIP_STUDIO_BASE for all paths
* feat: add migration support for old .venv and CWD-based installs in setup.sh
* Revert "feat: add migration support for old .venv and CWD-based installs in setup.sh"
This reverts commit
|
||
|
|
797ddd201e
|
Fix Studio silently exiting on Windows without error output (#4527)
* Fix Studio silently exiting on Windows without error output On Windows, `unsloth studio` launches a child process via subprocess.Popen to run the server in the studio venv. If the child crashes (e.g. due to a missing package), the parent just calls typer.Exit(rc) with no message -- the user sees "Launching Unsloth Studio... Please wait..." and then the prompt returns with zero feedback. Root cause: `data_designer_unstructured_seed` is imported at the top level in seed.py. If this package is not installed in the studio venv, the entire import chain (seed.py -> routes/__init__.py -> main.py -> run_server()) crashes with ModuleNotFoundError. Since run.py has no try/except around run_server() and studio.py does not report nonzero exit codes, the failure is completely silent. Changes: - run.py: wrap run_server() in try/except, print clear error with traceback to stderr. Also reconfigure stderr encoding on Windows so tracebacks with non-ASCII paths do not cause secondary failures. - studio.py: print an error message when the child process exits with a nonzero code on Windows, so the user knows something went wrong. - seed.py: make data_designer_unstructured_seed import optional with a try/except fallback. The server starts normally and only returns HTTP 500 if the unstructured seed endpoints are actually called. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Skip Anaconda/Miniconda Python when creating Studio venv on Windows Conda-bundled CPython ships modified DLL search paths that prevent torch from loading c10.dll on Windows. The Studio server fails silently at startup because the venv was created with conda's Python. Standalone CPython (python.org, winget, uv) does not have this issue. Both install.ps1 and setup.ps1 now skip any Python binary whose path contains conda, miniconda, anaconda, miniforge, or mambaforge when selecting the interpreter for the studio venv. If only conda Python is available, the scripts print an error with instructions to install standalone CPython. * Fix multi-file preview crash and improve setup.ps1 Python discovery Addresses review findings [10/10] and [8/10]: 1. seed.py: _read_preview_rows_from_multi_files() had a hard import of build_multi_file_preview_rows inside the function body, bypassing the optional-plugin guard. Moved it into the top-level try/except block and added a None guard matching the other functions. 2. setup.ps1: Python discovery now probes py.exe (Python Launcher) first, uses Get-Command -All to look past conda entries that shadow standalone CPython further down PATH, skips WindowsApps stubs, and resolves the actual executable path so venv creation does not re-resolve back to a conda interpreter. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Check sys.base_prefix to catch venvs created from conda Python A venv created from conda Python (e.g. C:\Users\danie\.venv) has a path that does not contain "conda", but sys.base_prefix still points to the conda install (e.g. C:\Users\danie\miniconda3). The previous path-only check missed this case entirely. Both install.ps1 and setup.ps1 now use a Test-IsConda helper that checks both the executable path AND sys.base_prefix against the conda/miniconda/anaconda/miniforge/mambaforge pattern. This catches: - Direct conda Python executables - Venvs created from conda Python (base_prefix reveals the origin) * Fix install.ps1 passing version string to uv venv instead of resolved path Find-CompatiblePython returned a bare version string (e.g. "3.13") which was passed to `uv venv --python 3.13`. uv performs its own interpreter discovery and can resolve that version string back to a conda Python, defeating the entire conda-skip logic. Now Find-CompatiblePython returns a hashtable with both .Version (for display) and .Path (the resolved absolute executable path). The venv is created with `uv venv --python <absolute-path>`, ensuring uv uses the exact interpreter we validated. * Quote resolved Python path in uv venv call for paths with spaces --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> |
||
|
|
981f477e31
|
fix: reconfigure stdout to UTF-8 on Windows to prevent UnicodeEncodeError on startup (#4493)
* fix: reconfigure stdout UTF-8 on Windows to prevent UnicodeEncodeError from emoji * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * fix: default frontend_path when None to fix blank page when venv is pre-activated * Restore Windows UTF-8 stdout fix dropped in earlier commit The cp1252 console encoding on Windows cannot render emoji characters used in startup messages (e.g. print("✅ Frontend loaded ...")). This causes UnicodeEncodeError and crashes the server before it starts. Place sys.stdout.reconfigure(encoding="utf-8", errors="replace") at the top of run_server(), unconditionally before any print() or structlog call, so all emoji output is covered -- including the frontend status messages and silent=True paths that the original placement missed. Guarded by sys.platform == "win32" and hasattr check, so it is a no-op on Linux/macOS and safe in non-standard stdout environments (Jupyter, piped IO). * fix: preserve run_server(None) as headless, fix CLI frontend kwarg Remove the frontend_path=None fallback in run_server() that changed None from "headless/API-only" to "mount bundled frontend", breaking backwards compatibility for embedders. The blank-page bug was actually caused by the CLI wrappers always passing frontend_path=frontend (even when frontend=None), which overrode run_server()'s default. Fix studio.py and ui.py to only pass frontend_path when the user explicitly sets --frontend. * fix: use timeout loop for shutdown event in ui command Match studio_default()'s shutdown loop that uses a 1-second timeout on Event.wait(). Without a timeout, the bare wait() blocks at the C level on Linux, preventing Python from delivering SIGINT (Ctrl+C). --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Daniel Han <danielhanchen@gmail.com> |
||
|
|
6f129a214b
|
Fix Install commands for Windows + 1 line installs (#4447)
* One liner setup for unsloth studio * Fix install scripts: system deps, activation bugs, curl/wget support - install.sh: detect platform (macOS/Linux/WSL) and check for missing system dependencies (cmake, git, build-essential, libcurl4-openssl-dev). Prompt user once for permission to install all missing packages via brew (macOS) or sudo apt-get (Linux/WSL). Add wget fallback via download() helper since curl is not always present on minimal Linux installs. Fix nested curl|sh stdin stealing by downloading uv installer to a tempfile first. Replace venv activation (no-op in a pipe subshell) with explicit --python flag for uv pip install and direct venv binary invocation. Add idempotency guard for venv creation. Redirect stdin on unsloth studio setup to prevent pipe consumption. On macOS, check for Xcode Command Line Tools and trigger install if missing. - install.ps1: wrap script body in Install-UnslothStudio function so that errors use return instead of exit (exit kills the terminal when run via irm|iex). Remove activate.ps1 invocation entirely -- use explicit --python path for uv pip install and & $UnslothExe for studio setup. This avoids both the child-scope activation bug (& vs dot-source) and the execution policy error on default Windows systems. Add winget availability check with clear error message. Fix PATH refresh to append registry paths instead of replacing the session PATH. Add uv installer fallback via astral.sh PowerShell script if winget install does not put uv on PATH. Broaden Python version check to accept 3.11-3.13. Add idempotency guard for venv creation. - README.md: add wget one-liner alternative for systems without curl. * Fix Tailwind CSS v4 .gitignore bug on Windows (#4444) - Add .gitignore hiding workaround to setup.ps1 (matching existing setup.sh logic) so venv .gitignore files containing "*" don't prevent Tailwind's oxide scanner from finding .tsx source files - Add CSS size validation to setup.sh, setup.ps1, and build.sh to catch truncated Tailwind builds early - Remove stray force-rebuild overrides that made the "skip build if current" cache check dead code in both setup scripts - Add rm -rf dist to build.sh to force clean rebuilds for wheel packaging * Change default port 8000 to 8888, fix installer bugs, improve UX - Change default Studio port from 8000 to 8888 across all entry points (run.py, studio.py, ui.py, colab.py, vite.config.ts, setup scripts) - Update launch banner: "Launching with studio venv..." to "Launching Unsloth Studio... Please wait..." - Add "Open your web browser" banner and rename labels (Local -> Local Access, External -> Worldwide Web Address) - Fix venv idempotency: check for bin/python instead of just directory existence, clean up partial venvs on retry - Fix build.sh CSS validation: handle empty CSS case that silently bypassed the check with "integer expression expected" - Fix install.sh sudo handling: try apt-get without sudo first (works when root), then escalate with per-package tracking and user prompt - Fix install.ps1: check exit code from studio setup, fail on error - Add pciutils to WSL GGUF build dependencies - Apply same smart apt-get escalation pattern to studio/setup.sh * Use detected Python version for venv, abort on non-apt Linux - install.ps1: detect existing Python 3.11/3.12/3.13 and use that version for venv creation instead of always forcing 3.13 - install.sh: exit with error on non-apt Linux distros when required packages cannot be auto-installed, instead of silently continuing * Make sudo permission prompt more prominent with warning banner * Add Accept [Y/n] sudo prompt to studio/setup.sh for consistency * Fix native command exit code handling and sudo decline flow install.ps1: Add $LASTEXITCODE checks after winget (Python), uv venv, and uv pip install calls. $ErrorActionPreference only catches PowerShell cmdlet errors, not native executable failures. The Python check also handles winget returning non-zero for "already installed". setup.sh: Skip llama-server build when user declines sudo or sudo is unavailable. Previously the script continued to section 8 which would fail with confusing errors (e.g. "gcc: command not found") since build-essential was never installed. * Move rm -rf llama.cpp inside build branch to preserve existing install When _SKIP_GGUF_BUILD is set (user declined sudo or sudo unavailable), the previous rm -rf would destroy an already-working llama-server before the skip check ran. Move it inside the else branch so existing builds are preserved when the rebuild is skipped. --------- Co-authored-by: Daniel Han <danielhanchen@users.noreply.github.com> Co-authored-by: Daniel Han <danielhanchen@gmail.com> |
||
|
|
0c8d407793
|
Rename cli/ to unsloth_cli/ to fix namespace collision with stringzilla (#4393)
* Rename cli/ to unsloth_cli/ to fix namespace collision with stringzilla stringzilla installs a namespace package at cli/ (cli/split.py, cli/wc.py) in site-packages without an __init__.py. When unsloth is installed as an editable package (pip install -e .), the entry point script does `from cli import app` which finds stringzilla's namespace cli/ first and fails with `ImportError: cannot import name 'app' from 'cli'`. Non-editable installs happened to work because unsloth's cli/__init__.py overwrites the namespace directory, but this is fragile and breaks if stringzilla is installed after unsloth. Renaming to unsloth_cli/ avoids the collision entirely and fixes both editable and non-editable install paths. * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Update stale cli/ references in comments and license files --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> |