* fix(streaming): comply with OpenAI usage / stream_options spec (#8546)
LocalAI emitted `"usage":{"prompt_tokens":0,...}` on every streamed
chunk because `OpenAIResponse.Usage` was a value type without
`omitempty`. The official OpenAI Node SDK and its consumers
(continuedev/continue, Kilo Code, Roo Code, Zed, IntelliJ Continue)
filter on a truthy `result.usage` to detect the trailing usage chunk;
LocalAI's zero-but-non-null usage on every intermediate chunk made
that filter swallow every content chunk and surface an empty chat
response while the server log looked successful.
Changes:
- `core/schema/openai.go`: `Usage *OpenAIUsage \`json:"usage,omitempty"\``
so intermediate chunks no longer carry a `usage` key. Add
`OpenAIRequest.StreamOptions` with `include_usage` to mirror OpenAI's
request field.
- `core/http/endpoints/openai/chat.go` and `completion.go`: keep using
the `Usage` struct field as an in-process channel for the running
cumulative, but strip it before JSON marshalling. When the request
set `stream_options.include_usage: true`, emit a dedicated trailing
chunk with `"choices": []` and the populated usage (matching the
OpenAI spec and llama.cpp's server behavior).
- `chat_emit.go`: new `streamUsageTrailerJSON` helper; drop the
`usage` parameter from `buildNoActionFinalChunks` since chunks no
longer carry usage.
- Update `image.go`, `inpainting.go`, `edit.go` to wrap their Usage
values with `&` for the new pointer field.
- UI: send `stream_options:{include_usage:true}` from the React
(`useChat.js`) and legacy (`static/chat.js`) chat clients so the
token-count badge keeps populating now that the server is
spec-compliant.
Tests:
- New `chat_stream_usage_test.go` pins the spec invariants:
intermediate chunks have no `usage` key, the trailer JSON has
`"choices":[]` and a populated `usage`, and `OpenAIRequest` parses
`stream_options.include_usage`.
- Update `chat_emit_test.go` to reflect that finals no longer embed
usage.
Verified against the live LocalAI instance: before the fix Continue's
filter logic swallowed 16/16 token chunks; with the new shape it
yields 4/5 and routes usage through the dedicated trailer chunk.
Fixes#8546
Assisted-by: Claude:opus-4.7 [Claude Code]
Signed-off-by: Ettore Di Giacinto <[email protected]>
* fix(streaming): silence errcheck on usage trailer Fprintf
The new spec-compliant `stream_options.include_usage` trailer writes
were flagged by errcheck since they're new code (golangci-lint runs
new-from-merge-base on master); the surrounding `fmt.Fprintf` data:
writes are grandfathered. Drop the return values explicitly to match
the linter's contract without adding a nolint shim.
Assisted-by: Claude:opus-4.7 [Claude Code]
Signed-off-by: Ettore Di Giacinto <[email protected]>
---------
Signed-off-by: Ettore Di Giacinto <[email protected]>
Co-authored-by: Ettore Di Giacinto <[email protected]>
This PR adds support to support the 'reasoning' API field of the OpenAI
spec.
LocalAI now will extract automatically thinking tags in both SSE and
non-SSE mode. The changes are adapted as well to the Chat UI now that
will use the reasoning field to extract the thinking process and display
it in the chat.
This fixes https://github.com/mudler/LocalAI/issues/7944
Signed-off-by: Ettore Di Giacinto <[email protected]>
Fixes a minor glitch that happens when switching model in from the chat
pane where the header was not getting updated. Besides, it allows to
create new chat directly when clicking from the management pane to the
model.
Signed-off-by: Ettore Di Giacinto <[email protected]>
* feat(agent): agent jobs
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Multiple webhooks, simplify
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Do not use cron with seconds
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Create separate pages for details
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Detect if no models have MCP configuration, show wizard
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Make services test to run
Signed-off-by: Ettore Di Giacinto <[email protected]>
---------
Signed-off-by: Ettore Di Giacinto <[email protected]>
* feat(ui): add watchdog settings
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Do not re-read env
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Some refactor, move other settings to runtime (p2p)
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Add API Keys handling
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Allow to disable runtime settings
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Documentation
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Small fixups
Signed-off-by: Ettore Di Giacinto <[email protected]>
* show MCP toggle in index
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Drop context default
Signed-off-by: Ettore Di Giacinto <[email protected]>
---------
Signed-off-by: Ettore Di Giacinto <[email protected]>
* feat(importer): support ollama and OCI, unify code
Signed-off-by: Ettore Di Giacinto <[email protected]>
* feat: support importing from local file
Signed-off-by: Ettore Di Giacinto <[email protected]>
* support also yaml config files
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Correctly handle local files
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Extract importing errors
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Add importer tests
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Add integration tests
Signed-off-by: Ettore Di Giacinto <[email protected]>
* chore(UX): improve and specify supported URI formats
Signed-off-by: Ettore Di Giacinto <[email protected]>
* fail if backend does not have a runfile
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Adapt tests
Signed-off-by: Ettore Di Giacinto <[email protected]>
* feat(gallery): add cache for galleries
Signed-off-by: Ettore Di Giacinto <[email protected]>
* fix(ui): remove handler duplicate
File input handlers are now handled by Alpine.js @change handlers in chat.html.
Removed duplicate listeners to prevent files from being processed twice
Signed-off-by: Ettore Di Giacinto <[email protected]>
* fix(ui): be consistent in attachments in the chat
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Fail if no importer matches
Signed-off-by: Ettore Di Giacinto <[email protected]>
* fix: propagate ops correctly
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Fixups
Signed-off-by: Ettore Di Giacinto <[email protected]>
---------
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Move management to separate section
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Make index to redirect to chat
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Use logo in index
Signed-off-by: Ettore Di Giacinto <[email protected]>
* work out the wizard in the front-page
Signed-off-by: Ettore Di Giacinto <[email protected]>
---------
Signed-off-by: Ettore Di Giacinto <[email protected]>
* Support file inputs
Signed-off-by: Ettore Di Giacinto <[email protected]>
* feat: support multiple files
Signed-off-by: Ettore Di Giacinto <[email protected]>
* show preview of files
Signed-off-by: Ettore Di Giacinto <[email protected]>
---------
Signed-off-by: Ettore Di Giacinto <[email protected]>
* chore(ui): drop set api key button
Signed-off-by: Ettore Di Giacinto <[email protected]>
* chore(ui): shore in-progress installs in model view
Signed-off-by: Ettore Di Giacinto <[email protected]>
* feat(ui): improve text to image view
Signed-off-by: Ettore Di Giacinto <[email protected]>
---------
Signed-off-by: Ettore Di Giacinto <[email protected]>
* feat(ui): show more informations in the chat view, minor adjustments to model gallery
Signed-off-by: Ettore Di Giacinto <[email protected]>
* fix(ui): UI improvements
Visual improvements and bugfixes including:
- disable pagination during search
- fix scrolling on new message
Signed-off-by: Ettore Di Giacinto <[email protected]>
---------
Signed-off-by: Ettore Di Giacinto <[email protected]>
Makes the web app honour the `X-Forwarded-Prefix` HTTP request header that may be sent by a reverse-proxy in order to inform the app that its public routes contain a path prefix.
For instance this allows to serve the webapp via a reverse-proxy/ingress controller under a path prefix/sub path such as e.g. `/localai/` while still being able to use the regular LocalAI routes/paths without prefix when directly connecting to the LocalAI server.
Changes:
* Add new `StripPathPrefix` middleware to strip the path prefix (provided with the `X-Forwarded-Prefix` HTTP request header) from the request path prior to matching the HTTP route.
* Add a `BaseURL` utility function to build the base URL, honouring the `X-Forwarded-Prefix` HTTP request header.
* Generate the derived base URL into the HTML (`head.html` template) as `<base/>` tag.
* Make all webapp-internal URLs (within HTML+JS) relative in order to make the browser resolve them against the `<base/>` URL specified within each HTML page's header.
* Make font URLs within the CSS files relative to the CSS file.
* Generate redirect location URLs using the new `BaseURL` function.
* Use the new `BaseURL` function to generate absolute URLs within gallery JSON responses.
Closes#3095
TL;DR:
The header-based approach allows to move the path prefix configuration concern completely to the reverse-proxy/ingress as opposed to having to align the path prefix configuration between LocalAI, the reverse-proxy and potentially other internal LocalAI clients.
The gofiber swagger handler already supports path prefixes this way, see e2d9e9916d/swagger.go (L79)
Signed-off-by: Max Goltzsche <[email protected]>
* feat(ui): allow to set system prompt for chat
Make also the models in the index clickable, and display as table
Fixes#2257
Signed-off-by: Ettore Di Giacinto <[email protected]>
* feat(vision): support also png with base64 input
Signed-off-by: Ettore Di Giacinto <[email protected]>
* feat(ui): support vision and upload of files
Signed-off-by: Ettore Di Giacinto <[email protected]>
* display the processed image
Signed-off-by: Ettore Di Giacinto <[email protected]>
* make trust remote code stand out
Signed-off-by: mudler <[email protected]>
* feat(ui): track in progress job across index/model gallery
Signed-off-by: mudler <[email protected]>
* minor fixups
Signed-off-by: mudler <[email protected]>
---------
Signed-off-by: Ettore Di Giacinto <[email protected]>
Signed-off-by: mudler <[email protected]>