LocalAI/core/http/static
LocalAI [bot] 8af963bdd9
fix(streaming): comply with OpenAI usage / stream_options spec (#9815)
* fix(streaming): comply with OpenAI usage / stream_options spec (#8546)

LocalAI emitted `"usage":{"prompt_tokens":0,...}` on every streamed
chunk because `OpenAIResponse.Usage` was a value type without
`omitempty`. The official OpenAI Node SDK and its consumers
(continuedev/continue, Kilo Code, Roo Code, Zed, IntelliJ Continue)
filter on a truthy `result.usage` to detect the trailing usage chunk;
LocalAI's zero-but-non-null usage on every intermediate chunk made
that filter swallow every content chunk and surface an empty chat
response while the server log looked successful.

Changes:

- `core/schema/openai.go`: `Usage *OpenAIUsage \`json:"usage,omitempty"\``
  so intermediate chunks no longer carry a `usage` key. Add
  `OpenAIRequest.StreamOptions` with `include_usage` to mirror OpenAI's
  request field.
- `core/http/endpoints/openai/chat.go` and `completion.go`: keep using
  the `Usage` struct field as an in-process channel for the running
  cumulative, but strip it before JSON marshalling. When the request
  set `stream_options.include_usage: true`, emit a dedicated trailing
  chunk with `"choices": []` and the populated usage (matching the
  OpenAI spec and llama.cpp's server behavior).
- `chat_emit.go`: new `streamUsageTrailerJSON` helper; drop the
  `usage` parameter from `buildNoActionFinalChunks` since chunks no
  longer carry usage.
- Update `image.go`, `inpainting.go`, `edit.go` to wrap their Usage
  values with `&` for the new pointer field.
- UI: send `stream_options:{include_usage:true}` from the React
  (`useChat.js`) and legacy (`static/chat.js`) chat clients so the
  token-count badge keeps populating now that the server is
  spec-compliant.

Tests:

- New `chat_stream_usage_test.go` pins the spec invariants:
  intermediate chunks have no `usage` key, the trailer JSON has
  `"choices":[]` and a populated `usage`, and `OpenAIRequest` parses
  `stream_options.include_usage`.
- Update `chat_emit_test.go` to reflect that finals no longer embed
  usage.

Verified against the live LocalAI instance: before the fix Continue's
filter logic swallowed 16/16 token chunks; with the new shape it
yields 4/5 and routes usage through the dedicated trailer chunk.

Fixes #8546

Assisted-by: Claude:opus-4.7 [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

* fix(streaming): silence errcheck on usage trailer Fprintf

The new spec-compliant `stream_options.include_usage` trailer writes
were flagged by errcheck since they're new code (golangci-lint runs
new-from-merge-base on master); the surrounding `fmt.Fprintf` data:
writes are grandfathered. Drop the return values explicitly to match
the linter's contract without adding a nolint shim.

Assisted-by: Claude:opus-4.7 [Claude Code]
Signed-off-by: Ettore Di Giacinto <mudler@localai.io>

---------

Signed-off-by: Ettore Di Giacinto <mudler@localai.io>
Co-authored-by: Ettore Di Giacinto <mudler@localai.io>
2026-05-14 08:53:46 +02:00
..
assets chore: refactor css, restyle to be slightly minimalistic (#7397) 2025-11-29 22:11:44 +01:00
animations.css chore: refactor css, restyle to be slightly minimalistic (#7397) 2025-11-29 22:11:44 +01:00
chat.js fix(streaming): comply with OpenAI usage / stream_options spec (#9815) 2026-05-14 08:53:46 +02:00
components.css feat(ui): left navbar, dark/light theme (#8594) 2026-02-18 00:14:39 +01:00
favicon.svg feat: rebrand - LocalAGI and LocalRecall joins the LocalAI stack family (#5159) 2025-04-15 17:51:24 +02:00
general.css chore(ui): improve navigation and buttons placement (#8608) 2026-02-19 23:41:05 +01:00
image.js chore(image-ui): simplify interface (#7882) 2026-01-05 23:20:28 +01:00
logo.png feat: rebrand - LocalAGI and LocalRecall joins the LocalAI stack family (#5159) 2025-04-15 17:51:24 +02:00
logo_horizontal.png feat: rebrand - LocalAGI and LocalRecall joins the LocalAI stack family (#5159) 2025-04-15 17:51:24 +02:00
p2panimation.js chore(ux): allow to create and drag dots in the animation (#3287) 2024-08-19 20:40:55 +02:00
sound.js feat(musicgen): add ace-step and UI interface (#8396) 2026-02-05 12:04:53 +01:00
talk.js feat(realtime): WebRTC support (#8790) 2026-03-13 21:37:15 +01:00
theme.css feat(ui): left navbar, dark/light theme (#8594) 2026-02-18 00:14:39 +01:00
tts.js feat(ui): remove api key handling and small ui adjustments (#4948) 2025-03-05 19:37:36 +01:00
typography.css chore: refactor css, restyle to be slightly minimalistic (#7397) 2025-11-29 22:11:44 +01:00
video.js fix(videogen): drop incomplete endpoint, add GGUF support for LTX-2 (#8160) 2026-01-22 14:09:20 +01:00