LocalAI/docs/content/features/backend-monitor.md
LocalAI [bot] 9090bca920
feat: Add documentation for undocumented API endpoints (#8852)
* feat: add documentation for undocumented API endpoints

Creates comprehensive documentation for 8 previously undocumented endpoints:
- Voice Activity Detection (/v1/vad)
- Video Generation (/video)
- Sound Generation (/v1/sound-generation)
- Backend Monitor (/backend/monitor, /backend/shutdown)
- Token Metrics (/tokenMetrics)
- P2P endpoints (/api/p2p/* - 5 sub-endpoints)
- System Info (/system, /version)

Each documentation file includes HTTP method, request/response schemas,
curl examples, sample JSON responses, and error codes.

* docs: remove token-metrics endpoint documentation per review feedback

The token-metrics endpoint is not wired into the HTTP router and
should not be documented per reviewer request.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: move system-info documentation to reference section

Per review feedback, system-info endpoint docs are better suited
for the reference section rather than features.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: localai-bot <localai-bot@noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 17:59:33 +01:00

2.8 KiB

+++ disableToc = false title = "Backend Monitor" weight = 20 url = "/features/backend-monitor/" +++

LocalAI provides endpoints to monitor and manage running backends. The /backend/monitor endpoint reports the status and resource usage of loaded models, and /backend/shutdown allows stopping a model's backend process.

Monitor API

  • Method: GET
  • Endpoints: /backend/monitor, /v1/backend/monitor

Request

The request body is JSON:

Parameter Type Required Description
model string Yes Name of the model to monitor

Response

Returns a JSON object with the backend status:

Field Type Description
state int Backend state: 0 = uninitialized, 1 = busy, 2 = ready, -1 = error
memory object Memory usage information
memory.total uint64 Total memory usage in bytes
memory.breakdown object Per-component memory breakdown (key-value pairs)

If the gRPC status call fails, the endpoint falls back to local process metrics:

Field Type Description
memory_info object Process memory info (RSS, VMS)
memory_percent float Memory usage percentage
cpu_percent float CPU usage percentage

Usage

curl http://localhost:8080/backend/monitor \
  -H "Content-Type: application/json" \
  -d '{"model": "my-model"}'

Example response

{
  "state": 2,
  "memory": {
    "total": 1073741824,
    "breakdown": {
      "weights": 536870912,
      "kv_cache": 268435456
    }
  }
}

Shutdown API

  • Method: POST
  • Endpoints: /backend/shutdown, /v1/backend/shutdown

Request

Parameter Type Required Description
model string Yes Name of the model to shut down

Usage

curl -X POST http://localhost:8080/backend/shutdown \
  -H "Content-Type: application/json" \
  -d '{"model": "my-model"}'

Response

Returns 200 OK with the shutdown confirmation message on success.

Error Responses

Status Code Description
400 Invalid or missing model name
500 Backend error or model not loaded