LocalAI/docs/content/features/backend-monitor.md at 95efb8a562f9d9db539e0477936ad079a018eab9

mirror of https://github.com/mudler/LocalAI synced 2026-04-21 13:27:21 +00:00

feat: Add documentation for undocumented API endpoints (#8852 )

* feat: add documentation for undocumented API endpoints

Creates comprehensive documentation for 8 previously undocumented endpoints:
- Voice Activity Detection (/v1/vad)
- Video Generation (/video)
- Sound Generation (/v1/sound-generation)
- Backend Monitor (/backend/monitor, /backend/shutdown)
- Token Metrics (/tokenMetrics)
- P2P endpoints (/api/p2p/* - 5 sub-endpoints)
- System Info (/system, /version)

Each documentation file includes HTTP method, request/response schemas,
curl examples, sample JSON responses, and error codes.

* docs: remove token-metrics endpoint documentation per review feedback

The token-metrics endpoint is not wired into the HTTP router and
should not be documented per reviewer request.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: move system-info documentation to reference section

Per review feedback, system-info endpoint docs are better suited
for the reference section rather than features.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: localai-bot <localai-bot@noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-08 17:59:33 +01:00

2.8 KiB

Raw Blame History

+++ disableToc = false title = "Backend Monitor" weight = 20 url = "/features/backend-monitor/" +++

LocalAI provides endpoints to monitor and manage running backends. The /backend/monitor endpoint reports the status and resource usage of loaded models, and /backend/shutdown allows stopping a model's backend process.

Monitor API

Method: GET
Endpoints: /backend/monitor, /v1/backend/monitor

Request

The request body is JSON:

Parameter	Type	Required	Description
`model`	`string`	Yes	Name of the model to monitor

Response

Returns a JSON object with the backend status:

Field	Type	Description
`state`	`int`	Backend state: `0` = uninitialized, `1` = busy, `2` = ready, `-1` = error
`memory`	`object`	Memory usage information
`memory.total`	`uint64`	Total memory usage in bytes
`memory.breakdown`	`object`	Per-component memory breakdown (key-value pairs)

If the gRPC status call fails, the endpoint falls back to local process metrics:

Field	Type	Description
`memory_info`	`object`	Process memory info (RSS, VMS)
`memory_percent`	`float`	Memory usage percentage
`cpu_percent`	`float`	CPU usage percentage

Usage

curl http://localhost:8080/backend/monitor \
  -H "Content-Type: application/json" \
  -d '{"model": "my-model"}'

Example response

{
  "state": 2,
  "memory": {
    "total": 1073741824,
    "breakdown": {
      "weights": 536870912,
      "kv_cache": 268435456
    }
  }
}

Shutdown API

Method: POST
Endpoints: /backend/shutdown, /v1/backend/shutdown

Request

Parameter	Type	Required	Description
`model`	`string`	Yes	Name of the model to shut down

Usage

curl -X POST http://localhost:8080/backend/shutdown \
  -H "Content-Type: application/json" \
  -d '{"model": "my-model"}'

Response

Returns 200 OK with the shutdown confirmation message on success.

Error Responses

Status Code	Description
400	Invalid or missing model name
500	Backend error or model not loaded

2.8 KiB Raw Blame History

Monitor API

Request

Response

Usage

Example response

Shutdown API

Request

Usage

Response

Error Responses

2.8 KiB

Raw Blame History