mirror of
https://github.com/mudler/LocalAI
synced 2026-04-21 13:27:21 +00:00
* feat: add documentation for undocumented API endpoints Creates comprehensive documentation for 8 previously undocumented endpoints: - Voice Activity Detection (/v1/vad) - Video Generation (/video) - Sound Generation (/v1/sound-generation) - Backend Monitor (/backend/monitor, /backend/shutdown) - Token Metrics (/tokenMetrics) - P2P endpoints (/api/p2p/* - 5 sub-endpoints) - System Info (/system, /version) Each documentation file includes HTTP method, request/response schemas, curl examples, sample JSON responses, and error codes. * docs: remove token-metrics endpoint documentation per review feedback The token-metrics endpoint is not wired into the HTTP router and should not be documented per reviewer request. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: move system-info documentation to reference section Per review feedback, system-info endpoint docs are better suited for the reference section rather than features. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: localai-bot <localai-bot@noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2.8 KiB
2.8 KiB
+++ disableToc = false title = "Backend Monitor" weight = 20 url = "/features/backend-monitor/" +++
LocalAI provides endpoints to monitor and manage running backends. The /backend/monitor endpoint reports the status and resource usage of loaded models, and /backend/shutdown allows stopping a model's backend process.
Monitor API
- Method:
GET - Endpoints:
/backend/monitor,/v1/backend/monitor
Request
The request body is JSON:
| Parameter | Type | Required | Description |
|---|---|---|---|
model |
string |
Yes | Name of the model to monitor |
Response
Returns a JSON object with the backend status:
| Field | Type | Description |
|---|---|---|
state |
int |
Backend state: 0 = uninitialized, 1 = busy, 2 = ready, -1 = error |
memory |
object |
Memory usage information |
memory.total |
uint64 |
Total memory usage in bytes |
memory.breakdown |
object |
Per-component memory breakdown (key-value pairs) |
If the gRPC status call fails, the endpoint falls back to local process metrics:
| Field | Type | Description |
|---|---|---|
memory_info |
object |
Process memory info (RSS, VMS) |
memory_percent |
float |
Memory usage percentage |
cpu_percent |
float |
CPU usage percentage |
Usage
curl http://localhost:8080/backend/monitor \
-H "Content-Type: application/json" \
-d '{"model": "my-model"}'
Example response
{
"state": 2,
"memory": {
"total": 1073741824,
"breakdown": {
"weights": 536870912,
"kv_cache": 268435456
}
}
}
Shutdown API
- Method:
POST - Endpoints:
/backend/shutdown,/v1/backend/shutdown
Request
| Parameter | Type | Required | Description |
|---|---|---|---|
model |
string |
Yes | Name of the model to shut down |
Usage
curl -X POST http://localhost:8080/backend/shutdown \
-H "Content-Type: application/json" \
-d '{"model": "my-model"}'
Response
Returns 200 OK with the shutdown confirmation message on success.
Error Responses
| Status Code | Description |
|---|---|
| 400 | Invalid or missing model name |
| 500 | Backend error or model not loaded |