LocalAI/docs/content/features/backend-monitor.md
LocalAI [bot] 9090bca920
feat: Add documentation for undocumented API endpoints (#8852)
* feat: add documentation for undocumented API endpoints

Creates comprehensive documentation for 8 previously undocumented endpoints:
- Voice Activity Detection (/v1/vad)
- Video Generation (/video)
- Sound Generation (/v1/sound-generation)
- Backend Monitor (/backend/monitor, /backend/shutdown)
- Token Metrics (/tokenMetrics)
- P2P endpoints (/api/p2p/* - 5 sub-endpoints)
- System Info (/system, /version)

Each documentation file includes HTTP method, request/response schemas,
curl examples, sample JSON responses, and error codes.

* docs: remove token-metrics endpoint documentation per review feedback

The token-metrics endpoint is not wired into the HTTP router and
should not be documented per reviewer request.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* docs: move system-info documentation to reference section

Per review feedback, system-info endpoint docs are better suited
for the reference section rather than features.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: localai-bot <localai-bot@noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-08 17:59:33 +01:00

93 lines
2.8 KiB
Markdown

+++
disableToc = false
title = "Backend Monitor"
weight = 20
url = "/features/backend-monitor/"
+++
LocalAI provides endpoints to monitor and manage running backends. The `/backend/monitor` endpoint reports the status and resource usage of loaded models, and `/backend/shutdown` allows stopping a model's backend process.
## Monitor API
- **Method:** `GET`
- **Endpoints:** `/backend/monitor`, `/v1/backend/monitor`
### Request
The request body is JSON:
| Parameter | Type | Required | Description |
|-----------|----------|----------|--------------------------------|
| `model` | `string` | Yes | Name of the model to monitor |
### Response
Returns a JSON object with the backend status:
| Field | Type | Description |
|----------------------|----------|-------------------------------------------------------|
| `state` | `int` | Backend state: `0` = uninitialized, `1` = busy, `2` = ready, `-1` = error |
| `memory` | `object` | Memory usage information |
| `memory.total` | `uint64` | Total memory usage in bytes |
| `memory.breakdown` | `object` | Per-component memory breakdown (key-value pairs) |
If the gRPC status call fails, the endpoint falls back to local process metrics:
| Field | Type | Description |
|------------------|---------|--------------------------------|
| `memory_info` | `object`| Process memory info (RSS, VMS) |
| `memory_percent` | `float` | Memory usage percentage |
| `cpu_percent` | `float` | CPU usage percentage |
### Usage
```bash
curl http://localhost:8080/backend/monitor \
-H "Content-Type: application/json" \
-d '{"model": "my-model"}'
```
### Example response
```json
{
"state": 2,
"memory": {
"total": 1073741824,
"breakdown": {
"weights": 536870912,
"kv_cache": 268435456
}
}
}
```
## Shutdown API
- **Method:** `POST`
- **Endpoints:** `/backend/shutdown`, `/v1/backend/shutdown`
### Request
| Parameter | Type | Required | Description |
|-----------|----------|----------|---------------------------------|
| `model` | `string` | Yes | Name of the model to shut down |
### Usage
```bash
curl -X POST http://localhost:8080/backend/shutdown \
-H "Content-Type: application/json" \
-d '{"model": "my-model"}'
```
### Response
Returns `200 OK` with the shutdown confirmation message on success.
## Error Responses
| Status Code | Description |
|-------------|------------------------------------------------|
| 400 | Invalid or missing model name |
| 500 | Backend error or model not loaded |