mirror of
https://github.com/mudler/LocalAI
synced 2026-04-21 13:27:21 +00:00
94 lines
2.8 KiB
Markdown
94 lines
2.8 KiB
Markdown
|
|
+++
|
||
|
|
disableToc = false
|
||
|
|
title = "Backend Monitor"
|
||
|
|
weight = 20
|
||
|
|
url = "/features/backend-monitor/"
|
||
|
|
+++
|
||
|
|
|
||
|
|
LocalAI provides endpoints to monitor and manage running backends. The `/backend/monitor` endpoint reports the status and resource usage of loaded models, and `/backend/shutdown` allows stopping a model's backend process.
|
||
|
|
|
||
|
|
## Monitor API
|
||
|
|
|
||
|
|
- **Method:** `GET`
|
||
|
|
- **Endpoints:** `/backend/monitor`, `/v1/backend/monitor`
|
||
|
|
|
||
|
|
### Request
|
||
|
|
|
||
|
|
The request body is JSON:
|
||
|
|
|
||
|
|
| Parameter | Type | Required | Description |
|
||
|
|
|-----------|----------|----------|--------------------------------|
|
||
|
|
| `model` | `string` | Yes | Name of the model to monitor |
|
||
|
|
|
||
|
|
### Response
|
||
|
|
|
||
|
|
Returns a JSON object with the backend status:
|
||
|
|
|
||
|
|
| Field | Type | Description |
|
||
|
|
|----------------------|----------|-------------------------------------------------------|
|
||
|
|
| `state` | `int` | Backend state: `0` = uninitialized, `1` = busy, `2` = ready, `-1` = error |
|
||
|
|
| `memory` | `object` | Memory usage information |
|
||
|
|
| `memory.total` | `uint64` | Total memory usage in bytes |
|
||
|
|
| `memory.breakdown` | `object` | Per-component memory breakdown (key-value pairs) |
|
||
|
|
|
||
|
|
If the gRPC status call fails, the endpoint falls back to local process metrics:
|
||
|
|
|
||
|
|
| Field | Type | Description |
|
||
|
|
|------------------|---------|--------------------------------|
|
||
|
|
| `memory_info` | `object`| Process memory info (RSS, VMS) |
|
||
|
|
| `memory_percent` | `float` | Memory usage percentage |
|
||
|
|
| `cpu_percent` | `float` | CPU usage percentage |
|
||
|
|
|
||
|
|
### Usage
|
||
|
|
|
||
|
|
```bash
|
||
|
|
curl http://localhost:8080/backend/monitor \
|
||
|
|
-H "Content-Type: application/json" \
|
||
|
|
-d '{"model": "my-model"}'
|
||
|
|
```
|
||
|
|
|
||
|
|
### Example response
|
||
|
|
|
||
|
|
```json
|
||
|
|
{
|
||
|
|
"state": 2,
|
||
|
|
"memory": {
|
||
|
|
"total": 1073741824,
|
||
|
|
"breakdown": {
|
||
|
|
"weights": 536870912,
|
||
|
|
"kv_cache": 268435456
|
||
|
|
}
|
||
|
|
}
|
||
|
|
}
|
||
|
|
```
|
||
|
|
|
||
|
|
## Shutdown API
|
||
|
|
|
||
|
|
- **Method:** `POST`
|
||
|
|
- **Endpoints:** `/backend/shutdown`, `/v1/backend/shutdown`
|
||
|
|
|
||
|
|
### Request
|
||
|
|
|
||
|
|
| Parameter | Type | Required | Description |
|
||
|
|
|-----------|----------|----------|---------------------------------|
|
||
|
|
| `model` | `string` | Yes | Name of the model to shut down |
|
||
|
|
|
||
|
|
### Usage
|
||
|
|
|
||
|
|
```bash
|
||
|
|
curl -X POST http://localhost:8080/backend/shutdown \
|
||
|
|
-H "Content-Type: application/json" \
|
||
|
|
-d '{"model": "my-model"}'
|
||
|
|
```
|
||
|
|
|
||
|
|
### Response
|
||
|
|
|
||
|
|
Returns `200 OK` with the shutdown confirmation message on success.
|
||
|
|
|
||
|
|
## Error Responses
|
||
|
|
|
||
|
|
| Status Code | Description |
|
||
|
|
|-------------|------------------------------------------------|
|
||
|
|
| 400 | Invalid or missing model name |
|
||
|
|
| 500 | Backend error or model not loaded |
|