LocalAI/docs/content/features/backend-monitor.md

+++
disableToc = false
title = "Backend Monitor"
weight = 20
url = "/features/backend-monitor/"
+++

LocalAI provides endpoints to monitor and manage running backends. The `/backend/monitor` endpoint reports the status and resource usage of loaded models, and `/backend/shutdown` allows stopping a model's backend process.

## Monitor API

- **Method:** `GET`
- **Endpoints:** `/backend/monitor`, `/v1/backend/monitor`

### Request

The request body is JSON:

| Parameter | Type     | Required | Description                    |
|-----------|----------|----------|--------------------------------|
| `model`   | `string` | Yes      | Name of the model to monitor   |

### Response

Returns a JSON object with the backend status:

| Field                | Type     | Description                                           |
|----------------------|----------|-------------------------------------------------------|
| `state`              | `int`    | Backend state: `0` = uninitialized, `1` = busy, `2` = ready, `-1` = error |
| `memory`             | `object` | Memory usage information                              |
| `memory.total`       | `uint64` | Total memory usage in bytes                           |
| `memory.breakdown`   | `object` | Per-component memory breakdown (key-value pairs)      |

If the gRPC status call fails, the endpoint falls back to local process metrics:

| Field            | Type    | Description                    |
|------------------|---------|--------------------------------|
| `memory_info`    | `object`| Process memory info (RSS, VMS) |
| `memory_percent` | `float` | Memory usage percentage        |
| `cpu_percent`    | `float` | CPU usage percentage           |

### Usage

```bash
curl http://localhost:8080/backend/monitor \
  -H "Content-Type: application/json" \
  -d '{"model": "my-model"}'
```

### Example response

```json
{
  "state": 2,
  "memory": {
    "total": 1073741824,
    "breakdown": {
      "weights": 536870912,
      "kv_cache": 268435456
    }
  }
}
```

## Shutdown API

- **Method:** `POST`
- **Endpoints:** `/backend/shutdown`, `/v1/backend/shutdown`

### Request

| Parameter | Type     | Required | Description                     |
|-----------|----------|----------|---------------------------------|
| `model`   | `string` | Yes      | Name of the model to shut down  |

### Usage

```bash
curl -X POST http://localhost:8080/backend/shutdown \
  -H "Content-Type: application/json" \
  -d '{"model": "my-model"}'
```

### Response

Returns `200 OK` with the shutdown confirmation message on success.

## Error Responses

| Status Code | Description                                    |
|-------------|------------------------------------------------|
| 400         | Invalid or missing model name                  |
| 500         | Backend error or model not loaded              |
feat: Add documentation for undocumented API endpoints (#8852) * feat: add documentation for undocumented API endpoints Creates comprehensive documentation for 8 previously undocumented endpoints: - Voice Activity Detection (/v1/vad) - Video Generation (/video) - Sound Generation (/v1/sound-generation) - Backend Monitor (/backend/monitor, /backend/shutdown) - Token Metrics (/tokenMetrics) - P2P endpoints (/api/p2p/* - 5 sub-endpoints) - System Info (/system, /version) Each documentation file includes HTTP method, request/response schemas, curl examples, sample JSON responses, and error codes. * docs: remove token-metrics endpoint documentation per review feedback The token-metrics endpoint is not wired into the HTTP router and should not be documented per reviewer request. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * docs: move system-info documentation to reference section Per review feedback, system-info endpoint docs are better suited for the reference section rather than features. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: localai-bot <localai-bot@noreply.github.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> 2026-03-08 16:59:33 +00:00			`+++`
			`disableToc = false`
			`title = "Backend Monitor"`
			`weight = 20`
			`url = "/features/backend-monitor/"`
			`+++`

			LocalAI provides endpoints to monitor and manage running backends. The `/backend/monitor` endpoint reports the status and resource usage of loaded models, and `/backend/shutdown` allows stopping a model's backend process.

			`## Monitor API`

			- Method: `GET`
			- Endpoints: `/backend/monitor`, `/v1/backend/monitor`

			`### Request`

			`The request body is JSON:`

			`\| Parameter \| Type \| Required \| Description \|`
			`\|-----------\|----------\|----------\|--------------------------------\|`
			\| `model` \| `string` \| Yes \| Name of the model to monitor \|

			`### Response`

			`Returns a JSON object with the backend status:`

			`\| Field \| Type \| Description \|`
			`\|----------------------\|----------\|-------------------------------------------------------\|`
			\| `state` \| `int` \| Backend state: `0` = uninitialized, `1` = busy, `2` = ready, `-1` = error \|
			\| `memory` \| `object` \| Memory usage information \|
			\| `memory.total` \| `uint64` \| Total memory usage in bytes \|
			\| `memory.breakdown` \| `object` \| Per-component memory breakdown (key-value pairs) \|

			`If the gRPC status call fails, the endpoint falls back to local process metrics:`

			`\| Field \| Type \| Description \|`
			`\|------------------\|---------\|--------------------------------\|`
			\| `memory_info` \| `object`\| Process memory info (RSS, VMS) \|
			\| `memory_percent` \| `float` \| Memory usage percentage \|
			\| `cpu_percent` \| `float` \| CPU usage percentage \|

			`### Usage`

			```bash
			`curl http://localhost:8080/backend/monitor \`
			`-H "Content-Type: application/json" \`
			`-d '{"model": "my-model"}'`
			```

			`### Example response`

			```json
			`{`
			`"state": 2,`
			`"memory": {`
			`"total": 1073741824,`
			`"breakdown": {`
			`"weights": 536870912,`
			`"kv_cache": 268435456`
			`}`
			`}`
			`}`
			```

			`## Shutdown API`

			- Method: `POST`
			- Endpoints: `/backend/shutdown`, `/v1/backend/shutdown`

			`### Request`

			`\| Parameter \| Type \| Required \| Description \|`
			`\|-----------\|----------\|----------\|---------------------------------\|`
			\| `model` \| `string` \| Yes \| Name of the model to shut down \|

			`### Usage`

			```bash
			`curl -X POST http://localhost:8080/backend/shutdown \`
			`-H "Content-Type: application/json" \`
			`-d '{"model": "my-model"}'`
			```

			`### Response`

			Returns `200 OK` with the shutdown confirmation message on success.

			`## Error Responses`

			`\| Status Code \| Description \|`
			`\|-------------\|------------------------------------------------\|`
			`\| 400 \| Invalid or missing model name \|`
			`\| 500 \| Backend error or model not loaded \|`