LocalAI is your complete AI stack for running AI models locally. It's designed to be simple, efficient, and accessible, providing a drop-in replacement for OpenAI's API while keeping your data private and secure.
LocalAI is a single binary (or container) that gives you everything you need:
- **OpenAI-compatible API** — Drop-in replacement for OpenAI, Anthropic, and Open Responses APIs
- **Built-in Web Interface** — Chat, model management, agent creation, image generation, and system monitoring
- **AI Agents** — Create autonomous agents with MCP (Model Context Protocol) tool support, directly from the UI
- **Multiple Model Support** — LLMs, image generation, text-to-speech, speech-to-text, vision, embeddings, and more
- **GPU Acceleration** — Automatic detection and support for NVIDIA, AMD, Intel, and Vulkan GPUs
- **Distributed Mode** — Scale horizontally with worker nodes, P2P federation, and model sharding
- **No GPU Required** — Runs on CPU with consumer-grade hardware
LocalAI integrates [LocalAGI](https://github.com/mudler/LocalAGI) (agent platform) and [LocalRecall](https://github.com/mudler/LocalRecall) (semantic memory) as built-in libraries — no separate installation needed.
LocalAI can be installed in several ways. **Docker is the recommended installation method** for most users as it provides the easiest setup and works across all platforms.
Then open **http://localhost:8080** to access the web interface, install models, and start chatting.
For GPU support, see the [Container images reference]({{% relref "getting-started/container-images" %}}) or the [Quickstart guide]({{% relref "getting-started/quickstart" %}}).
For complete installation instructions including Docker, macOS, Linux, Kubernetes, and building from source, see the [Installation guide](/installation/).