LocalAI/docs/content/reference/compatibility-table.md


+++
disableToc = false
title = "Model compatibility table"
weight = 24
url = "/model-compatibility/"
+++

Besides llama based models, LocalAI is compatible also with other architectures. The table below lists all the backends, compatible models families and the associated repository.

{{% notice note %}}

LocalAI will attempt to automatically load models which are not explicitly configured for a specific backend. You can specify the backend to use by configuring a model with a YAML file. See [the advanced section]({{%relref "advanced" %}}) for more details.

 {{% /notice %}}

## Text Generation & Language Models

| Backend and Bindings                                                             | Compatible models     | Completion/Chat endpoint | Capability | Embeddings support                | Token stream support | Acceleration |
|----------------------------------------------------------------------------------|-----------------------|--------------------------|---------------------------|-----------------------------------|----------------------|--------------|
| [llama.cpp]({{%relref "features/text-generation#llama.cpp" %}})        | LLama, Mamba, RWKV, Falcon, Starcoder, GPT-2, [and many others](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#description) | yes                      | GPT and Functions                        | yes | yes                  | CUDA 11/12, ROCm, Intel SYCL, Vulkan, Metal, CPU |
| [vLLM](https://github.com/vllm-project/vllm)        | Various GPTs and quantization formats | yes                      | GPT             | no | no                  | CUDA 12, ROCm, Intel |
| [transformers](https://github.com/huggingface/transformers) | Various GPTs and quantization formats  | yes                      | GPT, embeddings, Audio generation            | yes | yes*                  | CUDA 11/12, ROCm, Intel, CPU |
| [exllama2](https://github.com/turboderp-org/exllamav2)  | GPTQ                   | yes                       | GPT only                  | no                               | no                   | CUDA 12 |
| [MLX](https://github.com/ml-explore/mlx-lm)        | Various LLMs               | yes                       | GPT                        | no                                | no                   | Metal (Apple Silicon) |
| [MLX-VLM](https://github.com/Blaizzy/mlx-vlm)        | Vision-Language Models               | yes                       | Multimodal GPT                        | no                                | no                   | Metal (Apple Silicon) |
| [langchain-huggingface](https://github.com/tmc/langchaingo)                                                                    | Any text generators available on HuggingFace through API | yes                      | GPT                        | no                                | no                   | N/A |

## Audio & Speech Processing

| Backend and Bindings                                                             | Compatible models     | Completion/Chat endpoint | Capability | Embeddings support                | Token stream support | Acceleration |
|----------------------------------------------------------------------------------|-----------------------|--------------------------|---------------------------|-----------------------------------|----------------------|--------------|
| [whisper.cpp](https://github.com/ggml-org/whisper.cpp)         | whisper               | no                       | Audio transcription                 | no                                | no                   | CUDA 12, ROCm, Intel SYCL, Vulkan, CPU |
| [faster-whisper](https://github.com/SYSTRAN/faster-whisper)         | whisper               | no                       | Audio transcription                 | no                                | no                   | CUDA 12, ROCm, Intel, CPU |
| [piper](https://github.com/rhasspy/piper) ([binding](https://github.com/mudler/go-piper))                                                                     | Any piper onnx model | no                      | Text to voice                        | no                                | no                   | CPU |
| [bark](https://github.com/suno-ai/bark)  | bark                   | no                       | Audio generation                  | no                               | no                   | CUDA 12, ROCm, Intel |
| [bark-cpp](https://github.com/PABannier/bark.cpp)        | bark               | no                       | Audio-Only                 | no                                | no                   | CUDA, Metal, CPU |
| [coqui](https://github.com/idiap/coqui-ai-TTS) | Coqui TTS    | no                       | Audio generation and Voice cloning    | no                               | no                   | CUDA 12, ROCm, Intel, CPU |
| [kokoro](https://github.com/hexgrad/kokoro) | Kokoro TTS    | no                       | Text-to-speech    | no                               | no                   | CUDA 12, ROCm, Intel, CPU |
| [chatterbox](https://github.com/resemble-ai/chatterbox) | Chatterbox TTS    | no                       | Text-to-speech    | no                               | no                   | CUDA 11/12, CPU |
| [kitten-tts](https://github.com/KittenML/KittenTTS) | Kitten TTS    | no                       | Text-to-speech    | no                               | no                   | CPU |
| [silero-vad](https://github.com/snakers4/silero-vad) with [Golang bindings](https://github.com/streamer45/silero-vad-go) | Silero VAD    | no                       | Voice Activity Detection    | no                               | no                   | CPU |
| [neutts](https://github.com/neuphonic/neuttsair) | NeuTTSAir    | no                       | Text-to-speech with voice cloning    | no                               | no                   | CUDA 12, ROCm, CPU |
| [mlx-audio](https://github.com/Blaizzy/mlx-audio) | MLX | no                       | Text-tospeech    | no                               | no                   | Metal (Apple Silicon) |

## Image & Video Generation

| Backend and Bindings                                                             | Compatible models     | Completion/Chat endpoint | Capability | Embeddings support                | Token stream support | Acceleration |
|----------------------------------------------------------------------------------|-----------------------|--------------------------|---------------------------|-----------------------------------|----------------------|--------------|
| [stablediffusion.cpp](https://github.com/leejet/stable-diffusion.cpp)         | stablediffusion-1, stablediffusion-2, stablediffusion-3, flux, PhotoMaker               | no                       | Image                 | no                                | no                   | CUDA 12, Intel SYCL, Vulkan, CPU |
| [diffusers](https://github.com/huggingface/diffusers)  | SD, various diffusion models,...                   | no                       | Image/Video generation    | no                               | no                   | CUDA 11/12, ROCm, Intel, Metal, CPU |
| [transformers-musicgen](https://github.com/huggingface/transformers)  | MusicGen                    | no                       | Audio generation                | no                               | no                   | CUDA, CPU |

## Specialized AI Tasks

| Backend and Bindings                                                             | Compatible models     | Completion/Chat endpoint | Capability | Embeddings support                | Token stream support | Acceleration |
|----------------------------------------------------------------------------------|-----------------------|--------------------------|---------------------------|-----------------------------------|----------------------|--------------|
| [rfdetr](https://github.com/roboflow/rf-detr) | RF-DETR    | no                       | Object Detection    | no                               | no                   | CUDA 12, Intel, CPU |
| [rerankers](https://github.com/AnswerDotAI/rerankers) | Reranking API    | no                       | Reranking   | no                               | no                   | CUDA 11/12, ROCm, Intel, CPU |
| [local-store](https://github.com/mudler/LocalAI) | Vector database    | no                       | Vector storage   | yes                               | no                   | CPU |
| [huggingface](https://huggingface.co/docs/hub/en/api) | HuggingFace API models    | yes                       | Various AI tasks   | yes                               | yes                   | API-based |

## Acceleration Support Summary

### GPU Acceleration
- **NVIDIA CUDA**: CUDA 11.7, CUDA 12.0 support across most backends
- **AMD ROCm**: HIP-based acceleration for AMD GPUs
- **Intel oneAPI**: SYCL-based acceleration for Intel GPUs (F16/F32 precision)
- **Vulkan**: Cross-platform GPU acceleration
- **Metal**: Apple Silicon GPU acceleration (M1/M2/M3+)

### Specialized Hardware
- **NVIDIA Jetson (L4T)**: ARM64 support for embedded AI
- **Apple Silicon**: Native Metal acceleration for Mac M1/M2/M3+
- **Darwin x86**: Intel Mac support

### CPU Optimization
- **AVX/AVX2/AVX512**: Advanced vector extensions for x86
- **Quantization**: 4-bit, 5-bit, 8-bit integer quantization support
- **Mixed Precision**: F16/F32 mixed precision support

Note: any backend name listed above can be used in the `backend` field of the model configuration file (See [the advanced section]({{%relref "advanced" %}})).

- \* Only for CUDA and OpenVINO CPU/XPU acceleration.
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00
			`+++`
			`disableToc = false`
docs/examples: enhancements (#1572) * docs: re-order sections * fix references * Add mixtral-instruct, tinyllama-chat, dolphin-2.5-mixtral-8x7b * Fix link * Minor corrections * fix: models is a StringSlice, not a String Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * WIP: switch docs theme * content * Fix GH link * enhancements * enhancements * Fixed how to link Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> * fixups * logo fix * more fixups * final touches --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> Co-authored-by: lunamidori5 <118759930+lunamidori5@users.noreply.github.com> 2024-01-18 18:41:08 +00:00			`title = "Model compatibility table"`
			`weight = 24`
docs: re-use original permalinks (#1610) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2024-01-19 18:23:58 +00:00			`url = "/model-compatibility/"`
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00			`+++`

chore(docs): update available backends (#4325) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2024-12-05 15:57:56 +00:00			`Besides llama based models, LocalAI is compatible also with other architectures. The table below lists all the backends, compatible models families and the associated repository.`
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00
feat: docs revamp (#7313) * docs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small enhancements Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Enhancements * Default to zen-dark Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2025-11-19 21:21:20 +00:00			`{{% notice note %}}`
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00
feat: docs revamp (#7313) * docs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small enhancements Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Enhancements * Default to zen-dark Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2025-11-19 21:21:20 +00:00			`LocalAI will attempt to automatically load models which are not explicitly configured for a specific backend. You can specify the backend to use by configuring a model with a YAML file. See [the advanced section]({{%relref "advanced" %}}) for more details.`
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00
feat: docs revamp (#7313) * docs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small enhancements Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Enhancements * Default to zen-dark Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2025-11-19 21:21:20 +00:00			`{{% /notice %}}`
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00
chore(docs): update list of supported backends (#6134) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2025-08-24 18:09:19 +00:00			`## Text Generation & Language Models`

docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00			`\| Backend and Bindings \| Compatible models \| Completion/Chat endpoint \| Capability \| Embeddings support \| Token stream support \| Acceleration \|`
			`\|----------------------------------------------------------------------------------\|-----------------------\|--------------------------\|---------------------------\|-----------------------------------\|----------------------\|--------------\|`
feat: docs revamp (#7313) * docs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small enhancements Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Enhancements * Default to zen-dark Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2025-11-19 21:21:20 +00:00			`\| [llama.cpp]({{%relref "features/text-generation#llama.cpp" %}}) \| LLama, Mamba, RWKV, Falcon, Starcoder, GPT-2, [and many others](https://github.com/ggerganov/llama.cpp?tab=readme-ov-file#description) \| yes \| GPT and Functions \| yes \| yes \| CUDA 11/12, ROCm, Intel SYCL, Vulkan, Metal, CPU \|`
chore(docs): update list of supported backends (#6134) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2025-08-24 18:09:19 +00:00			`\| [vLLM](https://github.com/vllm-project/vllm) \| Various GPTs and quantization formats \| yes \| GPT \| no \| no \| CUDA 12, ROCm, Intel \|`
			`\| [transformers](https://github.com/huggingface/transformers) \| Various GPTs and quantization formats \| yes \| GPT, embeddings, Audio generation \| yes \| yes* \| CUDA 11/12, ROCm, Intel, CPU \|`
			`\| [exllama2](https://github.com/turboderp-org/exllamav2) \| GPTQ \| yes \| GPT only \| no \| no \| CUDA 12 \|`
			`\| [MLX](https://github.com/ml-explore/mlx-lm) \| Various LLMs \| yes \| GPT \| no \| no \| Metal (Apple Silicon) \|`
			`\| [MLX-VLM](https://github.com/Blaizzy/mlx-vlm) \| Vision-Language Models \| yes \| Multimodal GPT \| no \| no \| Metal (Apple Silicon) \|`
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00			`\| [langchain-huggingface](https://github.com/tmc/langchaingo) \| Any text generators available on HuggingFace through API \| yes \| GPT \| no \| no \| N/A \|`
chore(docs): update list of supported backends (#6134) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2025-08-24 18:09:19 +00:00
			`## Audio & Speech Processing`

			`\| Backend and Bindings \| Compatible models \| Completion/Chat endpoint \| Capability \| Embeddings support \| Token stream support \| Acceleration \|`
			`\|----------------------------------------------------------------------------------\|-----------------------\|--------------------------\|---------------------------\|-----------------------------------\|----------------------\|--------------\|`
			`\| [whisper.cpp](https://github.com/ggml-org/whisper.cpp) \| whisper \| no \| Audio transcription \| no \| no \| CUDA 12, ROCm, Intel SYCL, Vulkan, CPU \|`
			`\| [faster-whisper](https://github.com/SYSTRAN/faster-whisper) \| whisper \| no \| Audio transcription \| no \| no \| CUDA 12, ROCm, Intel, CPU \|`
			`\| [piper](https://github.com/rhasspy/piper) ([binding](https://github.com/mudler/go-piper)) \| Any piper onnx model \| no \| Text to voice \| no \| no \| CPU \|`
			`\| [bark](https://github.com/suno-ai/bark) \| bark \| no \| Audio generation \| no \| no \| CUDA 12, ROCm, Intel \|`
			`\| [bark-cpp](https://github.com/PABannier/bark.cpp) \| bark \| no \| Audio-Only \| no \| no \| CUDA, Metal, CPU \|`
			`\| [coqui](https://github.com/idiap/coqui-ai-TTS) \| Coqui TTS \| no \| Audio generation and Voice cloning \| no \| no \| CUDA 12, ROCm, Intel, CPU \|`
			`\| [kokoro](https://github.com/hexgrad/kokoro) \| Kokoro TTS \| no \| Text-to-speech \| no \| no \| CUDA 12, ROCm, Intel, CPU \|`
			`\| [chatterbox](https://github.com/resemble-ai/chatterbox) \| Chatterbox TTS \| no \| Text-to-speech \| no \| no \| CUDA 11/12, CPU \|`
			`\| [kitten-tts](https://github.com/KittenML/KittenTTS) \| Kitten TTS \| no \| Text-to-speech \| no \| no \| CPU \|`
chore(docs): update available backends (#4325) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2024-12-05 15:57:56 +00:00			`\| [silero-vad](https://github.com/snakers4/silero-vad) with [Golang bindings](https://github.com/streamer45/silero-vad-go) \| Silero VAD \| no \| Voice Activity Detection \| no \| no \| CPU \|`
feat(neutts): add backend (#6404) * feat(neutts): add backend Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(ci): add images to CI Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * chore(gallery): add Neutts Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Make it work with quantized versions Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Docs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Apply suggestion from @mudler Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> * Apply suggestion from @mudler Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> * Apply suggestion from @mudler Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> 2025-10-09 19:51:28 +00:00			`\| [neutts](https://github.com/neuphonic/neuttsair) \| NeuTTSAir \| no \| Text-to-speech with voice cloning \| no \| no \| CUDA 12, ROCm, CPU \|`
Add MLX-audio entry to compatibility table Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> 2025-09-08 07:54:01 +00:00			`\| [mlx-audio](https://github.com/Blaizzy/mlx-audio) \| MLX \| no \| Text-tospeech \| no \| no \| Metal (Apple Silicon) \|`
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00
chore(docs): update list of supported backends (#6134) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2025-08-24 18:09:19 +00:00			`## Image & Video Generation`

			`\| Backend and Bindings \| Compatible models \| Completion/Chat endpoint \| Capability \| Embeddings support \| Token stream support \| Acceleration \|`
			`\|----------------------------------------------------------------------------------\|-----------------------\|--------------------------\|---------------------------\|-----------------------------------\|----------------------\|--------------\|`
			`\| [stablediffusion.cpp](https://github.com/leejet/stable-diffusion.cpp) \| stablediffusion-1, stablediffusion-2, stablediffusion-3, flux, PhotoMaker \| no \| Image \| no \| no \| CUDA 12, Intel SYCL, Vulkan, CPU \|`
			`\| [diffusers](https://github.com/huggingface/diffusers) \| SD, various diffusion models,... \| no \| Image/Video generation \| no \| no \| CUDA 11/12, ROCm, Intel, Metal, CPU \|`
			`\| [transformers-musicgen](https://github.com/huggingface/transformers) \| MusicGen \| no \| Audio generation \| no \| no \| CUDA, CPU \|`

			`## Specialized AI Tasks`

			`\| Backend and Bindings \| Compatible models \| Completion/Chat endpoint \| Capability \| Embeddings support \| Token stream support \| Acceleration \|`
			`\|----------------------------------------------------------------------------------\|-----------------------\|--------------------------\|---------------------------\|-----------------------------------\|----------------------\|--------------\|`
			`\| [rfdetr](https://github.com/roboflow/rf-detr) \| RF-DETR \| no \| Object Detection \| no \| no \| CUDA 12, Intel, CPU \|`
			`\| [rerankers](https://github.com/AnswerDotAI/rerankers) \| Reranking API \| no \| Reranking \| no \| no \| CUDA 11/12, ROCm, Intel, CPU \|`
			`\| [local-store](https://github.com/mudler/LocalAI) \| Vector database \| no \| Vector storage \| yes \| no \| CPU \|`
			`\| [huggingface](https://huggingface.co/docs/hub/en/api) \| HuggingFace API models \| yes \| Various AI tasks \| yes \| yes \| API-based \|`

			`## Acceleration Support Summary`

			`### GPU Acceleration`
			`- NVIDIA CUDA: CUDA 11.7, CUDA 12.0 support across most backends`
			`- AMD ROCm: HIP-based acceleration for AMD GPUs`
			`- Intel oneAPI: SYCL-based acceleration for Intel GPUs (F16/F32 precision)`
			`- Vulkan: Cross-platform GPU acceleration`
			`- Metal: Apple Silicon GPU acceleration (M1/M2/M3+)`

			`### Specialized Hardware`
			`- NVIDIA Jetson (L4T): ARM64 support for embedded AI`
			`- Apple Silicon: Native Metal acceleration for Mac M1/M2/M3+`
			`- Darwin x86: Intel Mac support`

			`### CPU Optimization`
			`- AVX/AVX2/AVX512: Advanced vector extensions for x86`
			`- Quantization: 4-bit, 5-bit, 8-bit integer quantization support`
			`- Mixed Precision: F16/F32 mixed precision support`

feat: docs revamp (#7313) * docs Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Small enhancements Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * Enhancements * Default to zen-dark Signed-off-by: Ettore Di Giacinto <mudler@localai.io> * fixups Signed-off-by: Ettore Di Giacinto <mudler@localai.io> --------- Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2025-11-19 21:21:20 +00:00			Note: any backend name listed above can be used in the `backend` field of the model configuration file (See [the advanced section]({{%relref "advanced" %}})).
docs: Initial import from localai-website (#1312) Signed-off-by: Ettore Di Giacinto <mudler@localai.io> 2023-11-22 17:13:50 +00:00
docs: update compatibility-table.md (#4557) Signed-off-by: Ettore Di Giacinto <mudler@users.noreply.github.com> 2025-01-07 20:20:44 +00:00			`- \* Only for CUDA and OpenVINO CPU/XPU acceleration.`