mirror of https://github.com/mudler/LocalAI synced 2026-05-24 09:28:23 +00:00

History

Ettore Di Giacinto fe7b27eb66 Some checks are pending build backend container images / generate-matrix (push) Waiting to run Details build backend container images / backend-jobs (push) Blocked by required conditions Details build backend container images / backend-merge-jobs (push) Blocked by required conditions Details build backend container images / backend-jobs-darwin (push) Blocked by required conditions Details build backend container images / llama-cpp-darwin (1.25.x) (push) Waiting to run Details build backend container images / llama-cpp-darwin-publish (push) Blocked by required conditions Details Build test / build-test (push) Waiting to run Details Build test / launcher-build-darwin (push) Waiting to run Details Build test / launcher-build-linux (push) Waiting to run Details Explorer deployment / build-linux (push) Waiting to run Details GPU tests / ubuntu-latest (1.21.x) (push) Waiting to run Details generate and publish intel docker caches / generate_caches (intel/oneapi-basekit:2025.3.2-0-devel-ubuntu24.04, linux/amd64, arc-runner-set) (push) Waiting to run Details build container images / hipblas-jobs (rocm/dev-ubuntu-24.04:7.2.1, hipblas, --jobs=3 --output-sync=target, linux/amd64, ubuntu-latest, auto, -gpu-hipblas, noble, 2404) (push) Waiting to run Details build container images / core-image-build (intel/oneapi-basekit:2025.3.2-0-devel-ubuntu24.04, intel, --jobs=3 --output-sync=target, linux/amd64, ubuntu-latest, auto, -gpu-intel, noble, 2404) (push) Waiting to run Details build container images / core-image-build (ubuntu:22.04, cublas, 13, 0, --jobs=4 --output-sync=target, linux/amd64, ubuntu-latest, false, auto, -gpu-nvidia-cuda-13, noble, 2404) (push) Waiting to run Details build container images / core-image-build (ubuntu:24.04, , --jobs=4 --output-sync=target, linux/amd64,linux/arm64, ubuntu-latest, false, auto, , noble, 2404) (push) Waiting to run Details build container images / core-image-build (ubuntu:24.04, cublas, 12, 8, --jobs=4 --output-sync=target, linux/amd64, ubuntu-latest, false, auto, -gpu-nvidia-cuda-12, noble, 2404) (push) Waiting to run Details build container images / core-image-build (ubuntu:24.04, vulkan, --jobs=4 --output-sync=target, linux/amd64,linux/arm64, ubuntu-latest, false, auto, -gpu-vulkan, noble, 2404) (push) Waiting to run Details build container images / gh-runner (nvcr.io/nvidia/l4t-jetpack:r36.4.0, cublas, 12, 0, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, true, auto, -nvidia-l4t-arm64, jammy, 2204) (push) Waiting to run Details build container images / gh-runner (ubuntu:24.04, cublas, 13, 0, --jobs=4 --output-sync=target, linux/arm64, ubuntu-24.04-arm, false, auto, -nvidia-l4t-arm64-cuda-13, noble, 2404) (push) Waiting to run Details lint / golangci-lint (push) Waiting to run Details Security Scan / tests (push) Waiting to run Details Tests extras backends / detect-changes (push) Waiting to run Details Tests extras backends / tests-transformers (push) Blocked by required conditions Details Tests extras backends / tests-rerankers (push) Blocked by required conditions Details Tests extras backends / tests-diffusers (push) Blocked by required conditions Details Tests extras backends / tests-moonshine (push) Blocked by required conditions Details Tests extras backends / tests-pocket-tts (push) Blocked by required conditions Details Tests extras backends / tests-qwen-tts (push) Blocked by required conditions Details Tests extras backends / tests-qwen-asr (push) Blocked by required conditions Details Tests extras backends / tests-nemo (push) Blocked by required conditions Details Tests extras backends / tests-sherpa-onnx-grpc-tts (push) Blocked by required conditions Details Tests extras backends / tests-ik-llama-cpp-grpc (push) Blocked by required conditions Details Tests extras backends / tests-turboquant-grpc (push) Blocked by required conditions Details Tests extras backends / tests-acestep-cpp (push) Blocked by required conditions Details Tests extras backends / tests-qwen3-tts-cpp (push) Blocked by required conditions Details Tests extras backends / tests-coqui (push) Blocked by required conditions Details Tests extras backends / tests-voxcpm (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-quantization (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-grpc (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-grpc-transcription (push) Blocked by required conditions Details Tests extras backends / tests-llama-cpp-smoke (push) Waiting to run Details Tests extras backends / tests-sherpa-onnx-realtime (push) Blocked by required conditions Details Tests extras backends / tests-sherpa-onnx-grpc-transcription (push) Blocked by required conditions Details tests-aio / tests-aio (push) Waiting to run Details E2E Backend Tests / tests-e2e-backend (1.25.x) (push) Waiting to run Details UI E2E Tests / tests-ui-e2e (1.26.x) (push) Waiting to run Details Tests extras backends / tests-vibevoice-cpp (push) Blocked by required conditions Details Tests extras backends / tests-vibevoice-cpp-grpc-tts (push) Blocked by required conditions Details Tests extras backends / tests-vibevoice-cpp-grpc-transcription (push) Blocked by required conditions Details Tests extras backends / tests-localvqe-grpc-transform (push) Blocked by required conditions Details Tests extras backends / tests-voxtral (push) Blocked by required conditions Details Tests extras backends / tests-kokoros (push) Blocked by required conditions Details Tests extras backends / tests-insightface-grpc (push) Blocked by required conditions Details Tests extras backends / tests-speaker-recognition-grpc (push) Blocked by required conditions Details tests / tests-linux (1.26.x) (push) Waiting to run Details tests / tests-apple (1.26.x) (push) Waiting to run Details test(ci): trigger faster-whisper rebuild to observe per-arch+merge The PR that introduced the per-arch + manifest-merge pilot (#9727) only touched CI infrastructure files, so the path filter correctly skipped backend builds on its merge commit. To observe the new backend-merge-jobs flow assemble a real manifest list, this commit touches faster-whisper's Makefile so its two new per-arch entries schedule and the merge job runs. The trailing comment is the smallest possible diff and is harmless to the build. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Ettore Di Giacinto <mudler@localai.io>		2026-05-08 22:09:46 +00:00
..
ace-step	feat(rocm): bump to 7.x (#9323 )	2026-04-12 08:51:30 +02:00
chatterbox	fix(chatterbox): install chatterbox-tts with --no-deps and pin runtime deps	2026-05-07 09:03:40 +00:00
common	fix(python-backend): make JIT subprocesses work on hosts of any size (#9679 )	2026-05-06 00:28:01 +02:00
coqui	chore(deps): bump packaging from 24.1 to 26.2 in /backend/python/coqui (#9594 )	2026-04-28 08:44:53 +02:00
diffusers	fix(diffusers): drop compel from requirements to unblock pip resolver (#9632 )	2026-05-01 14:45:14 +02:00
faster-qwen3-tts	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
faster-whisper	test(ci): trigger faster-whisper rebuild to observe per-arch+merge	2026-05-08 22:09:46 +00:00
fish-speech	feat(rocm): bump to 7.x (#9323 )	2026-04-12 08:51:30 +02:00
insightface	feat: add biometrics UI (#9524 )	2026-04-24 08:50:34 +02:00
kitten-tts	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
kokoro	feat(rocm): bump to 7.x (#9323 )	2026-04-12 08:51:30 +02:00
llama-cpp-quantization	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
mlx	feat: refactor shared helpers and enhance MLX backend functionality (#9335 )	2026-04-13 18:44:03 +02:00
mlx-audio	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
mlx-distributed	feat: refactor shared helpers and enhance MLX backend functionality (#9335 )	2026-04-13 18:44:03 +02:00
mlx-vlm	fix(mlx-vlm): pin upstream to v0.4.4 to unblock CUDA builds (#9568 )	2026-04-25 22:06:01 +02:00
moonshine	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
nemo	feat(rocm): bump to 7.x (#9323 )	2026-04-12 08:51:30 +02:00
neutts	feat(rocm): bump to 7.x (#9323 )	2026-04-12 08:51:30 +02:00
outetts	feat(rocm): bump to 7.x (#9323 )	2026-04-12 08:51:30 +02:00
pocket-tts	feat(backends/python): use tempfile.gettempdir() instead of hardcoded /tmp (#9629 )	2026-05-01 10:56:24 +02:00
qwen-asr	feat(rocm): bump to 7.x (#9323 )	2026-04-12 08:51:30 +02:00
qwen-tts	feat(rocm): bump to 7.x (#9323 )	2026-04-12 08:51:30 +02:00
rerankers	fix(ci): unbreak rerankers (torch bump) and vllm-omni on aarch64 (#9688 )	2026-05-06 17:07:24 +02:00
rfdetr	feat(rocm): bump to 7.x (#9323 )	2026-04-12 08:51:30 +02:00
sglang	feat(sglang): wire engine_args, add cuda13 build, ship MTP gallery demos (#9686 )	2026-05-07 17:27:29 +02:00
speaker-recognition	feat: add biometrics UI (#9524 )	2026-04-24 08:50:34 +02:00
tinygrad	feat(backends/python): use tempfile.gettempdir() instead of hardcoded /tmp (#9629 )	2026-05-01 10:56:24 +02:00
transformers	chore(deps): bump sentence-transformers from 5.2.3 to 5.4.0 in /backend/python/transformers (#9342 )	2026-04-14 00:30:27 +02:00
trl	feat: add distributed mode (#9124 )	2026-03-30 00:47:27 +02:00
vibevoice	feat(rocm): bump to 7.x (#9323 )	2026-04-12 08:51:30 +02:00
vllm	fix(python-backend): make JIT subprocesses work on hosts of any size (#9679 )	2026-05-06 00:28:01 +02:00
vllm-omni	fix(ci): unbreak rerankers (torch bump) and vllm-omni on aarch64 (#9688 )	2026-05-06 17:07:24 +02:00
voxcpm	feat(rocm): bump to 7.x (#9323 )	2026-04-12 08:51:30 +02:00
whisperx	chore(whisperx): drop ROCm/hipblas build target (#9474 )	2026-04-21 21:50:18 +02:00
README.md	chore: drop bark which is unmaintained (#8207 )	2026-01-25 09:26:40 +01:00

README.md

Python Backends for LocalAI

This directory contains Python-based AI backends for LocalAI, providing support for various AI models and hardware acceleration targets.

Overview

The Python backends use a unified build system based on libbackend.sh that provides:

Automatic virtual environment management with support for both uv and pip
Hardware-specific dependency installation (CPU, CUDA, Intel, MLX, etc.)
Portable Python support for standalone deployments
Consistent backend execution across different environments

Available Backends

Core AI Models

transformers - Hugging Face Transformers framework (PyTorch-based)
vllm - High-performance LLM inference engine
mlx - Apple Silicon optimized ML framework

Audio & Speech

coqui - Coqui TTS models
faster-whisper - Fast Whisper speech recognition
kitten-tts - Lightweight TTS
mlx-audio - Apple Silicon audio processing
chatterbox - TTS model
kokoro - TTS models

Computer Vision

diffusers - Stable Diffusion and image generation
mlx-vlm - Vision-language models for Apple Silicon
rfdetr - Object detection models

Specialized

rerankers - Text reranking models

Quick Start

Prerequisites

Python 3.10+ (default: 3.10.18)
uv package manager (recommended) or pip
Appropriate hardware drivers for your target (CUDA, Intel, etc.)

Installation

Each backend can be installed individually:

# Navigate to a specific backend
cd backend/python/transformers

# Install dependencies
make transformers
# or
bash install.sh

# Run the backend
make run
# or
bash run.sh

Using the Unified Build System

The libbackend.sh script provides consistent commands across all backends:

# Source the library in your backend script
source $(dirname $0)/../common/libbackend.sh

# Install requirements (automatically handles hardware detection)
installRequirements

# Start the backend server
startBackend $@

# Run tests
runUnittests

Hardware Targets

The build system automatically detects and configures for different hardware:

CPU - Standard CPU-only builds
CUDA - NVIDIA GPU acceleration (supports CUDA 12/13)
Intel - Intel XPU/GPU optimization
MLX - Apple Silicon (M1/M2/M3) optimization
HIP - AMD GPU acceleration

Target-Specific Requirements

Backends can specify hardware-specific dependencies:

requirements.txt - Base requirements
requirements-cpu.txt - CPU-specific packages
requirements-cublas12.txt - CUDA 12 packages
requirements-cublas13.txt - CUDA 13 packages
requirements-intel.txt - Intel-optimized packages
requirements-mps.txt - Apple Silicon packages

Configuration Options

Environment Variables

PYTHON_VERSION - Python version (default: 3.10)
PYTHON_PATCH - Python patch version (default: 18)
BUILD_TYPE - Force specific build target
USE_PIP - Use pip instead of uv (default: false)
PORTABLE_PYTHON - Enable portable Python builds
LIMIT_TARGETS - Restrict backend to specific targets

Example: CUDA 12 Only Backend

# In your backend script
LIMIT_TARGETS="cublas12"
source $(dirname $0)/../common/libbackend.sh

Example: Intel-Optimized Backend

# In your backend script
LIMIT_TARGETS="intel"
source $(dirname $0)/../common/libbackend.sh

Development

Adding a New Backend

Create a new directory in backend/python/
Copy the template structure from common/template/
Implement your backend.py with the required gRPC interface
Add appropriate requirements files for your target hardware
Use libbackend.sh for consistent build and execution

Testing

# Run backend tests
make test
# or
bash test.sh

Building

# Install dependencies
make <backend-name>

# Clean build artifacts
make clean

Architecture

Each backend follows a consistent structure:

backend-name/
├── backend.py          # Main backend implementation
├── requirements.txt    # Base dependencies
├── requirements-*.txt  # Hardware-specific dependencies
├── install.sh         # Installation script
├── run.sh            # Execution script
├── test.sh           # Test script
├── Makefile          # Build targets
└── test.py           # Unit tests

Troubleshooting

Common Issues

Missing dependencies: Ensure all requirements files are properly configured
Hardware detection: Check that BUILD_TYPE matches your system
Python version: Verify Python 3.10+ is available
Virtual environment: Use ensureVenv to create/activate environments

Contributing

When adding new backends or modifying existing ones:

Follow the established directory structure
Use libbackend.sh for consistent behavior
Include appropriate requirements files for all target hardware
Add comprehensive tests
Update this README if adding new backend types