- Remove all lite/full variant logic from frontend, API, shared constants,
docs, and tests (single unified Docker image only)
- Replace single QEMU multi-arch Docker build with per-architecture native
builds (amd64 + arm64) and manifest merge to fix disk space exhaustion
- Add disk cleanup step and per-platform build cache scopes
- Switch release trigger from push to workflow_dispatch
- Add GitHub issue templates and PR template
Merge CPU, CUDA, and lite Docker images into a single unified image.
One tag (latest) works on all platforms: amd64 (NVIDIA CUDA) and arm64 (CPU).
GPU auto-detected at runtime. All ML models and packages baked in.
Key changes:
- Platform-conditional Dockerfile (nvidia/cuda on amd64, node on arm64)
- tini as PID 1 for proper signal handling
- Fix FILES_STORAGE_PATH data loss bug
- Fix RealESRGAN upscaler (was broken, always fell back to Lanczos)
- Fix PaddleOCR language codes and stdout corruption
- Simplified CI/CD (single build, single tag)
- Expanded model pre-download with verification
- Shutdown timeout, improved health endpoint
- Remove unused lama-cleaner
GPU-dependent libraries (paddlepaddle-gpu, torch CUDA, realesrgan)
cannot be imported at Docker build time because the CUDA driver is
only available at runtime. Smoke test now verifies CPU-only imports
(rembg, cv2, numpy, mediapipe, seam_carving) and checks that model
files exist on disk. GPU imports are verified at runtime.
paddlepaddle-gpu requires libcuda.so.1 at import time, but no CUDA
driver exists during Docker build. The download script now catches
this ImportError and skips PaddleOCR model pre-download on amd64.
Models will download on first use at runtime when the CUDA driver
is available via nvidia-container-toolkit.
On arm64 (CPU paddlepaddle), models are still pre-downloaded at build
time as before.
Also reverted CI to amd64-only Docker build test for speed. Multi-arch
build is tested on release via the release workflow.
paddlepaddle-gpu needs libcuda.so.1 at import time, but the real NVIDIA
driver is only injected at runtime by the container toolkit. Install
cuda-compat-12-6 which provides forward-compat stubs that satisfy the
dlopen without a real GPU. Also force CPU mode via env vars in the
download script.
paddlepaddle-gpu tries to load libcuda.so.1 on import, but no GPU
driver exists during Docker build. Set PADDLE_DEVICE=cpu, FLAGS_use_cuda=0,
and CUDA_VISIBLE_DEVICES="" before any ML imports to force CPU mode.
-i replaces the entire package index so paddleocr couldn't be found.
Use --extra-index-url to add PaddlePaddle's index alongside PyPI, and
install paddleocr separately so it resolves from PyPI.
paddlepaddle-gpu==3.0.0 is not on PyPI, it is hosted on PaddlePaddle's
own package index. Added -i flag pointing to the cu126 stable index
for the amd64 GPU build.
- Warn on startup if deprecated STIRLING_VARIANT env var is set
- Broaden upscale.py exception handling to catch RuntimeError/OSError
for Lanczos fallback (not just ImportError)
- Add QEMU + multi-arch (amd64+arm64) to CI Docker build test
- Use .get() instead of .all() for single-row health check query
- Restore container_name in docker-compose.yml for backwards compat
PaddleOCR prints download/init messages to stdout which corrupts the
JSON result that the bridge expects. Same risk with basicsr/realesrgan.
Applied the same fd-level stdout redirect pattern already used in
remove_bg.py: redirect fd 1 to stderr during ML work, restore for
the JSON result. Also added show_log=False to PaddleOCR constructor.
node --import tsx requires tsx to be directly in node_modules/, but
pnpm hoists it differently. npx tsx works because it resolves through
pnpm's bin links. tini as PID 1 handles signal forwarding regardless.
basicsr has a known torchvision.transforms.functional_tensor compat
issue on arm64 with newer torchvision. On arm64, upscale.py falls back
to Lanczos via ImportError anyway. Smoke test still verifies the model
weights file exists on all platforms.
PaddleOCR uses its own language codes (ch, japan, korean, latin) not
ISO codes (zh, ja, ko, de, fr, es). The download script and ocr.py
now map API language codes to PaddleOCR codes correctly. German,
French, and Spanish all use the "latin" script model.
Rewrite docker-tags.md for single image with GPU auto-detection.
Update deployment.md to remove variant table and lite/cuda references.
Replace LaMa Cleaner references with OpenCV in architecture and AI docs.
Add migration notes for users on :lite and :cuda tags.
Remove 3-variant matrix (full/lite/cuda). Single build produces
a multi-arch manifest (amd64 + arm64) pushed to Docker Hub and GHCR.
Tags: latest, X.Y.Z, X.Y, X. CI builds native platform only (amd64)
for speed. Multi-arch only on release.
- Add 8s shutdown timeout to prevent indefinite hang when app.close()
stalls. Stays under Docker's default 10s stop_grace_period.
- Health endpoint now checks database connectivity, returns 503 when
DB is unreachable so Docker marks container unhealthy.
- Removed variant field from health response (single image now).
GPU is activated at runtime via --gpus all, not a separate compose file.
Added log rotation (10MB x 3 files) to prevent disk fill on long-running
instances. Removed docker-compose.gpu.yml.
- Remove VARIANT/GPU build args, single image for all platforms
- amd64: nvidia/cuda base with GPU Python packages
- arm64: node base with CPU Python packages
- Add tini as PID 1 for proper signal handling
- Replace npx tsx with node --import tsx
- Split pip install into base + tool layers for better caching
- Add NVIDIA_VISIBLE_DEVICES env vars for container toolkit
- Suppress Python ML library log noise
- Increase healthcheck start-period to 60s
- Remove STIRLING_VARIANT env var
- Remove lama-cleaner from pip installs
Downloads all rembg models (6), RealESRGAN_x4plus.pth weights,
PaddleOCR models for all 7 supported languages, verifies MediaPipe
bundles its face detection models. Runs a final smoke test importing
every ML library. Any failure exits non-zero, failing the Docker build.
lama-cleaner is pip-installed but never imported in any Python script.
inpaint.py uses OpenCV TELEA. Removing saves ~100+ MB of image size.
Also added seam-carving to requirements-gpu.txt where it was missing.
model_path was None, so the model had random weights and always fell back
to Lanczos. Now loads RealESRGAN_x4plus.pth from /opt/models/realesrgan/
(configurable via REALESRGAN_MODEL_PATH env var). Only falls back to
Lanczos on ImportError, not blanket Exception.
User-uploaded files were stored in /app/data/files (container writable layer)
instead of /data/files (persistent volume) because the env var was not set in
the Dockerfile. Files were lost on container recreation.
Replace static llms.txt and llms-full.txt with auto-generated versions
that stay in sync with docs on every build. The plugin also generates
per-page .md files for individual page fetching by LLMs.
New tool for joining images horizontally or vertically,
distinct from the grid-based collage tool. Preserves aspect
ratios with fit/original resize modes, optional gap, and
multi-format output.