Commit graph

13 commits

Author SHA1 Message Date
Ashim
08a7ffe403 Enhance logging and error handling across tools; add full tool audit and Playwright tests
- Added model mismatch warnings in colorize, enhance-faces, and upscale routes.
- Improved error handling in colorize, enhance_faces, remove_bg, restore, and upscale scripts with detailed logging.
- Updated Dockerfile to align NCCL versions for compatibility.
- Introduced a new full tool audit script to test all tools for functionality and GPU usage.
- Created Playwright E2E tests for GPU-dependent tools to ensure proper functionality and performance.
2026-04-17 23:06:31 +08:00
stirling-image
8d8ab4bc45
fix(docker): close remaining airgap gaps for fully offline operation (#70)
Three fixes to ensure zero network access after docker pull:

1. rembg model allowlist: validate model parameter against the 7
   pre-downloaded models, preventing rembg from attempting to download
   unknown models via a raw API call.

2. GFPGAN/CodeFormer auxiliary models: pre-download facexlib's
   detection_Resnet50_Final.pth and parsing_parsenet.pth at build time.
   These were previously downloaded on first use via basicsr. Symlinks
   in /app/gfpgan/weights/ ensure codeformer-pip also finds them.

3. OpenCV colorize models: pre-download the prototxt, caffemodel, and
   pts_in_hull.npy so the lightweight OpenCV colorizer fallback works
   in addition to the primary DDColor method.

Co-authored-by: stirling-image <stirling-image@users.noreply.github.com>
2026-04-14 16:51:46 +08:00
Siddharth Kumar Sah
3345cb266a feat: add Ultra quality mode with BiRefNet-matting, rename quality tiers
Ultra quality (People only):
- BiRefNet-matting ONNX (928MB) for true alpha matting with per-pixel
  transparency on hair wisps and fine edges
- Custom rembg session class, zero new Python dependencies
- Model pre-downloaded in Docker build for immediate availability

Quality tier labels: Fast / HD / Max / Ultra (shorter, fits 4-col grid)
2026-04-12 18:23:09 +08:00
Siddharth Kumar Sah
6c58f12262 feat: overhaul remove-background with effects pipeline, consolidate color tools
Remove Background:
- Two-phase flow: AI removes bg once, then effects adjust instantly
- Blur background effect with real-time CSS preview (portrait mode)
- Drop shadow effect with opacity control
- Gradient backgrounds with presets, custom colors, and angle
- Custom background image upload (including HEIC/HEIF)
- Solid color backgrounds moved from Python to Node.js/Sharp
- Effects-only API endpoint for instant re-renders without AI re-run
- HEIC/HEIF input support (decoded before passing to Python/rembg)
- Passport/ID photo checkbox defaults ON for People subject
- Before/after slider preserved when no effects active
- 15 comprehensive Playwright e2e tests

Color Tools:
- Consolidated 4 tools (brightness-contrast, saturation, color-channels,
  color-effects) into single "Adjust Colors" tool
- Added exposure, temperature, tint, hue, sharpness controls
- SVG filter-based live preview for all adjustments
- Backward-compatible URL redirects from old tool paths

Other fixes:
- Favicon tool: download button instead of auto-download
- Batch processing: HEIC filename extension fix
- File store: processedFilename field for proper batch downloads
2026-04-12 17:53:16 +08:00
stirling-image
2eb77fe0f2 fix: improve AI tool reliability for face detection and background removal (#25)
- Replace OpenCV Haar Cascades with MediaPipe for face detection, using
  short-range model first with full-range fallback for better accuracy
- Add auto-orient to remove-background route for EXIF-rotated photos
- Change default background removal model from u2net to birefnet-general-lite
- Fix flaky test by setting SQLite busy_timeout before journal_mode pragma

Co-authored-by: Siddharth Kumar Sah <siddharth123sk@gmail.com>
2026-04-06 22:00:48 +08:00
Siddharth Kumar Sah
29a382e9e0 feat: add GPU/CUDA acceleration support (:cuda Docker tag)
Add a :cuda Docker image tag that auto-detects NVIDIA GPU at runtime
and falls back gracefully to CPU. Same pattern as Immich.

- New gpu.py shared utility for cached CUDA detection
- Background removal (rembg): pass CUDAExecutionProvider to ONNX Runtime
- Upscaling (Real-ESRGAN): use CUDA device + FP16 when GPU available
- OCR (PaddleOCR): enable use_gpu when CUDA detected
- Dispatcher reports GPU status at startup via readiness signal
- Admin health endpoint exposes GPU availability
- Dockerfile uses ARG GPU=false with conditional NVIDIA CUDA base image
- docker-compose.gpu.yml override for GPU users
- CI/CD workflows build and publish :cuda tag (amd64 only)

Three tags: :latest (CPU), :lite (no AI), :cuda (GPU with CPU fallback)
2026-04-05 19:12:45 +08:00
Siddharth Kumar Sah
723842988b feat(ai): add emit_progress() calls to all Python AI scripts 2026-03-23 01:38:19 +08:00
Siddharth Kumar Sah
4807bd2726 feat: add semantic-release for automated versioning and help dialog
- Set up semantic-release with zero-touch CI pipeline on push to main
- Add version sync script to keep all package.json files and APP_VERSION
  constant in sync automatically
- Consolidate Docker publishing into single tag-triggered workflow that
  pushes to both Docker Hub and ghcr.io with semver tags
- Add help dialog with keyboard shortcuts, getting started guide, and
  resource links
- Sync all versions to 0.2.1 to match Docker Hub latest
2026-03-22 21:25:14 +08:00
Siddharth Kumar Sah
fb1d366d5f fix: use U2-Net as default model (fast, 2s) with BiRefNet as opt-in
BiRefNet-Lite times out on first load (~60s+ for 973MB model).
U2-Net works in 2 seconds. Users can still select BiRefNet for
higher quality when they're willing to wait. Added timing hints
in model descriptions and increased timeout for BiRefNet models.
2026-03-22 21:08:37 +08:00
Siddharth Kumar Sah
2bdc367767 fix: use BiRefNet-Lite as default model and fix JSON parsing
- Switch default from birefnet-general (973MB, 4min) to
  birefnet-general-lite (faster, still SOTA quality)
- Fix Python script stdout pollution — progress messages now go to
  stderr so the JSON result parser doesn't break
- Pre-download birefnet-general-lite in Docker build
2026-03-22 20:07:39 +08:00
Siddharth Kumar Sah
77ee8469d8 feat: upgrade to BiRefNet SOTA background removal model
- Switch default model from U2-Net to BiRefNet (state-of-the-art)
- Add 6 model options: BiRefNet, BiRefNet Lite, BiRefNet Portrait,
  BRIA RMBG, IS-Net, U2-Net
- Add animated progress bar with stage indicators (loading model,
  analyzing, removing, refining edges) and elapsed timer
- Add intuitive background color presets (Transparent, White, Black,
  Red, Green, Blue) as clickable buttons + custom color picker
- Handle background color compositing in Python (PIL alpha composite)
- Add checkerboard pattern to before/after slider for transparency
- Pre-bake BiRefNet model (973MB) in Docker image for instant use
2026-03-22 19:51:35 +08:00
Siddharth Kumar Sah
ce03aad10f feat: production Docker, Playwright tests, settings API, and bug fixes
- Add user management endpoints (register, list, delete, change password)
- Add API key management (create, list, delete)
- Add settings persistence endpoints (get, put)
- Wire settings dialog to real backend (People, API Keys, System, Security)
- Fix login auth flow (window.location.href for full reload)
- Fix download URLs returning 401 (make public since UUIDs are unguessable)
- Fix border tool shadowColor validation (accept 6-8 hex digits)
- Fix remove-bg alpha matting fallback (retry without on failure)
- Fix AI tool silent fallbacks (report errors instead of no-ops)
- Add checkerboard background to before/after slider for transparency
- Add progress bars to all AI tool components
- Add Playwright E2E test suite (131 tests across 9 test files)
- Rewrite Dockerfile for production (tsx runtime, pre-baked AI models)
- Add .dockerignore for faster builds
- Add proper accessible labels to login form
2026-03-22 19:28:57 +08:00
Siddharth Kumar Sah
5524939b6f feat: add Phase 4 AI tools with Python bridge and 6 new tools
Add Python bridge (packages/ai/src/bridge.ts) that calls Python scripts
via child_process with venv-first fallback to system python3. Implements
6 AI-powered tools:

- Remove Background: rembg-based with U2-Net/IS-Net models
- Image Upscaling: Real-ESRGAN with Lanczos fallback
- OCR/Text Extraction: Tesseract + PaddleOCR engines
- Face/PII Blur: MediaPipe face detection with configurable blur
- Object Eraser: LaMa inpainting with mask-based input
- Smart Crop: Sharp attention-based entropy cropping (no Python needed)

Each tool includes: Python script, TypeScript wrapper, API route,
and React settings component. All Python scripts handle ImportError
gracefully with clear installation messages.
2026-03-22 04:31:49 +08:00