mirror of https://github.com/ashim-hq/ashim synced 2026-04-21 13:37:52 +00:00

Siddharth Kumar Sah 85b1cfc10a chore: rename Stirling-Image to ashim across entire codebase

Complete rebrand from Stirling-Image to ashim following the project
move to https://github.com/ashim-hq/ashim.

Changes across 117 files:
- Package scope: @stirling-image/* → @ashim/*
- GitHub URLs: stirling-image/stirling-image → ashim-hq/ashim
- Docker Hub: stirlingimage/stirling-image → ashimhq/ashim
- GitHub Pages: stirling-image.github.io → ashim-hq.github.io
- All branding text: "Stirling Image" → "ashim"
- Docker service/volumes/user: stirling → ashim
- Database: stirling.db → ashim.db
- localStorage keys: stirling-token → ashim-token
- Environment variables: STIRLING_GPU → ASHIM_GPU
- Python cache dirs: .cache/stirling-image → .cache/ashim
- SVG filter IDs, test prefixes, and all other references

2026-04-14 20:55:42 +08:00

7 KiB

Raw Blame History

AI engine

The @ashim/ai package wraps Python ML models in TypeScript functions. A persistent Python dispatcher process pre-imports heavy ML libraries at startup and keeps them warm in memory, eliminating the cold-start latency that would otherwise occur on every request. If the dispatcher is unavailable, the bridge falls back to spawning a fresh subprocess per call.

All model weights are bundled in the Docker image during the build. No downloads happen at runtime.

::: tip GPU acceleration The Docker image includes CUDA-accelerated ML libraries on amd64. Add --gpus all to your Docker run command to enable GPU acceleration. The image auto-detects your GPU and falls back to CPU if none is available. :::

Background removal

Removes the background from an image and returns a transparent PNG.

Model: BiRefNet models via rembg

Parameter	Type	Description
`model`	string	Model name. Default: `birefnet-general`. Available models: `birefnet-general`, `birefnet-general-lite`, `birefnet-matting`, `birefnet-portrait`, `bria-rmbg`, `u2net`.
`alphaMatting`	boolean	Use alpha matting for finer edge detail
`alphaMattingForegroundThreshold`	number	Foreground threshold for alpha matting (0-255)
`alphaMattingBackgroundThreshold`	number	Background threshold for alpha matting (0-255)

A Phase 2 effects endpoint is also available at POST /api/v1/tools/remove-background/effects. After removing the background, you can apply post-processing effects such as replacement backgrounds, blur, and drop shadows.

Python script: packages/ai/python/remove_bg.py

Upscaling

Increases image resolution using AI super-resolution.

Model: RealESRGAN with Lanczos fallback

Parameter	Type	Description
`scale`	number	Upscale factor (2-8)
`model`	string	Model selection: `auto`, `realesrgan`, or `lanczos`. Default: `auto`.
`faceEnhance`	boolean	Enable face enhancement for better facial detail
`denoise`	number	Denoise strength (0-1). Higher values remove more noise but may lose detail.
`format`	string	Output format (e.g. `png`, `webp`, `jpeg`)
`quality`	number	Output quality for lossy formats

Returns the upscaled image along with the original and new dimensions.

Python script: packages/ai/python/upscale.py

OCR (text recognition)

Extracts text from images. Three quality tiers are available, each trading speed for accuracy.

Models:

fast - Tesseract for quick extraction
balanced - PaddleOCR PP-OCRv5 for general-purpose use
best - PaddleOCR-VL 1.5 vision-language model for maximum accuracy

Parameter	Type	Description
`quality`	string	Quality tier: `fast`, `balanced`, or `best`. Default: `balanced`.
`language`	string	Language code: `auto`, `en`, `de`, `fr`, `es`, `zh`, `ja`, `ko`. Default: `auto`.
`enhance`	boolean	Apply image preprocessing to improve recognition accuracy

Returns structured results with text content, bounding boxes, and confidence scores for each detected text region.

Python script: packages/ai/python/ocr.py

Face detection and blurring

Detects faces in an image and applies a blur to each detected region.

Model: MediaPipe Face Detection

Parameter	Type	Description
`blurRadius`	number	Blur radius for detected faces (1-100). Default: `30`.
`sensitivity`	number	Detection sensitivity (0-1). Lower values require higher confidence. Default: `0.5`.

Returns the blurred image along with metadata about each detected face region (bounding box coordinates and confidence score).

Python script: packages/ai/python/detect_faces.py

Object erasing (inpainting)

Removes objects from images by filling in the area with generated content that matches the surroundings.

Model: LaMa (Large Mask Inpainting) via ONNX Runtime

Takes an image and a mask file (white = area to erase, black = keep). Returns the inpainted image. GPU acceleration is available via ONNX CUDAExecutionProvider when a compatible GPU is detected.

Python script: packages/ai/python/inpaint.py

Content-aware resize (seam carving)

Intelligently resizes images by removing or inserting seams - paths of least visual importance. This preserves the main subject and structure of the image while changing its dimensions.

Engine: caire Go binary

Parameter	Type	Description
`width`	number	Target width in pixels
`height`	number	Target height in pixels
`protectFaces`	boolean	Use face detection to protect facial regions from seam removal
`blurRadius`	number	Gaussian blur radius for energy map computation (0-20)
`sobelThreshold`	number	Edge detection threshold for energy computation (1-20)
`square`	boolean	Force output to a square aspect ratio

Smart crop

Automatically crops images to focus on the most important region. Combines Sharp attention/entropy strategies with MediaPipe face detection to find the optimal crop area.

Models: Sharp (attention/entropy) + MediaPipe Face Detection

Three modes are available:

subject - Uses Sharp's attention strategy to find the most visually interesting region.
face - Uses MediaPipe face detection to center the crop on detected faces.
trim - Removes uniform borders and whitespace from the edges of the image.

Parameters vary by mode. See the interactive API reference at /api/docs for the full parameter list for each mode.

How the bridge works

The TypeScript bridge (packages/ai/src/bridge.ts) exposes a single function, runPythonWithProgress, that does the following for each AI call:

Writes the input image to a temp file in the workspace directory.
Sends a JSON request to the persistent Python dispatcher via stdin (packages/ai/python/dispatcher.py). If the dispatcher isn't running, falls back to spawning a fresh subprocess.
Parses JSON progress lines from stderr (e.g. {"progress": 50, "stage": "Processing..."}) and forwards them via an onProgress callback for real-time SSE streaming.
Reads the JSON response from stdout.
Reads the output image from the filesystem.
Cleans up temp files.

The persistent dispatcher pre-imports rembg, torch, PaddleOCR, MediaPipe, and the LaMa ONNX model at startup. This means the first AI call after container start is fast instead of waiting for library imports. The dispatcher handles requests sequentially (Python's GIL) and reports readiness via a {"ready": true} message on stderr.

GPU detection is handled by packages/ai/python/gpu.py, which checks for CUDA availability at startup and configures each model to use GPU or CPU accordingly.

If the Python process exits with a non-zero code, the bridge extracts a user-friendly error from stderr/stdout and throws. Timeouts default to 5 minutes.

7 KiB Raw Blame History