mirror of
https://github.com/ashim-hq/ashim
synced 2026-04-21 13:37:52 +00:00
docs: add crash recovery and robustness to spec and plan
Atomic model downloads (.downloading suffix + rename), file-based install lock (survives container restart), atomic JSON writes, startup recovery sequence, frontend double-click prevention, SSE fallback polling, disk space pre-checks.
This commit is contained in:
parent
08a7ffe403
commit
31424d4356
2 changed files with 1236 additions and 0 deletions
712
docs/superpowers/plans/2026-04-17-on-demand-ai-features.md
Normal file
712
docs/superpowers/plans/2026-04-17-on-demand-ai-features.md
Normal file
|
|
@ -0,0 +1,712 @@
|
|||
# On-Demand AI Feature Downloads Implementation Plan
|
||||
|
||||
> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking.
|
||||
|
||||
**Goal:** Reduce Docker image from ~30 GB to ~5-6 GB by making AI features downloadable post-install via a UI-driven bundle system.
|
||||
|
||||
**Architecture:** Six feature bundles (Background Removal, Face Detection, Object Eraser & Colorize, Upscale & Enhance, Photo Restoration, OCR) are defined in a JSON manifest baked into the image. A Python install script handles pip + model downloads to a persistent volume. The backend exposes install/uninstall APIs with SSE progress. The frontend shows download badges on uninstalled tools and an install prompt on tool pages.
|
||||
|
||||
**Tech Stack:** Fastify (API), Zustand (frontend state), Python (install script), Docker (image restructuring), SSE (progress streaming)
|
||||
|
||||
**Spec:** `docs/superpowers/specs/2026-04-17-on-demand-ai-features-design.md`
|
||||
|
||||
---
|
||||
|
||||
## File Map
|
||||
|
||||
```
|
||||
NEW FILES:
|
||||
packages/shared/src/features.ts # Bundle definitions, tool-to-bundle map, types
|
||||
docker/feature-manifest.json # Authoritative manifest baked into image
|
||||
apps/api/src/lib/feature-status.ts # Reads manifest + installed.json, provides status
|
||||
apps/api/src/routes/features.ts # GET /features, POST install/uninstall, GET disk-usage
|
||||
packages/ai/python/install_feature.py # Python install script (pip + model downloads)
|
||||
apps/web/src/stores/features-store.ts # Zustand store for bundle statuses
|
||||
apps/web/src/components/features/feature-install-prompt.tsx # Install prompt card for tool pages
|
||||
apps/web/src/components/settings/ai-features-section.tsx # Settings panel section
|
||||
tests/unit/features.test.ts # Unit tests for feature logic
|
||||
|
||||
MODIFIED FILES:
|
||||
packages/ai/src/bridge.ts # restartDispatcher(), FEATURE_NOT_INSTALLED handling
|
||||
packages/ai/src/index.ts # Export restartDispatcher
|
||||
packages/ai/python/dispatcher.py # Read installed.json, gate scripts by feature
|
||||
packages/ai/python/colorize.py # Hard imports to lazy imports
|
||||
packages/ai/python/restore.py # Hard imports to lazy imports
|
||||
apps/api/src/index.ts # Register feature routes, startup venv check
|
||||
apps/api/src/routes/tool-factory.ts # Feature-installed guard before process()
|
||||
apps/api/src/routes/batch.ts # Feature-installed check at gating point
|
||||
apps/api/src/routes/pipeline.ts # Feature-installed check in pre-validation
|
||||
apps/api/src/routes/tools/restore-photo.ts # Feature-installed guard
|
||||
apps/web/src/lib/api.ts # Extend parseApiError for FEATURE_NOT_INSTALLED
|
||||
apps/web/src/components/common/tool-card.tsx # Download badge on uninstalled AI tools
|
||||
apps/web/src/pages/tool-page.tsx # Feature check then install prompt or "not enabled"
|
||||
apps/web/src/components/layout/tool-panel.tsx # Fetch features on mount
|
||||
apps/web/src/pages/fullscreen-grid-page.tsx # Fetch features on mount
|
||||
apps/web/src/components/settings/settings-dialog.tsx # Add AI Features nav item + section
|
||||
docker/Dockerfile # Remove ML packages/models, keep base
|
||||
docker/entrypoint.sh # Venv bootstrap, /data/ai/ setup
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 1: Shared Feature Types and Bundle Definitions
|
||||
|
||||
**Files:**
|
||||
- Create: `packages/shared/src/features.ts`
|
||||
- Modify: `packages/shared/src/index.ts`
|
||||
- Test: `tests/unit/features.test.ts`
|
||||
|
||||
- [ ] **Step 1: Write the failing test for bundle definitions**
|
||||
|
||||
Create `tests/unit/features.test.ts`:
|
||||
|
||||
```ts
|
||||
import { describe, expect, it } from "vitest";
|
||||
import {
|
||||
FEATURE_BUNDLES,
|
||||
getBundleForTool,
|
||||
getToolsForBundle,
|
||||
TOOL_BUNDLE_MAP,
|
||||
} from "@ashim/shared/features";
|
||||
import { PYTHON_SIDECAR_TOOLS } from "@ashim/shared";
|
||||
|
||||
describe("Feature bundles", () => {
|
||||
it("every PYTHON_SIDECAR_TOOL maps to exactly one bundle", () => {
|
||||
for (const toolId of PYTHON_SIDECAR_TOOLS) {
|
||||
const bundle = getBundleForTool(toolId);
|
||||
expect(bundle, `${toolId} has no bundle`).toBeDefined();
|
||||
}
|
||||
});
|
||||
|
||||
it("getBundleForTool returns null for non-AI tools", () => {
|
||||
expect(getBundleForTool("resize")).toBeNull();
|
||||
expect(getBundleForTool("crop")).toBeNull();
|
||||
});
|
||||
|
||||
it("getToolsForBundle returns correct tools", () => {
|
||||
const tools = getToolsForBundle("background-removal");
|
||||
expect(tools).toContain("remove-background");
|
||||
expect(tools).toContain("passport-photo");
|
||||
expect(tools).not.toContain("upscale");
|
||||
});
|
||||
|
||||
it("all 6 bundles are defined", () => {
|
||||
expect(Object.keys(FEATURE_BUNDLES)).toHaveLength(6);
|
||||
expect(FEATURE_BUNDLES["background-removal"]).toBeDefined();
|
||||
expect(FEATURE_BUNDLES["face-detection"]).toBeDefined();
|
||||
expect(FEATURE_BUNDLES["object-eraser-colorize"]).toBeDefined();
|
||||
expect(FEATURE_BUNDLES["upscale-enhance"]).toBeDefined();
|
||||
expect(FEATURE_BUNDLES["photo-restoration"]).toBeDefined();
|
||||
expect(FEATURE_BUNDLES["ocr"]).toBeDefined();
|
||||
});
|
||||
|
||||
it("TOOL_BUNDLE_MAP covers all sidecar tools", () => {
|
||||
const mappedTools = Object.keys(TOOL_BUNDLE_MAP);
|
||||
for (const toolId of PYTHON_SIDECAR_TOOLS) {
|
||||
expect(mappedTools, `${toolId} missing from TOOL_BUNDLE_MAP`).toContain(toolId);
|
||||
}
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Run test to verify it fails**
|
||||
|
||||
Run: `pnpm test:unit -- tests/unit/features.test.ts`
|
||||
Expected: FAIL with module not found error.
|
||||
|
||||
- [ ] **Step 3: Create the feature definitions module**
|
||||
|
||||
Create `packages/shared/src/features.ts`:
|
||||
|
||||
```ts
|
||||
export interface FeatureBundleInfo {
|
||||
id: string;
|
||||
name: string;
|
||||
description: string;
|
||||
estimatedSize: string;
|
||||
enablesTools: string[];
|
||||
}
|
||||
|
||||
export type FeatureStatus = "not_installed" | "installing" | "installed" | "error";
|
||||
|
||||
export interface FeatureBundleState {
|
||||
id: string;
|
||||
name: string;
|
||||
description: string;
|
||||
status: FeatureStatus;
|
||||
installedVersion: string | null;
|
||||
estimatedSize: string;
|
||||
enablesTools: string[];
|
||||
progress: { percent: number; stage: string } | null;
|
||||
error: string | null;
|
||||
}
|
||||
|
||||
export const FEATURE_BUNDLES: Record<string, FeatureBundleInfo> = {
|
||||
"background-removal": {
|
||||
id: "background-removal",
|
||||
name: "Background Removal",
|
||||
description: "Remove image backgrounds with AI",
|
||||
estimatedSize: "700 MB - 1 GB",
|
||||
enablesTools: ["remove-background", "passport-photo"],
|
||||
},
|
||||
"face-detection": {
|
||||
id: "face-detection",
|
||||
name: "Face Detection",
|
||||
description: "Detect and blur faces, fix red-eye, smart crop",
|
||||
estimatedSize: "200-300 MB",
|
||||
enablesTools: ["blur-faces", "red-eye-removal", "smart-crop"],
|
||||
},
|
||||
"object-eraser-colorize": {
|
||||
id: "object-eraser-colorize",
|
||||
name: "Object Eraser & Colorize",
|
||||
description: "Erase objects from photos and colorize B&W images",
|
||||
estimatedSize: "600-800 MB",
|
||||
enablesTools: ["erase-object", "colorize"],
|
||||
},
|
||||
"upscale-enhance": {
|
||||
id: "upscale-enhance",
|
||||
name: "Upscale & Enhance",
|
||||
description: "AI upscaling, face enhancement, and noise removal",
|
||||
estimatedSize: "4-5 GB",
|
||||
enablesTools: ["upscale", "enhance-faces", "noise-removal"],
|
||||
},
|
||||
"photo-restoration": {
|
||||
id: "photo-restoration",
|
||||
name: "Photo Restoration",
|
||||
description: "Restore old or damaged photos",
|
||||
estimatedSize: "800 MB - 1 GB",
|
||||
enablesTools: ["restore-photo"],
|
||||
},
|
||||
ocr: {
|
||||
id: "ocr",
|
||||
name: "OCR",
|
||||
description: "Extract text from images",
|
||||
estimatedSize: "3-4 GB",
|
||||
enablesTools: ["ocr"],
|
||||
},
|
||||
};
|
||||
|
||||
export const TOOL_BUNDLE_MAP: Record<string, string> = {};
|
||||
for (const [bundleId, bundle] of Object.entries(FEATURE_BUNDLES)) {
|
||||
for (const toolId of bundle.enablesTools) {
|
||||
TOOL_BUNDLE_MAP[toolId] = bundleId;
|
||||
}
|
||||
}
|
||||
|
||||
export function getBundleForTool(toolId: string): FeatureBundleInfo | null {
|
||||
const bundleId = TOOL_BUNDLE_MAP[toolId];
|
||||
return bundleId ? FEATURE_BUNDLES[bundleId] : null;
|
||||
}
|
||||
|
||||
export function getToolsForBundle(bundleId: string): string[] {
|
||||
return FEATURE_BUNDLES[bundleId]?.enablesTools ?? [];
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Export from shared package**
|
||||
|
||||
Add to the end of `packages/shared/src/index.ts`:
|
||||
|
||||
```ts
|
||||
export * from "./features.js";
|
||||
```
|
||||
|
||||
- [ ] **Step 5: Run test to verify it passes**
|
||||
|
||||
Run: `pnpm test:unit -- tests/unit/features.test.ts`
|
||||
Expected: PASS, all 5 tests green.
|
||||
|
||||
- [ ] **Step 6: Commit**
|
||||
|
||||
```bash
|
||||
git add packages/shared/src/features.ts packages/shared/src/index.ts tests/unit/features.test.ts
|
||||
git commit -m "feat: add shared feature bundle definitions and tool-to-bundle mapping"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 2: Feature Manifest File
|
||||
|
||||
**Files:**
|
||||
- Create: `docker/feature-manifest.json`
|
||||
|
||||
- [ ] **Step 1: Create the feature manifest**
|
||||
|
||||
Create `docker/feature-manifest.json` containing the full bundle definitions with exact package versions, pip flags, platform-specific packages, and model download URLs. Source exact versions from the current Dockerfile (lines 167-206) and model URLs from `docker/download_models.py`.
|
||||
|
||||
Key details: amd64 uses `--extra-index-url https://download.pytorch.org/whl/cu126` for torch/realesrgan; amd64 uses `paddlepaddle-gpu>=3.2.1` from `https://www.paddlepaddle.org.cn/packages/stable/cu126/`; arm64 uses `mediapipe==0.10.18`; `codeformer-pip==0.0.4` needs `--no-deps`; `postInstall` re-pins `numpy==1.26.4`.
|
||||
|
||||
The file should contain a top-level `manifestVersion`, `imageVersion`, `pythonVersion`, `basePackages` array, and `bundles` object with all 6 bundles. Each bundle has `name`, `description`, `estimatedSize`, `packages` (with `common`/`amd64`/`arm64` arrays), `pipFlags`, `postInstall`, `models` array, and `enablesTools` array.
|
||||
|
||||
Model entries use either: `{ "id", "url", "path", "minSize" }` for direct downloads, `{ "id", "downloadFn": "rembg_session", "args": [...] }` for rembg models, or `{ "id", "downloadFn": "hf_snapshot", "args": [repo_id, local_subpath] }` for HuggingFace snapshots.
|
||||
|
||||
- [ ] **Step 2: Commit**
|
||||
|
||||
```bash
|
||||
git add docker/feature-manifest.json
|
||||
git commit -m "feat: add feature manifest with all 6 bundle definitions"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 3: Backend Feature Status Service
|
||||
|
||||
**Files:**
|
||||
- Create: `apps/api/src/lib/feature-status.ts`
|
||||
|
||||
- [ ] **Step 1: Create the feature status service**
|
||||
|
||||
Create `apps/api/src/lib/feature-status.ts`. This module reads/writes `/data/ai/installed.json`, provides `isFeatureInstalled(bundleId)`, `isToolInstalled(toolId)`, `getFeatureStates()`, `markInstalled()`, `markUninstalled()`, `setInstallProgress()`, and `ensureAiDirs()`.
|
||||
|
||||
Uses `FEATURE_BUNDLES` and `TOOL_BUNDLE_MAP` from `@ashim/shared`. Caches `installed.json` in memory with `invalidateCache()` for refresh after install/uninstall. Detects Docker environment via `existsSync("/.dockerenv")`.
|
||||
|
||||
See spec section "Persistent Storage" for directory structure: `/data/ai/venv/`, `/data/ai/models/`, `/data/ai/pip-cache/`, `/data/ai/installed.json`.
|
||||
|
||||
**Robustness requirements for this module:**
|
||||
|
||||
- **Atomic JSON writes:** `markInstalled()` and `markUninstalled()` must write to `installed.json.tmp` first, then `renameSync()` to `installed.json`. Never write directly to `installed.json`.
|
||||
- **Corrupt JSON recovery:** `readInstalled()` wraps `JSON.parse` in try/catch. If the file is corrupt, treat as empty `{ bundles: {} }` and log a warning.
|
||||
- **File-based install lock:** Instead of just in-memory `installInProgress`, use `/data/ai/install.lock` file containing `{ bundleId, startedAt, pid }`. Create lock before install, delete on completion/failure. `getInstallingBundle()` reads from the lock file, not memory.
|
||||
- **`recoverInterruptedInstalls()`** function called on startup:
|
||||
1. Delete any `*.downloading` files in `/data/ai/models/` (recursive glob)
|
||||
2. Delete `installed.json.tmp` if it exists
|
||||
3. Delete `/data/ai/venv.bootstrapping/` if it exists
|
||||
4. If `install.lock` exists: check if PID is alive (via `process.kill(pid, 0)` in try/catch). If dead, delete the lock and log a warning. If alive, leave it (install is still running from a previous container lifecycle — unlikely but possible with shared volumes).
|
||||
5. For each bundle in `installed.json`, verify model files exist and meet `minSize` from the feature manifest. If any model is missing/undersized, set the bundle's error field to "Some model files are missing. Reinstall this feature." but do NOT remove from installed.json.
|
||||
- **`acquireInstallLock(bundleId)`** and **`releaseInstallLock()`** functions that create/delete the lock file atomically.
|
||||
|
||||
- [ ] **Step 2: Commit**
|
||||
|
||||
```bash
|
||||
git add apps/api/src/lib/feature-status.ts
|
||||
git commit -m "feat: add backend feature status service for tracking installed bundles"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 4: Feature API Routes
|
||||
|
||||
**Files:**
|
||||
- Create: `apps/api/src/routes/features.ts`
|
||||
- Modify: `apps/api/src/index.ts`
|
||||
|
||||
- [ ] **Step 1: Create the features route file**
|
||||
|
||||
Create `apps/api/src/routes/features.ts` with 4 endpoints:
|
||||
|
||||
1. `GET /api/v1/features` (any authenticated user) — returns `{ bundles: FeatureBundleState[] }`. In non-Docker environments, returns all features as installed.
|
||||
2. `POST /api/v1/admin/features/:bundleId/install` (admin only) — validates bundle exists, checks not already installed, checks no other install in progress (409). Spawns `install_feature.py` as child process via `spawn()`. Parses stderr JSON progress lines, updates progress via `updateSingleFileProgress()` from `progress.ts`. On success, calls `invalidateCache()` and `shutdownDispatcher()` (from `@ashim/ai`). Returns `{ jobId }`.
|
||||
3. `POST /api/v1/admin/features/:bundleId/uninstall` (admin only) — removes model files listed in the manifest, calls `markUninstalled()`, calls `shutdownDispatcher()`. Returns `{ ok: true }`.
|
||||
4. `GET /api/v1/admin/features/disk-usage` (admin only) — returns `{ totalBytes }` by recursively sizing `/data/ai/`.
|
||||
|
||||
Note: Use `spawn()` from `node:child_process` (not `exec()`) for the install script to avoid shell injection. Pass arguments as array elements.
|
||||
|
||||
**Robustness requirements for install endpoint:**
|
||||
- Call `acquireInstallLock(bundleId)` before spawning the child process. If lock acquisition fails (lock file already exists with a live PID), return 409.
|
||||
- Check available disk space before starting: `const { availableParallelism } = require("node:os"); const stats = statfsSync("/data"); const freeBytes = stats.bfree * stats.bsize;`. Compare against a rough estimate for the bundle. If insufficient, return 400 with disk space info.
|
||||
- On child process `close` event with code 0: call `releaseInstallLock()`, `invalidateCache()`, `shutdownDispatcher()`.
|
||||
- On child process `close` event with non-zero code: call `releaseInstallLock()`, set error state. Do NOT leave the lock file behind.
|
||||
- On child process `error` event (spawn failure): call `releaseInstallLock()`, return error.
|
||||
- The install endpoint returns `{ jobId }` immediately. The child process runs asynchronously. The HTTP response does not block on completion.
|
||||
|
||||
- [ ] **Step 2: Register feature routes in index.ts**
|
||||
|
||||
In `apps/api/src/index.ts`: import `registerFeatureRoutes`, call it after the settings routes registration. Also import and call `ensureAiDirs()` and `recoverInterruptedInstalls()` near the top of the startup sequence after `runMigrations()`.
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add apps/api/src/routes/features.ts apps/api/src/index.ts
|
||||
git commit -m "feat: add feature install/uninstall API routes with SSE progress"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 5: Python Install Script
|
||||
|
||||
**Files:**
|
||||
- Create: `packages/ai/python/install_feature.py`
|
||||
|
||||
- [ ] **Step 1: Create the install script**
|
||||
|
||||
Create `packages/ai/python/install_feature.py`. Takes 3 CLI args: `bundleId`, `manifestPath`, `modelsDir`. Reads manifest JSON, detects architecture via `platform.machine()`, runs pip install for each package using `subprocess.run([sys.executable, "-m", "pip", "install", ...])`, downloads models with retry logic (exponential backoff, 3 retries, file size assertions).
|
||||
|
||||
Progress reported via stderr JSON lines: `{"progress": N, "stage": "..."}`. Result written to stdout JSON: `{"success": true, "bundleId": "...", "version": "...", "models": [...]}`.
|
||||
|
||||
Port the retry pattern from `docker/download_models.py` `_urlretrieve()` (lines 18-35). Handle rembg models via `rembg.new_session()` and HuggingFace models via `huggingface_hub.snapshot_download()`. Must be idempotent.
|
||||
|
||||
Writes to `/data/ai/installed.json` on success (matching the structure read by `feature-status.ts`).
|
||||
|
||||
**Robustness requirements for the install script:**
|
||||
|
||||
- **Atomic model downloads:** For each URL-based model:
|
||||
1. Check if final path already exists and meets `minSize` — skip if so (idempotent)
|
||||
2. Delete any existing `<path>.downloading` file (orphan from a previous failed attempt)
|
||||
3. Download to `<path>.downloading`
|
||||
4. Verify file size against `minSize`. If too small, delete and raise error.
|
||||
5. `os.rename(<path>.downloading, <path>)` — atomic on same filesystem
|
||||
6. Never leave a `.downloading` file behind on success
|
||||
- **Atomic JSON writes:** When writing `installed.json`:
|
||||
1. Write to `installed.json.tmp`
|
||||
2. `os.rename()` to `installed.json`
|
||||
- **Disk space pre-check:** Before starting, check available disk space via `shutil.disk_usage()`. If free space is less than estimated bundle size, exit with a clear error message.
|
||||
- **pip failure recovery:** If `pip install` fails for one package, emit the error and exit. The packages that were already installed remain (pip is idempotent — re-running skips them). The admin can retry.
|
||||
- **Model failure isolation:** If one model fails to download after retries, continue downloading other models. At the end, report which models failed. Exit with non-zero code so the bundle is NOT marked as installed. On retry, only the failed models need downloading (others pass the exists+size check).
|
||||
|
||||
- [ ] **Step 2: Commit**
|
||||
|
||||
```bash
|
||||
git add packages/ai/python/install_feature.py
|
||||
git commit -m "feat: add Python install script for feature bundles"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 6: Tool Route Guards
|
||||
|
||||
**Files:**
|
||||
- Modify: `apps/api/src/routes/tool-factory.ts`
|
||||
- Modify: `apps/api/src/routes/batch.ts`
|
||||
- Modify: `apps/api/src/routes/pipeline.ts`
|
||||
- Modify: `apps/api/src/routes/tools/restore-photo.ts`
|
||||
|
||||
- [ ] **Step 1: Add feature guard to tool-factory.ts**
|
||||
|
||||
Import `isToolInstalled` from `../lib/feature-status.js` and `TOOL_BUNDLE_MAP`, `getBundleForTool` from `@ashim/shared`. Inside `createToolRoute`, after settings validation and before `config.process()`, add:
|
||||
|
||||
```ts
|
||||
const bundleId = TOOL_BUNDLE_MAP[config.toolId];
|
||||
if (bundleId && !isToolInstalled(config.toolId)) {
|
||||
const bundle = getBundleForTool(config.toolId);
|
||||
return reply.status(501).send({
|
||||
error: "Feature not installed",
|
||||
code: "FEATURE_NOT_INSTALLED",
|
||||
feature: bundleId,
|
||||
featureName: bundle?.name ?? bundleId,
|
||||
estimatedSize: bundle?.estimatedSize ?? "unknown",
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Add feature guard to batch.ts**
|
||||
|
||||
Same imports. After `getToolConfig(toolId)` returns (around line 35-37), add the same guard returning 501 with `FEATURE_NOT_INSTALLED` code.
|
||||
|
||||
- [ ] **Step 3: Add feature guard to pipeline.ts**
|
||||
|
||||
Same imports. In both pre-validation loops (execute at lines 143-172, batch at lines 441-462), after successful `getToolConfig(resolvedToolId)`, add the guard. Return 501 with step number in the error message.
|
||||
|
||||
- [ ] **Step 4: Add feature guard to restore-photo.ts**
|
||||
|
||||
This tool uses its own route handler, not the factory. Import `isToolInstalled` and add the guard before `restorePhoto()` is called.
|
||||
|
||||
- [ ] **Step 5: Commit**
|
||||
|
||||
```bash
|
||||
git add apps/api/src/routes/tool-factory.ts apps/api/src/routes/batch.ts apps/api/src/routes/pipeline.ts apps/api/src/routes/tools/restore-photo.ts
|
||||
git commit -m "feat: add feature-installed guards to tool routes, batch, and pipeline"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 7: Bridge and Python Sidecar Changes
|
||||
|
||||
**Files:**
|
||||
- Modify: `packages/ai/python/dispatcher.py`
|
||||
- Modify: `packages/ai/python/colorize.py`
|
||||
- Modify: `packages/ai/python/restore.py`
|
||||
|
||||
- [ ] **Step 1: Add feature gating to dispatcher.py**
|
||||
|
||||
Add a `TOOL_BUNDLE_MAP` dict mapping Python script names (without `.py`) to bundle IDs: `remove_bg` -> `background-removal`, `detect_faces` -> `face-detection`, `face_landmarks` -> `face-detection`, `red_eye_removal` -> `face-detection`, `inpaint` -> `object-eraser-colorize`, `colorize` -> `object-eraser-colorize`, `upscale` -> `upscale-enhance`, `enhance_faces` -> `upscale-enhance`, `noise_removal` -> `upscale-enhance`, `restore` -> `photo-restoration`, `ocr` -> `ocr`.
|
||||
|
||||
Add `_get_installed_bundles()` that reads `/data/ai/installed.json` and returns a set of installed bundle IDs.
|
||||
|
||||
In `_run_script_main()`, before the `exec()` call, check if the script's bundle is installed. If not, return a JSON error: `{"success": false, "error": "feature_not_installed", "feature": bundle_id, "message": "..."}`.
|
||||
|
||||
Also set `U2NET_HOME` to `/data/ai/models/rembg` on startup if `/data/ai/models` exists.
|
||||
|
||||
- [ ] **Step 2: Convert hard imports in colorize.py**
|
||||
|
||||
Move module-level `import numpy as np`, `import cv2`, `from PIL import Image` (lines 10-12) inside each function that uses them (`colorize_ddcolor`, `colorize_opencv`, `main`).
|
||||
|
||||
- [ ] **Step 3: Convert hard imports in restore.py**
|
||||
|
||||
Move module-level `import numpy as np`, `import cv2`, `from PIL import Image` (lines 13-15) inside each function that uses them.
|
||||
|
||||
- [ ] **Step 4: Commit**
|
||||
|
||||
```bash
|
||||
git add packages/ai/python/dispatcher.py packages/ai/python/colorize.py packages/ai/python/restore.py
|
||||
git commit -m "feat: add feature gating to Python dispatcher, convert hard imports to lazy"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 8: Frontend Features Store and API Error Extension
|
||||
|
||||
**Files:**
|
||||
- Create: `apps/web/src/stores/features-store.ts`
|
||||
- Modify: `apps/web/src/lib/api.ts`
|
||||
- Modify: `apps/web/src/hooks/use-tool-processor.ts`
|
||||
- Modify: `apps/web/src/hooks/use-pipeline-processor.ts`
|
||||
|
||||
- [ ] **Step 1: Create the features store**
|
||||
|
||||
Create `apps/web/src/stores/features-store.ts` following the `settings-store.ts` pattern. Zustand store with `bundles: FeatureBundleState[]`, `loaded: boolean`, `fetch()` (one-shot), `refresh()` (force re-fetch), `isToolInstalled(toolId)`, `getBundleForTool(toolId)`. Fetches from `GET /api/v1/features`.
|
||||
|
||||
- [ ] **Step 2: Extend parseApiError for FEATURE_NOT_INSTALLED**
|
||||
|
||||
In `apps/web/src/lib/api.ts`, add a `FeatureNotInstalledError` interface export: `{ type: "feature_not_installed"; feature: string; featureName: string; estimatedSize: string }`.
|
||||
|
||||
Modify `parseApiError` return type to `string | FeatureNotInstalledError`. Add early return when `body.code === "FEATURE_NOT_INSTALLED"`.
|
||||
|
||||
- [ ] **Step 3: Update use-tool-processor.ts and use-pipeline-processor.ts**
|
||||
|
||||
In both hooks, where `parseApiError` is called and passed to `setError()`, add a type check:
|
||||
|
||||
```ts
|
||||
const parsed = parseApiError(body, xhr.status);
|
||||
if (typeof parsed === "object" && parsed.type === "feature_not_installed") {
|
||||
setError(`Feature "${parsed.featureName}" is not installed. Enable it in Settings.`);
|
||||
} else {
|
||||
setError(parsed);
|
||||
}
|
||||
```
|
||||
|
||||
- [ ] **Step 4: Commit**
|
||||
|
||||
```bash
|
||||
git add apps/web/src/stores/features-store.ts apps/web/src/lib/api.ts apps/web/src/hooks/use-tool-processor.ts apps/web/src/hooks/use-pipeline-processor.ts
|
||||
git commit -m "feat: add frontend features store and FEATURE_NOT_INSTALLED error handling"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 9: Frontend Tool Grid Badge
|
||||
|
||||
**Files:**
|
||||
- Modify: `apps/web/src/components/common/tool-card.tsx`
|
||||
- Modify: `apps/web/src/components/layout/tool-panel.tsx`
|
||||
- Modify: `apps/web/src/pages/fullscreen-grid-page.tsx`
|
||||
|
||||
- [ ] **Step 1: Add download badge to ToolCard**
|
||||
|
||||
Import `useFeaturesStore`, `PYTHON_SIDECAR_TOOLS`, and `Download` icon from lucide-react. Compute `showDownloadBadge` when the tool is an AI tool and not installed. Render a `<Download className="h-3.5 w-3.5 text-muted-foreground" />` icon after the experimental badge.
|
||||
|
||||
- [ ] **Step 2: Fetch features on app load**
|
||||
|
||||
In `tool-panel.tsx`, add `useFeaturesStore().fetch()` in a useEffect alongside the existing settings fetch. Do the same in `fullscreen-grid-page.tsx`.
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add apps/web/src/components/common/tool-card.tsx apps/web/src/components/layout/tool-panel.tsx apps/web/src/pages/fullscreen-grid-page.tsx
|
||||
git commit -m "feat: add download badge to uninstalled AI tools in tool grid"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 10: Frontend Tool Page Install Prompt
|
||||
|
||||
**Files:**
|
||||
- Create: `apps/web/src/components/features/feature-install-prompt.tsx`
|
||||
- Modify: `apps/web/src/pages/tool-page.tsx`
|
||||
|
||||
- [ ] **Step 1: Create the FeatureInstallPrompt component**
|
||||
|
||||
Props: `{ bundle: FeatureBundleState; isAdmin: boolean }`.
|
||||
|
||||
For non-admins: show centered Download icon + "Feature Not Enabled" heading + "Ask your administrator" text.
|
||||
|
||||
For admins: show Download icon + bundle name/description + "requires additional download (~{estimatedSize})" + [Enable Feature] button. On click: POST to install endpoint, open EventSource for SSE progress, show progress bar with stage text and percent. On completion: call `useFeaturesStore().refresh()` to trigger re-render. On error: show error message with retry option.
|
||||
|
||||
Use same Tailwind patterns as existing components: `bg-primary text-primary-foreground` for buttons, `Loader2 animate-spin` for loading, `text-destructive` for errors.
|
||||
|
||||
**Robustness requirements for the frontend:**
|
||||
|
||||
- **Double-click prevention:** Set `installing = true` immediately on first click (before the API call). The button must be `disabled={installing || bundle.status === "installing"}`. This prevents any re-click.
|
||||
- **Browser close / navigate away:** The server-side install continues regardless. On component mount, check `bundle.status` from the features store. If it's `"installing"`, immediately show the progress bar and open EventSource for the in-progress job (fetch `jobId` from the features endpoint or use the bundle's progress data).
|
||||
- **SSE connection loss fallback:** If EventSource fires `onerror`, close it and fall back to polling `GET /api/v1/features` every 3 seconds via `setInterval`. When status changes from `"installing"` to `"installed"` or `"error"`, stop polling and update UI.
|
||||
- **Page refresh during install:** The features store's `fetch()` returns current status. If a bundle is `"installing"`, the component renders progress state immediately — no need for the user to click anything.
|
||||
- **Multiple admin sessions:** All sessions see the same `"installing"` status from the shared `GET /api/v1/features` endpoint. The server's install lock prevents concurrent installs. Any session trying to install gets a 409.
|
||||
- **Retry after error:** Show a "Retry" button when status is `"error"`. On retry, call the install endpoint again (the lock is released on failure, so this works). pip cache means previously-downloaded wheels aren't re-downloaded. Idempotent model downloads skip already-complete files.
|
||||
|
||||
- [ ] **Step 2: Integrate into ToolPage**
|
||||
|
||||
In `tool-page.tsx`: import `useFeaturesStore`, `PYTHON_SIDECAR_TOOLS`, `useAuth`, and `FeatureInstallPrompt`. After the tool/registryEntry lookup, compute `isAiTool`, `toolInstalled`, `featureBundle`, `isAdmin`. After the "Tool not found" guard, add a guard that renders `<FeatureInstallPrompt>` wrapped in `<AppLayout>` when the tool is AI and not installed.
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add apps/web/src/components/features/feature-install-prompt.tsx apps/web/src/pages/tool-page.tsx
|
||||
git commit -m "feat: add feature install prompt on uninstalled AI tool pages"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 11: Settings AI Features Section
|
||||
|
||||
**Files:**
|
||||
- Create: `apps/web/src/components/settings/ai-features-section.tsx`
|
||||
- Modify: `apps/web/src/components/settings/settings-dialog.tsx`
|
||||
|
||||
- [ ] **Step 1: Create AiFeaturesSection component**
|
||||
|
||||
Follow the card-based layout of existing sections in `settings-dialog.tsx`. Use `useFeaturesStore()`. Render each bundle as a bordered card (`rounded-lg border border-border`) with: name, description, status indicator (green dot = installed, gray = not installed, spinning = installing), estimated size, Install/Uninstall button. Add "Install All" button at top. Show total disk usage at bottom (fetch from `GET /api/v1/admin/features/disk-usage`). Reuse the toggle/button patterns from `ToolsSection`.
|
||||
|
||||
- [ ] **Step 2: Add section to settings-dialog.tsx**
|
||||
|
||||
Add `"ai-features"` to the `Section` type union. Add to `NAV_ITEMS` between `"api-keys"` and `"tools"`: `{ id: "ai-features", label: "AI Features", icon: Sparkles, requiredPermission: "settings:write" }`. Import `Sparkles` from lucide-react. Add `{section === "ai-features" && <AiFeaturesSection />}` to the conditional render block. Lazy-import `AiFeaturesSection` from `"./ai-features-section"`.
|
||||
|
||||
- [ ] **Step 3: Commit**
|
||||
|
||||
```bash
|
||||
git add apps/web/src/components/settings/ai-features-section.tsx apps/web/src/components/settings/settings-dialog.tsx
|
||||
git commit -m "feat: add AI Features settings panel for managing feature bundles"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 12: Dockerfile Restructuring
|
||||
|
||||
**Files:**
|
||||
- Modify: `docker/Dockerfile`
|
||||
- Modify: `docker/entrypoint.sh`
|
||||
|
||||
- [ ] **Step 1: Modify the Dockerfile**
|
||||
|
||||
In `docker/Dockerfile` production stage:
|
||||
|
||||
1. **Keep**: base image selection, Node.js install, pnpm setup, system packages, Python venv creation with base packages (numpy, Pillow, opencv)
|
||||
2. **Remove**: all ML pip install commands (lines 175-206: onnxruntime, rembg, realesrgan, paddlepaddle, mediapipe, codeformer)
|
||||
3. **Remove**: download_models.py COPY and RUN (lines 219-231)
|
||||
4. **Remove**: the `apt-get purge build-essential python3-dev` line (line 251) so build-essential stays for runtime pip installs
|
||||
5. **Add**: `COPY docker/feature-manifest.json /app/docker/feature-manifest.json`
|
||||
6. **Add**: `COPY packages/ai/python/install_feature.py /app/packages/ai/python/install_feature.py`
|
||||
7. **Update** env vars: `PYTHON_VENV_PATH=/data/ai/venv`, add `MODELS_PATH=/data/ai/models`, add `DATA_DIR=/data`
|
||||
|
||||
- [ ] **Step 2: Update entrypoint.sh for venv bootstrap**
|
||||
|
||||
Add venv bootstrap after auth defaults and before volume permission fix. Use atomic directory rename to prevent corrupt venv from partial copy:
|
||||
|
||||
```sh
|
||||
AI_VENV="/data/ai/venv"
|
||||
AI_VENV_TMP="/data/ai/venv.bootstrapping"
|
||||
|
||||
# Clean up any interrupted bootstrap from a previous start
|
||||
if [ -d "$AI_VENV_TMP" ]; then
|
||||
echo "Cleaning up interrupted venv bootstrap..."
|
||||
rm -rf "$AI_VENV_TMP"
|
||||
fi
|
||||
|
||||
# Bootstrap AI venv from base image on first run
|
||||
if [ ! -d "$AI_VENV" ] && [ -d "/opt/venv" ]; then
|
||||
echo "Bootstrapping AI venv from base image..."
|
||||
mkdir -p /data/ai/models /data/ai/pip-cache
|
||||
cp -r /opt/venv "$AI_VENV_TMP"
|
||||
mv "$AI_VENV_TMP" "$AI_VENV"
|
||||
echo "AI venv ready at $AI_VENV"
|
||||
fi
|
||||
```
|
||||
|
||||
The `cp -r` + `mv` pattern ensures `/data/ai/venv` is either fully present or absent — never half-copied. If the container is killed during `cp -r`, the `.bootstrapping` directory is cleaned up on next start.
|
||||
|
||||
- [ ] **Step 3: Build and verify**
|
||||
|
||||
```bash
|
||||
docker build -f docker/Dockerfile -t ashim:dev .
|
||||
docker images ashim:dev --format "{{.Size}}"
|
||||
```
|
||||
Expected: Image size ~5-6 GB (amd64) instead of ~30 GB.
|
||||
|
||||
- [ ] **Step 4: Commit**
|
||||
|
||||
```bash
|
||||
git add docker/Dockerfile docker/entrypoint.sh
|
||||
git commit -m "feat: restructure Dockerfile to remove ML packages and models
|
||||
|
||||
Base image now includes only Node.js + Sharp + Python with base deps.
|
||||
AI features are downloaded on-demand via the feature install system.
|
||||
Image reduced from ~30GB to ~5-6GB (amd64) / ~2-3GB (arm64)."
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task 13: Integration Testing
|
||||
|
||||
**Files:**
|
||||
- Create: `tests/e2e-docker/features.spec.ts`
|
||||
|
||||
- [ ] **Step 1: Create Docker e2e tests for feature system**
|
||||
|
||||
Create `tests/e2e-docker/features.spec.ts` using the existing `playwright.docker.config.ts` infrastructure:
|
||||
|
||||
```ts
|
||||
import { expect, test } from "@playwright/test";
|
||||
|
||||
test.describe("On-demand AI features", () => {
|
||||
test("GET /api/v1/features returns all 6 bundles", async ({ request }) => {
|
||||
const response = await request.get("/api/v1/features");
|
||||
expect(response.ok()).toBeTruthy();
|
||||
const data = await response.json();
|
||||
expect(data.bundles).toHaveLength(6);
|
||||
for (const bundle of data.bundles) {
|
||||
expect(bundle).toHaveProperty("id");
|
||||
expect(bundle).toHaveProperty("name");
|
||||
expect(bundle).toHaveProperty("status");
|
||||
expect(bundle).toHaveProperty("enablesTools");
|
||||
}
|
||||
});
|
||||
|
||||
test("AI tool returns 501 FEATURE_NOT_INSTALLED when bundle not installed", async ({ request }) => {
|
||||
const pngBuffer = Buffer.from(
|
||||
"iVBORw0KGgoAAAANSUhEUgAAAAEAAAABCAYAAAAfFcSJAAAADUlEQVR42mP8/5+hHgAHggJ/PchI7wAAAABJRU5ErkJggg==",
|
||||
"base64",
|
||||
);
|
||||
const response = await request.post("/api/v1/tools/remove-background", {
|
||||
multipart: {
|
||||
file: { name: "test.png", mimeType: "image/png", buffer: pngBuffer },
|
||||
settings: JSON.stringify({}),
|
||||
},
|
||||
});
|
||||
expect(response.status()).toBe(501);
|
||||
const body = await response.json();
|
||||
expect(body.code).toBe("FEATURE_NOT_INSTALLED");
|
||||
expect(body.feature).toBe("background-removal");
|
||||
});
|
||||
|
||||
test("uninstalled AI tool page shows install prompt for admin", async ({ page }) => {
|
||||
await page.goto("/remove-background");
|
||||
await expect(page.getByText("Enable")).toBeVisible({ timeout: 10000 });
|
||||
await expect(page.getByText("additional download")).toBeVisible();
|
||||
});
|
||||
});
|
||||
```
|
||||
|
||||
- [ ] **Step 2: Commit**
|
||||
|
||||
```bash
|
||||
git add tests/e2e-docker/features.spec.ts
|
||||
git commit -m "test: add e2e tests for on-demand AI feature system"
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
### Task Summary
|
||||
|
||||
| Task | Description | Key Files |
|
||||
|------|------------|-----------|
|
||||
| 1 | Shared types and bundle definitions | `packages/shared/src/features.ts` |
|
||||
| 2 | Feature manifest JSON | `docker/feature-manifest.json` |
|
||||
| 3 | Backend feature status service | `apps/api/src/lib/feature-status.ts` |
|
||||
| 4 | Feature API routes | `apps/api/src/routes/features.ts` |
|
||||
| 5 | Python install script | `packages/ai/python/install_feature.py` |
|
||||
| 6 | Tool route guards | `tool-factory.ts`, `batch.ts`, `pipeline.ts` |
|
||||
| 7 | Bridge + Python sidecar changes | `dispatcher.py`, `colorize.py`, `restore.py` |
|
||||
| 8 | Frontend features store + error handling | `features-store.ts`, `api.ts` |
|
||||
| 9 | Frontend tool grid badge | `tool-card.tsx`, `tool-panel.tsx` |
|
||||
| 10 | Frontend tool page install prompt | `feature-install-prompt.tsx`, `tool-page.tsx` |
|
||||
| 11 | Settings AI Features section | `ai-features-section.tsx`, `settings-dialog.tsx` |
|
||||
| 12 | Dockerfile restructuring | `Dockerfile`, `entrypoint.sh` |
|
||||
| 13 | Integration testing | `tests/e2e-docker/features.spec.ts` |
|
||||
|
|
@ -0,0 +1,524 @@
|
|||
# On-Demand AI Feature Downloads
|
||||
|
||||
**Date:** 2026-04-17
|
||||
**Status:** Approved
|
||||
**Goal:** Reduce Docker image from ~30 GB to ~5-6 GB (amd64) / ~2-3 GB (arm64) by making AI features downloadable post-install.
|
||||
|
||||
## Problem
|
||||
|
||||
The Docker image bundles all Python ML packages (~8-10 GB) and model weights (~5-8 GB) regardless of whether users need AI features. Users who only want basic image tools (resize, crop, convert) must pull ~30 GB.
|
||||
|
||||
## Design Decisions
|
||||
|
||||
- **Single Docker image** — no lite/full variants
|
||||
- **Individual feature bundles** — users cherry-pick by feature name, not model name
|
||||
- **Admin-only downloads** — only admins can enable/disable AI features
|
||||
- **AI tools visible with badge** — uninstalled tools appear in grid with a download indicator
|
||||
- **Both tool-page and settings UI** — admins can download from the tool page or from a central management panel in settings
|
||||
|
||||
## Architecture
|
||||
|
||||
### Base Image Contents
|
||||
|
||||
The base image includes everything needed for non-AI tools plus the prerequisites for AI feature installation:
|
||||
|
||||
| Component | Rationale |
|
||||
|-----------|-----------|
|
||||
| Node.js 22 + pnpm + app source + frontend dist | Core application |
|
||||
| Sharp, imagemagick, tesseract-ocr, potrace, libheif, exiftool | Non-AI image processing |
|
||||
| caire binary | Content-aware resize |
|
||||
| Python 3 + pip + build-essential | Required for pip install at runtime |
|
||||
| numpy==1.26.4, Pillow, opencv-python-headless | Shared by all AI features, small (~300 MB) |
|
||||
| CUDA runtime (amd64 only, from nvidia/cuda base) | Required for GPU-accelerated AI |
|
||||
|
||||
**Estimated size:** ~5-6 GB (amd64), ~2-3 GB (arm64)
|
||||
|
||||
### Feature Bundles
|
||||
|
||||
Six user-facing bundles, named by what they enable (not by model names). **Each tool belongs to exactly one bundle — no partial functionality.** When a bundle is installed, all its tools work fully. When it's not installed, those tools are locked entirely.
|
||||
|
||||
| Feature Name | Python Packages | Models | Tools Fully Enabled | Est. Size |
|
||||
|---|---|---|---|---|
|
||||
| **Background Removal** | rembg, onnxruntime(-gpu), mediapipe | birefnet-general-lite, blaze_face, face_landmarker | remove-background, passport-photo | ~700 MB - 1 GB |
|
||||
| **Face Detection** | mediapipe | blaze_face, face_landmarker | blur-faces, red-eye-removal, smart-crop | ~200-300 MB |
|
||||
| **Object Eraser & Colorize** | onnxruntime(-gpu) | LaMa ONNX, DDColor ONNX, OpenCV colorize | erase-object, colorize | ~600-800 MB |
|
||||
| **Upscale & Enhance** | torch, torchvision, realesrgan, codeformer-pip (--no-deps), gfpgan, basicsr, lpips | RealESRGAN x4plus, GFPGANv1.3, CodeFormer (.pth), facexlib, SCUNet, NAFNet | upscale, enhance-faces, noise-removal | ~4-5 GB |
|
||||
| **Photo Restoration** | onnxruntime(-gpu), mediapipe | LaMa ONNX, DDColor ONNX, CodeFormer ONNX, blaze_face, face_landmarker, OpenCV colorize | restore-photo | ~800 MB - 1 GB |
|
||||
| **OCR** | paddlepaddle(-gpu), paddleocr | PP-OCRv5 (7 models), PaddleOCR-VL 1.5 | ocr | ~3-4 GB |
|
||||
|
||||
Notes:
|
||||
- `passport-photo` is in the Background Removal bundle because it primarily needs rembg; mediapipe (for face landmarks) is included in the same bundle so the tool works fully
|
||||
- `noise-removal` is in the Upscale & Enhance bundle because its quality/maximum tiers need PyTorch; all 4 tiers (including OpenCV-based quick/balanced) are locked until the bundle is installed
|
||||
- `ocr` is fully locked until the OCR bundle is installed, including the Tesseract-based fast tier — this keeps the UX clean even though Tesseract is pre-installed in the base image
|
||||
- `restore-photo` is its own bundle because it needs models from multiple domains (inpainting, face enhancement, colorization); all stages work when installed
|
||||
- Some packages appear in multiple bundles (e.g., mediapipe in Background Removal, Face Detection, and Photo Restoration; onnxruntime in Background Removal, Object Eraser, and Photo Restoration). The install script skips already-installed packages — pip handles this naturally
|
||||
- Some models appear in multiple bundles (e.g., blaze_face in both Background Removal and Face Detection). The install script skips already-downloaded model files
|
||||
|
||||
### Bundle Dependencies
|
||||
|
||||
```
|
||||
Background Removal ───── standalone
|
||||
Face Detection ────────── standalone
|
||||
Object Eraser & Colorize ── standalone
|
||||
Upscale & Enhance ─────── standalone
|
||||
Photo Restoration ─────── standalone
|
||||
OCR ───────────────────── standalone
|
||||
```
|
||||
|
||||
All bundles are independently installable. Shared packages (mediapipe, onnxruntime) and shared models (blaze_face, LaMa, etc.) are silently skipped if already present from another bundle.
|
||||
|
||||
### Single Venv Strategy
|
||||
|
||||
The current architecture uses a single venv at `/opt/venv` (set via `PYTHON_VENV_PATH`). The bridge (`bridge.ts`) constructs `${venvPath}/bin/python3` — it can only point to one interpreter. Having two venvs (base at `/opt/venv`, features at `/data/ai/venv/`) is fragile: C extensions and entry points reference their venv prefix, and `PYTHONPATH` hacks break in practice.
|
||||
|
||||
**Solution:** Use a single venv on the persistent volume at `/data/ai/venv/`.
|
||||
|
||||
- The Dockerfile creates `/opt/venv` with base packages (numpy, Pillow, opencv) as before
|
||||
- The entrypoint script bootstraps `/data/ai/venv/` on first run by copying `/opt/venv` into it (fast file copy, ~300 MB)
|
||||
- `PYTHON_VENV_PATH` is set to `/data/ai/venv/` so the bridge uses it
|
||||
- Feature installs add packages to this same venv
|
||||
- On container update, the entrypoint checks if base package versions changed and updates the venv accordingly (pip install from wheel cache)
|
||||
|
||||
This gives us one venv with all packages, living on a persistent volume, bootstrapped from the image's base packages.
|
||||
|
||||
### Persistent Storage
|
||||
|
||||
All AI data lives under `/data/ai/` on the existing Docker volume (no docker-compose changes):
|
||||
|
||||
```
|
||||
/data/ai/
|
||||
venv/ # Single Python virtual environment (bootstrapped from /opt/venv, extended by feature installs)
|
||||
models/ # Downloaded model weight files (same structure as /opt/models/)
|
||||
pip-cache/ # Wheel cache for fast re-installs after updates
|
||||
installed.json # Tracks installed bundles, versions, timestamps
|
||||
```
|
||||
|
||||
### Feature Manifest
|
||||
|
||||
A `feature-manifest.json` file is baked into each Docker image at build time. It is the single source of truth for what each bundle installs:
|
||||
|
||||
```json
|
||||
{
|
||||
"manifestVersion": 1,
|
||||
"imageVersion": "1.16.0",
|
||||
"pythonVersion": "3.12",
|
||||
"basePackages": ["numpy==1.26.4", "Pillow==11.1.0", "opencv-python-headless==4.10.0.84"],
|
||||
"bundles": {
|
||||
"background-removal": {
|
||||
"name": "Background Removal",
|
||||
"description": "Remove image backgrounds with AI",
|
||||
"packages": {
|
||||
"common": ["rembg==2.0.62"],
|
||||
"amd64": ["onnxruntime-gpu==1.20.1", "mediapipe==0.10.21"],
|
||||
"arm64": ["onnxruntime==1.20.1", "rembg[cpu]==2.0.62", "mediapipe==0.10.18"]
|
||||
},
|
||||
"pipFlags": {},
|
||||
"models": [
|
||||
{
|
||||
"id": "birefnet-general-lite",
|
||||
"downloadFn": "rembg_session",
|
||||
"args": ["birefnet-general-lite"]
|
||||
},
|
||||
{
|
||||
"id": "blaze-face-short-range",
|
||||
"url": "https://storage.googleapis.com/mediapipe-models/face_detector/blaze_face_short_range/float16/latest/blaze_face_short_range.tflite",
|
||||
"path": "mediapipe/blaze_face_short_range.tflite",
|
||||
"minSize": 100000
|
||||
},
|
||||
{
|
||||
"id": "face-landmarker",
|
||||
"url": "https://storage.googleapis.com/mediapipe-models/face_landmarker/face_landmarker/float16/latest/face_landmarker.task",
|
||||
"path": "mediapipe/face_landmarker.task",
|
||||
"minSize": 5000000
|
||||
}
|
||||
],
|
||||
"enablesTools": ["remove-background", "passport-photo"]
|
||||
},
|
||||
"upscale-enhance": {
|
||||
"name": "Upscale & Enhance",
|
||||
"description": "AI upscaling, face enhancement, and noise removal",
|
||||
"packages": {
|
||||
"common": ["codeformer-pip==0.0.4", "lpips"],
|
||||
"amd64": [
|
||||
"torch torchvision --extra-index-url https://download.pytorch.org/whl/cu126",
|
||||
"realesrgan==0.3.0 --extra-index-url https://download.pytorch.org/whl/cu126"
|
||||
],
|
||||
"arm64": ["torch", "torchvision", "realesrgan==0.3.0"]
|
||||
},
|
||||
"pipFlags": {
|
||||
"codeformer-pip==0.0.4": "--no-deps"
|
||||
},
|
||||
"postInstall": ["pip install numpy==1.26.4"],
|
||||
"models": [
|
||||
{ "id": "realesrgan-x4plus", "url": "https://github.com/xinntao/Real-ESRGAN/releases/download/v0.1.0/RealESRGAN_x4plus.pth", "path": "realesrgan/RealESRGAN_x4plus.pth", "minSize": 67000000 },
|
||||
{ "id": "gfpgan-v1.3", "url": "https://github.com/TencentARC/GFPGAN/releases/download/v1.3.0/GFPGANv1.3.pth", "path": "gfpgan/GFPGANv1.3.pth", "minSize": 332000000 },
|
||||
{ "id": "codeformer-pth", "url": "https://github.com/sczhou/CodeFormer/releases/download/v0.1.0/codeformer.pth", "path": "codeformer/codeformer.pth", "minSize": 375000000 },
|
||||
{ "id": "codeformer-onnx", "url": "hf://facefusion/models-3.0.0/codeformer.onnx", "path": "codeformer/codeformer.onnx", "minSize": 377000000 },
|
||||
{ "id": "facexlib-detection", "url": "https://github.com/xinntao/facexlib/releases/download/v0.1.0/detection_Resnet50_Final.pth", "path": "gfpgan/facelib/detection_Resnet50_Final.pth", "minSize": 104000000 },
|
||||
{ "id": "facexlib-parsing", "url": "https://github.com/xinntao/facexlib/releases/download/v0.2.2/parsing_parsenet.pth", "path": "gfpgan/facelib/parsing_parsenet.pth", "minSize": 85000000 },
|
||||
{ "id": "scunet", "url": "https://github.com/cszn/KAIR/releases/download/v1.0/scunet_color_real_psnr.pth", "path": "scunet/scunet_color_real_psnr.pth", "minSize": 4000000 },
|
||||
{ "id": "nafnet", "url": "hf://mikestealth/nafnet-models/NAFNet-SIDD-width64.pth", "path": "nafnet/NAFNet-SIDD-width64.pth", "minSize": 67000000 }
|
||||
],
|
||||
"enablesTools": ["upscale", "enhance-faces", "noise-removal"]
|
||||
}
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
### Install Script
|
||||
|
||||
A Python script (`packages/ai/python/install_feature.py`) handles feature installation:
|
||||
|
||||
1. Reads the feature manifest from the image
|
||||
2. Detects architecture (amd64/arm64) and GPU availability
|
||||
3. Creates or reuses the venv at `/data/ai/venv/`
|
||||
4. Runs pip install with the correct packages, flags, and index URLs per platform
|
||||
5. Handles the numpy version conflict (--no-deps for codeformer, re-pin numpy)
|
||||
6. Downloads model weights with retry logic (ported from `download_models.py`)
|
||||
7. Updates `/data/ai/installed.json` with bundle status
|
||||
8. Reports progress to stdout as JSON lines (consumed by the Node bridge)
|
||||
|
||||
The script must be idempotent — running it twice for the same bundle is a no-op.
|
||||
|
||||
### Uninstall and Shared Package Strategy
|
||||
|
||||
Bundles share Python packages (e.g., onnxruntime in Background Removal, Object Eraser, and Photo Restoration). Naively pip-uninstalling a bundle's packages could break other installed bundles.
|
||||
|
||||
**v1 approach (simple):** Uninstall removes model files and updates `installed.json`. Orphaned pip packages stay in the venv — they use disk but don't cause issues. A "Clean up" button in the AI Features settings panel rebuilds the venv from scratch: creates a fresh venv, installs only packages needed by currently-installed bundles, removes the old venv.
|
||||
|
||||
**Future improvement:** Reference counting — track which bundles need which packages, only remove packages exclusively owned by the target bundle.
|
||||
|
||||
### Tool Route Registration for Uninstalled Features
|
||||
|
||||
Currently `registerToolRoutes()` either registers a route or doesn't (disabled tools get 404). For uninstalled AI features, we need routes that return a structured error instead of 404.
|
||||
|
||||
**Solution: Register ALL tool routes always, add a pre-processing guard.**
|
||||
|
||||
In `tool-factory.ts`, after settings validation (around line 198) and before calling `config.process()`, check feature installation status. The response follows the existing `{ error, details }` shape used by `formatZodErrors` (`apps/api/src/lib/errors.ts`) and consumed by `parseApiError` (`apps/web/src/lib/api.ts`):
|
||||
|
||||
```typescript
|
||||
if (isAiTool(config.toolId) && !isFeatureInstalled(config.toolId)) {
|
||||
const bundle = getBundleForTool(config.toolId);
|
||||
return reply.status(501).send({
|
||||
error: "Feature not installed",
|
||||
code: "FEATURE_NOT_INSTALLED",
|
||||
feature: bundle.id,
|
||||
featureName: bundle.name,
|
||||
estimatedSize: bundle.estimatedSize,
|
||||
});
|
||||
}
|
||||
```
|
||||
|
||||
This also applies to:
|
||||
- `restore-photo.ts` (uses its own route handler, not the factory)
|
||||
- `batch.ts` — the `getToolConfig(toolId)` call at line 35 is the gating point. Add a feature-installed check alongside the existing 404 check.
|
||||
- `pipeline.ts` — the pre-validation loops (lines 143-172 for execute, lines 441-462 for batch) already validate all tool IDs before processing starts. Extend to also check feature installation.
|
||||
|
||||
**Frontend error detection:** Extend `parseApiError` in `apps/web/src/lib/api.ts` to detect the `FEATURE_NOT_INSTALLED` code and return structured data (bundle id, name, size) instead of a plain error string. This enables `use-tool-processor.ts` and `use-pipeline-processor.ts` (both already use `parseApiError`) to trigger the install prompt rather than showing a generic error.
|
||||
|
||||
The global Fastify error handler in `apps/api/src/index.ts` (lines 41-51) provides a safety net — any unhandled Python import errors will produce structured JSON rather than crashing.
|
||||
|
||||
### API Endpoints
|
||||
|
||||
New routes — read endpoint is public (no `/admin/` prefix), mutation endpoints are admin-only:
|
||||
|
||||
```
|
||||
GET /api/v1/features
|
||||
Returns: list of all bundles with install status, sizes, enabled tools
|
||||
Auth: any authenticated user (read-only, needed by frontend for badges/tool page state)
|
||||
Response: {
|
||||
bundles: [{
|
||||
id: "background-removal",
|
||||
name: "Background Removal",
|
||||
description: "Remove image backgrounds with AI",
|
||||
status: "not_installed" | "installing" | "installed" | "error",
|
||||
installedVersion: "1.15.3" | null,
|
||||
estimatedSize: "500-700 MB",
|
||||
enablesTools: ["remove-background"],
|
||||
progress: { percent: 45, stage: "Downloading models..." } | null,
|
||||
error: "pip install failed: ..." | null,
|
||||
dependencies: [] | ["upscale-enhance"]
|
||||
}]
|
||||
}
|
||||
|
||||
POST /api/v1/admin/features/:bundleId/install
|
||||
Starts background installation of a feature bundle.
|
||||
Auth: admin only
|
||||
Response: { jobId: "uuid" }
|
||||
SSE progress at: GET /api/v1/jobs/:jobId/progress
|
||||
|
||||
POST /api/v1/admin/features/:bundleId/uninstall
|
||||
Removes a feature bundle (pip packages + models).
|
||||
Auth: admin only
|
||||
Response: { ok: true, freedSpace: "500 MB" }
|
||||
|
||||
GET /api/v1/admin/features/disk-usage
|
||||
Returns total disk usage of /data/ai/.
|
||||
Auth: admin only
|
||||
Response: { totalBytes: 5368709120, byBundle: { "background-removal": 734003200, ... } }
|
||||
```
|
||||
|
||||
### Background Job Mechanism
|
||||
|
||||
Feature installation runs as a background child process (not inline with the HTTP request):
|
||||
|
||||
1. `POST /admin/features/:bundleId/install` spawns the install script as a child process
|
||||
2. Progress is streamed via stderr JSON lines → captured by the Node process → pushed to SSE listeners
|
||||
3. The existing SSE infrastructure (`/api/v1/jobs/:jobId/progress`) is reused
|
||||
4. Job status is persisted to the `jobs` table for recovery on restart
|
||||
5. Only one install can run at a time (mutex). Concurrent install requests return 409 Conflict.
|
||||
|
||||
### Python Sidecar Changes
|
||||
|
||||
**dispatcher.py:**
|
||||
- On startup, read `/data/ai/installed.json` to know which features are available
|
||||
- Populate `available_modules` based on what's actually installed
|
||||
- When a script is requested for an uninstalled feature, return a structured error: `{"error": "feature_not_installed", "feature": "background-removal", "message": "Background Removal is not installed"}`
|
||||
- After a feature is installed, the dispatcher must be restarted (or sent a reload signal) to pick up new packages. The bridge handles this by killing and re-spawning the dispatcher.
|
||||
|
||||
**Python scripts:**
|
||||
- Convert hard module-level imports in `colorize.py` and `restore.py` to lazy imports inside functions
|
||||
- All scripts should check for their feature's models and return a clear "not installed" error if missing
|
||||
- The `sys.path` must include `/data/ai/venv/lib/python3.X/site-packages/` (set by the dispatcher on startup based on installed.json)
|
||||
|
||||
**Bridge (bridge.ts):**
|
||||
- Update `PYTHON_VENV_PATH` logic to prefer `/data/ai/venv/` when it exists
|
||||
- Add a `restartDispatcher()` function called after feature install completes
|
||||
- Handle the new `feature_not_installed` error type from the dispatcher
|
||||
|
||||
### Model Path Resolution
|
||||
|
||||
Currently models are at `/opt/models/`. With on-demand downloads, they'll be at `/data/ai/models/`. The resolution order:
|
||||
|
||||
1. `/opt/models/<model>` (Docker-baked, for backwards compatibility if someone builds a full image)
|
||||
2. `/data/ai/models/<model>` (on-demand download location)
|
||||
3. `~/.cache/ashim/<model>` (local dev fallback)
|
||||
|
||||
Environment variables (`U2NET_HOME`, etc.) are updated by the install script to point to `/data/ai/models/`.
|
||||
|
||||
### Dockerfile Changes
|
||||
|
||||
1. Remove all `pip install` commands for ML packages (lines 175-206)
|
||||
2. Remove `download_models.py` COPY and RUN (lines 219-231)
|
||||
3. Keep: Python 3 + pip + build-essential (do NOT purge build-essential)
|
||||
4. Keep: numpy, Pillow, opencv-python-headless install (lightweight shared deps)
|
||||
5. Add: COPY `feature-manifest.json` into the image
|
||||
6. Add: COPY `install_feature.py` into the image
|
||||
7. Update entrypoint to set up `/data/ai/` directory structure on first run
|
||||
8. Update env vars: `MODELS_PATH=/data/ai/models` as default, fallback to `/opt/models`
|
||||
|
||||
### Frontend: Tool Page (Uninstalled State)
|
||||
|
||||
When a user navigates to an AI tool that isn't installed:
|
||||
|
||||
**For admins:**
|
||||
- Show a card replacing the normal upload area:
|
||||
- Feature icon + name (e.g., "Background Removal")
|
||||
- "This feature requires an additional download (~500-700 MB)"
|
||||
- [Enable Feature] button
|
||||
- After clicking: progress bar with stage text, estimated time
|
||||
- On completion: page automatically transitions to the normal tool UI
|
||||
|
||||
**For non-admins:**
|
||||
- Show: "This feature is not enabled. Ask your administrator to enable it in Settings."
|
||||
|
||||
### Frontend: Tool Grid (Badge)
|
||||
|
||||
AI tools in the grid show a small download icon overlay when not installed. When installed, the icon disappears and the tool looks like any other tool.
|
||||
|
||||
Tools with partial dependencies (e.g., passport-photo needs 2 bundles) show the badge until ALL required bundles are installed.
|
||||
|
||||
### Frontend: Settings Panel
|
||||
|
||||
New "AI Features" section in the settings dialog (admin only):
|
||||
|
||||
- List of all 6 feature bundles as cards
|
||||
- Each card shows: name, description, status (installed/not installed/installing), disk usage
|
||||
- Install/Uninstall buttons per bundle
|
||||
- "Install All" button at the top
|
||||
- Total AI disk usage summary at the bottom
|
||||
- Progress bar during installation
|
||||
- Dependency warnings (e.g., "Advanced Noise Removal requires Upscale & Face Enhance")
|
||||
|
||||
### Container Update Flow
|
||||
|
||||
When a user does `docker pull` + restart:
|
||||
|
||||
1. **Pull:** Only app code layers changed → ~50-100 MB download
|
||||
2. **Startup:** Backend reads feature manifest from new image + installed.json from volume
|
||||
3. **Comparison:**
|
||||
- If bundle package versions unchanged → no action, instant startup
|
||||
- If a package version bumped → `pip install --upgrade` from wheel cache (seconds)
|
||||
- If a model URL/version changed → re-download that model only
|
||||
- If Python major version changed → rebuild venv from cached wheels (rare, ~2-5 min)
|
||||
4. **Dispatcher restart** if any packages changed
|
||||
|
||||
This check runs at startup, not blocking the HTTP server. AI features show "Updating..." status until the check completes.
|
||||
|
||||
### Robustness and Crash Recovery
|
||||
|
||||
The install system must handle every interruption gracefully: double-clicks, browser closes, container restarts mid-install, network failures, disk-full, corrupt downloads, and power loss.
|
||||
|
||||
#### Atomic Operations
|
||||
|
||||
**Model downloads** — never write directly to the final path:
|
||||
1. Download to `<model_path>.downloading`
|
||||
2. Verify file size against `minSize` from manifest
|
||||
3. `os.rename()` to final path (atomic on same filesystem)
|
||||
4. If the process dies mid-download, the `.downloading` file is an obvious orphan
|
||||
|
||||
**installed.json writes** — never write in-place:
|
||||
1. Write to `installed.json.tmp`
|
||||
2. `os.rename()` to `installed.json`
|
||||
3. If the process dies mid-write, `installed.json` is intact (either old version or doesn't exist)
|
||||
|
||||
**Venv bootstrap** (entrypoint) — same pattern:
|
||||
1. Copy `/opt/venv` to `/data/ai/venv.bootstrapping/`
|
||||
2. Rename to `/data/ai/venv/` on completion
|
||||
3. If interrupted, the `.bootstrapping/` directory is cleaned up on next start
|
||||
|
||||
#### File-Based Install Lock
|
||||
|
||||
In-memory `installInProgress` state is lost on container restart. Use a persistent lock file instead:
|
||||
|
||||
**`/data/ai/install.lock`** contains:
|
||||
```json
|
||||
{ "bundleId": "background-removal", "startedAt": "2026-04-17T12:00:00Z", "pid": 12345 }
|
||||
```
|
||||
|
||||
- Created before install starts, deleted on success or acknowledged failure
|
||||
- On server startup, if lock exists: check if PID is alive. If dead → the install was interrupted mid-flight
|
||||
- If lock is stale (PID dead), mark the bundle as needing cleanup, delete lock
|
||||
|
||||
#### Startup Recovery Sequence
|
||||
|
||||
On server startup (in `apps/api/src/index.ts`, after `runMigrations()`), run a recovery check:
|
||||
|
||||
1. **Clean orphan temp files:** Delete any `*.downloading` files in `/data/ai/models/` (recursive)
|
||||
2. **Clean orphan JSON:** If `installed.json.tmp` exists, delete it
|
||||
3. **Clean orphan venv bootstrap:** If `/data/ai/venv.bootstrapping/` exists, delete it
|
||||
4. **Check install lock:** If `/data/ai/install.lock` exists:
|
||||
- Read PID from lock file
|
||||
- If PID is not running → the install was interrupted
|
||||
- Delete the lock file
|
||||
- Log: "Previous installation of {bundleId} was interrupted, cleaned up"
|
||||
5. **Verify installed bundles:** For each bundle in `installed.json`, check that all model files exist and meet minimum sizes from the manifest. If any model is missing or undersized:
|
||||
- Mark the bundle status as `"error"` with message "Some model files are missing or corrupt. Please reinstall."
|
||||
- Do NOT automatically remove from installed.json — let the admin decide to reinstall or uninstall
|
||||
|
||||
#### Frontend Button Hardening
|
||||
|
||||
**Double-click prevention:**
|
||||
- Disable the button on first click (set `installing = true` immediately, before the API call)
|
||||
- The button should be `disabled={installing || bundle.status === "installing"}`
|
||||
- Even if the component re-renders, the disabled state persists from the store
|
||||
|
||||
**Browser close / navigate away:**
|
||||
- The install runs as a server-side child process — it completes regardless of browser state
|
||||
- When the user returns to the page, `useFeaturesStore.fetch()` picks up current status
|
||||
- If an install is in progress, the UI shows the progress bar (driven by polling, not just SSE)
|
||||
|
||||
**SSE connection loss fallback:**
|
||||
- The `FeatureInstallPrompt` component uses `EventSource` for real-time progress
|
||||
- If the `EventSource` connection drops (`onerror`), fall back to polling `GET /api/v1/features` every 3 seconds
|
||||
- When the install completes (status changes from "installing" to "installed" or "error"), stop polling
|
||||
|
||||
**Page refresh during install:**
|
||||
- On mount, the features store calls `fetch()` which returns current bundle states including install progress
|
||||
- If a bundle has `status: "installing"`, the component immediately shows the progress bar and opens an EventSource for the in-progress job
|
||||
|
||||
**Multiple admins:**
|
||||
- Server mutex: only one install at a time (409 Conflict)
|
||||
- The features store status reflects the global state — ALL admin sessions see "installing"
|
||||
- The install lock file prevents even a container restart from allowing a concurrent install
|
||||
|
||||
#### Error Handling
|
||||
|
||||
| Scenario | Behavior |
|
||||
|---|---|
|
||||
| Double-click on Enable | Button disabled on first click. Second click is no-op. |
|
||||
| Browser closed mid-install | Server-side install continues. Status visible on next page load. |
|
||||
| Container restart mid-install | Startup recovery detects stale lock, cleans up `.downloading` files, marks as error. Admin can retry. |
|
||||
| Network failure mid-pip-install | pip returns non-zero. Install script emits error. Bundle marked as "error" with pip output. Admin can retry (pip cache means previously-downloaded wheels aren't re-downloaded). |
|
||||
| Network failure mid-model-download | `.downloading` file left behind. Retry 3 times with exponential backoff. On final failure, bundle marked as "error". On retry, `.downloading` file is deleted and re-downloaded. |
|
||||
| Disk full | Check available disk space at the START of install (before any pip/download). Return clear error: "Not enough disk space. Need ~{estimatedSize}, only {available} available." If disk fills mid-install, pip/download fails, bundle marked as error. |
|
||||
| pip succeeds, models fail | Bundle is NOT marked as installed. Status is "error" with message about which models failed. Packages remain in venv (harmless). Admin can retry — pip install is idempotent (skip already-installed), only failed models are re-downloaded. |
|
||||
| Model file corrupt (downloads completely but data is bad) | Verify file size against `minSize` after download. If too small, delete and retry. For rembg/HuggingFace models, the library's own integrity checks apply. |
|
||||
| installed.json corrupted | Atomic writes prevent this. If somehow corrupted (manual edit, etc.), `JSON.parse` fails, treat as empty (no bundles installed). Log a warning. |
|
||||
| Power loss | Atomic operations ensure no file is in a half-written state. Startup recovery cleans up orphans. |
|
||||
| No internet during install | pip fails immediately with a clear network error. Model downloads fail after retries. Bundle marked as "error". |
|
||||
|
||||
### Testing Strategy
|
||||
|
||||
All testing runs against Docker containers using the existing `playwright.docker.config.ts` and `tests/e2e-docker/` infrastructure:
|
||||
|
||||
- **Unit tests:** Feature manifest parsing, version comparison logic, bundle dependency resolution (Vitest, excluded from e2e via `vitest.config.ts`)
|
||||
- **Integration tests:** Install/uninstall API endpoints, status reporting, SSE progress (Vitest integration suite)
|
||||
- **E2e-docker tests:** Add to `tests/e2e-docker/` alongside existing `fixes-verification.spec.ts`:
|
||||
- Verify uninstalled AI tool returns 501 with `FEATURE_NOT_INSTALLED` code
|
||||
- Admin enables a feature from settings, tool page transitions from "not installed" to working
|
||||
- Non-admin sees "not enabled" message on uninstalled tool page
|
||||
- Feature install/uninstall round-trip
|
||||
- **Docker build test:** Verify base image builds without ML packages, verify feature-manifest.json is present (CI, `SKIP_MODEL_DOWNLOADS` already exists)
|
||||
|
||||
### Migration Path
|
||||
|
||||
Since the new image is fundamentally different (no ML packages baked in), existing users upgrading from the full image will need to re-download their AI features. The Python ML packages are no longer in the system venv, so even if old model weights exist at `/opt/models/`, the features won't work without packages.
|
||||
|
||||
The first-run experience for upgrading users:
|
||||
|
||||
1. Detect this is an upgrade: no `/data/ai/installed.json` exists, but user data exists in `/data`
|
||||
2. Show a one-time banner in the UI: "We've reduced the image size from 30 GB to 5 GB! AI features are now downloaded on-demand. Visit Settings → AI Features to enable the ones you need."
|
||||
3. No automatic downloads — let the admin choose what to install
|
||||
4. Old model weights at `/opt/models/` are ignored (they won't exist in the new image anyway since that layer is removed)
|
||||
|
||||
### Frontend: Feature Status Propagation
|
||||
|
||||
The frontend needs to know which tools are installed for three purposes: tool grid badges, tool page state, and settings panel.
|
||||
|
||||
**Features store** (`apps/web/src/stores/features-store.ts`):
|
||||
- Zustand store fetched on app load (like `settings-store.ts`)
|
||||
- Calls `GET /api/v1/features` to get bundle statuses
|
||||
- Provides a derived mapping: `toolInstallStatus: Record<string, "installed" | "not_installed" | "installing">` (each tool maps to exactly one bundle, no partial states)
|
||||
- Provides `isToolInstalled(toolId): boolean` and `getBundleForTool(toolId): BundleInfo | null` helpers
|
||||
- Refreshes on install/uninstall completion
|
||||
|
||||
**Tool grid integration:**
|
||||
- `ToolCard` checks `isToolInstalled(tool.id)` from the features store
|
||||
- If not installed: show a download icon badge (similar to existing "Experimental" badge)
|
||||
- The tool remains clickable (not disabled) — clicking navigates to the tool page where the install prompt appears
|
||||
- `PYTHON_SIDECAR_TOOLS` constant is used to determine which tools are AI tools (only AI tools can be "not installed")
|
||||
|
||||
**Tool page integration:**
|
||||
- `ToolPage` component checks feature status after the tool lookup
|
||||
- If the user is admin and feature not installed: render `FeatureInstallPrompt` component instead of the normal tool UI
|
||||
- If the user is non-admin and feature not installed: render "This feature is not enabled. Contact your administrator."
|
||||
- The install prompt shows feature name, description, estimated size, and an "Enable" button
|
||||
- After clicking "Enable": show progress bar with SSE-streamed progress, auto-transition to normal tool UI on completion
|
||||
|
||||
### Development and Testing
|
||||
|
||||
All development and testing is done via Docker containers — the same environment users run. Build the image locally and run it with:
|
||||
|
||||
```bash
|
||||
docker run -d --name ashim -p 1349:1349 -v ashim-data:/data ghcr.io/ashim-hq/ashim:latest
|
||||
```
|
||||
|
||||
Auth can be disabled for development by passing `-e AUTH_ENABLED=false`.
|
||||
|
||||
### Scope Boundaries
|
||||
|
||||
**In scope:**
|
||||
- Dockerfile restructuring to remove ML packages and models
|
||||
- Feature manifest system
|
||||
- Install/uninstall API + background job
|
||||
- Python sidecar changes for dynamic feature detection
|
||||
- Frontend: tool page download prompt, grid badge, settings panel
|
||||
- Container update handling with version manifest
|
||||
|
||||
**Out of scope (future work):**
|
||||
- Additional rembg model variants as sub-downloads within Background Removal
|
||||
- Automatic feature recommendations based on usage
|
||||
- Download from private/custom model registries
|
||||
- Bandwidth throttling for downloads
|
||||
- Multiple venv support (e.g., different Python versions)
|
||||
Loading…
Reference in a new issue