refactor: Consolidate and optimize Docker image architecture (#233)
## Overview
This PR consolidates and optimizes the Docker build system, reducing
redundancy and improving CI/CD performance. The changes eliminate
duplicate Dockerfiles, introduce a flexible build template, and optimize
release builds to reuse CI artifacts.
## Changes Summary
### 🐳 Docker Images Restructured
**Before:** 5 Dockerfiles with significant overlap
**After:** 4 focused images + 1 utility
#### Final Structure:
1. **`operator/Dockerfile`** ✨ Updated
- **Standard operator image** for CI and release builds
- Minimal node image (accepts pre-built binaries)
- GHCR: `ghcr.io/datahaven-xyz/datahaven/datahaven` (CI)
- DockerHub: `datahavenxyz/datahaven` (releases)
2. **`docker/datahaven-build.Dockerfile`** (moved from
`operator/Dockerfile`)
- Full source-to-binary build for manual releases
- DockerHub: `datahavenxyz/datahaven:{label}`
- Supports custom RUSTFLAGS and fast-runtime feature
- Only used for manual workflow_dispatch builds
3. **`docker/datahaven-production.Dockerfile`** (kept)
- Binary builder for CPU-specific releases
- Used by build-prod-binary workflow template
- Supports custom target-cpu flags
4. **`docker/datahaven-dev.Dockerfile`** ✨ NEW (local dev only)
- **FOR LOCAL DEVELOPMENT/TROUBLESHOOTING ONLY**
- Includes debug tools: gdb, strace, vim, sudo
- Extra dependencies: librocksdb-dev, curl
- RUST_BACKTRACE enabled by default
- **DO NOT USE for CI or production builds**
5. **`test/docker/crossbuild-mac-libpq.dockerfile`** (kept)
- Utility for macOS → Linux cross-compilation
#### Removed (Redundant):
- ❌ `docker/datahaven.Dockerfile` → replaced by operator/Dockerfile
- ❌ `test/docker/datahaven-node-local.dockerfile` → replaced by
datahaven-dev.Dockerfile
---
### 🔄 Workflow Improvements
#### Enhanced `publish-docker` Template
- Supports both GHCR and DockerHub registries
- Flexible inputs: dockerfile, context, build-args, cache scope
- Auto-generates OCI-compliant labels
- Reduces code duplication (~70 lines → ~15 per workflow)
#### Refactored CI Pipeline
- **`docker-build-ci`**: Builds `operator/Dockerfile` → GHCR for CI/E2E
testing
- **`docker-build-release`**: Builds `operator/Dockerfile` → DockerHub
(main branch only)
- Both CI and release workflows now use the same minimal operator image
- Release builds **reuse CI binaries** instead of rebuilding from source
#### Optimized Release Workflow
The `task-docker-release` workflow now has dual modes:
**Mode 1: `workflow_call` (CI - main pushes)**
- ✅ Reuses binary from CI's build-operator task
- ✅ Uses lightweight `operator/Dockerfile`
- ✅ Tags: `latest`, `sha-{short}`
- ⚡ **Fast**: ~5 minutes (vs ~30 min previously)
**Mode 2: `workflow_dispatch` (Manual)**
- ✅ Full source build with `datahaven-build.Dockerfile`
- ✅ Custom branch and label selection
- ✅ Optional fast-runtime feature
- ✅ Tags: `PROD-{label}` or user-defined
---
### 🔧 Additional Optimizations
- Copy libpq5 from builder stage instead of reinstalling (smaller,
faster)
- Remove redundant protobuf-compiler package (use protoc v21.12
directly)
- Standardize user UID to 1000 across all runtime images
- Consistent OCI labeling and metadata
---------
Co-authored-by: Claude <noreply@anthropic.com>
2025-10-14 23:33:20 +00:00
|
|
|
# DataHaven Operator Image
|
2025-05-13 09:12:32 +00:00
|
|
|
|
2026-02-10 16:36:07 +00:00
|
|
|
FROM debian:stable-slim
|
|
|
|
|
|
|
|
|
|
LABEL version="0.4.0"
|
|
|
|
|
LABEL description="DataHaven Node - Release Build"
|
|
|
|
|
LABEL maintainer="steve@moonsonglabs.com"
|
|
|
|
|
|
|
|
|
|
ARG BINARY_PATH=./target/x86_64-unknown-linux-gnu/release/datahaven-node
|
2025-05-13 09:12:32 +00:00
|
|
|
|
refactor: Consolidate and optimize Docker image architecture (#233)
## Overview
This PR consolidates and optimizes the Docker build system, reducing
redundancy and improving CI/CD performance. The changes eliminate
duplicate Dockerfiles, introduce a flexible build template, and optimize
release builds to reuse CI artifacts.
## Changes Summary
### 🐳 Docker Images Restructured
**Before:** 5 Dockerfiles with significant overlap
**After:** 4 focused images + 1 utility
#### Final Structure:
1. **`operator/Dockerfile`** ✨ Updated
- **Standard operator image** for CI and release builds
- Minimal node image (accepts pre-built binaries)
- GHCR: `ghcr.io/datahaven-xyz/datahaven/datahaven` (CI)
- DockerHub: `datahavenxyz/datahaven` (releases)
2. **`docker/datahaven-build.Dockerfile`** (moved from
`operator/Dockerfile`)
- Full source-to-binary build for manual releases
- DockerHub: `datahavenxyz/datahaven:{label}`
- Supports custom RUSTFLAGS and fast-runtime feature
- Only used for manual workflow_dispatch builds
3. **`docker/datahaven-production.Dockerfile`** (kept)
- Binary builder for CPU-specific releases
- Used by build-prod-binary workflow template
- Supports custom target-cpu flags
4. **`docker/datahaven-dev.Dockerfile`** ✨ NEW (local dev only)
- **FOR LOCAL DEVELOPMENT/TROUBLESHOOTING ONLY**
- Includes debug tools: gdb, strace, vim, sudo
- Extra dependencies: librocksdb-dev, curl
- RUST_BACKTRACE enabled by default
- **DO NOT USE for CI or production builds**
5. **`test/docker/crossbuild-mac-libpq.dockerfile`** (kept)
- Utility for macOS → Linux cross-compilation
#### Removed (Redundant):
- ❌ `docker/datahaven.Dockerfile` → replaced by operator/Dockerfile
- ❌ `test/docker/datahaven-node-local.dockerfile` → replaced by
datahaven-dev.Dockerfile
---
### 🔄 Workflow Improvements
#### Enhanced `publish-docker` Template
- Supports both GHCR and DockerHub registries
- Flexible inputs: dockerfile, context, build-args, cache scope
- Auto-generates OCI-compliant labels
- Reduces code duplication (~70 lines → ~15 per workflow)
#### Refactored CI Pipeline
- **`docker-build-ci`**: Builds `operator/Dockerfile` → GHCR for CI/E2E
testing
- **`docker-build-release`**: Builds `operator/Dockerfile` → DockerHub
(main branch only)
- Both CI and release workflows now use the same minimal operator image
- Release builds **reuse CI binaries** instead of rebuilding from source
#### Optimized Release Workflow
The `task-docker-release` workflow now has dual modes:
**Mode 1: `workflow_call` (CI - main pushes)**
- ✅ Reuses binary from CI's build-operator task
- ✅ Uses lightweight `operator/Dockerfile`
- ✅ Tags: `latest`, `sha-{short}`
- ⚡ **Fast**: ~5 minutes (vs ~30 min previously)
**Mode 2: `workflow_dispatch` (Manual)**
- ✅ Full source build with `datahaven-build.Dockerfile`
- ✅ Custom branch and label selection
- ✅ Optional fast-runtime feature
- ✅ Tags: `PROD-{label}` or user-defined
---
### 🔧 Additional Optimizations
- Copy libpq5 from builder stage instead of reinstalling (smaller,
faster)
- Remove redundant protobuf-compiler package (use protoc v21.12
directly)
- Standardize user UID to 1000 across all runtime images
- Consistent OCI labeling and metadata
---------
Co-authored-by: Claude <noreply@anthropic.com>
2025-10-14 23:33:20 +00:00
|
|
|
# Install CA certificates and libpq5 for the release build
|
|
|
|
|
RUN apt-get update && \
|
|
|
|
|
apt-get install -y --no-install-recommends \
|
|
|
|
|
libpq5 \
|
|
|
|
|
ca-certificates && \
|
|
|
|
|
update-ca-certificates && \
|
|
|
|
|
apt-get clean && \
|
|
|
|
|
rm -rf /var/lib/apt/lists/*
|
2025-05-13 09:12:32 +00:00
|
|
|
|
refactor: Consolidate and optimize Docker image architecture (#233)
## Overview
This PR consolidates and optimizes the Docker build system, reducing
redundancy and improving CI/CD performance. The changes eliminate
duplicate Dockerfiles, introduce a flexible build template, and optimize
release builds to reuse CI artifacts.
## Changes Summary
### 🐳 Docker Images Restructured
**Before:** 5 Dockerfiles with significant overlap
**After:** 4 focused images + 1 utility
#### Final Structure:
1. **`operator/Dockerfile`** ✨ Updated
- **Standard operator image** for CI and release builds
- Minimal node image (accepts pre-built binaries)
- GHCR: `ghcr.io/datahaven-xyz/datahaven/datahaven` (CI)
- DockerHub: `datahavenxyz/datahaven` (releases)
2. **`docker/datahaven-build.Dockerfile`** (moved from
`operator/Dockerfile`)
- Full source-to-binary build for manual releases
- DockerHub: `datahavenxyz/datahaven:{label}`
- Supports custom RUSTFLAGS and fast-runtime feature
- Only used for manual workflow_dispatch builds
3. **`docker/datahaven-production.Dockerfile`** (kept)
- Binary builder for CPU-specific releases
- Used by build-prod-binary workflow template
- Supports custom target-cpu flags
4. **`docker/datahaven-dev.Dockerfile`** ✨ NEW (local dev only)
- **FOR LOCAL DEVELOPMENT/TROUBLESHOOTING ONLY**
- Includes debug tools: gdb, strace, vim, sudo
- Extra dependencies: librocksdb-dev, curl
- RUST_BACKTRACE enabled by default
- **DO NOT USE for CI or production builds**
5. **`test/docker/crossbuild-mac-libpq.dockerfile`** (kept)
- Utility for macOS → Linux cross-compilation
#### Removed (Redundant):
- ❌ `docker/datahaven.Dockerfile` → replaced by operator/Dockerfile
- ❌ `test/docker/datahaven-node-local.dockerfile` → replaced by
datahaven-dev.Dockerfile
---
### 🔄 Workflow Improvements
#### Enhanced `publish-docker` Template
- Supports both GHCR and DockerHub registries
- Flexible inputs: dockerfile, context, build-args, cache scope
- Auto-generates OCI-compliant labels
- Reduces code duplication (~70 lines → ~15 per workflow)
#### Refactored CI Pipeline
- **`docker-build-ci`**: Builds `operator/Dockerfile` → GHCR for CI/E2E
testing
- **`docker-build-release`**: Builds `operator/Dockerfile` → DockerHub
(main branch only)
- Both CI and release workflows now use the same minimal operator image
- Release builds **reuse CI binaries** instead of rebuilding from source
#### Optimized Release Workflow
The `task-docker-release` workflow now has dual modes:
**Mode 1: `workflow_call` (CI - main pushes)**
- ✅ Reuses binary from CI's build-operator task
- ✅ Uses lightweight `operator/Dockerfile`
- ✅ Tags: `latest`, `sha-{short}`
- ⚡ **Fast**: ~5 minutes (vs ~30 min previously)
**Mode 2: `workflow_dispatch` (Manual)**
- ✅ Full source build with `datahaven-build.Dockerfile`
- ✅ Custom branch and label selection
- ✅ Optional fast-runtime feature
- ✅ Tags: `PROD-{label}` or user-defined
---
### 🔧 Additional Optimizations
- Copy libpq5 from builder stage instead of reinstalling (smaller,
faster)
- Remove redundant protobuf-compiler package (use protoc v21.12
directly)
- Standardize user UID to 1000 across all runtime images
- Consistent OCI labeling and metadata
---------
Co-authored-by: Claude <noreply@anthropic.com>
2025-10-14 23:33:20 +00:00
|
|
|
# Create datahaven user and directories
|
feat: ✨ Add Docker Compose setup for local DataHaven network (#314)
## 🎯 Overview
This PR introduces a comprehensive Docker Compose configuration for
running a complete local DataHaven network, making it significantly
easier for developers to spin up and test the entire stack locally.
## 🏗️ Architecture
```mermaid
graph TB
subgraph "DataHaven Network (Docker)"
subgraph "Consensus Layer"
Alice["🔷 Alice (Validator)<br/>:9944 RPC<br/>4 Keys: GRAN, BABE, IMON, BEEF"]
Bob["🔷 Bob (Validator)<br/>:9945 RPC<br/>4 Keys: GRAN, BABE, IMON, BEEF"]
Alice <-->|P2P/mDNS| Bob
end
subgraph "Storage Provider Layer"
MSP["💾 MSP (Charlie)<br/>:9946 RPC<br/>1 GiB Storage<br/>1 Key: BCSV"]
BSP01["💾 BSP01 (Dave)<br/>:9947 RPC<br/>1 GiB Storage<br/>1 Key: BCSV"]
BSP02["💾 BSP02 (Eve)<br/>:9948 RPC<br/>1 GiB Storage<br/>1 Key: BCSV"]
end
subgraph "Monitoring Layer"
Indexer["📊 Indexer<br/>:9949 RPC<br/>Full Mode<br/>No Keys"]
Fisherman["🎣 Fisherman (Gustavo)<br/>:9950 RPC<br/>Storage Monitor<br/>1 Key: BCSV"]
DB["🗄️ PostgreSQL<br/>:5432<br/>indexer/datahaven"]
end
Alice -.->|Syncs| MSP
Bob -.->|Syncs| MSP
Alice -.->|Syncs| BSP01
Bob -.->|Syncs| BSP02
Alice -.->|Syncs| Indexer
Bob -.->|Syncs| Fisherman
MSP -.->|Monitors| Fisherman
BSP01 -.->|Monitors| Fisherman
BSP02 -.->|Monitors| Fisherman
Indexer -->|Writes| DB
Fisherman -->|Writes| DB
end
style Alice fill:#4a90e2,stroke:#2e5c8a,color:#fff
style Bob fill:#4a90e2,stroke:#2e5c8a,color:#fff
style MSP fill:#50c878,stroke:#2d7a4a,color:#fff
style BSP01 fill:#50c878,stroke:#2d7a4a,color:#fff
style BSP02 fill:#50c878,stroke:#2d7a4a,color:#fff
style Indexer fill:#f5a623,stroke:#b87818,color:#fff
style Fisherman fill:#f5a623,stroke:#b87818,color:#fff
style DB fill:#9b59b6,stroke:#6c3a82,color:#fff
```
**Legend:**
- 🔷 Validators - Consensus and block production
- 💾 Storage Providers - File storage and retrieval
- 📊 Indexer - Full blockchain indexing
- 🎣 Fisherman - Storage provider monitoring
- 🗄️ PostgreSQL - Database for indexer/fisherman
- `<-->` P2P Communication | `-.->` Network Sync | `-->` Database
Connection
## ✨ What's Included
### Core Network (8 Services)
- **2 Validator Nodes** (Alice & Bob) - Consensus and block production
- **1 Main Storage Provider (MSP)** - Charlie with 1 GiB storage
capacity
- **2 Backup Storage Providers (BSPs)** - Dave and Eve with 1 GiB each
- **1 StorageHub Indexer** - Full blockchain indexer with PostgreSQL
- **1 Fisherman Node** - Gustavo monitoring storage provider behavior
- **1 PostgreSQL Database** - Shared database for indexer and fisherman
### Key Features
✅ **Automated Key Injection** - All validator and storage provider keys
automatically injected on startup
✅ **Health Checks** - Validators have health checks that verify RPC port
is listening before dependent services start
✅ **Orchestrated Startup** - Services start in correct order with
health-based dependencies
✅ **Persistent Storage** - Chain data, keystores, and database all
persisted in Docker volumes
✅ **Unified Entrypoint Script** - Single script handles all node types
(validator, MSP, BSP, fisherman)
✅ **Proper User Permissions** - Root for setup, switches to datahaven
user for node execution
✅ **mDNS Peer Discovery** - Nodes automatically discover each other on
the Docker network
✅ **Comprehensive Documentation** - Full setup guide, troubleshooting,
and verification steps
### 🔄 Startup Sequence
The network starts in a carefully orchestrated sequence to ensure
stability:
1. **Alice (Validator)** starts first
- Injects 4 keys (GRAN, BABE, IMON, BEEF)
- Health check waits for RPC port 9944 to be listening
- Uses `/proc/net/tcp` for minimal-dependency port checking
2. **Bob (Validator)** waits for Alice to be healthy
- Ensures at least one validator is fully operational
- Enables block production to start immediately
3. **Storage Providers & Monitoring** wait for both validators to be
healthy
- MSP, BSP01, BSP02 start after validators are ready
- Indexer and Fisherman wait for validators before syncing
- PostgreSQL starts independently with its own health check
This dependency chain prevents race conditions and ensures reliable
network formation.
## 🚀 Quick Start
```bash
cd operator
# Build the binary (development mode with fast blocks)
./scripts/docker-prepare.sh --fast
# Start the entire network
docker-compose up -d
# View logs
docker-compose logs -f
# Check service health
docker-compose ps
# Stop the network
docker-compose down -v
```
## 📋 Port Mappings
| Service | RPC/WebSocket | Prometheus | P2P | Database/API |
|---------|---------------|------------|-----|--------------|
| Alice | 9944 | 9615 | 30333 | - |
| Bob | 9945 | 9616 | 30334 | - |
| MSP | 9946 | 9617 | 30335 | - |
| BSP01 | 9947 | 9618 | 30336 | - |
| BSP02 | 9948 | 9619 | 30337 | - |
| Indexer | 9949 | 9620 | 30338 | - |
| Fisherman | 9950 | 9621 | 30339 | - |
| PostgreSQL | - | - | - | 5432 |
## 🔑 Key Injection
All cryptographic keys are automatically injected on startup:
**Validators (Alice & Bob)** - 4 keys each:
- GRANDPA (ed25519) - Finality
- BABE (sr25519) - Block authoring
- ImOnline (sr25519) - Heartbeat
- BEEFY (ecdsa) - Bridge consensus
**Storage Providers (MSP, BSPs, Fisherman)** - 1 key each:
- BCSV (ecdsa) - Storage provider identity
**Indexer** - No keys required (non-validating node)
## 🩺 Health Checks
Validator nodes (Alice & Bob) implement health checks to ensure proper
startup sequencing:
**Health Check Method:**
- Reads `/proc/net/tcp` directly to check if RPC port is listening
- Zero dependencies - works in minimal debian:stable-slim containers
- Converts port to hex and searches for LISTEN state (0A)
**Configuration:**
- Start period: 30s (allows node initialization)
- Interval: 10s (check every 10 seconds)
- Timeout: 5s (per health check)
- Retries: 5 (must pass 5 consecutive checks)
**Benefits:**
- Prevents dependent services from starting before validators are ready
- Eliminates race conditions during network formation
- No additional packages required in Docker image
## 🍎 macOS Requirement
**Important:** On Docker Desktop for macOS, you must use the
**experimental DockerVMM** virtualization framework:
1. Open Docker Desktop settings
2. Go to "General" tab
3. Enable "Use experimental virtualization framework (DockerVMM)"
4. Restart Docker Desktop
The default Apple Virtualization Framework causes networking issues with
P2P connections.
## 📁 Files Changed
- `operator/docker-compose.yml` - Main orchestration configuration
- `operator/scripts/docker-entrypoint.sh` - Unified key injection script
- `operator/scripts/docker-prepare.sh` - Binary build helper
- `operator/scripts/docker-healthcheck.sh` - Health check script for
validators
- `operator/DOCKER-COMPOSE.md` - Comprehensive documentation
## 🔍 Testing
The configuration has been tested on:
- ✅ Docker Desktop for macOS (with DockerVMM)
- ✅ Docker on Linux/Ubuntu
All nodes successfully:
- Inject required keys
- Pass health checks
- Discover peers via mDNS
- Sync blocks and finalize
- Connect to PostgreSQL database (indexer/fisherman)
## 📝 Notes
- All settings are configured for **local development only**
- Uses well-known test seed phrase (⚠️ never use in production!)
- RPC exposed without authentication
- Unsafe flags enabled for convenience
---------
Co-authored-by: Claude <noreply@anthropic.com>
2025-11-22 13:07:46 +00:00
|
|
|
RUN useradd -m -u 1001 -U -s /bin/sh -d /datahaven datahaven && \
|
2026-02-10 16:36:07 +00:00
|
|
|
mkdir -p /datahaven/.local/share && \
|
|
|
|
|
chown -R datahaven:datahaven /datahaven/.local/share
|
2025-03-17 16:57:14 +00:00
|
|
|
|
2025-05-08 23:32:55 +00:00
|
|
|
USER datahaven
|
2025-03-17 16:57:14 +00:00
|
|
|
|
refactor: Consolidate and optimize Docker image architecture (#233)
## Overview
This PR consolidates and optimizes the Docker build system, reducing
redundancy and improving CI/CD performance. The changes eliminate
duplicate Dockerfiles, introduce a flexible build template, and optimize
release builds to reuse CI artifacts.
## Changes Summary
### 🐳 Docker Images Restructured
**Before:** 5 Dockerfiles with significant overlap
**After:** 4 focused images + 1 utility
#### Final Structure:
1. **`operator/Dockerfile`** ✨ Updated
- **Standard operator image** for CI and release builds
- Minimal node image (accepts pre-built binaries)
- GHCR: `ghcr.io/datahaven-xyz/datahaven/datahaven` (CI)
- DockerHub: `datahavenxyz/datahaven` (releases)
2. **`docker/datahaven-build.Dockerfile`** (moved from
`operator/Dockerfile`)
- Full source-to-binary build for manual releases
- DockerHub: `datahavenxyz/datahaven:{label}`
- Supports custom RUSTFLAGS and fast-runtime feature
- Only used for manual workflow_dispatch builds
3. **`docker/datahaven-production.Dockerfile`** (kept)
- Binary builder for CPU-specific releases
- Used by build-prod-binary workflow template
- Supports custom target-cpu flags
4. **`docker/datahaven-dev.Dockerfile`** ✨ NEW (local dev only)
- **FOR LOCAL DEVELOPMENT/TROUBLESHOOTING ONLY**
- Includes debug tools: gdb, strace, vim, sudo
- Extra dependencies: librocksdb-dev, curl
- RUST_BACKTRACE enabled by default
- **DO NOT USE for CI or production builds**
5. **`test/docker/crossbuild-mac-libpq.dockerfile`** (kept)
- Utility for macOS → Linux cross-compilation
#### Removed (Redundant):
- ❌ `docker/datahaven.Dockerfile` → replaced by operator/Dockerfile
- ❌ `test/docker/datahaven-node-local.dockerfile` → replaced by
datahaven-dev.Dockerfile
---
### 🔄 Workflow Improvements
#### Enhanced `publish-docker` Template
- Supports both GHCR and DockerHub registries
- Flexible inputs: dockerfile, context, build-args, cache scope
- Auto-generates OCI-compliant labels
- Reduces code duplication (~70 lines → ~15 per workflow)
#### Refactored CI Pipeline
- **`docker-build-ci`**: Builds `operator/Dockerfile` → GHCR for CI/E2E
testing
- **`docker-build-release`**: Builds `operator/Dockerfile` → DockerHub
(main branch only)
- Both CI and release workflows now use the same minimal operator image
- Release builds **reuse CI binaries** instead of rebuilding from source
#### Optimized Release Workflow
The `task-docker-release` workflow now has dual modes:
**Mode 1: `workflow_call` (CI - main pushes)**
- ✅ Reuses binary from CI's build-operator task
- ✅ Uses lightweight `operator/Dockerfile`
- ✅ Tags: `latest`, `sha-{short}`
- ⚡ **Fast**: ~5 minutes (vs ~30 min previously)
**Mode 2: `workflow_dispatch` (Manual)**
- ✅ Full source build with `datahaven-build.Dockerfile`
- ✅ Custom branch and label selection
- ✅ Optional fast-runtime feature
- ✅ Tags: `PROD-{label}` or user-defined
---
### 🔧 Additional Optimizations
- Copy libpq5 from builder stage instead of reinstalling (smaller,
faster)
- Remove redundant protobuf-compiler package (use protoc v21.12
directly)
- Standardize user UID to 1000 across all runtime images
- Consistent OCI labeling and metadata
---------
Co-authored-by: Claude <noreply@anthropic.com>
2025-10-14 23:33:20 +00:00
|
|
|
# Copy pre-built binary
|
2026-02-10 16:36:07 +00:00
|
|
|
COPY --chown=datahaven:datahaven $BINARY_PATH /usr/local/bin
|
refactor: Consolidate and optimize Docker image architecture (#233)
## Overview
This PR consolidates and optimizes the Docker build system, reducing
redundancy and improving CI/CD performance. The changes eliminate
duplicate Dockerfiles, introduce a flexible build template, and optimize
release builds to reuse CI artifacts.
## Changes Summary
### 🐳 Docker Images Restructured
**Before:** 5 Dockerfiles with significant overlap
**After:** 4 focused images + 1 utility
#### Final Structure:
1. **`operator/Dockerfile`** ✨ Updated
- **Standard operator image** for CI and release builds
- Minimal node image (accepts pre-built binaries)
- GHCR: `ghcr.io/datahaven-xyz/datahaven/datahaven` (CI)
- DockerHub: `datahavenxyz/datahaven` (releases)
2. **`docker/datahaven-build.Dockerfile`** (moved from
`operator/Dockerfile`)
- Full source-to-binary build for manual releases
- DockerHub: `datahavenxyz/datahaven:{label}`
- Supports custom RUSTFLAGS and fast-runtime feature
- Only used for manual workflow_dispatch builds
3. **`docker/datahaven-production.Dockerfile`** (kept)
- Binary builder for CPU-specific releases
- Used by build-prod-binary workflow template
- Supports custom target-cpu flags
4. **`docker/datahaven-dev.Dockerfile`** ✨ NEW (local dev only)
- **FOR LOCAL DEVELOPMENT/TROUBLESHOOTING ONLY**
- Includes debug tools: gdb, strace, vim, sudo
- Extra dependencies: librocksdb-dev, curl
- RUST_BACKTRACE enabled by default
- **DO NOT USE for CI or production builds**
5. **`test/docker/crossbuild-mac-libpq.dockerfile`** (kept)
- Utility for macOS → Linux cross-compilation
#### Removed (Redundant):
- ❌ `docker/datahaven.Dockerfile` → replaced by operator/Dockerfile
- ❌ `test/docker/datahaven-node-local.dockerfile` → replaced by
datahaven-dev.Dockerfile
---
### 🔄 Workflow Improvements
#### Enhanced `publish-docker` Template
- Supports both GHCR and DockerHub registries
- Flexible inputs: dockerfile, context, build-args, cache scope
- Auto-generates OCI-compliant labels
- Reduces code duplication (~70 lines → ~15 per workflow)
#### Refactored CI Pipeline
- **`docker-build-ci`**: Builds `operator/Dockerfile` → GHCR for CI/E2E
testing
- **`docker-build-release`**: Builds `operator/Dockerfile` → DockerHub
(main branch only)
- Both CI and release workflows now use the same minimal operator image
- Release builds **reuse CI binaries** instead of rebuilding from source
#### Optimized Release Workflow
The `task-docker-release` workflow now has dual modes:
**Mode 1: `workflow_call` (CI - main pushes)**
- ✅ Reuses binary from CI's build-operator task
- ✅ Uses lightweight `operator/Dockerfile`
- ✅ Tags: `latest`, `sha-{short}`
- ⚡ **Fast**: ~5 minutes (vs ~30 min previously)
**Mode 2: `workflow_dispatch` (Manual)**
- ✅ Full source build with `datahaven-build.Dockerfile`
- ✅ Custom branch and label selection
- ✅ Optional fast-runtime feature
- ✅ Tags: `PROD-{label}` or user-defined
---
### 🔧 Additional Optimizations
- Copy libpq5 from builder stage instead of reinstalling (smaller,
faster)
- Remove redundant protobuf-compiler package (use protoc v21.12
directly)
- Standardize user UID to 1000 across all runtime images
- Consistent OCI labeling and metadata
---------
Co-authored-by: Claude <noreply@anthropic.com>
2025-10-14 23:33:20 +00:00
|
|
|
# Make binary executable
|
2026-02-10 16:36:07 +00:00
|
|
|
RUN chmod uog+x /usr/local/bin/datahaven-node
|
refactor: Consolidate and optimize Docker image architecture (#233)
## Overview
This PR consolidates and optimizes the Docker build system, reducing
redundancy and improving CI/CD performance. The changes eliminate
duplicate Dockerfiles, introduce a flexible build template, and optimize
release builds to reuse CI artifacts.
## Changes Summary
### 🐳 Docker Images Restructured
**Before:** 5 Dockerfiles with significant overlap
**After:** 4 focused images + 1 utility
#### Final Structure:
1. **`operator/Dockerfile`** ✨ Updated
- **Standard operator image** for CI and release builds
- Minimal node image (accepts pre-built binaries)
- GHCR: `ghcr.io/datahaven-xyz/datahaven/datahaven` (CI)
- DockerHub: `datahavenxyz/datahaven` (releases)
2. **`docker/datahaven-build.Dockerfile`** (moved from
`operator/Dockerfile`)
- Full source-to-binary build for manual releases
- DockerHub: `datahavenxyz/datahaven:{label}`
- Supports custom RUSTFLAGS and fast-runtime feature
- Only used for manual workflow_dispatch builds
3. **`docker/datahaven-production.Dockerfile`** (kept)
- Binary builder for CPU-specific releases
- Used by build-prod-binary workflow template
- Supports custom target-cpu flags
4. **`docker/datahaven-dev.Dockerfile`** ✨ NEW (local dev only)
- **FOR LOCAL DEVELOPMENT/TROUBLESHOOTING ONLY**
- Includes debug tools: gdb, strace, vim, sudo
- Extra dependencies: librocksdb-dev, curl
- RUST_BACKTRACE enabled by default
- **DO NOT USE for CI or production builds**
5. **`test/docker/crossbuild-mac-libpq.dockerfile`** (kept)
- Utility for macOS → Linux cross-compilation
#### Removed (Redundant):
- ❌ `docker/datahaven.Dockerfile` → replaced by operator/Dockerfile
- ❌ `test/docker/datahaven-node-local.dockerfile` → replaced by
datahaven-dev.Dockerfile
---
### 🔄 Workflow Improvements
#### Enhanced `publish-docker` Template
- Supports both GHCR and DockerHub registries
- Flexible inputs: dockerfile, context, build-args, cache scope
- Auto-generates OCI-compliant labels
- Reduces code duplication (~70 lines → ~15 per workflow)
#### Refactored CI Pipeline
- **`docker-build-ci`**: Builds `operator/Dockerfile` → GHCR for CI/E2E
testing
- **`docker-build-release`**: Builds `operator/Dockerfile` → DockerHub
(main branch only)
- Both CI and release workflows now use the same minimal operator image
- Release builds **reuse CI binaries** instead of rebuilding from source
#### Optimized Release Workflow
The `task-docker-release` workflow now has dual modes:
**Mode 1: `workflow_call` (CI - main pushes)**
- ✅ Reuses binary from CI's build-operator task
- ✅ Uses lightweight `operator/Dockerfile`
- ✅ Tags: `latest`, `sha-{short}`
- ⚡ **Fast**: ~5 minutes (vs ~30 min previously)
**Mode 2: `workflow_dispatch` (Manual)**
- ✅ Full source build with `datahaven-build.Dockerfile`
- ✅ Custom branch and label selection
- ✅ Optional fast-runtime feature
- ✅ Tags: `PROD-{label}` or user-defined
---
### 🔧 Additional Optimizations
- Copy libpq5 from builder stage instead of reinstalling (smaller,
faster)
- Remove redundant protobuf-compiler package (use protoc v21.12
directly)
- Standardize user UID to 1000 across all runtime images
- Consistent OCI labeling and metadata
---------
Co-authored-by: Claude <noreply@anthropic.com>
2025-10-14 23:33:20 +00:00
|
|
|
|
|
|
|
|
# Expose ports
|
|
|
|
|
# 30333: p2p networking
|
|
|
|
|
# 9944: WebSocket/RPC
|
|
|
|
|
# 9615: Prometheus metrics
|
|
|
|
|
EXPOSE 30333 9944 9615
|
|
|
|
|
|
2026-02-10 16:36:07 +00:00
|
|
|
ENTRYPOINT ["datahaven-node"]
|