refactor: Consolidate and optimize Docker image architecture (#233)
## Overview
This PR consolidates and optimizes the Docker build system, reducing
redundancy and improving CI/CD performance. The changes eliminate
duplicate Dockerfiles, introduce a flexible build template, and optimize
release builds to reuse CI artifacts.
## Changes Summary
### 🐳 Docker Images Restructured
**Before:** 5 Dockerfiles with significant overlap
**After:** 4 focused images + 1 utility
#### Final Structure:
1. **`operator/Dockerfile`** ✨ Updated
- **Standard operator image** for CI and release builds
- Minimal node image (accepts pre-built binaries)
- GHCR: `ghcr.io/datahaven-xyz/datahaven/datahaven` (CI)
- DockerHub: `datahavenxyz/datahaven` (releases)
2. **`docker/datahaven-build.Dockerfile`** (moved from
`operator/Dockerfile`)
- Full source-to-binary build for manual releases
- DockerHub: `datahavenxyz/datahaven:{label}`
- Supports custom RUSTFLAGS and fast-runtime feature
- Only used for manual workflow_dispatch builds
3. **`docker/datahaven-production.Dockerfile`** (kept)
- Binary builder for CPU-specific releases
- Used by build-prod-binary workflow template
- Supports custom target-cpu flags
4. **`docker/datahaven-dev.Dockerfile`** ✨ NEW (local dev only)
- **FOR LOCAL DEVELOPMENT/TROUBLESHOOTING ONLY**
- Includes debug tools: gdb, strace, vim, sudo
- Extra dependencies: librocksdb-dev, curl
- RUST_BACKTRACE enabled by default
- **DO NOT USE for CI or production builds**
5. **`test/docker/crossbuild-mac-libpq.dockerfile`** (kept)
- Utility for macOS → Linux cross-compilation
#### Removed (Redundant):
- ❌ `docker/datahaven.Dockerfile` → replaced by operator/Dockerfile
- ❌ `test/docker/datahaven-node-local.dockerfile` → replaced by
datahaven-dev.Dockerfile
---
### 🔄 Workflow Improvements
#### Enhanced `publish-docker` Template
- Supports both GHCR and DockerHub registries
- Flexible inputs: dockerfile, context, build-args, cache scope
- Auto-generates OCI-compliant labels
- Reduces code duplication (~70 lines → ~15 per workflow)
#### Refactored CI Pipeline
- **`docker-build-ci`**: Builds `operator/Dockerfile` → GHCR for CI/E2E
testing
- **`docker-build-release`**: Builds `operator/Dockerfile` → DockerHub
(main branch only)
- Both CI and release workflows now use the same minimal operator image
- Release builds **reuse CI binaries** instead of rebuilding from source
#### Optimized Release Workflow
The `task-docker-release` workflow now has dual modes:
**Mode 1: `workflow_call` (CI - main pushes)**
- ✅ Reuses binary from CI's build-operator task
- ✅ Uses lightweight `operator/Dockerfile`
- ✅ Tags: `latest`, `sha-{short}`
- ⚡ **Fast**: ~5 minutes (vs ~30 min previously)
**Mode 2: `workflow_dispatch` (Manual)**
- ✅ Full source build with `datahaven-build.Dockerfile`
- ✅ Custom branch and label selection
- ✅ Optional fast-runtime feature
- ✅ Tags: `PROD-{label}` or user-defined
---
### 🔧 Additional Optimizations
- Copy libpq5 from builder stage instead of reinstalling (smaller,
faster)
- Remove redundant protobuf-compiler package (use protoc v21.12
directly)
- Standardize user UID to 1000 across all runtime images
- Consistent OCI labeling and metadata
---------
Co-authored-by: Claude <noreply@anthropic.com>
2025-10-14 23:33:20 +00:00
|
|
|
# DataHaven Development/Troubleshooting Image
|
|
|
|
|
#
|
|
|
|
|
# This image is ONLY for local development and troubleshooting purposes.
|
|
|
|
|
# It includes additional debugging tools and dependencies not needed in production.
|
|
|
|
|
#
|
|
|
|
|
# DO NOT USE for CI or production builds - use operator/Dockerfile instead.
|
|
|
|
|
#
|
|
|
|
|
# Build Args:
|
|
|
|
|
# DEBUG_MODE - Set to "true" to include debugging tools (default: false)
|
|
|
|
|
#
|
|
|
|
|
# Expected Binary Location:
|
|
|
|
|
# ./operator/target/x86_64-unknown-linux-gnu/release/datahaven-node
|
|
|
|
|
#
|
|
|
|
|
# Features:
|
|
|
|
|
# - Ubuntu base with additional system tools
|
|
|
|
|
# - librocksdb-dev for local development
|
|
|
|
|
# - Optional gdb, strace, vim for debugging
|
|
|
|
|
# - RUST_BACKTRACE enabled by default
|
|
|
|
|
# - Additional directories (/specs, /storage) for testing
|
|
|
|
|
|
|
|
|
|
FROM ubuntu:noble
|
|
|
|
|
|
|
|
|
|
LABEL version="0.3.0"
|
|
|
|
|
LABEL description="DataHaven Node - Development/CI/E2E Testing Build"
|
|
|
|
|
LABEL maintainer="steve@moonsonglabs.com"
|
|
|
|
|
|
|
|
|
|
ARG DEBUG_MODE=false
|
|
|
|
|
|
|
|
|
|
# Install runtime dependencies
|
|
|
|
|
RUN apt-get update && \
|
|
|
|
|
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
|
|
|
|
|
ca-certificates \
|
|
|
|
|
curl \
|
|
|
|
|
libpq-dev \
|
|
|
|
|
librocksdb-dev && \
|
|
|
|
|
# Optionally install debug tools
|
|
|
|
|
if [ "$DEBUG_MODE" = "true" ]; then \
|
|
|
|
|
apt-get install -y --no-install-recommends \
|
|
|
|
|
sudo \
|
|
|
|
|
gdb \
|
|
|
|
|
strace \
|
|
|
|
|
vim; \
|
|
|
|
|
fi && \
|
|
|
|
|
apt-get autoremove -y && \
|
|
|
|
|
apt-get clean && \
|
|
|
|
|
find /var/lib/apt/lists/ -type f -not -name lock -delete
|
|
|
|
|
|
|
|
|
|
# Create datahaven user and directories
|
2025-10-30 10:10:11 +00:00
|
|
|
RUN useradd -m -u 1001 -U -s /bin/sh -d /datahaven datahaven && \
|
refactor: Consolidate and optimize Docker image architecture (#233)
## Overview
This PR consolidates and optimizes the Docker build system, reducing
redundancy and improving CI/CD performance. The changes eliminate
duplicate Dockerfiles, introduce a flexible build template, and optimize
release builds to reuse CI artifacts.
## Changes Summary
### 🐳 Docker Images Restructured
**Before:** 5 Dockerfiles with significant overlap
**After:** 4 focused images + 1 utility
#### Final Structure:
1. **`operator/Dockerfile`** ✨ Updated
- **Standard operator image** for CI and release builds
- Minimal node image (accepts pre-built binaries)
- GHCR: `ghcr.io/datahaven-xyz/datahaven/datahaven` (CI)
- DockerHub: `datahavenxyz/datahaven` (releases)
2. **`docker/datahaven-build.Dockerfile`** (moved from
`operator/Dockerfile`)
- Full source-to-binary build for manual releases
- DockerHub: `datahavenxyz/datahaven:{label}`
- Supports custom RUSTFLAGS and fast-runtime feature
- Only used for manual workflow_dispatch builds
3. **`docker/datahaven-production.Dockerfile`** (kept)
- Binary builder for CPU-specific releases
- Used by build-prod-binary workflow template
- Supports custom target-cpu flags
4. **`docker/datahaven-dev.Dockerfile`** ✨ NEW (local dev only)
- **FOR LOCAL DEVELOPMENT/TROUBLESHOOTING ONLY**
- Includes debug tools: gdb, strace, vim, sudo
- Extra dependencies: librocksdb-dev, curl
- RUST_BACKTRACE enabled by default
- **DO NOT USE for CI or production builds**
5. **`test/docker/crossbuild-mac-libpq.dockerfile`** (kept)
- Utility for macOS → Linux cross-compilation
#### Removed (Redundant):
- ❌ `docker/datahaven.Dockerfile` → replaced by operator/Dockerfile
- ❌ `test/docker/datahaven-node-local.dockerfile` → replaced by
datahaven-dev.Dockerfile
---
### 🔄 Workflow Improvements
#### Enhanced `publish-docker` Template
- Supports both GHCR and DockerHub registries
- Flexible inputs: dockerfile, context, build-args, cache scope
- Auto-generates OCI-compliant labels
- Reduces code duplication (~70 lines → ~15 per workflow)
#### Refactored CI Pipeline
- **`docker-build-ci`**: Builds `operator/Dockerfile` → GHCR for CI/E2E
testing
- **`docker-build-release`**: Builds `operator/Dockerfile` → DockerHub
(main branch only)
- Both CI and release workflows now use the same minimal operator image
- Release builds **reuse CI binaries** instead of rebuilding from source
#### Optimized Release Workflow
The `task-docker-release` workflow now has dual modes:
**Mode 1: `workflow_call` (CI - main pushes)**
- ✅ Reuses binary from CI's build-operator task
- ✅ Uses lightweight `operator/Dockerfile`
- ✅ Tags: `latest`, `sha-{short}`
- ⚡ **Fast**: ~5 minutes (vs ~30 min previously)
**Mode 2: `workflow_dispatch` (Manual)**
- ✅ Full source build with `datahaven-build.Dockerfile`
- ✅ Custom branch and label selection
- ✅ Optional fast-runtime feature
- ✅ Tags: `PROD-{label}` or user-defined
---
### 🔧 Additional Optimizations
- Copy libpq5 from builder stage instead of reinstalling (smaller,
faster)
- Remove redundant protobuf-compiler package (use protoc v21.12
directly)
- Standardize user UID to 1000 across all runtime images
- Consistent OCI labeling and metadata
---------
Co-authored-by: Claude <noreply@anthropic.com>
2025-10-14 23:33:20 +00:00
|
|
|
mkdir -p /data /datahaven/.local/share /specs /storage && \
|
|
|
|
|
chown -R datahaven:datahaven /data /storage && \
|
|
|
|
|
ln -s /data /datahaven/.local/share/datahaven-node
|
|
|
|
|
|
|
|
|
|
# Grant sudo access if debug mode is enabled
|
|
|
|
|
RUN if [ "$DEBUG_MODE" = "true" ]; then \
|
|
|
|
|
echo "datahaven ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers && \
|
|
|
|
|
chmod -R 777 /storage /data; \
|
|
|
|
|
fi
|
|
|
|
|
|
|
|
|
|
USER datahaven
|
|
|
|
|
|
|
|
|
|
# Copy pre-built binary
|
|
|
|
|
COPY --chown=datahaven:datahaven ./operator/target/x86_64-unknown-linux-gnu/release/datahaven-node /usr/local/bin/datahaven-node
|
|
|
|
|
RUN chmod uog+x /usr/local/bin/datahaven-node
|
|
|
|
|
|
|
|
|
|
# Enable Rust backtraces for better debugging
|
|
|
|
|
ENV RUST_BACKTRACE=1
|
|
|
|
|
|
|
|
|
|
# Expose ports
|
|
|
|
|
# 30333: p2p networking
|
|
|
|
|
# 9944: WebSocket/RPC
|
|
|
|
|
# 9615: Prometheus metrics
|
|
|
|
|
EXPOSE 30333 9944 9615
|
|
|
|
|
|
|
|
|
|
VOLUME ["/data"]
|
|
|
|
|
|
|
|
|
|
ENTRYPOINT ["datahaven-node"]
|
|
|
|
|
CMD ["--tmp"]
|