datahaven/docker/datahaven-dev.Dockerfile
Steve Degosserie 9a5404de82
refactor: Consolidate and optimize Docker image architecture (#233)
## Overview

This PR consolidates and optimizes the Docker build system, reducing
redundancy and improving CI/CD performance. The changes eliminate
duplicate Dockerfiles, introduce a flexible build template, and optimize
release builds to reuse CI artifacts.

## Changes Summary

### 🐳 Docker Images Restructured

**Before:** 5 Dockerfiles with significant overlap
**After:** 4 focused images + 1 utility

#### Final Structure:

1. **`operator/Dockerfile`**  Updated
   - **Standard operator image** for CI and release builds
   - Minimal node image (accepts pre-built binaries)
   - GHCR: `ghcr.io/datahaven-xyz/datahaven/datahaven` (CI)
   - DockerHub: `datahavenxyz/datahaven` (releases)

2. **`docker/datahaven-build.Dockerfile`** (moved from
`operator/Dockerfile`)
   - Full source-to-binary build for manual releases
   - DockerHub: `datahavenxyz/datahaven:{label}`
   - Supports custom RUSTFLAGS and fast-runtime feature
   - Only used for manual workflow_dispatch builds

3. **`docker/datahaven-production.Dockerfile`** (kept)
   - Binary builder for CPU-specific releases
   - Used by build-prod-binary workflow template
   - Supports custom target-cpu flags

4. **`docker/datahaven-dev.Dockerfile`**  NEW (local dev only)
   - **FOR LOCAL DEVELOPMENT/TROUBLESHOOTING ONLY**
   - Includes debug tools: gdb, strace, vim, sudo
   - Extra dependencies: librocksdb-dev, curl
   - RUST_BACKTRACE enabled by default
   - **DO NOT USE for CI or production builds**

5. **`test/docker/crossbuild-mac-libpq.dockerfile`** (kept)
   - Utility for macOS → Linux cross-compilation

#### Removed (Redundant):
-  `docker/datahaven.Dockerfile` → replaced by operator/Dockerfile
-  `test/docker/datahaven-node-local.dockerfile` → replaced by
datahaven-dev.Dockerfile

---

### 🔄 Workflow Improvements

#### Enhanced `publish-docker` Template
- Supports both GHCR and DockerHub registries
- Flexible inputs: dockerfile, context, build-args, cache scope
- Auto-generates OCI-compliant labels
- Reduces code duplication (~70 lines → ~15 per workflow)

#### Refactored CI Pipeline
- **`docker-build-ci`**: Builds `operator/Dockerfile` → GHCR for CI/E2E
testing
- **`docker-build-release`**: Builds `operator/Dockerfile` → DockerHub
(main branch only)
- Both CI and release workflows now use the same minimal operator image
- Release builds **reuse CI binaries** instead of rebuilding from source

#### Optimized Release Workflow
The `task-docker-release` workflow now has dual modes:

**Mode 1: `workflow_call` (CI - main pushes)**
-  Reuses binary from CI's build-operator task
-  Uses lightweight `operator/Dockerfile`
-  Tags: `latest`, `sha-{short}`
-  **Fast**: ~5 minutes (vs ~30 min previously)

**Mode 2: `workflow_dispatch` (Manual)**
-  Full source build with `datahaven-build.Dockerfile`
-  Custom branch and label selection
-  Optional fast-runtime feature
-  Tags: `PROD-{label}` or user-defined

---

### 🔧 Additional Optimizations

- Copy libpq5 from builder stage instead of reinstalling (smaller,
faster)
- Remove redundant protobuf-compiler package (use protoc v21.12
directly)
- Standardize user UID to 1000 across all runtime images
- Consistent OCI labeling and metadata

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-10-15 01:33:20 +02:00

78 lines
2.3 KiB
Docker

# DataHaven Development/Troubleshooting Image
#
# This image is ONLY for local development and troubleshooting purposes.
# It includes additional debugging tools and dependencies not needed in production.
#
# DO NOT USE for CI or production builds - use operator/Dockerfile instead.
#
# Build Args:
# DEBUG_MODE - Set to "true" to include debugging tools (default: false)
#
# Expected Binary Location:
# ./operator/target/x86_64-unknown-linux-gnu/release/datahaven-node
#
# Features:
# - Ubuntu base with additional system tools
# - librocksdb-dev for local development
# - Optional gdb, strace, vim for debugging
# - RUST_BACKTRACE enabled by default
# - Additional directories (/specs, /storage) for testing
FROM ubuntu:noble
LABEL version="0.3.0"
LABEL description="DataHaven Node - Development/CI/E2E Testing Build"
LABEL maintainer="steve@moonsonglabs.com"
ARG DEBUG_MODE=false
# Install runtime dependencies
RUN apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install -y --no-install-recommends \
ca-certificates \
curl \
libpq-dev \
librocksdb-dev && \
# Optionally install debug tools
if [ "$DEBUG_MODE" = "true" ]; then \
apt-get install -y --no-install-recommends \
sudo \
gdb \
strace \
vim; \
fi && \
apt-get autoremove -y && \
apt-get clean && \
find /var/lib/apt/lists/ -type f -not -name lock -delete
# Create datahaven user and directories
RUN useradd -m -u 1000 -U -s /bin/sh -d /datahaven datahaven && \
mkdir -p /data /datahaven/.local/share /specs /storage && \
chown -R datahaven:datahaven /data /storage && \
ln -s /data /datahaven/.local/share/datahaven-node
# Grant sudo access if debug mode is enabled
RUN if [ "$DEBUG_MODE" = "true" ]; then \
echo "datahaven ALL=(ALL) NOPASSWD:ALL" >> /etc/sudoers && \
chmod -R 777 /storage /data; \
fi
USER datahaven
# Copy pre-built binary
COPY --chown=datahaven:datahaven ./operator/target/x86_64-unknown-linux-gnu/release/datahaven-node /usr/local/bin/datahaven-node
RUN chmod uog+x /usr/local/bin/datahaven-node
# Enable Rust backtraces for better debugging
ENV RUST_BACKTRACE=1
# Expose ports
# 30333: p2p networking
# 9944: WebSocket/RPC
# 9615: Prometheus metrics
EXPOSE 30333 9944 9615
VOLUME ["/data"]
ENTRYPOINT ["datahaven-node"]
CMD ["--tmp"]