datahaven/.github/workflows/task-docker.yml
Steve Degosserie 1f38b4e343
fix: Complete CI compatibility with self-hosted GitHub runners (#134)
## Summary

This PR resolves all CI failures following the migration to self-hosted
GitHub runners (`DH-Testing` group) by eliminating sudo dependencies and
fixing Docker connectivity issues.

## Key Changes

### 🔧 **Eliminated sudo requirements across all workflows**
- **Setup Environment**: Installed mold linker and system dependencies
in userspace without sudo
- **Tool Installation**: Replaced apt/system package installations with
direct binary downloads:
  - Kurtosis: Direct binary download from GitHub releases (v1.10.3)
  - Taplo: Direct binary installation for Cargo.toml formatting
- cargo-nextest: Using `cargo install` instead of GitHub action
(v0.9.100)
- **Runner Cleanup**: Skipped cleanup-runner action entirely on
self-hosted runners (bare-metal manages disk space externally)

### 🐳 **Fixed Docker connectivity for E2E tests**  
- **Enhanced dockerode configuration** with robust fallback logic for
different socket locations
- **Added DOCKER_HOST environment variable** to E2E workflow for
consistent Docker daemon access
- **Implemented connection testing** with detailed error diagnostics for
troubleshooting
- **Resolves FailedToOpenSocket errors** by supporting multiple socket
paths and connection methods

### 🏷️ **Workflow optimizations**
- **Label-based targeting**: All heavy workloads (Rust builds, E2E
tests) now run on `DH-Testing` runners
- **Dependency management**: Used `install-deps: false` flag instead of
hardcoded runner detection
- **Permission fixes**: Corrected Docker build permissions and GHCR
organization names

---------

Co-authored-by: Claude <noreply@anthropic.com>
2025-09-09 21:18:50 +02:00

146 lines
No EOL
4.7 KiB
YAML

name: Docker Build & Publish
on:
workflow_dispatch:
inputs:
binary-hash:
description: "The hash of the operator binary"
required: false
type: string
workflow_call:
inputs:
binary-hash:
description: "The hash of the operator binary"
required: true
type: string
outputs:
image-tag:
description: "The tag portion of the docker image (without registry)"
value: "${{ jobs.build-test-push.outputs.image-tag }}"
permissions:
contents: read
packages: write
concurrency:
group: docker-build-${{ github.ref }}
cancel-in-progress: true
jobs:
build-test-push:
runs-on: ubuntu-latest
outputs:
image-tag: ${{ steps.last_tag_extractor.outputs.image-tag }}
steps:
- name: Checkout repository
uses: actions/checkout@v4
- name: Download binary artifact
uses: actions/download-artifact@v4
with:
name: datahaven-node-${{ inputs.binary-hash }}
path: ./operator/target/x86_64-unknown-linux-gnu/release/
- name: Prepare binary
run: |
chmod +x ./operator/target/x86_64-unknown-linux-gnu/release/datahaven-node
ls -la ./operator/target/x86_64-unknown-linux-gnu/release/
- name: Docker meta
id: meta
uses: docker/metadata-action@v5
with:
images: ghcr.io/datahaven-xyz/datahaven/datahaven
flavor: |
latest=auto
tags: |
type=raw,value=ci-${{ github.run_id }}
type=sha,format=short,prefix=sha-
type=ref,event=tag
type=ref,event=branch
type=ref,event=pr
- name: Extract tag for job output
id: last_tag_extractor
run: |
FULL_TAG=$(echo '${{ steps.meta.outputs.json }}' | jq -r '.tags[-1]')
TAG_ONLY=$(echo "$FULL_TAG" | sed 's|.*:||')
echo "image-tag=$TAG_ONLY" >> $GITHUB_OUTPUT
echo "image-name=ghcr.io/datahaven-xyz/datahaven/datahaven:$TAG_ONLY" >> $GITHUB_OUTPUT
- name: Log Docker Metadata
run: |
echo "Generated tags: ${{ steps.meta.outputs.tags }}"
echo "Generated labels: ${{ steps.meta.outputs.labels }}"
echo "Generated JSON: ${{ steps.meta.outputs.json }}"
- uses: docker/setup-qemu-action@v3
- uses: docker/setup-buildx-action@v3
with:
driver-opts: |
image=moby/buildkit:master
network=host
buildkitd-flags: |
--allow-insecure-entitlement network.host
--allow-insecure-entitlement security.insecure
- name: Log in to GHCR
uses: docker/login-action@v3
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push Docker image
id: build
uses: docker/build-push-action@v5
with:
context: .
file: ./test/docker/datahaven-node-local.dockerfile
push: true
tags: ${{ steps.meta.outputs.tags }}
labels: ${{ steps.meta.outputs.labels }}
platforms: linux/amd64
cache-from: type=gha,scope=datahaven-local-build
cache-to: type=gha,mode=max,scope=datahaven-local-build
provenance: mode=max
sbom: true
- name: Log build cache statistics
run: |
echo "Build cache statistics:"
docker buildx du --verbose
# --- Smoke tests ---
- name: Pull and test node --help
run: |
docker pull ${{ steps.last_tag_extractor.outputs.image-name }}
docker run --rm ${{ steps.last_tag_extractor.outputs.image-name }} --help
- name: Integration test (dev chain starts)
run: |
docker run --rm -d -p 9944:9944 --name local-dh-node \
${{ steps.last_tag_extractor.outputs.image-name }} --dev --unsafe-rpc-external
- name: Wait for node to be healthy and test
run: |
echo "Waiting for node to start..."
for i in {1..30}; do # Retry for 30 * 5s = 150 seconds
if curl --fail --location 'http://127.0.0.1:9944' \
--header 'Content-Type: application/json' \
--data '{"jsonrpc":"2.0","id":1,"method":"system_chain","params":[]}' ; then
echo "Node is healthy!"
docker logs local-dh-node --tail 100
exit 0
fi
echo "Attempt $i: Node not ready yet, sleeping 5s..."
sleep 5
done
echo "Node failed to start or respond in time."
docker logs local-dh-node --tail 100
exit 1
- name: Cleanup integration test container
if: always()
run: docker rm -f local-dh-node