datahaven

mirror of https://github.com/datahaven-xyz/datahaven synced 2026-05-24 01:38:32 +00:00

Author	SHA1	Message	Date
Steve Degosserie	9a5404de82	refactor: Consolidate and optimize Docker image architecture (#233 ) ## Overview This PR consolidates and optimizes the Docker build system, reducing redundancy and improving CI/CD performance. The changes eliminate duplicate Dockerfiles, introduce a flexible build template, and optimize release builds to reuse CI artifacts. ## Changes Summary ### 🐳 Docker Images Restructured Before: 5 Dockerfiles with significant overlap After: 4 focused images + 1 utility #### Final Structure: 1. `operator/Dockerfile` ✨ Updated - Standard operator image for CI and release builds - Minimal node image (accepts pre-built binaries) - GHCR: `ghcr.io/datahaven-xyz/datahaven/datahaven` (CI) - DockerHub: `datahavenxyz/datahaven` (releases) 2. `docker/datahaven-build.Dockerfile` (moved from `operator/Dockerfile`) - Full source-to-binary build for manual releases - DockerHub: `datahavenxyz/datahaven:{label}` - Supports custom RUSTFLAGS and fast-runtime feature - Only used for manual workflow_dispatch builds 3. `docker/datahaven-production.Dockerfile` (kept) - Binary builder for CPU-specific releases - Used by build-prod-binary workflow template - Supports custom target-cpu flags 4. `docker/datahaven-dev.Dockerfile` ✨ NEW (local dev only) - FOR LOCAL DEVELOPMENT/TROUBLESHOOTING ONLY - Includes debug tools: gdb, strace, vim, sudo - Extra dependencies: librocksdb-dev, curl - RUST_BACKTRACE enabled by default - DO NOT USE for CI or production builds 5. `test/docker/crossbuild-mac-libpq.dockerfile` (kept) - Utility for macOS → Linux cross-compilation #### Removed (Redundant): - ❌ `docker/datahaven.Dockerfile` → replaced by operator/Dockerfile - ❌ `test/docker/datahaven-node-local.dockerfile` → replaced by datahaven-dev.Dockerfile --- ### 🔄 Workflow Improvements #### Enhanced `publish-docker` Template - Supports both GHCR and DockerHub registries - Flexible inputs: dockerfile, context, build-args, cache scope - Auto-generates OCI-compliant labels - Reduces code duplication (~70 lines → ~15 per workflow) #### Refactored CI Pipeline - `docker-build-ci`: Builds `operator/Dockerfile` → GHCR for CI/E2E testing - `docker-build-release`: Builds `operator/Dockerfile` → DockerHub (main branch only) - Both CI and release workflows now use the same minimal operator image - Release builds reuse CI binaries instead of rebuilding from source #### Optimized Release Workflow The `task-docker-release` workflow now has dual modes: Mode 1: `workflow_call` (CI - main pushes) - ✅ Reuses binary from CI's build-operator task - ✅ Uses lightweight `operator/Dockerfile` - ✅ Tags: `latest`, `sha-{short}` - ⚡ Fast: ~5 minutes (vs ~30 min previously) Mode 2: `workflow_dispatch` (Manual) - ✅ Full source build with `datahaven-build.Dockerfile` - ✅ Custom branch and label selection - ✅ Optional fast-runtime feature - ✅ Tags: `PROD-{label}` or user-defined --- ### 🔧 Additional Optimizations - Copy libpq5 from builder stage instead of reinstalling (smaller, faster) - Remove redundant protobuf-compiler package (use protoc v21.12 directly) - Standardize user UID to 1000 across all runtime images - Consistent OCI labeling and metadata --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-10-15 01:33:20 +02:00
undercover-cactus	a9d0f7157a	feat: storage hub client (#149 ) Co-authored-by: Steve Degosserie <723552+stiiifff@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>	2025-09-10 08:15:27 +02:00
Steve Degosserie	1f38b4e343	fix: Complete CI compatibility with self-hosted GitHub runners (#134 ) ## Summary This PR resolves all CI failures following the migration to self-hosted GitHub runners (`DH-Testing` group) by eliminating sudo dependencies and fixing Docker connectivity issues. ## Key Changes ### 🔧 Eliminated sudo requirements across all workflows - Setup Environment: Installed mold linker and system dependencies in userspace without sudo - Tool Installation: Replaced apt/system package installations with direct binary downloads: - Kurtosis: Direct binary download from GitHub releases (v1.10.3) - Taplo: Direct binary installation for Cargo.toml formatting - cargo-nextest: Using `cargo install` instead of GitHub action (v0.9.100) - Runner Cleanup: Skipped cleanup-runner action entirely on self-hosted runners (bare-metal manages disk space externally) ### 🐳 Fixed Docker connectivity for E2E tests - Enhanced dockerode configuration with robust fallback logic for different socket locations - Added DOCKER_HOST environment variable to E2E workflow for consistent Docker daemon access - Implemented connection testing with detailed error diagnostics for troubleshooting - Resolves FailedToOpenSocket errors by supporting multiple socket paths and connection methods ### 🏷️ Workflow optimizations - Label-based targeting: All heavy workloads (Rust builds, E2E tests) now run on `DH-Testing` runners - Dependency management: Used `install-deps: false` flag instead of hardcoded runner detection - Permission fixes: Corrected Docker build permissions and GHCR organization names --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-09-09 21:18:50 +02:00
Steve Degosserie	fe53cec524	fix: Use all self-hosted runners for now (#144 )	2025-09-02 14:22:20 +02:00
Steve Degosserie	b1f21e7a96	fix: Resolve CI workflow configuration issues (#143 ) ## Summary This PR resolves all CI failures following the migration to the new DataHaven Github & Docker Hub organizations, and correctly leverage self-hosted GitHub runners (`DH-runners` group) by eliminating sudo dependencies. ## Key Changes ### 🔧 Eliminated sudo requirements across all workflows - Setup Environment: Installed mold linker and system dependencies in userspace without sudo - Tool Installation: Replaced apt/system package installations with direct binary downloads: - Kurtosis: Direct binary download from GitHub releases (v1.10.3) - Taplo: Direct binary installation for Cargo.toml formatting - cargo-nextest: Using `cargo install` instead of GitHub action (v0.9.100) - Runner Cleanup: Skipped cleanup-runner action entirely on self-hosted runners (bare-metal manages disk space externally) ### 🏷️ Workflow optimizations - Group-based targeting: All heavy workloads (Rust builds, E2E tests) now run on `DH-runners` runners - Dependency management: Used `install-deps: false` flag instead of hardcoded runner detection Co-authored-by: Claude <noreply@anthropic.com>	2025-09-02 13:02:13 +02:00
undercover-cactus	01962656d7	ci: caching between run in the CI to make building faster (#114 ) ~~Improve CI runs by adding a caching action. Between each run crates should be cached to speed up building datahaven binary.~~ ## Summary In the end the real issue was not that we were missing a caching action but that we were caching too much. We went way over the 10GB limit imposed by github (we were using 60GB of cache). So the most recent cache were deleted right away or not cache at all. The over use of the cache was happening because we were caching twice sccache folder. Once with the `mozilla-actions/sccache-action` and an other time with `Swatinem/rust-cache`. The solution was to keep caching sccache on github with `mozilla-actions/sccache-action`. In this PR we also exchange the `Swatinem/rust-cache` action to use the more standard `actions/cache@v4`. It would avoid in the future unwanted breaking changes and security issues. --------- Co-authored-by: Ahmad Kaouk <56095276+ahmadkaouk@users.noreply.github.com>	2025-07-14 14:48:20 +02:00
Facundo Farall	935babe36a	ci: 👷 Add CI to check PAPI metadata (#107 ) Add CI check for Polkadot-API metadata freshness This PR adds a new CI workflow that ensures the Polkadot-API metadata file (`test/.papi/metadata/datahaven.scale`) is kept up-to-date when runtime changes are made. Changes: - Added task-check-metadata.yml workflow that: - Reuses the WASM artifact from the build-operator job (no duplicate compilation) - Runs `bun x papi add` to regenerate metadata - Fails if the metadata file has uncommitted changes - Integrated the check into `CI.yml` as a second-tier job alongside `docker-build` Why: - Prevents TypeScript type definitions from becoming out of sync with the runtime - Reminds developers to run `bun generate:types:fast` when making runtime changes - Ensures consistent type safety across the codebase The check provides clear error messages with instructions when metadata is outdated.	2025-06-19 19:12:04 -03:00
Tim B	1997c298a1	refactor: 🐳 Improve docker caching (again) (#86 ) ## Changes - New CI file for making Docker Prod images - Changed E2E tests use an image built from a local dockerfile - Some cargo build options to make it quicker - Fix the cache hit rate - added `tsgo` preview to the project 😎 - Can be invoked with `bun tsgo` to typecheck - Install in IDE [VSCode](https://code.visualstudio.com/docs/configure/extensions/extension-marketplace) & [Zed](https://github.com/zed-extensions/tsgo) for super-fast inline typechecking (as you type basically) ## Context This PR attempts to make the frankly unacceptable CI times better. This achieves that aim by making a crappy image for day-to-day usage and let the prod issue take ages since that will be infrequently used. The reason why the original design didn't work for us is because: 1) we are using the free GH runners 2) when we goto baremetal runners we'll lose our rapid caching abilities which make using docker cheap. Also, we add `tsgo` support to improve devex. The improvement is astounding. ```sh hyperfine -n tsc "bun tsc --incremental false --extendedDiagnostics" -n tsgo "bun tsgo --incremental false --extendedDiagnostics" Benchmark 1: tsc Time (mean ± σ): 5.500 s ± 0.221 s [User: 8.939 s, System: 0.400 s] Range (min … max): 5.196 s … 5.845 s 10 runs Benchmark 2: tsgo Time (mean ± σ): 99.1 ms ± 8.4 ms [User: 392.8 ms, System: 54.1 ms] Range (min … max): 88.3 ms … 116.0 ms 29 runs Summary tsgo ran 55.48 ± 5.22 times faster than tsc ```	2025-05-27 16:14:15 +00:00
Tim B	6aeece550b	ci: 🐳 Start Publishing Docker Images (#64 )	2025-05-08 20:32:55 -03:00
Tim B	3776d80a2e	test: ⚡️ CI Refactor (#59 ) Eventually our CI will be required to run two private blockchains locally plus associated relayers. This PR is to prepare for this fate by improving run times and refactoring our existing CIs so they are a bit easier to reason about. ### Refactors - _We now run ALL CIs on every PR!_ This is so that we decomplexify the logic around conditional builds and fetching built binaries from another source. This reduces the surface area of code we have to maintain at the cost of execution time - This penalty is ameliorated by a layered caching system. At best, it will be less than a minute to complete a build since everything will be cached. On GH runners this is about 6 minutes sadly. - We will no longer be at risk of important CIs being skipped erroneously which hide true failures. - Caching is a low-risk approach because at worst it has to build from scratch. A bad cache hit will never imply the wrong thing gets build since cargo is smart enough to just throw away any inappropriate build artefacts. - `setup-rust` action created so we have a unified way of setting up runner and unifying our approach to caching - Use a unique caching key for different activities and it will fallback to shared cache if no matches - we are using `mainnet` kurtosis config so that it works with relayer assumptions ### Additions - We can specify the ethereum block time via a new cli arg `--slot-time <seconds>` - We can specify arbitrary network_param args which get passed into the generated yaml - e.g. giving `bun cli --kurtosis-network-args="pet=cat food=fish" will add: ```yml network_params: # existing params... pet: cat food: fish ``` - We now have the ability to programmatically modify the yaml - This means we are back down to a single `minimal.yml` kurtosis config so we dont have to maintain changes between them - Flow is: `add new cli arg` -> `add if() block which mutates yaml` -> `profit` --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Facundo Farall <37149322+ffarall@users.noreply.github.com>	2025-05-06 20:20:02 +00:00

10 commits