datahaven

mirror of https://github.com/datahaven-xyz/datahaven synced 2026-05-24 09:50:01 +00:00

Author	SHA1	Message	Date
Steve Degosserie	62393dee23	ci: migrate from self-hosted to standard GitHub runners (#482 ) ## Summary - Self-hosted `DH-runners` have been decommissioned — all Rust build, test, and lint workflows now use `ubuntu-latest` - Removed `install-deps: false` overrides so workflows use the default apt-based dependency installation path - Updated `setup-env` action descriptions to remove self-hosted runner references ### Workflows updated - `task-build-operator.yml` - `task-build-static-operator.yaml` - `task-publish-binary.yml` - `task-rust-lint.yml` (3 jobs) - `task-rust-tests.yml` - `task-warm-sccache.yml` - `task-e2e.yml` ## Test plan - [x] Verify all Rust CI jobs pass on `ubuntu-latest` (build, lint, test, warm-cache) - [x] Confirm sccache and dependency installation work correctly on standard runners - [x] Ensure E2E workflow runs with Docker (instead of Podman) on standard runners --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-02 23:11:39 +02:00
Steve Degosserie	e04023ef11	ci: Add sccache warm-up job for better cache hit rates (#375 ) ## Summary Improve CI performance through better caching and simplified workflows: - Add a new `warm-sccache` job that runs before all Rust CI jobs to pre-populate the sccache cache - Cache locally installed tools (`~/.local/bin`, `~/.local/lib`, `~/.local/include`) with a version-based hash key - Simplify Rust tests by removing matrix partitioning and switching from cargo-nextest to cargo test ## Problem Poor sccache hit rates: - `rust-lint`, `unit-tests`, and `build-operator` ran in parallel with cold caches - Each job compiled dependencies independently - Cache was only saved at job completion (too late for parallel jobs to benefit) Redundant tool downloads: - Mold, LLVM/Clang, protoc, and libpq (~500MB+) were downloaded fresh on each job - No caching of locally installed tools Overcomplicated test setup: - 2-partition matrix for tests added complexity without significant benefit - cargo-nextest required installation step (~30s overhead) - Separate result-checker job wasn't necessary ## Solution ### 1. sccache warm-up job (`task-warm-sccache.yml`) - Runs first (Tier 0) before all Rust jobs - Compiles with release mode + all features (`fast-runtime`, `try-runtime`, `runtime-benchmarks`) - Compiles with debug mode to cover test builds - Uses `SKIP_WASM_BUILD=1` to minimize warm-up time ### 2. Local tools caching (`setup-env/action.yml`) - Define tool versions as env vars (`MOLD_VERSION`, `LLVM_VERSION`, `PROTOC_VERSION`, `LIBPQ_VERSION`) - Generate SHA256 hash from versions for cache key - Cache `~/.local/bin`, `~/.local/lib`, `~/.local/include` (not all of `~/.local` to avoid container storage) - Set up PATH and env vars immediately after cache restore ### 3. Simplified Rust tests (`task-rust-tests.yml`) - Remove 2-partition matrix strategy - Replace cargo-nextest with `cargo test --locked` - Remove separate tests-result-checker job ## CI Flow ``` ┌─ build-operator (warm sccache + cached tools) │ CI Start → warm-sccache ─┼─ rust-lint (warm sccache + cached tools) │ └─ unit-tests (warm sccache + cached tools) ``` ## Test plan - [x] CI workflow runs successfully - [x] Warm-sccache job completes and shows cache stats - [x] Local tools cache restores correctly (no permission errors) - [x] Downstream Rust jobs show improved cache hit rates - [x] Rust tests pass with simplified single-job setup --------- Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com> Co-authored-by: Ahmad Kaouk <56095276+ahmadkaouk@users.noreply.github.com> Co-authored-by: undercover-cactus <lola@moonsonglabs.com>	2026-01-08 13:44:45 +01:00
Steve Degosserie	746fce9328	security: 🛡️ Harden GitHub Actions workflows (#349 ) ## Summary This PR addresses several security vulnerabilities and applies hardening measures to the GitHub Actions workflows: - Replace `secrets: inherit` with explicit secret passing - Prevents unnecessary exposure of all repository secrets to called workflows - Add SHA256 checksum verification for downloaded binaries - Protects against supply chain attacks via compromised upstream releases - Add GitHub Environment protections for release workflows - Requires approval before publishing to Docker Hub or creating releases - Add explicit minimal permissions to all workflows - Follows principle of least privilege, removes unnecessary `packages: write` from CI.yml ## Changes by Category ### 1. Explicit Secret Passing \| Workflow \| Before \| After \| \|----------\|--------\|-------\| \| CI.yml → docker-build-ci \| `secrets: inherit` \| No secrets (GITHUB_TOKEN is automatic) \| \| CI.yml → docker-build-release \| `secrets: inherit` \| Explicit `DOCKERHUB_USERNAME`, `DOCKERHUB_TOKEN` \| \| CI.yml → e2e-tests \| `secrets: inherit` \| No secrets (GITHUB_TOKEN is automatic) \| ### 2. Binary Checksum Verification \| Workflow \| Binary \| SHA256 \| \|----------\|--------\|--------\| \| task-rust-lint.yml \| taplo 0.8.1 \| `c62baa73c9d7c1572...` \| \| task-e2e.yml \| kurtosis 1.11.99 \| `5e88e98c1b255362...` \| ### 3. Environment Protections \| Workflow \| Job \| Environment \| \|----------\|-----\|-------------\| \| task-docker-release.yml \| build-test-push \| `production` \| \| task-publish-binary.yml \| publish-draft-release \| `releases` \| \| task-publish-binary.yml \| docker-release-candidate \| `production` \| \| task-publish-runtime.yml \| publish-draft-release \| `releases` \| ### 4. Explicit Permissions All 14 workflow files now have explicit `permissions:` blocks with minimal required access. Co-authored-by: Claude <noreply@anthropic.com> Co-authored-by: Ahmad Kaouk <56095276+ahmadkaouk@users.noreply.github.com>	2025-12-12 09:52:50 +00:00
undercover-cactus	60d0e2c901	ci: remove nextest archive (#164 ) Removing the nextest archive steps from the CI because it is taking a lot of time to download. From latest ci runs downloading take 17min. --------- Co-authored-by: Ahmad Kaouk <56095276+ahmadkaouk@users.noreply.github.com>	2025-09-16 13:44:16 +02:00
undercover-cactus	a9d0f7157a	feat: storage hub client (#149 ) Co-authored-by: Steve Degosserie <723552+stiiifff@users.noreply.github.com> Co-authored-by: Claude <noreply@anthropic.com>	2025-09-10 08:15:27 +02:00
Steve Degosserie	1f38b4e343	fix: Complete CI compatibility with self-hosted GitHub runners (#134 ) ## Summary This PR resolves all CI failures following the migration to self-hosted GitHub runners (`DH-Testing` group) by eliminating sudo dependencies and fixing Docker connectivity issues. ## Key Changes ### 🔧 Eliminated sudo requirements across all workflows - Setup Environment: Installed mold linker and system dependencies in userspace without sudo - Tool Installation: Replaced apt/system package installations with direct binary downloads: - Kurtosis: Direct binary download from GitHub releases (v1.10.3) - Taplo: Direct binary installation for Cargo.toml formatting - cargo-nextest: Using `cargo install` instead of GitHub action (v0.9.100) - Runner Cleanup: Skipped cleanup-runner action entirely on self-hosted runners (bare-metal manages disk space externally) ### 🐳 Fixed Docker connectivity for E2E tests - Enhanced dockerode configuration with robust fallback logic for different socket locations - Added DOCKER_HOST environment variable to E2E workflow for consistent Docker daemon access - Implemented connection testing with detailed error diagnostics for troubleshooting - Resolves FailedToOpenSocket errors by supporting multiple socket paths and connection methods ### 🏷️ Workflow optimizations - Label-based targeting: All heavy workloads (Rust builds, E2E tests) now run on `DH-Testing` runners - Dependency management: Used `install-deps: false` flag instead of hardcoded runner detection - Permission fixes: Corrected Docker build permissions and GHCR organization names --------- Co-authored-by: Claude <noreply@anthropic.com>	2025-09-09 21:18:50 +02:00
Steve Degosserie	fe53cec524	fix: Use all self-hosted runners for now (#144 )	2025-09-02 14:22:20 +02:00
Steve Degosserie	b1f21e7a96	fix: Resolve CI workflow configuration issues (#143 ) ## Summary This PR resolves all CI failures following the migration to the new DataHaven Github & Docker Hub organizations, and correctly leverage self-hosted GitHub runners (`DH-runners` group) by eliminating sudo dependencies. ## Key Changes ### 🔧 Eliminated sudo requirements across all workflows - Setup Environment: Installed mold linker and system dependencies in userspace without sudo - Tool Installation: Replaced apt/system package installations with direct binary downloads: - Kurtosis: Direct binary download from GitHub releases (v1.10.3) - Taplo: Direct binary installation for Cargo.toml formatting - cargo-nextest: Using `cargo install` instead of GitHub action (v0.9.100) - Runner Cleanup: Skipped cleanup-runner action entirely on self-hosted runners (bare-metal manages disk space externally) ### 🏷️ Workflow optimizations - Group-based targeting: All heavy workloads (Rust builds, E2E tests) now run on `DH-runners` runners - Dependency management: Used `install-deps: false` flag instead of hardcoded runner detection Co-authored-by: Claude <noreply@anthropic.com>	2025-09-02 13:02:13 +02:00
Tim B	1997c298a1	refactor: 🐳 Improve docker caching (again) (#86 ) ## Changes - New CI file for making Docker Prod images - Changed E2E tests use an image built from a local dockerfile - Some cargo build options to make it quicker - Fix the cache hit rate - added `tsgo` preview to the project 😎 - Can be invoked with `bun tsgo` to typecheck - Install in IDE [VSCode](https://code.visualstudio.com/docs/configure/extensions/extension-marketplace) & [Zed](https://github.com/zed-extensions/tsgo) for super-fast inline typechecking (as you type basically) ## Context This PR attempts to make the frankly unacceptable CI times better. This achieves that aim by making a crappy image for day-to-day usage and let the prod issue take ages since that will be infrequently used. The reason why the original design didn't work for us is because: 1) we are using the free GH runners 2) when we goto baremetal runners we'll lose our rapid caching abilities which make using docker cheap. Also, we add `tsgo` support to improve devex. The improvement is astounding. ```sh hyperfine -n tsc "bun tsc --incremental false --extendedDiagnostics" -n tsgo "bun tsgo --incremental false --extendedDiagnostics" Benchmark 1: tsc Time (mean ± σ): 5.500 s ± 0.221 s [User: 8.939 s, System: 0.400 s] Range (min … max): 5.196 s … 5.845 s 10 runs Benchmark 2: tsgo Time (mean ± σ): 99.1 ms ± 8.4 ms [User: 392.8 ms, System: 54.1 ms] Range (min … max): 88.3 ms … 116.0 ms 29 runs Summary tsgo ran 55.48 ± 5.22 times faster than tsc ```	2025-05-27 16:14:15 +00:00
Tim B	ce59dd9625	fix: 🐳 Improve Docker Caching (#66 ) ## Summary This PR attempts to improve caching, and thus speeds, for Docker image generation. This is to dramatically reduce the times of building images by using: - cargo chef - cache mounts - sccache - cache dance ## Context As a result this means thats changes that Do not Affect the code, should (in theory) not trigger a new build to be run. Changes that do change the rust code should also in theory be shorter as the dependencies are unlikely to have changed and so that too can be reused. In fact some part of the process should always be able to be re-used unless we do something like drastic like change rust-toolchain, but even then should only be a one time thing to regenerate that part of the cache. --------- Co-authored-by: Facundo Farall <37149322+ffarall@users.noreply.github.com>	2025-05-13 09:12:32 +00:00
Tim B	6aeece550b	ci: 🐳 Start Publishing Docker Images (#64 )	2025-05-08 20:32:55 -03:00
Steve Degosserie	d6f76f7fa3	feat(operator): ✨ Multi-runtime support (#38 ) Add support for multiple runtimes: `stagenet`, `testnet` & `mainnet`. - Moved common types to `datahaven-runtime-common` crate. - Made the node's command & service code generic over different runtimes. Each runtime has a `dev` & `local` genesis config preset. - More types / constants can be moved to the `datahaven-runtime-common` crate ... this will be done in subsequent PRs. --------- Co-authored-by: Gonza Montiel <gon.montiel@gmail.com> Co-authored-by: Gonza Montiel <gonzamontiel@users.noreply.github.com> Co-authored-by: Facundo Farall <37149322+ffarall@users.noreply.github.com>	2025-05-08 13:14:30 +00:00
Tim B	3776d80a2e	test: ⚡️ CI Refactor (#59 ) Eventually our CI will be required to run two private blockchains locally plus associated relayers. This PR is to prepare for this fate by improving run times and refactoring our existing CIs so they are a bit easier to reason about. ### Refactors - _We now run ALL CIs on every PR!_ This is so that we decomplexify the logic around conditional builds and fetching built binaries from another source. This reduces the surface area of code we have to maintain at the cost of execution time - This penalty is ameliorated by a layered caching system. At best, it will be less than a minute to complete a build since everything will be cached. On GH runners this is about 6 minutes sadly. - We will no longer be at risk of important CIs being skipped erroneously which hide true failures. - Caching is a low-risk approach because at worst it has to build from scratch. A bad cache hit will never imply the wrong thing gets build since cargo is smart enough to just throw away any inappropriate build artefacts. - `setup-rust` action created so we have a unified way of setting up runner and unifying our approach to caching - Use a unique caching key for different activities and it will fallback to shared cache if no matches - we are using `mainnet` kurtosis config so that it works with relayer assumptions ### Additions - We can specify the ethereum block time via a new cli arg `--slot-time <seconds>` - We can specify arbitrary network_param args which get passed into the generated yaml - e.g. giving `bun cli --kurtosis-network-args="pet=cat food=fish" will add: ```yml network_params: # existing params... pet: cat food: fish ``` - We now have the ability to programmatically modify the yaml - This means we are back down to a single `minimal.yml` kurtosis config so we dont have to maintain changes between them - Flow is: `add new cli arg` -> `add if() block which mutates yaml` -> `profit` --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> Co-authored-by: Facundo Farall <37149322+ffarall@users.noreply.github.com>	2025-05-06 20:20:02 +00:00

13 commits