mirror of https://github.com/datahaven-xyz/datahaven synced 2026-05-24 09:50:01 +00:00

An EVM compatible Substrate chain, powered by StorageHub and secured by EigenLayer

Find a file

Steve Degosserie aa3409b239 feat(slashes): typed offence kinds, Perbill-to-WAD conversion, historical filtering, and liveness E2E test (#447 ) ## Summary Introduces typed offence classification, a linear Perbill-to-WAD conversion for EigenLayer slashing, historical offence filtering, and a new E2E test proving end-to-end liveness detection through `pallet_im_online`. --- ### OffenceKind enum New `OffenceKind` enum classifies consensus offences: - `LivenessOffence` — missed heartbeats (ImOnline) - `BabeEquivocation` — double block production - `GrandpaEquivocation` — double finality votes - `BeefyEquivocation` — double BEEFY votes / fork voting / future block voting - `Custom(BoundedVec<u8, 256>)` — manual / governance slashes Each variant carries a human-readable description string through the Snowbridge message to EigenLayer's `DatahavenServiceManager.slashValidatorsOperator()`. ### EquivocationReportWrapper Generic wrapper around `ReportOffence` wired for BABE, GRANDPA, BEEFY, and ImOnline in all three runtimes: 1. Filters historical offences — discards reports whose session predates the bonding period, using `BondedEras` storage (analogous to `FilterHistoricalOffences` in `pallet_staking`, but adapted to this pallet's own era tracking). 2. Tags offence kind — stores the `OffenceKind` in `PendingOffenceKind` double-map `(SessionIndex, ValidatorId)` before delegating to `pallet_offences`. The `on_offence` handler reads it via `take()` in the same block. 3. Cleans up on failure — removes stale `PendingOffenceKind` entries if the inner reporter returns an error (e.g. duplicate report), preventing them from leaking into unrelated future offences. ### Perbill to WAD conversion and MaxSlashWad #### How Substrate computes slash fractions Each offence type in Substrate defines its own `slash_fraction(offenders_count)` returning a `Perbill`: \| Offence \| Formula \| Typical range \| \|---\|---\|---\| \| BABE equivocation \| `min((3k/n)^2, 1)` \| 1 offender / 100 validators: ~0.09%; 1/2: capped to 100% \| \| GRANDPA equivocation \| `min((3k/n)^2, 1)` \| Same as BABE \| \| BEEFY double-vote \| `min((3k/n)^2, 1)` \| Same as BABE/GRANDPA \| \| BEEFY fork/future voting \| Fixed `50%` \| Always 50% \| \| ImOnline liveness \| `min(3(k - floor(n/10) - 1)/n, 1) 7%` \| 10% or fewer offline: 0%; ~33% offline: ~5%; ~43% offline: 7% (max) \| Where `k` = number of concurrent offenders, `n` = validator set size. Key behavior for small validator sets (E2E): With n=2, the ImOnline threshold is `floor(2/10) + 1 = 1`. A single offender (`k=1`) fails `checked_sub(1)` giving `Perbill(0)`. This means no `Slashes` storage entry is created (since `compute_slash` returns `None` when the new fraction doesn't exceed the prior slash), but the `SlashReported` event is still emitted, proving the full detection pipeline works. #### Linear conversion to EigenLayer WAD The Substrate `Perbill` is linearly mapped to a WAD value capped by `MaxSlashWad`: ``` WAD = perbill.deconstruct() * MaxSlashWad / 1_000_000_000 ``` - `MaxSlashWad` default: 5e16 (= 5% in WAD format, where 1e18 = 100%) - Governance-changeable dynamic runtime parameter (codec index 46) - `Perbill(100%)` maps to exactly `MaxSlashWad` (the cap) - `Perbill(0%)` maps to 0 (no slash sent to EigenLayer) #### Concrete examples (with default MaxSlashWad = 5%) \| Scenario \| Substrate Perbill \| WAD sent to EigenLayer \| EigenLayer % \| \|---\|---\|---\|---\| \| BABE equivocation (1 of 100 validators) \| ~0.09% \| ~4.5e13 \| ~0.0045% \| \| BABE equivocation (1 of 2 validators) \| 100% (capped) \| 5e16 \| 5% (max) \| \| BEEFY fork voting \| 50% \| 2.5e16 \| 2.5% \| \| ImOnline liveness (1 of 2 offline) \| 0% \| 0 (no slash) \| 0% \| \| ImOnline liveness (~33% of large set offline) \| ~5% \| ~2.5e15 \| ~0.25% \| \| Manual `force_inject_slash` at 20% \| 20% \| 1e16 \| 1% \| \| Manual `force_inject_slash` at 100% \| 100% \| 5e16 \| 5% (max) \| The same WAD value is applied uniformly to all configured strategies via the `SlashingRequest` struct sent through Snowbridge to `DatahavenServiceManager.slashValidatorsOperator()`. ### E2E liveness slashing test New test scenario (`should detect and slash an unresponsive validator`) validates the full liveness detection pipeline: 1. Pauses bob's Docker container (preserving GRANDPA state via `docker pause`) 2. Waits 200s (>= 2 full sessions) for `pallet_im_online` to detect missed heartbeats 3. Unpauses bob to restore GRANDPA finality (2/2 validators needed) 4. Polls for `SlashReported` event (not `Slashes` storage — see slash fraction note above) 5. Verifies the event confirms the full pipeline: `pallet_im_online -> EquivocationReportWrapper -> pallet_offences -> on_offence` The test uses `try/finally` to always unpause bob, `{ at: "best" }` queries for non-finalized chain state during the pause, and drains prior `SlashReported` events before starting. ### Tests - 10 new unit tests: `PendingOffenceKind` double-map semantics, session isolation, wrapper historical filtering, error cleanup, WAD conversion (100%, 50%, 0%), offence kind description propagation - New mock infrastructure: `MockInnerReporter`, `MockOffence`, `MockOkOutboundQueue` with slash data capture - E2E: Updated `force_inject_slash` test to use `offence_kind` enum, new liveness detection test --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Gonza Montiel <gonzamontiel@users.noreply.github.com> Co-authored-by: undercover-cactus <lola@moonsonglabs.com>		2026-03-04 14:25:17 +01:00
.github	feat: 🏁 pallet grandpa benchmarking (#442 )	2026-03-03 09:20:04 +01:00
contracts	feat: ✨ Bump contracts version to v0.20.0 (#464 )	2026-03-03 10:54:31 +01:00
deploy	fix: 🔨 Fix Kurtosis & Snowbridge relay configs for Fulu fork (#356 )	2025-12-18 15:50:09 +01:00
docker	Revert "feat: statically build binary (#292 )" (#330 )	2025-12-02 15:42:43 +01:00
operator	feat(slashes): typed offence kinds, Perbill-to-WAD conversion, historical filtering, and liveness E2E test (#447 )	2026-03-04 14:25:17 +01:00
resources	test: ✅ Add E2E Tests (#36 )	2025-04-14 16:22:43 -03:00
specs	feat: implement weighted top-32 validator selection (#443 )	2026-02-24 09:23:57 +01:00
test	feat(slashes): typed offence kinds, Perbill-to-WAD conversion, historical filtering, and liveness E2E test (#447 )	2026-03-04 14:25:17 +01:00
tools	fix: 🔧 Fix publish runtime draft release (#226 )	2025-10-12 23:59:32 +02:00
.gitignore	test: Update validator set e2e test (#126 )	2025-10-02 11:23:40 +00:00
.gitmodules	build: ➖ ➕ Change Snowbridge contracts dependency from upstream to fork (#18 )	2025-03-28 15:49:43 -03:00
biome.json	feat(test): expand Moonwall test coverage with balance and precompile tests (#414 )	2026-02-02 15:42:14 +01:00
CLAUDE.md	misc: remove slasher middleware solidity contracts (#366 )	2025-12-29 14:55:21 +01:00
file_header.txt	chore: ♻️ Add missing license header in operator & AVS contracts source code (#285 )	2025-11-10 12:56:41 +01:00
LICENSE	chore: ♻️ Add missing license header in operator & AVS contracts source code (#285 )	2025-11-10 12:56:41 +01:00
README.md	docs: Update README to feature StorageHub integration (#379 )	2026-01-06 11:36:23 +01:00
taplo.toml	ci: 🐳 Start Publishing Docker Images (#64 )	2025-05-08 20:32:55 -03:00

README.md

DataHaven 🫎

AI-First Decentralized Storage secured by EigenLayer — a verifiable storage network for AI training data, machine learning models, and Web3 applications.

Overview

DataHaven is a decentralized storage and retrieval network designed for applications that need verifiable, production-scale data storage. Built on StorageHub and secured by EigenLayer's restaking protocol, DataHaven separates storage from verification: providers store data off-chain while cryptographic commitments are anchored on-chain for tamper-evident verification.

Core Capabilities:

Verifiable Storage: Files are chunked, hashed into Merkle trees, and committed on-chain — enabling cryptographic proof that data hasn't been tampered with
Provider Network: Main Storage Providers (MSPs) serve data with competitive offerings, while Backup Storage Providers (BSPs) ensure redundancy through decentralized replication with on-chain slashing for failed proof challenges
EigenLayer Security: Validator set secured by Ethereum restaking — DataHaven validators register as EigenLayer operators with slashing for misbehavior
EVM Compatibility: Full Ethereum support via Frontier pallets for smart contracts and familiar Web3 tooling
Cross-chain Bridge: Native, trustless bridging with Ethereum via Snowbridge for tokens and messages

Architecture

DataHaven combines EigenLayer's shared security with StorageHub's decentralized storage infrastructure:

┌─────────────────────────────────────────────────────────────────────────────┐
│                              Ethereum (L1)                                  │
│  ┌───────────────────────────────────────────────────────────────────────┐  │
│  │  EigenLayer AVS Contracts                                             │  │
│  │  • DataHavenServiceManager (validator lifecycle & slashing)           │  │
│  │  • RewardsRegistry (validator performance & rewards)                  │  │
│  └───────────────────────────────────────────────────────────────────────┘  │
│                                    ↕                                        │
│                          Snowbridge Protocol                                │
│                    (trustless cross-chain messaging)                        │
└─────────────────────────────────────────────────────────────────────────────┘
                                     ↕
┌─────────────────────────────────────────────────────────────────────────────┐
│                          DataHaven (Substrate)                              │
│  ┌───────────────────────────────────────────────────────────────────────┐  │
│  │  StorageHub Pallets                     DataHaven Pallets             │  │
│  │  • file-system (file operations)        • External Validators         │  │
│  │  • providers (MSP/BSP registry)         • Native Transfer             │  │
│  │  • proofs-dealer (challenge/verify)     • Rewards                     │  │
│  │  • payment-streams (storage payments)   • Frontier (EVM)              │  │
│  │  • bucket-nfts (bucket ownership)                                     │  │
│  └───────────────────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────────────────┘
                                     ↕
┌─────────────────────────────────────────────────────────────────────────────┐
│                        Storage Provider Network                             │
│  ┌─────────────────────────────┐    ┌─────────────────────────────┐        │
│  │  Main Storage Providers     │    │  Backup Storage Providers   │        │
│  │  (MSP)                      │    │  (BSP)                      │        │
│  │  • User-selected            │    │  • Network-assigned         │        │
│  │  • Serve read requests      │    │  • Replicate data           │        │
│  │  • Anchor bucket roots      │    │  • Proof challenges         │        │
│  │  • MSP Backend service      │    │  • On-chain slashing        │        │
│  └─────────────────────────────┘    └─────────────────────────────┘        │
│  ┌─────────────────────────────┐    ┌─────────────────────────────┐        │
│  │  Indexer                    │    │  Fisherman                  │        │
│  │  • Index on-chain events    │    │  • Audit storage proofs     │        │
│  │  • Query storage metadata   │    │  • Trigger challenges       │        │
│  │  • PostgreSQL backend       │    │  • Detect misbehavior       │        │
│  └─────────────────────────────┘    └─────────────────────────────┘        │
└─────────────────────────────────────────────────────────────────────────────┘

How Storage Works

Upload: User selects an MSP, creates a bucket, and uploads files. Files are chunked (8KB default), hashed into Merkle trees, and the root is anchored on-chain.
Replication: The MSP coordinates with BSPs to replicate data across the network based on the bucket's replication policy.
Retrieval: MSP returns files with Merkle proofs that users verify against on-chain commitments.
Verification: BSPs face periodic proof challenges — failure to prove data custody results in on-chain slashing via StorageHub pallets.

Repository Structure

datahaven/
├── contracts/      # EigenLayer AVS smart contracts
│   ├── src/       # Service Manager, Rewards Registry, Slasher
│   ├── script/    # Deployment scripts
│   └── test/      # Foundry test suites
├── operator/       # Substrate-based DataHaven node
│   ├── node/      # Node implementation & chain spec
│   ├── pallets/   # Custom pallets (validators, rewards, transfers)
│   └── runtime/   # Runtime configurations (mainnet/stagenet/testnet)
├── test/           # E2E testing framework
│   ├── suites/    # Integration test scenarios
│   ├── framework/ # Test utilities and helpers
│   └── launcher/  # Network deployment automation
├── deploy/         # Kubernetes deployment charts
│   ├── charts/    # Helm charts for nodes and relayers
│   └── environments/ # Environment-specific configurations
├── tools/          # GitHub automation and release scripts
└── .github/        # CI/CD workflows

Each directory contains its own README with detailed information. See:

contracts/README.md - Smart contract development
operator/README.md - Node building and runtime development
test/README.md - E2E testing and network deployment
deploy/README.md - Kubernetes deployment
tools/README.md - Development tools

Quick Start

Prerequisites

Kurtosis - Network orchestration
Bun v1.3.2+ - TypeScript runtime
Docker - Container management
Foundry - Solidity toolkit
Rust - For building the operator
Helm - Kubernetes deployments (optional)
Zig - For macOS cross-compilation (macOS only)

Launch Local Network

The fastest way to get started is with the interactive CLI:

cd test
bun i                    # Install dependencies
bun cli launch           # Interactive launcher with prompts

This deploys a complete environment including:

Ethereum network: 2x EL clients (reth), 2x CL clients (lodestar)
Block explorers: Blockscout (optional), Dora consensus explorer
DataHaven node: Single validator with fast block times
Storage providers: MSP and BSP nodes for decentralized storage
AVS contracts: Deployed and configured on Ethereum
Snowbridge relayers: Bidirectional message passing

For more options and detailed instructions, see the test README.

Run Tests

cd test
bun test:e2e              # Run all integration tests
bun test:e2e:parallel     # Run with limited concurrency

NOTES: Adding the environment variable INJECT_CONTRACTS=true will inject the contracts when starting the tests to speed up setup.

Development Workflows

Smart Contract Development:

cd contracts
forge build               # Compile contracts
forge test                # Run contract tests

Node Development:

cd operator
cargo build --release --features fast-runtime
cargo test
./scripts/run-benchmarks.sh

After Making Changes:

cd test
bun generate:wagmi        # Regenerate contract bindings
bun generate:types        # Regenerate runtime types

Key Features

Verifiable Decentralized Storage

Production-scale storage with cryptographic guarantees:

Buckets: User-created containers managed by an MSP, summarized by a Merkle-Patricia trie root on-chain
Files: Deterministically chunked, hashed into Merkle trees, with roots serving as immutable fingerprints
Proofs: Merkle proofs enable verification of data integrity without trusting intermediaries
Audits: BSPs prove ongoing data custody via randomized proof challenges

Storage Provider Network

Two-tier provider model balancing performance and reliability:

MSPs: User-selected providers offering data retrieval with competitive service offerings
BSPs: Network-assigned backup providers ensuring data redundancy and availability, with on-chain slashing for failed proof challenges
Fisherman: Auditing service that monitors proofs and triggers challenges for misbehavior
Indexer: Indexes on-chain storage events for efficient querying

EigenLayer Security

DataHaven validators secured through Ethereum restaking:

Validators register as operators via DataHavenServiceManager contract
Economic security through ETH restaking
Slashing for validator misbehavior (separate from BSP slashing which is on-chain)
Performance-based validator rewards through RewardsRegistry

EVM Compatibility

Full Ethereum Virtual Machine support via Frontier pallets:

Deploy Solidity smart contracts
Use existing Ethereum tooling (MetaMask, Hardhat, etc.)
Compatible with ERC-20, ERC-721, and other standards

Cross-chain Communication

Trustless bridging via Snowbridge:

Native token transfers between Ethereum ↔ DataHaven
Cross-chain message passing
Finality proofs via BEEFY consensus
Three specialized relayers (beacon, BEEFY, execution)

Use Cases

DataHaven is designed for applications requiring verifiable, tamper-proof data storage:

AI & Machine Learning: Store training datasets, model weights, and agent configurations with cryptographic proofs of integrity — enabling federated learning and verifiable AI pipelines
DePIN (Decentralized Physical Infrastructure): Persistent storage for IoT sensor data, device configurations, and operational logs with provable data lineage
Real World Assets (RWAs): Immutable storage for asset documentation, ownership records, and compliance data with on-chain verification

Docker Images

Production images published to DockerHub.

Build optimizations:

sccache - Rust compilation caching
cargo-chef - Dependency layer caching
BuildKit cache mounts - External cache restoration

Build locally:

cd test
bun build:docker:operator    # Creates datahavenxyz/datahaven:local

Development Environment

VS Code Configuration

IDE configurations are excluded from version control for personalization, but these settings are recommended for optimal developer experience. Add to your .vscode/settings.json:

Rust Analyzer:

{
  "rust-analyzer.linkedProjects": ["./operator/Cargo.toml"],
  "rust-analyzer.cargo.allTargets": true,
  "rust-analyzer.procMacro.enable": false,
  "rust-analyzer.server.extraEnv": {
    "CARGO_TARGET_DIR": "target/.rust-analyzer",
    "SKIP_WASM_BUILD": 1
  },
  "rust-analyzer.diagnostics.disabled": ["unresolved-macro-call"],
  "rust-analyzer.cargo.buildScripts.enable": false
}

Optimizations:

Links operator/ directory as the primary Rust project
Disables proc macros and build scripts for faster analysis (Substrate macros are slow)
Uses dedicated target directory to avoid conflicts
Skips WASM builds during development

Solidity (Juan Blanco's extension):

{
  "solidity.formatter": "forge",
  "solidity.compileUsingRemoteVersion": "v0.8.28+commit.7893614a",
  "[solidity]": {
    "editor.defaultFormatter": "JuanBlanco.solidity"
  }
}

Note: Solidity version must match foundry.toml

TypeScript (Biome):

{
  "biome.lsp.bin": "test/node_modules/.bin/biome",
  "[typescript]": {
    "editor.defaultFormatter": "biomejs.biome",
    "editor.codeActionsOnSave": {
      "source.organizeImports.biome": "always"
    }
  }
}

CI/CD

Local CI Testing

Run GitHub Actions workflows locally using act:

# Run E2E workflow
act -W .github/workflows/e2e.yml -s GITHUB_TOKEN="$(gh auth token)"

# Run specific job
act -W .github/workflows/e2e.yml -j test-job-name

Automated Workflows

The repository includes GitHub Actions for:

E2E Testing: Full integration tests on PR and main branch
Contract Testing: Foundry test suites for smart contracts
Rust Testing: Unit and integration tests for operator
Docker Builds: Multi-platform image builds with caching
Release Automation: Version tagging and changelog generation

See .github/workflows/ for workflow definitions.

Contributing

Development Cycle

Make Changes: Edit contracts, runtime, or tests
Run Tests: Component-specific tests (forge test, cargo test)
Regenerate Types: Update bindings if contracts/runtime changed
Integration Test: Run E2E tests to verify cross-component behavior
Code Quality: Format and lint (cargo fmt, forge fmt, bun fmt:fix)

Common Pitfalls

Type mismatches: Regenerate with bun generate:types after runtime changes
Contract changes not reflected: Run bun generate:wagmi after modifications
Kurtosis issues: Ensure Docker is running and Kurtosis engine is started
Slow development: Use --features fast-runtime for shorter epochs/eras (block time stays 6s)
Network launch hangs: Check Blockscout - forge output can appear frozen

See CLAUDE.md for detailed development guidance.

License

GPL-3.0 - See LICENSE file for details