datahaven/test/e2e
Steve Degosserie aa3409b239
feat(slashes): typed offence kinds, Perbill-to-WAD conversion, historical filtering, and liveness E2E test (#447)
## Summary

Introduces typed offence classification, a linear Perbill-to-WAD
conversion for EigenLayer slashing, historical offence filtering, and a
new E2E test proving end-to-end liveness detection through
`pallet_im_online`.

---

### OffenceKind enum

New `OffenceKind` enum classifies consensus offences:
- `LivenessOffence` — missed heartbeats (ImOnline)
- `BabeEquivocation` — double block production
- `GrandpaEquivocation` — double finality votes
- `BeefyEquivocation` — double BEEFY votes / fork voting / future block
voting
- `Custom(BoundedVec<u8, 256>)` — manual / governance slashes

Each variant carries a human-readable description string through the
Snowbridge message to EigenLayer's
`DatahavenServiceManager.slashValidatorsOperator()`.

### EquivocationReportWrapper

Generic wrapper around `ReportOffence` wired for BABE, GRANDPA, BEEFY,
and ImOnline in all three runtimes:

1. **Filters historical offences** — discards reports whose session
predates the bonding period, using `BondedEras` storage (analogous to
`FilterHistoricalOffences` in `pallet_staking`, but adapted to this
pallet's own era tracking).
2. **Tags offence kind** — stores the `OffenceKind` in
`PendingOffenceKind` double-map `(SessionIndex, ValidatorId)` before
delegating to `pallet_offences`. The `on_offence` handler reads it via
`take()` in the same block.
3. **Cleans up on failure** — removes stale `PendingOffenceKind` entries
if the inner reporter returns an error (e.g. duplicate report),
preventing them from leaking into unrelated future offences.

### Perbill to WAD conversion and MaxSlashWad

#### How Substrate computes slash fractions

Each offence type in Substrate defines its own
`slash_fraction(offenders_count)` returning a `Perbill`:

| Offence | Formula | Typical range |
|---|---|---|
| **BABE equivocation** | `min((3k/n)^2, 1)` | 1 offender / 100
validators: ~0.09%; 1/2: capped to 100% |
| **GRANDPA equivocation** | `min((3k/n)^2, 1)` | Same as BABE |
| **BEEFY double-vote** | `min((3k/n)^2, 1)` | Same as BABE/GRANDPA |
| **BEEFY fork/future voting** | Fixed `50%` | Always 50% |
| **ImOnline liveness** | `min(3*(k - floor(n/10) - 1)/n, 1) * 7%` | 10%
or fewer offline: **0%**; ~33% offline: ~5%; ~43% offline: 7% (max) |

Where `k` = number of concurrent offenders, `n` = validator set size.

**Key behavior for small validator sets (E2E):** With n=2, the ImOnline
threshold is `floor(2/10) + 1 = 1`. A single offender (`k=1`) fails
`checked_sub(1)` giving `Perbill(0)`. This means no `Slashes` storage
entry is created (since `compute_slash` returns `None` when the new
fraction doesn't exceed the prior slash), but the `SlashReported` event
is still emitted, proving the full detection pipeline works.

#### Linear conversion to EigenLayer WAD

The Substrate `Perbill` is linearly mapped to a WAD value capped by
`MaxSlashWad`:

```
WAD = perbill.deconstruct() * MaxSlashWad / 1_000_000_000
```

- `MaxSlashWad` default: **5e16** (= 5% in WAD format, where 1e18 =
100%)
- Governance-changeable dynamic runtime parameter (codec index 46)
- `Perbill(100%)` maps to exactly `MaxSlashWad` (the cap)
- `Perbill(0%)` maps to 0 (no slash sent to EigenLayer)

#### Concrete examples (with default MaxSlashWad = 5%)

| Scenario | Substrate Perbill | WAD sent to EigenLayer | EigenLayer % |
|---|---|---|---|
| BABE equivocation (1 of 100 validators) | ~0.09% | ~4.5e13 | ~0.0045%
|
| BABE equivocation (1 of 2 validators) | 100% (capped) | 5e16 | 5%
(max) |
| BEEFY fork voting | 50% | 2.5e16 | 2.5% |
| ImOnline liveness (1 of 2 offline) | 0% | 0 (no slash) | 0% |
| ImOnline liveness (~33% of large set offline) | ~5% | ~2.5e15 | ~0.25%
|
| Manual `force_inject_slash` at 20% | 20% | 1e16 | 1% |
| Manual `force_inject_slash` at 100% | 100% | 5e16 | 5% (max) |

The same WAD value is applied uniformly to all configured strategies via
the `SlashingRequest` struct sent through Snowbridge to
`DatahavenServiceManager.slashValidatorsOperator()`.

### E2E liveness slashing test

New test scenario (`should detect and slash an unresponsive validator`)
validates the full liveness detection pipeline:

1. Pauses bob's Docker container (preserving GRANDPA state via `docker
pause`)
2. Waits 200s (>= 2 full sessions) for `pallet_im_online` to detect
missed heartbeats
3. Unpauses bob to restore GRANDPA finality (2/2 validators needed)
4. Polls for `SlashReported` event (not `Slashes` storage — see slash
fraction note above)
5. Verifies the event confirms the full pipeline: `pallet_im_online ->
EquivocationReportWrapper -> pallet_offences -> on_offence`

The test uses `try/finally` to always unpause bob, `{ at: "best" }`
queries for non-finalized chain state during the pause, and drains prior
`SlashReported` events before starting.

### Tests

- **10 new unit tests**: `PendingOffenceKind` double-map semantics,
session isolation, wrapper historical filtering, error cleanup, WAD
conversion (100%, 50%, 0%), offence kind description propagation
- **New mock infrastructure**: `MockInnerReporter`, `MockOffence`,
`MockOkOutboundQueue` with slash data capture
- **E2E**: Updated `force_inject_slash` test to use `offence_kind` enum,
new liveness detection test

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Gonza Montiel <gonzamontiel@users.noreply.github.com>
Co-authored-by: undercover-cactus <lola@moonsonglabs.com>
2026-03-04 14:25:17 +01:00
..
framework test: integrate validator-set-submitter Docker container into E2E test (#453) 2026-02-24 18:31:49 +02:00
suites feat(slashes): typed offence kinds, Perbill-to-WAD conversion, historical filtering, and liveness E2E test (#447) 2026-03-04 14:25:17 +01:00