This commit is contained in:
Steve Degosserie 2026-05-12 15:03:26 +02:00 committed by GitHub
commit 3625e52a3c
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
13 changed files with 6317 additions and 0 deletions

93
docs/README.md Normal file
View file

@ -0,0 +1,93 @@
# DataHaven Node Operations Documentation
This directory contains comprehensive documentation for setting up and operating DataHaven and StorageHub nodes.
## Documentation Structure
### DataHaven Nodes
- [Bootnode Setup](./datahaven-bootnode.md) - Bootnode configuration and operations
- [Validator Setup](./datahaven-validator.md) - Validator node configuration and operations
- [Full Node Setup](./datahaven-fullnode.md) - Full node (RPC) configuration and operations
### StorageHub Nodes
- [MSP Setup](./storagehub-msp.md) - Main Storage Provider configuration and operations
- [BSP Setup](./storagehub-bsp.md) - Backup Storage Provider configuration and operations
- [Indexer Setup](./storagehub-indexer.md) - Indexer node configuration and operations
- [Fisherman Setup](./storagehub-fisherman.md) - Fisherman node configuration and operations
### Snowbridge Relays
- [Beacon Relay](./snowbridge-beacon-relay.md) - Ethereum beacon chain → DataHaven
- [BEEFY Relay](./snowbridge-beefy-relay.md) - DataHaven BEEFY finality → Ethereum
- [Execution Relay](./snowbridge-execution-relay.md) - Ethereum messages → DataHaven
- [Solochain Relay](./snowbridge-solochain-relay.md) - DataHaven messages → Ethereum
## Quick Reference
### Node Types Overview
| Node Type | Purpose | Keys Required | On-Chain Registration | Database Required |
|-----------|---------|---------------|----------------------|-------------------|
| **Bootnode** | Network peer discovery | None | No | No |
| **Validator** | Block production & consensus | 4 (BABE, GRANDPA, ImOnline, BEEFY) | Yes (session.setKeys) | No |
| **Full Node** | RPC endpoint, sync only | None | No | No |
| **MSP** | Main storage provider | 1 (BCSV ECDSA) | Yes (2-step: request + confirm) | Optional |
| **BSP** | Backup storage provider | 1 (BCSV ECDSA) | Yes (2-step: request + confirm) | No |
| **Indexer** | Blockchain data indexer | None | No | Yes (PostgreSQL) |
| **Fisherman** | Storage provider monitor | 1 (BCSV ECDSA) | No | Yes (PostgreSQL) |
### Snowbridge Relays Overview
| Relay | Direction | Keys Required | Persistent Storage |
|-------|-----------|---------------|-------------------|
| **Beacon Relay** | Ethereum → DataHaven | Substrate | Yes (datastore) |
| **BEEFY Relay** | DataHaven → Ethereum | Ethereum | No |
| **Execution Relay** | Ethereum → DataHaven | Substrate | Yes (datastore) |
| **Solochain Relay** | DataHaven → Ethereum | Ethereum + Substrate | Yes (datastore) |
### Common CLI Flags
All node types support standard Substrate flags:
- `--chain <CHAIN_SPEC>` - Chain specification (dev, local, stagenet-local, testnet-local, mainnet-local)
- `--base-path <PATH>` - Base directory for chain data
- `--name <NAME>` - Human-readable node name
- `--port <PORT>` - P2P network port (default: 30333)
- `--rpc-port <PORT>` - WebSocket RPC port (default: 9944)
- `--rpc-external` - Listen on all network interfaces
- `--rpc-cors <ORIGINS>` - CORS origins for RPC (default: localhost)
- `--bootnodes <MULTIADDR>` - Bootstrap nodes for peer discovery
### Key Types Reference
| Key Type | Scheme | Purpose | Required For |
|----------|--------|---------|--------------|
| `gran` | ed25519 | GRANDPA finality | Validators |
| `babe` | sr25519 | BABE block authoring | Validators |
| `imon` | sr25519 | ImOnline heartbeat | Validators |
| `beef` | ecdsa | BEEFY bridge consensus | Validators |
| `bcsv` | ecdsa | Storage provider identity | MSP, BSP, Fisherman |
### Prerequisites
- [Docker](https://www.docker.com/) - Container runtime
- [Bun](https://bun.sh/) v1.2+ - For testing and tooling
- [Foundry](https://getfoundry.sh/) - For smart contract operations
- [PostgreSQL](https://www.postgresql.org/) - For Indexer and Fisherman nodes
### Getting Started
1. Choose your node type from the list above
2. Follow the specific setup guide for that node type
3. Generate or import keys as required
4. Configure CLI flags and environment
5. Start the node
6. Complete on-chain registration (if required)
### Support & Resources
- [Main Repository](https://github.com/Moonsong-Labs/datahaven)
- [StorageHub Repository](https://github.com/Moonsong-Labs/storage-hub)
- [Snowbridge Repository](https://github.com/datahaven-xyz/snowbridge) (solochain branch)
- [Snowbridge Documentation](https://docs.snowbridge.network)
- [E2E Testing Guide](../test/README.md)
- [Docker Compose Guide](../operator/DOCKER-COMPOSE.md)
- [Kubernetes Deployment](../deploy/charts/node/README.md)

337
docs/datahaven-bootnode.md Normal file
View file

@ -0,0 +1,337 @@
# DataHaven Bootnode Setup
## Overview
A bootnode serves as an entry point for peer discovery in the DataHaven network. It maintains a stable network identity and helps new nodes discover peers.
## Purpose
- Provide stable peer discovery endpoint
- Maintain persistent network identity
- Facilitate initial network connections for new nodes
- No participation in consensus or block production
## Prerequisites
- DataHaven node binary or Docker image
- Persistent storage for node key
- Open network port (default: 30333)
## Hardware Requirements
Bootnodes have moderate hardware requirements as they only handle peer discovery and do not participate in consensus. Network bandwidth and uptime are the primary concerns.
### Minimum Specifications
| Component | Requirement |
|-----------|-------------|
| **CPU** | 4 physical cores @ 2.0 GHz |
| **RAM** | 8 GB DDR4 |
| **Storage** | 100 GB NVMe SSD |
| **Network** | 500 Mbit/s symmetric |
### Recommended Specifications
| Component | Requirement |
|-----------|-------------|
| **CPU** | 8 physical cores @ 3.0 GHz (Intel Ice Lake+ or AMD Zen3+) |
| **RAM** | 16 GB DDR4 |
| **Storage** | 250 GB NVMe SSD |
| **Network** | 1 Gbit/s symmetric |
### Important Considerations
- **High availability**: Bootnodes should have excellent uptime as they are entry points for the network
- **Geographic distribution**: Deploy bootnodes in multiple regions for network resilience
- **Static IP**: Required for stable multiaddress that other nodes can reference
- **DDoS protection**: Consider DDoS mitigation as bootnodes are publicly known endpoints
## Key Requirements
### Node Key
Bootnodes require a **persistent node key** to maintain a stable peer ID.
#### Generate Node Key
```bash
# Generate a new node key
datahaven-node key generate-node-key > node-key.txt
# View the generated peer ID
datahaven-node key inspect-node-key --file node-key.txt
```
The output will show:
```
12D3KooWXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
```
### No Session Keys Required
Bootnodes do **not** require session keys (BABE, GRANDPA, ImOnline, BEEFY) as they do not participate in consensus.
## Wallet Requirements
### No Wallet Required
Bootnodes do not submit transactions or participate in consensus, so no funded account is needed.
## CLI Flags
### Required Flags
```bash
datahaven-node \
--chain <CHAIN_SPEC> \
--name <NODE_NAME> \
--node-key-file <PATH_TO_NODE_KEY>
```
### Important Flags
| Flag | Description | Default |
|------|-------------|---------|
| `--chain <SPEC>` | Chain specification (stagenet-local, testnet-local, mainnet-local) | Required |
| `--name <NAME>` | Human-readable node name | Required |
| `--node-key-file <PATH>` | Path to node key file | Required |
| `--base-path <PATH>` | Base directory for chain data | `~/.local/share/datahaven-node` |
| `--port <PORT>` | P2P network port | `30333` |
| `--listen-addr <MULTIADDR>` | Listen address for P2P | `/ip4/0.0.0.0/tcp/30333` |
| `--public-addr <MULTIADDR>` | Public address to advertise | Auto-detected |
### Optional Flags
| Flag | Description |
|------|-------------|
| `--no-telemetry` | Disable telemetry reporting |
| `--log <TARGETS>` | Logging targets (e.g., `info,libp2p=debug`) |
| `--unsafe-rpc-external` | Allow external RPC access (not recommended) |
## Complete Setup Example
### 1. Generate Node Key
```bash
mkdir -p /data/bootnode
datahaven-node key generate-node-key > /data/bootnode/node-key.txt
```
### 2. Get Peer ID
```bash
PEER_ID=$(datahaven-node key inspect-node-key --file /data/bootnode/node-key.txt)
echo "Bootnode Peer ID: $PEER_ID"
```
### 3. Start Bootnode
```bash
datahaven-node \
--chain stagenet-local \
--name "Bootnode-01" \
--base-path /data/bootnode \
--node-key-file /data/bootnode/node-key.txt \
--port 30333 \
--listen-addr /ip4/0.0.0.0/tcp/30333 \
--public-addr /dns/bootnode.example.com/tcp/30333 \
--no-telemetry
```
### 4. Advertise Bootnode Address
Other nodes can connect using:
```bash
--bootnodes /dns/bootnode.example.com/tcp/30333/p2p/$PEER_ID
```
## Docker Deployment
### Docker Compose
```yaml
version: '3.8'
services:
bootnode:
image: datahavenxyz/datahaven:latest
container_name: datahaven-bootnode
ports:
- "30333:30333"
volumes:
- bootnode-data:/data
- ./node-key.txt:/data/node-key.txt:ro
command:
- "--chain=stagenet-local"
- "--name=Bootnode-01"
- "--base-path=/data"
- "--node-key-file=/data/node-key.txt"
- "--port=30333"
- "--listen-addr=/ip4/0.0.0.0/tcp/30333"
- "--no-telemetry"
volumes:
bootnode-data:
```
### Docker Run
```bash
docker run -d \
--name datahaven-bootnode \
-p 30333:30333 \
-v $(pwd)/bootnode-data:/data \
-v $(pwd)/node-key.txt:/data/node-key.txt:ro \
datahavenxyz/datahaven:latest \
--chain stagenet-local \
--name "Bootnode-01" \
--base-path /data \
--node-key-file /data/node-key.txt \
--port 30333 \
--no-telemetry
```
## Kubernetes Deployment
```yaml
apiVersion: v1
kind: Service
metadata:
name: datahaven-bootnode
spec:
type: LoadBalancer
ports:
- port: 30333
targetPort: 30333
name: p2p
selector:
app: datahaven-bootnode
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: datahaven-bootnode
spec:
serviceName: datahaven-bootnode
replicas: 1
selector:
matchLabels:
app: datahaven-bootnode
template:
metadata:
labels:
app: datahaven-bootnode
spec:
containers:
- name: bootnode
image: datahavenxyz/datahaven:latest
ports:
- containerPort: 30333
name: p2p
volumeMounts:
- name: data
mountPath: /data
- name: node-key
mountPath: /data/node-key.txt
subPath: node-key.txt
readOnly: true
args:
- "--chain=stagenet-local"
- "--name=Bootnode-01"
- "--base-path=/data"
- "--node-key-file=/data/node-key.txt"
- "--port=30333"
- "--no-telemetry"
volumes:
- name: node-key
secret:
secretName: bootnode-node-key
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 100Gi
```
## On-Chain Registration
### Not Required
Bootnodes do not require any on-chain registration or extrinsics.
## Monitoring
### Health Checks
```bash
# Check peer count
curl -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "system_health"}' \
http://localhost:9944
# Check node info
curl -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "system_localPeerId"}' \
http://localhost:9944
```
### Logs
```bash
# View logs with Docker
docker logs -f datahaven-bootnode
# Filter for connection events
docker logs datahaven-bootnode 2>&1 | grep -i "discovered\|connected"
```
## Troubleshooting
### Issue: Peers Cannot Connect
**Check:**
1. Port 30333 is open in firewall
2. Public address is correctly configured
3. DNS resolves correctly (if using DNS)
4. Node key file has correct permissions
### Issue: Node Key Not Found
**Solution:**
```bash
# Verify node key file exists
ls -la /data/bootnode/node-key.txt
# Check file permissions
chmod 600 /data/bootnode/node-key.txt
```
### Issue: Network Identity Changes
**Solution:**
Always use `--node-key-file` instead of `--node-key` to ensure the key persists across restarts.
## Security Considerations
1. **Node Key Protection**: Keep node key file secure with restricted permissions (600)
2. **RPC Access**: Do not expose RPC publicly on bootnodes
3. **DDoS Protection**: Implement rate limiting at network level
4. **Monitoring**: Set up alerts for unexpected downtime
## Best Practices
1. Run multiple bootnodes for redundancy
2. Use DNS names instead of IP addresses for flexibility
3. Monitor peer connections and network health
4. Keep node software updated
5. Backup node key securely
## Related Documentation
- [Validator Setup](./datahaven-validator.md)
- [Full Node Setup](./datahaven-fullnode.md)
- [Docker Compose Guide](../operator/DOCKER-COMPOSE.md)

468
docs/datahaven-fullnode.md Normal file
View file

@ -0,0 +1,468 @@
# DataHaven Full Node Setup
## Overview
Full nodes synchronize with the DataHaven network and provide RPC endpoints for applications without participating in consensus or block production.
## Purpose
- Synchronize and maintain full blockchain state
- Provide RPC/WebSocket endpoints for applications
- Relay transactions to the network
- Query historical blockchain data
- No participation in consensus or validation
## Prerequisites
- DataHaven node binary or Docker image
- Sufficient storage for chain data
- Stable network connection
- Open network ports (30333, optionally 9944)
## Hardware Requirements
Full nodes have moderate hardware requirements as they sync the chain and serve RPC requests but do not participate in consensus.
### Minimum Specifications
| Component | Requirement |
|-----------|-------------|
| **CPU** | 4 physical cores @ 2.5 GHz |
| **RAM** | 16 GB DDR4 |
| **Storage** | 500 GB NVMe SSD |
| **Network** | 100 Mbit/s symmetric |
### Recommended Specifications
| Component | Requirement |
|-----------|-------------|
| **CPU** | 8 physical cores @ 3.0 GHz (Intel Ice Lake+ or AMD Zen3+) |
| **RAM** | 32 GB DDR4 |
| **Storage** | 1 TB NVMe SSD |
| **Network** | 500 Mbit/s symmetric |
### Important Considerations
- **Storage growth**: Plan for chain data growth over time; storage requirements will increase
- **RPC load**: If serving many RPC requests, consider higher CPU and RAM specifications
- **Archive node**: If running an archive node (full history), significantly more storage is required (2+ TB)
- **Cloud compatible**: Unlike validators, full nodes can run effectively on cloud VPS
## Key Requirements
### No Session Keys Required
Full nodes do **not** require session keys since they don't participate in consensus.
### Node Key (Optional)
A node key is optional but recommended for persistent peer identity:
```bash
# Generate node key
datahaven-node key generate-node-key > /data/fullnode/node-key.txt
```
## Wallet Requirements
### No Wallet Required
Full nodes do not submit transactions or participate in consensus, so no funded account is needed.
## CLI Flags
### Required Flags
```bash
datahaven-node \
--chain <CHAIN_SPEC> \
--name <NODE_NAME>
```
### Important Full Node Flags
| Flag | Description | Default |
|------|-------------|---------|
| `--chain <SPEC>` | Chain specification (stagenet-local, testnet-local, mainnet-local) | Required |
| `--name <NAME>` | Human-readable node name | Required |
| `--base-path <PATH>` | Base directory for chain data | `~/.local/share/datahaven-node` |
| `--port <PORT>` | P2P network port | `30333` |
| `--rpc-port <PORT>` | WebSocket RPC port | `9944` |
| `--rpc-external` | Listen on all network interfaces | Localhost only |
| `--rpc-cors <ORIGINS>` | CORS origins for RPC | `localhost` |
| `--rpc-methods <METHOD>` | RPC methods allowed (`safe`, `unsafe`, `auto`) | `auto` |
| `--bootnodes <MULTIADDR>` | Bootstrap nodes for peer discovery | None |
### Pruning and Storage Flags
| Flag | Description | Default |
|------|-------------|---------|
| `--pruning <MODE>` | State pruning mode (`archive`, `<number>`) | `256` blocks |
| `--blocks-pruning <MODE>` | Block pruning mode (`archive`, `archive-canonical`, `<number>`) | `archive-canonical` |
| `--state-cache-size <MB>` | State cache size in MB | `67108864` (64 GB) |
### Network Flags
| Flag | Description |
|------|-------------|
| `--public-addr <MULTIADDR>` | Public address to advertise |
| `--listen-addr <MULTIADDR>` | Listen address for P2P |
| `--reserved-nodes <MULTIADDR>` | Reserved peer addresses |
| `--reserved-only` | Only connect to reserved nodes |
| `--no-private-ip` | Disable private IP discovery |
### Optional Flags
| Flag | Description |
|------|-------------|
| `--prometheus-external` | Expose Prometheus metrics externally |
| `--prometheus-port <PORT>` | Prometheus metrics port (default: 9615) |
| `--telemetry-url <URL>` | Telemetry endpoint |
| `--log <TARGETS>` | Logging verbosity (e.g., `info,libp2p=debug`) |
| `--max-runtime-instances <N>` | Max WASM runtime instances |
| `--execution <STRATEGY>` | Execution strategy (`native`, `wasm`, `both`) |
## Complete Setup Examples
### 1. Basic Full Node
```bash
datahaven-node \
--chain stagenet-local \
--name "FullNode-01" \
--base-path /data/fullnode \
--port 30333 \
--rpc-port 9944 \
--rpc-external \
--rpc-cors all \
--bootnodes /dns/bootnode.example.com/tcp/30333/p2p/12D3KooW...
```
### 2. Archive Node
```bash
datahaven-node \
--chain stagenet-local \
--name "ArchiveNode-01" \
--base-path /data/archive \
--pruning archive \
--blocks-pruning archive \
--port 30333 \
--rpc-port 9944 \
--rpc-external \
--rpc-cors all \
--rpc-methods safe
```
### 3. RPC Node with High Performance
```bash
datahaven-node \
--chain stagenet-local \
--name "RPC-Node-01" \
--base-path /data/rpc \
--port 30333 \
--rpc-port 9944 \
--rpc-external \
--rpc-cors all \
--rpc-methods safe \
--state-cache-size 134217728 \
--max-runtime-instances 8 \
--execution wasm
```
## Docker Deployment
### Docker Compose
```yaml
version: '3.8'
services:
fullnode:
image: datahavenxyz/datahaven:latest
container_name: datahaven-fullnode
ports:
- "30333:30333"
- "9944:9944"
- "9615:9615" # Prometheus metrics
volumes:
- fullnode-data:/data
command:
- "--chain=stagenet-local"
- "--name=FullNode-01"
- "--base-path=/data"
- "--port=30333"
- "--rpc-port=9944"
- "--rpc-external"
- "--rpc-cors=all"
- "--rpc-methods=safe"
- "--prometheus-external"
- "--prometheus-port=9615"
- "--bootnodes=/dns/bootnode/tcp/30333/p2p/12D3KooW..."
restart: unless-stopped
volumes:
fullnode-data:
```
### Docker Run
```bash
docker run -d \
--name datahaven-fullnode \
-p 30333:30333 \
-p 9944:9944 \
-v $(pwd)/fullnode-data:/data \
datahavenxyz/datahaven:latest \
--chain stagenet-local \
--name "FullNode-01" \
--base-path /data \
--port 30333 \
--rpc-port 9944 \
--rpc-external \
--rpc-cors all \
--rpc-methods safe
```
## Kubernetes Deployment
```yaml
apiVersion: v1
kind: Service
metadata:
name: datahaven-fullnode
spec:
type: LoadBalancer
ports:
- port: 30333
targetPort: 30333
name: p2p
- port: 9944
targetPort: 9944
name: rpc
- port: 9615
targetPort: 9615
name: metrics
selector:
app: datahaven-fullnode
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: datahaven-fullnode
spec:
serviceName: datahaven-fullnode
replicas: 1
selector:
matchLabels:
app: datahaven-fullnode
template:
metadata:
labels:
app: datahaven-fullnode
spec:
containers:
- name: fullnode
image: datahavenxyz/datahaven:latest
ports:
- containerPort: 30333
name: p2p
- containerPort: 9944
name: rpc
- containerPort: 9615
name: metrics
volumeMounts:
- name: data
mountPath: /data
resources:
requests:
memory: "8Gi"
cpu: "2"
limits:
memory: "16Gi"
cpu: "4"
args:
- "--chain=stagenet-local"
- "--name=FullNode-01"
- "--base-path=/data"
- "--port=30333"
- "--rpc-port=9944"
- "--rpc-external"
- "--rpc-cors=all"
- "--rpc-methods=safe"
- "--prometheus-external"
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 200Gi
```
## On-Chain Registration
### Not Required
Full nodes do not require any on-chain registration or extrinsics.
## Monitoring
### Health Checks
```bash
# Check node health
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "system_health"}' \
http://localhost:9944 | jq
# Check sync status
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "system_syncState"}' \
http://localhost:9944 | jq
# Check peer count
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "system_peers"}' \
http://localhost:9944 | jq
```
### Prometheus Metrics
Access at `http://localhost:9615/metrics` when `--prometheus-external` is enabled.
Key metrics:
- `substrate_block_height` - Current block height
- `substrate_finalized_height` - Finalized block height
- `substrate_peers_count` - Connected peer count
- `substrate_ready_transactions_number` - Pending transactions
- `substrate_sync_blocks_total` - Total blocks synced
### Log Monitoring
```bash
# View logs with Docker
docker logs -f datahaven-fullnode
# Filter for errors
docker logs datahaven-fullnode 2>&1 | grep -i error
# Check sync progress
docker logs datahaven-fullnode 2>&1 | grep -i "Imported\|Syncing"
```
## RPC Usage Examples
### Query Chain Data
```bash
# Get latest block
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "chain_getBlock"}' \
http://localhost:9944 | jq
# Get account balance
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "system_accountNextIndex", "params":["0x..."]}' \
http://localhost:9944 | jq
```
### Submit Transactions
```bash
# Submit extrinsic
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "author_submitExtrinsic", "params":["0x..."]}' \
http://localhost:9944 | jq
```
## Troubleshooting
### Issue: Slow Sync Speed
**Solutions:**
1. Increase `--max-runtime-instances` to 8-16
2. Increase `--state-cache-size` (requires more RAM)
3. Use faster storage (NVMe SSD)
4. Add more `--bootnodes` for better peer discovery
### Issue: High Memory Usage
**Solutions:**
1. Reduce `--state-cache-size`
2. Enable pruning (remove `--pruning archive`)
3. Reduce `--max-runtime-instances`
### Issue: RPC Connection Refused
**Check:**
1. `--rpc-external` flag is set
2. Port 9944 is open in firewall
3. `--rpc-cors` includes your origin
4. Node is fully started (check logs)
### Issue: No Peers Connecting
**Solutions:**
1. Verify bootnode addresses are correct
2. Check port 30333 is open
3. Use `--listen-addr /ip4/0.0.0.0/tcp/30333`
4. Check firewall rules
## Performance Tuning
### For RPC Workloads
```bash
datahaven-node \
--rpc-methods safe \
--rpc-max-connections 1000 \
--state-cache-size 134217728 \
--max-runtime-instances 16 \
--execution wasm
```
### For Archive Node
```bash
datahaven-node \
--pruning archive \
--blocks-pruning archive \
--state-cache-size 268435456
```
### Resource Requirements
| Node Type | CPU | RAM | Storage | Network |
|-----------|-----|-----|---------|---------|
| Full Node (Pruned) | 2-4 cores | 8-16 GB | 100-200 GB | 100 Mbps |
| Archive Node | 4-8 cores | 16-32 GB | 500+ GB | 100 Mbps |
| RPC Node (High Traffic) | 8-16 cores | 32-64 GB | 200-500 GB | 1 Gbps |
## Security Considerations
1. **RPC Security**: Use `--rpc-methods safe` for public endpoints
2. **CORS**: Restrict `--rpc-cors` to specific domains in production
3. **Rate Limiting**: Implement reverse proxy with rate limiting
4. **Firewall**: Restrict RPC access to known IPs
5. **Monitoring**: Set up alerts for unusual activity
## Best Practices
1. Use dedicated server for production RPC nodes
2. Enable Prometheus metrics for monitoring
3. Regular backups of chain data
4. Use load balancer for multiple RPC nodes
5. Keep node software updated
6. Monitor disk space usage
7. Implement log rotation
## Related Documentation
- [Bootnode Setup](./datahaven-bootnode.md)
- [Validator Setup](./datahaven-validator.md)
- [Docker Compose Guide](../operator/DOCKER-COMPOSE.md)
- [Kubernetes Deployment](../deploy/charts/node/README.md)

527
docs/datahaven-validator.md Normal file
View file

@ -0,0 +1,527 @@
# DataHaven Validator Node Setup
## Overview
Validator nodes participate in consensus, produce blocks, and secure the DataHaven network through EigenLayer AVS integration.
## Purpose
- Participate in BABE block production
- Sign GRANDPA finality votes
- Submit ImOnline heartbeats
- Participate in BEEFY bridge consensus
- Earn rewards for block production and consensus participation
## Prerequisites
- DataHaven node binary or Docker image
- ECDSA keypair for operator registration on EigenLayer AVS
- Persistent storage for chain data
- Stable network connection
- Open network ports (30333, optionally 9944)
## Hardware Requirements
Validators have the highest hardware requirements as they participate in block production and consensus. Single-threaded CPU performance is critical.
### Minimum Specifications
| Component | Requirement |
|-----------|-------------|
| **CPU** | 8 physical cores @ 3.4 GHz (Intel Ice Lake+ or AMD Zen3+) |
| **RAM** | 32 GB DDR4 ECC |
| **Storage** | 1 TB NVMe SSD (low latency) |
| **Network** | 500 Mbit/s symmetric |
### Recommended Specifications
| Component | Requirement |
|-----------|-------------|
| **CPU** | Intel Xeon E-2386/E-2388 or AMD Ryzen 9 5950x/5900x |
| **RAM** | 64 GB DDR4 ECC |
| **Storage** | 2 TB NVMe SSD |
| **Network** | 1 Gbit/s symmetric |
### Important Considerations
- **Disable Hyper-Threading/SMT**: Single-threaded performance is prioritized over core count
- **Bare metal preferred**: Cloud VPS may have inconsistent performance due to shared resources
- **Dedicated server**: Do not run other applications on the validator machine
- **Docker not recommended**: Running in containers can significantly impact performance
- **Redundancy**: Consider primary and backup servers in different data centers
## Key Requirements
### Session Keys (4 Required)
Validators require **four session keys** for different consensus mechanisms:
| Key Type | Scheme | Purpose |
|----------|--------|---------|
| `gran` | ed25519 | GRANDPA finality gadget |
| `babe` | sr25519 | BABE block authoring |
| `imon` | sr25519 | ImOnline validator heartbeat |
| `beef` | ecdsa | BEEFY bridge consensus |
### Generate Session Keys
#### Method 1: Using RPC (Recommended)
```bash
# Start node first, then generate keys via RPC
curl -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "author_rotateKeys"}' \
http://localhost:9944
# Returns: "0x<combined_public_keys_hex>"
```
#### Method 2: CLI Key Insertion
```bash
# Generate seed phrase first
SEED=$(datahaven-node key generate | grep "Secret phrase" | cut -d'`' -f2)
# Insert GRANDPA key (ed25519)
datahaven-node key insert \
--base-path /data/validator \
--chain stagenet-local \
--key-type gran \
--scheme ed25519 \
--suri "$SEED"
# Insert BABE key (sr25519)
datahaven-node key insert \
--base-path /data/validator \
--chain stagenet-local \
--key-type babe \
--scheme sr25519 \
--suri "$SEED"
# Insert ImOnline key (sr25519)
datahaven-node key insert \
--base-path /data/validator \
--chain stagenet-local \
--key-type imon \
--scheme sr25519 \
--suri "$SEED"
# Insert BEEFY key (ecdsa)
datahaven-node key insert \
--base-path /data/validator \
--chain stagenet-local \
--key-type beef \
--scheme ecdsa \
--suri "$SEED"
```
#### Method 3: Docker Entrypoint (Automated)
Set environment variables and let the Docker entrypoint inject keys:
```bash
export NODE_TYPE=validator
export NODE_NAME=Alice
export SEED="your seed phrase here"
export CHAIN=stagenet-local
```
The entrypoint script (`operator/scripts/docker-entrypoint.sh`) automatically injects all 4 keys.
## Wallet Requirements
### Operator Account (ECDSA)
DataHaven validators are EigenLayer operators. The operator account is used to:
- Register as an operator on the DataHaven AVS (on Ethereum)
- Sign the `session.setKeys` extrinsic to associate session keys with the operator
**Important**:
- The account **does NOT need to be funded** on DataHaven - staking happens via EigenLayer delegation on Ethereum
- Token holders delegate stake to operators on EigenLayer, not on the DataHaven chain
- The same private key that controls the operator address on the AVS must sign the session keys transaction
### Generate Operator Account (ECDSA)
```bash
# Generate ECDSA keypair using datahaven-node
datahaven-node key generate --scheme ecdsa
# Output:
# Secret phrase: <your-seed-phrase>
# Network ID: substrate
# Secret seed: 0x...
# Public key (hex): 0x...
# Account ID: 0x... (20-byte Ethereum-style address)
# Derive Ethereum address from the hex public key using Foundry's cast
cast wallet address <public_key_hex>
# This gives you the Ethereum address to register on the AVS
```
### Alternative: Generate with cast (Foundry)
```bash
# Generate a new keypair
cast wallet new
# Or import from private key
cast wallet address --private-key 0x...
```
## CLI Flags
### Required Flags
```bash
datahaven-node \
--chain <CHAIN_SPEC> \
--validator \
--name <NODE_NAME>
```
### Important Validator Flags
| Flag | Description | Default |
|------|-------------|---------|
| `--chain <SPEC>` | Chain specification | Required |
| `--validator` | Run as validator | Required |
| `--name <NAME>` | Node name | Required |
| `--base-path <PATH>` | Base directory for data | `~/.local/share/datahaven-node` |
| `--port <PORT>` | P2P port | `30333` |
| `--rpc-port <PORT>` | WebSocket RPC port | `9944` |
| `--bootnodes <MULTIADDR>` | Bootstrap nodes | None |
### Optional Flags
| Flag | Description |
|------|-------------|
| `--rpc-external` | Listen on all interfaces |
| `--rpc-cors <ORIGINS>` | CORS origins (e.g., `all` or `http://localhost:3000`) |
| `--prometheus-external` | Expose Prometheus metrics externally |
| `--telemetry-url <URL>` | Telemetry endpoint |
| `--log <TARGETS>` | Logging verbosity (e.g., `info,runtime=debug`) |
| `--unsafe-force-node-key-generation` | Generate node key (dev only) |
## Complete Setup Example
### 1. Generate Operator Account (ECDSA)
```bash
# Generate ECDSA keypair for operator registration
datahaven-node key generate --scheme ecdsa
# Save the seed phrase and note the public key hex
# Example output:
# Secret phrase: "word1 word2 ... word12"
# Public key (hex): 0x0123456789abcdef...
# Get the Ethereum address for AVS registration
OPERATOR_ETH_ADDRESS=$(cast wallet address 0x<public_key_hex>)
echo "Operator ETH Address: $OPERATOR_ETH_ADDRESS"
```
### 2. Register as Operator on EigenLayer AVS
Before setting session keys, register your operator address on the DataHaven AVS contract on Ethereum. See [On-Chain Registration](#on-chain-registration) for details.
### 3. Generate Session Keys
```bash
# Start node to generate session keys via RPC
datahaven-node \
--chain stagenet-local \
--base-path /tmp/validator \
--validator \
--name "TempValidator" \
--rpc-port 9944 &
# Wait for node to start
sleep 10
# Generate session keys
SESSION_KEYS=$(curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "author_rotateKeys"}' \
http://localhost:9944 | jq -r '.result')
echo "Session Keys: $SESSION_KEYS"
# Stop temporary node
pkill -f datahaven-node
```
### 4. Start Validator Node
```bash
datahaven-node \
--chain stagenet-local \
--validator \
--name "Validator-01" \
--base-path /data/validator \
--port 30333 \
--rpc-port 9944 \
--bootnodes /dns/bootnode.example.com/tcp/30333/p2p/12D3KooW... \
--telemetry-url "wss://telemetry.polkadot.io/submit/ 0" \
--log info
```
### 5. Set Session Keys On-Chain
See [On-Chain Registration](#on-chain-registration) section below.
## Docker Deployment
### Docker Compose
```yaml
version: '3.8'
services:
validator:
image: datahavenxyz/datahaven:latest
container_name: datahaven-validator
environment:
NODE_TYPE: validator
NODE_NAME: Alice
SEED: "your seed phrase here"
CHAIN: stagenet-local
KEYSTORE_PATH: /data/keystore
ports:
- "30333:30333"
- "9944:9944"
volumes:
- validator-data:/data
command:
- "--chain=stagenet-local"
- "--validator"
- "--name=Validator-01"
- "--base-path=/data"
- "--keystore-path=/data/keystore"
- "--port=30333"
- "--rpc-port=9944"
- "--rpc-external"
- "--rpc-cors=all"
volumes:
validator-data:
```
## Kubernetes Deployment
See `deploy/charts/node/values.yaml` for full Helm configuration.
```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: datahaven-validator
spec:
serviceName: datahaven-validator
replicas: 1
selector:
matchLabels:
app: datahaven-validator
template:
metadata:
labels:
app: datahaven-validator
spec:
containers:
- name: validator
image: datahavenxyz/datahaven:latest
env:
- name: NODE_TYPE
value: "validator"
- name: NODE_NAME
value: "Alice"
- name: SEED
valueFrom:
secretKeyRef:
name: validator-seed
key: seed
- name: CHAIN
value: "stagenet-local"
ports:
- containerPort: 30333
name: p2p
- containerPort: 9944
name: rpc
volumeMounts:
- name: data
mountPath: /data
args:
- "--chain=stagenet-local"
- "--validator"
- "--name=Validator-01"
- "--base-path=/data"
- "--port=30333"
- "--rpc-port=9944"
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 200Gi
```
## On-Chain Registration
### Step 1: Register as Operator on EigenLayer AVS
Before setting session keys, you must register your operator address on the DataHaven AVS contract on Ethereum. This establishes your identity as a validator operator.
```solidity
// DataHavenServiceManager.sol
function registerOperatorToAVS(
address operator,
ISignatureUtils.SignatureWithSaltAndExpiry memory operatorSignature
) external;
```
See `contracts/` directory and `test/scripts/` for registration scripts.
### Step 2: Set Session Keys
After registering on the AVS, submit your session keys to the DataHaven chain. **Important**: The transaction must be signed with the same private key used for AVS registration, so the session keys are associated with your operator address.
Using Polkadot.js Apps or TypeScript:
```typescript
import { ApiPromise, WsProvider } from '@polkadot/api';
import { Keyring } from '@polkadot/keyring';
const api = await ApiPromise.create({
provider: new WsProvider('ws://localhost:9944')
});
// Use 'ethereum' keyring type for ECDSA keys
const keyring = new Keyring({ type: 'ethereum' });
// Use the same seed phrase as your AVS operator account
const operator = keyring.addFromUri('your operator seed phrase');
// Set session keys (from author_rotateKeys RPC)
const sessionKeys = '0x...'; // Combined public keys hex
const setKeysTx = api.tx.session.setKeys(sessionKeys, []);
await setKeysTx.signAndSend(operator);
```
### Step 3: Verify Registration
```bash
# Check session keys
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "author_hasSessionKeys", "params":["0x..."]}' \
http://localhost:9944
# Check validator status
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "system_health"}' \
http://localhost:9944
```
## Monitoring
### Key Metrics
- Block production rate
- Finality lag
- Peer count
- Session key validity
- ImOnline heartbeats
### Prometheus Metrics
```bash
# Enable Prometheus endpoint
datahaven-node --validator --prometheus-external --prometheus-port 9615
# Access metrics
curl http://localhost:9615/metrics
```
### Health Checks
```bash
# System health
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "system_health"}' \
http://localhost:9944 | jq
# Chain info
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "system_chain"}' \
http://localhost:9944 | jq
```
## Troubleshooting
### Issue: Not Producing Blocks
**Check:**
1. Session keys are set on-chain
2. Account is in the validator set
3. Node is fully synced
4. Session keys match on-chain registration
### Issue: Session Keys Lost
**Solution:**
```bash
# Rotate keys and re-register
curl -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "author_rotateKeys"}' \
http://localhost:9944
# Then submit new keys via session.setKeys extrinsic
```
### Issue: Not in Active Validator Set
**Check:**
1. Operator is registered on the DataHaven AVS contract (Ethereum)
2. Operator has sufficient delegated stake on EigenLayer
3. Session keys are correctly associated with operator address
4. Not slashed on EigenLayer
5. Session transition period (changes take effect in the next session)
6. Maximum validator count not exceeded
### Issue: Session Keys Not Linked to Operator
**Check:**
1. The `session.setKeys` transaction was signed with the same private key registered on the AVS
2. Verify the signing address matches your operator address on the AVS contract
3. Use `author_hasSessionKeys` RPC to confirm keys are stored locally
**Solution:**
```bash
# Verify your operator address matches what's registered on AVS
cast wallet address <your_operator_public_key_hex>
# Re-submit session.setKeys with the correct operator account
```
## Security Considerations
1. **Key Management**: Store seed phrase securely offline
2. **Network Security**: Use firewall to restrict RPC access
3. **High Availability**: Implement monitoring and alerting
4. **Slashing Prevention**: Monitor validator performance
5. **Backup Strategy**: Regular backups of keystores
## Best Practices
1. Monitor network connectivity
2. Keep node software updated
3. Test key rotation procedures
4. Document incident response procedures
## Related Documentation
- [Bootnode Setup](./datahaven-bootnode.md)
- [Full Node Setup](./datahaven-fullnode.md)
- [EigenLayer AVS Integration](../contracts/README.md)
- [Rewards System](../operator/pallets/external-validators/README.md)

View file

@ -0,0 +1,500 @@
# Snowbridge Beacon Relay
## Overview
The Beacon Relay syncs Ethereum beacon chain (consensus layer) finality to the DataHaven blockchain. It monitors the Ethereum beacon chain and submits sync committee updates and finality proofs to the `EthereumBeaconClient` pallet on DataHaven.
## Purpose
- Relay Ethereum beacon chain finality to DataHaven
- Submit sync committee updates for light client verification
- Enable trustless verification of Ethereum state on DataHaven
- Support cross-chain message verification from Ethereum
## Direction
```
Ethereum Beacon Chain → DataHaven
```
## Prerequisites
- Docker with `linux/amd64` platform support
- Access to Ethereum consensus layer (beacon) endpoint
- Access to DataHaven node WebSocket endpoint
- Substrate account with balance for transaction fees
- Persistent storage for relay datastore
## Hardware Requirements
### Specifications
| Component | Requirement |
|-----------|-------------|
| **CPU** | 4 cores |
| **RAM** | 8 GB |
| **Storage (Datastore)** | 10 GB SSD |
| **Network** | 100 Mbit/s symmetric |
### Important Considerations
- **Persistent storage**: The relay maintains a local datastore to track processed beacon updates; use persistent volumes in containerized deployments
- **Network latency**: Low latency connections to both beacon node and DataHaven node improve relay performance
- **Reliable RPC endpoints**: Use enterprise-grade or self-hosted beacon nodes for production deployments
## RPC Endpoint Requirements
### Beacon Node API
The relay requires access to a **stable, reliable Ethereum Beacon API endpoint**. Endpoint instability or downtime will prevent the relay from functioning correctly.
**Recommended providers:**
- Self-hosted beacon node (Lighthouse, Prysm, Teku, Nimbus, Lodestar)
- [Dwellir](https://www.dwellir.com/)
- [Chainstack](https://chainstack.com/)
- [QuickNode](https://www.quicknode.com/)
- [Alchemy](https://www.alchemy.com/)
**Requirements:**
- Full beacon API support (`/eth/v1/beacon/*` endpoints)
- State endpoint access for sync committee data
- Low latency (< 100ms recommended)
- High availability (99.9%+ uptime)
## Relay Redundancy
### Why Redundancy Matters
Running multiple relay instances provides fault tolerance and ensures continuous bridge operation even if one relay fails. The on-chain pallets have built-in deduplication, so only the first valid submission is accepted—redundant relays simply provide backup coverage.
### Configuring Redundant Relays
Deploy multiple relay instances pointing to **different RPC providers** for maximum fault tolerance:
**Instance 1 (Primary):**
```json
{
"source": {
"beacon": {
"endpoint": "https://beacon-provider-a.example.com",
"stateEndpoint": "https://beacon-provider-a.example.com"
}
},
"sink": {
"parachain": {
"endpoint": "wss://datahaven-rpc-1.example.com"
}
}
}
```
**Instance 2 (Backup):**
```json
{
"source": {
"beacon": {
"endpoint": "https://beacon-provider-b.example.com",
"stateEndpoint": "https://beacon-provider-b.example.com"
}
},
"sink": {
"parachain": {
"endpoint": "wss://datahaven-rpc-2.example.com"
}
}
}
```
### Best Practices for Redundancy
1. **Use different RPC providers**: Avoid single points of failure by using different beacon node providers for each relay instance
2. **Geographic distribution**: Deploy relays in different regions/data centers
3. **Independent infrastructure**: Run relays on separate machines or Kubernetes nodes
4. **Separate funding accounts**: Use different relay accounts to avoid nonce conflicts
5. **Monitor all instances**: Set up alerting for each relay independently
## Key Requirements
### Substrate Private Key
The Beacon Relay requires a **Substrate private key** to sign and submit extrinsics to the DataHaven chain.
| Key Type | Purpose |
|----------|---------|
| Substrate (sr25519/ecdsa) | Sign beacon update extrinsics on DataHaven |
### Account Funding
The relay account must be funded with HAVE tokens to pay for transaction fees when submitting beacon updates.
**Recommended Balance**: 100+ HAVE for continuous operations
For detailed operating cost estimates and optimization strategies, see the [Relay Operating Costs](./snowbridge-relay-costs.md) guide.
## CLI Flags
### Required Flags
| Flag | Description |
|------|-------------|
| `--config <PATH>` | Path to the JSON configuration file |
### Private Key Flags (One Required)
| Flag | Description |
|------|-------------|
| `--substrate.private-key <KEY>` | Substrate private key URI directly |
| `--substrate.private-key-file <PATH>` | Path to file containing the private key |
| `--substrate.private-key-id <ID>` | AWS Secrets Manager secret ID for the private key |
## Configuration File
### Structure
```json
{
"source": {
"beacon": {
"endpoint": "http://beacon-node:4000",
"stateEndpoint": "http://beacon-node:4000",
"spec": {
"syncCommitteeSize": 512,
"slotsInEpoch": 32,
"epochsPerSyncCommitteePeriod": 256,
"forkVersions": {
"deneb": 269568,
"electra": 364032,
"fulu": 411392
}
},
"datastore": {
"location": "/relay-data",
"maxEntries": 100
}
}
},
"sink": {
"parachain": {
"endpoint": "ws://datahaven-node:9944",
"maxWatchedExtrinsics": 8,
"headerRedundancy": 20
},
"updateSlotInterval": 30
}
}
```
### Fork Versions by Network
The `forkVersions` parameter specifies the epoch at which each consensus layer fork becomes active. Use the correct values for your target network:
#### Ethereum Mainnet
```json
"forkVersions": {
"deneb": 269568,
"electra": 364032,
"fulu": 411392
}
```
| Fork | Epoch | Activation Date | Fork Version Hex |
|------|-------|-----------------|------------------|
| Deneb | 269568 | March 13, 2024 | `0x04000000` |
| Electra | 364032 | May 7, 2025 | `0x05000000` |
| Fulu | 411392 | December 3, 2025 | `0x06000000` |
#### Hoodi Testnet
```json
"forkVersions": {
"deneb": 0,
"electra": 2048,
"fulu": 67584
}
```
| Fork | Epoch | Fork Version Hex |
|------|-------|------------------|
| Deneb | 0 (genesis) | `0x50000910` |
| Electra | 2048 | `0x60000910` |
| Fulu | 67584 | `0x70000910` |
**Note**: Hoodi is a merged-from-genesis testnet where Deneb is active from epoch 0. Check the [official Hoodi configuration](https://github.com/eth-clients/hoodi) for the latest values.
#### Local Development / Devnet
```json
"forkVersions": {
"deneb": 0,
"electra": 0,
"fulu": 0
}
```
For local development networks where all forks are active from genesis.
### Configuration Parameters
#### Source (Beacon Chain)
| Parameter | Description | Example |
|-----------|-------------|---------|
| `source.beacon.endpoint` | Beacon chain HTTP API endpoint | `http://beacon-node:4000` |
| `source.beacon.stateEndpoint` | Beacon chain state endpoint (usually same as above) | `http://beacon-node:4000` |
| `source.beacon.spec.syncCommitteeSize` | Size of sync committee | `512` |
| `source.beacon.spec.slotsInEpoch` | Slots per epoch | `32` |
| `source.beacon.spec.epochsPerSyncCommitteePeriod` | Epochs per sync committee period | `256` |
| `source.beacon.spec.forkVersions.deneb` | Epoch when Deneb fork activated | `269568` (mainnet) |
| `source.beacon.spec.forkVersions.electra` | Epoch when Electra fork activated | `364032` (mainnet) |
| `source.beacon.spec.forkVersions.fulu` | Epoch when Fulu fork activated | `411392` (mainnet) |
| `source.beacon.datastore.location` | Path to persistent datastore | `/relay-data` |
| `source.beacon.datastore.maxEntries` | Maximum datastore entries | `100` |
#### Sink (DataHaven)
| Parameter | Description | Example |
|-----------|-------------|---------|
| `sink.parachain.endpoint` | DataHaven WebSocket endpoint | `ws://datahaven-node:9944` |
| `sink.parachain.maxWatchedExtrinsics` | Max concurrent watched extrinsics | `8` |
| `sink.parachain.headerRedundancy` | Header redundancy factor | `20` |
| `sink.updateSlotInterval` | Slot interval for updates | `30` |
## Initialization: Beacon Client Pallet
Before starting the Beacon Relay, the `EthereumBeaconClient` pallet must be initialized with a checkpoint.
### Generate Initial Checkpoint
```bash
docker run --rm \
-v $(pwd)/beacon-relay.json:/app/beacon-relay.json:ro \
-v $(pwd)/checkpoint.json:/app/dump-initial-checkpoint.json \
-v $(pwd)/datastore:/data \
--platform linux/amd64 \
datahavenxyz/snowbridge-relay:latest \
generate-beacon-checkpoint --config beacon-relay.json \
> beacon_checkpoint.hex
```
This outputs the raw checkpoint payload to `beacon_checkpoint.hex`.
### Submit Checkpoint to DataHaven
The checkpoint must be submitted via a sudo call to `EthereumBeaconClient.force_checkpoint`. There are two methods:
#### Option 1: Using Polkadot.js Apps (Recommended)
1. Open [Polkadot.js Apps](https://polkadot.js.org/apps/) and connect to your DataHaven node
2. Navigate to **Developer** > **Extrinsics**
3. Select **Decode** tab
4. Prepend `0x24003c00` to the contents of `beacon_checkpoint.hex` and paste the full hex string
- `0x24` = Sudo pallet index
- `0x00` = sudo call index
- `0x3c` = EthereumBeaconClient pallet index
- `0x00` = force_checkpoint call index
5. The UI should decode this as `sudo.sudo(ethereumBeaconClient.force_checkpoint(...))`
6. Select your Sudo account and submit the transaction
#### Option 2: Using Polkadot-API (TypeScript)
```typescript
import { createClient } from 'polkadot-api';
import { datahaven } from '@polkadot-api/descriptors';
const client = createClient(wsProvider);
const api = client.getTypedApi(datahaven);
const forceCheckpointCall = api.tx.EthereumBeaconClient.force_checkpoint({
update: checkpoint // Parsed from dump-initial-checkpoint.json
});
const tx = api.tx.Sudo.sudo({
call: forceCheckpointCall.decodedCall
});
await tx.signAndSubmit(sudoSigner);
```
## Running the Relay
### Docker Run
```bash
docker run -d \
--name snowbridge-beacon-relay \
--platform linux/amd64 \
--add-host host.docker.internal:host-gateway \
--network datahaven-network \
-v $(pwd)/beacon-relay.json:/configs/beacon-relay.json:ro \
-v $(pwd)/relay-data:/relay-data \
--pull always \
datahavenxyz/snowbridge-relay:latest \
run beacon \
--config /configs/beacon-relay.json \
--substrate.private-key "0x..."
```
### Docker Compose
```yaml
version: '3.8'
services:
beacon-relay:
image: datahavenxyz/snowbridge-relay:latest
container_name: snowbridge-beacon-relay
platform: linux/amd64
restart: unless-stopped
volumes:
- ./configs/beacon-relay.json:/configs/beacon-relay.json:ro
- beacon-relay-data:/relay-data
command:
- "run"
- "beacon"
- "--config"
- "/configs/beacon-relay.json"
- "--substrate.private-key-file"
- "/secrets/substrate-key"
secrets:
- substrate-key
volumes:
beacon-relay-data:
secrets:
substrate-key:
file: ./secrets/beacon-relay-substrate-key
```
## Kubernetes Deployment
```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: dh-beacon-relay
spec:
serviceName: dh-beacon-relay
replicas: 1
selector:
matchLabels:
app: dh-beacon-relay
template:
metadata:
labels:
app: dh-beacon-relay
spec:
containers:
- name: beacon-relay
image: datahavenxyz/snowbridge-relay:latest
imagePullPolicy: Always
args:
- "run"
- "beacon"
- "--config"
- "/configs/beacon-relay.json"
- "--substrate.private-key-file"
- "/secrets/dh-beacon-relay-substrate-key"
volumeMounts:
- name: config
mountPath: /configs
readOnly: true
- name: secrets
mountPath: /secrets
readOnly: true
- name: relay-data
mountPath: /relay-data
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
volumes:
- name: config
configMap:
name: beacon-relay-config
- name: secrets
secret:
secretName: dh-beacon-relay-substrate-key
volumeClaimTemplates:
- metadata:
name: relay-data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
```
## Monitoring
### Health Checks
The relay logs sync committee updates and beacon chain state:
```bash
# View relay logs
docker logs -f snowbridge-beacon-relay
# Check for successful updates
docker logs snowbridge-beacon-relay 2>&1 | grep -i "submitted\|update"
```
### Key Metrics to Monitor
- Beacon chain slot lag
- Sync committee update frequency
- Transaction success rate
- Account balance (for fees)
## Troubleshooting
### Issue: Beacon Chain Not Ready
**Symptoms**: Relay fails to start or continuously retries
**Check**:
1. Beacon chain endpoint is accessible
2. Beacon chain has finalized blocks
3. Network connectivity between relay and beacon node
```bash
# Test beacon chain connectivity
curl http://beacon-node:4000/eth/v1/beacon/states/head/finality_checkpoints
```
### Issue: Checkpoint Submission Failed
**Check**:
1. Sudo account has sufficient balance
2. Checkpoint data is valid
3. DataHaven node is synced
### Issue: Transaction Failures
**Check**:
1. Relay account has sufficient HAVE balance
2. DataHaven node is accessible
3. No duplicate relayers submitting same updates
## Security Considerations
1. **Private Key Protection**: Store private keys securely (AWS Secrets Manager, Kubernetes secrets, or encrypted files)
2. **Network Security**: Restrict access to relay endpoints
3. **Access Control**: Use dedicated accounts with minimal required permissions
4. **Monitoring**: Set up alerts for relay failures
## Related Documentation
- [BEEFY Relay](./snowbridge-beefy-relay.md)
- [Execution Relay](./snowbridge-execution-relay.md)
- [Solochain Relay](./snowbridge-solochain-relay.md)
- [Relay Operating Costs](./snowbridge-relay-costs.md)
- [Snowbridge Documentation](https://docs.snowbridge.network)
- [DataHaven Snowbridge Repository](https://github.com/datahaven-xyz/snowbridge)
- [Ethereum Consensus Specs - Mainnet Config](https://github.com/ethereum/consensus-specs/blob/dev/configs/mainnet.yaml)
- [Hoodi Testnet Configuration](https://github.com/eth-clients/hoodi)
- [Ethereum Fork Timeline](https://ethereum.org/ethereum-forks/)

View file

@ -0,0 +1,447 @@
# Snowbridge BEEFY Relay
## Overview
The BEEFY Relay submits DataHaven BEEFY (Bridge Efficiency Enabling Finality Yielder) finality proofs to the `BeefyClient` smart contract on Ethereum. This enables trustless verification of DataHaven state on Ethereum.
## Purpose
- Relay DataHaven BEEFY finality proofs to Ethereum
- Submit validator set commitments to BeefyClient contract
- Enable trustless verification of DataHaven state on Ethereum
- Support cross-chain message verification to Ethereum
## Direction
```
DataHaven → Ethereum
```
## Prerequisites
- Docker with `linux/amd64` platform support
- Access to DataHaven node WebSocket endpoint
- Access to Ethereum execution layer WebSocket endpoint
- Ethereum account with ETH for gas fees
- Deployed BeefyClient and Gateway contracts on Ethereum
## Hardware Requirements
### Specifications
| Component | Requirement |
|-----------|-------------|
| **CPU** | 4 cores |
| **RAM** | 8 GB |
| **Storage (Datastore)** | 5 GB SSD |
| **Network** | 100 Mbit/s symmetric |
### Important Considerations
- **No persistent storage required**: BEEFY relay is stateless and recovers from on-chain state on restart
- **Gas optimization**: The relay batches BEEFY proofs when possible to reduce Ethereum gas costs
- **Network latency**: Low latency to Ethereum node is important for timely proof submission
- **Reliable RPC endpoints**: Use enterprise-grade or self-hosted nodes for production deployments
## RPC Endpoint Requirements
### Ethereum Execution Layer
The relay requires access to a **stable, reliable Ethereum WebSocket endpoint**. Endpoint instability or downtime will prevent the relay from functioning correctly.
**Recommended providers:**
- Self-hosted execution node (Geth, Nethermind, Besu, Erigon)
- [Dwellir](https://www.dwellir.com/)
- [Chainstack](https://chainstack.com/)
- [QuickNode](https://www.quicknode.com/)
- [Alchemy](https://www.alchemy.com/)
**Requirements:**
- WebSocket support (WSS for production)
- Low latency (< 100ms recommended)
- High availability (99.9%+ uptime)
### DataHaven Node
- Full node or archive node with WebSocket endpoint
- Low latency connection for monitoring BEEFY finality
## Relay Redundancy
### Why Redundancy Matters
Running multiple relay instances provides fault tolerance and ensures continuous bridge operation even if one relay fails. The BeefyClient contract handles duplicate submissions gracefully—only the first valid submission is processed.
### Configuring Redundant Relays
Deploy multiple relay instances pointing to **different RPC providers** for maximum fault tolerance:
**Instance 1 (Primary):**
```json
{
"source": {
"polkadot": {
"endpoint": "wss://datahaven-rpc-1.example.com"
}
},
"sink": {
"ethereum": {
"endpoint": "wss://eth-provider-a.example.com"
}
}
}
```
**Instance 2 (Backup):**
```json
{
"source": {
"polkadot": {
"endpoint": "wss://datahaven-rpc-2.example.com"
}
},
"sink": {
"ethereum": {
"endpoint": "wss://eth-provider-b.example.com"
}
}
}
```
### Best Practices for Redundancy
1. **Use different RPC providers**: Avoid single points of failure by using different Ethereum node providers for each relay instance
2. **Geographic distribution**: Deploy relays in different regions/data centers
3. **Independent infrastructure**: Run relays on separate machines or Kubernetes nodes
4. **Separate funding accounts**: Use different relay accounts to avoid nonce conflicts
5. **Monitor all instances**: Set up alerting for each relay independently
## Key Requirements
### Ethereum Private Key
The BEEFY Relay requires an **Ethereum private key** to sign and submit transactions to the BeefyClient contract.
| Key Type | Purpose |
|----------|---------|
| Ethereum (secp256k1) | Sign Ethereum transactions to BeefyClient contract |
### Account Funding
The relay account must be funded with ETH to pay for gas when submitting BEEFY proofs.
**Recommended Balance**: 0.5+ ETH for continuous operations (gas costs vary with network conditions)
For detailed operating cost estimates and optimization strategies, see the [Relay Operating Costs](./snowbridge-relay-costs.md) guide.
## CLI Flags
### Required Flags
| Flag | Description |
|------|-------------|
| `--config <PATH>` | Path to the JSON configuration file |
### Private Key Flags (One Required)
| Flag | Description |
|------|-------------|
| `--ethereum.private-key <KEY>` | Ethereum private key directly |
| `--ethereum.private-key-file <PATH>` | Path to file containing the private key |
| `--ethereum.private-key-id <ID>` | AWS Secrets Manager secret ID for the private key |
### Optional Flags
| Flag | Description | Default |
|------|-------------|---------|
| `--on-demand` | Synchronize commitments on demand | `false` |
## Configuration File
### Structure
```json
{
"source": {
"polkadot": {
"endpoint": "ws://datahaven-node:9944"
}
},
"sink": {
"ethereum": {
"endpoint": "ws://ethereum-node:8546",
"gas-limit": ""
},
"descendants-until-final": 3,
"contracts": {
"BeefyClient": "0x4826533B4897376654Bb4d4AD88B7faFD0C98528",
"Gateway": "0x8f86403A4DE0BB5791fa46B8e795C547942fE4Cf"
}
},
"on-demand-sync": {
"max-tokens": 5,
"refill-amount": 1,
"refill-period": 3600
}
}
```
### Configuration Parameters
#### Source (DataHaven)
| Parameter | Description | Example |
|-----------|-------------|---------|
| `source.polkadot.endpoint` | DataHaven WebSocket endpoint | `ws://datahaven-node:9944` |
#### Sink (Ethereum)
| Parameter | Description | Example |
|-----------|-------------|---------|
| `sink.ethereum.endpoint` | Ethereum WebSocket endpoint | `ws://ethereum-node:8546` |
| `sink.ethereum.gas-limit` | Optional gas limit override | `""` (empty for auto) |
| `sink.descendants-until-final` | Blocks to wait for finality | `3` |
| `sink.contracts.BeefyClient` | BeefyClient contract address | `0x...` |
| `sink.contracts.Gateway` | Gateway contract address | `0x...` |
#### On-Demand Sync (Rate Limiting)
| Parameter | Description | Example |
|-----------|-------------|---------|
| `on-demand-sync.max-tokens` | Maximum tokens for rate limiting | `5` |
| `on-demand-sync.refill-amount` | Tokens to refill per period | `1` |
| `on-demand-sync.refill-period` | Refill period in seconds | `3600` |
## Prerequisites: BEEFY Protocol Ready
Before starting the BEEFY Relay, the BEEFY protocol must be active on DataHaven.
### Check BEEFY Status
```bash
# Using curl with JSON-RPC
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "beefy_getFinalizedHead"}' \
http://localhost:9944
# Should return a non-zero block hash when ready
```
### Automated Wait (Test Environment)
The test framework automatically waits for BEEFY with a 60-second timeout:
```typescript
const waitBeefyReady = async (pollIntervalMs: number, timeoutMs: number) => {
// Poll beefy_getFinalizedHead until it returns a non-zero hash
};
```
## Running the Relay
### Docker Run
```bash
docker run -d \
--name snowbridge-beefy-relay \
--platform linux/amd64 \
--add-host host.docker.internal:host-gateway \
--network datahaven-network \
-v $(pwd)/beefy-relay.json:/configs/beefy-relay.json:ro \
--pull always \
datahavenxyz/snowbridge-relay:latest \
run beefy \
--config /configs/beefy-relay.json \
--ethereum.private-key "0x..."
```
### Docker Compose
```yaml
version: '3.8'
services:
beefy-relay:
image: datahavenxyz/snowbridge-relay:latest
container_name: snowbridge-beefy-relay
platform: linux/amd64
restart: unless-stopped
volumes:
- ./configs/beefy-relay.json:/configs/beefy-relay.json:ro
command:
- "run"
- "beefy"
- "--config"
- "/configs/beefy-relay.json"
- "--ethereum.private-key-file"
- "/secrets/ethereum-key"
secrets:
- ethereum-key
secrets:
ethereum-key:
file: ./secrets/beefy-relay-ethereum-key
```
## Kubernetes Deployment
```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: dh-beefy-relay
spec:
replicas: 1
selector:
matchLabels:
app: dh-beefy-relay
template:
metadata:
labels:
app: dh-beefy-relay
spec:
containers:
- name: beefy-relay
image: datahavenxyz/snowbridge-relay:latest
imagePullPolicy: Always
args:
- "run"
- "beefy"
- "--config"
- "/configs/beefy-relay.json"
- "--ethereum.private-key-file"
- "/secrets/dh-beefy-relay-ethereum-key"
volumeMounts:
- name: config
mountPath: /configs
readOnly: true
- name: secrets
mountPath: /secrets
readOnly: true
resources:
requests:
memory: "256Mi"
cpu: "100m"
limits:
memory: "512Mi"
cpu: "250m"
volumes:
- name: config
configMap:
name: beefy-relay-config
- name: secrets
secret:
secretName: dh-beefy-relay-ethereum-key
```
Note: BEEFY Relay does **not** require persistent storage (no `volumeClaimTemplates`).
## Contract Requirements
### BeefyClient Contract
The BeefyClient contract must be deployed on Ethereum and initialized with:
- Initial validator set
- Initial BEEFY authority set ID
### Gateway Contract
The Gateway contract coordinates cross-chain message passing and interacts with BeefyClient for verification.
### Get Contract Addresses
Contract addresses are typically stored in a deployments file after contract deployment:
```typescript
const deployments = await parseDeploymentsFile();
const beefyClientAddress = deployments.BeefyClient;
const gatewayAddress = deployments.Gateway;
```
## Monitoring
### Health Checks
```bash
# View relay logs
docker logs -f snowbridge-beefy-relay
# Check for BEEFY proof submissions
docker logs snowbridge-beefy-relay 2>&1 | grep -i "submit\|proof\|commitment"
```
### Key Metrics to Monitor
- BEEFY finality lag (DataHaven blocks behind)
- Ethereum transaction success rate
- Gas costs and account balance
- BeefyClient contract state
### Ethereum Contract State
```bash
# Check BeefyClient latest commitment (using cast from Foundry)
cast call $BEEFY_CLIENT "latestBeefyBlock()" --rpc-url $ETH_RPC_URL
```
## Troubleshooting
### Issue: BEEFY Not Ready
**Symptoms**: Relay fails to start with "BEEFY protocol not ready"
**Check**:
1. DataHaven network has active validators
2. BEEFY pallet is enabled in runtime
3. Sufficient blocks have been produced for BEEFY finality
```bash
# Check BEEFY finalized head
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "beefy_getFinalizedHead"}' \
http://localhost:9944
```
### Issue: Ethereum Transaction Failures
**Check**:
1. Relay account has sufficient ETH for gas
2. BeefyClient contract is deployed and initialized
3. Gas price is appropriate for network conditions
4. No competing relayers submitting same proofs
### Issue: Rate Limiting
**Symptoms**: Relay slows down or stops submitting proofs
**Check**:
1. `on-demand-sync` configuration is appropriate
2. Increase `max-tokens` if needed for higher throughput
3. Ensure `refill-period` matches expected submission frequency
## Security Considerations
1. **Private Key Protection**: Store Ethereum private keys securely
2. **Gas Management**: Monitor gas costs and set appropriate limits
3. **Access Control**: Use dedicated accounts with minimal ETH
4. **Monitoring**: Set up alerts for transaction failures and low balance
## Economics
### Gas Costs
- BEEFY proof submission: ~0.0003 ETH per message (varies with gas price)
- Validator set updates: Higher gas cost (less frequent)
### Incentives
Relayers can earn incentives for successful proof submissions. See Snowbridge documentation for incentive structure.
## Related Documentation
- [Beacon Relay](./snowbridge-beacon-relay.md)
- [Execution Relay](./snowbridge-execution-relay.md)
- [Solochain Relay](./snowbridge-solochain-relay.md)
- [Relay Operating Costs](./snowbridge-relay-costs.md)
- [Snowbridge Documentation](https://docs.snowbridge.network)
- [DataHaven Snowbridge Repository](https://github.com/datahaven-xyz/snowbridge)

View file

@ -0,0 +1,554 @@
# Snowbridge Execution Relay
## Overview
The Execution Relay processes Ethereum execution layer events and delivers cross-chain messages to DataHaven. It monitors the Gateway contract on Ethereum and relays messages to the corresponding pallets on DataHaven.
## Purpose
- Relay Ethereum execution layer messages to DataHaven
- Monitor Gateway contract for outbound messages
- Submit message proofs to DataHaven for verification
- Enable cross-chain token transfers and message passing from Ethereum
## Direction
```
Ethereum Execution Layer → DataHaven
```
## Prerequisites
- Docker with `linux/amd64` platform support
- Access to Ethereum execution layer WebSocket endpoint
- Access to Ethereum consensus layer (beacon) HTTP endpoint
- Access to DataHaven node WebSocket endpoint
- Substrate account with balance for transaction fees
- Deployed Gateway contract on Ethereum
- Persistent storage for relay datastore
## Hardware Requirements
### Specifications
| Component | Requirement |
|-----------|-------------|
| **CPU** | 4 cores |
| **RAM** | 8 GB |
| **Storage (Datastore)** | 10 GB SSD |
| **Network** | 100 Mbit/s symmetric |
### Important Considerations
- **Persistent storage**: The relay maintains a local datastore to track processed messages; use persistent volumes in containerized deployments
- **Message throughput**: Storage requirements may increase with high message volumes
- **Network connectivity**: Requires connections to both Ethereum (execution + beacon) and DataHaven nodes
- **Reliable RPC endpoints**: Use enterprise-grade or self-hosted nodes for production deployments
## RPC Endpoint Requirements
### Ethereum Execution Layer
The relay requires access to a **stable, reliable Ethereum WebSocket endpoint**. Endpoint instability or downtime will prevent the relay from functioning correctly.
**Recommended providers:**
- Self-hosted execution node (Geth, Nethermind, Besu, Erigon)
- [Dwellir](https://www.dwellir.com/)
- [Chainstack](https://chainstack.com/)
- [QuickNode](https://www.quicknode.com/)
- [Alchemy](https://www.alchemy.com/)
**Requirements:**
- WebSocket support (WSS for production)
- Full event log access for Gateway contract monitoring
- Low latency (< 100ms recommended)
- High availability (99.9%+ uptime)
### Beacon Node API
The relay also requires access to the Ethereum Beacon API for constructing message proofs.
**Recommended providers:**
- Self-hosted beacon node (Lighthouse, Prysm, Teku, Nimbus, Lodestar)
- Same providers as execution layer (with beacon API support)
**Requirements:**
- Full beacon API support (`/eth/v1/beacon/*` endpoints)
- State endpoint access for proof construction
- Low latency (< 100ms recommended)
## Relay Redundancy
### Why Redundancy Matters
Running multiple relay instances provides fault tolerance and ensures continuous bridge operation even if one relay fails. The on-chain pallets have built-in deduplication, so only the first valid submission is accepted—redundant relays simply provide backup coverage.
### Configuring Redundant Relays
Deploy multiple relay instances pointing to **different RPC providers** for maximum fault tolerance:
**Instance 1 (Primary):**
```json
{
"source": {
"ethereum": {
"endpoint": "wss://eth-provider-a.example.com"
},
"beacon": {
"endpoint": "https://beacon-provider-a.example.com"
}
},
"sink": {
"parachain": {
"endpoint": "wss://datahaven-rpc-1.example.com"
}
}
}
```
**Instance 2 (Backup):**
```json
{
"source": {
"ethereum": {
"endpoint": "wss://eth-provider-b.example.com"
},
"beacon": {
"endpoint": "https://beacon-provider-b.example.com"
}
},
"sink": {
"parachain": {
"endpoint": "wss://datahaven-rpc-2.example.com"
}
}
}
```
### Best Practices for Redundancy
1. **Use different RPC providers**: Avoid single points of failure by using different Ethereum node providers for each relay instance
2. **Geographic distribution**: Deploy relays in different regions/data centers
3. **Independent infrastructure**: Run relays on separate machines or Kubernetes nodes
4. **Separate funding accounts**: Use different relay accounts to avoid nonce conflicts
5. **Monitor all instances**: Set up alerting for each relay independently
## Key Requirements
### Substrate Private Key
The Execution Relay requires a **Substrate private key** to sign and submit extrinsics to DataHaven.
| Key Type | Purpose |
|----------|---------|
| Substrate (sr25519/ecdsa) | Sign message delivery extrinsics on DataHaven |
### Account Funding
The relay account must be funded with HAVE tokens to pay for transaction fees.
**Recommended Balance**: 100+ HAVE for continuous operations
For detailed operating cost estimates and optimization strategies, see the [Relay Operating Costs](./snowbridge-relay-costs.md) guide.
## CLI Flags
### Required Flags
| Flag | Description |
|------|-------------|
| `--config <PATH>` | Path to the JSON configuration file |
### Private Key Flags (One Required)
| Flag | Description |
|------|-------------|
| `--substrate.private-key <KEY>` | Substrate private key URI directly |
| `--substrate.private-key-file <PATH>` | Path to file containing the private key |
| `--substrate.private-key-id <ID>` | AWS Secrets Manager secret ID for the private key |
## Configuration File
### Structure
```json
{
"source": {
"ethereum": {
"endpoint": "ws://ethereum-node:8546"
},
"contracts": {
"Gateway": "0x8f86403A4DE0BB5791fa46B8e795C547942fE4Cf"
},
"beacon": {
"endpoint": "http://beacon-node:4000",
"stateEndpoint": "http://beacon-node:4000",
"spec": {
"syncCommitteeSize": 512,
"slotsInEpoch": 32,
"epochsPerSyncCommitteePeriod": 256,
"forkVersions": {
"deneb": 0,
"electra": 0
}
},
"datastore": {
"location": "/relay-data",
"maxEntries": 100
}
}
},
"sink": {
"parachain": {
"endpoint": "ws://datahaven-node:9944",
"maxWatchedExtrinsics": 8,
"headerRedundancy": 20
}
},
"instantVerification": false,
"schedule": {
"id": null,
"totalRelayerCount": 1,
"sleepInterval": 1
}
}
```
### Configuration Parameters
#### Source (Ethereum)
| Parameter | Description | Example |
|-----------|-------------|---------|
| `source.ethereum.endpoint` | Ethereum execution layer WebSocket | `ws://ethereum-node:8546` |
| `source.contracts.Gateway` | Gateway contract address | `0x...` |
#### Source (Beacon Chain)
| Parameter | Description | Example |
|-----------|-------------|---------|
| `source.beacon.endpoint` | Beacon chain HTTP API endpoint | `http://beacon-node:4000` |
| `source.beacon.stateEndpoint` | Beacon chain state endpoint | `http://beacon-node:4000` |
| `source.beacon.spec.*` | Beacon chain specification | See beacon spec parameters |
| `source.beacon.datastore.location` | Path to persistent datastore | `/relay-data` |
| `source.beacon.datastore.maxEntries` | Maximum datastore entries | `100` |
#### Sink (DataHaven)
| Parameter | Description | Example |
|-----------|-------------|---------|
| `sink.parachain.endpoint` | DataHaven WebSocket endpoint | `ws://datahaven-node:9944` |
| `sink.parachain.maxWatchedExtrinsics` | Max concurrent watched extrinsics | `8` |
| `sink.parachain.headerRedundancy` | Header redundancy factor | `20` |
#### Relay Settings
| Parameter | Description | Example |
|-----------|-------------|---------|
| `instantVerification` | Enable instant verification mode | `false` |
| `schedule.id` | Relayer instance ID (for multi-instance) | `null` or `0` |
| `schedule.totalRelayerCount` | Total number of relayer instances | `1` |
| `schedule.sleepInterval` | Seconds between message checks | `1` |
## Multi-Instance Deployment
For high-availability or load distribution, multiple Execution Relayers can be deployed using the `schedule` configuration to coordinate between instances.
### Schedule Configuration Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| `schedule.id` | `number` or `null` | Unique identifier for this relay instance (0-indexed). Set to `null` for single-instance deployments. |
| `schedule.totalRelayerCount` | `number` | Total number of relay instances in the deployment. All instances must use the same value. |
| `schedule.sleepInterval` | `number` | Seconds to wait between polling for new messages. Lower values = faster detection, higher resource usage. |
### How Multi-Instance Scheduling Works
When multiple relayers are deployed, the `schedule.id` and `totalRelayerCount` parameters work together to distribute message processing:
1. **Message assignment**: Messages are assigned to relayers based on `message_nonce % totalRelayerCount == schedule.id`
2. **Staggered processing**: Each relayer only processes messages assigned to its ID, preventing duplicate submissions
3. **Failover**: If a relayer fails, its messages will eventually be picked up by other relayers after timeout
**Example with 3 relayers:**
- Instance 0 processes messages where `nonce % 3 == 0` (nonces: 0, 3, 6, 9, ...)
- Instance 1 processes messages where `nonce % 3 == 1` (nonces: 1, 4, 7, 10, ...)
- Instance 2 processes messages where `nonce % 3 == 2` (nonces: 2, 5, 8, 11, ...)
### Configuration Examples
**Single Instance (default):**
```json
{
"schedule": {
"id": null,
"totalRelayerCount": 1,
"sleepInterval": 1
}
}
```
**Three-Instance Deployment:**
*Instance 0:*
```json
{
"schedule": {
"id": 0,
"totalRelayerCount": 3,
"sleepInterval": 1
}
}
```
*Instance 1:*
```json
{
"schedule": {
"id": 1,
"totalRelayerCount": 3,
"sleepInterval": 1
}
}
```
*Instance 2:*
```json
{
"schedule": {
"id": 2,
"totalRelayerCount": 3,
"sleepInterval": 1
}
}
```
### Sleep Interval Tuning
The `sleepInterval` parameter controls how frequently the relay polls for new messages:
| Value | Use Case | Trade-offs |
|-------|----------|------------|
| `1` | Low latency required | Higher RPC usage, faster message detection |
| `5` | Balanced | Good balance of latency and resource usage |
| `10` | Cost-sensitive | Lower RPC costs, slower message detection |
| `30` | Minimal activity | Very low resource usage, higher latency |
**Recommendation**: Start with `sleepInterval: 1` for production deployments where message latency is important. Increase if RPC rate limits become an issue.
### Deployment Checklist
1. **Unique IDs**: Each instance must have a unique `schedule.id` (0 to `totalRelayerCount - 1`)
2. **Consistent count**: All instances must use the same `totalRelayerCount` value
3. **Separate accounts**: Use different Substrate accounts to avoid nonce conflicts
4. **Independent storage**: Each instance needs its own persistent datastore volume
5. **Different RPC endpoints**: Point instances to different RPC providers for fault tolerance
## Running the Relay
### Docker Run
```bash
docker run -d \
--name snowbridge-execution-relay \
--platform linux/amd64 \
--add-host host.docker.internal:host-gateway \
--network datahaven-network \
-v $(pwd)/execution-relay.json:/configs/execution-relay.json:ro \
-v $(pwd)/relay-data:/relay-data \
--pull always \
datahavenxyz/snowbridge-relay:latest \
run execution \
--config /configs/execution-relay.json \
--substrate.private-key "0x..."
```
### Docker Compose
```yaml
version: '3.8'
services:
execution-relay:
image: datahavenxyz/snowbridge-relay:latest
container_name: snowbridge-execution-relay
platform: linux/amd64
restart: unless-stopped
volumes:
- ./configs/execution-relay.json:/configs/execution-relay.json:ro
- execution-relay-data:/relay-data
command:
- "run"
- "execution"
- "--config"
- "/configs/execution-relay.json"
- "--substrate.private-key-file"
- "/secrets/substrate-key"
secrets:
- substrate-key
volumes:
execution-relay-data:
secrets:
substrate-key:
file: ./secrets/execution-relay-substrate-key
```
## Kubernetes Deployment
```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: dh-execution-relay
spec:
serviceName: dh-execution-relay
replicas: 1
selector:
matchLabels:
app: dh-execution-relay
template:
metadata:
labels:
app: dh-execution-relay
spec:
containers:
- name: execution-relay
image: datahavenxyz/snowbridge-relay:latest
imagePullPolicy: Always
args:
- "run"
- "execution"
- "--config"
- "/configs/execution-relay.json"
- "--substrate.private-key-file"
- "/secrets/dh-execution-relay-substrate-key"
volumeMounts:
- name: config
mountPath: /configs
readOnly: true
- name: secrets
mountPath: /secrets
readOnly: true
- name: relay-data
mountPath: /relay-data
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
volumes:
- name: config
configMap:
name: execution-relay-config
- name: secrets
secret:
secretName: dh-execution-relay-substrate-key
volumeClaimTemplates:
- metadata:
name: relay-data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
```
## Message Flow
### Ethereum → DataHaven Message Flow
1. User calls Gateway contract on Ethereum
2. Gateway emits `OutboundMessageAccepted` event
3. Execution Relay monitors for Gateway events
4. Relay constructs message proof using beacon chain state
5. Relay submits proof to DataHaven via `EthereumInboundQueue` pallet
6. DataHaven verifies proof against beacon client state
7. Message is dispatched to target pallet
### Supported Message Types
- Token transfers (ERC-20 tokens to DataHaven)
- Arbitrary cross-chain messages
- Smart contract calls
## Monitoring
### Health Checks
```bash
# View relay logs
docker logs -f snowbridge-execution-relay
# Check for message processing
docker logs snowbridge-execution-relay 2>&1 | grep -i "message\|submit\|proof"
```
### Key Metrics to Monitor
- Message queue depth
- Message delivery success rate
- Ethereum event processing lag
- Account balance (for fees)
- Beacon chain sync status
## Troubleshooting
### Issue: Messages Not Being Delivered
**Check**:
1. Gateway contract address is correct
2. Ethereum endpoint is accessible
3. Beacon chain is synced
4. DataHaven node is accessible
```bash
# Check Ethereum connectivity
curl -X POST -H "Content-Type: application/json" \
--data '{"jsonrpc":"2.0","method":"eth_blockNumber","params":[],"id":1}' \
http://ethereum-node:8545
```
### Issue: Proof Verification Failures
**Check**:
1. Beacon Relay is running and synced
2. `EthereumBeaconClient` pallet has recent updates
3. Beacon chain spec matches configuration
### Issue: High Latency
**Solutions**:
1. Reduce `sleepInterval` for faster message detection
2. Deploy multiple relay instances
3. Ensure low-latency connections to endpoints
## Security Considerations
1. **Private Key Protection**: Store Substrate private keys securely
2. **Network Security**: Use secure connections (WSS) when possible
3. **Access Control**: Use dedicated accounts with minimal permissions
4. **Monitoring**: Set up alerts for message delivery failures
## Economics
### Transaction Costs
- Message delivery: ~0.012 DOT equivalent per message (varies with message size)
- Relayers earn incentives for successful deliveries
### Incentive Structure
Relayers can claim incentives from the protocol for successful message deliveries. See Snowbridge documentation for details.
## Related Documentation
- [Beacon Relay](./snowbridge-beacon-relay.md)
- [BEEFY Relay](./snowbridge-beefy-relay.md)
- [Solochain Relay](./snowbridge-solochain-relay.md)
- [Relay Operating Costs](./snowbridge-relay-costs.md)
- [Snowbridge Documentation](https://docs.snowbridge.network)
- [DataHaven Snowbridge Repository](https://github.com/datahaven-xyz/snowbridge)

View file

@ -0,0 +1,184 @@
# Snowbridge Relay Operating Costs
## Overview
This page provides guidance on funding requirements and operating cost estimates for running Snowbridge relays. These costs apply across all relay types (Beacon, BEEFY, Execution, and Solochain).
## Account Funding Requirements
### Ethereum Account (BEEFY, Solochain Relays)
Relays that submit transactions to Ethereum require funded Ethereum accounts for gas fees.
| Relay Type | Minimum | Recommended | Purpose |
|------------|---------|-------------|---------|
| BEEFY Relay | 0.5 ETH | 2.0 ETH | Submit BEEFY proofs to BeefyClient contract |
| Solochain Relay | 0.5 ETH | 2.0 ETH | Submit messages to Gateway, sync rewards |
### Substrate Account (Beacon, Execution, Solochain Relays)
Relays that submit extrinsics to DataHaven require funded Substrate accounts for transaction fees.
| Relay Type | Minimum | Recommended | Purpose |
|------------|---------|-------------|---------|
| Beacon Relay | 100 HAVE | 500 HAVE | Submit beacon updates to EthereumBeaconClient |
| Execution Relay | 100 HAVE | 500 HAVE | Deliver messages via EthereumInboundQueue |
| Solochain Relay | 100 HAVE | 500 HAVE | DataHaven operations |
## Gas Cost Breakdown (Ethereum)
Relays submitting to Ethereum incur gas costs for various operations:
| Operation | Relay Type | Estimated Gas | Frequency |
|-----------|------------|---------------|-----------|
| BEEFY commitment | BEEFY | 200,000-400,000 | Per commitment |
| Message delivery | Solochain | 150,000-300,000 | Per message |
| Reward sync update | Solochain | 100,000-200,000 | Per epoch/period |
## Annual Operating Cost Forecast
> **Disclaimer**: The cost estimates below are approximate projections based on typical network conditions and are provided for planning purposes only. Actual costs may vary significantly based on network congestion, gas price fluctuations, ETH price volatility, and message volume. **Always conduct your own cost analysis** based on current market conditions before budgeting for relay operations.
### Assumptions
- Average gas price: 30 gwei
- ETH price: $3,000 USD (as of December 2025)
- HAVE transaction fees: negligible compared to ETH costs
### BEEFY Relay Costs
BEEFY proofs are submitted periodically to keep the BeefyClient contract updated with DataHaven finality.
| Scenario | Commitments/Day | Gas/Year (ETH) | Annual Cost (USD) |
|----------|-----------------|----------------|-------------------|
| **Low activity** | 4-6 | 0.4-0.8 ETH | $1,200-$2,400 |
| **Medium activity** | 10-15 | 1.0-1.6 ETH | $3,000-$4,800 |
| **High activity** | 20-30 | 2.0-3.5 ETH | $6,000-$10,500 |
### Solochain Relay Costs
The Solochain Relay handles message delivery and reward synchronization.
| Scenario | Messages/Day | Gas/Year (ETH) | Annual Cost (USD) |
|----------|--------------|----------------|-------------------|
| **Low activity** | 50 | 0.8-1.5 ETH | $2,400-$4,500 |
| **Medium activity** | 100 | 1.5-3.0 ETH | $4,500-$9,000 |
| **High activity** | 200 | 3.0-6.0 ETH | $9,000-$18,000 |
### Beacon & Execution Relay Costs
These relays only incur HAVE transaction fees on DataHaven, which are minimal:
| Relay Type | Annual HAVE Estimate | Notes |
|------------|---------------------|-------|
| Beacon Relay | 50-200 HAVE | Sync committee updates |
| Execution Relay | 100-500 HAVE | Message delivery to DataHaven |
### Cost Calculation Formula
```
Annual ETH Cost = (operations_per_day × avg_gas_per_operation × avg_gas_price × 365) / 1e18
Annual USD Cost = Annual ETH Cost × ETH_price
```
**Example (Solochain Relay, Medium Activity):**
```
= (100 messages × 200,000 gas × 30 gwei × 365 days) / 1e18
= 2.19 ETH/year
= ~$6,570 USD/year at $3,000/ETH
```
## Total Operating Costs (Full Relay Stack)
Running a complete Snowbridge relay infrastructure requires all four relays. Here's a combined cost estimate:
| Scenario | ETH/Year | HAVE/Year | Annual USD (ETH only) |
|----------|----------|-----------|----------------------|
| **Low activity** | 1.2-2.3 ETH | 200-400 HAVE | $3,600-$6,900 |
| **Medium activity** | 2.5-4.6 ETH | 400-800 HAVE | $7,500-$13,800 |
| **High activity** | 5.0-9.5 ETH | 800-1,500 HAVE | $15,000-$28,500 |
> **Note**: These estimates assume a single relay instance per type. Running redundant relays (recommended for production) will multiply costs proportionally.
## Cost Optimization Strategies
### 1. Gas Price Optimization
- **Monitor gas prices**: Use services like [ETH Gas Station](https://ethgasstation.info/) or [Etherscan Gas Tracker](https://etherscan.io/gastracker)
- **Off-peak submissions**: Non-urgent operations can wait for lower gas prices
- **Gas price limits**: Configure maximum gas price thresholds in relay settings
### 2. Batching Operations
- BEEFY relay batches commitments when possible
- Solochain relay batches reward updates
- Reduces per-operation overhead
### 3. Right-Size Your Deployment
| Network Activity | Recommended Setup |
|------------------|-------------------|
| Low volume | Single instance per relay type |
| Medium volume | 2 instances with different providers |
| High volume/Production | 3+ instances across regions |
### 4. Infrastructure Cost Savings
- **Shared RPC endpoints**: Use the same provider subscription across relays
- **Self-hosted nodes**: Higher upfront cost but eliminates per-request fees
- **Cloud cost optimization**: Use reserved instances or spot pricing where appropriate
## Balance Monitoring & Alerts
### Recommended Alert Thresholds
| Account Type | Low Balance Alert | Critical Alert |
|--------------|-------------------|----------------|
| Ethereum | 0.2 ETH | 0.1 ETH |
| Substrate (HAVE) | 50 HAVE | 20 HAVE |
### Monitoring Setup
```bash
# Check Ethereum balance
cast balance $RELAY_ETH_ADDRESS --rpc-url $ETH_RPC_URL
# Check HAVE balance (using subxt or polkadot-js)
# Monitor via your preferred Substrate tooling
```
### Automated Top-Up
Consider implementing automated funding from a treasury account when balances fall below thresholds. This prevents relay downtime due to insufficient funds.
## Cost Variability Factors
### Ethereum Gas Prices
Gas prices can vary dramatically:
| Condition | Typical Gas Price | Impact |
|-----------|-------------------|--------|
| Low congestion | 10-20 gwei | 50% below estimates |
| Normal | 20-40 gwei | Within estimates |
| High congestion | 50-100 gwei | 2-3x estimates |
| Extreme (NFT mints, etc.) | 100-500+ gwei | 5-10x estimates |
### ETH Price Volatility
ETH price directly affects USD costs:
| ETH Price | Annual Cost (Medium Activity) |
|-----------|------------------------------|
| $2,000 | ~$4,400 |
| $3,000 | ~$6,600 |
| $4,000 | ~$8,800 |
| $5,000 | ~$11,000 |
## Related Documentation
- [Beacon Relay](./snowbridge-beacon-relay.md)
- [BEEFY Relay](./snowbridge-beefy-relay.md)
- [Execution Relay](./snowbridge-execution-relay.md)
- [Solochain Relay](./snowbridge-solochain-relay.md)
- [Snowbridge Documentation](https://docs.snowbridge.network)

View file

@ -0,0 +1,611 @@
# Snowbridge Solochain Relay
## Overview
The Solochain Relay handles DataHaven-specific operations, including relaying outbound messages from DataHaven to Ethereum and managing validator reward distributions. This relay is specific to the DataHaven solochain implementation of Snowbridge.
## Purpose
- Relay DataHaven outbound messages to Ethereum
- Submit messages to the Gateway contract on Ethereum
- Handle validator reward synchronization
- Enable cross-chain token transfers from DataHaven to Ethereum
## Direction
```
DataHaven → Ethereum (with bidirectional monitoring)
```
## Prerequisites
- Docker with `linux/amd64` platform support
- Access to DataHaven node WebSocket endpoint
- Access to Ethereum execution layer WebSocket endpoint
- Access to Ethereum consensus layer (beacon) HTTP endpoint
- Ethereum account with ETH for gas fees
- Substrate account for DataHaven operations
- Deployed BeefyClient, Gateway, and RewardsRegistry contracts on Ethereum
- Persistent storage for relay datastore
## Hardware Requirements
The Solochain Relay handles more operations than other relays (bidirectional messaging + rewards), so additional resources are recommended.
### Specifications
| Component | Requirement |
|-----------|-------------|
| **CPU** | 4 cores |
| **RAM** | 8 GB |
| **Storage (Datastore)** | 20 GB SSD |
| **Network** | 100 Mbit/s symmetric |
### Important Considerations
- **Persistent storage**: The relay maintains a local datastore to track processed messages and reward operations; use persistent volumes in containerized deployments
- **Bidirectional operations**: Handles both DataHaven → Ethereum messages and reward synchronization
- **Network connectivity**: Requires connections to Ethereum (execution + beacon) and DataHaven nodes simultaneously
- **Higher resource usage**: May use more resources during high message volumes or reward distribution periods
- **Reliable RPC endpoints**: Use enterprise-grade or self-hosted nodes for production deployments
## RPC Endpoint Requirements
### Ethereum Execution Layer
The relay requires access to a **stable, reliable Ethereum WebSocket endpoint**. Endpoint instability or downtime will prevent the relay from functioning correctly.
**Recommended providers:**
- Self-hosted execution node (Geth, Nethermind, Besu, Erigon)
- [Dwellir](https://www.dwellir.com/)
- [Chainstack](https://chainstack.com/)
- [QuickNode](https://www.quicknode.com/)
- [Alchemy](https://www.alchemy.com/)
**Requirements:**
- WebSocket support (WSS for production)
- Full event log access for contract monitoring
- Low latency (< 100ms recommended)
- High availability (99.9%+ uptime)
### Beacon Node API
The relay also requires access to the Ethereum Beacon API for finality verification.
**Recommended providers:**
- Self-hosted beacon node (Lighthouse, Prysm, Teku, Nimbus, Lodestar)
- Same providers as execution layer (with beacon API support)
**Requirements:**
- Full beacon API support (`/eth/v1/beacon/*` endpoints)
- State endpoint access for sync committee data
- Low latency (< 100ms recommended)
### DataHaven Node
- Full node or archive node with WebSocket endpoint
- Low latency connection for monitoring outbound messages
## Relay Redundancy
### Why Redundancy Matters
Running multiple relay instances provides fault tolerance and ensures continuous bridge operation even if one relay fails. The Gateway contract and on-chain pallets handle duplicate submissions gracefully—only the first valid submission is processed.
### Configuring Redundant Relays
Deploy multiple relay instances pointing to **different RPC providers** for maximum fault tolerance. Use the `schedule` configuration to coordinate between instances:
**Instance 1 (Primary):**
```json
{
"source": {
"ethereum": {
"endpoint": "wss://eth-provider-a.example.com"
},
"solochain": {
"endpoint": "wss://datahaven-rpc-1.example.com"
},
"beacon": {
"endpoint": "https://beacon-provider-a.example.com"
}
},
"sink": {
"ethereum": {
"endpoint": "wss://eth-provider-a.example.com"
}
},
"schedule": {
"id": 0,
"totalRelayerCount": 2,
"sleepInterval": 10
}
}
```
**Instance 2 (Backup):**
```json
{
"source": {
"ethereum": {
"endpoint": "wss://eth-provider-b.example.com"
},
"solochain": {
"endpoint": "wss://datahaven-rpc-2.example.com"
},
"beacon": {
"endpoint": "https://beacon-provider-b.example.com"
}
},
"sink": {
"ethereum": {
"endpoint": "wss://eth-provider-b.example.com"
}
},
"schedule": {
"id": 1,
"totalRelayerCount": 2,
"sleepInterval": 10
}
}
```
### Best Practices for Redundancy
1. **Use different RPC providers**: Avoid single points of failure by using different Ethereum and DataHaven node providers for each relay instance
2. **Geographic distribution**: Deploy relays in different regions/data centers
3. **Independent infrastructure**: Run relays on separate machines or Kubernetes nodes
4. **Separate funding accounts**: Use different relay accounts (both Ethereum and Substrate) to avoid nonce conflicts
5. **Coordinate with schedule IDs**: Use unique `schedule.id` values for each instance
6. **Monitor all instances**: Set up alerting for each relay independently
## Key Requirements
### Both Ethereum and Substrate Private Keys
The Solochain Relay requires **both** an Ethereum private key and a Substrate private key.
| Key Type | Purpose |
|----------|---------|
| Ethereum (secp256k1) | Sign Ethereum transactions to Gateway contract |
| Substrate (sr25519/ecdsa) | Sign DataHaven operations |
### Account Funding
The Solochain Relay requires funded accounts on both Ethereum and DataHaven to operate continuously.
| Account | Minimum | Recommended | Purpose |
|---------|---------|-------------|---------|
| Ethereum | 0.5 ETH | 2.0 ETH | Gas fees for Gateway contract calls |
| Substrate (HAVE) | 100 HAVE | 500 HAVE | Transaction fees on DataHaven |
For detailed operating cost estimates, annual forecasts, and cost optimization strategies, see the [Relay Operating Costs](./snowbridge-relay-costs.md) guide.
## CLI Flags
### Required Flags
| Flag | Description |
|------|-------------|
| `--config <PATH>` | Path to the JSON configuration file |
### Ethereum Private Key Flags (One Required)
| Flag | Description |
|------|-------------|
| `--ethereum.private-key <KEY>` | Ethereum private key directly |
| `--ethereum.private-key-file <PATH>` | Path to file containing the private key |
| `--ethereum.private-key-id <ID>` | AWS Secrets Manager secret ID for the private key |
### Substrate Private Key Flag
| Flag | Description |
|------|-------------|
| `--substrate.private-key <KEY>` | Substrate private key URI |
## Configuration File
### Structure
```json
{
"source": {
"ethereum": {
"endpoint": "ws://ethereum-node:8546"
},
"solochain": {
"endpoint": "ws://datahaven-node:9944"
},
"contracts": {
"BeefyClient": "0x4826533B4897376654Bb4d4AD88B7faFD0C98528",
"Gateway": "0x8f86403A4DE0BB5791fa46B8e795C547942fE4Cf"
},
"beacon": {
"endpoint": "http://beacon-node:4000",
"stateEndpoint": "http://beacon-node:4000",
"spec": {
"syncCommitteeSize": 512,
"slotsInEpoch": 32,
"epochsPerSyncCommitteePeriod": 256,
"forkVersions": {
"deneb": 0,
"electra": 0
}
},
"datastore": {
"location": "/relay-data",
"maxEntries": 100
}
}
},
"sink": {
"ethereum": {
"endpoint": "ws://ethereum-node:8546"
},
"contracts": {
"Gateway": "0x8f86403A4DE0BB5791fa46B8e795C547942fE4Cf"
}
},
"schedule": {
"id": 0,
"totalRelayerCount": 1,
"sleepInterval": 10
},
"reward-address": "0x4c5859f0F772848b2D91F1D83E2Fe57935348029",
"ofac": {
"enabled": false,
"apiKey": ""
}
}
```
### Configuration Parameters
#### Source Configuration
| Parameter | Description | Example |
|-----------|-------------|---------|
| `source.ethereum.endpoint` | Ethereum WebSocket endpoint | `ws://ethereum-node:8546` |
| `source.solochain.endpoint` | DataHaven WebSocket endpoint | `ws://datahaven-node:9944` |
| `source.contracts.BeefyClient` | BeefyClient contract address | `0x...` |
| `source.contracts.Gateway` | Gateway contract address | `0x...` |
| `source.beacon.*` | Beacon chain configuration | See beacon spec |
#### Sink Configuration
| Parameter | Description | Example |
|-----------|-------------|---------|
| `sink.ethereum.endpoint` | Ethereum WebSocket endpoint | `ws://ethereum-node:8546` |
| `sink.contracts.Gateway` | Gateway contract address | `0x...` |
#### Schedule Configuration
| Parameter | Description | Example |
|-----------|-------------|---------|
| `schedule.id` | Relayer instance ID (for multi-instance) | `0` |
| `schedule.totalRelayerCount` | Total number of relayer instances | `1` |
| `schedule.sleepInterval` | Seconds between message checks | `10` |
#### Rewards Configuration
| Parameter | Description | Example |
|-----------|-------------|---------|
| `reward-address` | RewardsRegistry contract address | `0x...` |
#### OFAC Compliance (Optional)
| Parameter | Description | Example |
|-----------|-------------|---------|
| `ofac.enabled` | Enable OFAC sanctions screening | `false` |
| `ofac.apiKey` | API key for OFAC screening service | `""` |
## Contract Requirements
### Required Contracts
1. **BeefyClient**: Verifies BEEFY finality proofs on Ethereum
2. **Gateway**: Handles cross-chain message passing
3. **RewardsRegistry**: Manages validator reward distribution
### Get Contract Addresses
```typescript
const deployments = await parseDeploymentsFile();
const beefyClientAddress = deployments.BeefyClient;
const gatewayAddress = deployments.Gateway;
const rewardsRegistryAddress = deployments.RewardsRegistry;
```
## Running the Relay
### Docker Run
```bash
docker run -d \
--name snowbridge-solochain-relay \
--platform linux/amd64 \
--add-host host.docker.internal:host-gateway \
--network datahaven-network \
-v $(pwd)/solochain-relay.json:/configs/solochain-relay.json:ro \
-v $(pwd)/relay-data:/relay-data \
--pull always \
datahavenxyz/snowbridge-relay:latest \
run solochain \
--config /configs/solochain-relay.json \
--ethereum.private-key "0x..." \
--substrate.private-key "0x..."
```
### Docker Compose
```yaml
version: '3.8'
services:
solochain-relay:
image: datahavenxyz/snowbridge-relay:latest
container_name: snowbridge-solochain-relay
platform: linux/amd64
restart: unless-stopped
volumes:
- ./configs/solochain-relay.json:/configs/solochain-relay.json:ro
- solochain-relay-data:/relay-data
command:
- "run"
- "solochain"
- "--config"
- "/configs/solochain-relay.json"
- "--ethereum.private-key-file"
- "/secrets/ethereum-key"
- "--substrate.private-key"
- "${SUBSTRATE_PRIVATE_KEY}"
secrets:
- ethereum-key
environment:
- SUBSTRATE_PRIVATE_KEY=${SUBSTRATE_PRIVATE_KEY}
volumes:
solochain-relay-data:
secrets:
ethereum-key:
file: ./secrets/solochain-relay-ethereum-key
```
## Kubernetes Deployment
```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: dh-solochain-relay
spec:
serviceName: dh-solochain-relay
replicas: 1
selector:
matchLabels:
app: dh-solochain-relay
template:
metadata:
labels:
app: dh-solochain-relay
spec:
containers:
- name: solochain-relay
image: datahavenxyz/snowbridge-relay:latest
imagePullPolicy: Always
args:
- "run"
- "solochain"
- "--config"
- "/configs/solochain-relay.json"
- "--ethereum.private-key-file"
- "/secrets/dh-solochain-relay-ethereum-key"
- "--substrate.private-key"
- "$(SUBSTRATE_PRIVATE_KEY)"
env:
- name: SUBSTRATE_PRIVATE_KEY
valueFrom:
secretKeyRef:
name: dh-solochain-relay-substrate-key
key: private-key
volumeMounts:
- name: config
mountPath: /configs
readOnly: true
- name: secrets
mountPath: /secrets
readOnly: true
- name: relay-data
mountPath: /relay-data
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
volumes:
- name: config
configMap:
name: solochain-relay-config
- name: secrets
secret:
secretName: dh-solochain-relay-ethereum-key
volumeClaimTemplates:
- metadata:
name: relay-data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 10Gi
```
## Message Flow
### DataHaven → Ethereum Message Flow
1. User submits outbound message on DataHaven
2. `EthereumOutboundQueue` pallet queues the message
3. Solochain Relay monitors for outbound messages
4. Relay constructs message proof using BEEFY finality
5. Relay submits proof to Gateway contract on Ethereum
6. Gateway verifies proof against BeefyClient
7. Message is executed on Ethereum
### Reward Distribution Flow
1. Validators earn rewards on DataHaven
2. Reward data is synchronized to RewardsRegistry contract
3. Operators can claim rewards on Ethereum
## Multi-Instance Deployment
For high-availability or load distribution, multiple Solochain Relayers can be deployed using the `schedule` configuration to coordinate between instances.
### Schedule Configuration Parameters
| Parameter | Type | Description |
|-----------|------|-------------|
| `schedule.id` | `number` | Unique identifier for this relay instance (0-indexed). |
| `schedule.totalRelayerCount` | `number` | Total number of relay instances in the deployment. All instances must use the same value. |
| `schedule.sleepInterval` | `number` | Seconds to wait between polling for new messages. Lower values = faster detection, higher resource usage. |
### How Multi-Instance Scheduling Works
When multiple relayers are deployed, the `schedule.id` and `totalRelayerCount` parameters work together to distribute message processing:
1. **Message assignment**: Messages are assigned to relayers based on `message_nonce % totalRelayerCount == schedule.id`
2. **Staggered processing**: Each relayer only processes messages assigned to its ID, preventing duplicate submissions
3. **Failover**: If a relayer fails, its messages will eventually be picked up by other relayers after timeout
**Example with 2 relayers:**
- Instance 0 processes messages where `nonce % 2 == 0` (nonces: 0, 2, 4, 6, ...)
- Instance 1 processes messages where `nonce % 2 == 1` (nonces: 1, 3, 5, 7, ...)
### Configuration Examples
**Single Instance (default):**
```json
{
"schedule": {
"id": 0,
"totalRelayerCount": 1,
"sleepInterval": 10
}
}
```
**Two-Instance Deployment:**
*Instance 0:*
```json
{
"schedule": {
"id": 0,
"totalRelayerCount": 2,
"sleepInterval": 10
}
}
```
*Instance 1:*
```json
{
"schedule": {
"id": 1,
"totalRelayerCount": 2,
"sleepInterval": 10
}
}
```
### Sleep Interval Tuning
The `sleepInterval` parameter controls how frequently the relay polls for new messages:
| Value | Use Case | Trade-offs |
|-------|----------|------------|
| `1` | Low latency required | Higher RPC usage, faster message detection |
| `10` | Balanced (default) | Good balance of latency and resource usage |
| `30` | Cost-sensitive | Lower RPC costs, slower message detection |
**Recommendation**: The default `sleepInterval: 10` works well for most deployments. Decrease if message latency is critical; increase if RPC rate limits are a concern.
### Deployment Checklist
1. **Unique IDs**: Each instance must have a unique `schedule.id` (0 to `totalRelayerCount - 1`)
2. **Consistent count**: All instances must use the same `totalRelayerCount` value
3. **Separate accounts**: Use different Ethereum and Substrate accounts to avoid nonce conflicts
4. **Independent storage**: Each instance needs its own persistent datastore volume
5. **Different RPC endpoints**: Point instances to different RPC providers for fault tolerance
## Monitoring
### Health Checks
```bash
# View relay logs
docker logs -f snowbridge-solochain-relay
# Check for message processing
docker logs snowbridge-solochain-relay 2>&1 | grep -i "message\|submit\|reward"
```
### Key Metrics to Monitor
- Outbound message queue depth
- Message delivery success rate
- Reward synchronization status
- Ethereum and Substrate account balances
- BEEFY finality status
## Troubleshooting
### Issue: Messages Not Being Delivered
**Check**:
1. BeefyClient has recent commitments (BEEFY Relay running)
2. Gateway contract address is correct
3. Both Ethereum and Substrate endpoints are accessible
4. Account balances are sufficient
### Issue: Reward Synchronization Failures
**Check**:
1. RewardsRegistry contract address is correct
2. Ethereum account has sufficient gas
3. Reward data is available on DataHaven
### Issue: OFAC Screening Failures
**Check**:
1. API key is valid (if OFAC enabled)
2. Network connectivity to OFAC service
3. Consider disabling OFAC for testing environments
## Security Considerations
1. **Private Key Protection**: Secure both Ethereum and Substrate keys
2. **Dual Account Management**: Monitor balances on both chains
3. **Network Security**: Use secure connections (WSS) when possible
4. **Access Control**: Use dedicated accounts with minimal permissions
5. **OFAC Compliance**: Enable OFAC screening for production if required
## Dependencies
The Solochain Relay depends on:
- **BEEFY Relay**: Must be running to provide finality proofs
- **Beacon Relay**: Must be running for Ethereum light client state
Ensure both relays are operational before starting the Solochain Relay.
## Related Documentation
- [Beacon Relay](./snowbridge-beacon-relay.md)
- [BEEFY Relay](./snowbridge-beefy-relay.md)
- [Execution Relay](./snowbridge-execution-relay.md)
- [Relay Operating Costs](./snowbridge-relay-costs.md)
- [Snowbridge Documentation](https://docs.snowbridge.network)
- [DataHaven Snowbridge Repository](https://github.com/datahaven-xyz/snowbridge)

676
docs/storagehub-bsp.md Normal file
View file

@ -0,0 +1,676 @@
# StorageHub Backup Storage Provider (BSP) Setup
## Overview
Backup Storage Providers (BSPs) provide redundant storage for files in the StorageHub network, receiving files from Main Storage Providers (MSPs) and submitting proofs of storage.
## Purpose
- Store backup copies of files
- Submit proofs of storage periodically
- Charge fees from users for backup storage
- Handle bucket migrations
- Serve file download requests as backup
- Ensure data redundancy and availability
## Prerequisites
- DataHaven node binary or Docker image
- Funded account with sufficient balance for deposits
- Storage capacity (minimum 1 TB, recommended 2+ TB)
- Stable network connection
- Open network ports (30333, optionally 9944)
## Hardware Requirements
BSPs have similar hardware requirements to MSPs as they store backup data and must reliably submit proofs of storage.
### Specifications
| Component | Requirement |
|-----------|-------------|
| **CPU** | 8 physical cores @ 3.4 GHz (Intel Ice Lake+ or AMD Zen3+) |
| **RAM** | 32 GB DDR4 ECC |
| **Storage (System)** | 500 GB NVMe SSD (chain data) |
| **Storage (User Data)** | 1 TB NVMe SSD or HDD (minimum) |
| **Network** | 500 Mbit/s symmetric |
### Important Considerations
- **Separate storage volumes**: Keep chain data and user data on separate volumes for better I/O performance
- **Storage expansion**: Plan for growth; user data storage should be easily expandable
- **max-storage-capacity**: Set this CLI flag to **80% of available physical disk space** to leave headroom for filesystem overhead and temporary files
- **Cloud compatible**: BSPs can run effectively on cloud VPS with dedicated storage volumes
- **Proof submission**: Ensure reliable network connectivity for timely proof submissions
## Key Requirements
### BCSV Key (ECDSA - 1 Required)
BSPs require **one BCSV key** for storage provider identity.
| Key Type | Scheme | Purpose |
|----------|--------|---------|
| `bcsv` | ecdsa | Storage provider identity and signing |
### Generate BCSV Key
#### Method 1: CLI Key Insertion
```bash
# Generate seed phrase
SEED=$(datahaven-node key generate | grep "Secret phrase" | cut -d'`' -f2)
# Insert BCSV key (ecdsa)
datahaven-node key insert \
--base-path /data/bsp \
--chain stagenet-local \
--key-type bcsv \
--scheme ecdsa \
--suri "$SEED"
```
#### Method 2: Docker Entrypoint (Automated)
Set environment variables:
```bash
export NODE_TYPE=bsp
export NODE_NAME=bsp01
export SEED="your seed phrase here"
export CHAIN=stagenet-local
```
The entrypoint script automatically injects the BCSV key.
## Wallet Requirements
### Provider Account
- **Purpose**: BSP registration, transaction fees, and deposits
- **Required Balance**:
- Base deposit: 100 HAVE (`SpMinDeposit`)
- Deposit per GiB: 2 HAVE (`DepositPerData`)
- Transaction fees: ~10 HAVE
- **Funding**: Must be funded **before** BSP registration
- **Account Type**: Ethereum-style 20-byte address (AccountId20)
**Deposit Calculation by Capacity:**
| Storage Capacity | Deposit Required | Recommended Balance |
|------------------|------------------|---------------------|
| 800 GiB (1 TB disk) | ~1,700 HAVE | 1,800+ HAVE |
| 1.6 TiB (2 TB disk) | ~3,400 HAVE | 3,600+ HAVE |
| 4 TiB (5 TB disk) | ~8,300 HAVE | 8,500+ HAVE |
Formula: `100 + (capacity_in_gib × 2) + buffer`
### Generate Provider Account
```bash
# Generate new account from seed
SEED="your secure seed phrase here"
echo $SEED | datahaven-node key inspect --output-type json | jq
# Derive BSP account
echo "$SEED" | datahaven-node key inspect --output-type json | jq -r '.ss58PublicKey'
```
## CLI Flags
### Required Flags
```bash
datahaven-node \
--chain <CHAIN_SPEC> \
--provider \
--provider-type bsp \
--max-storage-capacity <BYTES> \
--jump-capacity <BYTES>
```
### Core Provider Flags
| Flag | Description | Required | Default |
|------|-------------|----------|---------|
| `--provider` | Enable storage provider mode | Yes | false |
| `--provider-type bsp` | Set provider type to BSP | Yes | None |
| `--max-storage-capacity <BYTES>` | Maximum storage capacity | Yes | None |
| `--jump-capacity <BYTES>` | Jump capacity for new storage | Yes | None |
| `--storage-layer <TYPE>` | Storage backend (`rocksdb` or `memory`) | No | `memory` |
| `--storage-path <PATH>` | Storage path (required if rocksdb) | No | None |
**Example Values:**
- `--max-storage-capacity 858993459200` (800 GiB = 80% of 1 TB disk)
- `--max-storage-capacity 1717986918400` (1.6 TiB = 80% of 2 TB disk)
- `--jump-capacity 107374182400` (100 GiB)
**Note**: Set `--max-storage-capacity` to approximately **80% of your available physical disk space** to leave headroom for filesystem overhead and temporary files.
### BSP-Specific Task Flags
| Flag | Description | Default |
|------|-------------|---------|
| `--bsp-upload-file-task` | Enable file upload from MSP task | false |
| `--bsp-upload-file-max-try-count <N>` | Max retries for file uploads | 5 |
| `--bsp-upload-file-max-tip <AMOUNT>` | Max tip for upload file extrinsics | 0 |
| `--bsp-move-bucket-task` | Enable bucket migration task | false |
| `--bsp-move-bucket-grace-period <SECONDS>` | Grace period after bucket move | 300 |
| `--bsp-charge-fees-task` | Enable automatic fee charging | false |
| `--bsp-charge-fees-min-debt <AMOUNT>` | Minimum debt threshold to charge | 0 |
| `--bsp-submit-proof-task` | Enable proof submission task | false |
| `--bsp-submit-proof-max-attempts <N>` | Max attempts to submit proof | 3 |
### Remote File Handling Flags
| Flag | Description | Default |
|------|-------------|---------|
| `--max-file-size <BYTES>` | Maximum file size | 10737418240 (10 GB) |
| `--connection-timeout <SECONDS>` | Connection timeout | 30 |
| `--read-timeout <SECONDS>` | Read timeout | 300 |
| `--follow-redirects <BOOL>` | Follow HTTP redirects | true |
| `--max-redirects <N>` | Maximum redirects | 10 |
| `--user-agent <STRING>` | HTTP user agent | "StorageHub-Client/1.0" |
| `--chunk-size <BYTES>` | Upload/download chunk size | 8192 (8 KB) |
| `--chunks-buffer <N>` | Number of chunks to buffer | 512 |
### Operational Flags
| Flag | Description | Default |
|------|-------------|---------|
| `--extrinsic-retry-timeout <SECONDS>` | Extrinsic retry timeout | 60 |
| `--sync-mode-min-blocks-behind <N>` | Min blocks behind for sync mode | 5 |
| `--check-for-pending-proofs-period <N>` | Period to check pending proofs | 4 |
| `--max-blocks-behind-to-catch-up-root-changes <N>` | Max blocks to process for root changes | 10 |
## Complete Setup Example
### 1. Generate Keys and Account
```bash
# Generate seed phrase
SEED="your secure seed phrase here"
# Derive BSP account
BSP_ACCOUNT=$(echo "$SEED" | datahaven-node key inspect --output-type json | jq -r '.ss58PublicKey')
echo "BSP Account: $BSP_ACCOUNT"
# Insert BCSV key
datahaven-node key insert \
--base-path /data/bsp \
--chain stagenet-local \
--key-type bcsv \
--scheme ecdsa \
--suri "$SEED"
```
### 2. Fund Provider Account
```bash
# Transfer funds to BSP account
# For 800 GiB capacity: ~1,800 HAVE (1,700 deposit + 100 buffer)
# For 1.6 TiB capacity: ~3,600 HAVE (3,400 deposit + 200 buffer)
# Using Polkadot.js or a funded account, send HAVE tokens to $BSP_ACCOUNT
# Formula: 100 + (capacity_in_gib × 2) + buffer
```
### 3. Start BSP Node
```bash
datahaven-node \
--chain stagenet-local \
--name "BSP01" \
--base-path /data/bsp \
--provider \
--provider-type bsp \
--max-storage-capacity 858993459200 \
--jump-capacity 107374182400 \
--storage-layer rocksdb \
--storage-path /data/bsp/storage \
--bsp-upload-file-task \
--bsp-move-bucket-task \
--bsp-charge-fees-task \
--bsp-submit-proof-task \
--port 30333 \
--rpc-port 9946 \
--bootnodes /dns/bootnode.example.com/tcp/30333/p2p/12D3KooW...
```
### 4. Register BSP On-Chain
See [On-Chain Registration](#on-chain-registration) section below.
## Docker Deployment
### Docker Compose
```yaml
version: '3.8'
services:
bsp:
image: datahavenxyz/datahaven:latest
container_name: storagehub-bsp
environment:
NODE_TYPE: bsp
NODE_NAME: bsp01
SEED: "your seed phrase here"
CHAIN: stagenet-local
KEYSTORE_PATH: /data/keystore
ports:
- "30334:30333"
- "9946:9946"
volumes:
- bsp-data:/data
- bsp-storage:/data/storage
command:
- "--chain=stagenet-local"
- "--name=BSP01"
- "--base-path=/data"
- "--keystore-path=/data/keystore"
- "--provider"
- "--provider-type=bsp"
- "--max-storage-capacity=858993459200"
- "--jump-capacity=107374182400"
- "--storage-layer=rocksdb"
- "--storage-path=/data/storage"
- "--bsp-upload-file-task"
- "--bsp-move-bucket-task"
- "--bsp-charge-fees-task"
- "--bsp-submit-proof-task"
- "--port=30333"
- "--rpc-port=9946"
restart: unless-stopped
volumes:
bsp-data:
bsp-storage:
```
## Kubernetes Deployment
```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: storagehub-bsp
spec:
serviceName: storagehub-bsp
replicas: 1
selector:
matchLabels:
app: storagehub-bsp
template:
metadata:
labels:
app: storagehub-bsp
spec:
containers:
- name: bsp
image: datahavenxyz/datahaven:latest
env:
- name: NODE_TYPE
value: "bsp"
- name: NODE_NAME
value: "bsp01"
- name: SEED
valueFrom:
secretKeyRef:
name: bsp-seed
key: seed
ports:
- containerPort: 30333
name: p2p
- containerPort: 9946
name: rpc
volumeMounts:
- name: data
mountPath: /data
- name: storage
mountPath: /data/storage
resources:
requests:
memory: "4Gi"
cpu: "2"
limits:
memory: "8Gi"
cpu: "4"
args:
- "--chain=stagenet-local"
- "--provider"
- "--provider-type=bsp"
- "--max-storage-capacity=858993459200"
- "--jump-capacity=107374182400"
- "--storage-layer=rocksdb"
- "--storage-path=/data/storage"
- "--bsp-upload-file-task"
- "--bsp-move-bucket-task"
- "--bsp-charge-fees-task"
- "--bsp-submit-proof-task"
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 500Gi
- metadata:
name: storage
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 1000Gi
```
## On-Chain Registration
### BSP Registration Process
BSPs must be registered on-chain via the `Providers` pallet using a **2-step process**:
1. **Step 1**: Call `request_bsp_sign_up` - Initiates registration and reserves deposit
2. **Step 2**: Call `confirm_sign_up` - Completes registration after randomness verification
This two-step mechanism ensures security and prevents manipulation of provider IDs through randomness.
### Step 1: Request BSP Sign Up
```typescript
import { createClient } from 'polkadot-api';
import { getWsProvider } from 'polkadot-api/ws-provider/web';
import { withPolkadotSdkCompat } from 'polkadot-api/polkadot-sdk-compat';
import { datahaven } from '@polkadot-api/descriptors';
import { Binary } from 'polkadot-api';
// Connect to DataHaven node
const client = createClient(
withPolkadotSdkCompat(getWsProvider('ws://localhost:9944'))
);
const typedApi = client.getTypedApi(datahaven);
// BSP signer (using your BCSV key account)
const bspSigner = /* your polkadot-api signer */;
// BSP configuration
const capacity = BigInt(858_993_459_200); // 800 GiB (80% of 1 TB disk)
const multiaddresses = [
'/ip4/127.0.0.1/tcp/30333',
'/dns/bsp01.example.com/tcp/30333'
].map(addr => Binary.fromText(addr));
// Step 1: Request BSP sign up
const requestTx = typedApi.tx.Providers.request_bsp_sign_up({
capacity: capacity,
multiaddresses: multiaddresses,
payment_account: bspSigner.publicKey // Account receiving payments
});
// Sign and submit the request
const requestResult = await requestTx.signAndSubmit(bspSigner);
console.log('BSP sign-up requested. Waiting for finalization...');
await requestResult.finalized();
console.log('Request finalized! Deposit has been reserved.');
```
**What Happens in Step 1:**
- Validates multiaddresses format
- Calculates required deposit based on capacity (`SpMinDeposit + capacity * DepositPerData`)
- Verifies account has sufficient balance
- **Holds (reserves) the deposit** from your account
- Creates a pending sign-up request
- Emits `BspRequestSignUpSuccess` event
### Step 2: Confirm Sign Up
After requesting, you must wait for sufficient randomness to be available (controlled by `MaxBlocksForRandomness` parameter, typically 2 hours on mainnet).
```typescript
// Step 2: Confirm the sign-up (after waiting for randomness)
const confirmTx = typedApi.tx.Providers.confirm_sign_up({
provider_account: undefined // Optional: omit to use signer's account
});
// Sign and submit confirmation
const confirmResult = await confirmTx.signAndSubmit(bspSigner);
console.log('Confirming BSP registration...');
await confirmResult.finalized();
console.log('BSP registration confirmed and active!');
```
**What Happens in Step 2:**
- Verifies randomness is sufficiently fresh
- Checks request hasn't expired
- Generates Provider ID using randomness
- Registers BSP in the system
- Applies sign-up lock period (90 days on testnet/mainnet via `BspSignUpLockPeriod`)
- Emits `BspSignUpSuccess` event
- Deposit remains held for duration of BSP operation
### Timing Requirements
| Parameter | Testnet | Mainnet | Description |
|-----------|---------|---------|-------------|
| Min wait time | ~2 minutes | ~2 hours | Wait after `request_bsp_sign_up` for randomness |
| Max wait time | Set by `MaxBlocksForRandomness` | Typically 2 hours | Request expires if not confirmed in time |
| Sign-up lock | 90 days | 90 days | Period before BSP can sign off after registration |
### Verify Registration
```typescript
// Check BSP registration status
const bspAccount = bspSigner.publicKey;
const registeredBspId = await typedApi.query.Providers.AccountIdToBackupStorageProviderId.getValue(
bspAccount
);
if (registeredBspId) {
console.log('Registered BSP ID:', registeredBspId);
// Get full BSP details
const bspInfo = await typedApi.query.Providers.BackupStorageProviders.getValue(
registeredBspId
);
console.log('BSP Info:', bspInfo);
} else {
console.log('BSP not yet registered or confirmation pending');
}
```
### Cancel Pending Request
If you change your mind before confirming:
```typescript
const cancelTx = typedApi.tx.Providers.cancel_sign_up();
await cancelTx.signAndSubmit(bspSigner);
console.log('Sign-up request cancelled, deposit returned');
```
### Development/Testing: Force Sign Up (Requires Sudo)
For development and testing environments with sudo access, you can bypass the 2-step process:
```typescript
// Single-step registration for testing (requires sudo)
const sudoSigner = /* sudo account signer */;
const bspCall = typedApi.tx.Providers.force_bsp_sign_up({
who: bspAccount,
bsp_id: /* pre-generated provider ID */,
capacity: BigInt(858_993_459_200), // 800 GiB
multiaddresses: multiaddresses,
payment_account: bspAccount,
weight: undefined // Optional weight parameter
});
const sudoTx = typedApi.tx.Sudo.sudo({ call: bspCall.decodedCall });
await sudoTx.signAndSubmit(sudoSigner);
```
### Registration Parameters
| Parameter | Type | Description | Example |
|-----------|------|-------------|---------|
| `capacity` | StorageDataUnit | Storage capacity in bytes | `858993459200` (800 GiB) |
| `multiaddresses` | Vec<Bytes> | P2P network addresses | `[Binary.fromText("/ip4/...")]` |
| `payment_account` | AccountId | Account receiving payments | `0x...` (20-byte) |
### Deposit Requirements
- **Base Deposit**: 100 HAVE (`SpMinDeposit`)
- **Per GiB**: 2 HAVE (`DepositPerData`)
- **Formula**: `100 + (capacity_in_gib × 2)`
**Examples:**
- 800 GiB capacity: `100 + (800 × 2) = 1,700 HAVE`
- 1.6 TiB capacity: `100 + (1,638 × 2) = 3,376 HAVE`
The deposit is **held (reserved)** from your account when you call `request_bsp_sign_up` and remains held while you operate as a BSP. The deposit is returned when you deregister as a BSP.
## Monitoring
### Health Checks
```bash
# Check node health
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "system_health"}' \
http://localhost:9946 | jq
# Check provider status
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "storageprovider_getStatus"}' \
http://localhost:9946 | jq
```
### Key Metrics to Monitor
- Storage capacity usage
- Number of stored files
- Proof submission success rate
- File upload success rate from MSPs
- Fee collection status
- Bucket migration status
### Logs
```bash
# View BSP logs
docker logs -f storagehub-bsp
# Filter for storage events
docker logs storagehub-bsp 2>&1 | grep -i "storage\|proof\|file"
# Monitor proof submissions
docker logs storagehub-bsp 2>&1 | grep -i "proof"
```
## Proof Submission
### Automatic Proof Submission
BSPs automatically submit proofs when `--bsp-submit-proof-task` is enabled.
### Proof Submission Flow
1. **Challenge Received**: BSP receives storage proof challenge from ProofsDealer pallet
2. **Proof Generation**: BSP generates Merkle proof for challenged data
3. **Proof Submission**: BSP submits proof via `proofsDealer.submitProof` extrinsic
4. **Verification**: ProofsDealer pallet verifies proof on-chain
5. **Reward/Penalty**: BSP receives reward for valid proof or penalty for invalid/missing proof
### Monitor Proof Submission
```bash
# Check pending proofs
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "storageprovider_getPendingProofs"}' \
http://localhost:9946 | jq
```
## Troubleshooting
### Issue: Registration Failed
**Check:**
1. Account has sufficient balance (200+ HAVE)
2. BCSV key is correctly inserted
3. Capacity meets minimum (2 data units)
4. Provider ID is correctly calculated
### Issue: Not Receiving Files from MSP
**Check:**
1. BSP is registered on-chain
2. `--bsp-upload-file-task` flag is enabled
3. Storage capacity not exceeded
4. Node is fully synced
5. Network connectivity to MSPs
### Issue: Proof Submission Failing
**Check:**
1. `--bsp-submit-proof-task` flag is enabled
2. Node is fully synced
3. Sufficient balance for transaction fees
4. Files are correctly stored and accessible
5. Check logs for specific errors
### Issue: Fee Charging Not Working
**Check:**
1. `--bsp-charge-fees-task` flag is enabled
2. Users have sufficient debt to charge
3. Node is synced and connected
## Security Considerations
1. **Key Management**: Store seed phrase securely offline
2. **Storage Security**: Encrypt storage at rest
3. **Network Security**: Use firewall to restrict access
4. **Proof Integrity**: Ensure storage backend reliability
5. **Backup Strategy**: Regular backups of stored data
## Best Practices
1. Use production-grade storage (NVMe SSD recommended)
2. Monitor storage capacity proactively
3. Enable all BSP tasks for full functionality
4. Keep node software updated
5. Implement monitoring and alerting for proof submissions
6. Set reasonable `bsp-submit-proof-max-attempts` (3-5)
7. Document operational procedures
8. Monitor network connectivity to MSPs
## Performance Considerations
### Resource Requirements
| Component | Minimum | Recommended |
|-----------|---------|-------------|
| CPU | 2 cores | 4 cores |
| RAM | 4 GB | 8 GB |
| Storage (Chain Data) | 100 GB | 200 GB |
| Storage (Files) | 10 GB | 500+ GB |
| Network | 100 Mbps | 1 Gbps |
### Storage Backend Comparison
| Backend | Pros | Cons | Use Case |
|---------|------|------|----------|
| `memory` | Fast, simple | Not persistent | Testing only |
| `rocksdb` | Persistent, production-ready | Slower than memory | Production |
## Related Documentation
- [MSP Setup](./storagehub-msp.md)
- [Indexer Setup](./storagehub-indexer.md)
- [Fisherman Setup](./storagehub-fisherman.md)
- [StorageHub Pallets](https://github.com/Moonsong-Labs/storage-hub)
- [Proofs Dealer Pallet](https://github.com/Moonsong-Labs/storage-hub/tree/main/pallets/proofs-dealer)

View file

@ -0,0 +1,602 @@
# StorageHub Fisherman Node Setup
## Overview
Fisherman nodes monitor and validate storage provider behavior, detecting violations and submitting challenges to ensure network integrity.
## Purpose
- Monitor storage provider behavior and compliance
- Detect storage proof violations
- Validate provider availability
- Submit challenges for non-compliant providers
- Ensure data integrity and provider accountability
- Earn rewards for successful violation detection
## Prerequisites
- DataHaven node binary or Docker image
- Funded account with sufficient balance for challenges
- PostgreSQL 14+ database (can share with Indexer)
- Sufficient storage for chain data
- Stable network connection
- Open network ports (30333, optionally 9944)
## Hardware Requirements
Fisherman nodes have moderate hardware requirements. They rely on a PostgreSQL database (typically shared with an Indexer node) to monitor provider behavior.
### Specifications
| Component | Requirement |
|-----------|-------------|
| **CPU** | 4 physical cores @ 2.5 GHz |
| **RAM** | 8 GB DDR4 |
| **Storage (Chain Data)** | 200 GB NVMe SSD |
| **Storage (Database)** | Shared with Indexer |
| **Network** | 100 Mbit/s symmetric |
### Important Considerations
- **Database dependency**: Fisherman requires a running Indexer node in `fishing` or `full` mode
- **Shared database**: Can share PostgreSQL with Indexer to reduce resource overhead
- **Network reliability**: Stable connection required for timely challenge submissions
- **Cloud compatible**: Works well on cloud VPS
## Key Requirements
### BCSV Key (ECDSA - 1 Required)
Fishermen require **one BCSV key** for signing challenge submissions.
| Key Type | Scheme | Purpose |
|----------|--------|---------|
| `bcsv` | ecdsa | Fisherman identity and challenge signing |
### Generate BCSV Key
#### Method 1: CLI Key Insertion
```bash
# Generate seed phrase
SEED=$(datahaven-node key generate | grep "Secret phrase" | cut -d'`' -f2)
# Insert BCSV key (ecdsa)
datahaven-node key insert \
--base-path /data/fisherman \
--chain stagenet-local \
--key-type bcsv \
--scheme ecdsa \
--suri "$SEED//Gustavo"
```
#### Method 2: Docker Entrypoint (Automated)
Set environment variables:
```bash
export NODE_TYPE=fisherman
export NODE_NAME=Gustavo
export SEED="your seed phrase here"
export CHAIN=stagenet-local
```
The entrypoint script automatically injects the BCSV key.
## Wallet Requirements
### Fisherman Account
- **Purpose**: Challenge submission and transaction fees
- **Required Balance**:
- Transaction fees: ~10 HAVE per challenge
- **Recommended**: 100+ HAVE for continuous operations
- **Funding**: Must be funded to submit challenges
- **Account Type**: Ethereum-style 20-byte address (AccountId20)
### Generate Fisherman Account
```bash
# Generate new account from seed
SEED="your secure seed phrase here"
echo $SEED | datahaven-node key inspect --output-type json | jq
# Derive Fisherman account (common derivation: //Gustavo)
echo "$SEED//Gustavo" | datahaven-node key inspect --output-type json | jq -r '.ss58PublicKey'
```
## Database Requirements
### PostgreSQL Setup
Fisherman nodes **require** a PostgreSQL database, which can be shared with an Indexer node.
#### Install PostgreSQL
```bash
# Ubuntu/Debian
sudo apt update
sudo apt install postgresql-14 postgresql-contrib
# macOS
brew install postgresql@14
# Docker
docker run -d \
--name fisherman-postgres \
-e POSTGRES_PASSWORD=indexer \
-e POSTGRES_USER=indexer \
-e POSTGRES_DB=datahaven \
-p 5432:5432 \
-v fisherman-db:/var/lib/postgresql/data \
postgres:14
```
#### Database Connection String
```
postgresql://indexer:indexer@localhost:5432/datahaven
```
## CLI Flags
### Required Flags
```bash
datahaven-node \
--chain <CHAIN_SPEC> \
--fisherman \
--fisherman-database-url <DATABASE_URL>
```
### Core Fisherman Flags
| Flag | Description | Required | Default |
|------|-------------|----------|---------|
| `--fisherman` | Enable fisherman service | Yes | false |
| `--fisherman-database-url <URL>` | PostgreSQL connection URL | Yes* | None |
| `--fisherman-incomplete-sync-max <N>` | Max incomplete sync requests to process | No | 10000 |
| `--fisherman-incomplete-sync-page-size <N>` | Page size for pagination | No | 256 |
| `--fisherman-sync-mode-min-blocks-behind <N>` | Min blocks behind for sync mode | No | 5 |
*Can also use `FISHERMAN_DATABASE_URL` environment variable
### Standard Node Flags
| Flag | Description | Default |
|------|-------------|---------|
| `--chain <SPEC>` | Chain specification | Required |
| `--name <NAME>` | Node name | Required |
| `--base-path <PATH>` | Base directory for chain data | `~/.local/share/datahaven-node` |
| `--port <PORT>` | P2P port | `30333` |
| `--rpc-port <PORT>` | WebSocket RPC port | `9944` |
| `--bootnodes <MULTIADDR>` | Bootstrap nodes | None |
### Optional Flags
| Flag | Description |
|------|-------------|
| `--pruning <MODE>` | State pruning mode |
| `--prometheus-external` | Expose Prometheus metrics |
| `--log <TARGETS>` | Logging verbosity |
## Important Constraints
### Cannot Run with Lite Indexer
**CRITICAL**: Fisherman nodes **cannot** be run alongside an Indexer node in `lite` mode. They require either:
- A separate full Indexer node
- An Indexer in `fishing` mode
- An Indexer in `full` mode
### Cannot Run as Provider Simultaneously
A node **cannot** run as both a fisherman and a storage provider (MSP/BSP) at the same time.
## Complete Setup Examples
### 1. Generate Keys and Account
```bash
# Generate seed phrase
SEED="your secure seed phrase here"
# Derive Fisherman account
FISHERMAN_ACCOUNT=$(echo "$SEED//Gustavo" | datahaven-node key inspect --output-type json | jq -r '.ss58PublicKey')
echo "Fisherman Account: $FISHERMAN_ACCOUNT"
# Insert BCSV key
datahaven-node key insert \
--base-path /data/fisherman \
--chain stagenet-local \
--key-type bcsv \
--scheme ecdsa \
--suri "$SEED//Gustavo"
```
### 2. Fund Fisherman Account
```bash
# Transfer funds to Fisherman account
# Minimum: 100 HAVE for continuous operations
# Using Polkadot.js or a funded account, send HAVE tokens to $FISHERMAN_ACCOUNT
```
### 3. Setup Database
```bash
# Start PostgreSQL with Docker
docker run -d \
--name fisherman-postgres \
-e POSTGRES_PASSWORD=indexer \
-e POSTGRES_USER=indexer \
-e POSTGRES_DB=datahaven \
-p 5432:5432 \
-v fisherman-db:/var/lib/postgresql/data \
postgres:14
# Verify connection
psql postgresql://indexer:indexer@localhost:5432/datahaven -c "SELECT version();"
```
### 4. Start Fisherman Node
```bash
datahaven-node \
--chain stagenet-local \
--name "Fisherman-Gustavo" \
--base-path /data/fisherman \
--fisherman \
--fisherman-database-url postgresql://indexer:indexer@localhost:5432/datahaven \
--fisherman-incomplete-sync-max 10000 \
--fisherman-incomplete-sync-page-size 256 \
--fisherman-sync-mode-min-blocks-behind 5 \
--port 30333 \
--rpc-port 9948 \
--bootnodes /dns/bootnode.example.com/tcp/30333/p2p/12D3KooW...
```
## Docker Deployment
### Docker Compose (Full Stack)
```yaml
version: '3.8'
services:
postgres:
image: postgres:14
container_name: fisherman-postgres
environment:
POSTGRES_DB: datahaven
POSTGRES_USER: indexer
POSTGRES_PASSWORD: indexer
ports:
- "5432:5432"
volumes:
- fisherman-db:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U indexer -d datahaven"]
interval: 10s
timeout: 5s
retries: 5
indexer:
image: datahavenxyz/datahaven:latest
container_name: storagehub-indexer
depends_on:
postgres:
condition: service_healthy
environment:
INDEXER_DATABASE_URL: postgresql://indexer:indexer@postgres:5432/datahaven
ports:
- "30335:30333"
- "9947:9947"
volumes:
- indexer-data:/data
command:
- "--chain=stagenet-local"
- "--name=Indexer-Fishing"
- "--base-path=/data"
- "--indexer"
- "--indexer-mode=fishing"
- "--port=30333"
- "--rpc-port=9947"
restart: unless-stopped
fisherman:
image: datahavenxyz/datahaven:latest
container_name: storagehub-fisherman
depends_on:
postgres:
condition: service_healthy
indexer:
condition: service_started
environment:
NODE_TYPE: fisherman
NODE_NAME: Gustavo
SEED: "your seed phrase here"
CHAIN: stagenet-local
KEYSTORE_PATH: /data/keystore
FISHERMAN_DATABASE_URL: postgresql://indexer:indexer@postgres:5432/datahaven
ports:
- "30336:30333"
- "9948:9948"
volumes:
- fisherman-data:/data
command:
- "--chain=stagenet-local"
- "--name=Fisherman-Gustavo"
- "--base-path=/data"
- "--keystore-path=/data/keystore"
- "--fisherman"
- "--fisherman-incomplete-sync-max=10000"
- "--fisherman-incomplete-sync-page-size=256"
- "--port=30333"
- "--rpc-port=9948"
restart: unless-stopped
volumes:
fisherman-db:
indexer-data:
fisherman-data:
```
## Kubernetes Deployment
```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: storagehub-fisherman
spec:
serviceName: storagehub-fisherman
replicas: 1
selector:
matchLabels:
app: storagehub-fisherman
template:
metadata:
labels:
app: storagehub-fisherman
spec:
containers:
- name: fisherman
image: datahavenxyz/datahaven:latest
env:
- name: NODE_TYPE
value: "fisherman"
- name: NODE_NAME
value: "Gustavo"
- name: SEED
valueFrom:
secretKeyRef:
name: fisherman-seed
key: seed
- name: FISHERMAN_DATABASE_URL
value: postgresql://indexer:indexer@fisherman-postgres:5432/datahaven
ports:
- containerPort: 30333
name: p2p
- containerPort: 9948
name: rpc
volumeMounts:
- name: data
mountPath: /data
resources:
requests:
memory: "8Gi"
cpu: "4"
limits:
memory: "16Gi"
cpu: "8"
args:
- "--chain=stagenet-local"
- "--name=Fisherman-Gustavo"
- "--base-path=/data"
- "--fisherman"
- "--fisherman-incomplete-sync-max=10000"
- "--fisherman-incomplete-sync-page-size=256"
- "--port=30333"
- "--rpc-port=9948"
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 200Gi
```
## On-Chain Registration
### Not Required
Fisherman nodes do not require on-chain registration. They operate autonomously by monitoring blockchain data and submitting challenges as needed.
## Fisherman Operations
### Challenge Submission Flow
1. **Monitor**: Fisherman monitors blockchain data via database
2. **Detect**: Identifies storage provider violations:
- Missing proofs
- Invalid proofs
- Storage capacity violations
- Availability issues
3. **Verify**: Validates violation independently
4. **Challenge**: Submits challenge extrinsic to ProofsDealer pallet
5. **Reward**: Receives reward if challenge is validated
### Types of Violations Detected
| Violation Type | Description | Extrinsic |
|----------------|-------------|-----------|
| Missing Proof | Provider failed to submit proof | `proofsDealer.challengeMissingProof` |
| Invalid Proof | Submitted proof is invalid | `proofsDealer.challengeInvalidProof` |
| Over Capacity | Provider exceeds declared capacity | `providers.challengeCapacity` |
| Unavailable | Provider is unreachable | `providers.challengeAvailability` |
### Reward System
- Successful challenges earn rewards from slashed provider deposits
- Failed challenges may result in fisherman penalties
- Reward amount depends on violation severity
## Monitoring
### Health Checks
```bash
# Check node health
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "system_health"}' \
http://localhost:9948 | jq
# Check fisherman status
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "fisherman_getStatus"}' \
http://localhost:9948 | jq
```
### Database Queries
```sql
-- Get recent challenges submitted
SELECT * FROM challenges
WHERE fisherman_account = '0x...'
ORDER BY block_number DESC
LIMIT 10;
-- Get successful challenges
SELECT * FROM challenges
WHERE fisherman_account = '0x...'
AND status = 'validated'
ORDER BY block_number DESC;
-- Get violation statistics
SELECT violation_type, COUNT(*) as count
FROM challenges
WHERE fisherman_account = '0x...'
GROUP BY violation_type;
```
### Key Metrics
- Number of challenges submitted
- Challenge success rate
- Rewards earned
- Violations detected by type
- Account balance (for fees)
### Logs
```bash
# View Fisherman logs
docker logs -f storagehub-fisherman
# Filter for challenge events
docker logs storagehub-fisherman 2>&1 | grep -i "challenge\|violation"
# Monitor successful challenges
docker logs storagehub-fisherman 2>&1 | grep -i "challenge.*success"
```
## Troubleshooting
### Issue: Database Connection Failed
**Check:**
1. PostgreSQL is running: `docker ps | grep postgres`
2. Connection string is correct
3. Database is accessible from fisherman node
4. Indexer has populated database
### Issue: Not Detecting Violations
**Check:**
1. Indexer node is running and synced
2. Indexer mode is `fishing` or `full` (not `lite`)
3. Database has recent data
4. Fisherman account has sufficient balance
5. BCSV key is correctly inserted
### Issue: Challenge Submission Failing
**Check:**
1. Account has sufficient balance for fees
2. BCSV key is valid and inserted
3. Node is fully synced
4. Violation is still valid (not already challenged)
5. Check logs for specific error messages
### Issue: No Rewards Received
**Check:**
1. Challenges were validated successfully
2. Reward distribution period has passed
3. Check on-chain events for reward distribution
4. Verify fisherman account address
## Security Considerations
1. **Key Management**: Store seed phrase securely offline
2. **Account Security**: Monitor balance for unexpected drops
3. **Database Security**: Secure database access
4. **Network Security**: Use firewall to restrict access
5. **False Positives**: Ensure validation logic is accurate
## Best Practices
1. Run alongside a dedicated Indexer node
2. Monitor account balance and set up auto-refill
3. Set reasonable `incomplete-sync-max` to avoid overload
4. Keep node software updated
5. Implement monitoring and alerting
6. Document operational procedures
7. Test challenge submission in development environment
8. Monitor provider behavior patterns
## Performance Considerations
### Tuning Parameters
```bash
# For high-volume monitoring
--fisherman-incomplete-sync-max 20000 \
--fisherman-incomplete-sync-page-size 512 \
--fisherman-sync-mode-min-blocks-behind 3
```
## Economic Considerations
### Operational Costs
- **Transaction Fees**: ~10 HAVE per challenge
- **False Challenge Penalty**: Varies by violation type
- **Monitoring Costs**: Infrastructure costs
### Revenue Potential
- **Successful Challenges**: Rewards from slashed deposits
- **Volume**: Depends on network size and provider behavior
- **Competition**: Multiple fishermen may detect same violations
### Break-Even Analysis
```
Monthly Revenue = (Successful Challenges × Reward per Challenge)
Monthly Costs = (Infrastructure Costs + Transaction Fees)
Net Profit = Monthly Revenue - Monthly Costs
```
## Related Documentation
- [MSP Setup](./storagehub-msp.md)
- [BSP Setup](./storagehub-bsp.md)
- [Indexer Setup](./storagehub-indexer.md)
- [StorageHub Pallets](https://github.com/Moonsong-Labs/storage-hub)
- [Proofs Dealer Pallet](https://github.com/Moonsong-Labs/storage-hub/tree/main/pallets/proofs-dealer)
- [Docker Compose Guide](../operator/DOCKER-COMPOSE.md)

647
docs/storagehub-indexer.md Normal file
View file

@ -0,0 +1,647 @@
# StorageHub Indexer Node Setup
## Overview
Indexer nodes index blockchain data into a PostgreSQL database, enabling efficient querying of storage operations, file metadata, and provider activities.
## Purpose
- Index blockchain data to PostgreSQL database
- Enable efficient querying of storage operations
- Support fisherman node operations
- Provide historical data analysis
- Track file system events and provider activities
## Prerequisites
- DataHaven node binary or Docker image
- PostgreSQL 14+ database server
- Sufficient storage for chain data and database
- Stable network connection
- Open network ports (30333, optionally 9944)
## Hardware Requirements
Indexer nodes have varying requirements depending on the indexing mode. Full mode requires more resources for complete historical data indexing.
### Lite/Fishing Mode Specifications
| Component | Requirement |
|-----------|-------------|
| **CPU** | 4 physical cores @ 2.5 GHz |
| **RAM** | 16 GB DDR4 |
| **Storage (Chain Data)** | 100 GB NVMe SSD |
| **Storage (Database)** | 100 GB NVMe SSD |
| **Network** | 100 Mbit/s symmetric |
### Full Mode Specifications (Recommended)
| Component | Requirement |
|-----------|-------------|
| **CPU** | 8 physical cores @ 3.0 GHz (Intel Ice Lake+ or AMD Zen3+) |
| **RAM** | 32 GB DDR4 |
| **Storage (Chain Data)** | 300 GB NVMe SSD |
| **Storage (Database)** | 500 GB NVMe SSD |
| **Network** | 500 Mbit/s symmetric |
### Important Considerations
- **Archive mode**: Full indexers should run with `--pruning archive` for complete historical data
- **Database performance**: Use NVMe SSD for PostgreSQL data directory
- **Separate volumes**: Keep chain data and database on separate volumes for better I/O
- **Database growth**: Plan for database growth; full mode can grow significantly over time
- **Cloud compatible**: Indexer nodes work well on cloud VPS with dedicated storage
## Key Requirements
### No Session Keys Required
Indexer nodes do **not** require session keys as they are non-signing nodes that only observe and index blockchain data.
### No BCSV Key Required
Indexer nodes do not participate in storage operations, so no BCSV key is needed.
## Wallet Requirements
### No Wallet Required
Indexer nodes do not submit transactions, so no funded account is needed.
## Database Requirements
### PostgreSQL Setup
#### Install PostgreSQL
```bash
# Ubuntu/Debian
sudo apt update
sudo apt install postgresql-14 postgresql-contrib
# macOS
brew install postgresql@14
# Docker
docker run -d \
--name indexer-postgres \
-e POSTGRES_PASSWORD=indexer \
-e POSTGRES_USER=indexer \
-e POSTGRES_DB=datahaven \
-p 5432:5432 \
-v indexer-db:/var/lib/postgresql/data \
postgres:14
```
#### Create Database
```bash
# Connect to PostgreSQL
psql -U postgres
# Create database and user
CREATE DATABASE datahaven;
CREATE USER indexer WITH ENCRYPTED PASSWORD 'indexer';
GRANT ALL PRIVILEGES ON DATABASE datahaven TO indexer;
\q
```
#### Database Connection String
```
postgresql://indexer:indexer@localhost:5432/datahaven
```
## CLI Flags
### Required Flags
```bash
datahaven-node \
--chain <CHAIN_SPEC> \
--indexer \
--indexer-database-url <DATABASE_URL>
```
### Core Indexer Flags
| Flag | Description | Required | Default |
|------|-------------|----------|---------|
| `--indexer` | Enable indexer service | Yes | false |
| `--indexer-database-url <URL>` | PostgreSQL connection URL | Yes* | None |
| `--indexer-mode <MODE>` | Indexer mode (`full`, `lite`, `fishing`) | No | `full` |
*Can also use `INDEXER_DATABASE_URL` environment variable
### Indexer Modes
| Mode | Description | Data Indexed | Use Case |
|------|-------------|--------------|----------|
| `full` | Index all blockchain data | All events, storage, metadata | Complete historical data |
| `lite` | Index essential storage data | Storage operations, files, providers | Storage-focused queries |
| `fishing` | Index data for fisherman | Provider challenges, proofs, violations | Fisherman operations |
### Standard Node Flags
| Flag | Description | Default |
|------|-------------|---------|
| `--chain <SPEC>` | Chain specification | Required |
| `--name <NAME>` | Node name | Required |
| `--base-path <PATH>` | Base directory for chain data | `~/.local/share/datahaven-node` |
| `--port <PORT>` | P2P port | `30333` |
| `--rpc-port <PORT>` | WebSocket RPC port | `9944` |
| `--bootnodes <MULTIADDR>` | Bootstrap nodes | None |
### Optional Flags
| Flag | Description |
|------|-------------|
| `--pruning <MODE>` | State pruning mode (recommend `archive` for indexer) |
| `--blocks-pruning <MODE>` | Block pruning mode (recommend `archive`) |
| `--prometheus-external` | Expose Prometheus metrics |
| `--log <TARGETS>` | Logging verbosity |
## Complete Setup Examples
### 1. Setup Database
```bash
# Start PostgreSQL with Docker
docker run -d \
--name indexer-postgres \
-e POSTGRES_PASSWORD=indexer \
-e POSTGRES_USER=indexer \
-e POSTGRES_DB=datahaven \
-p 5432:5432 \
-v indexer-db:/var/lib/postgresql/data \
postgres:14
# Verify connection
psql postgresql://indexer:indexer@localhost:5432/datahaven -c "SELECT version();"
```
### 2. Start Indexer Node (Full Mode)
```bash
datahaven-node \
--chain stagenet-local \
--name "Indexer-Full" \
--base-path /data/indexer \
--indexer \
--indexer-mode full \
--indexer-database-url postgresql://indexer:indexer@localhost:5432/datahaven \
--pruning archive \
--blocks-pruning archive \
--port 30333 \
--rpc-port 9947 \
--bootnodes /dns/bootnode.example.com/tcp/30333/p2p/12D3KooW...
```
### 3. Start Indexer Node (Lite Mode)
```bash
datahaven-node \
--chain stagenet-local \
--name "Indexer-Lite" \
--base-path /data/indexer-lite \
--indexer \
--indexer-mode lite \
--indexer-database-url postgresql://indexer:indexer@localhost:5432/datahaven \
--port 30333 \
--rpc-port 9947
```
### 4. Start Indexer Node (Fishing Mode)
```bash
datahaven-node \
--chain stagenet-local \
--name "Indexer-Fishing" \
--base-path /data/indexer-fishing \
--indexer \
--indexer-mode fishing \
--indexer-database-url postgresql://indexer:indexer@localhost:5432/datahaven \
--port 30333 \
--rpc-port 9947
```
## Docker Deployment
### Docker Compose (Full Stack)
```yaml
version: '3.8'
services:
postgres:
image: postgres:14
container_name: indexer-postgres
environment:
POSTGRES_DB: datahaven
POSTGRES_USER: indexer
POSTGRES_PASSWORD: indexer
ports:
- "5432:5432"
volumes:
- indexer-db:/var/lib/postgresql/data
healthcheck:
test: ["CMD-SHELL", "pg_isready -U indexer -d datahaven"]
interval: 10s
timeout: 5s
retries: 5
indexer:
image: datahavenxyz/datahaven:latest
container_name: storagehub-indexer
depends_on:
postgres:
condition: service_healthy
environment:
INDEXER_DATABASE_URL: postgresql://indexer:indexer@postgres:5432/datahaven
ports:
- "30335:30333"
- "9947:9947"
volumes:
- indexer-data:/data
command:
- "--chain=stagenet-local"
- "--name=Indexer-Full"
- "--base-path=/data"
- "--indexer"
- "--indexer-mode=full"
- "--pruning=archive"
- "--blocks-pruning=archive"
- "--port=30333"
- "--rpc-port=9947"
- "--rpc-external"
restart: unless-stopped
volumes:
indexer-db:
indexer-data:
```
### Docker Run
```bash
# Start PostgreSQL
docker run -d \
--name indexer-postgres \
-e POSTGRES_PASSWORD=indexer \
-e POSTGRES_USER=indexer \
-e POSTGRES_DB=datahaven \
-p 5432:5432 \
postgres:14
# Wait for PostgreSQL to be ready
sleep 5
# Start Indexer
docker run -d \
--name storagehub-indexer \
--link indexer-postgres:postgres \
-e INDEXER_DATABASE_URL=postgresql://indexer:indexer@postgres:5432/datahaven \
-p 30333:30333 \
-p 9947:9947 \
-v $(pwd)/indexer-data:/data \
datahavenxyz/datahaven:latest \
--chain stagenet-local \
--name "Indexer-Full" \
--base-path /data \
--indexer \
--indexer-mode full \
--port 30333 \
--rpc-port 9947
```
## Kubernetes Deployment
```yaml
apiVersion: v1
kind: Service
metadata:
name: indexer-postgres
spec:
ports:
- port: 5432
targetPort: 5432
selector:
app: indexer-postgres
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: indexer-postgres
spec:
serviceName: indexer-postgres
replicas: 1
selector:
matchLabels:
app: indexer-postgres
template:
metadata:
labels:
app: indexer-postgres
spec:
containers:
- name: postgres
image: postgres:14
env:
- name: POSTGRES_DB
value: datahaven
- name: POSTGRES_USER
value: indexer
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: indexer-db-secret
key: password
ports:
- containerPort: 5432
name: postgres
volumeMounts:
- name: data
mountPath: /var/lib/postgresql/data
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 200Gi
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: storagehub-indexer
spec:
serviceName: storagehub-indexer
replicas: 1
selector:
matchLabels:
app: storagehub-indexer
template:
metadata:
labels:
app: storagehub-indexer
spec:
containers:
- name: indexer
image: datahavenxyz/datahaven:latest
env:
- name: INDEXER_DATABASE_URL
value: postgresql://indexer:indexer@indexer-postgres:5432/datahaven
ports:
- containerPort: 30333
name: p2p
- containerPort: 9947
name: rpc
volumeMounts:
- name: data
mountPath: /data
resources:
requests:
memory: "16Gi"
cpu: "4"
limits:
memory: "32Gi"
cpu: "8"
args:
- "--chain=stagenet-local"
- "--name=Indexer-Full"
- "--base-path=/data"
- "--indexer"
- "--indexer-mode=full"
- "--pruning=archive"
- "--blocks-pruning=archive"
- "--port=30333"
- "--rpc-port=9947"
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 500Gi
```
**Note**: Database storage (200Gi in PostgreSQL StatefulSet) should be increased to 500Gi for full mode in production.
## On-Chain Registration
### Not Required
Indexer nodes do not require any on-chain registration or extrinsics.
## Database Schema
### Key Tables (Generated Automatically)
The indexer automatically creates and manages database tables:
- **blocks**: Block headers and metadata
- **extrinsics**: Extrinsic data per block
- **events**: Blockchain events
- **storage_providers**: MSP/BSP registration data
- **files**: File metadata and storage information
- **buckets**: Bucket ownership and configuration
- **proofs**: Proof submissions and challenges
- **payment_streams**: Payment stream data
### Query Examples
```sql
-- Get all MSPs
SELECT * FROM storage_providers WHERE provider_type = 'msp';
-- Get files stored by a specific MSP
SELECT * FROM files WHERE msp_id = '0x...';
-- Get recent proof submissions
SELECT * FROM proofs ORDER BY block_number DESC LIMIT 10;
-- Get total storage capacity by provider type
SELECT provider_type, SUM(capacity) as total_capacity
FROM storage_providers
GROUP BY provider_type;
```
## Monitoring
### Health Checks
```bash
# Check node health
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "system_health"}' \
http://localhost:9947 | jq
# Check sync status
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "system_syncState"}' \
http://localhost:9947 | jq
```
### Database Health
```bash
# Check database connection
psql postgresql://indexer:indexer@localhost:5432/datahaven -c "SELECT COUNT(*) FROM blocks;"
# Check database size
psql postgresql://indexer:indexer@localhost:5432/datahaven -c "SELECT pg_size_pretty(pg_database_size('datahaven'));"
# Check table sizes
psql postgresql://indexer:indexer@localhost:5432/datahaven -c "
SELECT
schemaname,
tablename,
pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) AS size
FROM pg_tables
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;"
```
### Key Metrics
- Indexing lag (blocks behind chain tip)
- Database size and growth rate
- Query performance
- Connection pool usage
- Disk I/O performance
## Troubleshooting
### Issue: Database Connection Failed
**Check:**
1. PostgreSQL is running: `docker ps | grep postgres`
2. Connection string is correct
3. Database credentials are valid
4. Network connectivity between node and database
5. PostgreSQL logs: `docker logs indexer-postgres`
### Issue: Slow Indexing
**Solutions:**
1. Optimize PostgreSQL configuration:
```sql
ALTER SYSTEM SET shared_buffers = '4GB';
ALTER SYSTEM SET effective_cache_size = '12GB';
ALTER SYSTEM SET maintenance_work_mem = '1GB';
ALTER SYSTEM SET checkpoint_completion_target = 0.9;
ALTER SYSTEM SET wal_buffers = '16MB';
ALTER SYSTEM SET default_statistics_target = 100;
```
2. Add indexes to frequently queried columns
3. Use faster storage (NVMe SSD)
4. Increase database connection pool size
### Issue: Database Running Out of Space
**Solutions:**
1. Enable PostgreSQL auto-vacuum: `ALTER TABLE <table> SET (autovacuum_enabled = true);`
2. Manual vacuum: `VACUUM FULL;`
3. Archive old data
4. Increase disk space
### Issue: Indexer Not Catching Up
**Check:**
1. Node is fully synced: Check `system_syncState`
2. Database has sufficient resources
3. No errors in indexer logs
4. PostgreSQL is not overloaded
## Performance Tuning
### PostgreSQL Configuration
Edit `postgresql.conf`:
```ini
# Memory
shared_buffers = 4GB
effective_cache_size = 12GB
maintenance_work_mem = 1GB
work_mem = 256MB
# Checkpoints
checkpoint_completion_target = 0.9
wal_buffers = 16MB
max_wal_size = 4GB
# Connections
max_connections = 200
# Query Performance
random_page_cost = 1.1 # For SSD
effective_io_concurrency = 200
# Statistics
default_statistics_target = 100
```
### Indexer Node Configuration
```bash
datahaven-node \
--indexer \
--pruning archive \
--blocks-pruning archive \
--state-cache-size 268435456 \ # 256 MB
--max-runtime-instances 8
```
## Security Considerations
1. **Database Security**: Use strong passwords, restrict network access
2. **Connection Encryption**: Use SSL for PostgreSQL connections
3. **Access Control**: Limit database access to indexer node only
4. **Backup Strategy**: Regular database backups
5. **Monitoring**: Set up alerts for connection failures
## Best Practices
1. Use dedicated PostgreSQL server for production
2. Enable regular database backups (daily recommended)
3. Monitor database size and plan for growth
4. Use archive mode for complete historical data
5. Implement connection pooling (e.g., PgBouncer)
6. Regular database maintenance (VACUUM, ANALYZE)
7. Set up monitoring and alerting
8. Document backup/restore procedures
## Backup and Restore
### Backup Database
```bash
# Full backup
pg_dump -U indexer -h localhost datahaven > datahaven-backup-$(date +%Y%m%d).sql
# Compressed backup
pg_dump -U indexer -h localhost datahaven | gzip > datahaven-backup-$(date +%Y%m%d).sql.gz
```
### Restore Database
```bash
# Restore from backup
psql -U indexer -h localhost datahaven < datahaven-backup-20250124.sql
# Restore from compressed backup
gunzip -c datahaven-backup-20250124.sql.gz | psql -U indexer -h localhost datahaven
```
## Related Documentation
- [MSP Setup](./storagehub-msp.md)
- [BSP Setup](./storagehub-bsp.md)
- [Fisherman Setup](./storagehub-fisherman.md)
- [PostgreSQL Documentation](https://www.postgresql.org/docs/14/)
- [Docker Compose Guide](../operator/DOCKER-COMPOSE.md)

671
docs/storagehub-msp.md Normal file
View file

@ -0,0 +1,671 @@
# StorageHub Main Storage Provider (MSP) Setup
## Overview
Main Storage Providers (MSPs) are primary storage providers in the StorageHub network that manage user data, buckets, and coordinate with Backup Storage Providers (BSPs).
## Purpose
- Store and manage user files and buckets
- Charge storage fees from users
- Distribute files to BSPs for redundancy
- Manage bucket migrations
- Serve file download requests
- Submit proofs of storage
## Prerequisites
- DataHaven node binary or Docker image
- Funded account with sufficient balance for deposits
- Storage capacity (minimum 1 TB, recommended 2+ TB)
- Stable network connection
- Open network ports (30333, optionally 9944)
- Optional: PostgreSQL database for advanced features
## Hardware Requirements
MSPs have validator-level hardware requirements plus additional storage capacity for user data. Single-threaded CPU performance is important for block processing.
### Minimum Specifications
| Component | Requirement |
|-----------|-------------|
| **CPU** | 8 physical cores @ 3.4 GHz (Intel Ice Lake+ or AMD Zen3+) |
| **RAM** | 32 GB DDR4 ECC |
| **Storage (System)** | 500 GB NVMe SSD (chain data) |
| **Storage (User Data)** | 1 TB NVMe SSD or HDD |
| **Network** | 500 Mbit/s symmetric |
### Recommended Specifications
| Component | Requirement |
|-----------|-------------|
| **CPU** | Intel Xeon E-2386/E-2388 or AMD Ryzen 9 5950x/5900x |
| **RAM** | 64 GB DDR4 ECC |
| **Storage (System)** | 1 TB NVMe SSD (chain data) |
| **Storage (User Data)** | 2+ TB NVMe SSD (expandable) |
| **Network** | 1 Gbit/s symmetric |
### Important Considerations
- **Disable Hyper-Threading/SMT**: Single-threaded performance is prioritized over core count
- **Separate storage volumes**: Keep chain data and user data on separate volumes for better I/O performance
- **Storage expansion**: Plan for growth; user data storage should be easily expandable
- **max-storage-capacity**: Set this CLI flag to **80% of available physical disk space** to leave headroom for filesystem overhead and temporary files
- **Bare metal preferred**: Cloud VPS may have inconsistent performance; bare metal provides better I/O predictability
## Key Requirements
### BCSV Key (ECDSA - 1 Required)
MSPs require **one BCSV key** for storage provider identity.
| Key Type | Scheme | Purpose |
|----------|--------|---------|
| `bcsv` | ecdsa | Storage provider identity and signing |
### Generate BCSV Key
#### Method 1: CLI Key Insertion
```bash
# Generate seed phrase
SEED=$(datahaven-node key generate | grep "Secret phrase" | cut -d'`' -f2)
# Insert BCSV key (ecdsa)
datahaven-node key insert \
--base-path /data/msp \
--chain stagenet-local \
--key-type bcsv \
--scheme ecdsa \
--suri "$SEED"
```
#### Method 2: Docker Entrypoint (Automated)
Set environment variables:
```bash
export NODE_TYPE=msp
export NODE_NAME=msp01
export SEED="your seed phrase here"
export CHAIN=stagenet-local
```
The entrypoint script automatically injects the BCSV key.
## Wallet Requirements
### Provider Account
- **Purpose**: MSP registration, transaction fees, and deposits
- **Required Balance**:
- Base deposit: 100 HAVE (`SpMinDeposit`)
- Deposit per GiB: 2 HAVE (`DepositPerData`)
- Transaction fees: ~10 HAVE
- **Funding**: Must be funded **before** MSP registration
- **Account Type**: Ethereum-style 20-byte address (AccountId20)
**Deposit Calculation by Capacity:**
| Storage Capacity | Deposit Required | Recommended Balance |
|------------------|------------------|---------------------|
| 800 GiB (1 TB disk) | ~1,700 HAVE | 1,800+ HAVE |
| 1.6 TiB (2 TB disk) | ~3,400 HAVE | 3,600+ HAVE |
| 4 TiB (5 TB disk) | ~8,300 HAVE | 8,500+ HAVE |
Formula: `100 + (capacity_in_gib × 2) + buffer`
### Generate Provider Account
```bash
# Generate new account from seed
SEED="your secure seed phrase here"
echo $SEED | datahaven-node key inspect --output-type json | jq
# Derive MSP account
echo "$SEED//my_awesome_msp" | datahaven-node key inspect --output-type json | jq -r '.ss58PublicKey'
```
## CLI Flags
### Required Flags
```bash
datahaven-node \
--chain <CHAIN_SPEC> \
--provider \
--provider-type msp \
--max-storage-capacity <BYTES> \
--jump-capacity <BYTES> \
--msp-charging-period <BLOCKS>
```
### Core Provider Flags
| Flag | Description | Required | Default |
|------|-------------|----------|---------|
| `--provider` | Enable storage provider mode | Yes | false |
| `--provider-type msp` | Set provider type to MSP | Yes | None |
| `--max-storage-capacity <BYTES>` | Maximum storage capacity | Yes | None |
| `--jump-capacity <BYTES>` | Jump capacity for new storage | Yes | None |
| `--msp-charging-period <BLOCKS>` | Fee charging period in blocks | Yes | None |
| `--storage-layer <TYPE>` | Storage backend (`rocksdb` or `memory`) | No | `memory` |
| `--storage-path <PATH>` | Storage path (required if rocksdb) | No | None |
**Example Values:**
- `--max-storage-capacity 858993459200` (800 GiB = 80% of 1 TB disk)
- `--max-storage-capacity 1717986918400` (1.6 TiB = 80% of 2 TB disk)
- `--jump-capacity 107374182400` (100 GiB)
- `--msp-charging-period 100` (100 blocks)
**Note**: Set `--max-storage-capacity` to approximately **80% of your available physical disk space** to leave headroom for filesystem overhead and temporary files.
### MSP-Specific Task Flags
| Flag | Description | Default |
|------|-------------|---------|
| `--msp-charge-fees-task` | Enable automatic fee charging | false |
| `--msp-charge-fees-min-debt <AMOUNT>` | Minimum debt threshold to charge | 0 |
| `--msp-move-bucket-task` | Enable bucket migration task | false |
| `--msp-move-bucket-max-try-count <N>` | Max retries for bucket moves | 5 |
| `--msp-move-bucket-max-tip <AMOUNT>` | Max tip for move bucket extrinsics | 0 |
| `--msp-distribute-files` | Enable file distribution to BSPs | false |
### Remote File Handling Flags
| Flag | Description | Default |
|------|-------------|---------|
| `--max-file-size <BYTES>` | Maximum file size | 10737418240 (10 GB) |
| `--connection-timeout <SECONDS>` | Connection timeout | 30 |
| `--read-timeout <SECONDS>` | Read timeout | 300 |
| `--follow-redirects <BOOL>` | Follow HTTP redirects | true |
| `--max-redirects <N>` | Maximum redirects | 10 |
| `--user-agent <STRING>` | HTTP user agent | "StorageHub-Client/1.0" |
| `--chunk-size <BYTES>` | Upload/download chunk size | 8192 (8 KB) |
| `--chunks-buffer <N>` | Number of chunks to buffer | 512 |
### Operational Flags
| Flag | Description | Default |
|------|-------------|---------|
| `--extrinsic-retry-timeout <SECONDS>` | Extrinsic retry timeout | 60 |
| `--sync-mode-min-blocks-behind <N>` | Min blocks behind for sync mode | 5 |
| `--check-for-pending-proofs-period <N>` | Period to check pending proofs | 4 |
| `--max-blocks-behind-to-catch-up-root-changes <N>` | Max blocks to process for root changes | 10 |
## Complete Setup Example
### 1. Generate Keys and Account
```bash
# Generate seed phrase
SEED="your secure seed phrase here"
# Derive MSP account
MSP_ACCOUNT=$(echo "$SEED//msp01" | datahaven-node key inspect --output-type json | jq -r '.ss58PublicKey')
echo "MSP Account: $MSP_ACCOUNT"
# Insert BCSV key
datahaven-node key insert \
--base-path /data/msp \
--chain stagenet-local \
--key-type bcsv \
--scheme ecdsa \
--suri "$SEED"
```
### 2. Fund Provider Account
```bash
# Transfer funds to MSP account
# For 800 GiB capacity: ~1,800 HAVE (1,700 deposit + 100 buffer)
# For 1.6 TiB capacity: ~3,600 HAVE (3,400 deposit + 200 buffer)
# Using Polkadot.js or a funded account, send HAVE tokens to $MSP_ACCOUNT
# Formula: 100 + (capacity_in_gib × 2) + buffer
```
### 3. Start MSP Node
```bash
datahaven-node \
--chain stagenet-local \
--name "MSP01" \
--base-path /data/msp \
--provider \
--provider-type msp \
--max-storage-capacity 10737418240 \
--jump-capacity 1073741824 \
--msp-charging-period 100 \
--storage-layer rocksdb \
--storage-path /data/msp/storage \
--msp-charge-fees-task \
--msp-move-bucket-task \
--msp-distribute-files \
--port 30333 \
--rpc-port 9945 \
--bootnodes /dns/bootnode.example.com/tcp/30333/p2p/12D3KooW...
```
### 4. Register MSP On-Chain
See [On-Chain Registration](#on-chain-registration) section below.
## Docker Deployment
### Docker Compose
```yaml
version: '3.8'
services:
msp:
image: datahavenxyz/datahaven:latest
container_name: storagehub-msp
environment:
NODE_TYPE: msp
NODE_NAME: msp01
SEED: "your seed phrase here"
CHAIN: stagenet-local
KEYSTORE_PATH: /data/keystore
ports:
- "30333:30333"
- "9945:9945"
volumes:
- msp-data:/data
- msp-storage:/data/storage
command:
- "--chain=stagenet-local"
- "--name=MSP01"
- "--base-path=/data"
- "--keystore-path=/data/keystore"
- "--provider"
- "--provider-type=msp"
- "--max-storage-capacity=10737418240"
- "--jump-capacity=1073741824"
- "--msp-charging-period=100"
- "--storage-layer=rocksdb"
- "--storage-path=/data/storage"
- "--msp-charge-fees-task"
- "--msp-move-bucket-task"
- "--msp-distribute-files"
- "--port=30333"
- "--rpc-port=9945"
restart: unless-stopped
volumes:
msp-data:
msp-storage:
```
## Kubernetes Deployment
```yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: storagehub-msp
spec:
serviceName: storagehub-msp
replicas: 1
selector:
matchLabels:
app: storagehub-msp
template:
metadata:
labels:
app: storagehub-msp
spec:
containers:
- name: msp
image: datahavenxyz/datahaven:latest
env:
- name: NODE_TYPE
value: "msp"
- name: NODE_NAME
value: "MSP01"
- name: SEED
valueFrom:
secretKeyRef:
name: msp-seed
key: seed
ports:
- containerPort: 30333
name: p2p
- containerPort: 9945
name: rpc
volumeMounts:
- name: data
mountPath: /data
- name: storage
mountPath: /data/storage
resources:
requests:
memory: "4Gi"
cpu: "2"
limits:
memory: "8Gi"
cpu: "4"
args:
- "--chain=stagenet-local"
- "--provider"
- "--provider-type=msp"
- "--max-storage-capacity=10737418240"
- "--jump-capacity=1073741824"
- "--msp-charging-period=100"
- "--storage-layer=rocksdb"
- "--storage-path=/data/storage"
- "--msp-charge-fees-task"
- "--msp-move-bucket-task"
- "--msp-distribute-files"
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 100Gi
- metadata:
name: storage
spec:
accessModes: [ "ReadWriteOnce" ]
resources:
requests:
storage: 500Gi
```
## On-Chain Registration
### MSP Registration Process
MSPs must be registered on-chain via the `Providers` pallet using a **2-step process**:
1. **Step 1**: Call `request_msp_sign_up` - Initiates registration and reserves deposit
2. **Step 2**: Call `confirm_sign_up` - Completes registration after randomness verification
This two-step mechanism ensures security and prevents manipulation of provider IDs through randomness.
### Step 1: Request MSP Sign Up
```typescript
import { createClient } from 'polkadot-api';
import { getWsProvider } from 'polkadot-api/ws-provider/web';
import { withPolkadotSdkCompat } from 'polkadot-api/polkadot-sdk-compat';
import { datahaven } from '@polkadot-api/descriptors';
import { Binary } from 'polkadot-api';
// Connect to DataHaven node
const client = createClient(
withPolkadotSdkCompat(getWsProvider('ws://localhost:9944'))
);
const typedApi = client.getTypedApi(datahaven);
// MSP signer (using your BCSV key account)
const mspSigner = /* your polkadot-api signer */;
// MSP configuration
const capacity = BigInt(858_993_459_200); // 800 GiB (80% of 1 TB disk)
const multiaddresses = [
'/ip4/127.0.0.1/tcp/30333',
'/dns/msp01.example.com/tcp/30333'
].map(addr => Binary.fromText(addr));
// Pricing: $0.20 / 50 GiB / month at HAVE = $0.01
// See "Calculating Storage Pricing" section for formula
const pricePerGibPerBlock = BigInt(926_000_000_000);
// Step 1: Request MSP sign up
const requestTx = typedApi.tx.Providers.request_msp_sign_up({
capacity: capacity,
multiaddresses: multiaddresses,
value_prop_price_per_giga_unit_of_data_per_block: pricePerGibPerBlock,
commitment: Binary.fromText('msp01'),
value_prop_max_data_limit: BigInt(53_687_091_200), // 50 GiB
payment_account: mspSigner.publicKey // Account receiving payments
});
// Sign and submit the request
const requestResult = await requestTx.signAndSubmit(mspSigner);
console.log('MSP sign-up requested. Waiting for finalization...');
await requestResult.finalized();
console.log('Request finalized! Deposit has been reserved.');
```
**What Happens in Step 1:**
- Validates multiaddresses format
- Calculates required deposit based on capacity (`SpMinDeposit + capacity * DepositPerData`)
- Verifies account has sufficient balance
- **Holds (reserves) the deposit** from your account
- Creates a pending sign-up request
- Emits `MspRequestSignUpSuccess` event
### Step 2: Confirm Sign Up
After requesting, you must wait for sufficient randomness to be available (controlled by `MaxBlocksForRandomness` parameter, typically 2 hours on mainnet).
```typescript
// Step 2: Confirm the sign-up (after waiting for randomness)
const confirmTx = typedApi.tx.Providers.confirm_sign_up({
provider_account: undefined // Optional: omit to use signer's account
});
// Sign and submit confirmation
const confirmResult = await confirmTx.signAndSubmit(mspSigner);
console.log('Confirming MSP registration...');
await confirmResult.finalized();
console.log('MSP registration confirmed and active!');
```
**What Happens in Step 2:**
- Verifies randomness is sufficiently fresh
- Checks request hasn't expired
- Generates Provider ID using randomness
- Registers MSP in the system
- Emits `MspSignUpSuccess` event
- Deposit remains held for duration of MSP operation
### Timing Requirements
| Parameter | Testnet | Mainnet | Description |
|-----------|---------|---------|-------------|
| Min wait time | ~2 minutes | ~2 hours | Wait after `request_msp_sign_up` for randomness |
| Max wait time | Set by `MaxBlocksForRandomness` | Typically 2 hours | Request expires if not confirmed in time |
### Verify Registration
```typescript
// Check MSP registration status
const mspAccount = mspSigner.publicKey;
const registeredMspId = await typedApi.query.Providers.AccountIdToMainStorageProviderId.getValue(
mspAccount
);
if (registeredMspId) {
console.log('Registered MSP ID:', registeredMspId);
// Get full MSP details
const mspInfo = await typedApi.query.Providers.MainStorageProviders.getValue(
registeredMspId
);
console.log('MSP Info:', mspInfo);
} else {
console.log('MSP not yet registered or confirmation pending');
}
```
### Cancel Pending Request
If you change your mind before confirming:
```typescript
const cancelTx = typedApi.tx.Providers.cancel_sign_up();
await cancelTx.signAndSubmit(mspSigner);
console.log('Sign-up request cancelled, deposit returned');
```
### Development/Testing: Force Sign Up (Requires Sudo)
For development and testing environments with sudo access, you can bypass the 2-step process:
```typescript
// Single-step registration for testing (requires sudo)
const sudoSigner = /* sudo account signer */;
const mspCall = typedApi.tx.Providers.force_msp_sign_up({
who: mspAccount,
msp_id: /* pre-generated provider ID */,
capacity: BigInt(858_993_459_200), // 800 GiB
value_prop_price_per_giga_unit_of_data_per_block: BigInt(926_000_000_000),
multiaddresses: multiaddresses,
commitment: Binary.fromText('msp01'),
value_prop_max_data_limit: BigInt(53_687_091_200), // 50 GiB
payment_account: mspAccount
});
const sudoTx = typedApi.tx.Sudo.sudo({ call: mspCall.decodedCall });
await sudoTx.signAndSubmit(sudoSigner);
```
### Registration Parameters
| Parameter | Type | Description | Example |
|-----------|------|-------------|---------|
| `capacity` | StorageDataUnit | Storage capacity in bytes | `858993459200` (800 GiB) |
| `multiaddresses` | Vec<Bytes> | P2P network addresses | `[Binary.fromText("/ip4/...")]` |
| `value_prop_price_per_giga_unit_of_data_per_block` | Balance | Price per GiB per block (18 decimals) | `926_000_000_000` |
| `commitment` | Bytes | Service commitment identifier | `Binary.fromText("msp01")` |
| `value_prop_max_data_limit` | StorageDataUnit | Max data per value prop | `53687091200` (50 GiB) |
| `payment_account` | AccountId | Account receiving payments | `0x...` (20-byte) |
### Calculating Storage Pricing
The `value_prop_price_per_giga_unit_of_data_per_block` parameter sets your price per GiB of data stored per block. This value is in HAVE with 18 decimals.
**Formula:**
```
price_per_gib_per_block = (target_monthly_price / storage_gb / blocks_per_month) / have_price × 10^18
```
**Example Calculation:**
Given:
- HAVE token price: **$0.01**
- Target monthly revenue: **$0.20 per 50 GiB per month**
- Block time: 6 seconds → **432,000 blocks per month** (30 days)
Step-by-step:
1. Price per GiB per month: `$0.20 / 50 GiB = $0.004 per GiB/month`
2. Price per GiB per block: `$0.004 / 432,000 = $9.26 × 10⁻⁹ per GiB/block`
3. Convert to HAVE: `$9.26 × 10⁻⁹ / $0.01 = 9.26 × 10⁻⁷ HAVE`
4. Apply 18 decimals: `9.26 × 10⁻⁷ × 10¹⁸ = 926,000,000,000`
**Result:** `value_prop_price_per_giga_unit_of_data_per_block: BigInt(926_000_000_000)`
| Target Price | HAVE @ $0.01 | Value (18 decimals) |
|-------------|--------------|---------------------|
| $0.10 / 50 GiB / month | 0.463 µHAVE/GiB/block | `463_000_000_000` |
| $0.20 / 50 GiB / month | 0.926 µHAVE/GiB/block | `926_000_000_000` |
| $0.50 / 50 GiB / month | 2.315 µHAVE/GiB/block | `2_315_000_000_000` |
| $1.00 / 50 GiB / month | 4.630 µHAVE/GiB/block | `4_630_000_000_000` |
### Deposit Requirements
- **Base Deposit**: 100 HAVE (`SpMinDeposit`)
- **Per GiB**: 2 HAVE (`DepositPerData`)
- **Formula**: `100 + (capacity_in_gib × 2)`
**Examples:**
- 800 GiB capacity: `100 + (800 × 2) = 1,700 HAVE`
- 1.6 TiB capacity: `100 + (1,638 × 2) = 3,376 HAVE`
The deposit is **held (reserved)** from your account when you call `request_msp_sign_up` and remains held while you operate as an MSP. The deposit is returned when you deregister as an MSP.
## Monitoring
### Health Checks
```bash
# Check node health
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "system_health"}' \
http://localhost:9945 | jq
# Check provider status
curl -s -H "Content-Type: application/json" \
-d '{"id":1, "jsonrpc":"2.0", "method": "storageprovider_getStatus"}' \
http://localhost:9945 | jq
```
### Key Metrics to Monitor
- Storage capacity usage
- Number of stored files
- Fee collection status
- Proof submission success rate
- Bucket migration status
- BSP distribution success rate
### Logs
```bash
# View MSP logs
docker logs -f storagehub-msp
# Filter for storage events
docker logs storagehub-msp 2>&1 | grep -i "storage\|bucket\|file"
```
## Troubleshooting
### Issue: Registration Failed
**Check:**
1. Account has sufficient balance (200+ HAVE)
2. BCSV key is correctly inserted
3. Capacity meets minimum (2 data units)
4. Provider ID is correctly calculated
### Issue: Not Accepting Files
**Check:**
1. MSP is registered on-chain
2. Storage capacity not exceeded
3. Node is fully synced
4. RPC endpoint is accessible
### Issue: Fee Charging Not Working
**Check:**
1. `--msp-charge-fees-task` flag is enabled
2. `--msp-charging-period` matches on-chain value
3. Users have sufficient debt to charge
## Security Considerations
1. **Key Management**: Store seed phrase securely offline
2. **Storage Security**: Encrypt storage at rest
3. **Network Security**: Use firewall to restrict access
4. **Access Control**: Limit RPC access to trusted sources
5. **Backup Strategy**: Regular backups of stored data
## Best Practices
1. Use production-grade storage (NVMe SSD recommended)
2. Monitor storage capacity proactively
3. Enable all MSP tasks for full functionality
4. Set reasonable `msp-charging-period` (100-1000 blocks)
5. Keep node software updated
6. Implement monitoring and alerting
7. Document operational procedures
## Related Documentation
- [BSP Setup](./storagehub-bsp.md)
- [Indexer Setup](./storagehub-indexer.md)
- [Fisherman Setup](./storagehub-fisherman.md)
- [StorageHub Pallets](https://github.com/Moonsong-Labs/storage-hub)