datahaven/.github/workflows/actions/setup-env/action.yml
Steve Degosserie e04023ef11
ci: Add sccache warm-up job for better cache hit rates (#375)
## Summary

Improve CI performance through better caching and simplified workflows:

- Add a new `warm-sccache` job that runs before all Rust CI jobs to
pre-populate the sccache cache
- Cache locally installed tools (`~/.local/bin`, `~/.local/lib`,
`~/.local/include`) with a version-based hash key
- Simplify Rust tests by removing matrix partitioning and switching from
cargo-nextest to cargo test

## Problem

**Poor sccache hit rates:**
- `rust-lint`, `unit-tests`, and `build-operator` ran in parallel with
cold caches
- Each job compiled dependencies independently
- Cache was only saved at job completion (too late for parallel jobs to
benefit)

**Redundant tool downloads:**
- Mold, LLVM/Clang, protoc, and libpq (~500MB+) were downloaded fresh on
each job
- No caching of locally installed tools

**Overcomplicated test setup:**
- 2-partition matrix for tests added complexity without significant
benefit
- cargo-nextest required installation step (~30s overhead)
- Separate result-checker job wasn't necessary

## Solution

### 1. sccache warm-up job (`task-warm-sccache.yml`)
- Runs first (Tier 0) before all Rust jobs
- Compiles with release mode + all features (`fast-runtime`,
`try-runtime`, `runtime-benchmarks`)
- Compiles with debug mode to cover test builds
- Uses `SKIP_WASM_BUILD=1` to minimize warm-up time

### 2. Local tools caching (`setup-env/action.yml`)
- Define tool versions as env vars (`MOLD_VERSION`, `LLVM_VERSION`,
`PROTOC_VERSION`, `LIBPQ_VERSION`)
- Generate SHA256 hash from versions for cache key
- Cache `~/.local/bin`, `~/.local/lib`, `~/.local/include` (not all of
`~/.local` to avoid container storage)
- Set up PATH and env vars immediately after cache restore

### 3. Simplified Rust tests (`task-rust-tests.yml`)
- Remove 2-partition matrix strategy
- Replace cargo-nextest with `cargo test --locked`
- Remove separate tests-result-checker job

## CI Flow

```
                      ┌─ build-operator (warm sccache + cached tools)
                      │
 CI Start → warm-sccache ─┼─ rust-lint (warm sccache + cached tools)
                      │
                      └─ unit-tests (warm sccache + cached tools)
```

## Test plan

- [x] CI workflow runs successfully
- [x] Warm-sccache job completes and shows cache stats
- [x] Local tools cache restores correctly (no permission errors)
- [x] Downstream Rust jobs show improved cache hit rates
- [x] Rust tests pass with simplified single-job setup

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
Co-authored-by: Ahmad Kaouk <56095276+ahmadkaouk@users.noreply.github.com>
Co-authored-by: undercover-cactus <lola@moonsonglabs.com>
2026-01-08 13:44:45 +01:00

273 lines
No EOL
11 KiB
YAML

name: "Setup Rust Environment"
description: "Creates a Rust environment with the specified toolchain, cache, and dependencies"
inputs:
cache-key:
description: "Cache key used to retrieve built data. Usually matches the profile of the build"
required: false
default: "cache"
install-deps:
description: "Whether to install system dependencies. Set to false for self-hosted runners with pre-installed deps"
required: false
default: "true"
runs:
using: "composite"
steps:
- name: Set tool versions and cache key
shell: bash
run: |
# Define tool versions in one place
echo "MOLD_VERSION=2.40.4" >> $GITHUB_ENV
echo "LLVM_VERSION=18.1.8" >> $GITHUB_ENV
echo "PROTOC_VERSION=28.3" >> $GITHUB_ENV
echo "LIBPQ_VERSION=18.1-1" >> $GITHUB_ENV
# Create a hash from tool versions for cache key
TOOLS_HASH=$(echo "mold-2.40.4|llvm-18.1.8|protoc-28.3|libpq-18.1-1" | sha256sum | cut -c1-16)
echo "TOOLS_HASH=$TOOLS_HASH" >> $GITHUB_ENV
echo "Tools cache key hash: $TOOLS_HASH"
- name: Set Rust version
shell: bash
run: |
echo "BUILD_RUST_VERSION=$(rustc --version)" >> $GITHUB_ENV
- name: Run sccache-cache
uses: mozilla-actions/sccache-action@v0.0.9
- name: Show sccache stats
shell: bash
run: sccache --show-stats
# Cache specific ~/.local subdirs for locally installed tools (mold, llvm, protoc, libpq)
# Note: We don't cache all of ~/.local to avoid container storage in ~/.local/share
- name: Cache local tools
uses: actions/cache@v4
with:
path: |
~/.local/bin
~/.local/lib
~/.local/include
key: ${{ runner.os }}-local-tools-${{ env.TOOLS_HASH }}
# Set up PATH and env vars for cached tools (must run after cache restore)
- name: Setup paths for cached tools
shell: bash
run: |
# Add ~/.local/bin to PATH for cached tools
echo "$HOME/.local/bin" >> $GITHUB_PATH
export PATH="$HOME/.local/bin:$PATH"
# Setup library paths if ~/.local/lib exists (from cached libpq/llvm)
if [ -d "$HOME/.local/lib" ]; then
echo "LD_LIBRARY_PATH=$HOME/.local/lib:${LD_LIBRARY_PATH:-}" >> $GITHUB_ENV
echo "LIBRARY_PATH=$HOME/.local/lib:${LIBRARY_PATH:-}" >> $GITHUB_ENV
echo "PKG_CONFIG_PATH=$HOME/.local/lib/pkgconfig:${PKG_CONFIG_PATH:-}" >> $GITHUB_ENV
echo "LDFLAGS=-L$HOME/.local/lib ${LDFLAGS:-}" >> $GITHUB_ENV
echo "RUSTFLAGS=-L $HOME/.local/lib" >> $GITHUB_ENV
fi
# Setup include paths if ~/.local/include exists
if [ -d "$HOME/.local/include" ]; then
echo "C_INCLUDE_PATH=$HOME/.local/include:${C_INCLUDE_PATH:-}" >> $GITHUB_ENV
echo "CPLUS_INCLUDE_PATH=$HOME/.local/include:${CPLUS_INCLUDE_PATH:-}" >> $GITHUB_ENV
echo "CPPFLAGS=-I$HOME/.local/include ${CPPFLAGS:-}" >> $GITHUB_ENV
fi
# Setup LLVM paths if libclang exists in cache
if [ -d "$HOME/.local/lib" ] && ls $HOME/.local/lib/libclang* &>/dev/null; then
echo "LIBCLANG_PATH=$HOME/.local/lib" >> $GITHUB_ENV
echo "LLVM_CONFIG_PATH=$HOME/.local/bin/llvm-config" >> $GITHUB_ENV
fi
# Export mold path if it exists
if command -v mold &>/dev/null; then
echo "MOLD_PATH=$(which mold)" >> $GITHUB_ENV
fi
- uses: actions/cache@v4
with:
path: |
~/.cargo/bin/
~/.cargo/registry/index/
~/.cargo/registry/cache/
~/.cargo/git/db/
key: ${{ runner.os }}-cargo-${{ hashFiles('**/Cargo.lock') }}
- name: Setup Rust toolchain
uses: dtolnay/rust-toolchain@stable
with:
components: rustfmt, clippy
# Install mold - always install locally to avoid sudo requirements
- name: Install mold
shell: bash
run: |
# Check if mold is already available and the right version
if ! command -v mold &> /dev/null || [[ $(mold --version 2>/dev/null | grep -oP '\d+\.\d+\.\d+' | head -1) != "$MOLD_VERSION" ]]; then
echo "Installing mold $MOLD_VERSION in ~/.local"
mkdir -p ~/.local/bin
wget -q -O- "https://github.com/rui314/mold/releases/download/v${MOLD_VERSION}/mold-${MOLD_VERSION}-$(uname -m)-linux.tar.gz" | \
tar -xzf - -C ~/.local --strip-components=1
echo "$HOME/.local/bin" >> $GITHUB_PATH
export PATH="$HOME/.local/bin:$PATH"
else
echo "mold is already installed: $(mold --version)"
fi
# Export the mold path for use in RUSTFLAGS
MOLD_PATH=$(which mold)
echo "MOLD_PATH=${MOLD_PATH}" >> $GITHUB_ENV
echo "Mold installed at: ${MOLD_PATH}"
# Install system dependencies only when install-deps is true
- name: Install system dependencies
if: inputs.install-deps == 'true'
shell: bash
run: sudo apt-get update && sudo apt-get install -y libpq-dev libclang-dev
# Auto-install missing dependencies when install-deps is false (for self-hosted runners)
# Note: PATH and env vars are already set up in "Setup paths for cached tools" step
- name: Setup system dependencies (self-hosted)
if: inputs.install-deps == 'false'
shell: bash
run: |
echo "Checking and installing system dependencies locally if needed..."
# Setup local directories
LOCAL_DIR="$HOME/.local"
LOCAL_BIN="$LOCAL_DIR/bin"
LOCAL_LIB="$LOCAL_DIR/lib"
LOCAL_INCLUDE="$LOCAL_DIR/include"
LOCAL_PKG_CONFIG="$LOCAL_DIR/lib/pkgconfig"
mkdir -p "$LOCAL_BIN" "$LOCAL_LIB" "$LOCAL_INCLUDE" "$LOCAL_PKG_CONFIG"
# Check and install PostgreSQL client libraries if needed
if ! pkg-config --exists libpq 2>/dev/null; then
echo "Installing libpq $LIBPQ_VERSION..."
ARCH=$(dpkg --print-architecture 2>/dev/null || echo "amd64")
WORK_DIR="$LOCAL_DIR/tmp"
mkdir -p "$WORK_DIR" && cd "$WORK_DIR"
# Download PostgreSQL 18 packages
wget -q "https://apt.postgresql.org/pub/repos/apt/pool/main/p/postgresql-18/libpq5_${LIBPQ_VERSION}.pgdg+2_${ARCH}.deb"
wget -q "https://apt.postgresql.org/pub/repos/apt/pool/main/p/postgresql-18/libpq-dev_${LIBPQ_VERSION}.pgdg+2_${ARCH}.deb"
# Extract and install
for pkg in *.deb; do
ar p "$pkg" data.tar.xz | tar -xJ --no-same-owner 2>/dev/null || true
done
cp -r usr/lib/*/libpq* usr/lib/libpq* "$LOCAL_LIB/" 2>/dev/null || true
cp -r usr/include/* "$LOCAL_INCLUDE/" 2>/dev/null || true
find usr -name "*.pc" -exec cp {} "$LOCAL_PKG_CONFIG/" \; 2>/dev/null || true
# Fix pkg-config paths
sed -i "s|/usr|$LOCAL_DIR|g" "$LOCAL_PKG_CONFIG"/*.pc 2>/dev/null || true
cd "$HOME" && rm -rf "$WORK_DIR"
echo "✓ libpq installed"
else
echo "✓ libpq found: $(pkg-config --modversion libpq 2>/dev/null || echo 'version unknown')"
fi
# Check and install LLVM/Clang if needed
if ! command -v clang &> /dev/null || ! ls $LOCAL_LIB/libclang* 2>/dev/null && ! ldconfig -p 2>/dev/null | grep -q libclang; then
echo "LLVM/Clang not found, installing locally..."
ARCH=$(uname -m)
# Map architecture names
case "$ARCH" in
x86_64) LLVM_ARCH="x86_64-linux-gnu-ubuntu-22.04" ;;
aarch64) LLVM_ARCH="aarch64-linux-gnu" ;;
*) LLVM_ARCH="$ARCH-linux-gnu" ;;
esac
# Download pre-built LLVM/Clang binaries
echo "Downloading LLVM ${LLVM_VERSION} for ${LLVM_ARCH}..."
wget -q -O- "https://github.com/llvm/llvm-project/releases/download/llvmorg-${LLVM_VERSION}/clang+llvm-${LLVM_VERSION}-${LLVM_ARCH}.tar.xz" | \
tar -xJf - -C /tmp --strip-components=1 || {
echo "Failed to download pre-built binaries, trying alternative version..."
# Try an older, more widely available version
LLVM_FALLBACK="17.0.6"
wget -q -O- "https://github.com/llvm/llvm-project/releases/download/llvmorg-${LLVM_FALLBACK}/clang+llvm-${LLVM_FALLBACK}-${LLVM_ARCH}.tar.xz" | \
tar -xJf - -C /tmp --strip-components=1
}
# Copy required files
if [ -d "/tmp/bin" ]; then
cp /tmp/bin/clang* "$LOCAL_BIN/" 2>/dev/null || true
cp /tmp/bin/llvm-config "$LOCAL_BIN/" 2>/dev/null || true
cp -r /tmp/lib/libclang* "$LOCAL_LIB/" 2>/dev/null || true
cp -r /tmp/lib/clang "$LOCAL_LIB/" 2>/dev/null || true
cp -r /tmp/include/clang* "$LOCAL_INCLUDE/" 2>/dev/null || true
rm -rf /tmp/bin /tmp/lib /tmp/include /tmp/share
# Set LIBCLANG_PATH for rust bindgen (in case not set by cache)
echo "LIBCLANG_PATH=$LOCAL_LIB" >> $GITHUB_ENV
echo "LLVM_CONFIG_PATH=$LOCAL_BIN/llvm-config" >> $GITHUB_ENV
fi
echo "✓ LLVM/Clang ${LLVM_VERSION} installed locally"
else
echo "✓ clang found: $(clang --version 2>/dev/null | head -n1 || echo 'installed')"
fi
echo "All required system dependencies ready!"
# Install protoc only when install-deps is true
- name: Install Protoc
if: inputs.install-deps == 'true'
uses: arduino/setup-protoc@v3
with:
repo-token: ${{ github.token }}
# Auto-install protoc when install-deps is false (for self-hosted runners)
- name: Setup Protoc (self-hosted)
if: inputs.install-deps == 'false'
shell: bash
run: |
echo "Checking and installing protoc locally if needed..."
if ! command -v protoc &> /dev/null; then
echo "protoc not found, installing locally..."
ARCH=$(uname -m)
# Map architecture names for protoc
case "$ARCH" in
x86_64) PROTOC_ARCH="x86_64" ;;
aarch64) PROTOC_ARCH="aarch_64" ;;
*) PROTOC_ARCH="$ARCH" ;;
esac
# Download and install protoc
mkdir -p ~/.local/bin
wget -q -O /tmp/protoc.zip "https://github.com/protocolbuffers/protobuf/releases/download/v${PROTOC_VERSION}/protoc-${PROTOC_VERSION}-linux-${PROTOC_ARCH}.zip"
unzip -q -o /tmp/protoc.zip -d ~/.local
rm /tmp/protoc.zip
# Make sure it's executable
chmod +x ~/.local/bin/protoc
# Add to PATH if not already there
echo "$HOME/.local/bin" >> $GITHUB_PATH
export PATH="$HOME/.local/bin:$PATH"
echo "✓ protoc ${PROTOC_VERSION} installed locally"
else
PROTOC_VERSION=$(protoc --version | grep -oP '\d+\.\d+(\.\d+)?')
echo "✓ protoc found: version $PROTOC_VERSION"
# Check minimum version (3.0 or higher recommended)
MAJOR_VERSION=$(echo $PROTOC_VERSION | cut -d. -f1)
if [ "$MAJOR_VERSION" -lt 3 ]; then
echo "WARNING: protoc version $PROTOC_VERSION is older than recommended (3.0+)"
fi
fi
echo "Protoc ready!"