DataDesigner/.github/workflows/build-docs.yml
Andre Manoel 2564834a47
fix: cache notebook builds to avoid flaky upstream model failures (#370)
* fix: cache notebook builds to avoid failures from flaky upstream models

The build-notebooks CI executes all tutorial notebooks on every run.
When an upstream model (e.g. black-forest-labs/flux.2-pro) is down, the
entire docs build fails even if no notebooks changed.

Add per-notebook caching based on source file SHA-256 hashes. Unchanged
notebooks are served from cache, and only modified ones are re-executed.
On the first CI run (empty cache), the workflow seeds the cache from the
last successful build artifact.

Also add a minimal test script (test_flux_image_gen.py) to reproduce the
flux.2-pro health check failure locally.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: address review comments on notebook caching

- Don't write .sha256 during seeding so changed notebooks are detected
- Rename TMPDIR to SEED_TMPDIR to avoid shadowing the POSIX env var
- Use portable sha256 helper (sha256sum with shasum fallback)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: only seed cache when truly empty, restore hash writing

Skip artifact seeding when a partial cache was restored (it already has
correct per-file hashes). Only seed + write current hashes when the
cache dir is completely empty (true bootstrapping).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: restrict artifact seed lookup to main branch

Prevents seeding from feature branch runs that may have different
notebook sources.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: add actions:read permission for artifact seeding

The seed step uses gh run list and gh run download which require
actions:read. Without it, these calls silently fail and the cold-start
cache bootstrapping never executes.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: only use notebook cache when called from build-docs

Scheduled Monday runs and manual workflow_dispatch should execute all
notebooks to catch regressions (e.g. library changes that break a
notebook). Caching is only used via workflow_call (from build-docs)
where the goal is fast, resilient doc deployment.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use jq // empty to avoid "null" string on empty run list

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat: add use_cache input flag to notebook and docs workflows

Replace event_name-based cache logic with an explicit use_cache boolean
input. Defaults:
- build-notebooks: workflow_call=true, dispatch=false, schedule=false
- build-docs: dispatch=true (toggleable), release=false

This gives full control over caching from the GitHub Actions UI.

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-05 12:30:14 -03:00

75 lines
2.6 KiB
YAML

name: Build docs
on:
workflow_dispatch:
inputs:
use_cache:
description: "Use cached notebooks for unchanged sources"
type: boolean
default: true
release:
types:
- published
jobs:
build-notebooks:
uses: ./.github/workflows/build-notebooks.yml
with:
use_cache: ${{ github.event_name == 'workflow_dispatch' && inputs.use_cache || false }}
secrets: inherit
deploy:
needs: build-notebooks
runs-on: ubuntu-latest
permissions:
contents: write
steps:
- name: Checkout repository
uses: actions/checkout@v2
- name: Install uv
uses: astral-sh/setup-uv@v6
with:
version: "0.9.5"
- name: Set up Python
run: uv python install 3.11
- name: Install dependencies for docs
run: uv sync --all-packages --group docs
- name: Download artifact from previous step
uses: actions/download-artifact@v5
with:
name: notebooks
path: docs/notebooks
- name: Find the latest existing release tag
id: get_release
run: |
if [ "${{ github.event_name }}" == "release" ]; then
LATEST_TAG="${{ github.event.release.tag_name }}"
else
echo "::notice::Running manually via workflow_dispatch. Fetching latest release tag..."
gh auth status || echo "GitHub CLI is not authenticated, relying on GITHUB_TOKEN."
# We use tr -d '\n' to remove the trailing newline for a clean tag string
LATEST_TAG=$(gh release view --json tagName -q .tagName 2>/dev/null)
if [ -z "$LATEST_TAG" ]; then
echo "::error::Could not find the latest published release tag. Ensure a release exists."
exit 1
fi
fi
echo "Latest release tag found: $LATEST_TAG"
echo "LATEST_TAG=$LATEST_TAG" >> $GITHUB_ENV
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Extract version from release tag
run: |
# Remove the 'v' prefix and any suffix after a space
VERSION=$(echo ${{ env.LATEST_TAG }} | sed 's/^v//' | sed 's/ .*$//')
echo "::notice::Extracted version: $VERSION"
echo "VERSION=$VERSION" >> $GITHUB_ENV
- name: Setup doc deploy
run: |
git fetch origin gh-pages --depth=1
git config --global user.name "github-actions[bot]"
git config --global user.email "41898282+github-actions[bot]@users.noreply.github.com"
- name: Build and deploy docs
run: uv run mike deploy --push --update-aliases ${{ env.VERSION }} latest