DataDesigner

mirror of https://github.com/NVIDIA-NeMo/DataDesigner synced 2026-05-24 09:48:29 +00:00

History

Andre Manoel 2564834a47 fix: cache notebook builds to avoid flaky upstream model failures (#370 ) * fix: cache notebook builds to avoid failures from flaky upstream models The build-notebooks CI executes all tutorial notebooks on every run. When an upstream model (e.g. black-forest-labs/flux.2-pro) is down, the entire docs build fails even if no notebooks changed. Add per-notebook caching based on source file SHA-256 hashes. Unchanged notebooks are served from cache, and only modified ones are re-executed. On the first CI run (empty cache), the workflow seeds the cache from the last successful build artifact. Also add a minimal test script (test_flux_image_gen.py) to reproduce the flux.2-pro health check failure locally. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: address review comments on notebook caching - Don't write .sha256 during seeding so changed notebooks are detected - Rename TMPDIR to SEED_TMPDIR to avoid shadowing the POSIX env var - Use portable sha256 helper (sha256sum with shasum fallback) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: only seed cache when truly empty, restore hash writing Skip artifact seeding when a partial cache was restored (it already has correct per-file hashes). Only seed + write current hashes when the cache dir is completely empty (true bootstrapping). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: restrict artifact seed lookup to main branch Prevents seeding from feature branch runs that may have different notebook sources. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: add actions:read permission for artifact seeding The seed step uses gh run list and gh run download which require actions:read. Without it, these calls silently fail and the cold-start cache bootstrapping never executes. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: only use notebook cache when called from build-docs Scheduled Monday runs and manual workflow_dispatch should execute all notebooks to catch regressions (e.g. library changes that break a notebook). Caching is only used via workflow_call (from build-docs) where the goal is fast, resilient doc deployment. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix: use jq // empty to avoid "null" string on empty run list Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * feat: add use_cache input flag to notebook and docs workflows Replace event_name-based cache logic with an explicit use_cache boolean input. Defaults: - build-notebooks: workflow_call=true, dispatch=false, schedule=false - build-docs: dispatch=true (toggleable), release=false This gives full control over caching from the GitHub Actions UI. --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>		2026-03-05 12:30:14 -03:00
..
build-docs.yml	fix: cache notebook builds to avoid flaky upstream model failures (#370 )	2026-03-05 12:30:14 -03:00
build-notebooks.yml	fix: cache notebook builds to avoid flaky upstream model failures (#370 )	2026-03-05 12:30:14 -03:00
check-colab-notebooks.yml	refactor: slim package refactor into three subpackages (#240 )	2026-01-27 13:53:20 -05:00
ci.yml	chore: add publish script and update license headers (#253 )	2026-01-28 08:47:34 -05:00
dco-assistant.yml	update branch for signatures (#19 )	2025-11-06 16:56:47 -05:00
health-checks.yml	test: add provider health checks script and CI workflow (#301 )	2026-02-06 15:18:35 -03:00
pack-tutorials.yml	chore: moving notebooks to jupytext and cleaning up workflows (#91 )	2025-12-03 17:29:07 -03:00
semantic-pull-requests.yml	updating ci workflows	2025-10-31 11:29:06 -04:00