Commit graph

461 commits

Author SHA1 Message Date
Johnny Greco
fe13efcc89
Merge branch 'main' into andreatgretel/docs/remove-code-reference-docs 2026-05-21 17:44:52 -04:00
Nabin Mulepati
2a487cdc5c
feat: add dropped column preservation toggle (#691)
* feat: add dropped column preservation toggle

Closes #690

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

* fix: reject dropped column policy resume mismatch

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

---------

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>
2026-05-21 13:19:20 -06:00
Johnny Greco
1fa29ad940
Merge branch 'main' into andreatgretel/docs/remove-code-reference-docs 2026-05-21 15:02:43 -04:00
Sai Asish Y
000fc09f94
fix(interface): reject duplicate names within output_processors (#675) (#697)
Signed-off-by: SAY-5 <say.apm35@gmail.com>
2026-05-21 12:13:51 -06:00
Eric W. Tramel
c0a4dcbb85
feat: implement async scheduling admission control (#661)
Some checks are pending
CI / End to end test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test Config (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Config (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Config (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
CI / Coverage Check (Python 3.11) (push) Blocked by required conditions
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
2026-05-20 20:58:05 -04:00
Nabin Mulepati
a83968f701
feat: preserve multimodal MCP tool results (#689)
Some checks are pending
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
CI / Coverage Check (Python 3.11) (push) Blocked by required conditions
CI / End to end test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Lint and Format Check (push) Blocked by required conditions
CI / Check License Headers (push) Blocked by required conditions
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
* feat: preserve multimodal MCP tool results

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

* fix: gate MCP generic image payload detection

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

* fix: validate MCP image payload blocks

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

---------

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>
2026-05-20 14:49:22 -06:00
Nabin Mulepati
83694b2cdf
chore: refresh uv lockfile (#692)
Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>
2026-05-20 12:17:01 -06:00
Nabin Mulepati
0860d62e23
fix: restore chat completion multi-choice support (#672)
* fix chat completion multi-choice support

Restore the chat completion n request field and preserve all returned choices in the canonical response while keeping response.message as the first choice.

Add coverage for request forwarding, compatibility access, multi-choice parsing, and generate forwarding.

Fixes #620

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

* strip n from generate requests

Prevent generate and agenerate from forwarding multi-choice requests that they cannot expose, while keeping completion() multi-choice support intact.

Add coverage for async parsing and Anthropic n exclusion.

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

* strip configured n from generate requests

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

* rename multiple choice completion flag

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

* move choice sanitizer to private helpers

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

* order private facade helpers

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

---------

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>
2026-05-20 11:40:10 -06:00
Nabin Mulepati
bd0410bb05
fix(engine): actionable error when a Jinja field is missing/None/empty (#633)
Some checks are pending
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
CI / Coverage Check (Python 3.11) (push) Blocked by required conditions
CI / End to end test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
CI / Lint and Format Check (push) Blocked by required conditions
CI / Check License Headers (push) Blocked by required conditions
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
* fix(engine): actionable error when a Jinja field is missing/None/empty

Empty-render and missing-attribute failures used to surface as the
generic "User provided prompt generation template is invalid." either
because `sanitize_user_exceptions` stripped the detail or because
Jinja's raw `UndefinedError` leaked through. Both now raise a new
`EmptyTemplateRenderError` carrying a row-level diagnostic that names
the offending chain and includes copy-pasteable Jinja conditional and
SkipConfig fix patterns.

Closes #629.

* fix(engine): address PR review feedback on EmptyTemplateRenderError

Addresses the open review comments on #633:

1. (Greptile P1) Gate expression in the suggested remediation template
   was one accessor too deep when the root variable was entirely absent
   from the record, causing the suggested fix to itself raise
   UndefinedError. Fall back to gating on the root name alone when
   sample_name is not in record.

2. (andreatgretel) The AST walker reported loop-local names as missing
   culprits (e.g. ``person`` in ``{% for person in people %}...{% endfor %}``).
   Filter extracted chains through ``meta.find_undeclared_variables`` to
   defer to Jinja's canonical scope tracking.

3. (andreatgretel follow-up) Empty collections used as loop iterables
   (``items=[]``) fell through to the no-culprit fallback. Add a new
   ``_CULPRIT_EMPTY_COLLECTION`` classification so they're surfaced.

4. Minor: add ``from exception`` to ``safe_render``'s UndefinedError
   re-raise for traceback consistency with the native engine path, and
   add a note on the load-bearing exception ordering in
   ``sanitize_user_exceptions``.
2026-05-20 09:51:21 -06:00
Nabin Mulepati
e181b4b3b2
docs: plan audio/video context support (#669)
* docs: plan audio/video context support

Closes #668

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

* docs: address audio video context plan feedback

Tighten the plan around legacy image-context migration, audio/video auto-detection, config-layer canonical blocks, capability gating, ImageColumnConfig scope, and single-PR implementation rollout.

Refs #668

* docs: clarify legacy image context migration

* docs: resolve image context canonical migration

---------

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>
2026-05-20 09:42:22 -06:00
github-actions[bot]
aa69207cc7
chore(agentic-ci): declare numpy as direct dependency of data-designer-engine (#676)
The engine package imports numpy directly (e.g. `from numpy.typing import
NDArray` in `sampling_gen/constraints.py`) but only declared it
transitively via `data-designer-config`. Add `numpy>=1.23.5,<3` to the
engine's own `[project.dependencies]`, matching the specifier already
used by the config package. No runtime behavior changes.

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: Andre Manoel <165937436+andreatgretel@users.noreply.github.com>
2026-05-20 09:32:24 -03:00
github-actions[bot]
383db51e6f
chore(agentic-ci): add future annotations import to dataset_metadata.py (#640)
* chore(agentic-ci): add future annotations import to dataset_metadata.py

Per AGENTS.md: every Python source file requires
`from __future__ import annotations`.

* chore: trigger PR checks

---------

Co-authored-by: agentic-ci[bot] <agentic-ci@users.noreply.github.com>
Co-authored-by: Andre Manoel <amanoel@nvidia.com>
Co-authored-by: Andre Manoel <165937436+andreatgretel@users.noreply.github.com>
2026-05-20 09:29:58 -03:00
Andre Manoel
ff5277088d
fix(ci): trust generated Agentic CI PRs (#643)
* fix(ci): trust generated agentic CI PRs

Signed-off-by: Andre Manoel <amanoel@nvidia.com>

* fix(ci): authorize generated PR checks

Signed-off-by: Andre Manoel <amanoel@nvidia.com>

* fix(ci): pin authorized agentic checks

Signed-off-by: Andre Manoel <amanoel@nvidia.com>

* fix(ci): narrow agentic CI trust

* fix(ci): reject stale agentic authorizations

* fix(ci): serialize agentic authorization

---------

Signed-off-by: Andre Manoel <amanoel@nvidia.com>
2026-05-20 09:27:04 -03:00
Andre Manoel
20555a751d
docs: address generated reference review 2026-05-20 12:14:50 +00:00
Steve Han
abb4a242df
docs: add retriever SDG toolkit dev note (#666)
Some checks failed
CI / Test Interface (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Coverage Check (Python 3.11) (push) Waiting to run
CI / End to end test (Python 3.10 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Lint and Format Check (push) Waiting to run
CI / Check License Headers (push) Waiting to run
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on macos-latest) (push) Waiting to run
Publish devnotes / deploy (push) Has been cancelled
Publish Fern devnotes / deploy (push) Has been cancelled
* docs: add retriever SDG toolkit dev note

Signed-off-by: Steve Han <sthan@nvidia.com>

* docs: restyle retriever SDG pipeline diagram

Signed-off-by: Steve Han <sthan@nvidia.com>

* docs: fix retriever SDG pipeline flow order

Signed-off-by: Steve Han <sthan@nvidia.com>

* docs: address retriever SDG dev note review

Signed-off-by: Steve Han <sthan@nvidia.com>

* docs: clarify retriever SDG wording

Signed-off-by: Steve Han <sthan@nvidia.com>

---------

Signed-off-by: Steve Han <sthan@nvidia.com>
2026-05-19 16:19:03 -04:00
Johnny Greco
7f7c62fbc1
docs: update generated token badge (#678)
Signed-off-by: Johnny Greco <jogreco@nvidia.com>
2026-05-19 15:11:13 -04:00
Mike Knepper
498e627d49
feat: Expose on_batch_complete via create method (#663)
Some checks are pending
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.10 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on macos-latest) (push) Waiting to run
CI / Coverage Check (Python 3.11) (push) Waiting to run
CI / End to end test (Python 3.11 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Lint and Format Check (push) Waiting to run
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Check License Headers (push) Waiting to run
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
2026-05-19 09:38:18 -05:00
Andre Manoel
6055290136
feat: add workflow chaining (#636)
Some checks are pending
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Coverage Check (Python 3.11) (push) Waiting to run
CI / End to end test (Python 3.10 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Lint and Format Check (push) Waiting to run
CI / Check License Headers (push) Waiting to run
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
Publish Fern devnotes / deploy (push) Waiting to run
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
* feat: add workflow chaining

* test: tidy workflow chaining coverage

* fix: harden workflow chaining concurrency

* docs: update workflow chaining plan

* feat: add workflow stage postprocessors

* feat: expose workflow stage outputs

* fix: align workflow selected output export

* fix: address workflow chaining review issues

* fix: align workflow parquet export selection

Signed-off-by: Andre Manoel <amanoel@nvidia.com>

* fix: preserve generated columns in drop validation

Signed-off-by: Andre Manoel <amanoel@nvidia.com>

* fix: clarify workflow output processors

Signed-off-by: Andre Manoel <amanoel@nvidia.com>

* docs: add workflow chaining page

* docs: align workflow chaining warning

* fix: address workflow review nits

---------

Signed-off-by: Andre Manoel <amanoel@nvidia.com>
2026-05-18 20:15:47 -03:00
Andre Manoel
08ccf3412d
docs: remove docs code reference 2026-05-18 21:34:15 +00:00
Nabin Mulepati
71997624b3
feat: track reasoning token usage (#670)
Some checks are pending
CI / Test Engine (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Coverage Check (Python 3.11) (push) Waiting to run
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Config (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Config (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Config (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Config (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Config (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Config (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Config (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Waiting to run
* feat: track reasoning token usage

Capture provider-reported reasoning-token breakdowns alongside output tokens without changing output token totals. Carry the field through model usage aggregation and add coverage for parsing, facade tracking, and deltas.

Refs #665

* fix: show reasoning tokens in usage summary

Include reasoning token counts in the local model usage summary while preserving output and total token semantics. Telemetry remains unchanged.

Refs #665

* fix: estimate missing reasoning token counts

When providers return reasoning content without a numeric usage breakdown, estimate reasoning tokens from that content while preserving provider-reported output and total token counts.

Refs #665

* fix: track reasoning token count source

* fix: simplify reasoning token source

* fix: omit unknown reasoning tokens from logs

* refactor: clarify reasoning token count helpers

* test: move token counting tests

* fix: enforce reasoning token source

* fix: address reasoning usage review
2026-05-18 12:15:31 -06:00
dependabot[bot]
387be6f07d
ci: bump the all-actions group across 1 directory with 2 updates (#664)
Some checks are pending
CI / Test Engine (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Coverage Check (Python 3.11) (push) Waiting to run
CI / End to end test (Python 3.10 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.13 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
Bumps the all-actions group with 2 updates in the / directory: [cloudflare/wrangler-action](https://github.com/cloudflare/wrangler-action) and [NVIDIA-NeMo/FW-CI-templates/.github/workflows/_semantic_pull_request.yml](https://github.com/nvidia-nemo/fw-ci-templates).


Updates `cloudflare/wrangler-action` from 3.15.0 to 4.0.0
- [Release notes](https://github.com/cloudflare/wrangler-action/releases)
- [Changelog](https://github.com/cloudflare/wrangler-action/blob/main/CHANGELOG.md)
- [Commits](9acf94ace1...ebbaa15849)

Updates `NVIDIA-NeMo/FW-CI-templates/.github/workflows/_semantic_pull_request.yml` from 1.1.0 to 1.2.0
- [Release notes](https://github.com/nvidia-nemo/fw-ci-templates/releases)
- [Changelog](https://github.com/NVIDIA-NeMo/FW-CI-templates/blob/main/CHANGELOG.md)
- [Commits](2dee428461...e58924ea30)

---
updated-dependencies:
- dependency-name: cloudflare/wrangler-action
  dependency-version: 4.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: all-actions
- dependency-name: NVIDIA-NeMo/FW-CI-templates/.github/workflows/_semantic_pull_request.yml
  dependency-version: 1.2.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: all-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-18 11:45:27 -03:00
Andre Manoel
cd604a57a4
ci: fix Fern devnotes artifact lookup (#667)
Some checks failed
CI / Test Config (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test Config (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test Config (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Config (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Config (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test Engine (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test Engine (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test Engine (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test Engine (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Coverage Check (Python 3.11) (push) Has been cancelled
CI / Test (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.12 on ubuntu-latest) (push) Has been cancelled
2026-05-15 17:45:51 -03:00
Andre Manoel
765fccfcb0
docs: fix Fern versioned publishing (#656)
* docs: fix Fern versioned plugin docs

* docs: guard Fern release version content

* docs: dedupe latest Fern release pages

* ci: require latest Fern nav on release

* docs: document Fern release prep

* ci: automate Fern release sync

* ci: publish Fern snapshots from docs branch

* docs: keep Fern archive on docs branch

* docs: harden Fern docs branch publishing

* ci: preview Fern docs from archive branch

* docs: include utility modules in Fern API reference

* ci: harden Fern devnotes publishing

* docs: keep Fern latest label stable

* docs: normalize Fern latest preview label

* docs: align Fern code reference nav

* docs: sync Fern code reference across versions

* docs: materialize Fern version pages

* ci: record Fern publish provenance

* docs: fix Fern generated API MDX

* docs: escape generated Fern API example

* ci: use stable Fern preview URL

* docs: flatten Fern API nav roots

* docs: use generated API overview pages

* ci: allow branch-dispatched Fern publish tests

* docs: update Fern CLI pin

* docs: dedupe release nav validation paths

* docs: address Fern review nits
2026-05-15 17:09:59 -03:00
Eric W. Tramel
a4085c441a
feat: add AIMD startup ramp (#638)
Some checks failed
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Coverage Check (Python 3.11) (push) Has been cancelled
CI / End to end test (Python 3.11 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.12 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.13 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Lint and Format Check (push) Has been cancelled
CI / Check License Headers (push) Has been cancelled
Publish devnotes / deploy (push) Has been cancelled
Publish Fern devnotes / deploy (push) Has been cancelled
CI / Test (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.11 on ubuntu-latest) (push) Has been cancelled
2026-05-13 16:25:03 -04:00
Nabin Mulepati
3c8394e783
fix(interface): silence registry-default deprecation when library auto-fills it (#655)
The ``ModelProviderRegistry.default is deprecated`` warning added in #594
fires for every fresh-install ``DataDesigner()`` construction, even when
the user wrote ``default=`` nowhere — neither in YAML, nor in Python, nor
in any ``ModelConfig``.

Root cause: ``resolve_model_provider_registry`` synthesises
``default=providers[0].name`` for the multi-provider case to satisfy
``check_implicit_default``. The auto-seeded
``~/.data-designer/model_providers.yaml`` ships three providers and no
``default:`` key, so this path is hit on every bare ``DataDesigner()``
call. ``_warn_on_explicit_default`` then attributes the warning to the
user's ``DataDesigner()`` line, with a remediation message ("Specify
provider= explicitly on each ModelConfig") that doesn't even apply when
the user hasn't built a ``ModelConfig`` (e.g. a UUID-only sampler config
with the GitHub plugin).

Fix: broaden the existing warning suppression in ``DataDesigner.__init__``
to also cover the ``model_providers is None`` case. The user is opting
into all defaults — the library is the one filling ``default=``, so the
deprecation nudge is misdirected. Users who hand-construct a
multi-provider list in Python still see the warning (they wrote the
multi-provider intent themselves), and direct
``ModelProviderRegistry(default="x")`` always warns — those are the
entry points #589 actually targets.

New regression test pins the bare-``DataDesigner()`` quiet path so a
future tightening of the suppression can't silently re-introduce the
spurious warning.

Refs #589, follow-up to #594.

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>
2026-05-13 14:15:10 -06:00
Johnny Greco
ef761b824b
docs: add "Have It Your Way" plugin dev note (#608)
* docs: add plugins dev note

* docs: mention custom columns

Signed-off-by: Johnny Greco <jogreco@nvidia.com>

* docs: update plugins dev note

* docs: refine plugins dev note

* docs: link v0.6.0 release

---------

Signed-off-by: Johnny Greco <jogreco@nvidia.com>
2026-05-13 15:01:37 -04:00
Eric W. Tramel
0fdea845ac
feat: add fair async task scheduling (#639) 2026-05-13 13:47:45 -04:00
Johnny Greco
d14c9b3ccc
feat(cli): add plugin catalog core (#618)
* feat(cli): add plugin catalog services

Add typed catalog and tap models, persistent tap storage, cached
catalog loading, compatibility evaluation, install plan generation,
and runtime plugin discovery helpers.

Refs #617

* feat(cli): add plugins command group

Wire list, search, info, install, installed, and tap management
commands through the existing command-controller CLI pattern.

Refs #617

* test(cli): cover plugin catalog workflows

Add regression coverage for tap caching, catalog compatibility,
installer command generation, local path resolution, and Typer command
delegation.

Refs #617

* fix(cli): align plugin taps with schema v2

Validate tap catalogs against the schema v2 contract used by
NVIDIA-NeMo/DataDesignerPlugins#36, including source union fields,
docs URLs, package paths, compatibility metadata, and unique runtime
plugin names.

Derive Git install targets as package-qualified PEP 508 direct
references so git tap entries install the package described by the
catalog source metadata.

Refs #617

* fix(cli): address plugin review feedback

- Invalidate import caches before post-install entry point verification
- Make tap aliases case-insensitive and cache catalogs by alias plus URL
- Prefer compatible catalog entries before falling back to forced installs
- Clarify unused --tap behavior and list installed entry points without imports
- Add direct controller coverage and update CLI plugin documentation

Refs #617

* fix(cli): gate incompatible plugin installs

Fetch install targets before compatibility filtering so the controller
owns the final --force decision and the incompatible install guard stays
reachable.

Refs #617

* style(cli): format plugin catalog files

Apply ruff formatting to the plugin command and tap repository tests so
CI format checks pass on the PR merge commit.

Refs #617

* fix(cli): reject duplicate plugin entry names

Key catalog duplicate detection by entry_point.name so distinct catalog
entries cannot register the same runtime plugin name.

Refs #617

* fix(cli): preserve GitHub tree tap paths

* fix(cli): verify plugin entry point names

* align plugin CLI with catalog schema

- adopt catalog terminology for plugin source aliases
- parse package-first plugin catalog metadata from the plugin repo
- install package requirements with optional catalog indexes

* tidy plugin catalog workflow docs

* align plugin catalog CLI with package contract

* add plugin package uninstall workflow

* test plugin package command targets

* document plugin package aliases

* address plugin catalog review feedback

* prefer runtime plugin lookup matches

* rename plugins command to plugin

* show plugin package descriptions

* rename plugin catalogs command

* add protected plugin package installs

* document plugin package install modes

* avoid building project during plugin installs

* harden plugin package installs

* tighten plugin catalog contracts

* fix no-args help exit code

* make plugin docs links robust

* document plugin CLI catalog workflows

* clarify plugin entry point verification

* simplify plugin CLI docs

* narrow plugin search fields

* hide plugin catalog cache ttl

* remove plugin catalog trust flag

* improve plugin CLI recovery UX

* polish plugin catalog table display

* stabilize plugin catalog table test

* tighten plugin catalog edge cases

* harden plugin catalog verification

- Escape catalog-provided Rich markup before rendering CLI output
- Reject runtime plugin names that collide after enum-key normalization
- Load installed runtime entry points in a subprocess before reporting success

* simplify plugin entry point verification

Load matching entry points directly after install instead of spawning a
separate Python process. This keeps the check package-scoped while still
catching broken entry-point targets and non-Plugin objects.

* require newer uv for plugin plans

Use uv >= 0.10.0 as the single supported uv requirement for
plugin package commands. Auto mode now falls back to a pip plan with
an upgrade warning when uv is unavailable or too old, while explicit
uv selection remains strict.

* verify pip fallback availability

* polish plugin CLI status markers

* clarify plugin compatibility labels

* simplify plugin info install details

* address plugin CLI review nits

* support versioned plugin package installs

* share plugin install metadata rendering

* show installed plugin packages

* harden versioned plugin installs

- Preserve catalog requirement constraints for versioned installs
- Remove stale install-plan metadata fields
- Expand parser, uv, controller, and local-catalog dry-run coverage

* harden plugin help tests

* show plugin package versions

Add package version metadata support for plugin catalogs and resolve current versions from exact requirements or simple indexes when catalog entries omit them.

Update plugin list/info/install metadata to show the plugin package version and Data Designer compatibility requirement while removing the separate Data Designer version line.

* format plugin catalog tests

* harden plugin package metadata checks

* harden plugin CLI test coverage

* add plugin discovery docs (#642)

Signed-off-by: Johnny Greco <jogreco@nvidia.com>

---------

Signed-off-by: Johnny Greco <jogreco@nvidia.com>
2026-05-13 12:26:58 -04:00
Andre Manoel
1d203b1dda
feat(agentic-ci): decision-ready triage and daily PR fixes (#600)
Some checks are pending
CI / Test Config (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Coverage Check (Python 3.11) (push) Waiting to run
CI / End to end test (Python 3.10 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
Publish Fern devnotes / deploy (push) Waiting to run
* feat(agentic-ci): decision-ready triage and daily PR fixes

Reorganize the weekly issue-triage report around recommended actions
(close as resolved, close as duplicate, needs maintainer decision,
ready for assignment, stuck PR, duplicate PRs, stale) so each flagged
item carries action + evidence + rationale and can be resolved without
opening it. Multi-comment split with i/N markers and orphan
reconciliation when the report grows or shrinks.

Flip the four daily audit suites with mechanical fix categories from
read-only reports to opening one PR per run:

- docs-and-references: broken-link, docstring-drift, arch-ref-rename
- structure: missing-future, lazy-import
- dependencies: transitive-gap, unused
- code-quality: bare-except (draft until landing rate proven)

test-health stays report-only (all candidates require inferring intent).

The shared procedure - fix_backlog selection, finding-hash spec for
stable cross-run identification, attempted_fixes lifecycle with
two-strike escalation, allowlists, ranking, branch/PR conventions -
lives in .agents/recipes/_fix-policy.md. Each suite recipe declares
only its eligible categories, branch types, and test requirements.

Workflow runs claude twice per suite (audit, then conditionally fix),
each capped at the existing --max-turns 50. Fix call is gated on
non-empty fix_backlog and skipped entirely for test-health.

* fix(agentic-ci): address review findings before merge

- Map per-package test targets explicitly in _fix-policy.md (Makefile
  exposes test-config/test-engine/test-interface, not test-<package>).
- Use github-actions[bot] noreply identity for commits the recipes
  produce.
- Refresh fix_backlog.data when an id already exists so the fix phase
  cannot drive a PR from stale data after the underlying file changed.
- Stop time-pruning closed/abandoned attempted_fixes entries — pruning
  before the two-strike threshold erases the history needed to
  escalate. Single-strike entries now age out only via the 200-entry
  cap.
- Disambiguate bare-except findings within the same function by
  including a try-body hash in the finding id.
- Audit grep for code-quality now matches both `except:` and
  `except BaseException:`, in parity with the fix eligibility.
- Restrict transitive-gap fix eligibility to cases where a sibling
  package already declares the dep (avoids inventing version
  specifiers from scratch).
- Issue-triage workflow handles multi-part reports in both the fallback
  post step and the job summary; recipe always writes numbered parts.

* fix(agentic-ci): close residuals from review pass 2

- Replace remaining `make test-<package>` references with pointers to
  the mapping table; only the table itself uses that placeholder now.
- Fix `gh api --paginate | jq | length` returning per-page counts: slurp
  with `jq -s 'add // 0'` to get a single total.
- Compare posted-comment count to expected part count so a partial post
  (agent posted part 1 but not 2/3) triggers the fallback instead of
  being silently treated as success.
- Add `shell: bash` to triage steps using `shopt`/`mapfile` so they're
  not at the mercy of the runner's default shell.
- Disambiguate bare-except findings whose try-body hashes collide by
  adding a per-function ordinal to the canonical_key.
- Tie the 200-entry attempted_fixes cap eviction to `attempts[0].at`
  (the schema has no `first_seen` field).

* fix(agentic-ci): identity-based partial-post detection in triage fallback

Replace the count-only POSTED_COUNT >= EXPECTED_PARTS check with an
identity-based check that extracts every i/N marker seen in
today-dated bot comments and verifies each expected i is present.
A duplicate post of one part can no longer mask a missing other.

* fix(agentic-ci): close remaining bot-review findings

- Exempt two-strike attempted_fixes entries from the 200-entry cap
  eviction. Cap now evicts non-two-strike oldest-first by
  attempts[0].at; two-strike entries are silently-forgotten only in
  the pathological all-200-are-two-strike case (itself a signal).
- Specify the attempted_fixes PR-marker reconciliation algorithm:
  scan open PR bodies for the `<!-- agentic-ci finding=<id> -->`
  marker and back-fill missing entries.
- Tighten the daily workflow conditionals to gate on explicit step
  outcomes (steps.audit.outcome == 'success' rather than success())
  so a future pre-audit gate cannot accidentally trip the fix step.

* fix(agentic-ci): close Greptile pass-2 findings (timeout, re-verify wording)

- Bump daily-suite job timeout from 20 to 40 minutes. The split into
  two sequential `claude --max-turns 50` invocations can saturate a
  20-minute budget; a mid-fix SIGTERM would leave an orphaned branch
  and inconsistent runner-state.
- Disambiguate the `_phase-fix.md` "do NOT re-scan" rule. It forbids
  rebuilding fix_backlog from scratch but does NOT override the
  per-candidate re-verification step required by _fix-policy.md
  step 4.1 (re-grep / re-read the specific file the candidate points
  at). Single-candidate re-verification is required; whole-codebase
  re-scanning is forbidden.

* fix(agentic-ci): close Greptile pass-3 P1s in triage fallback

- Guard `jq capture()` with a `test()` select. `capture()` errors on
  non-match instead of returning empty, which would truncate
  SEEN_PARTS if any unrelated today-dated bot comment lacks the
  triage marker (e.g. from a sibling workflow). Adding the test()
  guard ensures capture() only runs on bodies that already match.
- Iterate the MISSING[] array when posting fallback parts, not the
  full PARTS[] array. Posting all parts when only some were missing
  was creating duplicate comments for the parts the agent already
  successfully posted.

* fix(agentic-ci): close johnnygreco review-pass warnings

Address the five Warnings from the 2026-05-07 review focused on the
trust boundary for autonomous PR generation. Five workflow/policy
adjustments shrink the surface where agent compliance is load-bearing:

- Workflow-level scope gate. After the fix step, re-derive the diff
  against `origin/main` and validate against the per-suite path
  allowlist (regex mirrored from `_fix-policy.md`), the 50-LOC cap, and
  the 3-file cap. On violation, close the PR with `--delete-branch`
  and flip the `attempted_fixes` entry from `open` to `abandoned` so
  two-strike logic still sees the failure. The recipe alone could not
  bind the agent's path choices; the workflow now does.
- Dependencies install-dev verification. For the dependencies suite
  only, re-run `make install-dev` after the scope gate so the agent's
  pyproject edit is exercised against the lockfile resolver. Closes
  the PR if `install-dev` fails — catches the failure mode where the
  per-package test target passed against the old cached lockfile.
- Flip matrix-job `cancel-in-progress` from true to false. A
  cancellation between the agent's git push and `gh pr create` would
  leave an orphaned branch with no `attempted_fixes` record;
  reconciliation only covers PRs that were opened. Queueing a
  duplicate run is the lesser evil. `_fix-policy.md` Atomicity
  section now documents the trade-off.
- Allow `/tmp/audit-{{suite}}.md` in `_phase-audit.md`'s "do not
  modify outside `{{memory_path}}/`" directive. A literal-minded
  agent could refuse to write the report file, which would break the
  job summary, artifact upload, and the fix phase's audit context.
- Always upload the agent log artifact (was `if: failure()` only) and
  include `runner-state.json`. For autonomous mode, the most
  interesting failure is "the workflow succeeded but the PR was
  wrong"; the stream-json log is the only way to look back days
  later.

Also takes johnnygreco's Suggestion 2: spell out in the policy doc
that the `draft_until_proven` flip is the sole human-gated
promotion step in the fix policy and must not be automated.

Greptile and the github-actions auto-reviewer's findings were
already closed in the prior pass-2/pass-3 commits; no action needed
on those.

* fix(agentic-ci): close Codex review-pass-2 findings on workflow gates

Codex flagged five issues in the prior commit's scope/lockfile gates.
This commit closes all five:

- HIGH: Wrong-PR targeting. Both gates selected the last globally-open
  attempted_fixes entry, which could match a stale orphan from a
  prior crashed run rather than the PR opened by *this* run. Adds a
  pre-fix snapshot step that captures `(id, attempts-length)` pairs
  before the fix runs, and changes the post-fix selectors to require
  that the entry's attempts count grew during this run.
- HIGH: Docstring-only enforcement gap on the docs-and-references
  suite. The .py path allowlist was at workflow level but the
  docstring-only caveat was still policy-only. Adds an AST-based
  check: for each .py file changed, parse the post-change tree,
  collect docstring line ranges (module/class/function), then verify
  every added line in the diff is either inside a docstring, a
  comment, or whitespace. Verified locally with both pass and fail
  fixtures.
- MEDIUM: Diff-ref mismatch. Gates diffed `origin/main...HEAD` rather
  than `origin/main...origin/$BRANCH`, so a misbehaving agent that
  left HEAD pointing elsewhere would have validated the wrong tree.
  Now fetches `origin/$BRANCH` first and prefers that ref. Falls
  back to HEAD only if fetch fails (with a warning).
- MEDIUM: FILE_COUNT bug. `grep -c '.' || echo 0` produced "0\n0" on
  empty diff, breaking the downstream integer comparison. Replaces
  with `mapfile -t FILE_ARR` + `${#FILE_ARR[@]}`, which is correct
  for any input including empty.
- LOW: Non-atomic JSON writes. The runner-state mutations could leave
  the file half-written if the workflow was cancelled mid-write.
  Switches both gates to the temp-file + os.replace pattern.

Also: dependencies-lockfile gate now does an explicit
`git checkout --detach origin/$BRANCH` before re-running install-dev,
so verification runs against what was actually pushed rather than
relying on local working-tree state.

* fix(agentic-ci): gate fix + scope_gate steps on snapshot.outcome

Greptile review on 872d5617 flagged that the fix step's custom `if:`
expression bypasses GitHub Actions' implicit success() check. Without
explicitly referencing steps.snapshot.outcome, a snapshot failure
(corrupt runner-state, disk error) would let the fix step run anyway.
The scope gate's `jq --slurpfile prior /tmp/prior-attempted-fixes.json`
would then exit non-zero on the missing file, leave OPEN empty, and
hit the "nothing to validate" early-exit — silently approving whatever
the agent pushed.

Adds steps.snapshot.outcome == 'success' to both the fix step's
condition (the actual fix) and the scope_gate step's condition
(belt-and-suspenders against future refactors).

* fix(agentic-ci): harden daily fix gates

Signed-off-by: Andre Manoel <amanoel@nvidia.com>

* fix(agentic-ci): validate all grown fix attempts

* fix(agentic-ci): harden post-fix gates

---------

Signed-off-by: Andre Manoel <amanoel@nvidia.com>
2026-05-12 18:54:01 -03:00
Andre Manoel
46dc8b232a
docs: prepare Fern docs workflow (#622)
* docs: prepare fern generated artifacts

* docs: update fern migration artifacts

* docs: leave colab notebooks unchanged

* docs: add VLM recipe cards to Fern

* docs: trim Dev Notes sidebar

* docs: collapse older Dev Notes in sidebar

* docs: add Fern publishing workflows

* docs: gate Fern publishing on check

* docs: restrict hosted previews for fork PRs

* docs: clean Fern preview URL

* docs: cancel stale preview runs

* docs: clarify devnotes notebook reuse

* docs: clean older versions route

* docs: document Fern versioning conventions

* docs: add Fern release version guard

* docs: harden Fern release tag handling

* ci: let docs preview continue after fern failure

* ci: split docs preview deploy

* docs: clarify fern make commands

* ci: harden fern deploy workflows

* docs: render preview notebooks without outputs

* ci: keep docs preview deploy inline

* docs: align notebook code highlighting

* docs: show notebook snippet scrollbars

* docs: isolate fern preview check failures

* ci: align fern release docs behavior
2026-05-12 18:18:26 -03:00
Johnny Greco
da4875d510
chore: update vulnerable dependencies (#631)
Some checks are pending
CI / Test Config (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Config (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Config (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Config (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Coverage Check (Python 3.11) (push) Waiting to run
CI / End to end test (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Waiting to run
* chore: update vulnerable dependencies

Raise security floors for python-multipart, Jupyter Server, JupyterLab, Mistune, and Notebook according to the May 2026 scanner guidance.

Regenerate uv.lock so the workspace resolves patched versions for the notebooks/docs and MCP dependency paths.

Signed-off-by: Johnny Greco <jogreco@nvidia.com>

* chore: scope CVE floors to direct deps

---------

Signed-off-by: Johnny Greco <jogreco@nvidia.com>
2026-05-12 16:06:58 -04:00
Andre Manoel
2c6e6b5e0f
docs: add plan for workflow chaining (#552)
* docs: add plan for workflow chaining and allow_resize removal

Proposes replacing the in-place allow_resize mechanism with a Pipeline
class that chains multiple generation stages. Each stage gets a fresh
fixed-size tracker, and resize becomes a between-stage concern.

* docs: reframe plan - chaining is the primary goal, allow_resize removal is secondary

* docs: add to_config_builder convenience method and concrete use cases

* docs: address review feedback - data contract, resume safety, seed controls, edge cases

* docs: refresh plan against current main - deprecation already shipped, fingerprint feature available

- Update allow_resize framing: now logs DeprecationWarning and falls back to sync (#553), no longer hard-rejected. Async is default as of #592.
- Reference DataDesignerConfig.fingerprint() (#587) as the per-stage hash for resume invalidation.
- Rename _validate_async_compatibility() to _resolve_async_compatibility() to match current code.
- Mark Phase 2 step 1 as done; list the concrete docs that still need updates.

* docs: bake parallel-async carefulness into the plan - throttle invariant, on-disk handoffs, DAG-ready, acreate sidecar

- Resolve in-memory vs on-disk handoff to always-on-disk inside Pipeline; reserve in-memory for to_config_builder() notebook ergonomic.
- Add Composability section: parent DataDesigner reuse is a load-bearing API contract for throttle coordination across stages and parallel branches.
- Add Engine API surface section: acreate() as a small additive sidecar, independent of chaining v1 but a hard dependency for Phase 4.
- Promote DAG semantics from "future work" to "designed-in"; add Phase 4 (parallel branches via asyncio.gather over acreate); demote auto-chaining to Phase 5.
- New Resolved decisions section captures the three load-bearing API decisions; trim the Open questions list accordingly.
- Mention possible future external orchestration only as a vague composability constraint, no commitment.

* docs: align plan framing with cross-process orchestration discussion

- Soften "Door open for external orchestration" - drop throttle-backend-as-seam framing; cross-reference Future considerations.
- Make acreate() scope explicit (in-process); cross-process orchestration is not the same problem.
- Add Phase 4 scope clarifier - branch parallelism, not stage pipelining.
- New Future considerations section: external orchestration (vague, uncommitted) and pipelined execution of dependent stages.

* docs: address workflow chaining review comments

* docs: tighten workflow chaining resume semantics

* docs: validate callback seed paths on resume

* docs: define empty pipeline stage results

* docs: clarify composite workflow plan

* docs: require unique composite workflow stages
2026-05-12 12:56:04 -03:00
Nabin Mulepati
bbcd7d3995
fix: harden resume checkpoint handling (#624)
Some checks are pending
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Coverage Check (Python 3.11) (push) Waiting to run
CI / End to end test (Python 3.11 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.10 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
* fix: harden resume checkpoint handling

Persist config identity in metadata, make checkpoints atomic, and reject unsafe resume states so interrupted runs do not mix incompatible or post-processed data.

* fix: close resume edge cases

Let IF_POSSIBLE start fresh for resize configs and mark after-generation processing before mutation so interrupted processors cannot be resumed unsafely.

* refactor: drop dataset directory lock

Single-user CLI/notebook flows don't race on the artifact directory, and
the timestamped-directory fallback already handles the "ran it twice"
case. The lock added complexity (re-entrancy, stale cleanup, the
cached-property trap where IF_POSSIBLE→NEVER moves writes to a
timestamped directory while the lock stays pinned to the original) for
no real protection. Atomic metadata writes still cover the actual hazard
(crash mid-write).

Also fix a pre-existing test bug in
test_initial_actual_num_records_uses_actual_parquet_rows_for_partial_row_group
where the mocked scheduler hit the partial-completion path with
unconfigured Mock attributes.

* fix: address Greptile review on resume edge cases

* Drop the unreachable ResumeMode.IF_POSSIBLE branch in
  _post_generation_processed_resume_result. By the time this helper
  runs, build() has normalised IF_POSSIBLE to ALWAYS or NEVER, so the
  guard now matches reality. Tighten the docstring to document the
  three outcomes (no-op return / fall through / raise).

* Split the post-processed extension/raise into two cases. When
  num_records < prior_target the user just asked for fewer records than
  already exist; the previous "would mix pre- and post-processor
  records" message only describes the extension case. Mirror the
  wording used by _load_resume_state and add a regression test.

* Remove the dead _find_completed_row_group_ids wrapper now that
  _build_async uses _find_completed_row_groups directly. Rename the
  related test to match.

* refactor: unify sync + async resume around filesystem-derived progress

Both engines now derive `num_completed_batches` and `actual_num_records`
from `parquet-files/batch_*.parquet` via `_recover_progress_from_disk`.
`metadata.json` keeps describing the run *configuration* (`buffer_size`,
`target_num_records`, `original_target_num_records`, config fingerprint),
while the filesystem is the source of truth for *progress*. This closes
the sync engine's race window between `move_partial_result_to_final_file_path`
and the metadata write that follows it, matching the crash-recovery the
async engine already had.

The sync engine additionally rejects non-contiguous batch IDs (a hole can
only mean external mutation or a directory written by an incompatible
engine); the async engine continues to tolerate gaps from out-of-order
completion via `allow_holes=True`.

Existing sync resume tests now seed parquet files alongside metadata,
and two new tests cover the unified behaviour: filesystem progress wins
when metadata lags, and sync rejects non-contiguous IDs.

* docs: clarify DatasetCreationResults observability scope on resume

`load_dataset`, `count_records`, `load_analysis`, `export`, and `push_to_hub`
all read from the artifact directory, so they reflect the cumulative dataset
(original + resume rows). `task_traces`, model-usage logs, and telemetry
events are scoped to the current invocation only because the original run's
in-memory state is not persisted. Document this in the class docstring,
the architecture note, and the Fern resume guide.

* docs: explain DeprecationWarning re-raise in create()/preview()

Future readers were puzzled by the ``except DeprecationWarning: raise``
short-circuits before the generic generation-error wrappers. Add a
comment in ``create()`` (with a back-reference from ``preview()``) to
record that strict warning filters (``pytest.warns``,
``-W error::DeprecationWarning``) turn the engine's
``warnings.warn(..., DeprecationWarning)`` calls — most notably the
``allow_resize=True`` deprecation in ``_resolve_async_compatibility`` —
into raised exceptions, and we want them to surface untouched instead of
being swallowed by ``DataDesignerGenerationError``.

* fix: close after-generation crash window and tighten metadata typing on resume

Address review feedback on resume hardening:

* Run after-generation processors unconditionally on the on-disk dataset
  rather than gating on the generation return value. The previous gate
  silently skipped after-generation when resume saw every row group
  already on disk, leaving a crash window between the final parquet write
  and the ``post_generation_state="started"`` marker write: in that
  window the dataset is complete but after-generation never ran, and the
  on-disk parquet files are still clean. The "started" short-circuit
  still rejects the other direction (crashed mid-rewrite, ambiguous
  state), so resume only re-runs after-generation when it is safe to do
  so.

* Raise ``DatasetGenerationError`` (instead of letting a raw
  ``TypeError`` leak out of ``num_records < prior_target``) when a
  post-processed dataset's metadata is missing ``target_num_records``.
  Mirrors the wording used by ``_load_resume_state``.

* Document the new behaviour in ``architecture/dataset-builders.md`` and
  the Fern resume invariants.

Tests:

* ``test_build_resume_complete_dataset_runs_after_generation_when_no_marker``
  covers the closed crash window via the public ``set_processor_runner``
  API.
* ``test_build_resume_post_generation_processed_missing_target_raises_clearly``
  covers the typed-error gap.
2026-05-11 11:44:46 -06:00
Nabin Mulepati
4b93f5b245
feat: let column configs declare all model aliases for the startup health check (#626)
* feat(engine): let column configs declare all model aliases for the startup health check

Plugin column configs that depend on more than one model alias (generator + judge,
critic, etc.) previously could not opt their secondary aliases into the standard
startup health check, and configs without a `model_alias` field crashed the
collection loop with AttributeError.

Add `SingleColumnConfig.get_model_aliases()` as the single override hook the
builder uses to enumerate aliases. The default returns the column's primary
`model_alias` (if any), so built-in LLM, embedding, and image columns work
unchanged. `CustomColumnConfig` overrides it to surface decorator-declared
aliases, replacing the special-case `isinstance` branch in the builder. Plugin
configs with multiple model fields override it to opt every endpoint into the
health check.

Fixes #606

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

* fix(config): forward empty model_alias to startup health check

SingleColumnConfig.get_model_aliases() used `if alias` to filter, which
also dropped empty-string aliases. Empty model_alias values are accepted
by the config model and previously reached run_health_check, where they
failed fast with "No model config with alias '' found!". Treating them
as "no model endpoints" silently delayed that error to first generation.

Use `alias is not None` so only a truly missing attribute skips the
health check, and add a regression test that exercises an empty-string
model_alias on a built-in config.

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

---------

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>
2026-05-11 11:33:50 -06:00
Andre Manoel
16db8d61fa
fix(config): update OpenRouter vision model id (#630)
* fix(config): update OpenRouter vision model id

* fix(ci): harden provider health checks

* fix(config): use nano omni for OpenRouter vision

* docs: warn about hosted provider data handling

* fix(config): align OpenRouter vision params
2026-05-11 13:49:17 -03:00
Nabin Mulepati
405ddda698
chore(engine): rename correction-step counter for clarity (#627)
The local counter `curr_num_correction_steps` in `ModelFacade.generate`
and `agenerate` was incremented before each parse attempt, so it
actually counted parse attempts (initial + corrections), not corrections
taken. The name suggested the latter, which made the surrounding
`<= max_correction_steps` check read like a possible off-by-one even
though the math worked out.

Rename it to `parse_attempts` and add a short comment describing the
semantics. Pure refactor: no behavior change.

Closes #371
2026-05-11 09:29:03 -06:00
Johnny Greco
320e723823
validate subcategory parent sampler type (#628)
(cherry picked from commit ec9eeeae1f7f7d5a7cf6ab58579d057099f2da81)

Signed-off-by: Johnny Greco <jogreco@nvidia.com>
2026-05-11 09:37:37 -04:00
Przemysław Boruta
810c681f7a
feat: resume interrupted dataset generation runs (sync + async engine) (#526)
Some checks failed
CI / Test Interface (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Coverage Check (Python 3.11) (push) Has been cancelled
CI / End to end test (Python 3.10 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.11 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.12 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.13 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Lint and Format Check (push) Has been cancelled
CI / Check License Headers (push) Has been cancelled
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.13 on ubuntu-latest) (push) Has been cancelled
* docs: add implementation plan for resume mechanism

Fixes #525

* feat(storage): add resume flag and clear_partial_results()

- ArtifactStorage gains a `resume: bool = False` field
- resolved_dataset_name skips timestamp logic when resume=True,
  returning the existing dataset folder name as-is
- Raises ArtifactStorageError on resume=True when the target folder
  is absent or empty (no data to resume from)
- New clear_partial_results() removes in-flight partial results
  left over from an interrupted run

Fixes #525

* feat(batch-manager): add start_batch param to start()

DatasetBatchManager.start() now accepts:
- start_batch: int = 0  — first batch index to process
- initial_actual_num_records: int = 0  — records already on disk

Both default to 0 so all existing call sites are unaffected.

Fixes #525

* feat(builder): implement resume logic in DatasetBuilder

- build() gains a resume: bool = False parameter
- _load_resume_state() reads metadata.json and validates that
  num_records and buffer_size match the original run
- _build_with_resume() skips completed batches, clears in-flight
  partial results, and continues from the first incomplete batch
- Raises DatasetGenerationError with clear messages for:
  - missing metadata.json (interrupted before first batch completes)
  - num_records mismatch
  - buffer_size mismatch
  - DATA_DESIGNER_ASYNC_ENGINE=1 (not yet supported)
- Logs a warning and returns early when dataset is already complete

Fixes #525

* feat(interface): expose resume on DataDesigner.create()

- create() gains resume: bool = False
- _create_resource_provider() passes resume to ArtifactStorage
- builder.build() receives the resume flag

Fixes #525

* test: add tests for resume mechanism

Covers:
- ArtifactStorage.resolved_dataset_name with resume=True
- ArtifactStorage.clear_partial_results()
- DatasetBatchManager.start() with start_batch and
  initial_actual_num_records
- DatasetBuilder.build(resume=True): missing metadata, num_records
  mismatch, buffer_size mismatch, already-complete detection

Fixes #525

* feat(builder): extend resume to async engine (DATA_DESIGNER_ASYNC_ENGINE=1)

- Add _find_completed_row_group_ids() to scan parquet-files/ for already-written
  row groups by parsing batch_*.parquet filenames
- _build_async() now accepts resume=True: loads metadata, finds completed row groups,
  clears partial results, and logs progress; returns early if all row groups are done
- _prepare_async_run() accepts skip_row_groups, initial_actual_num_records, and
  initial_total_num_batches so the scheduler only processes remaining row groups
  and RowGroupBufferManager starts from the correct counts
- RowGroupBufferManager.__init__ gains initial_actual_num_records and
  initial_total_num_batches params to seed the counters on resume
- finalize_row_group closure now writes incremental metadata after each checkpoint
  so any run (resume or not) can be resumed if interrupted mid-way
- Remove the guard that rejected resume=True with DATA_DESIGNER_ASYNC_ENGINE=1
- Add tests for all new paths

* fix(builder): skip after-generation processors when resume finds dataset already complete

_build_with_resume and _build_async now return False when the dataset is already
complete (early-return path), True otherwise. build() skips
_processor_runner.run_after_generation() on False, preventing processors from
calling shutil.rmtree and rewriting an already-finalized dataset.

Fixes the issue raised in review: greptile P1 comment on PR #526.

* fix(builder): use filesystem count for initial_total_num_batches on async resume

Metadata can lag by one row group if a crash occurs between
move_partial_result_to_final_file_path and write_metadata. Using
len(completed_ids) from the filesystem scan instead of
state.num_completed_batches ensures the final metadata reflects the
actual number of parquet files present, not the potentially stale
metadata count.

* feat(results): add export() method and --output-format CLI flag

Adds DatasetCreationResults.export(path, format=) supporting jsonl,
csv, and parquet. The CLI create command gains --output-format / -f
which writes dataset.<format> alongside the parquet batch files.

* fix(builder): handle resume when metadata.json missing (interrupted before first batch)

When a run is interrupted before any row group or batch completes, metadata.json
is never written. Previously resume=True would raise DatasetGenerationError in
this case. Now build() detects the missing file, logs an info message, clears
any leftover partial results and falls back to a clean fresh run.

This is the common scenario for small datasets (fewer records than buffer_size)
where all records fit in a single row group.

* docs(interface): fix resume docstring — async engine is supported

* fix(builder): derive initial_actual_num_records from filesystem in async resume

In the crash window (row group written to disk but write_metadata crashed before
updating the file), both initial_total_num_batches and initial_actual_num_records
now use the filesystem-discovered completed_ids as source of truth.  Previously
initial_actual_num_records was read from potentially stale metadata, causing
actual_num_records in the final metadata to be undercounted by one row group.

Also adds a test covering the partial-resume crash-window scenario.

* feat(resume): replace resume: bool with ResumeMode enum (NEVER/ALWAYS/IF_POSSIBLE)

- Introduces ResumeMode(StrEnum) in artifact_storage.py for use across all layers
- Replaces resume: bool with resume: ResumeMode in DatasetBuilder.build(),
  DataDesigner.create(), ArtifactStorage, and _build_async()
- Adds _check_resume_config_compatibility() using config fingerprints to support
  IF_POSSIBLE: falls back to a fresh run when config has changed since last run
- Relaxes num_records validation from strict equality to num_records >= actual_num_records,
  allowing dataset extension on resume; buffer_size must still match exactly
- Preserves exception chain with 'from exc' on FileNotFoundError in _load_resume_state
- Exports ResumeMode from data_designer.interface for users to import
- Adds skip_row_groups assertion test and IF_POSSIBLE storage behavior tests

* fix(resume): invalidate resolved_dataset_name cache when IF_POSSIBLE downgrades to NEVER

ArtifactStorage's Pydantic model validator accesses base_dataset_path at
construction time, caching resolved_dataset_name under IF_POSSIBLE semantics
before build() can set resume=NEVER. Pop the stale cache entry so the property
re-resolves with the correct NEVER semantics (timestamped directory).

Also fixes _check_resume_config_compatibility() to use artifact_path/dataset_name
directly instead of base_dataset_path, and adds a regression test covering the
cache-bypass scenario.

* fix(builder): move partial-completion warning before return in _build_async

* fix(builder): IF_POSSIBLE now starts fresh when no dataset directory exists

_check_resume_config_compatibility returned True when config_path was absent,
even when the dataset directory itself didn't exist. This caused IF_POSSIBLE to
upgrade to ALWAYS, which then raised ArtifactStorageError on the first-ever run
because ALWAYS requires an existing directory.

Fix: return False early when the dataset directory is absent. Also sets
actual_num_records on mock buffer managers in two async resume tests that
started failing after the partial-completion warning block was made reachable.

* fix(builder): use original target_num_records in async resume record count

When extending a non-aligned run (e.g. original num_records=5, buffer_size=2),
the last completed row group has 1 record, not buffer_size=2. Using new num_records
in the formula would overcount: min(2, 7-2*2)=2 instead of min(2, 5-2*2)=1.

Fix: capture state from _load_resume_state (previously discarded) and pass
state.target_num_records into the sum formula. Added target_num_records field to
_ResumeState, populated from metadata.json.

Test: test_build_async_resume_initial_actual_num_records_uses_original_target

* fix(builder): IF_POSSIBLE starts fresh on empty dataset directory

Empty directory (crash between mkdir and first file write) was treated as
compatible — _check_resume_config_compatibility returned True, IF_POSSIBLE
upgraded to ALWAYS, which then raised ArtifactStorageError.

Fix: treat empty directory the same as missing — return False from
_check_resume_config_compatibility when any(dir.iterdir()) is False.

Test: test_if_possible_starts_fresh_when_directory_is_empty

* fix(builder): ALWAYS raises DatasetGenerationError on config fingerprint mismatch

ResumeMode.ALWAYS was documented to raise when column/model config changed, but
_check_resume_config_compatibility() was only called in the IF_POSSIBLE branch.
A user resuming with ALWAYS after changing the config would silently mix records
from two different configs.

Fix:
- Refactor _check_resume_config_compatibility() to return _ConfigCompatibility
  enum (COMPATIBLE / INCOMPATIBLE / NO_PRIOR_DATASET) instead of bool so callers
  can distinguish 'no prior run' from 'configs differ'
- Call the check for both ALWAYS and IF_POSSIBLE before _write_builder_config()
- ALWAYS + INCOMPATIBLE → DatasetGenerationError
- IF_POSSIBLE + INCOMPATIBLE → silent fresh start (existing behaviour)
- IF_POSSIBLE + NO_PRIOR_DATASET → silent fresh start (existing behaviour)

Test: test_build_resume_always_raises_on_config_mismatch

* fix(resume): address nabinchha review — drop export collision, add CLI flag, fix edge cases

C1: drop commit 0bdf24ab — remove export() / --output-format from this PR; that feature
    belongs to #540 which has a superior streaming implementation
C2: add --resume / -r flag to data-designer create CLI, thread ResumeMode through
    GenerationController.run_create() into DataDesigner.create()
C3: fix already-complete warning text — replace stale "Remove resume=True" with
    "Use resume=ResumeMode.NEVER" in _build_with_resume and _build_async
C4: fix docstrings — ALWAYS does NOT raise when no checkpoint exists (silently
    restarts from scratch); clarify num_records >= actual semantics
C5: sync artifact_storage.resume = NEVER when no-metadata fallback fires so both
    state holders agree after the downgrade
C6: fix return_value=False → _ConfigCompatibility.INCOMPATIBLE in IF_POSSIBLE test;
    drop 3 direct _find_completed_row_group_ids tests (private API, covered by build())
W1: add logger.warning when builder_config.json is absent (silent COMPATIBLE was footgun)
W2: narrow except Exception → (OSError, json.JSONDecodeError, ValidationError)
W3: run make check-all-fix — ruff reformatted test_if_possible_starts_fresh_when_directory_is_empty

* fix(builder): replace stdlib StrEnum with project compat shim for Python 3.10

* fix(builder): guard extension row groups in initial_actual_num_records formula on async resume

When extending an async run (num_records > state.target_num_records) and a crash
occurs after an extension row group is written to disk but before write_metadata,
the formula `min(buffer_size, state.target_num_records - rg_id * buffer_size)` yields
a negative value for any extension row group (rg_id * buffer_size >= target), making
initial_actual_num_records silently undercount. The RowGroupBufferManager then starts
at the wrong offset, and the final metadata reports an incorrect actual_num_records
with a false partial-completion warning.

Fix: use state.target_num_records for original row groups and num_records for extension
row groups (guarded by rg_id * buffer_size < state.target_num_records). Covers the
scenario with a new regression test.

* fix(builder): pre-compute row-group list in _build_async to fix sizes on non-aligned extension resume

The partitioning loop in _prepare_async_run decremented remaining by
min(buffer_size, remaining) for every row group, including skipped ones.
For a non-aligned original run (e.g. target=5, buffer_size=2, last group
has 1 record), the loop deducted 2 for the skipped last group, leaving
remaining one short.  Extension row groups received smaller sizes than
intended, so the generated dataset was silently short by the deficit and
a false partial-completion warning fired.

Fix: pre-compute the full row-group list with correct per-group sizes in
_build_async where state.target_num_records is available, then pass it to
_prepare_async_run as precomputed_row_groups (replacing the skip_row_groups
param). Original groups use min(buffer_size, target - rg*bs); extension
groups use min(buffer_size, extension_records - ext_idx*bs).

Also updates the skip_row_groups test to assert on precomputed_row_groups
and adds a regression test for the non-aligned extension case.

* chore: remove stale implementation plan for #525

The plan described the initial resume: bool design which has since been
replaced by the full ResumeMode enum (NEVER/ALWAYS/IF_POSSIBLE), async
engine support, filesystem reconciliation, and config compatibility checks.
The PR description is the authoritative record of what shipped.

* fix(engine): fix false 'already complete' when extension fits in last group's slack

original_target=5, buffer_size=2 produces 3 groups [2,2,1]. Extending to
num_records=6: ceil(6/2)=3 equalled len(completed_ids)=3, triggering the
already-complete branch on both the async and sync paths — returning the
5-record dataset silently.

Fix (async): replace ceil(num_records/bs) with
  num_original_groups + ceil(extension_records/bs)
so any extension always adds new groups beyond num_original_groups.

Fix (sync): add num_records_list param to DatasetBatchManager.start() and
pass the correct per-batch sizes in _build_with_resume, giving the batch
manager the right total batch count (4 instead of 3 in the example).

* fix(engine): raise error when num_records is below original target on resume

Prevents negative extension_records in async path which silently truncated
the dataset and corrupted metadata without triggering a partial-completion warning.

* fix(storage): refresh MediaStorage path after IF_POSSIBLE → NEVER downgrade

When build() detected an incompatible config and downgraded resume from
IF_POSSIBLE to NEVER, _media_storage.base_path remained bound to the
original directory while all other path properties resolved to the new
timestamped directory — causing broken image references in image-column runs.

* fix(engine): preserve original_target_num_records across extension resume writes

After finalize_row_group successfully wrote incremental metadata during an
extension run, target_num_records in metadata was updated to the extension
target. A subsequent resume would read this as the original target, making
_rg_size() incorrect for all row groups and silently corrupting actual_num_records.

Stores original_target_num_records as an immutable field in metadata so the
original group boundaries are always recoverable regardless of how many
incremental writes have occurred.

---------

Co-authored-by: Nabin Mulepati <nmulepati@nvidia.com>
2026-05-08 15:37:56 -06:00
dependabot[bot]
eb0b9d3226
ci: bump NVIDIA-NeMo/FW-CI-templates/.github/workflows/_semantic_pull_request.yml (#621)
Bumps the all-actions group with 1 update: [NVIDIA-NeMo/FW-CI-templates/.github/workflows/_semantic_pull_request.yml](https://github.com/nvidia-nemo/fw-ci-templates).


Updates `NVIDIA-NeMo/FW-CI-templates/.github/workflows/_semantic_pull_request.yml` from 0.94.1 to 1.1.0
- [Release notes](https://github.com/nvidia-nemo/fw-ci-templates/releases)
- [Changelog](https://github.com/NVIDIA-NeMo/FW-CI-templates/blob/main/CHANGELOG.md)
- [Commits](211c302d64...2dee428461)

---
updated-dependencies:
- dependency-name: NVIDIA-NeMo/FW-CI-templates/.github/workflows/_semantic_pull_request.yml
  dependency-version: 1.1.0
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: all-actions
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-05-08 17:10:38 -03:00
Andre Manoel
6cbbb7d29b
fix: validate subcategory parents are sampler columns (#614)
Some checks are pending
CI / Test Engine (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Coverage Check (Python 3.11) (push) Waiting to run
CI / End to end test (Python 3.10 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Lint and Format Check (push) Waiting to run
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Check License Headers (push) Waiting to run
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
* fix: validate subcategory parents are sampler columns

Subcategory sampler columns require a category-sampler parent. When the
parent column was a non-sampler type (e.g. llm-text), validation failed
deep inside the sampler-only DataSchema with the misleading message
"Column 'X' not found in schema" - the column does exist in the user's
config, just not in the sampler subset.

Add a model validator on DataDesignerConfig that has visibility into all
column types and raises a precise error naming the parent's actual
column type.

* fix: address PR review feedback on subcategory parent validator

- Tighten the error message to match what the validator actually checks
  ("sampler columns with sampler_type='category'") and mirror the wording
  the engine-level companion validator at schema.py uses.
- Rename `_check_subcategory_parents_are_samplers` to
  `_validate_subcategory_parents` to follow STYLEGUIDE convention
  (`validate_*` for check-style validators).
- Add a regression test pinning the deliberate scope: when the parent
  column name does not exist at all, this validator does not raise and
  defers to the existing engine-level "Column not found" path.

* chore: trim trailing whitespace introduced in merge resolution
2026-05-07 23:37:58 -03:00
Lawrence Lane
fba8f0b1b0
fix(docs): unbreak published Fern site (#615)
* fix(docs): shrink inline base64 images in notebook HTML outputs

The image notebooks (5, 6) emit `IPython.display.HTML` blocks containing
inline `data:image/png;base64,...` URIs to render side-by-side image
grids. Those bypassed the existing `image/png` MIME shrinker and shipped
multi-MB strings through the `text/html` branch, producing 1.8 MB and
4.6 MB .ts modules. Fern's hosted SSR bundler couldn't render the
version, taking every page down with a Server Components error.

Add `shrink_inline_b64_in_html()` so the html branch resizes embedded
base64 images through the same 800px JPEG q=82 path the standalone
image branch already uses. Apply in-place to the committed bundles:
notebook 5 1.8 MB → 423 KB, notebook 6 4.6 MB → 1.3 MB. Other outputs
preserved.

Signed-off-by: Lawrence Lane <llane@nvidia.com>

* fix(docs): migrate leftover MkDocs tab syntax to Fern Tabs

The agent-rollout-ingestion concept page still used PyMdown
`=== "Title"` tab blocks left over from the MkDocs source. Fern's MDX
runtime doesn't recognize the syntax, breaking the published page.
Convert the five tab blocks to Fern's <Tabs>/<Tab title="..."> JSX
components, preserving titles, intro text, and code snippets verbatim.

Signed-off-by: Lawrence Lane <llane@nvidia.com>

* fix(docs): convert leftover MkDocs admonition to Fern Tip

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>

---------

Signed-off-by: Lawrence Lane <llane@nvidia.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Andre Manoel <165937436+andreatgretel@users.noreply.github.com>
2026-05-07 18:28:11 -03:00
Eric W. Tramel
417b0c715d
feat(cli): show version update notice (#602) 2026-05-07 15:20:18 -04:00
Johnny Greco
604fdd9602
fix: quote review-code skill argument hint (#616)
Signed-off-by: Johnny Greco <jogreco@nvidia.com>
2026-05-07 14:55:10 -04:00
Eric W. Tramel
8d4d59303d
fix: normalize rollout timestamps before deriving started_at/ended_at (#556) 2026-05-07 14:13:10 -04:00
Lawrence Lane
7b5854ca36
docs: migrate documentation from MkDocs to Fern (#581)
* docs: migrate documentation from MkDocs to Fern

Adds a Fern Docs build under fern/ alongside the existing mkdocs site.
Production target docs.nvidia.com/nemo/datadesigner with floating-latest
pointer (latest.yml symlink) at v0.5.8. Migrated all concept, recipe, plugin,
dev-note, and tutorial pages to MDX with NVIDIA theme and custom components
(Authors, MetricsTable, TrajectoryViewer, NotebookViewer, BadgeLinks).
Tutorial notebooks now render via NotebookViewer with captured outputs (text,
DataFrames, inline images) - new make targets generate-fern-notebooks and
generate-fern-notebooks-with-outputs drive the .py -> executed .ipynb -> Fern
JSON+TS pipeline, pinning docs to Python 3.13 to dodge pyarrow wheel issues
on 3.14. Python API reference is configured via Fern libraries: pointing at
data-designer-config; output is gitignored and regenerated locally with
'fern docs md generate'.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>

* docs: add datadesigner-docs agent skill

Captures the patterns established in the Fern migration so agents (and humans)
can maintain fern/ confidently. Modeled after NVIDIA-NeMo/Gym's
nemo-gym-docs SKILL.md, adapted for our floating-latest versioning,
notebook-with-outputs pipeline, dev-notes kit components, and the MDX gotchas
hit during migration (pymdown attr_list, --8<-- snippet syntax, frontmatter
authors-as-JSX-scope-variable, etc.). Routes triggers like "edit docs", "add
doc page", "regenerate notebooks", "update dev note", "add API reference" to
this skill.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>

* docs: address PR review for Fern migration

- Delete stale fern/versions/_nav_order.yml (references non-existent
  ./versions/latest/pages/ — paths were never updated when latest/ was
  renamed to v0.5.8/, no consumer found in docs.yml or v0.5.8.yml).
- Remove unused custom components: Tag.tsx, CustomCard.tsx, Include.tsx
  (had its own untested markdown parser), ExpandableCode.tsx (broken in
  Fern SSR runtime). Drop expandable-code.css from docs.yml. Authors,
  BadgeLinks, MetricsTable, NotebookViewer, TrajectoryViewer remain
  (each has at least one call site).
- BadgeLinks: remove DEFAULT_BADGES with placeholder URLs; make `badges`
  prop required so we can never accidentally ship 'your-org/your-repo'.
- NotebookViewer: document the XSS trust boundary on output cells of
  format: "html". Outputs flow .py source → jupytext --execute → committed
  *.ts (review boundary). Add an inline comment at the dangerouslySetInnerHTML
  call site pointing back to the trust-model section.
- README: add Windows caveat on the latest.yml symlink — Windows users need
  core.symlinks=true before clone or Fern will reject the version config.
- Makefile: tighten generate-fern-notebooks source probe from `ls .../*.ipynb`
  (which can return success on non-file errors) to `[ -f docs/notebooks/1-the-basics.ipynb ]`,
  matching the reviewer's suggestion.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>

* docs: address @aschilling-nv review on fern/docs.yml

Three suggestions from the Fern review, all matching Curator's docs.yml
conventions:

- instances[0].url: drop the https:// protocol prefix to match Curator's
  shape (e.g. nemo-curator.docs.buildwithfern.com/nemo/curator).
- logo.href: was '/'; now points at /nemo/datadesigner/getting-started/welcome
  (the actual landing page) so clicking the logo lands on real content
  instead of the bare basepath.
- experimental.basepath-aware: true — opts into Fern's basepath-aware
  routing so internal links don't double-prefix the /nemo/datadesigner
  segment.
- redirects: also fix /nemo/datadesigner/index.html → getting-started/welcome
  (was bouncing to /latest, which is just the version slug); add
  /getting-started → /getting-started/welcome to mirror Curator's
  /home → /home/welcome convention.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>

* docs: put dev notes overview timestamps on separate lines

Signed-off-by: Kirit93 <kthadaka@nvidia.com>
Made-with: Cursor

* docs: redesign dev-notes index with BlogCard component

Replaces the generic <CardGroup>/<Card> grid (same green icon × 10, date
glued to bottom of description) with a purpose-built BlogCard for the
dev-notes landing page.

Each card now has:
- Hero image (16:9, lazy-loaded, click-to-zoom via Fern's rmiz wrapper)
- ALL-CAPS date eyebrow as proper subtitle styling
- Title, 3-line clamped description
- Author byline at the bottom: avatar stack (overlapping) + first author
  name + "+N", pulling from the existing devnotes/.authors.yml registry
- Hover: NVIDIA-green border + subtle lift

Posts without a hero image fall back to a deterministic hash-based
gradient placeholder + monogram (DJB2 hash of href → HSL hue, with the
muddy-yellow band 40–90° remapped). Same post always gets the same look.

Notes:
- Image prop is React.ReactNode (not string) — pass <img> JSX from MDX
  so Fern's link rewriter can resolve the src to /_local/... in dev and
  /nemo/datadesigner/assets/... in prod. Raw string props bypass the
  rewriter and 404 in dev.
- Card href runs through a small withBasepath() helper since the <a>
  also bypasses Fern's link rewriter.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>

* docs: flush blog-card hero images to the top of the card

Fern's prose stylesheet applies a top margin to <img> tags, and the
click-to-zoom wrapper Fern injects around each image (<span data-rmiz>)
inherits that margin too. Result: a ~1rem gap between the card's top
edge and the hero image.

Reset margin/padding on the rmiz wrapper spans + the img itself inside
.blog-card__media so the image renders flush against the top border.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>

* docs: stop blog-card hero from opening Fern's click-to-zoom modal

When an <img> appears in MDX, Fern auto-wraps it with a click-to-zoom
shell (<span data-rmiz>...). On the dev-notes index that shell intercepts
clicks meant for the card's <a> wrapper, so clicking a hero opens a
lightbox AND tries to navigate.

Set pointer-events: none on the rmiz spans + img inside .blog-card__media
so clicks bubble straight to the parent <a> and the card behaves as a
single, predictable link target. Hover still works because pointer-events
on children doesn't block :hover on the ancestor <a>.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>

* docs: render notebook markdown at build time with markdown-it-py

Replaces NotebookViewer's hand-rolled JS markdown parser (the one with
the ^@BR^@ sentinel the reviewer flagged as fragile) with build-time
rendering in the converter.

ipynb-to-fern-json.py now uses markdown-it-py (CommonMark + tables +
strikethrough + raw HTML) to render each markdown cell's source into
source_html, mirroring how code cells already store Pygments-highlighted
source_html. NotebookViewer's markdown branch becomes a single
dangerouslySetInnerHTML on the pre-rendered HTML, with a plain-escape
fallback for old snapshots.

Removes the dead JS helpers (renderMarkdown, isSafeUrl, UL_CLASS,
OL_CLASS) — ~60 lines of brittle regex-based markdown parsing.

Fixes broken rendering of:
- Blockquotes (showed literal > characters before)
- Nested content inside blockquotes (e.g. blockquote with bullet list)
- Fenced code blocks
- Tables
- Multi-paragraph list items

Includes regenerated fern/components/notebooks/*.{json,ts} for all 6
tutorials.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>

* docs: rewrite recipes index + replace octicons download links with Fern Info callouts

The recipes/cards.mdx page was still in MkDocs Material format:
- <div class="grid cards" markdown> wrapper (no-op in MDX)
- :material-snake:, :material-database:, :material-tools:, etc. (rendered
  as literal text — Fern uses Font Awesome, not Material icons)
- !!! tip Prerequisite (mkdocs admonition syntax)
- [:material-book-open-page-variant: View Recipe] / [Download Code
  :octicons-download-24:] links with embedded icon shortcodes

Rewrite using Fern's native components: <CardGroup cols={2}> with <Card
title icon href> grouped by category (Code Generation, QA and Chat,
Trace Ingestion, MCP and Tool Use, Plugin Development). Each card has
one primary action (the recipe page); download lives on the recipe page
itself.

Replace the trailing "Download Code :octicons-download-24:" link on
every recipe page (and 2 dev notes) with a <Info title="Download Recipe">
callout pointing at the GitHub blob URL — matching PR #215's
convention. 12 occurrences across 12 files.

Also fixes 6 recipe pages whose frontmatter title was "Untitled"
(unfilled placeholder from auto-migration): text_to_python, basic_mcp,
pdf_qa, multi_turn_chat, product_info_qa, agent_rollout_distillation.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>

* docs(fern): mirror main's content updates into v0.5.8 MDX pages

Forward-port the doc changes that landed in main since this branch was
cut, translating MkDocs admonition syntax to Fern components. Three
product changes drove the updates:

PR #594 — deprecate implicit default-provider routing:
- concepts/models/configure-model-settings-with-the-cli.mdx: deprecate
  "Change default provider" workflow + inline mark on `data-designer
  config list` output
- concepts/models/custom-model-settings.mdx: warning that `provider=`
  is now required on every ModelConfig
- concepts/models/default-model-settings.mdx: warning that the
  registry-level default-provider concept is deprecated
- concepts/models/model-providers.mdx: same warning at the top of the
  ModelProvider overview
- concepts/models/inference-parameters.mdx: add explicit `provider=
  "openai"` to the dalle ModelConfig example

PR #592 — async engine becomes the default:
- concepts/architecture-and-performance.mdx: rewrite Execution Model
  intro to mention both engines, qualify "How It Works" as sync-engine
  semantics, update Concurrency Formula and Throttle notes from "Sync
  engine caveat" to "Engine paths", and add a full new "## Async
  Engine" section (per-model timeouts, run outcomes / Early Shutdown,
  opt-out via DATA_DESIGNER_ASYNC_ENGINE=0). Add `provider="nvidia"`
  to the my-model example.
- concepts/custom_columns.mdx: note that sync `cell_by_cell`
  generators dispatch concurrently under the async engine; mock with
  `MagicMock(spec=ModelFacade)` so async methods are auto-detected.
- concepts/processors.mdx: warning that the async engine enforces
  row-count invariance in process_before/after_batch.
- devnotes/posts/async-all-the-way-down.mdx: append an "Update" callout
  noting the engine is now default, with a link to the Architecture
  page anchor.

All `!!! warning|note|tip "Title"` admonitions converted to Fern
<Warning|Note|Tip title="..."> components. Internal links to mkdocs
relative paths (`../../concepts/foo.md#anchor`) rewritten to canonical
Fern URLs (`/concepts/foo#anchor`).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>

* docs(fern): address @andreatgretel review comments

Four issues from Andre's review pass:

1. /devnotes 404 (index.mdx:23) — section slug is /dev-notes, page slug
   is /dev-notes/overview. Fix the link in the landing page so visitors
   actually reach the dev notes index.

2. TrajectoryViewer.tsx final-answer body shown as literal markdown
   (line 66) — the renderer uses dangerouslySetInnerHTML but
   example-marcia.ts shipped raw markdown (**bold**, \n\n breaks). Visible
   on the deep-research devnote where the trajectory is defaultOpen.
   Pre-render body to HTML in the fixture (matches the original hand-coded
   format pre-migration); document the convention in the ToolCall.body
   doc comment so future fixtures don't regress.

3. Tutorials 5/6 (image generation/editing) ship with 0 captured outputs
   because Flux runs through OpenRouter and OPENROUTER_API_KEY isn't set
   at build time. Cannot regenerate without the key, so add a <Note> at
   the top of each wrapper page pointing readers at the Colab link to
   execute the cells live and see the generated images. Maintainers with
   the key in their environment should re-run
   `make generate-fern-notebooks-with-outputs` before merge to capture
   the snapshots.

4. Legacy nvidia-nemo.github.io/DataDesigner/* URLs in MDX prose (8
   occurrences across 5 files) rewritten to canonical Fern paths so
   visitors don't get sent back to the legacy GitHub Pages site once
   docs.nvidia.com/nemo/datadesigner becomes the production URL:
   - The single deep link in data-designer-got-skills.mdx →
     /concepts/models/default-model-settings
   - All other "documentation home" links (CONTRIBUTING ×2,
     async-all-the-way-down ×2, owning-the-model-stack, design-principles
     ×2) → /getting-started/welcome (the canonical landing slug, matches
     logo.href in docs.yml)

   Notebook .py source URLs are tracked separately as part of the
   notebook-regen work.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>

* docs(fern): regenerate notebook snapshots with Flux outputs captured

Re-ran make generate-fern-notebooks-with-outputs with NVIDIA_API_KEY +
OPENROUTER_API_KEY set, now that we have a NVIDIA key with permission
on nemotron-3-nano-30b-a3b. All 6 tutorials regenerated; the two image
tutorials (5 and 6) which had been shipping with 0 outputs now have
captured Flux generations:

  1 the-basics:                12/15 outputs
  2 structured-outputs:        13/17 outputs
  3 seeding-with-a-dataset:    10/13 outputs
  4 providing-images:          13/17 outputs (1 image)
  5 generating-images:          8/10 outputs (2 images) ← was 0/12
  6 image-to-image-editing:     9/12 outputs (10 images) ← was 0/14

The two `<Note title="Run in Colab to see ...">` workarounds I added on
the 5/6 wrapper pages are no longer needed — outputs render inline now.
NotebookViewer's own "Run in Google Colab" banner is still rendered
from the wrapper's `colabUrl` prop, so the live-execute path stays one
click away.

Bumps the diff size noticeably (notebook 6 .ts is ~22MB of base64-
encoded PNGs from 10 edited images), but that's intentional — these
images are the proof points for what the Flux/MCP image-context
tutorials actually produce.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>

* docs(fern): unbreak SSR — shrink notebook image outputs + fix BlogCard React import

Two server-side render bugs surfaced when running `fern docs md generate &&
fern docs dev` (the static-preview path):

1. The 22 MB notebook 6 .ts module (full-resolution Flux PNGs from 10 edited
   images) tripped Fern's SSR module-evaluation step. Once that module
   failed to evaluate, the shared component bundle failed to load on every
   page, replacing each MDX body with `<span data-intent="error">Something
   went wrong!</span>` while the layout chrome continued to render.

   Fix in fern/scripts/ipynb-to-fern-json.py: after extracting an
   image/png output, pass it through Pillow to (a) downscale so the
   longest edge is at most 800 px, (b) re-encode as JPEG q=82 progressive
   (Flux outputs are photographic — JPEG compresses 5–10× better than PNG
   for this content). NotebookViewer's CellOutput interface gains a
   `mime` field so the data URL uses the actual encoded MIME type. Result:

       notebook 6: 22 MB → 4.6 MB
       notebook 5: 3.8 MB → 1.8 MB
       notebook 4: 514 KB → 116 KB
       (notebooks 1–3 unaffected — no image outputs)

2. fern/components/BlogCard.tsx referenced `React.ReactNode` twice without
   importing React. Other components in the kit use `import type
   { ReactNode } from "react"`; BlogCard was the outlier. Aligned the
   import style — even though this didn't end up being the trigger, leaving
   the dangling reference would have eventually caused a strict-mode SSR
   regression.

Sweep test against http://localhost:3000/nemo/datadesigner/* — landing,
concepts, tutorials (including 5/6 image notebooks), dev notes, recipes,
and code-reference topic pages all render with their content; no error
spans.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>

* docs(fern): add MkDocs-shape redirects for legacy URLs

The legacy site at https://nvidia-nemo.github.io/DataDesigner/ used
MkDocs-Material conventions (mkdocstrings + blog plugin + mkdocs-jupyter
+ directory URLs). Several path segments and page slugs differ from
Fern's slugified-title routing — search-engine indexed links and
copy-pasted bookmarks land on 404 without redirects.

Adds 30+ specific redirect rules covering every renamed surface:

- Tutorials: /notebooks/<filename>/ -> /tutorials/<title-slug>
  (page-title slugs differ from .ipynb filenames; one rule per notebook
   plus a README -> overview alias).

- Recipes: /recipes/<snake_subsection>/<snake_page>/ ->
  /recipes/<kebab-subsection>/<kebab-page>. Per-page rules for each of
  the 10 recipes (page titles diverged from .py filenames — e.g.
  basic_mcp -> basic-mcp-tool-use, search_agent -> nemotron-super-search-agent),
  followed by subsection :rest* fallbacks.

- Concepts: /concepts/mcp/* -> /concepts/tool-use-mcp/* (subsection
  rename, with & dropped, not -and-). Per-page rules for safety-and-limits
  -> safety-limits and configure-mcp-cli -> cli-configuration where
  page titles diverged from filenames.

- Code Reference: /code_reference/<module>/ ->
  /code-reference/topic-overviews/<module>. Per-page rules for the six
  underscored modules (column_configs, config_builder, run_config,
  sampler_params, validator_params, data_designer_config) since Fern's
  page-slug rule kebabs underscores.

- Plugins: filesystem_seed_reader -> file-system-seed-reader-plugins
  (Fern inserts hyphens between CamelCase words). example -> example-plugin,
  available -> available-plugin-list (page-title slugs).

- Dev Notes: blog plugin's /devnotes/posts/<slug>/ -> /dev-notes/<slug>.
  Per-page rules for text-to-sql -> text-to-sql-for-nemotron-super and
  rqa -> rqa-dataset (post titles diverged from filenames).

- /devnotes -> /dev-notes/overview (section landing).

MkDocs's directory-URL trailing-slash convention is handled natively by
Fern's runtime (both /foo and /foo/ return the same page), so no
explicit slash-strip rule is needed.

Smoke-tested all 34 legacy URLs against http://localhost:3000 — every
one resolves to a 200 page on the new structure.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: Lawrence Lane <llane@nvidia.com>

---------

Signed-off-by: Lawrence Lane <llane@nvidia.com>
Signed-off-by: Kirit93 <kthadaka@nvidia.com>
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
Co-authored-by: Kirit93 <kthadaka@nvidia.com>
Co-authored-by: Andre Manoel <165937436+andreatgretel@users.noreply.github.com>
2026-05-07 14:12:58 -03:00
Przemysław Boruta
0afe287a5f
feat(results): add export() method and --output-format CLI flag (#540)
* feat(results): add export() method and --output-format CLI flag

Adds DatasetCreationResults.export(path, format=) supporting jsonl,
csv, and parquet. The CLI create command gains --output-format / -f
which writes dataset.<format> alongside the parquet batch files.

* fix(cli): validate output_format before dataset generation

* fix(cli): remove top-level results import from create.py to preserve lazy loading

* fix(results): address andreatgretel review — error types, UX ordering, import hygiene

- Derive SUPPORTED_EXPORT_FORMATS from get_args(ExportFormat) so the two can't drift apart
- Replace ValueError with InvalidFileFormatError in export() — consistent with project error conventions
- Add date_format="iso" to to_json() for consistent datetime serialization across formats
- Add click.Choice(SUPPORTED_EXPORT_FORMATS) to --output-format CLI option for parse-time
  validation, better --help output, and tab completion
- Fix double load_dataset() in run_create: inline len() so the DataFrame ref dies before export
- Move success message after the export block to avoid "Dataset created" followed by "Export failed"
- Move imports to module level in test_results.py (json, Path, lazy already imported)
- Add controller-level tests for output_format happy path, bad format rejection, and export failure

* fix(results): correct Raises docstring — ValueError -> InvalidFileFormatError

* feat(results): stream batch files in export() to avoid OOM on large datasets

- Rewrite export() to read batch parquet files one at a time instead of
  materialising the full dataset via load_dataset(); peak memory is now
  proportional to a single batch regardless of dataset size
- Infer output format from file extension by default; format= parameter
  kept as an explicit override (e.g. writing .txt as JSONL)
- _export_parquet unifies schemas across batches (pa.unify_schemas) to
  handle type drift (e.g. int64 vs float64 in the same column)
- Drop format= from the controller's export() call — path already carries
  the correct extension
- Rewrite export tests around real batch parquet files (stub_batch_dir
  fixture); add tests for multi-batch output, schema unification, unknown
  extension, empty batch directory, and explicit format override

* fix(results): address nabinchha review — memory safety, error wrapping, UX

- Replace load_dataset() with count_records() in CLI to avoid OOM on
  large datasets; add count_records() method using pq.read_metadata
  (reads file metadata only, no data pages loaded)
- Remove redundant format validation in controller — click.Choice in
  create.py already rejects invalid values at parse time; dead code
  removed along with corresponding test
- Wrap pa.unify_schemas / table.cast ArrowInvalid as InvalidFileFormatError
  to normalize third-party exceptions at module boundaries per AGENTS.md
- Lowercase file extension before format lookup so .JSONL/.CSV/.PARQUET
  are accepted without error
- Add clarifying comment to trailing-newline guard in _export_jsonl
- Add tests: count_records(), uppercase extension, incompatible schemas

* fix(results): fix parquet export schema unification and controller path bug

- Use promote_options="permissive" in pa.unify_schemas so minor numeric
  type drift (int64 vs float64) is handled by promotion instead of raising
- Also catch ArrowTypeError from unify_schemas and ValueError from
  table.cast() — the actual exception types thrown by pyarrow for these
  cases (ArrowInvalid alone is not sufficient)
- Wrap base_dataset_path in Path() in generation_controller.run_create
  to guard against callers that return a str (mock returns str, Path
  does not support / with str operands)
- Update test_export_parquet_incompatible_schemas_raises to match the
  new error source: with permissive unification, different-column-name
  batches fail at cast() not at unify_schemas(), so the match string
  changes from "Cannot unify batch schemas" to "Cannot cast batch"

* fix(results,cli): address nabinchha review round 2

- Use public pa.ArrowInvalid/ArrowTypeError instead of pa.lib.* in _export_parquet
- Drop dead trailing-newline guard in _export_jsonl; skip empty batches with `if content`
- Rename num_records→actual_record_count after count_records() call to avoid shadowing
- Unlink partial export file before re-raising on export failure in run_create
- Export filename now uses dataset_name (<dataset-name>.<format>) instead of literal "dataset"
- Update help text and tests to match new export filename convention

---------

Co-authored-by: Andre Manoel <165937436+andreatgretel@users.noreply.github.com>
2026-05-06 17:13:57 -06:00
Johnny Greco
8b8d748446
docs: graduate plugins out of experimental mode (#603)
* chore: add __init__.py to engine namespace subpackages

Griffe (used by mkdocstrings) skips directories without __init__.py
when resolving module paths, which prevented the new plugins code
reference from rendering SeedReader, FileSystemSeedReader, and
Processor. Adding empty __init__.py files in engine/resources/,
engine/processing/, and engine/processing/processors/ aligns with
the convention already used in engine/mcp/, engine/models/, etc.

* docs: flesh out docstrings on plugin extension-point classes

Plugin authors now see meaningful descriptions for every field and
method on the bases rendered in the plugins code reference:

- Plugin and PluginType: class docstrings + Attributes tables for
  fields and enum members; fix typo in config_qualified_name field
  description.
- SingleColumnConfig: document allow_resize.
- ProcessorConfig: document processor_type discriminator.
- SeedSource: document seed_type discriminator.
- FileSystemSeedSource: add class docstring + Attributes table for
  path / file_pattern / recursive.
- ColumnGeneratorFullColumn and ColumnGeneratorCellByCell: add
  class docstrings explaining when to use each base, plus method
  docstrings on the abstract generate() implementations.

* docs: graduate plugins out of experimental mode

Restructures plugin documentation around the now-stable extension
points (column generator, seed reader, processor) and treats plugins
as a first-class story for customizing Data Designer.

- Add code_reference/plugins.md: single-stop reference for the Plugin
  object and the config + implementation base classes used by all
  three plugin types.
- Add code_reference/generators.md: column generator implementation
  base classes, separated from column configs.
- Surface SingleColumnConfig in code_reference/column_configs.md.
- Add plugins/implement.md ("Build Your Own"): per-type implementation
  instructions across column generators, seed readers, and processors.
- Add plugins/processor.md: complete processor plugin package example.
- Rewrite plugins/overview.md: open with why plugins exist, drop the
  internal-helpers note (PluginRegistry / PluginManager), and focus
  the guide on what plugin builders need.
- Refresh plugins/available.md (Catalog) and
  plugins/filesystem_seed_reader.md to match the new structure.
- Delete plugins/example.md (replaced by per-type guides).
- Reorder Code Reference nav alphabetically and add the new pages.
- Minor link / wording fixes in concepts/processors.md and
  concepts/deployment-options.md.

* docs: simplify plugin docs structure

Replace the overview's how-to walkthrough and the per-type plugin
guides with a single Build Your Own page that covers all three
plugin types side-by-side. Add a dedicated Using Models in Plugins
guide and a seed_readers code reference, and trim the overview down
to what the plugin types are, how to use one, and how discovery
works.

- Rename plugins/implement.md to plugins/build_your_own.md.
- Delete plugins/filesystem_seed_reader.md and plugins/processor.md
  (their content is now in build_your_own.md and the per-type code
  references).
- Add plugins/models.md for model-backed column generator authoring.
- Add code_reference/seed_readers.md for seed reader implementation
  base classes.
- Rewrite plugins/overview.md: shorter intro, type bullets link to
  the relevant code reference, drop the multi-step "How do you
  create plugins" walkthrough in favor of a single Build a Plugin
  pointer, tighten Discovery troubleshooting.
- Refresh plugins/available.md (Available Plugins): point to the
  DataDesignerPlugins catalog and explain how to request a community
  listing.
- Update cross-page links in concepts/processors.md,
  concepts/seed-datasets.md, recipes/plugin_development/markdown_seed_reader.md,
  code_reference/plugins.md, and code_reference/generators.md to
  match the new structure.
- Update mkdocs.yml nav: rename to Build Your Own, add Using Models,
  add seed_readers code reference.

* docs: scroll wide tables horizontally instead of wrapping

Code-heavy reference tables (plugin bases, column generators, etc.)
were wrapping aggressively on narrow viewports, breaking long
identifiers across multiple lines. Switch the table container to
horizontal overflow and prevent code cells from wrapping so
identifiers stay readable.

* docs: address PR #603 review feedback

- Add an Implementation base section to code_reference/processors.md
  rendering the engine-side Processor class. This justifies the
  engine/processing/__init__.py files added earlier and gives
  processor plugin authors an auto-rendered API reference, matching
  the pattern used by code_reference/generators.md and seed_readers.md.
- build_your_own.md: replace the placeholder "x" emoji on the
  IndexMultiplier example with the actual multiplication sign.
- build_your_own.md: drop the manual `re.compile + apply(lambda)`
  pattern in the regex-filter processor in favor of the idiomatic
  `Series.str.contains(..., regex=True)`.
- build_your_own.md: add a kernel-restart caveat after the editable
  install instructions — PluginRegistry caches discovery on first
  import, so notebooks need a fresh kernel to pick up freshly
  installed plugins.
- build_your_own.md: state explicitly what `assert_valid_plugin`
  checks (config base + plugin-type-appropriate impl base).
- code_reference/plugins.md: link out to the processors code
  reference alongside generators and seed_readers.

* docs: split code reference by package

* docs: add interface code reference

* docs: add code reference overviews

* docs: refine code reference pages

* docs: improve code reference tables

* docs: correct reference docstrings

* docs: embed plugin catalog table

* docs: note plugin discovery restart caveat

* docs: explain generator base class choice

* docs: mention async cell generator examples

* docs: clarify plugin model usage

* docs: clarify plugin model aliases

* docs: address plugin review feedback

* docs: update available plugins page
2026-05-06 18:12:44 -04:00
Johnny Greco
9214637a5b
fix(engine): validate processor plugin impls (#609)
* fix(engine): validate processor plugin impls

Add the processor implementation base to assert_valid_plugin so
processor plugins are checked against Processor instead of only the
generic config contract. Keep plugin type validation table-driven and
raise explicit AssertionError messages so checks are not skipped under
optimized Python.

Signed-off-by: Johnny Greco <jogreco@nvidia.com>

* test(engine): require plugin base map coverage

---------

Signed-off-by: Johnny Greco <jogreco@nvidia.com>
2026-05-06 14:31:12 -04:00
Nabin Mulepati
f73da1975c
feat(models): deprecate implicit default provider routing (#594)
Some checks failed
CI / Test (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test Engine (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Coverage Check (Python 3.11) (push) Has been cancelled
CI / End to end test (Python 3.10 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.11 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.12 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.13 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Has been cancelled
* feat(models): deprecate implicit default provider routing

Emit DeprecationWarning whenever the legacy "implicit default
provider" path is exercised: `ModelConfig.provider=None`, the
registry-level `ModelProviderRegistry.default`, the YAML
`default:` key in `~/.data-designer/model_providers.yaml`, and
the CLI's "Change default provider" workflow.

`resolve_model_provider_registry` skips passing `default=` in the
single-provider case so the common construction path stays quiet.
Multi-provider registries still pass `default` (per
`check_implicit_default`) and warn accordingly.

Update docs, the package README, and test fixtures to specify
`provider=` explicitly on every `ModelConfig`. New tests cover
each warning entry point and pin the post-deprecation happy paths.

Refs #589

Made-with: Cursor

* fix(models): address PR #594 review feedback

Greptile P1: ProviderRepository.load emitted its DeprecationWarning
inside a `try/except Exception` block. Under
`filterwarnings("error", DeprecationWarning)` the warn would raise,
the except would swallow it, and `load()` would silently return None
(losing the registry). Move the warn outside the catch-all so the
strict-warning path no longer drops valid configs.

Greptile P2 / johnnygreco: `_warn_on_implicit_provider` and
`_warn_on_explicit_default` use `stacklevel=2`, which lands inside
pydantic v2's validator dispatch rather than at the user's
`ModelConfig(...)` / `ModelProviderRegistry(...)` call. That broke
both attribution (the source line was unhelpful) and Python's
once-per-location dedup (every call collapsed to the same
pydantic-internal key, suppressing all but the first warning).
Introduce `data_designer.config.utils.warning_helpers.warn_at_caller`,
which walks past the helper, validator, and any pydantic frames to
find the user's call site and emits via `warnings.warn_explicit` with
the user frame's `__warningregistry__`. Keeps attribution accurate
and dedup keyed on the user's (filename, lineno).

johnnygreco: align the `provider_repository.py` warning copy with the
sibling site in `default_model_settings.py` ("specify provider=
explicitly on each ModelConfig instead") so both YAML-default warning
sites give the same migration instruction. The previous wording
pointed users at "ModelConfig entries" inside `model_providers.yaml`,
where ModelConfig entries don't actually live.

johnnygreco: dedup the cascade in `DataDesigner.__init__`. With
`model_providers=None` and a YAML `default:`, the user previously saw
two DeprecationWarnings for the same root cause —
`get_default_provider_name()` warns about the YAML key, then
`resolve_model_provider_registry(...)` re-warns from
`_warn_on_explicit_default`. Suppress the registry-level duplicate in
the YAML-fallback branch via `warnings.catch_warnings()` so users see
exactly one warning per user action.

johnnygreco: tighten `_warn_on_explicit_default` to fire only when
`default is not None`. Passing `default=None` explicitly is
semantically equivalent to omitting it (caller is opting *out* of a
registry-level default), and shouldn't trigger the deprecation
nudge.

johnnygreco: add a `model_validate({...})` regression test for
`ModelConfig` so the deserialization path (legacy on-disk configs)
is pinned alongside the construction path.

Tests:
- Update `test_load_exists` and `test_save` to omit `default=` so the
  roundtrip stops exercising the deprecated YAML-default path
  unguarded (Greptile note).
- Wrap `test_resolve_model_provider_registry_with_explicit_default`,
  `test_get_provider`, and
  `test_init_user_supplied_providers_preserve_first_wins_over_yaml_default`
  in `pytest.warns` so the suite stays green under
  `-W error::DeprecationWarning` (Greptile note).
- Add `test_explicit_default_none_does_not_emit_deprecation_warning`
  to pin the tightened predicate.
- Add `test_init_yaml_default_emits_single_deprecation_warning` to
  pin the cascade-dedup behavior.

Refs #589

Made-with: Cursor

* fix(models): make deprecation warnings visible under default filters

andreatgretel (PR #594): the YAML-default warning in
`get_default_provider_name` and the registry-default warning emitted
from inside DataDesigner helpers were attributing to data_designer
library frames, not user code. Python's default filter chain includes
`ignore::DeprecationWarning`, so library-attributed entries are
silenced — meaning a normal `DataDesigner()` call with a YAML
`default:` set showed nothing, and `resolve_model_provider_registry`
warnings were similarly invisible. Two related changes:

1. `warn_at_caller`: extend the default skip-list from `("pydantic",)`
   to `("pydantic", "pydantic_core", "data_designer")` so the walk
   escapes both pydantic's validator-dispatch frames and data_designer
   helper frames before attributing. Also tighten the prefix predicate
   to exact-or-dotted-prefix matching (`name == p or
   name.startswith(p + ".")`) so e.g. `pydantic_helpers` is not
   falsely matched as part of `pydantic` (johnnygreco nit). Allow
   callers to pass a custom `skip_prefixes` for flexibility. Drop the
   "skip frame 0+1 unconditionally" guard now that prefix matching
   covers it.

2. `get_default_provider_name`: switch from
   `warnings.warn(stacklevel=2)` to `warn_at_caller`. The previous
   stacklevel pointed into `default_model_settings.py`, which is a
   library file → silenced under default filters. Verified the fix
   empirically with `python -W default`: warning is now attributed to
   the user's call site and rendered.

johnnygreco (PR #594): add the missing
`test_explicit_default_none_does_not_emit_deprecation_warning`
regression for the `self.default is not None` predicate landed in
the prior round.

Tests:
- New `test_warning_helpers.py` pins prefix-matching precision
  (rejects `pydantic_helpers` / `data_designer_other`), default
  skip-list contents, attribution past skip-prefix frames, and
  per-call-site dedup behavior.
- `test_get_default_provider_name_warning_attributes_to_user_frame`
  pins andreatgretel's repro for the YAML-default site.
- `test_explicit_default_warning_attributes_to_user_frame` pins the
  multi-frame case: construction goes through
  `resolve_model_provider_registry`, so the walk has to escape both
  pydantic and data_designer before landing on the test file.
- `test_explicit_default_none_does_not_emit_deprecation_warning`
  pins johnnygreco's predicate-tightening regression.

3,124 tests pass (540 config + 1,923 engine + 653 interface; +10 net
from this round).

Refs #589

Made-with: Cursor

* fix(models): apply warn_at_caller to remaining deprecation sites

greptile-apps (PR #594, r3189904028): `ProviderRepository.load`'s
YAML-default `DeprecationWarning` was using `warnings.warn(stacklevel=2)`,
which attributes to whichever data_designer frame called `load()` —
controllers, services, list/reset commands, agent introspection. Every
real call path lands on `data_designer.cli.*`, which falls under
Python's default `ignore::DeprecationWarning` filter and is silenced.
Audit found two more sites with the same problem:

- `DatasetBuilder._resolve_async_compatibility` (`allow_resize` /
  issue #552) — was using `stacklevel=4` to walk past
  `_resolve_async_compatibility -> build/build_preview -> interface ->
  user`. Brittle: any added frame (decorator, async wrapping, the
  `try/except DeprecationWarning: raise` boundary) shifts attribution
  silently. The existing test passed only because it used
  `simplefilter("always") + record=True`, which records warnings
  regardless of attribution.
- `ProviderController._handle_change_default` — was using
  `stacklevel=2`, which lands on the menu dispatcher in the same
  controller module. `print_warning` already shows the message
  visually, but programmatic observers (`pytest.warns`,
  `filterwarnings("error", ...)`) saw a library-attributed entry that
  default filters silenced.

All three migrated to `warn_at_caller` (the helper from 247fa30) so
attribution lands on the user's call site regardless of internal
chain shape. `data_designer` is already in
`DEFAULT_INTERNAL_PREFIXES`, so the walk escapes the entire library
in one pass.

Added attribution regression tests at each site asserting
`warning.filename == __file__`. A future regression to
`warnings.warn(stacklevel=N)` now fails CI instead of silently
silencing the user-facing nudge:

- `test_load_with_yaml_default_attributes_warning_to_caller`
  (test_provider_repository.py)
- `test_resolve_async_compatibility` extended with the same assertion
- `test_handle_change_default_emits_deprecation_warning` rewritten
  from `pytest.warns(...)` to a `catch_warnings(record=True)` block
  that filters for the message and asserts `filename == __file__`
  (`pytest.warns` does not check attribution, so the rewrite is
  required to actually catch the regression).

3,125 tests pass (548 config + 1,923 engine + 654 interface).

Refs #589
2026-05-05 13:39:12 -06:00
Johnny Greco
8fb132077a
fix(config): round-trip processors and profilers (#605)
* fix(config): round-trip processors and profilers

Signed-off-by: Johnny Greco <jogreco@nvidia.com>

* test(config): cover from_config round trips

Signed-off-by: Johnny Greco <jogreco@nvidia.com>

---------

Signed-off-by: Johnny Greco <jogreco@nvidia.com>
2026-05-05 13:57:55 -04:00