Commit graph

26 commits

Author SHA1 Message Date
Nabin Mulepati
2a487cdc5c
feat: add dropped column preservation toggle (#691)
* feat: add dropped column preservation toggle

Closes #690

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

* fix: reject dropped column policy resume mismatch

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>

---------

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>
2026-05-21 13:19:20 -06:00
Sai Asish Y
000fc09f94
fix(interface): reject duplicate names within output_processors (#675) (#697)
Signed-off-by: SAY-5 <say.apm35@gmail.com>
2026-05-21 12:13:51 -06:00
Eric W. Tramel
c0a4dcbb85
feat: implement async scheduling admission control (#661)
Some checks are pending
CI / End to end test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test Config (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Config (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Config (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
CI / Coverage Check (Python 3.11) (push) Blocked by required conditions
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
2026-05-20 20:58:05 -04:00
Mike Knepper
498e627d49
feat: Expose on_batch_complete via create method (#663)
Some checks are pending
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.10 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on macos-latest) (push) Waiting to run
CI / Coverage Check (Python 3.11) (push) Waiting to run
CI / End to end test (Python 3.11 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Lint and Format Check (push) Waiting to run
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Check License Headers (push) Waiting to run
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
2026-05-19 09:38:18 -05:00
Andre Manoel
6055290136
feat: add workflow chaining (#636)
Some checks are pending
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Coverage Check (Python 3.11) (push) Waiting to run
CI / End to end test (Python 3.10 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Lint and Format Check (push) Waiting to run
CI / Check License Headers (push) Waiting to run
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
Publish Fern devnotes / deploy (push) Waiting to run
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
* feat: add workflow chaining

* test: tidy workflow chaining coverage

* fix: harden workflow chaining concurrency

* docs: update workflow chaining plan

* feat: add workflow stage postprocessors

* feat: expose workflow stage outputs

* fix: align workflow selected output export

* fix: address workflow chaining review issues

* fix: align workflow parquet export selection

Signed-off-by: Andre Manoel <amanoel@nvidia.com>

* fix: preserve generated columns in drop validation

Signed-off-by: Andre Manoel <amanoel@nvidia.com>

* fix: clarify workflow output processors

Signed-off-by: Andre Manoel <amanoel@nvidia.com>

* docs: add workflow chaining page

* docs: align workflow chaining warning

* fix: address workflow review nits

---------

Signed-off-by: Andre Manoel <amanoel@nvidia.com>
2026-05-18 20:15:47 -03:00
Nabin Mulepati
3c8394e783
fix(interface): silence registry-default deprecation when library auto-fills it (#655)
The ``ModelProviderRegistry.default is deprecated`` warning added in #594
fires for every fresh-install ``DataDesigner()`` construction, even when
the user wrote ``default=`` nowhere — neither in YAML, nor in Python, nor
in any ``ModelConfig``.

Root cause: ``resolve_model_provider_registry`` synthesises
``default=providers[0].name`` for the multi-provider case to satisfy
``check_implicit_default``. The auto-seeded
``~/.data-designer/model_providers.yaml`` ships three providers and no
``default:`` key, so this path is hit on every bare ``DataDesigner()``
call. ``_warn_on_explicit_default`` then attributes the warning to the
user's ``DataDesigner()`` line, with a remediation message ("Specify
provider= explicitly on each ModelConfig") that doesn't even apply when
the user hasn't built a ``ModelConfig`` (e.g. a UUID-only sampler config
with the GitHub plugin).

Fix: broaden the existing warning suppression in ``DataDesigner.__init__``
to also cover the ``model_providers is None`` case. The user is opting
into all defaults — the library is the one filling ``default=``, so the
deprecation nudge is misdirected. Users who hand-construct a
multi-provider list in Python still see the warning (they wrote the
multi-provider intent themselves), and direct
``ModelProviderRegistry(default="x")`` always warns — those are the
entry points #589 actually targets.

New regression test pins the bare-``DataDesigner()`` quiet path so a
future tightening of the suppression can't silently re-introduce the
spurious warning.

Refs #589, follow-up to #594.

Signed-off-by: Nabin Mulepati <nmulepati@nvidia.com>
2026-05-13 14:15:10 -06:00
Przemysław Boruta
0afe287a5f
feat(results): add export() method and --output-format CLI flag (#540)
* feat(results): add export() method and --output-format CLI flag

Adds DatasetCreationResults.export(path, format=) supporting jsonl,
csv, and parquet. The CLI create command gains --output-format / -f
which writes dataset.<format> alongside the parquet batch files.

* fix(cli): validate output_format before dataset generation

* fix(cli): remove top-level results import from create.py to preserve lazy loading

* fix(results): address andreatgretel review — error types, UX ordering, import hygiene

- Derive SUPPORTED_EXPORT_FORMATS from get_args(ExportFormat) so the two can't drift apart
- Replace ValueError with InvalidFileFormatError in export() — consistent with project error conventions
- Add date_format="iso" to to_json() for consistent datetime serialization across formats
- Add click.Choice(SUPPORTED_EXPORT_FORMATS) to --output-format CLI option for parse-time
  validation, better --help output, and tab completion
- Fix double load_dataset() in run_create: inline len() so the DataFrame ref dies before export
- Move success message after the export block to avoid "Dataset created" followed by "Export failed"
- Move imports to module level in test_results.py (json, Path, lazy already imported)
- Add controller-level tests for output_format happy path, bad format rejection, and export failure

* fix(results): correct Raises docstring — ValueError -> InvalidFileFormatError

* feat(results): stream batch files in export() to avoid OOM on large datasets

- Rewrite export() to read batch parquet files one at a time instead of
  materialising the full dataset via load_dataset(); peak memory is now
  proportional to a single batch regardless of dataset size
- Infer output format from file extension by default; format= parameter
  kept as an explicit override (e.g. writing .txt as JSONL)
- _export_parquet unifies schemas across batches (pa.unify_schemas) to
  handle type drift (e.g. int64 vs float64 in the same column)
- Drop format= from the controller's export() call — path already carries
  the correct extension
- Rewrite export tests around real batch parquet files (stub_batch_dir
  fixture); add tests for multi-batch output, schema unification, unknown
  extension, empty batch directory, and explicit format override

* fix(results): address nabinchha review — memory safety, error wrapping, UX

- Replace load_dataset() with count_records() in CLI to avoid OOM on
  large datasets; add count_records() method using pq.read_metadata
  (reads file metadata only, no data pages loaded)
- Remove redundant format validation in controller — click.Choice in
  create.py already rejects invalid values at parse time; dead code
  removed along with corresponding test
- Wrap pa.unify_schemas / table.cast ArrowInvalid as InvalidFileFormatError
  to normalize third-party exceptions at module boundaries per AGENTS.md
- Lowercase file extension before format lookup so .JSONL/.CSV/.PARQUET
  are accepted without error
- Add clarifying comment to trailing-newline guard in _export_jsonl
- Add tests: count_records(), uppercase extension, incompatible schemas

* fix(results): fix parquet export schema unification and controller path bug

- Use promote_options="permissive" in pa.unify_schemas so minor numeric
  type drift (int64 vs float64) is handled by promotion instead of raising
- Also catch ArrowTypeError from unify_schemas and ValueError from
  table.cast() — the actual exception types thrown by pyarrow for these
  cases (ArrowInvalid alone is not sufficient)
- Wrap base_dataset_path in Path() in generation_controller.run_create
  to guard against callers that return a str (mock returns str, Path
  does not support / with str operands)
- Update test_export_parquet_incompatible_schemas_raises to match the
  new error source: with permissive unification, different-column-name
  batches fail at cast() not at unify_schemas(), so the match string
  changes from "Cannot unify batch schemas" to "Cannot cast batch"

* fix(results,cli): address nabinchha review round 2

- Use public pa.ArrowInvalid/ArrowTypeError instead of pa.lib.* in _export_parquet
- Drop dead trailing-newline guard in _export_jsonl; skip empty batches with `if content`
- Rename num_records→actual_record_count after count_records() call to avoid shadowing
- Unlink partial export file before re-raising on export failure in run_create
- Export filename now uses dataset_name (<dataset-name>.<format>) instead of literal "dataset"
- Update help text and tests to match new export filename convention

---------

Co-authored-by: Andre Manoel <165937436+andreatgretel@users.noreply.github.com>
2026-05-06 17:13:57 -06:00
Nabin Mulepati
f73da1975c
feat(models): deprecate implicit default provider routing (#594)
Some checks failed
CI / Test (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test Engine (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Coverage Check (Python 3.11) (push) Has been cancelled
CI / End to end test (Python 3.10 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.11 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.12 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.13 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Has been cancelled
* feat(models): deprecate implicit default provider routing

Emit DeprecationWarning whenever the legacy "implicit default
provider" path is exercised: `ModelConfig.provider=None`, the
registry-level `ModelProviderRegistry.default`, the YAML
`default:` key in `~/.data-designer/model_providers.yaml`, and
the CLI's "Change default provider" workflow.

`resolve_model_provider_registry` skips passing `default=` in the
single-provider case so the common construction path stays quiet.
Multi-provider registries still pass `default` (per
`check_implicit_default`) and warn accordingly.

Update docs, the package README, and test fixtures to specify
`provider=` explicitly on every `ModelConfig`. New tests cover
each warning entry point and pin the post-deprecation happy paths.

Refs #589

Made-with: Cursor

* fix(models): address PR #594 review feedback

Greptile P1: ProviderRepository.load emitted its DeprecationWarning
inside a `try/except Exception` block. Under
`filterwarnings("error", DeprecationWarning)` the warn would raise,
the except would swallow it, and `load()` would silently return None
(losing the registry). Move the warn outside the catch-all so the
strict-warning path no longer drops valid configs.

Greptile P2 / johnnygreco: `_warn_on_implicit_provider` and
`_warn_on_explicit_default` use `stacklevel=2`, which lands inside
pydantic v2's validator dispatch rather than at the user's
`ModelConfig(...)` / `ModelProviderRegistry(...)` call. That broke
both attribution (the source line was unhelpful) and Python's
once-per-location dedup (every call collapsed to the same
pydantic-internal key, suppressing all but the first warning).
Introduce `data_designer.config.utils.warning_helpers.warn_at_caller`,
which walks past the helper, validator, and any pydantic frames to
find the user's call site and emits via `warnings.warn_explicit` with
the user frame's `__warningregistry__`. Keeps attribution accurate
and dedup keyed on the user's (filename, lineno).

johnnygreco: align the `provider_repository.py` warning copy with the
sibling site in `default_model_settings.py` ("specify provider=
explicitly on each ModelConfig instead") so both YAML-default warning
sites give the same migration instruction. The previous wording
pointed users at "ModelConfig entries" inside `model_providers.yaml`,
where ModelConfig entries don't actually live.

johnnygreco: dedup the cascade in `DataDesigner.__init__`. With
`model_providers=None` and a YAML `default:`, the user previously saw
two DeprecationWarnings for the same root cause —
`get_default_provider_name()` warns about the YAML key, then
`resolve_model_provider_registry(...)` re-warns from
`_warn_on_explicit_default`. Suppress the registry-level duplicate in
the YAML-fallback branch via `warnings.catch_warnings()` so users see
exactly one warning per user action.

johnnygreco: tighten `_warn_on_explicit_default` to fire only when
`default is not None`. Passing `default=None` explicitly is
semantically equivalent to omitting it (caller is opting *out* of a
registry-level default), and shouldn't trigger the deprecation
nudge.

johnnygreco: add a `model_validate({...})` regression test for
`ModelConfig` so the deserialization path (legacy on-disk configs)
is pinned alongside the construction path.

Tests:
- Update `test_load_exists` and `test_save` to omit `default=` so the
  roundtrip stops exercising the deprecated YAML-default path
  unguarded (Greptile note).
- Wrap `test_resolve_model_provider_registry_with_explicit_default`,
  `test_get_provider`, and
  `test_init_user_supplied_providers_preserve_first_wins_over_yaml_default`
  in `pytest.warns` so the suite stays green under
  `-W error::DeprecationWarning` (Greptile note).
- Add `test_explicit_default_none_does_not_emit_deprecation_warning`
  to pin the tightened predicate.
- Add `test_init_yaml_default_emits_single_deprecation_warning` to
  pin the cascade-dedup behavior.

Refs #589

Made-with: Cursor

* fix(models): make deprecation warnings visible under default filters

andreatgretel (PR #594): the YAML-default warning in
`get_default_provider_name` and the registry-default warning emitted
from inside DataDesigner helpers were attributing to data_designer
library frames, not user code. Python's default filter chain includes
`ignore::DeprecationWarning`, so library-attributed entries are
silenced — meaning a normal `DataDesigner()` call with a YAML
`default:` set showed nothing, and `resolve_model_provider_registry`
warnings were similarly invisible. Two related changes:

1. `warn_at_caller`: extend the default skip-list from `("pydantic",)`
   to `("pydantic", "pydantic_core", "data_designer")` so the walk
   escapes both pydantic's validator-dispatch frames and data_designer
   helper frames before attributing. Also tighten the prefix predicate
   to exact-or-dotted-prefix matching (`name == p or
   name.startswith(p + ".")`) so e.g. `pydantic_helpers` is not
   falsely matched as part of `pydantic` (johnnygreco nit). Allow
   callers to pass a custom `skip_prefixes` for flexibility. Drop the
   "skip frame 0+1 unconditionally" guard now that prefix matching
   covers it.

2. `get_default_provider_name`: switch from
   `warnings.warn(stacklevel=2)` to `warn_at_caller`. The previous
   stacklevel pointed into `default_model_settings.py`, which is a
   library file → silenced under default filters. Verified the fix
   empirically with `python -W default`: warning is now attributed to
   the user's call site and rendered.

johnnygreco (PR #594): add the missing
`test_explicit_default_none_does_not_emit_deprecation_warning`
regression for the `self.default is not None` predicate landed in
the prior round.

Tests:
- New `test_warning_helpers.py` pins prefix-matching precision
  (rejects `pydantic_helpers` / `data_designer_other`), default
  skip-list contents, attribution past skip-prefix frames, and
  per-call-site dedup behavior.
- `test_get_default_provider_name_warning_attributes_to_user_frame`
  pins andreatgretel's repro for the YAML-default site.
- `test_explicit_default_warning_attributes_to_user_frame` pins the
  multi-frame case: construction goes through
  `resolve_model_provider_registry`, so the walk has to escape both
  pydantic and data_designer before landing on the test file.
- `test_explicit_default_none_does_not_emit_deprecation_warning`
  pins johnnygreco's predicate-tightening regression.

3,124 tests pass (540 config + 1,923 engine + 653 interface; +10 net
from this round).

Refs #589

Made-with: Cursor

* fix(models): apply warn_at_caller to remaining deprecation sites

greptile-apps (PR #594, r3189904028): `ProviderRepository.load`'s
YAML-default `DeprecationWarning` was using `warnings.warn(stacklevel=2)`,
which attributes to whichever data_designer frame called `load()` —
controllers, services, list/reset commands, agent introspection. Every
real call path lands on `data_designer.cli.*`, which falls under
Python's default `ignore::DeprecationWarning` filter and is silenced.
Audit found two more sites with the same problem:

- `DatasetBuilder._resolve_async_compatibility` (`allow_resize` /
  issue #552) — was using `stacklevel=4` to walk past
  `_resolve_async_compatibility -> build/build_preview -> interface ->
  user`. Brittle: any added frame (decorator, async wrapping, the
  `try/except DeprecationWarning: raise` boundary) shifts attribution
  silently. The existing test passed only because it used
  `simplefilter("always") + record=True`, which records warnings
  regardless of attribution.
- `ProviderController._handle_change_default` — was using
  `stacklevel=2`, which lands on the menu dispatcher in the same
  controller module. `print_warning` already shows the message
  visually, but programmatic observers (`pytest.warns`,
  `filterwarnings("error", ...)`) saw a library-attributed entry that
  default filters silenced.

All three migrated to `warn_at_caller` (the helper from 247fa30) so
attribution lands on the user's call site regardless of internal
chain shape. `data_designer` is already in
`DEFAULT_INTERNAL_PREFIXES`, so the walk escapes the entire library
in one pass.

Added attribution regression tests at each site asserting
`warning.filename == __file__`. A future regression to
`warnings.warn(stacklevel=N)` now fails CI instead of silently
silencing the user-facing nudge:

- `test_load_with_yaml_default_attributes_warning_to_caller`
  (test_provider_repository.py)
- `test_resolve_async_compatibility` extended with the same assertion
- `test_handle_change_default_emits_deprecation_warning` rewritten
  from `pytest.warns(...)` to a `catch_warnings(record=True)` block
  that filters for the message and asserts `filename == __file__`
  (`pytest.warns` does not check attribution, so the rewrite is
  required to actually catch the regression).

3,125 tests pass (548 config + 1,923 engine + 654 interface).

Refs #589
2026-05-05 13:39:12 -06:00
Andre Manoel
61cdeefb17
feat: make async engine the default execution path (#592)
Some checks failed
CI / Test Config (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Coverage Check (Python 3.11) (push) Waiting to run
CI / End to end test (Python 3.10 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
Publish devnotes / deploy (push) Has been cancelled
* feat: make async engine the default execution path

The async engine has been hardening as opt-in for several releases. Make it
the default and address the prerequisites flagged for the flip.

Default flip
- DATA_DESIGNER_ASYNC_ENGINE defaults to "1" at both consumption sites
- Set DATA_DESIGNER_ASYNC_ENGINE=0 for one transitional release to opt out
- allow_resize=True still falls back to sync with a DeprecationWarning

Python 3.10 support
- Replace asyncio.TaskGroup (3.11+) in async_concurrency.py with
  gather-with-explicit-cancel; semantics preserved because _run_task already
  swallows its own exceptions and uses _shutdown_event for sibling cancellation
- Remove the sys.version_info < (3, 11) runtime guard
- Remove the matching pytest skipif so the executor tests run on 3.10 too

Derived timeouts (replaces two hardcoded 300s constants)
- ThrottleManager.acquire_sync/async default to timeout=None (no deadline)
  instead of DEFAULT_ACQUIRE_TIMEOUT=300; HTTP request timeout already bounds
  actual work, queue waits scale with provider speed and AIMD
- _AsyncBridgedModelFacade derives the sync->async bridge timeout from the
  model's inference_parameters.timeout and the call's max_correction_steps;
  one knob (per-model timeout) drives both deadlines, no new config surface
- Add ModelFacade.request_timeout property so the bridge can read the
  effective timeout the client is configured with

Root-cause surfacing
- AsyncTaskScheduler captures the first non-retryable error and exposes it
  via first_non_retryable_error
- Interface threads it through DataDesignerGenerationError when 0 records
  are produced without early-shutdown, so deterministic failures (e.g. bad
  seed sources) surface their original message instead of a wrapped
  FileNotFoundError on the parquet path

Tests
- New: throttle no-deadline default behavior (sync+async), parametrized
  derived bridge timeout, restored async_concurrency tests on 3.10
- Updated: test_dataset_builder.py uses an autouse fixture to pin its
  Mock-based tests to the sync engine they cover; existing bridge tests
  set facade.request_timeout for the new derivation

Docs
- Replace the stale LiteLLM security notice in README with a short
  async-default heads-up and link to the migration guide
- Add docs/migration-async-default.md covering per-model timeouts,
  custom-column thread safety, mocking model calls, run outcomes, and
  the opt-out
- Append a short Update section to the async-all-the-way-down dev note

* test: extract _compute_bridge_timeout helper for direct testing

The parametrized bridge-timeout test was patching ``concurrent.futures.Future.result``
to capture the timeout the bridge passed in. That reaches into stdlib internals
(DEVELOPMENT.md "Mock at boundaries: Keep mocking shallow") and the ``ids=`` argument
on the parametrize was missing.

Extracts the formula into a module-level ``_compute_bridge_timeout`` helper. The test
now calls the helper directly with no mocking, and the parametrize gets readable ids.
Behavior is unchanged.

* test(e2e): align demo plugins with async engine contracts

The e2e demo plugins exercise plugin discovery and full DD lifecycle. Two
of them were written against sync-engine semantics that the async engine
restricts:

- DemoColumnGeneratorImpl was a ColumnGeneratorFullColumn with no
  required_columns. The async engine routes ``no-upstream`` columns
  through the from-scratch path, which passes an empty DataFrame to
  generators that aren't FromScratchColumnGenerator subclasses. The
  generator then produces 0 rows and the scheduler raises
  ``update_batch received 0 values``. Switching the plugin to
  FromScratchColumnGenerator with generate_from_scratch(num_records)
  matches what the plugin actually does (produces a constant column
  without input) and works on both engines.

- RegexFilterProcessor implemented process_before_batch with row-count
  changes. The async engine enforces row-count invariance in pre- and
  post-batch processor stages by design. Moving the filter to
  process_after_generation preserves the plugin's purpose (regex-based
  row filtering) at a stage that supports row-count changes on both
  engines. Test assertions check the final dataset, so the stage shift
  is transparent.

Both changes are demo-plugin updates only; no production code change.

* fix: address Codex review findings on async-default flip

Three bugs and two test-quality concerns surfaced by an independent review of
the prior commits. Each was real and worth fixing in the flip PR.

Bug fixes
- Sync-fallback path was creating async-only model clients. The default flip
  meant ``client_concurrency_mode = ASYNC`` for every default run, but the
  ``allow_resize=True`` path falls back to the sync engine — sync ``model.generate()``
  calls then hit ``SyncClientUnavailableError``. The resolution decision now
  lives at the DataDesigner interface level via
  ``_resolve_client_concurrency_mode``: it considers both the env var and the
  config (allow_resize forces sync clients) and is passed explicitly to
  ``create_resource_provider``. Direct callers of the factory still get the
  env-var default.

- Sync→async bridge timeout ignored the per-call ``timeout=`` override. A
  custom column calling ``model.generate(timeout=600)`` against a slow endpoint
  was being cancelled at the model-config default, not 600s. The bridge now
  prefers ``kwargs.get("timeout")`` over ``facade.request_timeout``.

- Bridge timeout formula missed ``max_conversation_restarts``. One logical
  generation can do ``(1 + max_conversation_restarts) × (1 + max_correction_steps)``
  HTTP requests; the formula now multiplies both, matching the worst-case
  attempt budget.

Engine routing fix (also surfaced by failing e2e plugin tests)
- ``_run_from_scratch`` else-branch passed an empty DataFrame to non-FromScratch
  generators classified as seeds (no upstream columns), so ``ColumnGeneratorFullColumn``
  with no required_columns produced 0 rows for an ``rg_size``-row buffer. Now
  passes an ``rg_size``-row snapshot of the row-group buffer, mirroring the
  sync engine's FULL_COLUMN contract.
- The earlier ``DemoColumnGeneratorImpl`` workaround (rewrite as ``FromScratchColumnGenerator``)
  is reverted; the engine fix subsumes it. The processor-plugin fix
  (``process_after_generation`` for the regex filter) stays — pre-batch
  row-count change is intentionally rejected by the async engine.

Test improvements
- Throttle no-deadline tests are parametrized over ``(timeout=0.0, raises)``
  and ``(timeout=None, waits)``, pinning that ``None`` is genuinely distinct
  from any finite default. Sync and async counterparts mirror.
- New regression tests for ``first_non_retryable_error`` surfacing covering
  both load-raises and load-returns-empty paths, asserting the original
  exception is chained via ``__cause__`` and that the typed
  ``DataDesignerEarlyShutdownError`` doesn't fire in this branch.
- New parametrized regression test for ``_resolve_client_concurrency_mode``
  covering all four (env × allow_resize) combinations.
- New parametrized test for the per-call ``timeout=`` override flowing into
  the bridge timeout calculation.
- Bridge formula tests extended with ``max_conversation_restarts`` cases.

* test: trim redundant parametrize cases in async-default tests

Three parametrize cases were duplicating coverage already provided by
existing standalone tests:

- ``test_acquire_*_timeout_branches`` parametrized over ``(0.0, raises)``
  and ``(None, waits)``. The ``raises`` half duplicates
  ``test_acquire_*_raises_timeout_when_at_capacity``. Replaced with two
  focused ``..._default_no_deadline_waits_for_release`` tests covering
  only the no-deadline branch.

- ``test_resolve_client_concurrency_mode_matches_engine_choice`` had four
  cases. The ``async-off + allow-resize`` case asserts ``SYNC`` because the
  env var alone forces it; the allow_resize input is moot. Dropped.

- ``test_async_bridge_honors_per_call_timeout`` had three cases. The
  "override below floor" case cross-products the per-call override flow
  with the floor-clamping behavior already covered by
  ``test_compute_bridge_timeout``. Dropped.

Net: -25 lines of test code with no loss of essential coverage.

* docs: fold migration page into existing concept docs

The standalone ``Migrating to the async default`` page didn't fit the
existing docs style — present tense, behavior over comparisons, content
in the natural concept home. Folding it in:

- ``architecture-and-performance.md`` gets a new ``Async Engine`` section
  covering per-model timeouts, run outcomes (partial completion +
  ``DataDesignerEarlyShutdownError``), and the transitional opt-out.
  Three stale ``async engine is landing soon`` callouts updated to
  reflect the flip.
- ``custom_columns.md`` gets two short notes: a thread-safety callout
  near Generation Strategies, and a mocking-with-spec note in
  Development Testing.
- ``async-all-the-way-down.md`` Update section now points at the new
  arch-and-perf section.
- README heads-up links to the same anchor.
- ``migration-async-default.md`` removed; mkdocs.yml entry dropped.

* docs: frame Execution Model as sync-engine specifics

Small targeted edits to make the user-facing concept docs consistent
with the post-flip state. No restructuring.

- ``architecture-and-performance.md``: the ``Execution Model`` callout
  now opens with two engines, links to the new ``Async Engine`` section,
  and frames the existing column-at-a-time description as sync-engine
  semantics. The ``Step 2: Process columns sequentially`` paragraph notes
  the async engine relaxes this. The ``Key Concepts`` table differentiates
  per-engine for ``Batching`` and ``Sequential columns``; ``Parallel cells``
  is the same on both.
- ``processors.md``: added a warning callout about the async engine's
  row-count invariance in pre- and post-batch stages, with the guidance
  to use ``process_after_generation()`` for row-filtering or expansion.

* fix: address review nits from PR #592 (Nabin)

Four targeted fixes from the review.

Worth-addressing (warning):
- ``test_acquire_async_default_no_deadline_waits_for_release`` was
  spawning the release task without holding a strong reference. The
  loop's weak-ref bookkeeping could GC it before the inner ``await``
  observes the release, producing a CI flake. Hold the task and
  ``await`` it in ``finally``.

Take-it-or-leave-it (applied):
- Root-cause error surfacing now includes the exception type name:
  ``f"🛑 {type(root_cause).__name__}: {root_cause}"`` so users see
  ``ValueError: ...`` instead of just the message string. The
  ``__cause__`` chain is preserved either way.
- Drop the defensive ``getattr(c, "allow_resize", False)`` in
  ``_resolve_client_concurrency_mode`` — every member of
  ``ColumnConfigT`` inherits ``allow_resize: bool = False`` from
  ``SingleColumnConfig``.
- One-line comment near the root-cause surfacing branch noting that
  ``actual_num_records == 0`` is async-only (sync runs leave it at
  ``-1``), so the branch is async-only by construction.

Not addressed in this PR (filing as follow-ups):
- ``SYNC_BRIDGE_TIMEOUT = 300`` still hardcoded in
  ``column_generators/generators/base.py:_run_coroutine_sync``. That
  bridge has no model-facade context to derive a timeout from, so the
  fix is a structural refactor outside this PR's scope.
- First-error capture loses subsequent-error context. The "first wins"
  heuristic is documented; richer aggregation is a follow-up.

* fix: drop SYNC_BRIDGE_TIMEOUT in _run_coroutine_sync

This was the third hardcoded 300s timeout (Nabin flagged it on PR #592).
The path is the generic sync→async bridge in ``ColumnGenerator.generate()``:
when a subclass overrides only ``agenerate()``, the sync entry point runs
the coroutine in a background thread.

Same philosophy we applied to the throttle queue wait elsewhere in the
PR: a defensive deadline on top of work that's already bounded by the
HTTP timeout doesn't add safety, it just produces spurious failures on
slow self-hosted endpoints. Drop the constant, the timeout exception
handling, and the ``timed_out`` bookkeeping. ``pool.shutdown(wait=True)``
becomes the simple cleanup.

Tests in ``test_async_generators.py`` exercise the happy path only and
don't depend on the timeout firing.

* Revert "fix: drop SYNC_BRIDGE_TIMEOUT in _run_coroutine_sync"

This reverts commit 7a0b77d44c.

* docs+feat: deprecate the sync-engine opt-out path

Nabin asked whether the docs should adopt explicit "deprecation" language
on the opt-out path. Doing both:

- Doc: ``architecture-and-performance.md``'s ``Opting out`` section now
  uses an ``!!! warning "Deprecated"`` admonition that names the env var
  as a deprecated escape hatch and notes the run-time warning.
- Code: ``DataDesigner._resolve_client_concurrency_mode`` emits a
  ``DeprecationWarning`` when ``DATA_DESIGNER_ASYNC_ENGINE=0`` is detected.
  Same precedent as the existing ``allow_resize=True`` warning. Auto-fallback
  via ``allow_resize`` does not double-warn here; the builder layer emits
  its own warning later.
- Test: parametrized regression now asserts ``pytest.warns(DeprecationWarning)``
  on the opt-out branch and treats any warning on the async-on branches as
  a failure (``simplefilter("error")`` inside the ``catch_warnings`` block).

* fix: emit logger.warning alongside DeprecationWarning on env-var opt-out

Parity fix from Nabin's re-review of PR #592. The ``allow_resize=True``
auto-fallback path in ``_resolve_async_compatibility`` emits both a
``logger.warning("⚠️ ...")`` and a ``DeprecationWarning``. The new
``DATA_DESIGNER_ASYNC_ENGINE=0`` opt-out path was only emitting the
``DeprecationWarning``, leaving users who run with default warning
filters silenced and inconsistent with the established precedent.

Match the pattern: same message body, both signals, same stacklevel.

* docs: breadcrumb explaining why SYNC_BRIDGE_TIMEOUT survives PR #592

Nabin's re-review pointed out that ``base.py`` is the lone place where
the 300s pattern survives, while ``custom.py`` and ``throttle_manager.py``
both retired theirs. Without a comment, a future reader (or a lint sweep)
could mistake this for an oversight and "consistency-fix" it the wrong way.

Add a short note at the constant naming the two retired siblings, the
reason this one stayed (no ``ModelFacade`` context to derive from), and
the fact that it's tracked for a structural follow-up.
2026-05-04 16:22:13 -03:00
Nabin Mulepati
5ec0e89972
fix(interface): don't leak YAML default provider into user-supplied list (#591)
Some checks are pending
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Coverage Check (Python 3.11) (push) Waiting to run
CI / End to end test (Python 3.10 on macos-latest) (push) Waiting to run
CI / Lint and Format Check (push) Waiting to run
CI / Check License Headers (push) Waiting to run
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
* fix(interface): don't leak YAML default provider into user-supplied list

When a caller passes ``model_providers`` to ``DataDesigner.__init__``, the
YAML's ``default:`` key from ``~/.data-designer/model_providers.yaml`` was
still being applied to the resulting ``ModelProviderRegistry``. This caused
two related problems:

1. Hard failure: if the YAML default named a provider absent from the
   user-supplied list, construction raised
   ``ValidationError: Specified default 'X' not found in providers list``.
2. Silent override: if the YAML default matched a non-first user-supplied
   provider, the documented "first wins" behavior was silently overridden.

Gate the YAML lookup on ``model_providers is None`` so that user-supplied
lists own their own default. Also expose ``model_provider_registry`` and
``run_config`` as public read-only properties on ``DataDesigner``, paired
with the existing ``secret_resolver`` property and ``set_run_config``
setter; tests now use these instead of the underscore-prefixed attributes.

Closes #588.

Made-with: Cursor

* fix(interface): tighten model_provider_registry docstring; pin YAML-fallback path

Address review feedback on PR #591:
- Clarify ``model_provider_registry`` docstring so it reflects the full
  fallback chain: user-supplied first → YAML default (when set) → first
  provider in the YAML list.
- Add ``test_init_no_user_providers_uses_yaml_default`` to lock the
  YAML-fallback contract that the #588 fix preserved but didn't pin.

Made-with: Cursor
2026-04-30 15:50:26 -06:00
Andre Manoel
47c72b3d87
fix(async): pack of fixes for async engine under degraded providers (#585)
Some checks are pending
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Coverage Check (Python 3.11) (push) Waiting to run
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.10 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Lint and Format Check (push) Waiting to run
CI / Check License Headers (push) Waiting to run
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
* fix(async): exclude all retryable errors from early-shutdown gate

The gate previously only excluded `ModelRateLimitError`, leaving
`ModelTimeoutError`, `ModelInternalServerError`, and
`ModelAPIConnectionError` to count toward the sliding-window error
rate. Under provider degradation these errors cluster in time
(concurrent in-flight requests time out together), so 5/10 in a row
is easy and trips the gate even when salvage could recover the rows.

Refs #575.

* feat(async): WARN log when provider showing degraded performance

Diagnostic A/Bs against build.nvidia.com showed runs failing silently
under provider degradation - no log indication that retryable errors
were piling up until the early-shutdown gate fired (or, post-fix,
until salvage exhaustion). Surfacing this earlier helps users
distinguish "DataDesigner is broken" from "the upstream provider is
slow today."

Tracks a separate sliding window over retryable-vs-not for every task
outcome (independent of the early-shutdown gate's window) and emits a
throttled WARN when the rolling fraction crosses the threshold.

Refs #575.

* fix(async): salvage partial row groups on early shutdown

Before: when the early-shutdown gate fired, any row group still in
flight stayed in `_rg_states` un-checkpointed. The buffer manager
later raised `FileNotFoundError` when the builder tried to read the
finalized parquet. User-visible result: `0 records produced`.

After: a new `_finalize_after_shutdown` step runs in `run()`'s finally
block, after `_cancel_workers` has drained in-flight tasks (Codex
caveat: in-flight `from_scratch`/`batch` tasks must not be allowed to
write into a buffer that's being finalized). For each remaining row
group it drops rows that aren't fully complete, then delegates to the
existing `_checkpoint_completed_row_groups` so the buffer manager's
zero-survivor handling (skip empty parquet, free buffer) kicks in
unchanged.

Also surfaces partial completion as a structured signal: scheduler
exposes `early_shutdown: bool` and `partial_row_groups: tuple[int, ...]`
properties so callers can detect partial completion programmatically
rather than parsing log lines. Builder uses this to emit a more
specific WARN distinguishing early shutdown from non-shutdown drops.

Refs #575.

* fix(throttle): reset consecutive_429s on non-rate-limit failure

In `release_failure`, the cascade counter wasn't reset, so a sequence
like 429 → 500 → 429 was treated as 2 consecutive 429s. The cascade
counter feeds AIMD's reduce-once-per-cascade logic; the second 429
should start a fresh cascade and trigger another concurrency reduction,
but currently doesn't.

Standalone bug surfaced during #575 investigation; not on the failure
path that drives the gate-trip outcome but worth fixing while we're
in this code.

* fix(custom): preserve retryability through CustomColumnGenerator wrap

A real-workload run of #575 showed the early-shutdown gate still trips
even with the gate-exclusion fix in place: the trigger is 10 timeouts
inside Anonymizer's QA-repair custom columns, all wrapped in
CustomColumnGenerationError (non-retryable) by the catch-all in
CustomColumnGenerator.

Two fixes here:

1. Re-raise RETRYABLE_MODEL_ERRORS unchanged before the wrap so the
   scheduler's _is_retryable correctly classifies them.

2. Surface _AsyncBridgedModelFacade timeouts as ModelTimeoutError
   instead of stdlib TimeoutError. Without this the sync bridge times
   out as the wrong exception type and is still classified non-retryable
   even after fix #1.

Also moves _RETRYABLE_MODEL_ERRORS from async_scheduler to
models/errors as the public RETRYABLE_MODEL_ERRORS tuple - both the
scheduler and the wrap site need it, and models/errors is the
appropriate home alongside the error class definitions.

Refs #575.

* feat(interface): typed DataDesignerEarlyShutdownError on zero-record runs

When the async scheduler hits early shutdown and produces zero
records, the buffer manager skips writing parquet (correctly), so
ArtifactStorage.load_dataset_with_dropped_columns() raises
FileNotFoundError. Previously this surfaced as a generic
DataDesignerGenerationError wrapping the FileNotFoundError, which is
ambiguous (could be missing files for any reason).

This commit:

- Adds DataDesignerEarlyShutdownError as a subclass of
  DataDesignerGenerationError so existing handlers still match while
  callers that want to react programmatically (retry on different
  alias, surface a degraded-provider message, etc.) can catch the
  specific type.
- Plumbs the scheduler's structured signals (early_shutdown,
  partial_row_groups) up through the builder so they're available at
  data_designer.create() time without re-introspecting the scheduler.
- create() raises the typed error in both failure modes (load fails
  or empty DataFrame returned) when builder.early_shutdown is True.

Refs #575.

* fix(async): emit first degraded-provider WARN regardless of clock state

  Initialize _last_degraded_warn_at to -inf so the first WARN is always
  emitted. The previous initialization to 0.0 suppressed the first WARN on
  fresh CI runners where time.monotonic() returns a small value (system
  boot uptime), making the throttle interval check (now - 0.0 < interval)
  true on the first attempt.

* fix(async): address review findings on early-shutdown salvage PR

Five real correctness issues caught in review of the original PR, plus a
few smaller cleanups and test simplifications.

Throttle - cascade reset (regression of existing AIMD invariant):
release_failure() now resets consecutive_429s only when in_flight == 0.
Resetting unconditionally broke "reduce once per cascade" when 429/500/429
arrived interleaved within a single in-flight burst - the second 429 was
treated as a new cascade and the limit got halved twice for what was
effectively one rate-limit event.

Interface - typed-error gating: DataDesignerEarlyShutdownError now fires
only when early_shutdown is true AND actual_num_records == 0. Without
this, a partial-salvage run that fails to load for unrelated reasons
(corrupt parquet, schema drift, disk hiccup) was misdiagnosed as "zero
records produced," hiding the real cause.

Async - WARN window scope: the degraded-provider warning was fed by every
task outcome, including samplers and non-LLM customs. In realistic
pipelines (one model column, several non-model columns) the rate stayed
under threshold even when every model call was failing, silencing the
WARN exactly when it mattered. Now gated on is_llm.

Async/builder - signal preservation across raises: scheduler.early_shutdown
and partial_row_groups are captured in a try/finally around future.result(),
so a processor failure during the salvage path doesn't drop the
structured signal. Both build() and build_preview() now reset per-run
state at the start so reused builders don't leak prior-run flags.

Async - dead code: dispatch_error capture in run() was unread (the post-
finally check is unreachable on the exception path). Removed.

Smaller cleanups:
- early-shutdown WARN says "non-retryable error rate exceeded threshold"
- bridge timeout WARN demoted to debug (ModelTimeoutError already surfaces
  it; the throttled degraded-provider WARN is the user-facing signal)
- TODO note for threading degraded_warn_* through RunConfig
- doc note in _finalize_after_shutdown clarifying that pre-batch processor
  isn't re-run on partial-salvage row groups

Tests:
- new regression tests for the cascade burst case, partial-salvage error
  gating, and LLM-only WARN window
- direct unit test for _reset_run_state
- dedup via _make_storage / _seed_plus_cell_setup helpers
- WARN emission cases parametrized into a single test
- shared parametrize lists hoisted to module-level constants
- redundant cascade test dropped in favor of the more thorough drain
  variant; redundant healthy-baseline test folded into the zero-survivor
  test

* chore(async): address Nabin's review comments

Style cleanups, parametrization, docstring polish, and one consistency
fix in the typed-error path. All non-blocking ("Ship it (with nits)").

interface/data_designer.py:
- preview() now raises DataDesignerEarlyShutdownError when shutdown
  produced zero records (parity with create()), and also gates on
  actual_num_records == 0 so partial-salvage runs that fail to load
  don't get misdiagnosed
- create()'s defensive empty-DF guard mirrors the load-failure guard
  with the same actual_num_records == 0 check

async_scheduler.py:
- _record_retryable_outcome docstring clarifies that the call site
  filters by is_llm; the function alone reads as if every outcome feeds
  the window

dataset_builder.py:
- moved _reset_run_state() down past the public methods to match the
  project's public-before-private convention

test_custom.py:
- flattened TestAsyncBridgedModelFacade class into module-level test
  functions (matches the rest of the file)
- hoisted inline imports (asyncio, threading, patch, _AsyncBridgedModelFacade,
  SyncClientUnavailableError) to top of file
- driven retryable-error parametrize off RETRYABLE_MODEL_ERRORS directly
  instead of the hand-rolled factory list, so new retryable types pick
  up coverage automatically
- dropped the redundant "Sanity" block in test_async_bridge_timeout_raises_
  model_timeout_error - pytest.raises already enforces the type, the
  duplicate block was running the same slow scenario twice

test_async_scheduler.py:
- parametrize over RETRYABLE_MODEL_ERRORS directly (same as above)

test_data_designer.py:
- added preview-path tests for the typed-error and partial-salvage
  fall-through cases
- updated the existing empty-DF test to also patch actual_num_records=0
  (otherwise the new gating in the empty-DF guard skips the typed error)

* test(interface): consolidate create() error-dispatch tests into a matrix

Five separate tests (two existing, three new from earlier in this PR)
all probed the same dispatch logic in create(): "given a load outcome
and a builder state, which error type should fire?" Pulled them into a
single parametrized matrix indexed by (load_side_effect, early_shutdown,
actual_num_records).

Net result: 5 named tests → 1 parametrized test with 6 cells, and the
previously-missing empty_df + shutdown + partial salvage cell is now
covered.

Test names retain readable IDs (load_fails_shutdown_zero_records etc.)
so failures still pinpoint the exact case in pytest output.
2026-04-30 14:43:35 -03:00
Eric W. Tramel
8be4ff787f
feat: add RunConfig jinja rendering engine (#557) 2026-04-17 15:06:27 -04:00
Johnny Greco
4a2813654a
fix: always return ISO-8601 from datetime postproc (#484) (#512)
* fix: always return ISO-8601 from datetime postproc (#484)

The DatetimeFormatMixin.postproc heuristics inferred output format from
value distribution, silently stripping date/time components for small
datasets or narrow date ranges. Replace with deterministic ISO-8601
output via vectorized strftime. Users who need custom formats can still
set convert_to on the SamplerColumnConfig.

* docs: update convert_to docstring and add DatetimeFormatMixin docstring

The SamplerColumnConfig.convert_to docstring incorrectly stated that
only "float", "int", or "str" are accepted. Datetime/timedelta samplers
accept strftime format strings. Also document the ISO-8601 default.

* test: add regression test for #484 via DataDesigner.preview API

Captures the exact reproducer from the issue: a single-record datetime
preview through the public DataDesigner.preview() interface must return
a full ISO-8601 timestamp, not a bare year string.

* test: trim redundant datetime tests, align reproducer with issue #484

- Remove postproc_same_day_records (subsumed by same_month + no_convert_to)
- Remove postproc_always_parseable (subsumed by stdlib_fromisoformat)
- Remove all_same_month integration test (subsumed by narrow_range_single_day)
- Update single_record test to use unit="h" matching the issue reproducer

* fix: address review nits — move datetime import to module scope, drop redundant isinstance
2026-04-09 12:50:40 -04:00
Eric W. Tramel
7891dd53cb
feat: add Hermes Agent rollout support (#500) 2026-04-07 12:39:49 -04:00
Eric W. Tramel
58870bb83f
feat: add ATIF rollout ingestion (#495) 2026-04-06 11:06:14 -04:00
Andre Manoel
c146af5146
chore: async engine follow-up - rename, preview, lifecycle, progress (#456)
* chore: rename ColumnWiseDatasetBuilder, wire async preview, unify row-group lifecycle

- Rename ColumnWiseDatasetBuilder to DatasetBuilder and
  column_wise_builder.py to dataset_builder.py, update all references
- Extract _prepare_async_run() factory shared by build and preview paths
- Add _build_async_preview() for async preview with no disk checkpoints
- Replace on_row_group_complete/on_checkpoint_complete with single
  on_finalize_row_group callback; caller handles checkpointing
- Add free_row_group() on RowGroupBufferManager for discard-without-write
- Free fully-dropped row groups instead of finalizing them
- Add consolidated AsyncProgressReporter for async generation logging

Closes #437, closes #442, closes #444

* feat: add consolidated async progress reporting with row group context

- Add AsyncProgressReporter: groups per-column progress into a single
  log block emitted at configurable intervals (default 5s)
- Add quiet mode to ProgressTracker to suppress per-tracker logging
  when used with the consolidated reporter
- Add ContextVar-based row group tagging (RG1, RG2, ...) for log
  messages emitted inside async tasks (samplers, expressions, seeds)
- Add progress_interval to RunConfig for user-configurable reporting
- Remove log_start_async from ProgressTracker (superseded by reporter)

Closes #443

* fix async preview and progress reporting

* feat: add opt-in sticky ANSI progress bars for generation

Add RunConfig.progress_bar setting that replaces periodic log-line
progress with sticky terminal bars that stay at the bottom while
logs scroll above. Pure ANSI escape codes, no new dependencies.

Disabled by default - existing log-based output unchanged.

* docs: add missing progress_interval docstring in RunConfig

* fix: update progress bar on every completion in async path

Skip the time gate when the progress bar is active so the bar
redraws on every record instead of every progress_interval seconds.

* fix: resolve row-group semaphore deadlock when all tasks are deferred

When all tasks for admitted row groups fail with transient errors,
the row-group semaphore never releases, blocking admission of new
row groups. Fix by salvaging stalled row groups inline - retrying
deferred tasks immediately so row groups can checkpoint and free
their semaphore slots.

Also updates row group log format to (x/X) with leading zeros.

* fix: eagerly salvage stalled row groups to avoid wasting semaphore slots

Run inline salvage after every checkpoint pass instead of only when
globally stalled. Row groups with 0 in-flight and only deferred tasks
are salvaged immediately, freeing their semaphore slot for new work.

* fix: address review findings from greptile and codex

- Use `with` statement for progress bar context (safe __exit__ on error)
- Check bar.is_active instead of bar is not None (non-TTY fallback)
- Record failures (not skips) for tasks that exhaust salvage retries
- Record skipped tasks when pre-batch filtering drops rows

* fix: stable progress bar width and accurate failure counts

- Pre-compute fixed stats width at bar creation to prevent bar
  resizing when failed count appears
- Cap displayed completed at total to avoid >100% on retries
- Exclude already-failed columns from skip recording to prevent
  double-counting in progress reporter

* fix: address Nabin's review - exclude_columns, dead code, docstring

- Add exclude_columns={task.column} on non-retryable batch/from_scratch
  drop path to prevent double-counting (same pattern as cell path)
- Simplify salvage drop to per-task exclude (is_dropped guard handles
  multi-column case)
- Remove dead _in_flight_for_rg method
- Fix context.py docstring to match actual (x/X) format

* fix: _drain_frontier exits before dispatching ready salvage tasks

After salvage discards a cell task from dispatched (making it
available in the frontier), _drain_frontier broke immediately
because nothing was in-flight yet. The task and its downstream
were never re-dispatched, leaving the row group incomplete.

Fix: only break when both ready and in-flight are empty.

* fix: salvage edge cases found by code review

- _drain_frontier: only break when both ready and in-flight are empty
- _salvage_rounds: re-mark sibling columns as dispatched after
  from_scratch retry to prevent duplicate dispatch
- _salvage_stalled_row_groups: separate exhausted tasks from new
  drain failures to avoid treating non-stalled tasks as permanent
- _checkpoint_completed_row_groups: clean up deferred tasks for
  checkpointed row groups
- Early shutdown: salvage stalled row groups before exiting

* fix: skip record_failure for already-dropped rows in salvage

When multiple columns fail for the same row, the first drop records
a skip for the other column. Without this guard, record_failure
fires again for the second column, double-counting it.
2026-03-25 14:50:48 -03:00
Eric W. Tramel
a0fb04ee07
feat: agent rollout trace ingestion (#399) 2026-03-20 09:17:35 -04:00
Eric W. Tramel
c88508790f
test: trim FileSystemSeedReader follow-up coverage (#432) 2026-03-19 09:47:55 -04:00
Eric W. Tramel
3d33680b7d
feat: support 1-to-many FileSystemSeedReader hydration (#424) 2026-03-17 20:56:19 -04:00
Eric W. Tramel
28c8345909
feat: add built-in filesystem seed readers (#421) 2026-03-16 17:40:27 -04:00
Nabin Mulepati
340087f4cf
fix: raise clear error when all records are dropped during generation (#383)
* fix: raise clear error when all records are dropped during generation (#382)

When every record fails during generation (e.g. LLM or image model errors),
the empty dataset previously reached the profiler, which raised a misleading
DatasetProfilerConfigurationError about missing columns. Now both preview()
and create() check for an empty dataset before profiling and raise a
DataDesignerGenerationError with an actionable message instead.

Made-with: Cursor

* fix: keep load_dataset_with_dropped_columns inside profiling try/except

Moves the call back inside the try/except block so ArtifactStorageError
is caught and re-raised as DataDesignerProfilingError, preserving the
documented API contract. Also reduces test num_records to 1.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* Address Andre's feedback

* more small feedback

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-10 10:18:16 -06:00
Johnny Greco
1439bbea7e
chore: Improve CLI startup with lazy heavy import cleanup (#330)
* perf: defer heavy imports to improve CLI startup time

Move expensive imports (engine, models, controllers) out of the module-level import path so that data-designer --help and other non-generation commands no longer pay the full startup cost.

Key changes:
- Defer controller imports to inside command functions
- Remove eager re-export chains from CLI package __init__ files
- Move default-settings bootstrap into load_config_builder() and DataDesigner.__init__() instead of running at import time
- Add lazy __getattr__ exports in interface/__init__.py
- Replace module-level tokenizer init with cached lazy getter
- Fix ModelProvider import to use config layer instead of engine
- Update test mock paths to match new import locations

Reduces CLI import-time from ~1.67s to ~0.46s.

* perf: defer pandas/numpy in io_helpers and add config_list benchmark

- Replace eager `from lazy_heavy_imports import pd, np` in io_helpers
  with module-level __getattr__ (for backwards-compatible external
  access / test mocks) and function-level imports in the 3 functions
  that actually use them (read_parquet_dataset, smart_load_dataframe,
  _convert_to_serializable). Importing io_helpers no longer triggers
  pandas/numpy loading.
- Defer heavy imports in list and reset CLI commands into function
  bodies to avoid loading repositories, Rich, and prompt_toolkit at
  module import time.
- Add `config_list` (data-designer config list) measurement to the
  CLI startup benchmark with isolated cold measurement in a separate
  venv and a --skip-config-list-check flag.
- Update test mock paths to match new import locations.

* Refine lazy import usage and TYPE_CHECKING cleanup

* Run license header updater on PR-touched files

* fix: update sqlfluff mock target for lazy imports in test_sql

* perf: cache globals() in lazy __getattr__ to avoid repeated lookups

Add globals() caching and explanatory comment to all three lazy
__getattr__ implementations (lazy_heavy_imports, config/__init__,
interface/__init__) so subsequent attribute accesses bypass __getattr__.

* perf: lazy CLI command loading and deferred heavy import evaluations

- Add LazyTyperGroup to defer command module loading until invocation, allowing module-level imports in all CLI command files

- Split DataFrameSeedSource into seed_source_dataframe.py to isolate pandas dependency from other seed source classes

- Move TypeVar/TypeAlias definitions (DataT, NumpyArray1dT, RadomStateT, EngineT) to TYPE_CHECKING blocks with runtime fallbacks

- Wrap module-level constants in lru_cache (phone_number parquet data, jsonschema validator) to defer I/O and heavy imports to first use

- Update test mock targets to patch at usage-site for module-level imports

* refactor: use direct pandas import in seed_source_dataframe

Drop lazy-loading for pandas in DataFrameSeedSource; use direct import
for simplicity.

* update lazy import pattern

* update tests to use lazy import namespace

Switch test modules to import data_designer.lazy_heavy_imports as lazy
and reference heavy libraries through that namespace. This keeps heavy
imports deferred during module import and aligns tests with the new
lazy-import usage pattern.

* tighten import perf test thresholds

Document recent baseline timings and lower the allowed average
import time and timeout so regressions are detected sooner.

* document pandas import requirement

Clarify that Pydantic needs DataFrame resolved at module load and
that keeping the direct import preserves IDE typing support.

* increase timeout time

* use lazy pandas imports in visualization tests

- replace direct pandas usage with lazy.pd in visualization tests to avoid eager imports
- add TYPE_CHECKING pandas import and keep CLI controller imports sorted

* fix lazy pandas runtime usage and preview mocks

Switch sample-record handling to lazy pandas types so runtime paths no longer
depend on TYPE_CHECKING imports. Align preview controller tests to patch the
module-local DataDesigner symbol, preventing real engine invocation in save
results scenarios.
2026-02-18 16:24:15 -05:00
Nabin Mulepati
1f172eaa28
chore: move ArtifactStorage to engine/storage/ module (#321) 2026-02-12 14:58:12 -07:00
Andre Manoel
429b558588
refactor: callback-based processor design (#294) 2026-02-11 21:32:24 -03:00
Johnny Greco
c19f35639f
chore: add publish script and update license headers (#253) 2026-01-28 08:47:34 -05:00
Johnny Greco
ae0665fa16
refactor: slim package refactor into three subpackages (#240)
* remove old structure

* major shuffle

* streamline project configs

* update make commands

* updates to make commands

* remove essentials

* initialize logger in interface

* uv lock

* ignore notepad

* update workflows

* fix e2e project config

* generate colab notebooks

* resolve default model settings in interface

* fix build commands

* update perf import make command

* cleaning up some slop

* update recipes

* move conftest files to tests/

* update subpackage readmes

* streamline config_logging

* use exports

* update perf import usage pattern

* update for IDE behavior with ruff

* remove engine's fixtures file

* add note to about lazy imports

* update dependencies

* update docs

* doc fixes

* uv lock

* updates to catch up with main

* clean up makefile

* remove package gitignores

* define deps only once

* isolate tests

* add test for protetion rule

* create temp dirs for isolated tests

* catch up to main

* update headers

* re apply changes

* better result summaries for isolated tests

* move exports into top-level init

* fix client importlib version syntax

* catch up with main
2026-01-27 13:53:20 -05:00