DataDesigner/tests_e2e/src
Andre Manoel 61cdeefb17
Some checks failed
CI / End to end test (Python 3.13 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Lint and Format Check (push) Waiting to run
CI / Check License Headers (push) Waiting to run
CI / Test Config (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Config (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Config (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Config (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Config (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Config (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Config (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Config (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Engine (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Coverage Check (Python 3.11) (push) Waiting to run
CI / End to end test (Python 3.10 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
Publish devnotes / deploy (push) Has been cancelled
feat: make async engine the default execution path (#592)
* feat: make async engine the default execution path

The async engine has been hardening as opt-in for several releases. Make it
the default and address the prerequisites flagged for the flip.

Default flip
- DATA_DESIGNER_ASYNC_ENGINE defaults to "1" at both consumption sites
- Set DATA_DESIGNER_ASYNC_ENGINE=0 for one transitional release to opt out
- allow_resize=True still falls back to sync with a DeprecationWarning

Python 3.10 support
- Replace asyncio.TaskGroup (3.11+) in async_concurrency.py with
  gather-with-explicit-cancel; semantics preserved because _run_task already
  swallows its own exceptions and uses _shutdown_event for sibling cancellation
- Remove the sys.version_info < (3, 11) runtime guard
- Remove the matching pytest skipif so the executor tests run on 3.10 too

Derived timeouts (replaces two hardcoded 300s constants)
- ThrottleManager.acquire_sync/async default to timeout=None (no deadline)
  instead of DEFAULT_ACQUIRE_TIMEOUT=300; HTTP request timeout already bounds
  actual work, queue waits scale with provider speed and AIMD
- _AsyncBridgedModelFacade derives the sync->async bridge timeout from the
  model's inference_parameters.timeout and the call's max_correction_steps;
  one knob (per-model timeout) drives both deadlines, no new config surface
- Add ModelFacade.request_timeout property so the bridge can read the
  effective timeout the client is configured with

Root-cause surfacing
- AsyncTaskScheduler captures the first non-retryable error and exposes it
  via first_non_retryable_error
- Interface threads it through DataDesignerGenerationError when 0 records
  are produced without early-shutdown, so deterministic failures (e.g. bad
  seed sources) surface their original message instead of a wrapped
  FileNotFoundError on the parquet path

Tests
- New: throttle no-deadline default behavior (sync+async), parametrized
  derived bridge timeout, restored async_concurrency tests on 3.10
- Updated: test_dataset_builder.py uses an autouse fixture to pin its
  Mock-based tests to the sync engine they cover; existing bridge tests
  set facade.request_timeout for the new derivation

Docs
- Replace the stale LiteLLM security notice in README with a short
  async-default heads-up and link to the migration guide
- Add docs/migration-async-default.md covering per-model timeouts,
  custom-column thread safety, mocking model calls, run outcomes, and
  the opt-out
- Append a short Update section to the async-all-the-way-down dev note

* test: extract _compute_bridge_timeout helper for direct testing

The parametrized bridge-timeout test was patching ``concurrent.futures.Future.result``
to capture the timeout the bridge passed in. That reaches into stdlib internals
(DEVELOPMENT.md "Mock at boundaries: Keep mocking shallow") and the ``ids=`` argument
on the parametrize was missing.

Extracts the formula into a module-level ``_compute_bridge_timeout`` helper. The test
now calls the helper directly with no mocking, and the parametrize gets readable ids.
Behavior is unchanged.

* test(e2e): align demo plugins with async engine contracts

The e2e demo plugins exercise plugin discovery and full DD lifecycle. Two
of them were written against sync-engine semantics that the async engine
restricts:

- DemoColumnGeneratorImpl was a ColumnGeneratorFullColumn with no
  required_columns. The async engine routes ``no-upstream`` columns
  through the from-scratch path, which passes an empty DataFrame to
  generators that aren't FromScratchColumnGenerator subclasses. The
  generator then produces 0 rows and the scheduler raises
  ``update_batch received 0 values``. Switching the plugin to
  FromScratchColumnGenerator with generate_from_scratch(num_records)
  matches what the plugin actually does (produces a constant column
  without input) and works on both engines.

- RegexFilterProcessor implemented process_before_batch with row-count
  changes. The async engine enforces row-count invariance in pre- and
  post-batch processor stages by design. Moving the filter to
  process_after_generation preserves the plugin's purpose (regex-based
  row filtering) at a stage that supports row-count changes on both
  engines. Test assertions check the final dataset, so the stage shift
  is transparent.

Both changes are demo-plugin updates only; no production code change.

* fix: address Codex review findings on async-default flip

Three bugs and two test-quality concerns surfaced by an independent review of
the prior commits. Each was real and worth fixing in the flip PR.

Bug fixes
- Sync-fallback path was creating async-only model clients. The default flip
  meant ``client_concurrency_mode = ASYNC`` for every default run, but the
  ``allow_resize=True`` path falls back to the sync engine — sync ``model.generate()``
  calls then hit ``SyncClientUnavailableError``. The resolution decision now
  lives at the DataDesigner interface level via
  ``_resolve_client_concurrency_mode``: it considers both the env var and the
  config (allow_resize forces sync clients) and is passed explicitly to
  ``create_resource_provider``. Direct callers of the factory still get the
  env-var default.

- Sync→async bridge timeout ignored the per-call ``timeout=`` override. A
  custom column calling ``model.generate(timeout=600)`` against a slow endpoint
  was being cancelled at the model-config default, not 600s. The bridge now
  prefers ``kwargs.get("timeout")`` over ``facade.request_timeout``.

- Bridge timeout formula missed ``max_conversation_restarts``. One logical
  generation can do ``(1 + max_conversation_restarts) × (1 + max_correction_steps)``
  HTTP requests; the formula now multiplies both, matching the worst-case
  attempt budget.

Engine routing fix (also surfaced by failing e2e plugin tests)
- ``_run_from_scratch`` else-branch passed an empty DataFrame to non-FromScratch
  generators classified as seeds (no upstream columns), so ``ColumnGeneratorFullColumn``
  with no required_columns produced 0 rows for an ``rg_size``-row buffer. Now
  passes an ``rg_size``-row snapshot of the row-group buffer, mirroring the
  sync engine's FULL_COLUMN contract.
- The earlier ``DemoColumnGeneratorImpl`` workaround (rewrite as ``FromScratchColumnGenerator``)
  is reverted; the engine fix subsumes it. The processor-plugin fix
  (``process_after_generation`` for the regex filter) stays — pre-batch
  row-count change is intentionally rejected by the async engine.

Test improvements
- Throttle no-deadline tests are parametrized over ``(timeout=0.0, raises)``
  and ``(timeout=None, waits)``, pinning that ``None`` is genuinely distinct
  from any finite default. Sync and async counterparts mirror.
- New regression tests for ``first_non_retryable_error`` surfacing covering
  both load-raises and load-returns-empty paths, asserting the original
  exception is chained via ``__cause__`` and that the typed
  ``DataDesignerEarlyShutdownError`` doesn't fire in this branch.
- New parametrized regression test for ``_resolve_client_concurrency_mode``
  covering all four (env × allow_resize) combinations.
- New parametrized test for the per-call ``timeout=`` override flowing into
  the bridge timeout calculation.
- Bridge formula tests extended with ``max_conversation_restarts`` cases.

* test: trim redundant parametrize cases in async-default tests

Three parametrize cases were duplicating coverage already provided by
existing standalone tests:

- ``test_acquire_*_timeout_branches`` parametrized over ``(0.0, raises)``
  and ``(None, waits)``. The ``raises`` half duplicates
  ``test_acquire_*_raises_timeout_when_at_capacity``. Replaced with two
  focused ``..._default_no_deadline_waits_for_release`` tests covering
  only the no-deadline branch.

- ``test_resolve_client_concurrency_mode_matches_engine_choice`` had four
  cases. The ``async-off + allow-resize`` case asserts ``SYNC`` because the
  env var alone forces it; the allow_resize input is moot. Dropped.

- ``test_async_bridge_honors_per_call_timeout`` had three cases. The
  "override below floor" case cross-products the per-call override flow
  with the floor-clamping behavior already covered by
  ``test_compute_bridge_timeout``. Dropped.

Net: -25 lines of test code with no loss of essential coverage.

* docs: fold migration page into existing concept docs

The standalone ``Migrating to the async default`` page didn't fit the
existing docs style — present tense, behavior over comparisons, content
in the natural concept home. Folding it in:

- ``architecture-and-performance.md`` gets a new ``Async Engine`` section
  covering per-model timeouts, run outcomes (partial completion +
  ``DataDesignerEarlyShutdownError``), and the transitional opt-out.
  Three stale ``async engine is landing soon`` callouts updated to
  reflect the flip.
- ``custom_columns.md`` gets two short notes: a thread-safety callout
  near Generation Strategies, and a mocking-with-spec note in
  Development Testing.
- ``async-all-the-way-down.md`` Update section now points at the new
  arch-and-perf section.
- README heads-up links to the same anchor.
- ``migration-async-default.md`` removed; mkdocs.yml entry dropped.

* docs: frame Execution Model as sync-engine specifics

Small targeted edits to make the user-facing concept docs consistent
with the post-flip state. No restructuring.

- ``architecture-and-performance.md``: the ``Execution Model`` callout
  now opens with two engines, links to the new ``Async Engine`` section,
  and frames the existing column-at-a-time description as sync-engine
  semantics. The ``Step 2: Process columns sequentially`` paragraph notes
  the async engine relaxes this. The ``Key Concepts`` table differentiates
  per-engine for ``Batching`` and ``Sequential columns``; ``Parallel cells``
  is the same on both.
- ``processors.md``: added a warning callout about the async engine's
  row-count invariance in pre- and post-batch stages, with the guidance
  to use ``process_after_generation()`` for row-filtering or expansion.

* fix: address review nits from PR #592 (Nabin)

Four targeted fixes from the review.

Worth-addressing (warning):
- ``test_acquire_async_default_no_deadline_waits_for_release`` was
  spawning the release task without holding a strong reference. The
  loop's weak-ref bookkeeping could GC it before the inner ``await``
  observes the release, producing a CI flake. Hold the task and
  ``await`` it in ``finally``.

Take-it-or-leave-it (applied):
- Root-cause error surfacing now includes the exception type name:
  ``f"🛑 {type(root_cause).__name__}: {root_cause}"`` so users see
  ``ValueError: ...`` instead of just the message string. The
  ``__cause__`` chain is preserved either way.
- Drop the defensive ``getattr(c, "allow_resize", False)`` in
  ``_resolve_client_concurrency_mode`` — every member of
  ``ColumnConfigT`` inherits ``allow_resize: bool = False`` from
  ``SingleColumnConfig``.
- One-line comment near the root-cause surfacing branch noting that
  ``actual_num_records == 0`` is async-only (sync runs leave it at
  ``-1``), so the branch is async-only by construction.

Not addressed in this PR (filing as follow-ups):
- ``SYNC_BRIDGE_TIMEOUT = 300`` still hardcoded in
  ``column_generators/generators/base.py:_run_coroutine_sync``. That
  bridge has no model-facade context to derive a timeout from, so the
  fix is a structural refactor outside this PR's scope.
- First-error capture loses subsequent-error context. The "first wins"
  heuristic is documented; richer aggregation is a follow-up.

* fix: drop SYNC_BRIDGE_TIMEOUT in _run_coroutine_sync

This was the third hardcoded 300s timeout (Nabin flagged it on PR #592).
The path is the generic sync→async bridge in ``ColumnGenerator.generate()``:
when a subclass overrides only ``agenerate()``, the sync entry point runs
the coroutine in a background thread.

Same philosophy we applied to the throttle queue wait elsewhere in the
PR: a defensive deadline on top of work that's already bounded by the
HTTP timeout doesn't add safety, it just produces spurious failures on
slow self-hosted endpoints. Drop the constant, the timeout exception
handling, and the ``timed_out`` bookkeeping. ``pool.shutdown(wait=True)``
becomes the simple cleanup.

Tests in ``test_async_generators.py`` exercise the happy path only and
don't depend on the timeout firing.

* Revert "fix: drop SYNC_BRIDGE_TIMEOUT in _run_coroutine_sync"

This reverts commit 7a0b77d44c.

* docs+feat: deprecate the sync-engine opt-out path

Nabin asked whether the docs should adopt explicit "deprecation" language
on the opt-out path. Doing both:

- Doc: ``architecture-and-performance.md``'s ``Opting out`` section now
  uses an ``!!! warning "Deprecated"`` admonition that names the env var
  as a deprecated escape hatch and notes the run-time warning.
- Code: ``DataDesigner._resolve_client_concurrency_mode`` emits a
  ``DeprecationWarning`` when ``DATA_DESIGNER_ASYNC_ENGINE=0`` is detected.
  Same precedent as the existing ``allow_resize=True`` warning. Auto-fallback
  via ``allow_resize`` does not double-warn here; the builder layer emits
  its own warning later.
- Test: parametrized regression now asserts ``pytest.warns(DeprecationWarning)``
  on the opt-out branch and treats any warning on the async-on branches as
  a failure (``simplefilter("error")`` inside the ``catch_warnings`` block).

* fix: emit logger.warning alongside DeprecationWarning on env-var opt-out

Parity fix from Nabin's re-review of PR #592. The ``allow_resize=True``
auto-fallback path in ``_resolve_async_compatibility`` emits both a
``logger.warning("⚠️ ...")`` and a ``DeprecationWarning``. The new
``DATA_DESIGNER_ASYNC_ENGINE=0`` opt-out path was only emitting the
``DeprecationWarning``, leaving users who run with default warning
filters silenced and inconsistent with the established precedent.

Match the pattern: same message body, both signals, same stacklevel.

* docs: breadcrumb explaining why SYNC_BRIDGE_TIMEOUT survives PR #592

Nabin's re-review pointed out that ``base.py`` is the lone place where
the 300s pattern survives, while ``custom.py`` and ``throttle_manager.py``
both retired theirs. Without a comment, a future reader (or a lint sweep)
could mistake this for an oversight and "consistency-fix" it the wrong way.

Add a short note at the constant naming the two retired siblings, the
reason this one stayed (no ``ModelFacade`` context to derive from), and
the fact that it's tracked for a structural follow-up.
2026-05-04 16:22:13 -03:00
..
data_designer_e2e_tests feat: make async engine the default execution path (#592) 2026-05-04 16:22:13 -03:00