mirror of
https://github.com/NVIDIA-NeMo/DataDesigner
synced 2026-05-24 09:48:29 +00:00
Some checks failed
CI / Test (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Lint and Format Check (push) Has been cancelled
CI / Check License Headers (push) Has been cancelled
CI / Test Config (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test Config (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test Config (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test Config (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test Config (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test Config (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Config (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Config (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test Engine (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test Engine (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test Engine (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test Engine (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Coverage Check (Python 3.11) (push) Has been cancelled
CI / End to end test (Python 3.10 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.11 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.12 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.13 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Has been cancelled
* feat(models): deprecate implicit default provider routing
Emit DeprecationWarning whenever the legacy "implicit default
provider" path is exercised: `ModelConfig.provider=None`, the
registry-level `ModelProviderRegistry.default`, the YAML
`default:` key in `~/.data-designer/model_providers.yaml`, and
the CLI's "Change default provider" workflow.
`resolve_model_provider_registry` skips passing `default=` in the
single-provider case so the common construction path stays quiet.
Multi-provider registries still pass `default` (per
`check_implicit_default`) and warn accordingly.
Update docs, the package README, and test fixtures to specify
`provider=` explicitly on every `ModelConfig`. New tests cover
each warning entry point and pin the post-deprecation happy paths.
Refs #589
Made-with: Cursor
* fix(models): address PR #594 review feedback
Greptile P1: ProviderRepository.load emitted its DeprecationWarning
inside a `try/except Exception` block. Under
`filterwarnings("error", DeprecationWarning)` the warn would raise,
the except would swallow it, and `load()` would silently return None
(losing the registry). Move the warn outside the catch-all so the
strict-warning path no longer drops valid configs.
Greptile P2 / johnnygreco: `_warn_on_implicit_provider` and
`_warn_on_explicit_default` use `stacklevel=2`, which lands inside
pydantic v2's validator dispatch rather than at the user's
`ModelConfig(...)` / `ModelProviderRegistry(...)` call. That broke
both attribution (the source line was unhelpful) and Python's
once-per-location dedup (every call collapsed to the same
pydantic-internal key, suppressing all but the first warning).
Introduce `data_designer.config.utils.warning_helpers.warn_at_caller`,
which walks past the helper, validator, and any pydantic frames to
find the user's call site and emits via `warnings.warn_explicit` with
the user frame's `__warningregistry__`. Keeps attribution accurate
and dedup keyed on the user's (filename, lineno).
johnnygreco: align the `provider_repository.py` warning copy with the
sibling site in `default_model_settings.py` ("specify provider=
explicitly on each ModelConfig instead") so both YAML-default warning
sites give the same migration instruction. The previous wording
pointed users at "ModelConfig entries" inside `model_providers.yaml`,
where ModelConfig entries don't actually live.
johnnygreco: dedup the cascade in `DataDesigner.__init__`. With
`model_providers=None` and a YAML `default:`, the user previously saw
two DeprecationWarnings for the same root cause —
`get_default_provider_name()` warns about the YAML key, then
`resolve_model_provider_registry(...)` re-warns from
`_warn_on_explicit_default`. Suppress the registry-level duplicate in
the YAML-fallback branch via `warnings.catch_warnings()` so users see
exactly one warning per user action.
johnnygreco: tighten `_warn_on_explicit_default` to fire only when
`default is not None`. Passing `default=None` explicitly is
semantically equivalent to omitting it (caller is opting *out* of a
registry-level default), and shouldn't trigger the deprecation
nudge.
johnnygreco: add a `model_validate({...})` regression test for
`ModelConfig` so the deserialization path (legacy on-disk configs)
is pinned alongside the construction path.
Tests:
- Update `test_load_exists` and `test_save` to omit `default=` so the
roundtrip stops exercising the deprecated YAML-default path
unguarded (Greptile note).
- Wrap `test_resolve_model_provider_registry_with_explicit_default`,
`test_get_provider`, and
`test_init_user_supplied_providers_preserve_first_wins_over_yaml_default`
in `pytest.warns` so the suite stays green under
`-W error::DeprecationWarning` (Greptile note).
- Add `test_explicit_default_none_does_not_emit_deprecation_warning`
to pin the tightened predicate.
- Add `test_init_yaml_default_emits_single_deprecation_warning` to
pin the cascade-dedup behavior.
Refs #589
Made-with: Cursor
* fix(models): make deprecation warnings visible under default filters
andreatgretel (PR #594): the YAML-default warning in
`get_default_provider_name` and the registry-default warning emitted
from inside DataDesigner helpers were attributing to data_designer
library frames, not user code. Python's default filter chain includes
`ignore::DeprecationWarning`, so library-attributed entries are
silenced — meaning a normal `DataDesigner()` call with a YAML
`default:` set showed nothing, and `resolve_model_provider_registry`
warnings were similarly invisible. Two related changes:
1. `warn_at_caller`: extend the default skip-list from `("pydantic",)`
to `("pydantic", "pydantic_core", "data_designer")` so the walk
escapes both pydantic's validator-dispatch frames and data_designer
helper frames before attributing. Also tighten the prefix predicate
to exact-or-dotted-prefix matching (`name == p or
name.startswith(p + ".")`) so e.g. `pydantic_helpers` is not
falsely matched as part of `pydantic` (johnnygreco nit). Allow
callers to pass a custom `skip_prefixes` for flexibility. Drop the
"skip frame 0+1 unconditionally" guard now that prefix matching
covers it.
2. `get_default_provider_name`: switch from
`warnings.warn(stacklevel=2)` to `warn_at_caller`. The previous
stacklevel pointed into `default_model_settings.py`, which is a
library file → silenced under default filters. Verified the fix
empirically with `python -W default`: warning is now attributed to
the user's call site and rendered.
johnnygreco (PR #594): add the missing
`test_explicit_default_none_does_not_emit_deprecation_warning`
regression for the `self.default is not None` predicate landed in
the prior round.
Tests:
- New `test_warning_helpers.py` pins prefix-matching precision
(rejects `pydantic_helpers` / `data_designer_other`), default
skip-list contents, attribution past skip-prefix frames, and
per-call-site dedup behavior.
- `test_get_default_provider_name_warning_attributes_to_user_frame`
pins andreatgretel's repro for the YAML-default site.
- `test_explicit_default_warning_attributes_to_user_frame` pins the
multi-frame case: construction goes through
`resolve_model_provider_registry`, so the walk has to escape both
pydantic and data_designer before landing on the test file.
- `test_explicit_default_none_does_not_emit_deprecation_warning`
pins johnnygreco's predicate-tightening regression.
3,124 tests pass (540 config + 1,923 engine + 653 interface; +10 net
from this round).
Refs #589
Made-with: Cursor
* fix(models): apply warn_at_caller to remaining deprecation sites
greptile-apps (PR #594, r3189904028): `ProviderRepository.load`'s
YAML-default `DeprecationWarning` was using `warnings.warn(stacklevel=2)`,
which attributes to whichever data_designer frame called `load()` —
controllers, services, list/reset commands, agent introspection. Every
real call path lands on `data_designer.cli.*`, which falls under
Python's default `ignore::DeprecationWarning` filter and is silenced.
Audit found two more sites with the same problem:
- `DatasetBuilder._resolve_async_compatibility` (`allow_resize` /
issue #552) — was using `stacklevel=4` to walk past
`_resolve_async_compatibility -> build/build_preview -> interface ->
user`. Brittle: any added frame (decorator, async wrapping, the
`try/except DeprecationWarning: raise` boundary) shifts attribution
silently. The existing test passed only because it used
`simplefilter("always") + record=True`, which records warnings
regardless of attribution.
- `ProviderController._handle_change_default` — was using
`stacklevel=2`, which lands on the menu dispatcher in the same
controller module. `print_warning` already shows the message
visually, but programmatic observers (`pytest.warns`,
`filterwarnings("error", ...)`) saw a library-attributed entry that
default filters silenced.
All three migrated to `warn_at_caller` (the helper from 247fa30) so
attribution lands on the user's call site regardless of internal
chain shape. `data_designer` is already in
`DEFAULT_INTERNAL_PREFIXES`, so the walk escapes the entire library
in one pass.
Added attribution regression tests at each site asserting
`warning.filename == __file__`. A future regression to
`warnings.warn(stacklevel=N)` now fails CI instead of silently
silencing the user-facing nudge:
- `test_load_with_yaml_default_attributes_warning_to_caller`
(test_provider_repository.py)
- `test_resolve_async_compatibility` extended with the same assertion
- `test_handle_change_default_emits_deprecation_warning` rewritten
from `pytest.warns(...)` to a `catch_warnings(record=True)` block
that filters for the message and asserts `filename == __file__`
(`pytest.warns` does not check attribution, so the rewrite is
required to actually catch the regression).
3,125 tests pass (548 config + 1,923 engine + 654 interface).
Refs #589
228 lines
9.4 KiB
Python
228 lines
9.4 KiB
Python
# SPDX-FileCopyrightText: Copyright (c) 2025-2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
|
# SPDX-License-Identifier: Apache-2.0
|
|
|
|
import warnings
|
|
|
|
import pytest
|
|
|
|
from data_designer.config.mcp import LocalStdioMCPProvider
|
|
from data_designer.engine.errors import NoModelProvidersError
|
|
from data_designer.engine.model_provider import (
|
|
MCPProviderRegistry,
|
|
ModelProvider,
|
|
ModelProviderRegistry,
|
|
UnknownProviderError,
|
|
resolve_model_provider_registry,
|
|
)
|
|
|
|
|
|
@pytest.fixture
|
|
def stub_foo_provider():
|
|
return ModelProvider(name="foo", endpoint="https://foo.com", provider_type="foo")
|
|
|
|
|
|
@pytest.fixture
|
|
def stub_bar_provider():
|
|
return ModelProvider(name="bar", endpoint="https://bar.com", provider_type="bar")
|
|
|
|
|
|
def test_must_have_at_least_one_provider():
|
|
with pytest.raises(ValueError):
|
|
ModelProviderRegistry(providers=[], default="a")
|
|
|
|
with pytest.raises(ValueError):
|
|
ModelProviderRegistry(providers=[])
|
|
|
|
|
|
def test_defined_default_must_exist(stub_foo_provider: ModelProvider):
|
|
with pytest.raises(ValueError):
|
|
ModelProviderRegistry(providers=[stub_foo_provider], default="bar")
|
|
|
|
|
|
def test_multiple_providers_requires_explicit_default(
|
|
stub_foo_provider: ModelProvider, stub_bar_provider: ModelProvider
|
|
):
|
|
with pytest.raises(ValueError):
|
|
ModelProviderRegistry(providers=[stub_foo_provider, stub_bar_provider])
|
|
|
|
|
|
def test_implicit_default(stub_foo_provider: ModelProvider):
|
|
registry = ModelProviderRegistry(providers=[stub_foo_provider])
|
|
|
|
assert registry.get_provider(None) == stub_foo_provider
|
|
|
|
|
|
def test_no_duplicate_provider_names(stub_foo_provider: ModelProvider):
|
|
with pytest.raises(ValueError):
|
|
ModelProviderRegistry(providers=[stub_foo_provider, stub_foo_provider], default="foo")
|
|
|
|
|
|
def test_get_provider(stub_foo_provider: ModelProvider, stub_bar_provider: ModelProvider):
|
|
# Multi-provider construction with an explicit default exercises the #589
|
|
# deprecation path; wrap so this test stays green if the project ever runs
|
|
# with ``-W error::DeprecationWarning``.
|
|
with pytest.warns(DeprecationWarning, match="ModelProviderRegistry.default is deprecated"):
|
|
registry = ModelProviderRegistry(
|
|
providers=[stub_foo_provider, stub_bar_provider],
|
|
default="foo",
|
|
)
|
|
|
|
assert registry.get_provider(None) == stub_foo_provider
|
|
assert registry.get_provider("foo") == stub_foo_provider
|
|
assert registry.get_provider("bar") == stub_bar_provider
|
|
|
|
with pytest.raises(UnknownProviderError):
|
|
registry.get_provider("quux")
|
|
|
|
|
|
def test_resolve_model_provider_registry(stub_foo_provider: ModelProvider) -> None:
|
|
"""Test resolve_model_provider_registry creates a registry from providers."""
|
|
registry = resolve_model_provider_registry([stub_foo_provider])
|
|
|
|
assert len(registry.providers) == 1
|
|
assert registry.get_default_provider_name() == "foo"
|
|
|
|
|
|
def test_resolve_model_provider_registry_with_explicit_default(
|
|
stub_foo_provider: ModelProvider, stub_bar_provider: ModelProvider
|
|
) -> None:
|
|
"""Test resolve_model_provider_registry with explicit default.
|
|
|
|
The multi-provider/explicit-default path is the deprecated one (see #589),
|
|
so the construction emits a ``DeprecationWarning``. Wrap the call in
|
|
``pytest.warns`` so this test stays green if the project ever runs under
|
|
``-W error::DeprecationWarning``.
|
|
"""
|
|
with pytest.warns(DeprecationWarning, match="ModelProviderRegistry.default is deprecated"):
|
|
registry = resolve_model_provider_registry([stub_foo_provider, stub_bar_provider], default_provider_name="bar")
|
|
|
|
assert registry.get_default_provider_name() == "bar"
|
|
|
|
|
|
def test_resolve_model_provider_registry_empty_error() -> None:
|
|
"""Test resolve_model_provider_registry raises error for empty providers."""
|
|
with pytest.raises(NoModelProvidersError, match="At least one model provider"):
|
|
resolve_model_provider_registry([])
|
|
|
|
|
|
def test_explicit_default_emits_deprecation_warning(stub_foo_provider: ModelProvider) -> None:
|
|
"""Regression for #589: passing ``default=`` explicitly to ``ModelProviderRegistry``
|
|
must emit a ``DeprecationWarning``. The registry-level default field is on its
|
|
way out; users should specify ``provider=`` per ``ModelConfig`` instead.
|
|
"""
|
|
with pytest.warns(DeprecationWarning, match="ModelProviderRegistry.default is deprecated"):
|
|
ModelProviderRegistry(providers=[stub_foo_provider], default="foo")
|
|
|
|
|
|
def test_no_default_does_not_emit_deprecation_warning(stub_foo_provider: ModelProvider) -> None:
|
|
"""Pin the post-deprecation happy path: omitting ``default=`` (single-provider
|
|
case) must NOT emit a warning, since callers haven't opted into the deprecated
|
|
field.
|
|
"""
|
|
with warnings.catch_warnings():
|
|
warnings.simplefilter("error", DeprecationWarning)
|
|
ModelProviderRegistry(providers=[stub_foo_provider])
|
|
|
|
|
|
def test_explicit_default_none_does_not_emit_deprecation_warning(stub_foo_provider: ModelProvider) -> None:
|
|
"""Pin the predicate tightening from PR #594 review: passing ``default=None``
|
|
explicitly is semantically equivalent to omitting it (caller is opting *out*
|
|
of a registry-level default), so the deprecation must NOT fire.
|
|
"""
|
|
with warnings.catch_warnings():
|
|
warnings.simplefilter("error", DeprecationWarning)
|
|
ModelProviderRegistry(providers=[stub_foo_provider], default=None)
|
|
|
|
|
|
def test_explicit_default_warning_attributes_to_user_frame(
|
|
stub_foo_provider: ModelProvider, stub_bar_provider: ModelProvider
|
|
) -> None:
|
|
"""Regression for PR #594 review (andreatgretel): the ``default=`` deprecation
|
|
warning must attribute to the *user's* call site, not the pydantic-internal
|
|
or ``data_designer`` library frame that emits it. Library-attributed
|
|
``DeprecationWarning`` entries are silenced under Python's default
|
|
``ignore::DeprecationWarning`` filter, so attribution determines whether
|
|
the warning is actually visible.
|
|
|
|
Construction goes through ``resolve_model_provider_registry`` so the walk
|
|
has to escape both pydantic (validator dispatch) and ``data_designer``
|
|
(the helper that builds the registry) before landing on the test frame.
|
|
"""
|
|
with warnings.catch_warnings(record=True) as caught:
|
|
warnings.simplefilter("always", DeprecationWarning)
|
|
resolve_model_provider_registry([stub_foo_provider, stub_bar_provider], default_provider_name="bar")
|
|
|
|
matches = [w for w in caught if "ModelProviderRegistry.default is deprecated" in str(w.message)]
|
|
assert len(matches) == 1, [str(w.message) for w in caught]
|
|
assert matches[0].filename == __file__, (
|
|
f"Warning attributed to {matches[0].filename!r} (line {matches[0].lineno}) "
|
|
f"instead of the test file. Library-attributed DeprecationWarnings are "
|
|
f"silenced under default filters."
|
|
)
|
|
|
|
|
|
def test_resolve_single_provider_quiet_under_deprecation(stub_foo_provider: ModelProvider) -> None:
|
|
"""Pin the q3 tweak: ``resolve_model_provider_registry`` skips ``default=``
|
|
in the single-provider case so common construction paths stay quiet under
|
|
the #589 deprecation warning.
|
|
"""
|
|
with warnings.catch_warnings():
|
|
warnings.simplefilter("error", DeprecationWarning)
|
|
registry = resolve_model_provider_registry([stub_foo_provider])
|
|
|
|
assert registry.get_default_provider_name() == "foo"
|
|
|
|
|
|
def test_resolve_multi_provider_emits_deprecation_warning(
|
|
stub_foo_provider: ModelProvider, stub_bar_provider: ModelProvider
|
|
) -> None:
|
|
"""Multi-provider registries currently require ``default``, so
|
|
``resolve_model_provider_registry`` keeps passing it. That construction
|
|
path is the deprecated one users should migrate off; the warning fires
|
|
accordingly.
|
|
"""
|
|
with pytest.warns(DeprecationWarning, match="ModelProviderRegistry.default is deprecated"):
|
|
resolve_model_provider_registry([stub_foo_provider, stub_bar_provider])
|
|
|
|
|
|
def test_mcp_provider_registry_empty() -> None:
|
|
"""Test MCPProviderRegistry can be created empty."""
|
|
registry = MCPProviderRegistry()
|
|
|
|
assert len(registry.providers) == 0
|
|
|
|
|
|
def test_mcp_provider_registry_with_providers() -> None:
|
|
"""Test MCPProviderRegistry with providers."""
|
|
provider = LocalStdioMCPProvider(name="test-provider", command="test-cmd")
|
|
registry = MCPProviderRegistry(providers=[provider])
|
|
|
|
assert len(registry.providers) == 1
|
|
assert registry.get_provider("test-provider") == provider
|
|
|
|
|
|
def test_mcp_provider_registry_duplicate_names() -> None:
|
|
"""Test MCPProviderRegistry raises error for duplicate provider names."""
|
|
provider1 = LocalStdioMCPProvider(name="test-provider", command="test-cmd")
|
|
provider2 = LocalStdioMCPProvider(name="test-provider", command="test-cmd-2")
|
|
|
|
with pytest.raises(ValueError, match="duplicate"):
|
|
MCPProviderRegistry(providers=[provider1, provider2])
|
|
|
|
|
|
def test_mcp_provider_registry_unknown_provider() -> None:
|
|
"""Test MCPProviderRegistry raises error for unknown provider."""
|
|
registry = MCPProviderRegistry()
|
|
|
|
with pytest.raises(UnknownProviderError, match="registered"):
|
|
registry.get_provider("unknown-provider")
|
|
|
|
|
|
def test_mcp_provider_registry_is_empty() -> None:
|
|
"""Test MCPProviderRegistry is_empty method."""
|
|
empty_registry = MCPProviderRegistry()
|
|
assert empty_registry.is_empty() is True
|
|
|
|
provider = LocalStdioMCPProvider(name="test-provider", command="test-cmd")
|
|
registry_with_providers = MCPProviderRegistry(providers=[provider])
|
|
assert registry_with_providers.is_empty() is False
|