Commit graph

24 commits

Author SHA1 Message Date
Eric W. Tramel
7a539c0e3d fix: make display_tui the canonical run config flag 2026-05-21 21:51:40 -04:00
Eric W. Tramel
a867a66d58 fix: rename create progress flags to tui 2026-05-21 21:41:59 -04:00
Eric W. Tramel
a24edaa06a feat: add create progress override flags 2026-05-21 21:36:58 -04:00
Johnny Greco
d14c9b3ccc
feat(cli): add plugin catalog core (#618)
* feat(cli): add plugin catalog services

Add typed catalog and tap models, persistent tap storage, cached
catalog loading, compatibility evaluation, install plan generation,
and runtime plugin discovery helpers.

Refs #617

* feat(cli): add plugins command group

Wire list, search, info, install, installed, and tap management
commands through the existing command-controller CLI pattern.

Refs #617

* test(cli): cover plugin catalog workflows

Add regression coverage for tap caching, catalog compatibility,
installer command generation, local path resolution, and Typer command
delegation.

Refs #617

* fix(cli): align plugin taps with schema v2

Validate tap catalogs against the schema v2 contract used by
NVIDIA-NeMo/DataDesignerPlugins#36, including source union fields,
docs URLs, package paths, compatibility metadata, and unique runtime
plugin names.

Derive Git install targets as package-qualified PEP 508 direct
references so git tap entries install the package described by the
catalog source metadata.

Refs #617

* fix(cli): address plugin review feedback

- Invalidate import caches before post-install entry point verification
- Make tap aliases case-insensitive and cache catalogs by alias plus URL
- Prefer compatible catalog entries before falling back to forced installs
- Clarify unused --tap behavior and list installed entry points without imports
- Add direct controller coverage and update CLI plugin documentation

Refs #617

* fix(cli): gate incompatible plugin installs

Fetch install targets before compatibility filtering so the controller
owns the final --force decision and the incompatible install guard stays
reachable.

Refs #617

* style(cli): format plugin catalog files

Apply ruff formatting to the plugin command and tap repository tests so
CI format checks pass on the PR merge commit.

Refs #617

* fix(cli): reject duplicate plugin entry names

Key catalog duplicate detection by entry_point.name so distinct catalog
entries cannot register the same runtime plugin name.

Refs #617

* fix(cli): preserve GitHub tree tap paths

* fix(cli): verify plugin entry point names

* align plugin CLI with catalog schema

- adopt catalog terminology for plugin source aliases
- parse package-first plugin catalog metadata from the plugin repo
- install package requirements with optional catalog indexes

* tidy plugin catalog workflow docs

* align plugin catalog CLI with package contract

* add plugin package uninstall workflow

* test plugin package command targets

* document plugin package aliases

* address plugin catalog review feedback

* prefer runtime plugin lookup matches

* rename plugins command to plugin

* show plugin package descriptions

* rename plugin catalogs command

* add protected plugin package installs

* document plugin package install modes

* avoid building project during plugin installs

* harden plugin package installs

* tighten plugin catalog contracts

* fix no-args help exit code

* make plugin docs links robust

* document plugin CLI catalog workflows

* clarify plugin entry point verification

* simplify plugin CLI docs

* narrow plugin search fields

* hide plugin catalog cache ttl

* remove plugin catalog trust flag

* improve plugin CLI recovery UX

* polish plugin catalog table display

* stabilize plugin catalog table test

* tighten plugin catalog edge cases

* harden plugin catalog verification

- Escape catalog-provided Rich markup before rendering CLI output
- Reject runtime plugin names that collide after enum-key normalization
- Load installed runtime entry points in a subprocess before reporting success

* simplify plugin entry point verification

Load matching entry points directly after install instead of spawning a
separate Python process. This keeps the check package-scoped while still
catching broken entry-point targets and non-Plugin objects.

* require newer uv for plugin plans

Use uv >= 0.10.0 as the single supported uv requirement for
plugin package commands. Auto mode now falls back to a pip plan with
an upgrade warning when uv is unavailable or too old, while explicit
uv selection remains strict.

* verify pip fallback availability

* polish plugin CLI status markers

* clarify plugin compatibility labels

* simplify plugin info install details

* address plugin CLI review nits

* support versioned plugin package installs

* share plugin install metadata rendering

* show installed plugin packages

* harden versioned plugin installs

- Preserve catalog requirement constraints for versioned installs
- Remove stale install-plan metadata fields
- Expand parser, uv, controller, and local-catalog dry-run coverage

* harden plugin help tests

* show plugin package versions

Add package version metadata support for plugin catalogs and resolve current versions from exact requirements or simple indexes when catalog entries omit them.

Update plugin list/info/install metadata to show the plugin package version and Data Designer compatibility requirement while removing the separate Data Designer version line.

* format plugin catalog tests

* harden plugin package metadata checks

* harden plugin CLI test coverage

* add plugin discovery docs (#642)

Signed-off-by: Johnny Greco <jogreco@nvidia.com>

---------

Signed-off-by: Johnny Greco <jogreco@nvidia.com>
2026-05-13 12:26:58 -04:00
Przemysław Boruta
810c681f7a
feat: resume interrupted dataset generation runs (sync + async engine) (#526)
Some checks failed
CI / Test Interface (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Coverage Check (Python 3.11) (push) Has been cancelled
CI / End to end test (Python 3.10 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.11 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.12 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.13 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Lint and Format Check (push) Has been cancelled
CI / Check License Headers (push) Has been cancelled
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.13 on ubuntu-latest) (push) Has been cancelled
* docs: add implementation plan for resume mechanism

Fixes #525

* feat(storage): add resume flag and clear_partial_results()

- ArtifactStorage gains a `resume: bool = False` field
- resolved_dataset_name skips timestamp logic when resume=True,
  returning the existing dataset folder name as-is
- Raises ArtifactStorageError on resume=True when the target folder
  is absent or empty (no data to resume from)
- New clear_partial_results() removes in-flight partial results
  left over from an interrupted run

Fixes #525

* feat(batch-manager): add start_batch param to start()

DatasetBatchManager.start() now accepts:
- start_batch: int = 0  — first batch index to process
- initial_actual_num_records: int = 0  — records already on disk

Both default to 0 so all existing call sites are unaffected.

Fixes #525

* feat(builder): implement resume logic in DatasetBuilder

- build() gains a resume: bool = False parameter
- _load_resume_state() reads metadata.json and validates that
  num_records and buffer_size match the original run
- _build_with_resume() skips completed batches, clears in-flight
  partial results, and continues from the first incomplete batch
- Raises DatasetGenerationError with clear messages for:
  - missing metadata.json (interrupted before first batch completes)
  - num_records mismatch
  - buffer_size mismatch
  - DATA_DESIGNER_ASYNC_ENGINE=1 (not yet supported)
- Logs a warning and returns early when dataset is already complete

Fixes #525

* feat(interface): expose resume on DataDesigner.create()

- create() gains resume: bool = False
- _create_resource_provider() passes resume to ArtifactStorage
- builder.build() receives the resume flag

Fixes #525

* test: add tests for resume mechanism

Covers:
- ArtifactStorage.resolved_dataset_name with resume=True
- ArtifactStorage.clear_partial_results()
- DatasetBatchManager.start() with start_batch and
  initial_actual_num_records
- DatasetBuilder.build(resume=True): missing metadata, num_records
  mismatch, buffer_size mismatch, already-complete detection

Fixes #525

* feat(builder): extend resume to async engine (DATA_DESIGNER_ASYNC_ENGINE=1)

- Add _find_completed_row_group_ids() to scan parquet-files/ for already-written
  row groups by parsing batch_*.parquet filenames
- _build_async() now accepts resume=True: loads metadata, finds completed row groups,
  clears partial results, and logs progress; returns early if all row groups are done
- _prepare_async_run() accepts skip_row_groups, initial_actual_num_records, and
  initial_total_num_batches so the scheduler only processes remaining row groups
  and RowGroupBufferManager starts from the correct counts
- RowGroupBufferManager.__init__ gains initial_actual_num_records and
  initial_total_num_batches params to seed the counters on resume
- finalize_row_group closure now writes incremental metadata after each checkpoint
  so any run (resume or not) can be resumed if interrupted mid-way
- Remove the guard that rejected resume=True with DATA_DESIGNER_ASYNC_ENGINE=1
- Add tests for all new paths

* fix(builder): skip after-generation processors when resume finds dataset already complete

_build_with_resume and _build_async now return False when the dataset is already
complete (early-return path), True otherwise. build() skips
_processor_runner.run_after_generation() on False, preventing processors from
calling shutil.rmtree and rewriting an already-finalized dataset.

Fixes the issue raised in review: greptile P1 comment on PR #526.

* fix(builder): use filesystem count for initial_total_num_batches on async resume

Metadata can lag by one row group if a crash occurs between
move_partial_result_to_final_file_path and write_metadata. Using
len(completed_ids) from the filesystem scan instead of
state.num_completed_batches ensures the final metadata reflects the
actual number of parquet files present, not the potentially stale
metadata count.

* feat(results): add export() method and --output-format CLI flag

Adds DatasetCreationResults.export(path, format=) supporting jsonl,
csv, and parquet. The CLI create command gains --output-format / -f
which writes dataset.<format> alongside the parquet batch files.

* fix(builder): handle resume when metadata.json missing (interrupted before first batch)

When a run is interrupted before any row group or batch completes, metadata.json
is never written. Previously resume=True would raise DatasetGenerationError in
this case. Now build() detects the missing file, logs an info message, clears
any leftover partial results and falls back to a clean fresh run.

This is the common scenario for small datasets (fewer records than buffer_size)
where all records fit in a single row group.

* docs(interface): fix resume docstring — async engine is supported

* fix(builder): derive initial_actual_num_records from filesystem in async resume

In the crash window (row group written to disk but write_metadata crashed before
updating the file), both initial_total_num_batches and initial_actual_num_records
now use the filesystem-discovered completed_ids as source of truth.  Previously
initial_actual_num_records was read from potentially stale metadata, causing
actual_num_records in the final metadata to be undercounted by one row group.

Also adds a test covering the partial-resume crash-window scenario.

* feat(resume): replace resume: bool with ResumeMode enum (NEVER/ALWAYS/IF_POSSIBLE)

- Introduces ResumeMode(StrEnum) in artifact_storage.py for use across all layers
- Replaces resume: bool with resume: ResumeMode in DatasetBuilder.build(),
  DataDesigner.create(), ArtifactStorage, and _build_async()
- Adds _check_resume_config_compatibility() using config fingerprints to support
  IF_POSSIBLE: falls back to a fresh run when config has changed since last run
- Relaxes num_records validation from strict equality to num_records >= actual_num_records,
  allowing dataset extension on resume; buffer_size must still match exactly
- Preserves exception chain with 'from exc' on FileNotFoundError in _load_resume_state
- Exports ResumeMode from data_designer.interface for users to import
- Adds skip_row_groups assertion test and IF_POSSIBLE storage behavior tests

* fix(resume): invalidate resolved_dataset_name cache when IF_POSSIBLE downgrades to NEVER

ArtifactStorage's Pydantic model validator accesses base_dataset_path at
construction time, caching resolved_dataset_name under IF_POSSIBLE semantics
before build() can set resume=NEVER. Pop the stale cache entry so the property
re-resolves with the correct NEVER semantics (timestamped directory).

Also fixes _check_resume_config_compatibility() to use artifact_path/dataset_name
directly instead of base_dataset_path, and adds a regression test covering the
cache-bypass scenario.

* fix(builder): move partial-completion warning before return in _build_async

* fix(builder): IF_POSSIBLE now starts fresh when no dataset directory exists

_check_resume_config_compatibility returned True when config_path was absent,
even when the dataset directory itself didn't exist. This caused IF_POSSIBLE to
upgrade to ALWAYS, which then raised ArtifactStorageError on the first-ever run
because ALWAYS requires an existing directory.

Fix: return False early when the dataset directory is absent. Also sets
actual_num_records on mock buffer managers in two async resume tests that
started failing after the partial-completion warning block was made reachable.

* fix(builder): use original target_num_records in async resume record count

When extending a non-aligned run (e.g. original num_records=5, buffer_size=2),
the last completed row group has 1 record, not buffer_size=2. Using new num_records
in the formula would overcount: min(2, 7-2*2)=2 instead of min(2, 5-2*2)=1.

Fix: capture state from _load_resume_state (previously discarded) and pass
state.target_num_records into the sum formula. Added target_num_records field to
_ResumeState, populated from metadata.json.

Test: test_build_async_resume_initial_actual_num_records_uses_original_target

* fix(builder): IF_POSSIBLE starts fresh on empty dataset directory

Empty directory (crash between mkdir and first file write) was treated as
compatible — _check_resume_config_compatibility returned True, IF_POSSIBLE
upgraded to ALWAYS, which then raised ArtifactStorageError.

Fix: treat empty directory the same as missing — return False from
_check_resume_config_compatibility when any(dir.iterdir()) is False.

Test: test_if_possible_starts_fresh_when_directory_is_empty

* fix(builder): ALWAYS raises DatasetGenerationError on config fingerprint mismatch

ResumeMode.ALWAYS was documented to raise when column/model config changed, but
_check_resume_config_compatibility() was only called in the IF_POSSIBLE branch.
A user resuming with ALWAYS after changing the config would silently mix records
from two different configs.

Fix:
- Refactor _check_resume_config_compatibility() to return _ConfigCompatibility
  enum (COMPATIBLE / INCOMPATIBLE / NO_PRIOR_DATASET) instead of bool so callers
  can distinguish 'no prior run' from 'configs differ'
- Call the check for both ALWAYS and IF_POSSIBLE before _write_builder_config()
- ALWAYS + INCOMPATIBLE → DatasetGenerationError
- IF_POSSIBLE + INCOMPATIBLE → silent fresh start (existing behaviour)
- IF_POSSIBLE + NO_PRIOR_DATASET → silent fresh start (existing behaviour)

Test: test_build_resume_always_raises_on_config_mismatch

* fix(resume): address nabinchha review — drop export collision, add CLI flag, fix edge cases

C1: drop commit 0bdf24ab — remove export() / --output-format from this PR; that feature
    belongs to #540 which has a superior streaming implementation
C2: add --resume / -r flag to data-designer create CLI, thread ResumeMode through
    GenerationController.run_create() into DataDesigner.create()
C3: fix already-complete warning text — replace stale "Remove resume=True" with
    "Use resume=ResumeMode.NEVER" in _build_with_resume and _build_async
C4: fix docstrings — ALWAYS does NOT raise when no checkpoint exists (silently
    restarts from scratch); clarify num_records >= actual semantics
C5: sync artifact_storage.resume = NEVER when no-metadata fallback fires so both
    state holders agree after the downgrade
C6: fix return_value=False → _ConfigCompatibility.INCOMPATIBLE in IF_POSSIBLE test;
    drop 3 direct _find_completed_row_group_ids tests (private API, covered by build())
W1: add logger.warning when builder_config.json is absent (silent COMPATIBLE was footgun)
W2: narrow except Exception → (OSError, json.JSONDecodeError, ValidationError)
W3: run make check-all-fix — ruff reformatted test_if_possible_starts_fresh_when_directory_is_empty

* fix(builder): replace stdlib StrEnum with project compat shim for Python 3.10

* fix(builder): guard extension row groups in initial_actual_num_records formula on async resume

When extending an async run (num_records > state.target_num_records) and a crash
occurs after an extension row group is written to disk but before write_metadata,
the formula `min(buffer_size, state.target_num_records - rg_id * buffer_size)` yields
a negative value for any extension row group (rg_id * buffer_size >= target), making
initial_actual_num_records silently undercount. The RowGroupBufferManager then starts
at the wrong offset, and the final metadata reports an incorrect actual_num_records
with a false partial-completion warning.

Fix: use state.target_num_records for original row groups and num_records for extension
row groups (guarded by rg_id * buffer_size < state.target_num_records). Covers the
scenario with a new regression test.

* fix(builder): pre-compute row-group list in _build_async to fix sizes on non-aligned extension resume

The partitioning loop in _prepare_async_run decremented remaining by
min(buffer_size, remaining) for every row group, including skipped ones.
For a non-aligned original run (e.g. target=5, buffer_size=2, last group
has 1 record), the loop deducted 2 for the skipped last group, leaving
remaining one short.  Extension row groups received smaller sizes than
intended, so the generated dataset was silently short by the deficit and
a false partial-completion warning fired.

Fix: pre-compute the full row-group list with correct per-group sizes in
_build_async where state.target_num_records is available, then pass it to
_prepare_async_run as precomputed_row_groups (replacing the skip_row_groups
param). Original groups use min(buffer_size, target - rg*bs); extension
groups use min(buffer_size, extension_records - ext_idx*bs).

Also updates the skip_row_groups test to assert on precomputed_row_groups
and adds a regression test for the non-aligned extension case.

* chore: remove stale implementation plan for #525

The plan described the initial resume: bool design which has since been
replaced by the full ResumeMode enum (NEVER/ALWAYS/IF_POSSIBLE), async
engine support, filesystem reconciliation, and config compatibility checks.
The PR description is the authoritative record of what shipped.

* fix(engine): fix false 'already complete' when extension fits in last group's slack

original_target=5, buffer_size=2 produces 3 groups [2,2,1]. Extending to
num_records=6: ceil(6/2)=3 equalled len(completed_ids)=3, triggering the
already-complete branch on both the async and sync paths — returning the
5-record dataset silently.

Fix (async): replace ceil(num_records/bs) with
  num_original_groups + ceil(extension_records/bs)
so any extension always adds new groups beyond num_original_groups.

Fix (sync): add num_records_list param to DatasetBatchManager.start() and
pass the correct per-batch sizes in _build_with_resume, giving the batch
manager the right total batch count (4 instead of 3 in the example).

* fix(engine): raise error when num_records is below original target on resume

Prevents negative extension_records in async path which silently truncated
the dataset and corrupted metadata without triggering a partial-completion warning.

* fix(storage): refresh MediaStorage path after IF_POSSIBLE → NEVER downgrade

When build() detected an incompatible config and downgraded resume from
IF_POSSIBLE to NEVER, _media_storage.base_path remained bound to the
original directory while all other path properties resolved to the new
timestamped directory — causing broken image references in image-column runs.

* fix(engine): preserve original_target_num_records across extension resume writes

After finalize_row_group successfully wrote incremental metadata during an
extension run, target_num_records in metadata was updated to the extension
target. A subsequent resume would read this as the original target, making
_rg_size() incorrect for all row groups and silently corrupting actual_num_records.

Stores original_target_num_records as an immutable field in metadata so the
original group boundaries are always recoverable regardless of how many
incremental writes have occurred.

---------

Co-authored-by: Nabin Mulepati <nmulepati@nvidia.com>
2026-05-08 15:37:56 -06:00
Eric W. Tramel
417b0c715d
feat(cli): show version update notice (#602) 2026-05-07 15:20:18 -04:00
Przemysław Boruta
0afe287a5f
feat(results): add export() method and --output-format CLI flag (#540)
* feat(results): add export() method and --output-format CLI flag

Adds DatasetCreationResults.export(path, format=) supporting jsonl,
csv, and parquet. The CLI create command gains --output-format / -f
which writes dataset.<format> alongside the parquet batch files.

* fix(cli): validate output_format before dataset generation

* fix(cli): remove top-level results import from create.py to preserve lazy loading

* fix(results): address andreatgretel review — error types, UX ordering, import hygiene

- Derive SUPPORTED_EXPORT_FORMATS from get_args(ExportFormat) so the two can't drift apart
- Replace ValueError with InvalidFileFormatError in export() — consistent with project error conventions
- Add date_format="iso" to to_json() for consistent datetime serialization across formats
- Add click.Choice(SUPPORTED_EXPORT_FORMATS) to --output-format CLI option for parse-time
  validation, better --help output, and tab completion
- Fix double load_dataset() in run_create: inline len() so the DataFrame ref dies before export
- Move success message after the export block to avoid "Dataset created" followed by "Export failed"
- Move imports to module level in test_results.py (json, Path, lazy already imported)
- Add controller-level tests for output_format happy path, bad format rejection, and export failure

* fix(results): correct Raises docstring — ValueError -> InvalidFileFormatError

* feat(results): stream batch files in export() to avoid OOM on large datasets

- Rewrite export() to read batch parquet files one at a time instead of
  materialising the full dataset via load_dataset(); peak memory is now
  proportional to a single batch regardless of dataset size
- Infer output format from file extension by default; format= parameter
  kept as an explicit override (e.g. writing .txt as JSONL)
- _export_parquet unifies schemas across batches (pa.unify_schemas) to
  handle type drift (e.g. int64 vs float64 in the same column)
- Drop format= from the controller's export() call — path already carries
  the correct extension
- Rewrite export tests around real batch parquet files (stub_batch_dir
  fixture); add tests for multi-batch output, schema unification, unknown
  extension, empty batch directory, and explicit format override

* fix(results): address nabinchha review — memory safety, error wrapping, UX

- Replace load_dataset() with count_records() in CLI to avoid OOM on
  large datasets; add count_records() method using pq.read_metadata
  (reads file metadata only, no data pages loaded)
- Remove redundant format validation in controller — click.Choice in
  create.py already rejects invalid values at parse time; dead code
  removed along with corresponding test
- Wrap pa.unify_schemas / table.cast ArrowInvalid as InvalidFileFormatError
  to normalize third-party exceptions at module boundaries per AGENTS.md
- Lowercase file extension before format lookup so .JSONL/.CSV/.PARQUET
  are accepted without error
- Add clarifying comment to trailing-newline guard in _export_jsonl
- Add tests: count_records(), uppercase extension, incompatible schemas

* fix(results): fix parquet export schema unification and controller path bug

- Use promote_options="permissive" in pa.unify_schemas so minor numeric
  type drift (int64 vs float64) is handled by promotion instead of raising
- Also catch ArrowTypeError from unify_schemas and ValueError from
  table.cast() — the actual exception types thrown by pyarrow for these
  cases (ArrowInvalid alone is not sufficient)
- Wrap base_dataset_path in Path() in generation_controller.run_create
  to guard against callers that return a str (mock returns str, Path
  does not support / with str operands)
- Update test_export_parquet_incompatible_schemas_raises to match the
  new error source: with permissive unification, different-column-name
  batches fail at cast() not at unify_schemas(), so the match string
  changes from "Cannot unify batch schemas" to "Cannot cast batch"

* fix(results,cli): address nabinchha review round 2

- Use public pa.ArrowInvalid/ArrowTypeError instead of pa.lib.* in _export_parquet
- Drop dead trailing-newline guard in _export_jsonl; skip empty batches with `if content`
- Rename num_records→actual_record_count after count_records() call to avoid shadowing
- Unlink partial export file before re-raising on export failure in run_create
- Export filename now uses dataset_name (<dataset-name>.<format>) instead of literal "dataset"
- Update help text and tests to match new export filename convention

---------

Co-authored-by: Andre Manoel <165937436+andreatgretel@users.noreply.github.com>
2026-05-06 17:13:57 -06:00
Nabin Mulepati
f73da1975c
feat(models): deprecate implicit default provider routing (#594)
Some checks failed
CI / Test (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test Engine (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Coverage Check (Python 3.11) (push) Has been cancelled
CI / End to end test (Python 3.10 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.11 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.12 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.13 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Has been cancelled
* feat(models): deprecate implicit default provider routing

Emit DeprecationWarning whenever the legacy "implicit default
provider" path is exercised: `ModelConfig.provider=None`, the
registry-level `ModelProviderRegistry.default`, the YAML
`default:` key in `~/.data-designer/model_providers.yaml`, and
the CLI's "Change default provider" workflow.

`resolve_model_provider_registry` skips passing `default=` in the
single-provider case so the common construction path stays quiet.
Multi-provider registries still pass `default` (per
`check_implicit_default`) and warn accordingly.

Update docs, the package README, and test fixtures to specify
`provider=` explicitly on every `ModelConfig`. New tests cover
each warning entry point and pin the post-deprecation happy paths.

Refs #589

Made-with: Cursor

* fix(models): address PR #594 review feedback

Greptile P1: ProviderRepository.load emitted its DeprecationWarning
inside a `try/except Exception` block. Under
`filterwarnings("error", DeprecationWarning)` the warn would raise,
the except would swallow it, and `load()` would silently return None
(losing the registry). Move the warn outside the catch-all so the
strict-warning path no longer drops valid configs.

Greptile P2 / johnnygreco: `_warn_on_implicit_provider` and
`_warn_on_explicit_default` use `stacklevel=2`, which lands inside
pydantic v2's validator dispatch rather than at the user's
`ModelConfig(...)` / `ModelProviderRegistry(...)` call. That broke
both attribution (the source line was unhelpful) and Python's
once-per-location dedup (every call collapsed to the same
pydantic-internal key, suppressing all but the first warning).
Introduce `data_designer.config.utils.warning_helpers.warn_at_caller`,
which walks past the helper, validator, and any pydantic frames to
find the user's call site and emits via `warnings.warn_explicit` with
the user frame's `__warningregistry__`. Keeps attribution accurate
and dedup keyed on the user's (filename, lineno).

johnnygreco: align the `provider_repository.py` warning copy with the
sibling site in `default_model_settings.py` ("specify provider=
explicitly on each ModelConfig instead") so both YAML-default warning
sites give the same migration instruction. The previous wording
pointed users at "ModelConfig entries" inside `model_providers.yaml`,
where ModelConfig entries don't actually live.

johnnygreco: dedup the cascade in `DataDesigner.__init__`. With
`model_providers=None` and a YAML `default:`, the user previously saw
two DeprecationWarnings for the same root cause —
`get_default_provider_name()` warns about the YAML key, then
`resolve_model_provider_registry(...)` re-warns from
`_warn_on_explicit_default`. Suppress the registry-level duplicate in
the YAML-fallback branch via `warnings.catch_warnings()` so users see
exactly one warning per user action.

johnnygreco: tighten `_warn_on_explicit_default` to fire only when
`default is not None`. Passing `default=None` explicitly is
semantically equivalent to omitting it (caller is opting *out* of a
registry-level default), and shouldn't trigger the deprecation
nudge.

johnnygreco: add a `model_validate({...})` regression test for
`ModelConfig` so the deserialization path (legacy on-disk configs)
is pinned alongside the construction path.

Tests:
- Update `test_load_exists` and `test_save` to omit `default=` so the
  roundtrip stops exercising the deprecated YAML-default path
  unguarded (Greptile note).
- Wrap `test_resolve_model_provider_registry_with_explicit_default`,
  `test_get_provider`, and
  `test_init_user_supplied_providers_preserve_first_wins_over_yaml_default`
  in `pytest.warns` so the suite stays green under
  `-W error::DeprecationWarning` (Greptile note).
- Add `test_explicit_default_none_does_not_emit_deprecation_warning`
  to pin the tightened predicate.
- Add `test_init_yaml_default_emits_single_deprecation_warning` to
  pin the cascade-dedup behavior.

Refs #589

Made-with: Cursor

* fix(models): make deprecation warnings visible under default filters

andreatgretel (PR #594): the YAML-default warning in
`get_default_provider_name` and the registry-default warning emitted
from inside DataDesigner helpers were attributing to data_designer
library frames, not user code. Python's default filter chain includes
`ignore::DeprecationWarning`, so library-attributed entries are
silenced — meaning a normal `DataDesigner()` call with a YAML
`default:` set showed nothing, and `resolve_model_provider_registry`
warnings were similarly invisible. Two related changes:

1. `warn_at_caller`: extend the default skip-list from `("pydantic",)`
   to `("pydantic", "pydantic_core", "data_designer")` so the walk
   escapes both pydantic's validator-dispatch frames and data_designer
   helper frames before attributing. Also tighten the prefix predicate
   to exact-or-dotted-prefix matching (`name == p or
   name.startswith(p + ".")`) so e.g. `pydantic_helpers` is not
   falsely matched as part of `pydantic` (johnnygreco nit). Allow
   callers to pass a custom `skip_prefixes` for flexibility. Drop the
   "skip frame 0+1 unconditionally" guard now that prefix matching
   covers it.

2. `get_default_provider_name`: switch from
   `warnings.warn(stacklevel=2)` to `warn_at_caller`. The previous
   stacklevel pointed into `default_model_settings.py`, which is a
   library file → silenced under default filters. Verified the fix
   empirically with `python -W default`: warning is now attributed to
   the user's call site and rendered.

johnnygreco (PR #594): add the missing
`test_explicit_default_none_does_not_emit_deprecation_warning`
regression for the `self.default is not None` predicate landed in
the prior round.

Tests:
- New `test_warning_helpers.py` pins prefix-matching precision
  (rejects `pydantic_helpers` / `data_designer_other`), default
  skip-list contents, attribution past skip-prefix frames, and
  per-call-site dedup behavior.
- `test_get_default_provider_name_warning_attributes_to_user_frame`
  pins andreatgretel's repro for the YAML-default site.
- `test_explicit_default_warning_attributes_to_user_frame` pins the
  multi-frame case: construction goes through
  `resolve_model_provider_registry`, so the walk has to escape both
  pydantic and data_designer before landing on the test file.
- `test_explicit_default_none_does_not_emit_deprecation_warning`
  pins johnnygreco's predicate-tightening regression.

3,124 tests pass (540 config + 1,923 engine + 653 interface; +10 net
from this round).

Refs #589

Made-with: Cursor

* fix(models): apply warn_at_caller to remaining deprecation sites

greptile-apps (PR #594, r3189904028): `ProviderRepository.load`'s
YAML-default `DeprecationWarning` was using `warnings.warn(stacklevel=2)`,
which attributes to whichever data_designer frame called `load()` —
controllers, services, list/reset commands, agent introspection. Every
real call path lands on `data_designer.cli.*`, which falls under
Python's default `ignore::DeprecationWarning` filter and is silenced.
Audit found two more sites with the same problem:

- `DatasetBuilder._resolve_async_compatibility` (`allow_resize` /
  issue #552) — was using `stacklevel=4` to walk past
  `_resolve_async_compatibility -> build/build_preview -> interface ->
  user`. Brittle: any added frame (decorator, async wrapping, the
  `try/except DeprecationWarning: raise` boundary) shifts attribution
  silently. The existing test passed only because it used
  `simplefilter("always") + record=True`, which records warnings
  regardless of attribution.
- `ProviderController._handle_change_default` — was using
  `stacklevel=2`, which lands on the menu dispatcher in the same
  controller module. `print_warning` already shows the message
  visually, but programmatic observers (`pytest.warns`,
  `filterwarnings("error", ...)`) saw a library-attributed entry that
  default filters silenced.

All three migrated to `warn_at_caller` (the helper from 247fa30) so
attribution lands on the user's call site regardless of internal
chain shape. `data_designer` is already in
`DEFAULT_INTERNAL_PREFIXES`, so the walk escapes the entire library
in one pass.

Added attribution regression tests at each site asserting
`warning.filename == __file__`. A future regression to
`warnings.warn(stacklevel=N)` now fails CI instead of silently
silencing the user-facing nudge:

- `test_load_with_yaml_default_attributes_warning_to_caller`
  (test_provider_repository.py)
- `test_resolve_async_compatibility` extended with the same assertion
- `test_handle_change_default_emits_deprecation_warning` rewritten
  from `pytest.warns(...)` to a `catch_warnings(record=True)` block
  that filters for the message and asserts `filename == __file__`
  (`pytest.warns` does not check attribution, so the rewrite is
  required to actually catch the regression).

3,125 tests pass (548 config + 1,923 engine + 654 interface).

Refs #589
2026-05-05 13:39:12 -06:00
Mike Knepper
98715dcd86
chore(cli): Add --org option to NGC download command (#604)
Some checks are pending
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on macos-latest) (push) Waiting to run
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Coverage Check (Python 3.11) (push) Waiting to run
CI / End to end test (Python 3.10 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on macos-latest) (push) Waiting to run
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.12 on ubuntu-latest) (push) Waiting to run
CI / End to end test (Python 3.13 on ubuntu-latest) (push) Waiting to run
CI / Lint and Format Check (push) Waiting to run
CI / Check License Headers (push) Waiting to run
CI / Test (Python 3.10 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on macos-latest) (push) Blocked by required conditions
CI / Test (Python 3.10 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.11 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.12 on ubuntu-latest) (push) Blocked by required conditions
CI / Test (Python 3.13 on ubuntu-latest) (push) Blocked by required conditions
2026-05-05 08:03:49 -05:00
Eric W. Tramel
fc0365cada
feat(cli): add data-designer --version (#599) 2026-05-04 13:30:45 -04:00
Johnny Greco
a65903eb1a
chore: add ko_KR locale to nemotron personas datasets (#572)
Some checks failed
CI / Test Engine (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test Engine (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test Engine (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Coverage Check (Python 3.11) (push) Has been cancelled
CI / End to end test (Python 3.10 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.11 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test Engine (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test Engine (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.12 on ubuntu-latest) (push) Has been cancelled
CI / Test Engine (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Test Interface (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test Interface (Python 3.11 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.13 on macos-latest) (push) Has been cancelled
CI / End to end test (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / End to end test (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.11 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.12 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.13 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.13 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.10 on macos-latest) (push) Has been cancelled
CI / Test (Python 3.10 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.11 on ubuntu-latest) (push) Has been cancelled
CI / Test (Python 3.12 on ubuntu-latest) (push) Has been cancelled
* chore: add ko_KR locale to nemotron personas datasets

Register Korean (ko_KR, 2.66 GB) as an available managed persona
dataset locale, update related CLI/repository tests, and document the
new locale and its NGC download command.

* update  person fields

* update fr_FR size

* docs: reconcile personas field tables with installed parquet schemas

Remove stale per-locale fields that no longer exist in any managed
parquet (commune, departement, prefecture), drop district from the
India-specific section since it's already listed in Core Fields,
rename digital_skills → digital_skill to match the actual ja_JP
column, and add sections for ko_KR, en_SG, and the en_US/en_SG
shared ethnic_background. Corrects the religion-family membership
to include en_SG.

* test: add missing fr_FR assertion in test_run_personas_with_all_flag

The test asserts all 9 locales were downloaded but only enumerates 8
in its per-locale checks — fr_FR has been missing since before the
ko_KR addition. Align the enumeration with the count.

* docs: add ko_KR to locale parameter list
2026-04-24 17:19:02 -04:00
Johnny Greco
0d10bf8dc6
feat: add fr_FR locale to nemotron personas datasets (#468)
* feat: add fr_FR locale to nemotron personas datasets

Register the France locale (fr_FR, 2.71 GB) in NEMOTRON_PERSONAS_DATASET_SIZES
and add 7 France-specific PII fields: first_name_heritage, name_heritage,
is_first_gen_immigrant, household_type, monthly_income_eur, commune, departement.

* fix: update download controller and service tests for fr_FR locale

Update hardcoded locale counts from 7 to 8 and add fr_FR assertions
in download controller and download service tests.

* fix: generate CLI locale help text dynamically from constants

The --locale help text was hardcoded and already stale (missing en_SG,
pt_BR, fr_FR). Build it from LOCALES_WITH_MANAGED_DATASETS so it stays
in sync automatically.

* refactor: add LOCALES_WITH_MANAGED_DATASETS_STR constant

Centralise the comma-joined locale list so it is defined once in
constants and reused in the CLI help text, PersonSamplerParams field
description, and locale validation error message.
2026-03-31 17:28:03 -04:00
Johnny Greco
164db0aeb4
refactor: simplify agent CLI to context, types, and state (#418) (#420)
* refactor: simplify agent CLI to context, types, and state subcommands

- Remove schema and builder subcommands and all supporting code
- Add description column (docstring first paragraph) to types table
- Add config_file per family (relative to data_designer package)
- Add config_package_path and library_version to context output
- Clean section hierarchy: ## for sections, ### for family sub-tables
- Add docstrings to ScalarInequalityConstraint and ColumnInequalityConstraint

* cleanup: remove dead code and fix redundant type discovery

- Remove unused get_import_path (only used by deleted schema/builder)
- Remove unused class_name from catalog dicts
- Fix N+1: get_family_source_file uses get_args directly instead of
  rediscovering all types via discover_family_types

* docs: update DropColumnsProcessorConfig docstring to prefer drop=True

* fix: address Greptile review feedback

- Add parameters:/params: to _SECTION_HEADERS for docstring parsing
- Fix config_package_path to return parent of data_designer package so
  Path(base) / relative_file resolves correctly
- Use last occurrence of data_designer in _get_source_file to handle
  nested paths (e.g. dev checkouts)
- Return list of deduplicated files per family (get_family_source_files)
  instead of assuming all types live in one file
- Add config_builder_file to context output

* fix: resolve config_builder_file dynamically and fix fragile test

- Use _get_source_file(DataDesignerConfigBuilder) instead of hardcoded
  string for config_builder_file, consistent with family file resolution
- Fix test assertion that assumed "config" in path (only true in dev)

* fix: return empty string for unresolvable source paths

- _get_source_file returns "" instead of absolute path when
  data_designer is not in the path, consistent with error branch
- Add Config Module section to context output pointing agent to
  the config module as the only part of the codebase to work with
- Rename config_package_path to config_module_path (returns config dir)

* refactor: remove ConfigBase.schema_text() and supporting helpers

Schema rendering is no longer needed in the config layer — the agent
CLI now provides file paths so agents can read source files directly.

* Improve agent context output and processor discoverability

- Redeclare `name: str` in DropColumnsProcessorConfig and
  SchemaTransformProcessorConfig so agents see the required field
  without reading the base class
- Add base config file path to agent context output
- Optimize agent context formatting: strip redundant path prefixes,
  remove family count summary, separate usable/unusable model aliases,
  rename sections for clarity

* fix: restore emoji literal in get_column_emoji

* fix: revert unnecessary name redeclarations and use posix paths

- Remove bare name: str redeclarations in processor configs that
  silently dropped the parent Field(description=...)
- Use Path.as_posix() in _get_source_file for consistent forward slashes

* docs: standardize config docstrings with (required) markers and Inherited Attributes

- Add (required) to all required parameters in Attributes sections
- Add Inherited Attributes section to all config subclasses listing
  fields from parent classes (SingleColumnConfig, ProcessorConfig, Constraint)
- Fix stale with_trace descriptions in LLM subclass inherited sections
- Remove discriminator fields from Attributes sections
- Remove redundant name: str redeclaration from ExpressionColumnConfig

* fix: address Greptile feedback on model aliases and test paths

- Show per-alias reason for unusable models instead of blanket
  "missing API keys" label
- Surface model_config_present: tell agent when no config file exists
- Fix test fixtures to use realistic data_designer/config/ paths that
  exercise _strip_config_prefix

* test: add coverage for model_config_present=false branch

* docs: put required attributes first in Inherited Attributes docstrings

Move `name (required)` to the top of the Inherited Attributes section
in LLMCodeColumnConfig, LLMStructuredColumnConfig, and LLMJudgeColumnConfig
so required fields appear before optional ones.

* fix: improve agent CLI output for clarity and agent comprehension

- Use {config_root}/file.py path syntax across all agent output
- Add config_root preamble to standalone `agent types` output
- Replace type_name (discriminator) with type (class name) in tables
- Show only usable model aliases; warn agent to surface config issues
- Add directive scoping agents to the config module only
- Reword import hint and config module description for directness

* fix: fall back to absolute path for plugin source files

_get_source_file() returned "" for types outside the data_designer
package (e.g., plugin configs). Now returns the absolute path so
the agent still gets a readable file reference.

* fix: remove unreachable model_config_present branch from formatter

main() calls ensure_cli_default_model_settings() before any agent
command, so model config is always seeded. The model_config_present=False
branch was dead code.

* test: add coverage for no-usable-model-aliases warning

Covers the remaining branch in _format_model_aliases_context where
all aliases are unusable and the agent gets a warning to surface to
the user.

* fix: add inherited attributes to section headers and use posix paths

Address two Greptile review comments:
- Add "inherited attributes:" to _SECTION_HEADERS so docstring parsing
  stops before that section even without a preceding blank line.
- Use .as_posix() in get_config_module_path() for consistent
  forward-slash paths across platforms.
2026-03-17 09:30:06 -07:00
Johnny Greco
4c19dba74b
feat: agent CLI introspection (simplified) (#415)
* feat: add agent introspection cli

* refactor: remove agent cli schema version

* refactor: omit missing builder docstrings from context

* refactor: tighten agent cli contract

* feat: add schema_text() to ConfigBase for human-readable field summaries

ConfigBase.schema_text() returns a concise text representation including
the class docstring summary, field names, types, defaults, and
descriptions. Field descriptions added to column config types to
surface through this method.

* refactor: flatten agent CLI into plain functions with text output mode

Delete AgentController class and agent_command_defs module. Move all
logic into agent_introspection (data) and agent_text_formatter (display)
as plain functions. Add --json flag so commands default to human-readable
text using schema_text(), with JSON as opt-in. Unify _emit helper,
remove include_docstrings parameter, deduplicate catalog calls, and fix
N+1 discover_family_types in get_family_schemas.

* fix: port stale controller tests and consolidate command descriptions

Port test_agent_controller.py to use plain functions instead of deleted
AgentController. Extract AGENT_COMMANDS constant as single source for
operation descriptions, syncing with main.py help strings.

* style: fix ruff formatting in agent_introspection

* refactor: centralize agent command definitions

Extract AGENT_COMMANDS into agent_command_defs.py so main.py and
agent_introspection.py share a single source for command names,
help text, and metadata. The new module has no heavy dependencies,
keeping --help latency unaffected.

* fix: handle default_factory and empty providers in schema_text and introspection

- schema_text() now detects default_factory fields and renders e.g. "list()"
  instead of leaking PydanticUndefined
- Guard against IndexError when provider registry has an empty providers list
- Add 15 edge-case tests for schema_text covering default_factory, enum
  defaults, None defaults, scalar defaults, descriptions, and docstrings

* refactor: remove JSON output mode from agent CLI commands

Text-only output simplifies the interface. Structured output can be
added back trivially since the functions already return dicts.

* docs: update schema_text docstring to reflect agent focus

* fix: include builder section and import_path in agent text output

- format_context_text now renders a ## Builder section
- format_types_text now includes import_path column in tables

* refactor: drop import_path from types tables

All config objects are imported via dd.<ClassName>, so the full import
path is redundant noise in agent output.

* docs: add family definition and import hint to context output

* refactor: rename Types section to Families, drop redundant "types" from sub-headers

* fix: coerce None to empty string in table cells

row.get(col, '') returns None when the key exists with value None,
causing str(None) to render "None" in the output. Use `or ''` instead.

* refactor: move agent controller tests to utils as introspection integration tests

There is no controller layer — these tests exercise functions in
agent_introspection.py, so they belong in tests/cli/utils/.

* fix: only coerce None to empty string in table cells, not False

The previous `or ''` pattern treated all falsy values (including False)
as empty. Use an explicit None check so booleans render correctly.

* style: address review nits from nabin

- Add explicit parentheses to and/or precedence in _build_agent_lazy_group
- Rename loop variable l to line in test_schema_text
- Move get_family_schema import to module level in test_agent_text_formatter

* fix: improve schema_text Literal display, builder signature quotes, and docstring parsing

- _format_annotation now renders Literal['value'] instead of bare Literal
- _format_signature strips quotes from stringified annotations caused by
  `from __future__ import annotations`
- _get_docstring_summary stops at any Google-style section header, not
  just Attributes:
2026-03-13 18:26:00 -04:00
Johnny Greco
b94b88b7a4
feat(cli): bootstrap default configs on CLI startup (#401)
* feat(cli): bootstrap default configs on command run

* fix(cli): use active interpreter in bootstrap warning

* refactor(cli): simplify bootstrap warning flow

* refactor(cli): bootstrap defaults in main entrypoint

* refactor(cli): keep bootstrap ownership in main

* test(cli): cover lazy dispatch and runtime failure flag

* refactor(cli): remove redundant bootstrap state

* test(cli): assert bootstrap warning includes error

* test: address cli bootstrap review feedback
2026-03-12 15:42:41 -04:00
Nabin Mulepati
e4857f62fa
feat: add Streamable HTTP transport support for remote MCP providers (#358)
* feat: add Streamable HTTP transport support for remote MCP providers (#357)

Add `streamable_http` as a supported transport type for `MCPProvider`,
enabling connections to MCP servers that use the Streamable HTTP protocol
(e.g. Tavily remote endpoints). Previously only SSE transport was supported,
causing silent 5-minute timeouts when connecting to incompatible endpoints.

- Expand `MCPProvider.provider_type` to `Literal["sse", "streamable_http"]`
  (default remains `"sse"` for backwards compatibility)
- Route `streamable_http` providers through `streamablehttp_client` from
  the MCP SDK in `MCPIOService._get_or_create_session()`
- Handle variable-length context manager results from MCP transport clients
- Add `DataDesigner.list_mcp_tool_names()` for discovering available tools
- Update CLI form builder and controller to support the new transport option
- Add tests for streamable_http config, session creation, and form builder

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* updates

* simplify import

* address greptile comments

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-04 08:11:54 -07:00
Johnny Greco
03b3d6c726
chore: address Andre's feedback on --save-results and CLI preview (#335)
* fix: suppress stdout when saving report and sample records to file

Console(record=True) still prints to stdout by default. Use
file=io.StringIO() to redirect output so save-path calls only
write to disk.

* refactor: --save-results skips terminal display

When --save-results is used, records and the analysis report are no
longer printed to the terminal. Extracted save logic into a dedicated
_save_preview_results method and updated option help text accordingly.

* feat: wrap-around navigation in sample records browser

Prev/next buttons and arrow keys now cycle back to the beginning/end
instead of clamping at boundaries.

* test: reuse record_series fixture in visualization tests

* feat: thread --theme through to sample records pager

The pager shell was hardcoded dark, so --theme light produced
light records inside a dark frame. Extract CSS variables into
dark/light constants and pass the theme from the controller.

* fix: cap terminal display width at display_width

The module-level Console() had no width limit, so tables with
expand=True stretched to the full terminal width. Cap terminal
output at min(terminal_width, display_width) and thread the
display_width parameter through the controller's display methods.

* docs: update --display-width and --theme help text

Remove "Only applies when --save-results is used" from
--display-width since it now also affects terminal output.

* fix: update generation controller tests to match display_width and save_results behavior
2026-02-18 20:17:03 -05:00
Johnny Greco
1439bbea7e
chore: Improve CLI startup with lazy heavy import cleanup (#330)
* perf: defer heavy imports to improve CLI startup time

Move expensive imports (engine, models, controllers) out of the module-level import path so that data-designer --help and other non-generation commands no longer pay the full startup cost.

Key changes:
- Defer controller imports to inside command functions
- Remove eager re-export chains from CLI package __init__ files
- Move default-settings bootstrap into load_config_builder() and DataDesigner.__init__() instead of running at import time
- Add lazy __getattr__ exports in interface/__init__.py
- Replace module-level tokenizer init with cached lazy getter
- Fix ModelProvider import to use config layer instead of engine
- Update test mock paths to match new import locations

Reduces CLI import-time from ~1.67s to ~0.46s.

* perf: defer pandas/numpy in io_helpers and add config_list benchmark

- Replace eager `from lazy_heavy_imports import pd, np` in io_helpers
  with module-level __getattr__ (for backwards-compatible external
  access / test mocks) and function-level imports in the 3 functions
  that actually use them (read_parquet_dataset, smart_load_dataframe,
  _convert_to_serializable). Importing io_helpers no longer triggers
  pandas/numpy loading.
- Defer heavy imports in list and reset CLI commands into function
  bodies to avoid loading repositories, Rich, and prompt_toolkit at
  module import time.
- Add `config_list` (data-designer config list) measurement to the
  CLI startup benchmark with isolated cold measurement in a separate
  venv and a --skip-config-list-check flag.
- Update test mock paths to match new import locations.

* Refine lazy import usage and TYPE_CHECKING cleanup

* Run license header updater on PR-touched files

* fix: update sqlfluff mock target for lazy imports in test_sql

* perf: cache globals() in lazy __getattr__ to avoid repeated lookups

Add globals() caching and explanatory comment to all three lazy
__getattr__ implementations (lazy_heavy_imports, config/__init__,
interface/__init__) so subsequent attribute accesses bypass __getattr__.

* perf: lazy CLI command loading and deferred heavy import evaluations

- Add LazyTyperGroup to defer command module loading until invocation, allowing module-level imports in all CLI command files

- Split DataFrameSeedSource into seed_source_dataframe.py to isolate pandas dependency from other seed source classes

- Move TypeVar/TypeAlias definitions (DataT, NumpyArray1dT, RadomStateT, EngineT) to TYPE_CHECKING blocks with runtime fallbacks

- Wrap module-level constants in lru_cache (phone_number parquet data, jsonschema validator) to defer I/O and heavy imports to first use

- Update test mock targets to patch at usage-site for module-level imports

* refactor: use direct pandas import in seed_source_dataframe

Drop lazy-loading for pandas in DataFrameSeedSource; use direct import
for simplicity.

* update lazy import pattern

* update tests to use lazy import namespace

Switch test modules to import data_designer.lazy_heavy_imports as lazy
and reference heavy libraries through that namespace. This keeps heavy
imports deferred during module import and aligns tests with the new
lazy-import usage pattern.

* tighten import perf test thresholds

Document recent baseline timings and lower the allowed average
import time and timeout so regressions are detected sooner.

* document pandas import requirement

Clarify that Pydantic needs DataFrame resolved at module load and
that keeping the direct import preserves IDE typing support.

* increase timeout time

* use lazy pandas imports in visualization tests

- replace direct pandas usage with lazy.pd in visualization tests to avoid eager imports
- add TYPE_CHECKING pandas import and keep CLI controller imports sorted

* fix lazy pandas runtime usage and preview mocks

Switch sample-record handling to lazy pandas types so runtime paths no longer
depend on TYPE_CHECKING imports. Align preview controller tests to patch the
module-local DataDesigner symbol, preventing real engine invocation in save
results scenarios.
2026-02-18 16:24:15 -05:00
Johnny Greco
f2a1657870
feat: add --save-results option to preview command (#333)
* feat: add --save-report option to preview command

* feat: add save_path option to display_sample_record

Allow saving rendered sample records as HTML or SVG files via an
optional save_path parameter on both the standalone function and
the WithRecordSamplerMixin method.

* feat: replace --save-report with --save-results on preview command

Replace the single-file --save-report option with --save-results, which saves all preview artifacts (dataset parquet, analysis report HTML, and per-record sample HTMLs) into a timestamped directory under the artifact path. Add error handling around the save block, improve timestamp precision to microseconds, and expand test coverage for the new behavior.

* feat: add sample records pager with theme toggle, postMessage bridge, and UI polish

* feat: add dataset metadata subtitle to pager and clean up toolbar layout

* fix: address review findings for preview save-results feature

- Split try/except in generation_controller so report display errors
  don't produce misleading "failed to save" messages when not saving
- Add browser HTML path to save success output for discoverability
- Remove 5 unused CSS variables from pager theme constants
- Add "N of M" record counter to pager toolbar
- Add theme/display_width assertions to all preview_command tests
- Add dedicated test for custom theme and display_width passthrough
- Add tests for record counter and CSS variable cleanup

* fix: address code review findings and simplify pager

- Fix critical bug: analysis report now displays to console even when
  --save-results is active (was silently dropped via pass statement)
- Fix latent UnboundLocalError in display_sample_record when index is
  out of bounds (num_records computed before try block)
- Eliminate duplicated dark CSS between constant and theme listener script
- Simplify sample_records_pager: remove dual-theme system, postMessage
  bridge, and responsive media queries; restore GitHub link; reorder
  toolbar to put prev/next buttons on the far left
- Narrow except Exception to except OSError in save-results path
- Use case-insensitive extension check and lambda-based re.sub
- Collapse redundant preview command delegation tests into parametrize
- Add missing type annotations and remove tautological assertions

* style: move record counter to far right of pager toolbar

* refactor: remove dead theme-listener script and inline CSS constant

_THEME_LISTENER_SCRIPT and _SAMPLE_RECORD_DARK_CSS_INLINE became
orphaned after the pager simplification removed the postMessage
bridge. This removes both constants, drops the injection line,
switches the idempotency guard to the viewport meta tag, and
cleans up related test assertions.

* fix: move Path import out of TYPE_CHECKING block in test_visualization

* fix: rename _logger to logger to match codebase convention

* fix: remove unnecessary cast in preview command theme parameter

* refactor: extract DEFAULT_DISPLAY_WIDTH constant and make apply_html_post_processing public

* Update packages/data-designer-config/tests/config/utils/test_visualization.py

---------

Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com>
2026-02-18 15:58:35 -05:00
Johnny Greco
1514720596
feat: support loading config files from HTTP(S) URLs (#323)
* support loading config files from http urls

- allow config builder and CLI loader to load YAML/JSON configs from
  HTTP(S) URLs
- reject unsupported URL extensions and remote Python module URLs
- update CLI help text and add tests for URL success/failure paths

* harden remote config loading and deduplicate URL validation

- Add size limit (10 MB) when fetching configs from URLs
- Validate parsed YAML is a dict before returning
- Make is_http_url public and reuse it in CLI validate_url
- Replace local CONFIG_FILE_EXTENSIONS with shared constant
- Add tests for is_http_url, URL-with-no-extension edge cases

* use requests for remote config loading

- replace urllib URL fetching with requests and status checks
- parse remote payloads via smart_load_yaml for consistent validation
- expand tests for HTTP errors, size limits, and non-dict payloads

* lower remote config size limit to 1 MB

* improve config URL HTTP error reporting

Add granular 401/403/404 and generic HTTP status errors for remote config fetching to make failures actionable. Clarify that authenticated config URL loading is not currently supported and update tests for status-aware behavior.

* rewrite github blob URLs for remote loading

Handle GitHub blob links by rewriting them to raw content URLs for
config and dataframe HTTP loaders, preserving query params but avoiding
query token leaks in logs. This also fixes extension detection for URLs
with query strings and adds coverage for rewrite behavior.

* remove validate_url wrapper in favor of is_http_url

The validate_url function in cli/utils was just a thin wrapper around
is_http_url from io_helpers. Remove it and have callers use is_http_url
directly for clarity and reduced indirection.

* fix optional type for artifact_path CLI option

* fix URL recursion in smart_load_yaml

- avoid treating remote payload strings as new URL inputs
- add regression test for URL string payloads from remote config

* rewrite huggingface blob URLs for remote loading
2026-02-11 15:12:52 -05:00
Johnny Greco
d3c4de76da
feat: add preview, create, and validate CLI commands (#313)
* feat: add preview, create, and validate CLI commands

Add three new top-level CLI commands for the data-designer workflow:
- `data-designer preview` - generate preview datasets for fast iteration
- `data-designer create` - create full datasets and save to disk
- `data-designer validate` - validate configuration files

Also includes:
- Move wait_for_navigation_key() UI primitive from preview.py to ui.py
- Add KeyPressEvent type annotations to all key binding handlers in ui.py
- Refactor cli/utils.py into cli/utils/ package with config_loader module
- Comprehensive test coverage for all new commands

* fix: update pythonjsonlogger import and clean up dev dependencies

- Update pythonjsonlogger import to use newer JsonFormatter API
- Consolidate dev-dependencies into [dependency-groups] dev section
- Remove unnecessary test cli/utils __init__.py

* small E

* address greptile feedback

* organize CLI commands into rich help panels

Group top-level commands under "Generation" and "Setup" panels
for clearer help output.

* refactor config loader to parse files directly and auto-detect config format

- Parse YAML/JSON files into dicts before passing to from_config,
  providing format-specific error messages for parse failures
- Auto-detect DataDesignerConfig format (columns at top level) and
  wrap it into BuilderConfig so users can provide either format
- Clean up Python module loading with try/except/finally for reliable
  sys.modules and sys.path cleanup
- Add comprehensive tests for parsing, validation, and auto-wrapping

* fix sys.path cleanup in config loader and simplify tests

- Use pop(0) instead of remove() to precisely undo the insert(0, ...)
  and avoid accidentally removing a different matching path entry
- Replace MagicMock with real DataDesignerConfigBuilder in tests

* move config format auto-detection into from_config

Centralize the shorthand DataDesignerConfig detection (columns at
top level without a data_designer wrapper) in
DataDesignerConfigBuilder.from_config so all callers benefit, not
just the CLI config loader. Simplify config_loader to delegate file
parsing and format normalization entirely to from_config.

* extract GenerationController from CLI commands

Move shared generation logic (preview, validate, create) out of the
individual Typer command functions into a dedicated GenerationController,
matching the existing controller pattern (DownloadController, etc.).
The command functions now delegate to the controller, keeping them as
thin entry points. Tests updated accordingly — command tests verify
delegation while controller tests cover the full behavior.

* harden sys.path cleanup and add explanatory comments

Use sys.path.remove() instead of checking sys.path[0] so cleanup
succeeds even when exec_module inserts entries at index 0. Drop
unnecessary spec=DataDesignerConfigBuilder from test mocks.

* check stdout TTY in preview interactive mode detection

Previously only stdin was checked, so piping stdout (e.g.
`dd preview cfg.yaml | head`) would still attempt interactive
browsing. Now both stdin and stdout must be a TTY.
2026-02-11 14:06:06 -05:00
Eric W. Tramel
e6e58e692e
feat: MCP (Model Context Protocol) tool calling integration for LLM columns (#248) 2026-02-02 09:41:58 -05:00
Johnny Greco
c19f35639f
chore: add publish script and update license headers (#253) 2026-01-28 08:47:34 -05:00
Johnny Greco
ae0665fa16
refactor: slim package refactor into three subpackages (#240)
* remove old structure

* major shuffle

* streamline project configs

* update make commands

* updates to make commands

* remove essentials

* initialize logger in interface

* uv lock

* ignore notepad

* update workflows

* fix e2e project config

* generate colab notebooks

* resolve default model settings in interface

* fix build commands

* update perf import make command

* cleaning up some slop

* update recipes

* move conftest files to tests/

* update subpackage readmes

* streamline config_logging

* use exports

* update perf import usage pattern

* update for IDE behavior with ruff

* remove engine's fixtures file

* add note to about lazy imports

* update dependencies

* update docs

* doc fixes

* uv lock

* updates to catch up with main

* clean up makefile

* remove package gitignores

* define deps only once

* isolate tests

* add test for protetion rule

* create temp dirs for isolated tests

* catch up to main

* update headers

* re apply changes

* better result summaries for isolated tests

* move exports into top-level init

* fix client importlib version syntax

* catch up with main
2026-01-27 13:53:20 -05:00