mirror of
https://github.com/NVIDIA-NeMo/DataDesigner
synced 2026-05-24 09:48:29 +00:00
* refactor: simplify agent CLI to context, types, and state subcommands
- Remove schema and builder subcommands and all supporting code
- Add description column (docstring first paragraph) to types table
- Add config_file per family (relative to data_designer package)
- Add config_package_path and library_version to context output
- Clean section hierarchy: ## for sections, ### for family sub-tables
- Add docstrings to ScalarInequalityConstraint and ColumnInequalityConstraint
* cleanup: remove dead code and fix redundant type discovery
- Remove unused get_import_path (only used by deleted schema/builder)
- Remove unused class_name from catalog dicts
- Fix N+1: get_family_source_file uses get_args directly instead of
rediscovering all types via discover_family_types
* docs: update DropColumnsProcessorConfig docstring to prefer drop=True
* fix: address Greptile review feedback
- Add parameters:/params: to _SECTION_HEADERS for docstring parsing
- Fix config_package_path to return parent of data_designer package so
Path(base) / relative_file resolves correctly
- Use last occurrence of data_designer in _get_source_file to handle
nested paths (e.g. dev checkouts)
- Return list of deduplicated files per family (get_family_source_files)
instead of assuming all types live in one file
- Add config_builder_file to context output
* fix: resolve config_builder_file dynamically and fix fragile test
- Use _get_source_file(DataDesignerConfigBuilder) instead of hardcoded
string for config_builder_file, consistent with family file resolution
- Fix test assertion that assumed "config" in path (only true in dev)
* fix: return empty string for unresolvable source paths
- _get_source_file returns "" instead of absolute path when
data_designer is not in the path, consistent with error branch
- Add Config Module section to context output pointing agent to
the config module as the only part of the codebase to work with
- Rename config_package_path to config_module_path (returns config dir)
* refactor: remove ConfigBase.schema_text() and supporting helpers
Schema rendering is no longer needed in the config layer — the agent
CLI now provides file paths so agents can read source files directly.
* Improve agent context output and processor discoverability
- Redeclare `name: str` in DropColumnsProcessorConfig and
SchemaTransformProcessorConfig so agents see the required field
without reading the base class
- Add base config file path to agent context output
- Optimize agent context formatting: strip redundant path prefixes,
remove family count summary, separate usable/unusable model aliases,
rename sections for clarity
* fix: restore emoji literal in get_column_emoji
* fix: revert unnecessary name redeclarations and use posix paths
- Remove bare name: str redeclarations in processor configs that
silently dropped the parent Field(description=...)
- Use Path.as_posix() in _get_source_file for consistent forward slashes
* docs: standardize config docstrings with (required) markers and Inherited Attributes
- Add (required) to all required parameters in Attributes sections
- Add Inherited Attributes section to all config subclasses listing
fields from parent classes (SingleColumnConfig, ProcessorConfig, Constraint)
- Fix stale with_trace descriptions in LLM subclass inherited sections
- Remove discriminator fields from Attributes sections
- Remove redundant name: str redeclaration from ExpressionColumnConfig
* fix: address Greptile feedback on model aliases and test paths
- Show per-alias reason for unusable models instead of blanket
"missing API keys" label
- Surface model_config_present: tell agent when no config file exists
- Fix test fixtures to use realistic data_designer/config/ paths that
exercise _strip_config_prefix
* test: add coverage for model_config_present=false branch
* docs: put required attributes first in Inherited Attributes docstrings
Move `name (required)` to the top of the Inherited Attributes section
in LLMCodeColumnConfig, LLMStructuredColumnConfig, and LLMJudgeColumnConfig
so required fields appear before optional ones.
* fix: improve agent CLI output for clarity and agent comprehension
- Use {config_root}/file.py path syntax across all agent output
- Add config_root preamble to standalone `agent types` output
- Replace type_name (discriminator) with type (class name) in tables
- Show only usable model aliases; warn agent to surface config issues
- Add directive scoping agents to the config module only
- Reword import hint and config module description for directness
* fix: fall back to absolute path for plugin source files
_get_source_file() returned "" for types outside the data_designer
package (e.g., plugin configs). Now returns the absolute path so
the agent still gets a readable file reference.
* fix: remove unreachable model_config_present branch from formatter
main() calls ensure_cli_default_model_settings() before any agent
command, so model config is always seeded. The model_config_present=False
branch was dead code.
* test: add coverage for no-usable-model-aliases warning
Covers the remaining branch in _format_model_aliases_context where
all aliases are unusable and the agent gets a warning to surface to
the user.
* fix: add inherited attributes to section headers and use posix paths
Address two Greptile review comments:
- Add "inherited attributes:" to _SECTION_HEADERS so docstring parsing
stops before that section even without a preceding blank line.
- Use .as_posix() in get_config_module_path() for consistent
forward-slash paths across platforms.
|
||
|---|---|---|
| .. | ||
| test_agent_introspection.py | ||
| test_agent_introspection_integration.py | ||
| test_agent_text_formatter.py | ||
| test_config_loader.py | ||
| test_sample_records_pager.py | ||