* docs: restructure agent and contributor documentation Restructure AGENTS.md from ~627 lines to ~55 lines of high-signal architectural invariants. Extract code style into STYLEGUIDE.md and development workflow into DEVELOPMENT.md. Overhaul CONTRIBUTING.md to reflect agent-assisted development as the primary workflow. Move skills and sub-agents from .claude/ to .agents/ as the tool-agnostic home, with symlinks back for Claude Code compatibility. Add architecture/ skeleton with 10 stub files for incremental population. Implements PR 1 of #427. Made-with: Cursor * remove obsolete new-sdg skill The new-sdg skill is superseded by skills/data-designer/, which is the proper usage skill for building datasets. Update .agents/README.md to reference the usage skill's actual location. Made-with: Cursor * docs: expand style guide and refine development docs Add docstring conventions (Google style), Pydantic/dataclass guidance, error handling patterns, and f-string preference to STYLEGUIDE.md. Clarify per-package test targets, flat test style, e2e API key requirement, notebook regeneration commands, and import perf threshold in DEVELOPMENT.md. Point dataset-building agents to the data-designer skill in AGENTS.md and clarify dependency direction arrows. Made-with: Cursor * docs: link AGENTS.md to architecture/ directory Made-with: Cursor * docs: refine CONTRIBUTING.md contribution workflow Add plan document step, self-review with multi-model passes, automated CI review expectations, and comment resolution protocol. Made-with: Cursor * docs: add architecture/ to PR 2 scope and link from AGENTS.md Move architecture doc population from deferred/incremental to PR 2 since the subsystems already exist. Update plan delivery strategy, execution order, and out-of-scope sections accordingly. Made-with: Cursor * docs: address PR review comments on style guide, dev guide, and contributing Replace pd.DataFrame with list[dict[str, str]] in naming example to avoid contradicting lazy-import guidance in the same file. Soften "enforced by SIM" to note SIM rules are not yet enabled in CI. Fix upstream sync instructions for fork-based contributors. Update copyright year in CONTRIBUTING.md from 2025 to 2026 to match STYLEGUIDE.md.
7 KiB
Development Guide
This document covers local setup, day-to-day development workflow, testing, and pre-commit usage for DataDesigner contributors.
For architectural invariants and project identity, see AGENTS.md. For code style, naming, and import conventions, see STYLEGUIDE.md. For the contribution workflow (issues, PRs, agent-assisted development), see CONTRIBUTING.md.
Prerequisites
Local Setup
Clone and Install
git clone https://github.com/NVIDIA-NeMo/DataDesigner.git
cd DataDesigner
# Install with dev dependencies
make install-dev
# Or, if you use Jupyter / IPython for development
make install-dev-notebooks
Verify Your Setup
make test && make check-all
If no errors are reported, you're ready to develop.
Day-to-Day Workflow
Branching
git checkout main
git pull origin main
git checkout -b <username>/<type>/<issue-number>-<short-description>
Branch name types: feat, fix, docs, test, refactor, chore, style, perf.
Example: nmulepati/feat/123-add-xyz-generator
Syncing with Upstream
If you're working from a fork, add the upstream remote first:
git remote add upstream https://github.com/NVIDIA-NeMo/DataDesigner.git
Then sync:
git fetch upstream
git merge upstream/main
Validation Before Committing
make check-all-fix # format + lint (ruff)
make test # run all test suites
Code Quality
Using Make (Recommended)
make lint # Run ruff linter
make lint-fix # Fix linting issues automatically
make format # Format code with ruff
make format-check # Check code formatting without changes
make check-all # Run all checks (format-check + lint)
make check-all-fix # Run all checks with autofix (format + lint-fix)
Direct Commands
uv run ruff check # Lint all files
uv run ruff check --fix # Lint with autofix
uv run ruff format # Format all files
uv run ruff format --check # Check formatting
Testing
Running Tests
make test runs all three package test suites in sequence (config, engine, interface). When iterating on a single package, run its tests directly:
# Run all tests (config + engine + interface)
make test
# Run a single package's tests
make test-config # data-designer-config
make test-engine # data-designer-engine
make test-interface # data-designer (interface)
# Run a specific test file
uv run pytest tests/config/test_sampler_constraints.py
# Run tests with verbose output
uv run pytest -v
# Run tests with coverage
make coverage
# View htmlcov/index.html in browser
# E2E and example tests (slower, require API keys — see README.md for setup)
make test-e2e # end-to-end tests
make test-run-tutorials # run tutorial notebooks as tests
make test-run-recipes # run recipe scripts as tests
Test Patterns
The project uses pytest with the following patterns:
- Flat test functions: Write standalone
test_*functions, notclass-based test suites. Use fixtures and parametrize for shared setup instead of class inheritance. - Fixtures: Shared fixtures are provided via
pytest_pluginsfromdata_designer.config.testing.fixturesanddata_designer.engine.testing.fixtures, plus localconftest.pyfiles in each test directory - Stub configs: YAML-based configuration stubs for testing (see
stub_data_designer_config_strfixture) - Mocking: Use
unittest.mock.patchfor external services and dependencies - Async support: pytest-asyncio for async tests (
asyncio_default_fixture_loop_scope = "session") - HTTP mocking: pytest-httpx for mocking HTTP requests
- Coverage: Track test coverage with pytest-cov
Test Guidelines
- Test public APIs only: Tests should exercise public interfaces, not
_-prefixed functions or classes. If something is hard to test without reaching into private internals, consider refactoring the code to expose a public entry point - Type annotations required: Test functions and fixtures must include type annotations —
-> Nonefor tests, typed parameters, and typed return values for fixtures - Imports at module level: Follow the same import rules as production code — keep imports at the top of the file, not inside test functions
- Parametrize over duplicate: Use
@pytest.mark.parametrize(withids=for readable names) instead of writing multiple test functions for variations of the same behavior - Minimal fixtures: Fixtures should be simple — one fixture, one responsibility, just setup with no behavior logic
- Shared fixtures in
conftest.py: Place fixtures shared across a test directory inconftest.py - Mock at boundaries: Mock external dependencies (APIs, databases, third-party services), not internal functions
- Test behavior, not implementation: Assert on outputs and side effects, not internal call counts (unless verifying routing)
- Keep mocking shallow: If a test requires deeply nested mocking, the code under test may need refactoring
Example Test
from typing import Any
from data_designer.config.config_builder import DataDesignerConfigBuilder
def test_something(stub_model_configs: dict[str, Any]) -> None:
"""Test description."""
builder = DataDesignerConfigBuilder(model_configs=stub_model_configs)
# ... test implementation
assert expected == actual
Pre-commit Hooks
The project uses pre-commit hooks to enforce code quality. Install them with:
uv run pre-commit install
Hooks include:
- Trailing whitespace removal
- End-of-file fixer
- YAML/JSON/TOML validation
- Merge conflict detection
- Debug statement detection
- Ruff linting and formatting
Common Tasks
make clean # Clean up generated files
make update-license-headers # Add SPDX headers to new files
make check-all-fix # Format + lint before committing
make test # Run all tests
make coverage # Run tests with coverage report
make perf-import # Profile import time
make perf-import CLEAN=1 # Clean cache first, then profile
make convert-execute-notebooks # Regenerate .ipynb from docs/notebook_source/*.py
make generate-colab-notebooks # Generate Colab-compatible notebooks
Import Performance
After adding heavy third-party dependencies, verify import performance:
make perf-import CLEAN=1
There is also a CI test (test_import_performance in packages/data-designer/tests/test_import_perf.py) that runs 5 import cycles (1 cold + 4 warm) and fails if the average exceeds 3 seconds. If your dependency causes a regression, add it to lazy_heavy_imports.py — see STYLEGUIDE.md for the lazy loading pattern.