mirror of
https://github.com/NVIDIA-NeMo/DataDesigner
synced 2026-05-24 09:48:29 +00:00
* feat: support nested field access in schema transform templates
Enable {{ result.quality.score }} style dot notation in schema
transform Jinja2 templates, where result is a deserialized JSON column.
Previously, _json_escape_record flattened all dict values to escaped
JSON strings before Jinja2 saw them. This made the rendered output
valid JSON but prevented nested access since Jinja2 only saw strings.
The fix introduces TemplateValue, a wrapper that defers the choice
between "drill into nested dict" and "render as escaped string" to
template evaluation time. Jinja2 resolves dot notation via __getattr__
(returning a new TemplateValue for the nested value), and converts to
string via __str__ (delegating to a caller-provided str_fn). This is
necessary because plain dicts render as Python repr ({'key': 'val'})
which is invalid JSON - we need to control __str__ to produce properly
escaped JSON, and that requires a wrapper object.
Other Jinja2 consumers (prompt templates, expression columns) don't
need this - Jinja2 natively supports dot access on plain dicts via
getattr-to-getitem fallback, and plain str() is fine for text output.
Schema transform is unique because its output must be valid JSON.
* Address PR review comments
- Fix boolean serialization: add bool check before str in _escape_value_for_json
to produce JSON 'true'/'false' instead of Python 'True'/'False'
- Add class-level _record_str_fn annotation to WithJinja2UserTemplateRendering
- Rename skip_record_sanitization to _skip_record_sanitization (underscore prefix)
to signal internal-only usage, and document it in safe_render docstring
- Add defensive error handling in TemplateValue.__getitem__ and __iter__
- Promote test input data to parametrize column, removing brittle string scan
* Address second round of PR review comments
- Add __eq__ and __hash__ to TemplateValue so Jinja2 equality
conditionals (e.g. {% if result.label == "excellent" %}) work
- Add inline comment explaining deliberate double-encode in
_escape_value_for_json for dict/list values
- Default _record_str_fn to None at class level so accessing it
before prepare_jinja2_template_renderer doesn't mask the real error
* refactor: replace TemplateValue with Jinja2 finalize hook
Use Jinja2's built-in finalize hook for value-to-string conversion
and a getattr override for dict-key-priority lookup, eliminating the
custom TemplateValue wrapper class entirely.
* docs: add missing param docs to prepare_jinja2_template_renderer
|
||
|---|---|---|
| .. | ||
| __init__.py | ||
| test_drop_columns.py | ||
| test_registry.py | ||
| test_schema_transform.py | ||