mirror of
https://github.com/NVIDIA-NeMo/DataDesigner
synced 2026-05-24 09:48:29 +00:00
Preserves tree from previous docs-website head: 5e47d33ea8. This branch is a CI-managed publish artifact like gh-pages; source provenance is tracked in commit messages rather than Git ancestry.
19 lines
901 B
Markdown
19 lines
901 B
Markdown
# Seeds
|
|
|
|
Seed configs declare existing data used as input during generation. A [SeedConfig](#data_designer.config.seed.SeedConfig) combines a seed source with optional row sampling and selection settings. Seed source objects declare where seed data comes from; the engine reads them through seed readers.
|
|
|
|
Use these objects with `DataDesignerConfigBuilder.with_seed_dataset()`. Related pages: [Seed Datasets](../../concepts/seed-datasets.md) and [seed readers](../engine/seed_readers.md).
|
|
|
|
Built-in seed sources include local files, Hugging Face paths, in-memory DataFrames, directories, file contents, and agent rollout traces. Plugin seed sources can extend the same discriminated union through the plugin system.
|
|
|
|
## Seed Config
|
|
|
|
::: data_designer.config.seed
|
|
|
|
## Built-In Seed Sources
|
|
|
|
::: data_designer.config.seed_source
|
|
|
|
## DataFrame Seed Source
|
|
|
|
::: data_designer.config.seed_source_dataframe
|