DataDesigner/docs/quick-start.md
Nabin Mulepati cb0b1c6f6a
docs: docs for quickstart, cli, model settings (#37)
* vibe it baby

* clean up

* iterate with claude

* Save prog

* Update info pipeine

* Fix tests

* Fix typo

* remove redundant overload

* Add support for multiple default model providers and config

* pull user-defined model configs and providers if available

* Added tests for default model settings

* save progress

* refactor cli to be modular and use OOP

* new tests for cli components

* config_dir > config_path

* simplify list

* list tests

* stranded commit

* tests for commands

* tests for field.py

* tests for form.py

* more tests

* deleting providers should delete associated model configs

* add readme.md for cli

* clean up

* Fix tests

* feat: (FTUE) pull user-defined (via cli) model configs and providers  (#24)

* added docs for quick start and default model settings

* Updates per chat

* update quickstart.md

* update default-model-settings.md

* add check for interface.py as well

* move default model config resolution to src/data_designer/__init__.py

* Revert "move default model config resolution to src/data_designer/__init__.py"

This reverts commit 806a81dc93.

* docs for cli

* update default-model-settings.md

* docs for model provider

* more docs

* add new tests for get provider name

* add lru cache

* remove non doc related changes

* PR feedback

* update reset info

* tip for settings files

* update

* update info about default inference providers

* DATA_DESIGNER_HOME_DIR -> DATA_DESIGNER_HOME

---------

Co-authored-by: Johnny Greco <jogreco@nvidia.com>
2025-11-18 21:28:03 -07:00

2.6 KiB

Quick Start

Get started with Data Designer using the default model providers and configurations. Data Designer ships with built-in model providers and configurations that make it easy to start generating synthetic data immediately.

Prerequisites

Before you begin, you'll need an API key from one of the default providers:

Set your API key as an environment variable:

export NVIDIA_API_KEY="your-api-key-here"
# Or for OpenAI
export OPENAI_API_KEY="your-openai-api-key-here"

Example

Below we'll construct a simple Data Designer workflow that generates multilingual greetings.

import os

from data_designer.essentials import (
    CategorySamplerParams,
    DataDesigner,
    DataDesignerConfigBuilder,
    InfoType,
    LLMTextColumnConfig,
    SamplerColumnConfig,
    SamplerType,
)

# Set your API key from build.nvidia.com
# Skip this step if you've already exported your key to the environemnt variable
os.environ["NVIDIA_API_KEY"] = "your-api-key-here"

# Create a DataDesigner instance
# This automatically configures the default model providers
data_designer = DataDesigner()

# Print out all the model providers available
data_designer.info.display(InfoType.MODEL_PROVIDERS)

# Create a config builder
# This automatically loads the default model configurations
config_builder = DataDesignerConfigBuilder()

# Print out all the model configurations available
config_builder.info.display(InfoType.MODEL_CONFIGS)

# Add a sampler column to randomly select a language
config_builder.add_column(
    SamplerColumnConfig(
        name="language",
        sampler_type=SamplerType.CATEGORY,
        params=CategorySamplerParams(
            values=["English", "Spanish", "French", "German", "Italian"],
        ),
    )
)

# Add an LLM text generation column
# We'll use the built-in 'nvidia-text' model alias
config_builder.add_column(
    LLMTextColumnConfig(
        name="greetings",
        model_alias="nvidia-text",
        prompt="""Write a casual and formal greeting in '{{language}}' language.""",
    )
)

# Run a preview to generate sample records
preview_results = data_designer.preview(config_builder=config_builder)

# Display a sample record
preview_results.display_sample_record()

🎉 Congratulations, you successfully ran one iteration designing your synthetic data. Follow along to learn more.

To learn more about the default providers and model configurations available, see the Default Model Settings guide.