mirror of https://github.com/NVIDIA-NeMo/DataDesigner synced 2026-05-24 09:48:29 +00:00

feat: add Streamable HTTP transport support for remote MCP providers (#358 )

* feat: add Streamable HTTP transport support for remote MCP providers (#357)

Add `streamable_http` as a supported transport type for `MCPProvider`,
enabling connections to MCP servers that use the Streamable HTTP protocol
(e.g. Tavily remote endpoints). Previously only SSE transport was supported,
causing silent 5-minute timeouts when connecting to incompatible endpoints.

- Expand `MCPProvider.provider_type` to `Literal["sse", "streamable_http"]`
  (default remains `"sse"` for backwards compatibility)
- Route `streamable_http` providers through `streamablehttp_client` from
  the MCP SDK in `MCPIOService._get_or_create_session()`
- Handle variable-length context manager results from MCP transport clients
- Add `DataDesigner.list_mcp_tool_names()` for discovering available tools
- Update CLI form builder and controller to support the new transport option
- Add tests for streamable_http config, session creation, and form builder

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* updates

* simplify import

* address greptile comments

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>

2026-03-04 08:11:54 -07:00

4.3 KiB

Raw Blame History

MCP (Model Context Protocol)

The mcp module defines configuration and execution classes for tool use via MCP (Model Context Protocol).

Configuration Classes

MCPProvider configures remote MCP servers via SSE or Streamable HTTP transport. LocalStdioMCPProvider configures local MCP servers as subprocesses via stdio transport. ToolConfig defines which tools are available for LLM columns and how they are constrained.

For user-facing guides, see:

MCP Providers - Configure local or remote MCP providers
Tool Configs - Define tool permissions and limits
Enabling Tools - Use tools in LLM columns
Traces - Capture full conversation history

Internal Architecture

Parallel Structure

Model Layer	MCP Layer	Purpose
`ModelProviderRegistry`	`MCPProviderRegistry`	Holds provider configurations
`ModelRegistry`	`MCPRegistry`	Manages configs by alias, lazy facade creation
`ModelFacade`	`MCPFacade`	Lightweight facade scoped to specific config
`ModelConfig.alias`	`ToolConfig.tool_alias`	Alias for referencing in column configs

MCPProviderRegistry

Holds MCP provider configurations. Can be empty (MCP is optional). Created first during resource initialization.

MCPRegistry

The central registry for tool configurations:

Holds ToolConfig instances by tool_alias
Lazily creates MCPFacade instances via get_mcp(tool_alias)
Manages shared connection pool and tool cache across all facades
Validates that tool configs reference valid providers

MCPFacade

A lightweight facade scoped to a specific ToolConfig. Key methods:

Method	Description
`tool_call_count(response)`	Count tool calls in a completion response
`has_tool_calls(response)`	Check if response contains tool calls
`get_tool_schemas()`	Get OpenAI-format tool schemas for this config
`process_completion_response(response)`	Execute tool calls and return messages
`refuse_completion_response(response)`	Refuse tool calls gracefully (budget exhaustion)

Properties: tool_alias, providers, max_tool_call_turns, allow_tools, timeout_sec

I/O Layer (mcp/io.py)

The io.py module provides low-level MCP communication with performance optimizations:

Single event loop architecture: All MCP operations funnel through a dedicated background daemon thread running an asyncio event loop. This allows:

Efficient concurrent I/O without per-thread event loop overhead
Natural session sharing across all worker threads
Clean async implementation for parallel tool calls

Session pooling: MCP sessions are created lazily and kept alive for the program's duration:

One session per provider (keyed by serialized config)
No per-call connection/handshake overhead
Graceful cleanup on program exit via atexit handler

Request coalescing: The list_tools operation uses request coalescing to prevent thundering herd:

When multiple workers request tools from the same provider simultaneously
Only one request is made; others wait for the cached result
Uses asyncio.Lock per provider key

Parallel tool execution: The call_tools_parallel() function executes multiple tool calls concurrently via asyncio.gather(). This is used by MCPFacade when the model returns parallel tool calls in a single response.

Integration with ModelFacade.generate()

The ModelFacade.generate() method accepts an optional tool_alias parameter:

output, messages = model_facade.generate(
    prompt="Search and answer...",
    parser=my_parser,
    tool_alias="my-tools",  # Enables tool calling for this generation
)

When tool_alias is provided:

ModelFacade looks up the MCPFacade from MCPRegistry
Tool schemas are fetched and passed to the LLM
After each completion, MCPFacade processes tool calls
Turn counting tracks iterations; refusal kicks in when budget exhausted
Messages (including tool results) are returned for trace capture

Config Module

::: data_designer.config.mcp

4.3 KiB Raw Blame History