mirror of
https://github.com/NVIDIA-NeMo/DataDesigner
synced 2026-05-24 09:48:29 +00:00
* plans for model facade overhaul * update plan * add review * address feedback + add more details after several self reviews * update plan doc * address nits * Add cannonical objects * self-review feedback + address * add LiteLLMRouter protocol to strongly type bridge router param Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * simplify some things * add a protol for http response like object * move HttpResponse * update PR-1 architecture notes for lifecycle and router protocol Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Address PR #359 feedback: exception wrapping, shared parsing, test improvements - Wrap all LiteLLM router calls in try/except to normalize raw exceptions into canonical ProviderError at the bridge boundary (blocking review item) - Extract reusable response-parsing helpers into clients/parsing.py for shared use across future native adapters - Add async image parsing path using httpx.AsyncClient to avoid blocking the event loop in agenerate_image - Add retry_after field to ProviderError for future retry engine support - Fix _to_int_or_none to parse numeric strings from providers - Create test conftest.py with shared mock_router/bridge_client fixtures - Parametrize duplicate image generation and error mapping tests - Add tests for exception wrapping across all bridge methods * Use contextlib to dry out some code * Address Greptile feedback: HTTP-date retry-after parsing, docstring clarity - Parse RFC 7231 HTTP-date strings in Retry-After header (used by Azure and Anthropic during rate-limiting) in addition to numeric delay-seconds - Clarify collect_non_none_optional_fields docstring explaining why f.default is None is the correct check for optional field forwarding - Add tests for HTTP-date and garbage Retry-After values * Address Greptile feedback: FastAPI detail parsing, comment fixes - Fix misleading comment about prompt field defaults in _IMAGE_EXCLUDE - Handle list-format detail arrays in _extract_structured_message for FastAPI/Pydantic validation errors - Document scope boundary for vision content in collect_raw_image_candidates * add PR-2 architecture notes for model facade overhaul * save progress on pr2 * small refactor * address feedback * Address greptile comment in pr1 * refactor ProviderError from dataclass to regular Exception - Replace @dataclass + __post_init__ with explicit __init__ that calls super().__init__ properly, avoiding brittle field-ordering dependency - Store cause via __cause__ only, removing the redundant .cause attr - Update match pattern in handle_llm_exceptions for non-dataclass type - Rename shadowed local `fields` to `optional_fields` in TransportKwargs * Address greptile feedback * PR feedback * track usage tracking in finally block for images * pr feedback * add native OpenAI adapter with retry and throttle infrastructure - Implement OpenAICompatibleClient using httpx with RetryTransport - Add ThrottleManager with AIMD concurrency control and structured logging - Route provider_type=openai to native adapter in client factory - Add extract_reasoning_content helper for vLLM field migration - Make ModelRegistry own ThrottleManager and RetryConfig explicitly - Support DATA_DESIGNER_MODEL_BACKEND=litellm_bridge env var override Made-with: Cursor * Self CR * fix claude slop * Updates after self-review. Simplify use of ThrottleManager in light of plan 346 scheduler * wrap facade close in try/catch * clean up stray params * fix: address review findings from model facade overhaul PR3 - Fix metadata bug: drop unknown kwargs instead of passing them to ChatCompletionRequest (which has no metadata field), preventing a runtime TypeError. - Lazy-init httpx clients: sync and async clients are created on first use instead of eagerly in the constructor, with constructor injection for testability. - Remove defensive getattr on httpx.Response.status_code (always present). - Add comment clarifying throttle manager wiring is deferred. - Refactor tests to use constructor injection instead of private attribute mutation. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix stray inclusion of metadata * small regression fix * address more feedback * self review * Fixes * new test for aimd lifecycle * update plan docs * update plans with refs to prs * fix: cap acquire_sync/acquire_async sleep to remaining budget to prevent timeout overshoot Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * test lay init * fix timeout for openaicompatibleadapter * remove unused attr * fix: address review findings from PR #402 - Guard reasoning_content fallback with isinstance(str) check to prevent non-string provider values from violating the return type contract - Normalize provider_type comparison to lowercase so mixed-case configs (e.g. "OpenAI") route to the native adapter instead of silently falling through to the bridge - Document register-before-acquire ordering invariant on ThrottleManager - Add TODO to remove 429 from retryable_status_codes once ThrottleManager is wired via AsyncTaskScheduler (plan 346) Made-with: Cursor * Address pr feedback * fix method order * feat: native Anthropic adapter with image block translation Implement AnthropicClient as a native httpx-based adapter for the Anthropic Messages API, following the same patterns as OpenAICompatibleClient. Route provider_type="anthropic" to the native adapter in the client factory, with unknown providers falling through to LiteLLMBridgeClient. Key adapter behaviors: - System message extraction to top-level `system` parameter - OpenAI image_url content blocks translated to Anthropic image/source format - tool_use / thinking content blocks normalized to canonical types - x-api-key auth (not Bearer), anthropic-version header - max_tokens defaults to 4096 when omitted - stop mapped to stop_sequences - Embeddings and image generation raise unsupported capability errors Made-with: Cursor * fix: exclude OpenAI-specific params from Anthropic payload and drop unnecessary re.DOTALL Expand _ANTHROPIC_EXCLUDE to filter response_format, frequency_penalty, presence_penalty, and seed from TransportKwargs forwarding. Remove the unnecessary re.DOTALL flag from _DATA_URI_RE since base64 data never contains newlines. Made-with: Cursor * Fix failing test * fix: address PR #402 review findings - Add POST to RetryTransport allowed_methods so retries actually fire for all adapter endpoints (chat, embeddings, images are all POST) - Guard close()/aclose() with _init_lock to prevent TOCTOU race with lazy client initialization - Detect "unsupported parameter" patterns in 400 responses so the native path returns UNSUPPORTED_PARAMS instead of generic BAD_REQUEST - Return None from coerce_message_content for image-only content lists instead of leaking the Python repr via str(content) - Restore per-request timeout forwarding for LiteLLMBridgeClient via _with_timeout helper (broken when "timeout" was added to _META_FIELDS) - Normalize ModelProvider.provider_type to lowercase via field_validator so consumers don't need per-site .lower() calls - Fix unclosed httpx.AsyncClient in lazy-init test Made-with: Cursor * updates to DRY out code between the two adapters * refactor code to introduce HttpModelClient * update tests * fix: address PR #426 review findings - Release _aclient in close() to prevent async connection pool leak when both sync and async clients were initialized - Drop malformed image_url blocks (missing image_url key) instead of forwarding them unchanged to the Anthropic API - Preserve image blocks in system messages by returning Anthropic block-list format when non-text content is present - Rename extract_system_text to extract_system_content and add merge_system_parts helper for mixed string/block system parts Made-with: Cursor * fix: improve error classification and surface provider messages for 400s - Handle /v1 in Anthropic endpoint gracefully to avoid path duplication - Add QUOTA_EXCEEDED provider error kind for credit/billing failures - Extend UNSUPPORTED_PARAMS detection for mutually exclusive params - Surface raw provider message in formatted errors for 400 status codes - Consolidate provider message helpers into single _attach_provider_message Made-with: Cursor * fix: address PR #426 review findings (round 2) - Fix close() double-close of shared transport by closing self._transport directly instead of accessing private aclient._transport (critical) - Add TODO for threading.Lock → asyncio.Lock split (plan-346) - Remove unused _model_id from HttpModelClient and all callers - Export AnthropicClient from adapters __init__.py - Filter empty text blocks in translate_tool_result_content join - Move mock helpers to conftest.py with consistent json.dumps text default - Add __init__.py files to enable absolute imports from test conftest - Add bridge env override test for anthropic provider - Add ConnectionError and non-JSON response tests for AnthropicClient - Assert secret_resolver.resolve called with correct key ref in factory tests Made-with: Cursor * Update license headers * Fix anthropic tool call flow * fix: add explicit UNSUPPORTED_CAPABILITY error mapping - Add ModelUnsupportedCapabilityError so unsupported operations (e.g. Anthropic embeddings/image-generation) surface a specific error instead of falling through to generic ModelAPIError - Forward the provider's operation-specific message in the cause - Add parametrized test case for the new error path
2 lines
137 B
Python
2 lines
137 B
Python
# SPDX-FileCopyrightText: Copyright (c) 2026 NVIDIA CORPORATION & AFFILIATES. All rights reserved.
|
|
# SPDX-License-Identifier: Apache-2.0
|