Commit graph

1 commit

Author SHA1 Message Date
Nabin Mulepati
a9a16ae61a
feat: Native Anthropic adapter with shared HTTP client infrastructure (#426)
* plans for model facade overhaul

* update plan

* add review

* address feedback + add more details after several self reviews

* update plan doc

* address nits

* Add cannonical objects

* self-review feedback + address

* add LiteLLMRouter protocol to strongly type bridge router param

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* simplify some things

* add a protol for http response like object

* move HttpResponse

* update PR-1 architecture notes for lifecycle and router protocol

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Address PR #359 feedback: exception wrapping, shared parsing, test improvements

- Wrap all LiteLLM router calls in try/except to normalize raw exceptions
  into canonical ProviderError at the bridge boundary (blocking review item)
- Extract reusable response-parsing helpers into clients/parsing.py for
  shared use across future native adapters
- Add async image parsing path using httpx.AsyncClient to avoid blocking
  the event loop in agenerate_image
- Add retry_after field to ProviderError for future retry engine support
- Fix _to_int_or_none to parse numeric strings from providers
- Create test conftest.py with shared mock_router/bridge_client fixtures
- Parametrize duplicate image generation and error mapping tests
- Add tests for exception wrapping across all bridge methods

* Use contextlib to dry out some code

* Address Greptile feedback: HTTP-date retry-after parsing, docstring clarity

- Parse RFC 7231 HTTP-date strings in Retry-After header (used by
  Azure and Anthropic during rate-limiting) in addition to numeric
  delay-seconds
- Clarify collect_non_none_optional_fields docstring explaining why
  f.default is None is the correct check for optional field forwarding
- Add tests for HTTP-date and garbage Retry-After values

* Address Greptile feedback: FastAPI detail parsing, comment fixes

- Fix misleading comment about prompt field defaults in _IMAGE_EXCLUDE
- Handle list-format detail arrays in _extract_structured_message for
  FastAPI/Pydantic validation errors
- Document scope boundary for vision content in collect_raw_image_candidates

* add PR-2 architecture notes for model facade overhaul

* save progress on pr2

* small refactor

* address feedback

* Address greptile comment in pr1

* refactor ProviderError from dataclass to regular Exception

- Replace @dataclass + __post_init__ with explicit __init__ that calls
  super().__init__ properly, avoiding brittle field-ordering dependency
- Store cause via __cause__ only, removing the redundant .cause attr
- Update match pattern in handle_llm_exceptions for non-dataclass type
- Rename shadowed local `fields` to `optional_fields` in TransportKwargs

* Address greptile feedback

* PR feedback

* track usage tracking in finally block for images

* pr feedback

* add native OpenAI adapter with retry and throttle infrastructure

- Implement OpenAICompatibleClient using httpx with RetryTransport
- Add ThrottleManager with AIMD concurrency control and structured logging
- Route provider_type=openai to native adapter in client factory
- Add extract_reasoning_content helper for vLLM field migration
- Make ModelRegistry own ThrottleManager and RetryConfig explicitly
- Support DATA_DESIGNER_MODEL_BACKEND=litellm_bridge env var override

Made-with: Cursor

* Self CR

* fix claude slop

* Updates after self-review. Simplify use of ThrottleManager in light of plan 346 scheduler

* wrap facade close in try/catch

* clean up stray params

* fix: address review findings from model facade overhaul PR3

- Fix metadata bug: drop unknown kwargs instead of passing them to
  ChatCompletionRequest (which has no metadata field), preventing a
  runtime TypeError.
- Lazy-init httpx clients: sync and async clients are created on first
  use instead of eagerly in the constructor, with constructor injection
  for testability.
- Remove defensive getattr on httpx.Response.status_code (always present).
- Add comment clarifying throttle manager wiring is deferred.
- Refactor tests to use constructor injection instead of private attribute
  mutation.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix stray inclusion of metadata

* small regression fix

* address more feedback

* self review

* Fixes

* new test for aimd lifecycle

* update plan docs

* update plans with refs to prs

* fix: cap acquire_sync/acquire_async sleep to remaining budget to prevent timeout overshoot

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* test lay init

* fix timeout for openaicompatibleadapter

* remove unused attr

* fix: address review findings from PR #402

- Guard reasoning_content fallback with isinstance(str) check to prevent
  non-string provider values from violating the return type contract
- Normalize provider_type comparison to lowercase so mixed-case configs
  (e.g. "OpenAI") route to the native adapter instead of silently falling
  through to the bridge
- Document register-before-acquire ordering invariant on ThrottleManager
- Add TODO to remove 429 from retryable_status_codes once ThrottleManager
  is wired via AsyncTaskScheduler (plan 346)

Made-with: Cursor

* Address pr feedback

* fix method order

* feat: native Anthropic adapter with image block translation

Implement AnthropicClient as a native httpx-based adapter for the
Anthropic Messages API, following the same patterns as
OpenAICompatibleClient. Route provider_type="anthropic" to the native
adapter in the client factory, with unknown providers falling through
to LiteLLMBridgeClient.

Key adapter behaviors:
- System message extraction to top-level `system` parameter
- OpenAI image_url content blocks translated to Anthropic image/source format
- tool_use / thinking content blocks normalized to canonical types
- x-api-key auth (not Bearer), anthropic-version header
- max_tokens defaults to 4096 when omitted
- stop mapped to stop_sequences
- Embeddings and image generation raise unsupported capability errors

Made-with: Cursor

* fix: exclude OpenAI-specific params from Anthropic payload and drop unnecessary re.DOTALL

Expand _ANTHROPIC_EXCLUDE to filter response_format, frequency_penalty,
presence_penalty, and seed from TransportKwargs forwarding. Remove the
unnecessary re.DOTALL flag from _DATA_URI_RE since base64 data never
contains newlines.

Made-with: Cursor

* Fix failing test

* fix: address PR #402 review findings

- Add POST to RetryTransport allowed_methods so retries actually fire
  for all adapter endpoints (chat, embeddings, images are all POST)
- Guard close()/aclose() with _init_lock to prevent TOCTOU race with
  lazy client initialization
- Detect "unsupported parameter" patterns in 400 responses so the
  native path returns UNSUPPORTED_PARAMS instead of generic BAD_REQUEST
- Return None from coerce_message_content for image-only content lists
  instead of leaking the Python repr via str(content)
- Restore per-request timeout forwarding for LiteLLMBridgeClient via
  _with_timeout helper (broken when "timeout" was added to _META_FIELDS)
- Normalize ModelProvider.provider_type to lowercase via field_validator
  so consumers don't need per-site .lower() calls
- Fix unclosed httpx.AsyncClient in lazy-init test

Made-with: Cursor

* updates to DRY out code between the two adapters

* refactor code to introduce HttpModelClient

* update tests

* fix: address PR #426 review findings

- Release _aclient in close() to prevent async connection pool leak
  when both sync and async clients were initialized
- Drop malformed image_url blocks (missing image_url key) instead of
  forwarding them unchanged to the Anthropic API
- Preserve image blocks in system messages by returning Anthropic
  block-list format when non-text content is present
- Rename extract_system_text to extract_system_content and add
  merge_system_parts helper for mixed string/block system parts

Made-with: Cursor

* fix: improve error classification and surface provider messages for 400s

- Handle /v1 in Anthropic endpoint gracefully to avoid path duplication
- Add QUOTA_EXCEEDED provider error kind for credit/billing failures
- Extend UNSUPPORTED_PARAMS detection for mutually exclusive params
- Surface raw provider message in formatted errors for 400 status codes
- Consolidate provider message helpers into single _attach_provider_message

Made-with: Cursor

* fix: address PR #426 review findings (round 2)

- Fix close() double-close of shared transport by closing self._transport
  directly instead of accessing private aclient._transport (critical)
- Add TODO for threading.Lock → asyncio.Lock split (plan-346)
- Remove unused _model_id from HttpModelClient and all callers
- Export AnthropicClient from adapters __init__.py
- Filter empty text blocks in translate_tool_result_content join
- Move mock helpers to conftest.py with consistent json.dumps text default
- Add __init__.py files to enable absolute imports from test conftest
- Add bridge env override test for anthropic provider
- Add ConnectionError and non-JSON response tests for AnthropicClient
- Assert secret_resolver.resolve called with correct key ref in factory tests

Made-with: Cursor

* Update license headers

* Fix anthropic tool call flow

* fix: add explicit UNSUPPORTED_CAPABILITY error mapping

- Add ModelUnsupportedCapabilityError so unsupported operations
  (e.g. Anthropic embeddings/image-generation) surface a specific
  error instead of falling through to generic ModelAPIError
- Forward the provider's operation-specific message in the cause
- Add parametrized test case for the new error path
2026-03-19 11:18:40 -06:00