core/org.llm4s/org.llm4s.llmconnect/org.llm4s.llmconnect.provider/AnthropicClient

AnthropicClient

org.llm4s.llmconnect.provider.AnthropicClient

See theAnthropicClient companion object

class AnthropicClient(config: AnthropicConfig, val metrics: MetricsCollector, exchangeLogging: ProviderExchangeLogging)(using val registryService: ModelRegistryService) extends BaseLifecycleLLMClient

LLMClient implementation for Anthropic Claude models.

Uses the official Anthropic Java SDK (AnthropicOkHttpClient) for all API calls. SDK exceptions are mapped to the appropriate org.llm4s.error.LLMError subtypes before being returned.

== Message format adaptations ==

The Anthropic Messages API differs from the OpenAI convention in several ways that this client handles transparently:

Default system prompt: if the conversation contains no SystemMessage, the client injects "You are Claude, a helpful AI assistant." automatically. Supply an explicit SystemMessage to override this.
Tool results as user messages: the Anthropic API does not accept native tool-result messages in the same turn structure as OpenAI. ToolMessage values are therefore forwarded as user messages with the prefix "[Tool result for <toolCallId>]: ".
Assistant messages with tool calls are skipped: when an AssistantMessage carries pending tool calls, it is not forwarded — Anthropic infers the assistant turn from the subsequent tool-result user messages.
Schema sanitisation: OpenAI-specific fields (strict, additionalProperties) are stripped from tool schemas before sending, because Anthropic's API rejects them.

== Extended thinking ==

When CompletionOptions.reasoning is set, a thinking block is added to the request. The token budget is clamped to [1024, maxTokens - 1] to satisfy the Anthropic API constraint; the effective budget may therefore differ from what was requested.

maxTokens defaults to 2048 when not set in CompletionOptions because the Anthropic API requires the field.

Value parameters

config: AnthropicConfig carrying the API key, model name, and base URL.
metrics: Receives per-call latency and token-usage events. Defaults to MetricsCollector.noop.

Attributes

Companion: object
Graph
Supertypes: trait BaseLifecycleLLMClient

trait MetricsRecording

trait LLMClient

trait AutoCloseable

class Object

trait Matchable

class Any
Show all

Members list

Value members

Concrete methods

Executes a blocking completion request and returns the full response.

Sends the conversation to the LLM and waits for the complete response. Use when you need the entire response at once or when streaming is not required.

Value parameters

conversation: conversation history including system, user, assistant, and tool messages
options: configuration including temperature, max tokens, tools, etc. (default: CompletionOptions())

Attributes

Returns: Right(Completion) with the model's response, or Left(LLMError) on failure
Definition Classes: LLMClient

Returns the maximum context window size supported by this model in tokens.

The context window is the total tokens (prompt + completion) the model can process in a single request, including all conversation messages and the generated response.

Attributes

Returns: total context window size in tokens (e.g., 4096, 8192, 128000)
Definition Classes: LLMClient

Returns the number of tokens reserved for the model's completion response.

This value is subtracted from the context window when calculating available tokens for prompts. Corresponds to the max_tokens or completion token limit configured for the model.

Attributes

Returns: number of tokens reserved for completion
Definition Classes: LLMClient

Executes a streaming completion request, invoking a callback for each chunk as it arrives.

Streams the response incrementally, calling onChunk for each token/chunk received. Enables real-time display of responses. Returns the final accumulated completion on success.

Value parameters

conversation: conversation history including system, user, assistant, and tool messages
onChunk: callback invoked for each chunk; called synchronously, avoid blocking operations
options: configuration including temperature, max tokens, tools, etc. (default: CompletionOptions())

Attributes

Returns: Right(Completion) with the complete accumulated response, or Left(LLMError) on failure
Definition Classes: LLMClient

Inherited methods

Releases resources and closes connections to the LLM provider.

Call when the client is no longer needed. After calling close(), the client should not be used. Default implementation is a no-op; override if managing resources like connections or thread pools.

Attributes

Definition Classes: BaseLifecycleLLMClient -> LLMClient -> AutoCloseable
Inherited from:: BaseLifecycleLLMClient

Validates that the client is open, executes the operation, and records standard completion metrics (latency, token usage, estimated cost).

Use this in complete and streamComplete implementations to avoid repeating the lifecycle-check + metrics-wrapping boilerplate.

Value parameters

operation: The provider-specific completion logic to execute. Called only when the client is open.

Attributes

Returns: The completion result with metrics recorded as a side-effect.
Inherited from:: BaseLifecycleLLMClient

Calculates available token budget for prompts after accounting for completion reserve and headroom.

Formula: (contextWindow - reserveCompletion) * (1 - headroom)

Headroom provides a safety margin for tokenization variations and message formatting overhead.

Value parameters

headroom: safety margin as percentage of prompt budget (default: HeadroomPercent.Standard ~10%)

Attributes

Returns: maximum tokens available for prompt content
Inherited from:: LLMClient

Validates client configuration and connectivity to the LLM provider.

May perform checks such as verifying API credentials, testing connectivity, and validating configuration. Default implementation returns success; override for provider-specific validation.

Attributes

Returns: Right(()) if validation succeeds, Left(LLMError) with details on failure
Inherited from:: LLMClient

Attributes

Inherited from:: BaseLifecycleLLMClient

Executes operation and records metrics for the call.

Latency and outcome (success or classified error) are recorded for every call regardless of result. Token counts and cost are recorded only on success — a Left result emits an org.llm4s.metrics.Outcome.Error event whose kind is derived from the org.llm4s.error.LLMError subtype via ErrorKind.fromLLMError.

Value parameters

extractCost: Extracts the pre-computed cost (USD) from a successful result; return None to skip cost recording.
extractUsage: Extracts prompt/completion token counts from a successful result; return None to skip token recording.
model: Model identifier forwarded to the collector.
operation: The LLM call to time and observe.
provider: Provider label forwarded to the collector (e.g. "openai").

Attributes

Returns: The result of operation, unchanged.
Inherited from:: MetricsRecording

Givens

In this article

Generated with