llm4s-core/org.llm4s/org.llm4s.reliability/ReliableClient

ReliableClient

org.llm4s.reliability.ReliableClient

See theReliableClient companion object

final class ReliableClient(underlying: LLMClient, providerName: String, config: ReliabilityConfig, collector: Option[MetricsCollector]) extends LLMClient

Wrapper that adds reliability features to any LLMClient.

Provides:

Retry with configurable policies (exponential backoff, linear, fixed)
Circuit breaker to fail fast when service is down
Deadline enforcement to prevent hanging operations
Metrics tracking for retry attempts and circuit breaker state

Thread-safety: Uses AtomicInteger/AtomicReference for circuit breaker state management to ensure correct behavior under concurrent access.

Value parameters

collector: Optional metrics collector for observability
config: Reliability configuration
providerName: Explicit provider name for stable metrics labels
underlying: The client to wrap

Attributes

Companion: object
Graph
Supertypes: trait LLMClient

trait AutoCloseable

class Object

trait Matchable

class Any

Members list

Value members

Concrete methods

Releases resources and closes connections to the LLM provider.

Call when the client is no longer needed. After calling close(), the client should not be used. Default implementation is a no-op; override if managing resources like connections or thread pools.

Attributes

Definition Classes: LLMClient -> AutoCloseable

Executes a blocking completion request and returns the full response.

Sends the conversation to the LLM and waits for the complete response. Use when you need the entire response at once or when streaming is not required.

Value parameters

conversation: conversation history including system, user, assistant, and tool messages
options: configuration including temperature, max tokens, tools, etc. (default: CompletionOptions())

Attributes

Returns: Right(Completion) with the model's response, or Left(LLMError) on failure
Definition Classes: LLMClient

Get current circuit breaker state (for testing/monitoring).

Attributes

Returns the maximum context window size supported by this model in tokens.

The context window is the total tokens (prompt + completion) the model can process in a single request, including all conversation messages and the generated response.

Attributes

Returns: total context window size in tokens (e.g., 4096, 8192, 128000)
Definition Classes: LLMClient

Returns the number of tokens reserved for the model's completion response.

This value is subtracted from the context window when calculating available tokens for prompts. Corresponds to the max_tokens or completion token limit configured for the model.

Attributes

Returns: number of tokens reserved for completion
Definition Classes: LLMClient

Reset circuit breaker state (for testing).

Attributes

Executes a streaming completion request, invoking a callback for each chunk as it arrives.

Streams the response incrementally, calling onChunk for each token/chunk received. Enables real-time display of responses. Returns the final accumulated completion on success.

Value parameters

conversation: conversation history including system, user, assistant, and tool messages
onChunk: callback invoked for each chunk; called synchronously, avoid blocking operations
options: configuration including temperature, max tokens, tools, etc. (default: CompletionOptions())

Attributes

Returns: Right(Completion) with the complete accumulated response, or Left(LLMError) on failure
Definition Classes: LLMClient

Validates client configuration and connectivity to the LLM provider.

May perform checks such as verifying API credentials, testing connectivity, and validating configuration. Default implementation returns success; override for provider-specific validation.

Attributes

Returns: Right(()) if validation succeeds, Left(LLMError) with details on failure
Definition Classes: LLMClient

Inherited methods

Calculates available token budget for prompts after accounting for completion reserve and headroom.

Formula: (contextWindow - reserveCompletion) * (1 - headroom)

Headroom provides a safety margin for tokenization variations and message formatting overhead.

Value parameters

headroom: safety margin as percentage of prompt budget (default: HeadroomPercent.Standard ~10%)

Attributes

Returns: maximum tokens available for prompt content
Inherited from:: LLMClient

In this article

Generated with