BaseLifecycleLLMClient

org.llm4s.llmconnect.BaseLifecycleLLMClient

Mixin trait providing standard lifecycle management and metrics wrapping for LLMClient implementations.

Tracks whether the client has been closed via an AtomicBoolean and provides:

  • validateNotClosed — a pre-check returning Left(ConfigurationError) once closed.
  • close() — an idempotent close that delegates to releaseResources() exactly once.
  • completeWithMetrics — combines lifecycle validation with metrics recording for Completion results, eliminating boilerplate in complete / streamComplete.

Concrete clients mix in this trait, supply providerName and modelName, and optionally override releaseResources() to free provider-specific resources (HTTP clients, SDK connections, etc.).

Attributes

Graph
Supertypes
trait LLMClient
trait AutoCloseable
class Object
trait Matchable
class Any
Show all
Known subtypes

Members list

Value members

Abstract methods

protected def clientDescription: String

Human-readable label used in the "already closed" error message.

Human-readable label used in the "already closed" error message.

Attributes

protected def modelName: String

The model identifier forwarded to the metrics collector.

The model identifier forwarded to the metrics collector.

Attributes

protected def providerName: String

Metrics label for this provider (e.g. "openai", "anthropic").

Metrics label for this provider (e.g. "openai", "anthropic").

Attributes

Concrete methods

override def close(): Unit

Releases resources and closes connections to the LLM provider.

Releases resources and closes connections to the LLM provider.

Call when the client is no longer needed. After calling close(), the client should not be used. Default implementation is a no-op; override if managing resources like connections or thread pools.

Attributes

Definition Classes
LLMClient -> AutoCloseable
protected def completeWithMetrics(operation: => Result[Completion]): Result[Completion]

Validates that the client is open, executes the operation, and records standard completion metrics (latency, token usage, estimated cost).

Validates that the client is open, executes the operation, and records standard completion metrics (latency, token usage, estimated cost).

Use this in complete and streamComplete implementations to avoid repeating the lifecycle-check + metrics-wrapping boilerplate.

Value parameters

operation

The provider-specific completion logic to execute. Called only when the client is open.

Attributes

Returns

The completion result with metrics recorded as a side-effect.

protected def releaseResources(): Unit

Hook for releasing provider-specific resources. Called at most once, inside close(). The default is a no-op.

Hook for releasing provider-specific resources. Called at most once, inside close(). The default is a no-op.

Attributes

protected def validateNotClosed: Result[Unit]

Inherited methods

Calculates available token budget for prompts after accounting for completion reserve and headroom.

Calculates available token budget for prompts after accounting for completion reserve and headroom.

Formula: (contextWindow - reserveCompletion) * (1 - headroom)

Headroom provides a safety margin for tokenization variations and message formatting overhead.

Value parameters

headroom

safety margin as percentage of prompt budget (default: HeadroomPercent.Standard ~10%)

Attributes

Returns

maximum tokens available for prompt content

Inherited from:
LLMClient
def validate(): Result[Unit]

Validates client configuration and connectivity to the LLM provider.

Validates client configuration and connectivity to the LLM provider.

May perform checks such as verifying API credentials, testing connectivity, and validating configuration. Default implementation returns success; override for provider-specific validation.

Attributes

Returns

Right(()) if validation succeeds, Left(LLMError) with details on failure

Inherited from:
LLMClient
protected def withMetrics[A](provider: String, model: String, operation: => Result[A], extractUsage: A => Option[TokenUsage], extractCost: A => Option[Double]): Result[A]

Executes operation and records metrics for the call.

Executes operation and records metrics for the call.

Latency and outcome (success or classified error) are recorded for every call regardless of result. Token counts and cost are recorded only on success — a Left result emits an org.llm4s.metrics.Outcome.Error event whose kind is derived from the org.llm4s.error.LLMError subtype via ErrorKind.fromLLMError.

Value parameters

extractCost

Extracts the pre-computed cost (USD) from a successful result; return None to skip cost recording.

extractUsage

Extracts prompt/completion token counts from a successful result; return None to skip token recording.

model

Model identifier forwarded to the collector.

operation

The LLM call to time and observe.

provider

Provider label forwarded to the collector (e.g. "openai").

Attributes

Returns

The result of operation, unchanged.

Inherited from:
MetricsRecording

Inherited and Abstract methods

def complete(conversation: Conversation, options: CompletionOptions): Result[Completion]

Executes a blocking completion request and returns the full response.

Executes a blocking completion request and returns the full response.

Sends the conversation to the LLM and waits for the complete response. Use when you need the entire response at once or when streaming is not required.

Value parameters

conversation

conversation history including system, user, assistant, and tool messages

options

configuration including temperature, max tokens, tools, etc. (default: CompletionOptions())

Attributes

Returns

Right(Completion) with the model's response, or Left(LLMError) on failure

Inherited from:
LLMClient
def getContextWindow(): Int

Returns the maximum context window size supported by this model in tokens.

Returns the maximum context window size supported by this model in tokens.

The context window is the total tokens (prompt + completion) the model can process in a single request, including all conversation messages and the generated response.

Attributes

Returns

total context window size in tokens (e.g., 4096, 8192, 128000)

Inherited from:
LLMClient

Returns the number of tokens reserved for the model's completion response.

Returns the number of tokens reserved for the model's completion response.

This value is subtracted from the context window when calculating available tokens for prompts. Corresponds to the max_tokens or completion token limit configured for the model.

Attributes

Returns

number of tokens reserved for completion

Inherited from:
LLMClient
protected def metrics: MetricsCollector

The org.llm4s.metrics.MetricsCollector that receives timing, token, and cost events.

The org.llm4s.metrics.MetricsCollector that receives timing, token, and cost events.

Injected by each concrete provider client. Defaults to MetricsCollector.noop in all public constructors, so callers that do not need metrics do not pay an allocation cost.

Attributes

Inherited from:
MetricsRecording
def streamComplete(conversation: Conversation, options: CompletionOptions, onChunk: StreamedChunk => Unit): Result[Completion]

Executes a streaming completion request, invoking a callback for each chunk as it arrives.

Executes a streaming completion request, invoking a callback for each chunk as it arrives.

Streams the response incrementally, calling onChunk for each token/chunk received. Enables real-time display of responses. Returns the final accumulated completion on success.

Value parameters

conversation

conversation history including system, user, assistant, and tool messages

onChunk

callback invoked for each chunk; called synchronously, avoid blocking operations

options

configuration including temperature, max tokens, tools, etc. (default: CompletionOptions())

Attributes

Returns

Right(Completion) with the complete accumulated response, or Left(LLMError) on failure

Inherited from:
LLMClient