CachedEmbeddingClient

org.llm4s.llmconnect.caching.CachedEmbeddingClient
class CachedEmbeddingClient(baseClient: EmbeddingClient, cache: EmbeddingCache[Seq[Double]], keyGenerator: (String, String) => String)

A caching decorator for EmbeddingClient that avoids redundant provider calls.

On each embed call, input texts are checked against the cache first. Only texts that are not cached are forwarded to the base client — in a single batched request. Results are then stored in the cache and merged with cached hits before being returned, preserving the original input order.

Errors returned by the base client are never cached, so transient failures are retried on the next call.

'''Note''': EmbeddingClient is a concrete class rather than a trait, so this wrapper cannot be used as a drop-in substitute for EmbeddingClient in APIs that require that type. A follow-up issue should extract an embedding service interface to allow proper decorator substitution.

Value parameters

baseClient

The underlying client used to generate embeddings on cache misses.

cache

The storage backend for the embedding vectors.

keyGenerator

Function that maps (text, modelName) to a cache key (defaults to SHA-256).

Attributes

Graph
Supertypes
class Object
trait Matchable
class Any

Members list

Value members

Concrete methods

Returns cache hit/miss statistics for this client.

Returns cache hit/miss statistics for this client.

Attributes

def clearCache(): Unit

Clears all cached vectors and resets statistics.

Clears all cached vectors and resets statistics.

Attributes

Generates embeddings for the provided request, serving cached vectors where available and forwarding all cache misses to the base client in a single call.

Generates embeddings for the provided request, serving cached vectors where available and forwarding all cache misses to the base client in a single call.

Value parameters

request

The embedding request containing one or more input strings.

Attributes

Returns

A Result containing an EmbeddingResponse with one vector per input, in the same order as EmbeddingRequest.input.