core/org.llm4s/org.llm4s.llmconnect/org.llm4s.llmconnect.caching/CachedEmbeddingClient

CachedEmbeddingClient

org.llm4s.llmconnect.caching.CachedEmbeddingClient

class CachedEmbeddingClient(baseClient: EmbeddingClient, cache: EmbeddingCache[Seq[Double]], keyGenerator: (String, String) => String)

A caching decorator for EmbeddingClient that avoids redundant provider calls.

On each embed call, input texts are checked against the cache first. Only texts that are not cached are forwarded to the base client — in a single batched request. Results are then stored in the cache and merged with cached hits before being returned, preserving the original input order.

Errors returned by the base client are never cached, so transient failures are retried on the next call.

'''Note''': EmbeddingClient is a concrete class rather than a trait, so this wrapper cannot be used as a drop-in substitute for EmbeddingClient in APIs that require that type. A follow-up issue should extract an embedding service interface to allow proper decorator substitution.

Value parameters

baseClient: The underlying client used to generate embeddings on cache misses.
cache: The storage backend for the embedding vectors.
keyGenerator: Function that maps (text, modelName) to a cache key (defaults to SHA-256).

Attributes

Graph
Supertypes: class Object

trait Matchable

class Any

Members list

Value members

Concrete methods

Returns cache hit/miss statistics for this client.

Attributes

Clears all cached vectors and resets statistics.

Attributes

Generates embeddings for the provided request, serving cached vectors where available and forwarding all cache misses to the base client in a single call.

Value parameters

request: The embedding request containing one or more input strings.

Attributes

Returns: A Result containing an EmbeddingResponse with one vector per input, in the same order as EmbeddingRequest.input.

In this article

Generated with