Generated with

core/org.llm4s/org.llm4s.llmconnect/org.llm4s.llmconnect.middleware/RateLimitingMiddleware

RateLimitingMiddleware

org.llm4s.llmconnect.middleware.RateLimitingMiddleware

class RateLimitingMiddleware(requestsPerMinute: Int, burstCapacity: Int, timeSource: () => Long) extends LLMMiddleware

Middleware that enforces a local rate limit using a Token Bucket algorithm.

Prevents the application from overwhelming downstream providers or exceeding cost budgets.

Value parameters

burstCapacity: Maximum burst size (default: same as RPM)
requestsPerMinute: Maximum allowable requests per minute

Attributes

Graph
Supertypes: trait LLMMiddleware

class Object

trait Matchable

class Any

Members list

Value members

Constructors

Concrete methods

Human-readable name for logging/debugging.

Human-readable name for logging/debugging.

Attributes

Definition Classes: LLMMiddleware

Wrap the given LLMClient, returning a new client with added behavior.

Wrap the given LLMClient, returning a new client with added behavior.

Implementations should delegate all LLMClient methods to next, adding behavior before/after delegation as needed.

Value parameters

next: the next client in the pipeline

Attributes

Returns: a new LLMClient with this middleware's behavior added
Definition Classes: LLMMiddleware

In this article

Generated with