RateLimitingMiddleware

org.llm4s.llmconnect.middleware.RateLimitingMiddleware
class RateLimitingMiddleware(requestsPerMinute: Int, burstCapacity: Int, timeSource: () => Long) extends LLMMiddleware

Middleware that enforces a local rate limit using a Token Bucket algorithm.

Prevents the application from overwhelming downstream providers or exceeding cost budgets.

Value parameters

burstCapacity

Maximum burst size (default: same as RPM)

requestsPerMinute

Maximum allowable requests per minute

Attributes

Graph
Supertypes
class Object
trait Matchable
class Any

Members list

Value members

Constructors

def this(requestsPerMinute: Int)

Concrete methods

override def name: String

Human-readable name for logging/debugging.

Human-readable name for logging/debugging.

Attributes

Definition Classes
override def wrap(client: LLMClient): LLMClient

Wrap the given LLMClient, returning a new client with added behavior.

Wrap the given LLMClient, returning a new client with added behavior.

Implementations should delegate all LLMClient methods to next, adding behavior before/after delegation as needed.

Value parameters

next

the next client in the pipeline

Attributes

Returns

a new LLMClient with this middleware's behavior added

Definition Classes