RateLimitingMiddleware
org.llm4s.llmconnect.middleware.RateLimitingMiddleware
class RateLimitingMiddleware(requestsPerMinute: Int, burstCapacity: Int, timeSource: () => Long) extends LLMMiddleware
Middleware that enforces a local rate limit using a Token Bucket algorithm.
Prevents the application from overwhelming downstream providers or exceeding cost budgets.
Value parameters
- burstCapacity
-
Maximum burst size (default: same as RPM)
- requestsPerMinute
-
Maximum allowable requests per minute
Attributes
- Graph
-
- Supertypes
Members list
In this article