org.llm4s.agent.guardrails

Members list

Type members

Classlikes

class CompositeGuardrail[A](guardrails: Seq[Guardrail[A]], mode: ValidationMode) extends Guardrail[A]

Combines multiple guardrails with configurable validation mode.

Combines multiple guardrails with configurable validation mode.

Supports three modes:

  • All: All guardrails must pass (strictest)
  • Any: At least one guardrail must pass (OR logic)
  • First: First result wins (performance optimization)

Type parameters

A

The type of value to validate

Value parameters

guardrails

The guardrails to combine

mode

How to combine validation results

Attributes

Companion
object
Supertypes
trait Guardrail[A]
class Object
trait Matchable
class Any

Attributes

Companion
class
Supertypes
class Object
trait Matchable
class Any
Self type
trait Guardrail[A]

Base trait for all guardrails.

Base trait for all guardrails.

A guardrail is a pure function that validates a value of type A. Guardrails are used to validate inputs before agent processing and outputs before returning results to users.

Type parameters

A

The type of value to validate

Attributes

Supertypes
class Object
trait Matchable
class Any
Known subtypes
sealed trait GuardrailAction

Actions to take when a guardrail detects a violation.

Actions to take when a guardrail detects a violation.

Guardrails can be configured to respond to violations in different ways depending on the use case and severity of the detected issue.

Example usage:

// Block on prompt injection (security critical)
val injectionGuard = PromptInjectionDetector(onFail = GuardrailAction.Block)

// Fix PII by masking (privacy preserving)
val piiGuard = PIIDetector(onFail = GuardrailAction.Fix)

// Warn on low-severity issues (monitoring)
val lengthGuard = LengthCheck(1, 10000, onFail = GuardrailAction.Warn)

Attributes

Companion
object
Supertypes
class Object
trait Matchable
class Any
Known subtypes
object Block
object Fix
object Warn

Attributes

Companion
trait
Supertypes
trait Sum
trait Mirror
class Object
trait Matchable
class Any
Self type
sealed trait GuardrailResult[+A]

Result of a guardrail check with action handling.

Result of a guardrail check with action handling.

Extends the basic Result type with information about what action was taken and any warnings that were logged.

Attributes

Companion
object
Supertypes
class Object
trait Matchable
class Any
Known subtypes
class Blocked
class Fixed[A]
class Passed[A]
class Warned[A]

Attributes

Companion
trait
Supertypes
trait Sum
trait Mirror
class Object
trait Matchable
class Any
Self type
trait InputGuardrail extends Guardrail[String]

Validates user input before agent processing.

Validates user input before agent processing.

Input guardrails run BEFORE the LLM is called, validating:

  • User queries
  • System prompts
  • Tool arguments

Attributes

Supertypes
trait Guardrail[String]
class Object
trait Matchable
class Any
Known subtypes

Base trait for LLM-based guardrails (LLM-as-Judge pattern).

Base trait for LLM-based guardrails (LLM-as-Judge pattern).

LLM guardrails use a language model to evaluate content against natural language criteria. This enables validation of subjective qualities like tone, factual accuracy, and safety that cannot be easily validated with deterministic rules.

Unlike function-based guardrails, LLM guardrails:

  • Use natural language evaluation prompts
  • Return a score between 0.0 and 1.0
  • Pass if score >= threshold
  • Can use a separate model for judging (to avoid self-evaluation bias)

Attributes

Note

LLM guardrails have higher latency than function-based guardrails due to the LLM API call. Consider using them only when deterministic validation is insufficient.

Example
class MyCustomLLMGuardrail(client: LLMClient) extends LLMGuardrail {
 val llmClient = client
 val evaluationPrompt = "Rate if this response is helpful (0-1)"
 val threshold = 0.7
 val name = "HelpfulnessGuardrail"
}
Companion
object
Supertypes
trait Guardrail[String]
class Object
trait Matchable
class Any
Known subtypes
object LLMGuardrail

Attributes

Companion
trait
Supertypes
class Object
trait Matchable
class Any
Self type
trait OutputGuardrail extends Guardrail[String]

Validates agent output before returning to user.

Validates agent output before returning to user.

Output guardrails run AFTER the LLM responds, validating:

  • Assistant messages
  • Tool results
  • Final responses

Attributes

Supertypes
trait Guardrail[String]
class Object
trait Matchable
class Any
Known subtypes
sealed trait ValidationMode

Mode for combining multiple guardrails.

Mode for combining multiple guardrails.

Determines how validation results are combined when multiple guardrails are applied to the same value.

Attributes

Companion
object
Supertypes
class Object
trait Matchable
class Any
Known subtypes
object All
object Any
object First

Attributes

Companion
trait
Supertypes
trait Sum
trait Mirror
class Object
trait Matchable
class Any
Self type