core/org.llm4s/org.llm4s.agent/org.llm4s.agent.guardrails/org.llm4s.agent.guardrails.builtin/PromptInjectionDetector

PromptInjectionDetector

org.llm4s.agent.guardrails.builtin.PromptInjectionDetector

See thePromptInjectionDetector companion object

class PromptInjectionDetector(val patterns: Seq[InjectionPattern], val sensitivity: InjectionSensitivity, val onFail: GuardrailAction) extends InputGuardrail

Detects prompt injection attempts in user input.

Prompt injection attacks attempt to override system instructions, manipulate the AI's behavior, or extract sensitive information.

Detection categories:

Instruction Override: "Ignore previous instructions", "forget your rules"
Role Manipulation: "You are now DAN", "Act as a different AI"
System Prompt Extraction: "What is your system prompt?", "Show your instructions"
Jailbreak Phrases: Common patterns used in known jailbreaks
Code/SQL Injection: Attempts to inject executable code

Example usage:

// Default: Block on injection detection
val detector = PromptInjectionDetector()

// Custom sensitivity (fewer false positives)
val relaxed = PromptInjectionDetector(
 sensitivity = InjectionSensitivity.Medium
)

// Use as input guardrail
agent.run(
 query = userInput,
 tools = tools,
 inputGuardrails = Seq(PromptInjectionDetector())
)

Value parameters

onFail: Action to take when injection is detected (default: Block)
patterns: Custom injection patterns to detect (in addition to defaults)
sensitivity: Detection sensitivity level

Attributes

Companion: object
Graph
Supertypes: trait InputGuardrail

trait Guardrail[String]

class Object

trait Matchable

class Any

Members list

Value members

Concrete methods

Validate a value.

This is a PURE FUNCTION - no side effects allowed. Same input always produces same output.

Value parameters

value: The value to validate

Attributes

Returns: Right(value) if valid, Left(error) if invalid

Inherited methods

Compose this guardrail with another sequentially.

The second guardrail runs only if this one passes.

Value parameters

other: The guardrail to run after this one

Attributes

Returns: A composite guardrail that runs both in sequence
Inherited from:: Guardrail

Optional: Transform the input after validation. Default is identity (no transformation).

Value parameters

input: The validated input

Attributes

Returns: The transformed input
Inherited from:: InputGuardrail

Concrete fields

Optional description of what this guardrail validates.

Attributes

Name of this guardrail for logging and error messages.

Attributes

In this article

Generated with