Guardrails are validation functions that run before (input) and after (output) agent processing. They help ensure:
Safety - Block harmful or inappropriate content
Quality - Enforce response standards
Compliance - Meet business requirements
Security - Detect prompt injection and PII
1
2
3
4
5
6
agent.run(query="User input",tools=tools,inputGuardrails=Seq(...),// Validate before LLM calloutputGuardrails=Seq(...)// Validate after LLM response)
Built-in Guardrails
Simple Validators
These guardrails run locally without LLM calls:
Guardrail
Purpose
Example
LengthCheck
Enforce min/max length
new LengthCheck(1, 10000)
ProfanityFilter
Block profane content
new ProfanityFilter()
JSONValidator
Ensure valid JSON output
new JSONValidator()
RegexValidator
Pattern matching
new RegexValidator("\\d{3}-\\d{4}")
ToneValidator
Simple tone detection
new ToneValidator(Tone.Professional)
PIIDetector
Detect PII (email, SSN, etc.)
new PIIDetector()
PIIMasker
Mask detected PII
new PIIMasker()
PromptInjectionDetector
Detect injection attempts
new PromptInjectionDetector()
LLM-as-Judge Guardrails
These use an LLM to evaluate subjective qualities:
Guardrail
Purpose
Example
LLMSafetyGuardrail
Content safety check
new LLMSafetyGuardrail(client)
LLMFactualityGuardrail
Verify factual accuracy
new LLMFactualityGuardrail(client)
LLMQualityGuardrail
Assess response quality
new LLMQualityGuardrail(client)
LLMToneGuardrail
Validate tone compliance
new LLMToneGuardrail(client, "professional")
RAG-Specific Guardrails
For retrieval-augmented generation:
Guardrail
Purpose
GroundingGuardrail
Verify answers are grounded in retrieved context
ContextRelevanceGuardrail
Check context relevance to query
SourceAttributionGuardrail
Ensure sources are cited
TopicBoundaryGuardrail
Prevent off-topic responses
Basic Usage
Input Validation
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
importorg.llm4s.agent.guardrails.builtin._valresult=agent.run(query=userInput,tools=tools,inputGuardrails=Seq(newLengthCheck(min=1,max=10000),newProfanityFilter(),newPromptInjectionDetector()))resultmatch{caseLeft(GuardrailError(name,message))=>println(s"Input rejected by $name: $message")caseRight(state)=>println(state.lastAssistantMessage)}
Output Validation
1
2
3
4
5
6
7
8
valresult=agent.run(query="Generate a JSON response with user data",tools=tools,outputGuardrails=Seq(newJSONValidator(),newPIIMasker()))
importorg.llm4s.agent.guardrails.builtin.LLMSafetyGuardrailvalsafetyGuardrail=newLLMSafetyGuardrail(client)agent.run(query="Write a story",tools=tools,outputGuardrails=Seq(safetyGuardrail))
Factuality Check
Verify responses are grounded in source documents:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
importorg.llm4s.agent.guardrails.builtin.LLMFactualityGuardrailvalfactualityGuardrail=LLMFactualityGuardrail.strict(client=client,sourceDocuments=Seq("The capital of France is Paris.","Paris has a population of 2.1 million."))agent.run(query="What is the capital of France?",tools=tools,outputGuardrails=Seq(factualityGuardrail))
Tone Validation
1
2
3
4
5
6
7
8
9
10
11
12
importorg.llm4s.agent.guardrails.builtin.LLMToneGuardrailvaltoneGuardrail=newLLMToneGuardrail(client=client,targetTone="professional and helpful")agent.run(query="Help with customer complaint",tools=tools,outputGuardrails=Seq(toneGuardrail))
Composite Guardrails
Combine multiple guardrails with different strategies:
All Must Pass (AND)
1
2
3
4
5
6
7
8
9
importorg.llm4s.agent.guardrails.CompositeGuardrailvalstrictValidation=CompositeGuardrail.all(Seq(newLengthCheck(1,5000),newProfanityFilter(),newPIIDetector()))// All guardrails must pass for input to be accepted
Any Must Pass (OR)
1
2
3
4
5
6
valflexibleValidation=CompositeGuardrail.any(Seq(newRegexValidator("^[A-Z].*"),// Starts with capitalnewRegexValidator("^\\d.*")// Starts with digit))// At least one guardrail must pass
Sequential (Short-Circuit)
1
2
3
4
5
6
7
valsequentialValidation=CompositeGuardrail.sequential(Seq(newLengthCheck(1,10000),// Check length firstnewProfanityFilter(),// Then profanitynewPIIDetector()// Then PII))// Stops at first failure, more efficient
importorg.llm4s.agent.guardrails.InputGuardrailimportorg.llm4s.types.ResultclassKeywordRequirementGuardrail(requiredKeywords:Set[String])extendsInputGuardrail{valname:String="keyword-requirement"defvalidate(value:String):Result[String]={valfound=requiredKeywords.filter(kw=>value.toLowerCase.contains(kw.toLowerCase))if(found.nonEmpty){Right(value)}else{Left(LLMError.validation(s"Input must contain at least one of: ${requiredKeywords.mkString(",")}"))}}}// Usagevalguardrail=newKeywordRequirementGuardrail(Set("scala","java","kotlin"))
Custom Output Guardrail
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
importorg.llm4s.agent.guardrails.OutputGuardrailclassMaxSentenceCountGuardrail(maxSentences:Int)extendsOutputGuardrail{valname:String="max-sentence-count"defvalidate(value:String):Result[String]={valsentenceCount=value.split("[.!?]+").lengthif(sentenceCount<=maxSentences){Right(value)}else{Left(LLMError.validation(s"Response has $sentenceCount sentences, max allowed is $maxSentences"))}}}
Custom LLM-Based Guardrail
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
importorg.llm4s.agent.guardrails.LLMGuardrailclassCustomLLMGuardrail(client:LLMClient)extendsLLMGuardrail(client){valname:String="custom-llm-check"overridedefbuildPrompt(content:String):String={s"""Evaluate if the following content is appropriate for a children's website.
|Respond with only "PASS" or "FAIL" followed by a brief explanation.
|
|Content: $content""".stripMargin}overridedefparseResponse(response:String):Result[Boolean]={if(response.trim.startsWith("PASS"))Right(true)elseif(response.trim.startsWith("FAIL"))Right(false)elseLeft(LLMError.parsing("Unexpected response format"))}}
// Verify answer is grounded in retrieved contextvalgrounding=newGroundingGuardrail(client=client,retrievedContext=retrievedDocuments)// Check retrieved context is relevant to queryvalrelevance=newContextRelevanceGuardrail(client=client,query=userQuery)// Ensure sources are properly citedvalattribution=newSourceAttributionGuardrail(client=client,sourceDocuments=sources)// Prevent off-topic responsesvaltopicBoundary=newTopicBoundaryGuardrail(client=client,allowedTopics=Set("programming","software engineering"))
importorg.llm4s.agent.guardrails.builtin.PIIMaskervalpiiMasker=newPIIMasker()// Replaces PII with [REDACTED_EMAIL], [REDACTED_SSN], etc.agent.run(query="Get user details",tools=tools,outputGuardrails=Seq(piiMasker))
Supported PII Types
Type
Pattern
Masked As
Email
user@domain.com
[REDACTED_EMAIL]
SSN
123-45-6789
[REDACTED_SSN]
Credit Card
4111-1111-1111-1111
[REDACTED_CC]
Phone
(555) 123-4567
[REDACTED_PHONE]
IP Address
192.168.1.1
[REDACTED_IP]
Prompt Injection Protection
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
importorg.llm4s.agent.guardrails.builtin.PromptInjectionDetectorvalinjectionDetector=newPromptInjectionDetector()agent.run(query=userInput,tools=tools,inputGuardrails=Seq(injectionDetector))// Detects patterns like:// - "Ignore previous instructions..."// - "System: You are now..."// - "---\nNew instructions:"// - Base64 encoded payloads
importorg.llm4s.agent.guardrails.ValidationMode// Block on failure (default)valblocking=ValidationMode.Block// Warn only, continue processingvalwarn=ValidationMode.Warn// Log and continuevallog=ValidationMode.Log
Best Practices
1. Layer Your Guardrails
1
2
3
4
5
6
7
8
9
10
11
12
13
// Fast, local checks firstvalinputGuardrails=Seq(newLengthCheck(1,10000),// Cheapest firstnewProfanityFilter(),// Still fastnewPromptInjectionDetector(),// Pattern matchingnewPIIDetector()// More complex but local)// LLM checks for output only (expensive)valoutputGuardrails=Seq(newJSONValidator(),// Fast, localnewLLMSafetyGuardrail(client)// Expensive, last)