core/org.llm4s/org.llm4s.agent/org.llm4s.agent.guardrails/org.llm4s.agent.guardrails.builtin/SecretLeakGuardrail

SecretLeakGuardrail

org.llm4s.agent.guardrails.builtin.SecretLeakGuardrail

See theSecretLeakGuardrail companion object

class SecretLeakGuardrail(val secretTypes: Seq[SecretType], val onFail: GuardrailAction) extends InputGuardrail, OutputGuardrail

Detects and redacts secrets / credentials in LLM input and output.

Prevents two classes of leak:

Input leak: a user accidentally pastes an API key into a prompt that is then logged, cached, or forwarded to third-party LLM providers.
Output leak: the LLM echoes a secret back in its response (e.g. when asked to summarise a config file that contains credentials).

Detected credential types (defaults):

OpenAI API keys (sk-... / sk-proj-...)
Anthropic API keys (sk-ant-...)
Google API keys (AIza...)
Voyage API keys (pa-...)
Langfuse keys (pk-lf-... / sk-lf-...)
AWS Access Key IDs (AKIA...)
JWT tokens (eyJ...eyJ...sig)

Behaviour is controlled by GuardrailAction:

Block (default) – reject the text and return a Left error.
Fix – replace secrets with typed placeholders and continue (e.g. [REDACTED_OPENAI_KEY]).
Warn – allow the text through unchanged; the caller may inspect the Right and decide what to log.

Example usage:

// Block any input that contains a credential
agent.run(
 query          = userInput,
 tools          = tools,
 inputGuardrails = Seq(SecretLeakGuardrail())
)

// Mask secrets automatically and let the query proceed
agent.run(
 query          = userInput,
 tools          = tools,
 inputGuardrails = Seq(SecretLeakGuardrail.masking)
)

// Also scrub LLM responses
agent.run(
 query           = userInput,
 tools           = tools,
 outputGuardrails = Seq(SecretLeakGuardrail.masking)
)

Value parameters

onFail: Action to take when a secret is detected (default: Block)
secretTypes: Secret types to detect (default: all common provider keys)

Attributes

Companion: object
Graph
Supertypes: trait OutputGuardrail

trait InputGuardrail

trait Guardrail[String]

class Object

trait Matchable

class Any
Show all

Members list

Value members

Concrete methods

True if the text contains at least one detectable secret.

Attributes

Returns a map of secret-type name → count for the given text.

Attributes

Replace every detected secret with its type-specific placeholder.

e.g. sk-abc123... → [REDACTED_OPENAI_KEY]

This is called automatically when onFail == Fix. It can also be called directly for unconditional sanitisation regardless of the guardrail mode.

Attributes

Definition Classes: OutputGuardrail -> InputGuardrail

Validate the text for secrets.

Returns:

Right(original) if no secrets found, or onFail == Warn
Right(masked) if onFail == Fix
Left(error) if onFail == Block and secrets are present

Attributes

Inherited methods

Compose this guardrail with another sequentially.

The second guardrail runs only if this one passes.

Value parameters

other: The guardrail to run after this one

Attributes

Returns: A composite guardrail that runs both in sequence
Inherited from:: Guardrail

Concrete fields

Optional description of what this guardrail validates.

Attributes

Name of this guardrail for logging and error messages.

Attributes

In this article

Generated with