llm4s-core/org.llm4s/org.llm4s.rag/org.llm4s.rag.evaluation/org.llm4s.rag.evaluation.metrics/Faithfulness

Faithfulness

org.llm4s.rag.evaluation.metrics.Faithfulness

See theFaithfulness companion object

class Faithfulness(llmClient: LLMClient, batchSize: Int) extends RAGASMetric

Faithfulness metric: measures factual accuracy of the answer relative to the retrieved contexts.

Algorithm:

Extract factual claims from the generated answer using LLM
For each claim, verify if it can be inferred from the contexts
Score = Number of supported claims / Total number of claims

A score of 1.0 means all claims in the answer can be verified from the retrieved context. Lower scores indicate hallucination.

Value parameters

batchSize: Number of claims to verify per LLM call (default: 5)
llmClient: The LLM client for claim extraction and verification

Attributes

Example

val faithfulness = Faithfulness(llmClient)
val sample = EvalSample(
 question = "What is the capital of France?",
 answer = "Paris is the capital of France and has a population of 2.1 million.",
 contexts = Seq("Paris is the capital and largest city of France.")
)
val result = faithfulness.evaluate(sample)
// Result: score ~0.5 (capital claim supported, population claim not supported)

Companion

object

Graph

Supertypes

trait RAGASMetric

class Object

trait Matchable

class Any

Members list

Value members

Concrete methods

Evaluate a single sample.

Value parameters

sample: The evaluation sample containing question, answer, contexts

Attributes

Returns: Score between 0.0 and 1.0, with optional details
Definition Classes: RAGASMetric

Inherited methods

Check if this metric can be evaluated for a given sample.

Attributes

Inherited from:: RAGASMetric

Evaluate multiple samples.

Default implementation evaluates sequentially. Override for batch optimizations (e.g., batched LLM calls).

Value parameters

samples: The evaluation samples

Attributes

Returns: Results for each sample in order
Inherited from:: RAGASMetric

Concrete fields

Human-readable description of what this metric measures.

Attributes

Unique name of this metric (e.g., "faithfulness", "answer_relevancy"). Used as an identifier in results and configuration.

Attributes

Which inputs this metric requires from an EvalSample. Used to skip metrics when required inputs are missing.

Attributes

In this article

Generated with