Faithfulness

org.llm4s.rag.evaluation.metrics.Faithfulness
See theFaithfulness companion object
class Faithfulness(llmClient: LLMClient, batchSize: Int) extends RAGASMetric

Faithfulness metric: measures factual accuracy of the answer relative to the retrieved contexts.

Algorithm:

  1. Extract factual claims from the generated answer using LLM
  2. For each claim, verify if it can be inferred from the contexts
  3. Score = Number of supported claims / Total number of claims

A score of 1.0 means all claims in the answer can be verified from the retrieved context. Lower scores indicate hallucination.

Value parameters

batchSize

Number of claims to verify per LLM call (default: 5)

llmClient

The LLM client for claim extraction and verification

Attributes

Example
val faithfulness = Faithfulness(llmClient)
val sample = EvalSample(
 question = "What is the capital of France?",
 answer = "Paris is the capital of France and has a population of 2.1 million.",
 contexts = Seq("Paris is the capital and largest city of France.")
)
val result = faithfulness.evaluate(sample)
// Result: score ~0.5 (capital claim supported, population claim not supported)
Companion
object
Graph
Supertypes
trait RAGASMetric
class Object
trait Matchable
class Any

Members list

Value members

Concrete methods

override def evaluate(sample: EvalSample): Result[MetricResult]

Evaluate a single sample.

Evaluate a single sample.

Value parameters

sample

The evaluation sample containing question, answer, contexts

Attributes

Returns

Score between 0.0 and 1.0, with optional details

Definition Classes

Inherited methods

def canEvaluate(sample: EvalSample): Boolean

Check if this metric can be evaluated for a given sample.

Check if this metric can be evaluated for a given sample.

Attributes

Inherited from:
RAGASMetric
def evaluateBatch(samples: Seq[EvalSample]): Result[Seq[MetricResult]]

Evaluate multiple samples.

Evaluate multiple samples.

Default implementation evaluates sequentially. Override for batch optimizations (e.g., batched LLM calls).

Value parameters

samples

The evaluation samples

Attributes

Returns

Results for each sample in order

Inherited from:
RAGASMetric

Concrete fields

override val description: String

Human-readable description of what this metric measures.

Human-readable description of what this metric measures.

Attributes

override val name: String

Unique name of this metric (e.g., "faithfulness", "answer_relevancy"). Used as an identifier in results and configuration.

Unique name of this metric (e.g., "faithfulness", "answer_relevancy"). Used as an identifier in results and configuration.

Attributes

override val requiredInputs: Set[RequiredInput]

Which inputs this metric requires from an EvalSample. Used to skip metrics when required inputs are missing.

Which inputs this metric requires from an EvalSample. Used to skip metrics when required inputs are missing.

Attributes