AnswerRelevancy

org.llm4s.rag.evaluation.metrics.AnswerRelevancy
See theAnswerRelevancy companion object
class AnswerRelevancy(llmClient: LLMClient, embeddingClient: EmbeddingClient, modelConfig: EmbeddingModelConfig, numGeneratedQuestions: Int) extends RAGASMetric

Answer Relevancy metric: measures how well the answer addresses the question.

Algorithm:

  1. Generate N questions that the provided answer would address
  2. Compute embedding for the original question
  3. Compute embeddings for the generated questions
  4. Calculate cosine similarity between original and generated question embeddings
  5. Score = average similarity across generated questions

The intuition: if the answer is relevant to the question, then questions generated from the answer should be semantically similar to the original question.

Value parameters

embeddingClient

Client for computing embeddings

llmClient

LLM client for generating questions from the answer

modelConfig

Embedding model configuration

numGeneratedQuestions

Number of questions to generate (default: 3)

Attributes

Example
val metric = AnswerRelevancy(llmClient, embeddingClient, modelConfig)
val sample = EvalSample(
 question = "What is machine learning?",
 answer = "Machine learning is a subset of AI that enables systems to learn from data.",
 contexts = Seq("...") // contexts not used for this metric
)
val result = metric.evaluate(sample)
// High score if generated questions are similar to "What is machine learning?"
Companion
object
Graph
Supertypes
trait RAGASMetric
class Object
trait Matchable
class Any

Members list

Value members

Concrete methods

override def evaluate(sample: EvalSample): Result[MetricResult]

Evaluate a single sample.

Evaluate a single sample.

Value parameters

sample

The evaluation sample containing question, answer, contexts

Attributes

Returns

Score between 0.0 and 1.0, with optional details

Definition Classes

Inherited methods

def canEvaluate(sample: EvalSample): Boolean

Check if this metric can be evaluated for a given sample.

Check if this metric can be evaluated for a given sample.

Attributes

Inherited from:
RAGASMetric
def evaluateBatch(samples: Seq[EvalSample]): Result[Seq[MetricResult]]

Evaluate multiple samples.

Evaluate multiple samples.

Default implementation evaluates sequentially. Override for batch optimizations (e.g., batched LLM calls).

Value parameters

samples

The evaluation samples

Attributes

Returns

Results for each sample in order

Inherited from:
RAGASMetric

Concrete fields

override val description: String

Human-readable description of what this metric measures.

Human-readable description of what this metric measures.

Attributes

override val name: String

Unique name of this metric (e.g., "faithfulness", "answer_relevancy"). Used as an identifier in results and configuration.

Unique name of this metric (e.g., "faithfulness", "answer_relevancy"). Used as an identifier in results and configuration.

Attributes

override val requiredInputs: Set[RequiredInput]

Which inputs this metric requires from an EvalSample. Used to skip metrics when required inputs are missing.

Which inputs this metric requires from an EvalSample. Used to skip metrics when required inputs are missing.

Attributes