ContextPrecision

org.llm4s.rag.evaluation.metrics.ContextPrecision
See theContextPrecision companion object
class ContextPrecision(llmClient: LLMClient) extends RAGASMetric

Context Precision metric: measures if relevant contexts are ranked at the top.

Algorithm:

  1. For each retrieved context, determine if it's relevant to the question/ground_truth
  2. Calculate precision@k for each position where a relevant doc appears
  3. Score = Average Precision (AP) = sum of (precision@k * relevance@k) / total_relevant

The intuition: if your retrieval system ranks relevant documents at the top, you get a higher score. Documents ranked lower contribute less to the score.

Value parameters

llmClient

The LLM client for relevance assessment

Attributes

Example
{
val metric = ContextPrecision(llmClient)
val sample = EvalSample(
 question = "What is the capital of France?",
 answer = "Paris is the capital of France.",
 contexts = Seq(
   "Paris is the capital and largest city of France.",  // relevant
   "France has beautiful countryside.",                  // less relevant
   "Paris has the Eiffel Tower."                         // relevant
 ),
 groundTruth = Some("The capital of France is Paris.")
)
val result = metric.evaluate(sample)
// High score if relevant contexts are at positions 1 and 2 vs scattered

}

Companion
object
Graph
Supertypes
trait RAGASMetric
class Object
trait Matchable
class Any

Members list

Value members

Concrete methods

override def evaluate(sample: EvalSample): Result[MetricResult]

Evaluate a single sample.

Evaluate a single sample.

Value parameters

sample

The evaluation sample containing question, answer, contexts

Attributes

Returns

Score between 0.0 and 1.0, with optional details

Definition Classes

Inherited methods

def canEvaluate(sample: EvalSample): Boolean

Check if this metric can be evaluated for a given sample.

Check if this metric can be evaluated for a given sample.

Attributes

Inherited from:
RAGASMetric
def evaluateBatch(samples: Seq[EvalSample]): Result[Seq[MetricResult]]

Evaluate multiple samples.

Evaluate multiple samples.

Default implementation evaluates sequentially. Override for batch optimizations (e.g., batched LLM calls).

Value parameters

samples

The evaluation samples

Attributes

Returns

Results for each sample in order

Inherited from:
RAGASMetric

Concrete fields

override val description: String

Human-readable description of what this metric measures.

Human-readable description of what this metric measures.

Attributes

override val name: String

Unique name of this metric (e.g., "faithfulness", "answer_relevancy"). Used as an identifier in results and configuration.

Unique name of this metric (e.g., "faithfulness", "answer_relevancy"). Used as an identifier in results and configuration.

Attributes

override val requiredInputs: Set[RequiredInput]

Which inputs this metric requires from an EvalSample. Used to skip metrics when required inputs are missing.

Which inputs this metric requires from an EvalSample. Used to skip metrics when required inputs are missing.

Attributes