llm4s-core/org.llm4s/org.llm4s.rag/org.llm4s.rag.benchmark/BenchmarkRunner

BenchmarkRunner

org.llm4s.rag.benchmark.BenchmarkRunner

See theBenchmarkRunner companion object

class BenchmarkRunner(llmClient: LLMClient, embeddingClient: EmbeddingClient, resolveEmbeddingProvider: String => Result[EmbeddingProviderConfig], datasetManager: DatasetManager, val options: BenchmarkRunnerOptions)

Main execution engine for RAG benchmarks.

Orchestrates the full benchmark workflow:

Load dataset
For each experiment configuration: a. Create RAG pipeline with config b. Index documents c. Run queries and generate answers d. Evaluate with RAGAS metrics e. Collect timing and results
Aggregate results and generate reports

Value parameters

datasetManager: Dataset loading manager
embeddingClient: Default embedding client
llmClient: LLM client for answer generation and evaluation
options: Runner configuration options

Attributes

Example

val runner = BenchmarkRunner(llmClient, embeddingClient, resolveEmbeddingProvider)
val suite = BenchmarkSuite.chunkingSuite("data/datasets/ragbench/test.jsonl")
val results = runner.runSuite(suite)
println(BenchmarkReport.console(results))

Companion

object

Graph

Supertypes

class Object

trait Matchable

class Any

Members list

Value members

Concrete methods

Compare two configurations head-to-head.

Value parameters

config1: First configuration
config2: Second configuration
datasetPath: Path to dataset
sampleCount: Number of samples (optional)

Attributes

Returns: Comparison result

Run a quick validation with minimal samples.

Value parameters

config: Experiment configuration
datasetPath: Path to dataset
sampleCount: Number of samples to test

Attributes

Returns: Experiment result

Run a single experiment.

Value parameters

config: Experiment configuration
dataset: Evaluation dataset

Attributes

Returns: Experiment result

Run a complete benchmark suite.

Value parameters

suite: The benchmark suite to run

Attributes

Returns: Aggregated results for all experiments

Concrete fields

In this article

Generated with