BenchmarkResults

org.llm4s.rag.benchmark.BenchmarkResults
See theBenchmarkResults companion object
final case class BenchmarkResults(suite: BenchmarkSuite, results: Seq[ExperimentResult], startTime: Long, endTime: Long)

Results from running a complete benchmark suite.

Value parameters

endTime

When the benchmark completed

results

Results for each experiment

startTime

When the benchmark started

suite

The benchmark suite that was run

Attributes

Companion
object
Graph
Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all

Members list

Value members

Concrete methods

def averageScores: Map[String, Double]

Get average scores across all experiments for each metric.

Get average scores across all experiments for each metric.

Attributes

def compare(experiment1: String, experiment2: String): Option[(Double, String)]

Compare two experiments by name. Returns (difference in RAGAS score, comparison details)

Compare two experiments by name. Returns (difference in RAGAS score, comparison details)

Attributes

All failed results

All failed results

Attributes

def failureCount: Int

Number of failed experiments

Number of failed experiments

Attributes

def getResult(experimentName: String): Option[ExperimentResult]

Get result for a specific experiment.

Get result for a specific experiment.

Attributes

def metricTable: Map[String, Map[String, Double]]

Get metric comparison table. Returns map of experiment name -> map of metric name -> score

Get metric comparison table. Returns map of experiment name -> map of metric name -> score

Attributes

Get results ranked by RAGAS score (highest first).

Get results ranked by RAGAS score (highest first).

Attributes

def successCount: Int

Number of successful experiments

Number of successful experiments

Attributes

All successful results

All successful results

Attributes

def totalDurationMs: Long

Total benchmark duration in milliseconds

Total benchmark duration in milliseconds

Attributes

def totalDurationSeconds: Double

Total benchmark duration in seconds

Total benchmark duration in seconds

Attributes

def winner: Option[ExperimentResult]

Get the best performing experiment.

Get the best performing experiment.

Attributes

Inherited methods

def productElementNames: Iterator[String]

Attributes

Inherited from:
Product
def productIterator: Iterator[Any]

Attributes

Inherited from:
Product