org.llm4s.eval.dataset

Members list

Type members

Classlikes

final case class Dataset[I, O](id: DatasetId, name: String, description: String, inputSchema: Option[Value], outputSchema: Option[Value], examples: List[Example[I, O]], createdAt: Instant, tags: Set[String])

A named, versioned collection of Example instances.

A named, versioned collection of Example instances.

Type parameters

I

input type

O

output type

Value parameters

createdAt

creation timestamp

description

purpose and content of the dataset

examples

ordered list of examples

id

unique identifier

inputSchema

optional JSON Schema describing the expected input structure

name

human-readable name

outputSchema

optional JSON Schema describing the expected referenceOutput structure

tags

dataset-level labels for discovery

Attributes

Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all
final case class DatasetId(value: String) extends AnyVal

Opaque identifier for a Dataset. Wraps a UUID string.

Opaque identifier for a Dataset. Wraps a UUID string.

Attributes

Companion
object
Supertypes
trait Serializable
trait Product
trait Equals
class AnyVal
trait Matchable
class Any
Show all
object DatasetId

Attributes

Companion
class
Supertypes
class Object
trait Matchable
class Any
Self type
DatasetId.type
final case class DatasetSnapshot[I, O](snapshotId: SnapshotId, datasetId: DatasetId, examples: List[Example[I, O]], createdAt: Instant)

An immutable point-in-time copy of the examples in a Dataset.

An immutable point-in-time copy of the examples in a Dataset.

Snapshots are created by DatasetStore.createSnapshot and are unaffected by subsequent mutations to the originating dataset.

Type parameters

I

input type

O

output type

Value parameters

createdAt

snapshot creation timestamp

datasetId

the dataset from which the snapshot was taken

examples

the frozen example list at snapshot time

snapshotId

unique identifier for this snapshot

Attributes

Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all
trait DatasetStore[F[_]]

Algebra for managing labelled evaluation datasets.

Algebra for managing labelled evaluation datasets.

The effect type F[_] is left unconstrained so that implementations can range from the trivial cats.Id (synchronous, in-memory) to any async effect (e.g. Future, IO) without requiring cats-effect as a core dependency.

All methods use ujson.Value for both input and output, making the store format-agnostic; callers handle (de)serialisation at their own boundary.

Type parameters

F

effect wrapper (e.g. cats.Id, Future, IO)

Attributes

Supertypes
class Object
trait Matchable
class Any
Known subtypes
final case class Example[I, O](id: ExampleId, input: I, referenceOutput: Option[O], tags: Set[String], metadata: Map[String, String])

A single labelled example in a dataset.

A single labelled example in a dataset.

Type parameters

I

input type

O

output type

Value parameters

id

unique identifier for this example

input

the model input (generic; often ujson.Value)

metadata

arbitrary string key-value annotations

referenceOutput

optional ground-truth output to compare model responses against

tags

free-form labels for filtering (e.g. "qa", "rag")

Attributes

Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all
final case class ExampleId(value: String) extends AnyVal

Opaque identifier for an Example within a dataset. Wraps a UUID string.

Opaque identifier for an Example within a dataset. Wraps a UUID string.

Attributes

Companion
object
Supertypes
trait Serializable
trait Product
trait Equals
class AnyVal
trait Matchable
class Any
Show all
object ExampleId

Attributes

Companion
class
Supertypes
class Object
trait Matchable
class Any
Self type
ExampleId.type
sealed trait ExampleSelector

Selects a subset of examples from a dataset.

Selects a subset of examples from a dataset.

Pattern-match exhaustively over the three cases to handle all variants:

 selector match {
   case ExampleSelector.All        => // return everything
   case ExampleSelector.ByTags(ts) => // filter by tag intersection
   case ExampleSelector.ByIds(ids) => // filter by exact ID membership
 }

Attributes

Companion
object
Supertypes
class Object
trait Matchable
class Any
Known subtypes
object All
class ByIds
class ByTags

Attributes

Companion
trait
Supertypes
trait Sum
trait Mirror
class Object
trait Matchable
class Any
Self type

A synchronous, in-memory implementation of DatasetStore backed by mutable Scala maps.

A synchronous, in-memory implementation of DatasetStore backed by mutable Scala maps.

Intended for unit tests and local experimentation. All public methods are synchronized on the store instance to provide basic thread safety within a single JVM.

Obtain an instance via the companion object factory:

 val store = InMemoryDatasetStore() 

Attributes

Companion
object
Supertypes
trait DatasetStore[Id]
class Object
trait Matchable
class Any

Attributes

Companion
class
Supertypes
class Object
trait Matchable
class Any
Self type

Lightweight structural validator for JSON values against a JSON Schema subset.

Lightweight structural validator for JSON values against a JSON Schema subset.

Only the following schema keywords are recognised; all others are silently ignored:

  • type — checks the JSON type of the value (object, array, string, number, boolean, null); unknown type strings are skipped
  • required — when the value is a JSON object, verifies that each named key is present
  • properties — when the value is a JSON object, recursively validates each listed property against its sub-schema

This is intentionally '''not''' a full JSON Schema implementation (no if/then/else, anyOf, $ref, etc.). It is sufficient for validating the structure of Example inputs and outputs in evaluation datasets.

Attributes

Supertypes
class Object
trait Matchable
class Any
Self type
object JsonlCodec

Codec for reading and writing Example values as JSONL (newline-delimited JSON).

Codec for reading and writing Example values as JSONL (newline-delimited JSON).

Each line produced by encode is a compact, single-line JSON object. decode is the inverse: it parses one such line back into an Example, returning None for any malformed or structurally invalid input without throwing.

Intended for batch import/export via DatasetStore.importJsonl and DatasetStore.exportJsonl.

Attributes

Supertypes
class Object
trait Matchable
class Any
Self type
JsonlCodec.type
final case class SnapshotId(value: String) extends AnyVal

Opaque identifier for a DatasetSnapshot. Wraps a UUID string.

Opaque identifier for a DatasetSnapshot. Wraps a UUID string.

Attributes

Companion
object
Supertypes
trait Serializable
trait Product
trait Equals
class AnyVal
trait Matchable
class Any
Show all
object SnapshotId

Attributes

Companion
class
Supertypes
class Object
trait Matchable
class Any
Self type
SnapshotId.type