Generates ground truth evaluation datasets from documents using LLM.
Creates question-answer pairs with context that can be used for RAGAS evaluation. Supports multiple generation strategies for different testing scenarios.
Value parameters
llmClient
LLM client for generation
options
Generation options
Attributes
Example
val generator = GroundTruthGenerator(llmClient)
// Generate from documents
val dataset = generator.generateFromDocuments(
documents = Seq(doc1, doc2, doc3),
questionsPerDoc = 5,
datasetName = "my-test-set"
)
// Save for later use
TestDataset.save(dataset, "data/generated/my-test-set.json")