Modular RAG Example

This guide introduces a practical, modular Retrieval-Augmented Generation (RAG) example for LLM4S.

The sample lives in:

modules/samples/src/main/scala/org/llm4s/samples/rag/modular/ModularRAGExample.scala
modules/samples/src/main/scala/org/llm4s/samples/rag/modular/RAGModules.scala

Architecture

The example intentionally separates the pipeline into modules:

IngestionModule: document ingestion from filesystem (.txt, .md, .pdf, .docx) or direct text.
RetrievalModule: similarity retrieval (topK) from indexed chunks.
GenerationModule: grounded answer generation from retrieved contexts.

This keeps cross-cutting concerns small and makes each stage easy to evolve or test independently.

sbt "samples/runMain org.llm4s.samples.rag.modular.ModularRAGExample"

If no document path is passed, the sample ingests a small built-in corpus.

sbt "samples/runMain org.llm4s.samples.rag.modular.ModularRAGExample ./docs \"What are the key reliability patterns?\""

The ingestion stage supports file extraction through the core RAG API, including PDF.

The sample resolves embedding and LLM settings from Llm4sConfig:

Embeddings are required.
LLM is optional. If LLM config is missing, the sample still runs ingestion + retrieval and skips answer generation.

This modular RAG layout is useful for real applications because you can: