Modular RAG Example
This guide introduces a practical, modular Retrieval-Augmented Generation (RAG) example for LLM4S.
The sample lives in:
modules/samples/src/main/scala/org/llm4s/samples/rag/modular/ModularRAGExample.scalamodules/samples/src/main/scala/org/llm4s/samples/rag/modular/RAGModules.scala
Architecture
The example intentionally separates the pipeline into modules:
IngestionModule: document ingestion from filesystem (.txt,.md,.pdf,.docx) or direct text.RetrievalModule: similarity retrieval (topK) from indexed chunks.GenerationModule: grounded answer generation from retrieved contexts.
This keeps cross-cutting concerns small and makes each stage easy to evolve or test independently.
Run
1) Retrieval + generation (if LLM provider configured)
1
sbt "samples/runMain org.llm4s.samples.rag.modular.ModularRAGExample"
If no document path is passed, the sample ingests a small built-in corpus.
2) Use your own document directory
1
sbt "samples/runMain org.llm4s.samples.rag.modular.ModularRAGExample ./docs \"What are the key reliability patterns?\""
The ingestion stage supports file extraction through the core RAG API, including PDF.
Configuration
The sample resolves embedding and LLM settings from Llm4sConfig:
- Embeddings are required.
- LLM is optional. If LLM config is missing, the sample still runs ingestion + retrieval and skips answer generation.
Why this pattern
This modular RAG layout is useful for real applications because you can:
- swap retrieval strategy without changing ingestion/generation wiring,
- test retrieval quality independently,
- run retrieval-only workflows when generation is disabled or unavailable.