High-level RAG (Retrieval-Augmented Generation) pipeline.
Provides a unified interface for:
- Document ingestion (from files, directories, or raw text)
- Semantic search with hybrid fusion
- Answer generation with retrieved context
Attributes
- Example
-
// Create pipeline val rag = RAG.builder() .withEmbeddings(EmbeddingProvider.OpenAI) .withChunking(ChunkerFactory.Strategy.Sentence, 800, 150) .build() .toOption.get // Ingest documents rag.ingest("./docs") // Search val results = rag.query("What is X?") // With answer generation (requires LLM client) val ragWithLLM = RAG.builder() .withEmbeddings(EmbeddingProvider.OpenAI) .withLLM(llmClient) .build() .toOption.get val answer = ragWithLLM.queryWithAnswer("What is X?") - Companion
- object
- Graph
-
- Supertypes
-
trait Closeabletrait AutoCloseableclass Objecttrait Matchableclass Any
Members list
Value members
Concrete methods
Number of chunks indexed
Number of chunks indexed
Attributes
Close resources.
Close resources.
Attributes
- Definition Classes
-
Closeable -> AutoCloseable
Delete a specific document and its chunks.
Delete a specific document and its chunks.
Value parameters
- docId
-
Document ID to delete
Attributes
- Returns
-
Unit on success
Number of documents ingested
Number of documents ingested
Attributes
Ingest a document from a file path.
Ingest a document from a file path.
Supports: .txt, .md, .pdf, .docx and other text-like formats.
Value parameters
- metadata
-
Additional metadata to attach to all chunks
- path
-
Path to file or directory
Attributes
- Returns
-
Number of chunks created
Ingest a document from a Path.
Ingest a document from a Path.
Attributes
Ingest documents from a DocumentLoader.
Ingest documents from a DocumentLoader.
Value parameters
- loader
-
The document loader to ingest from
Attributes
- Returns
-
Loading statistics with success/failure counts
Async version of ingest with parallel document processing.
Async version of ingest with parallel document processing.
Processes documents in batches with configurable parallelism. Uses the parallelism and batchSize settings from LoadingConfig.
Value parameters
- ec
-
Execution context for async operations
- loader
-
The document loader to ingest from
Attributes
- Returns
-
Future with loading statistics
Ingest pre-chunked content (for advanced use cases).
Ingest pre-chunked content (for advanced use cases).
Attributes
Ingest a document from a Path with metadata.
Ingest a document from a Path with metadata.
Attributes
Ingest raw text content.
Ingest raw text content.
Value parameters
- content
-
The text content to ingest
- documentId
-
Unique identifier for this document
- metadata
-
Additional metadata
Attributes
- Returns
-
Number of chunks created
Check if a document needs updating based on version.
Check if a document needs updating based on version.
Value parameters
- doc
-
Document to check
Attributes
- Returns
-
true if document is new or changed
Search for relevant chunks.
Search for relevant chunks.
Value parameters
- query
-
The search query
- topK
-
Override default topK (optional)
Attributes
- Returns
-
Ranked search results
Search and generate an answer using LLM.
Search and generate an answer using LLM.
Requires an LLM client to be configured.
Value parameters
- question
-
The question to answer
- topK
-
Override default topK (optional)
Attributes
- Returns
-
Answer with supporting contexts
Full refresh - re-process all documents.
Full refresh - re-process all documents.
Clears the registry and re-ingests all documents. Use this when you want to ensure a clean slate.
Value parameters
- loader
-
The document loader to refresh from
Attributes
- Returns
-
Sync statistics (all as "added")
Async full refresh.
Async full refresh.
Clears all data and re-ingests from the loader with parallel processing.
Value parameters
- ec
-
Execution context for async operations
- loader
-
The document loader to refresh from
Attributes
- Returns
-
Future with sync statistics
Sync with a loader - only process changed documents.
Sync with a loader - only process changed documents.
Compares document versions to detect:
- New documents (added)
- Changed documents (updated - old chunks removed, new chunks added)
- Deleted documents (removed from source)
- Unchanged documents (skipped)
Value parameters
- loader
-
The document loader to sync with
Attributes
- Returns
-
Sync statistics
Async sync with parallel change detection.
Async sync with parallel change detection.
Performs change detection in parallel, but applies updates sequentially to avoid conflicts in the vector store.
Value parameters
- ec
-
Execution context for async operations
- loader
-
The document loader to sync with
Attributes
- Returns
-
Future with sync statistics