SourceBackedLoader

org.llm4s.rag.loader.SourceBackedLoader
See theSourceBackedLoader companion object
final case class SourceBackedLoader(source: DocumentSource, extractor: DocumentExtractor, additionalMetadata: Map[String, String], defaultHints: Option[DocumentHints]) extends DocumentLoader

Bridge between DocumentSource and DocumentLoader.

SourceBackedLoader converts any DocumentSource into a DocumentLoader, enabling documents from S3, GCS, databases, or any custom source to be used with the RAG pipeline.

The loader:

  1. Lists documents from the source
  2. Reads document content (bytes)
  3. Extracts text using DocumentExtractor
  4. Creates Document objects with appropriate metadata and hints

Usage:

// From S3
val s3Source = S3DocumentSource("my-bucket", "docs/")
val loader = SourceBackedLoader(s3Source)
rag.sync(loader)

// With custom extractor
val loader = SourceBackedLoader(source, customExtractor)

Value parameters

additionalMetadata

Extra metadata to add to all documents

defaultHints

Default processing hints for documents

extractor

Document extractor for text extraction (default: DefaultDocumentExtractor)

source

The document source to load from

Attributes

Companion
object
Graph
Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all

Members list

Value members

Concrete methods

override def description: String

Human-readable description of this loader.

Human-readable description of this loader.

Used for logging and debugging.

Attributes

Definition Classes
override def estimatedCount: Option[Int]

Estimated number of documents (if known).

Estimated number of documents (if known).

Used for progress reporting and resource allocation. Returns None if count is unknown or expensive to compute.

Attributes

Definition Classes
override def load(): Iterator[LoadResult]

Load documents from this source.

Load documents from this source.

Returns an iterator of LoadResult for streaming large document sets. Each result is either a successfully loaded document or a loading error. This allows processing to continue even when some documents fail.

Attributes

Returns

Iterator of load results (successes and failures)

Definition Classes

Create a new loader with a different extractor.

Create a new loader with a different extractor.

Attributes

Create a new loader with default hints.

Create a new loader with default hints.

Attributes

def withMetadata(metadata: Map[String, String]): SourceBackedLoader

Create a new loader with additional metadata.

Create a new loader with additional metadata.

Attributes

Inherited methods

Combine this loader with another.

Combine this loader with another.

Creates a composite loader that loads from both sources.

Attributes

Inherited from:
DocumentLoader
def productElementNames: Iterator[String]

Attributes

Inherited from:
Product
def productIterator: Iterator[Any]

Attributes

Inherited from:
Product