org.llm4s.vectorstore

Fusion strategy for combining vector and keyword search results.

Attributes

Companion: object
Supertypes: class Object

trait Matchable

class Any
Known subtypes: object KeywordOnly

class RRF

object VectorOnly

class WeightedScore

Attributes

Companion: trait
Supertypes: trait Sum

trait Mirror

class Object

trait Matchable

class Any
Self type: FusionStrategy.type

Result from hybrid search combining vector and keyword results.

Value parameters

content: Document content
highlights: Keyword match highlights (if available)
id: Document ID
keywordScore: Original BM25 keyword score (if available)
metadata: Document metadata
score: Combined relevance score (higher is better)
vectorScore: Original vector similarity score (0-1, if available)

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

class Object

trait Matchable

class Any
Show all

Hybrid searcher combining vector similarity and keyword matching.

Provides unified search over both vector embeddings (semantic similarity) and keyword indexes (BM25 term matching). Results are fused using configurable strategies like RRF or weighted scoring.

Usage:

for {
 vectorStore <- VectorStoreFactory.inMemory()
 keywordIndex <- KeywordIndex.inMemory()
 searcher = HybridSearcher(vectorStore, keywordIndex)
 // Add documents to both stores
 _ <- vectorStore.upsert(VectorRecord("doc-1", embedding, Some("content")))
 _ <- keywordIndex.index(KeywordDocument("doc-1", "content"))
 // Search with hybrid fusion
 results <- searcher.search(queryEmbedding, "search terms", topK = 10)
} yield results

Attributes

Companion: object
Supertypes: class Object

trait Matchable

class Any

Attributes

Companion: class
Supertypes: class Object

trait Matchable

class Any
Self type: HybridSearcher.type

Document to be indexed for keyword search.

Value parameters

content: Text content to index
id: Unique document identifier
metadata: Additional metadata (not indexed, but returned in results)

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

class Object

trait Matchable

class Any
Show all

Abstract interface for keyword-based document indexing and search.

Implementations use BM25 (Best Matching 25) scoring for relevance ranking. BM25 considers term frequency, document length, and inverse document frequency.

This trait is designed to complement VectorStore for hybrid search scenarios:

VectorStore: Semantic similarity via embeddings
KeywordIndex: Exact/partial term matching via BM25

The two can be combined using score fusion (RRF or weighted) for hybrid search.

Attributes

Companion: object
Supertypes: class Object

trait Matchable

class Any
Known subtypes: class PgKeywordIndex

class SQLiteKeywordIndex

Factory for creating KeywordIndex instances.

Attributes

Companion: trait
Supertypes: class Object

trait Matchable

class Any
Self type: KeywordIndex.type

Statistics about the keyword index.

Value parameters

avgDocumentLength: Average document length in tokens
indexSizeBytes: Approximate index size on disk (if applicable)
totalDocuments: Number of indexed documents
totalTokens: Approximate total token count across all documents

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

class Object

trait Matchable

class Any
Show all

Result from a keyword search operation.

Value parameters

content: Document content
highlights: Optional highlighted snippets showing match context
id: Document ID
metadata: Document metadata
score: BM25 relevance score (higher is more relevant)

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

class Object

trait Matchable

class Any
Show all

Filter for metadata-based queries.

Attributes

Companion: object
Supertypes: class Object

trait Matchable

class Any
Known subtypes: object All

class And

class Contains

class Equals

class HasKey

class In

class Not

class Or
Show all

Attributes

Companion: trait
Supertypes: trait Sum

trait Mirror

class Object

trait Matchable

class Any
Self type: MetadataFilter.type

PostgreSQL-based keyword index implementation using native full-text search.

Uses PostgreSQL's tsvector/tsquery for efficient text indexing and ranking. Provides BM25-like scoring via ts_rank_cd (cover density ranking).

Requirements:

PostgreSQL 16+ (18+ recommended for best performance)

Features:

Native PostgreSQL full-text search with tsvector
ts_rank_cd scoring for relevance ranking
ts_headline for snippet highlighting
GIN indexing for fast full-text lookups
JSONB metadata storage with GIN index
Connection pooling via HikariCP

Query syntax (via websearch_to_tsquery):

"hello world" - documents containing both terms
"hello OR world" - documents containing either term
"-hello" - exclude documents with hello
""hello world"" - exact phrase match

Value parameters

dataSource: HikariCP data source for connection pooling
language: PostgreSQL text search configuration (default: "english")
ownsDataSource: Whether to close dataSource on close()
tableName: Base table name (creates {tableName}_keyword table)

Attributes

Companion: object
Supertypes: trait KeywordIndex

class Object

trait Matchable

class Any

Attributes

Companion: class
Supertypes: class Object

trait Matchable

class Any
Self type: PgKeywordIndex.type

PostgreSQL + pgvector implementation of VectorStore.

Uses pgvector extension for efficient vector similarity search with support for IVFFlat and HNSW indexes.

Features:

Hardware-accelerated vector operations
HNSW indexing for fast approximate nearest neighbor search
Connection pooling via HikariCP
ACID transactions
Scalable to millions of vectors

Requirements:

PostgreSQL 16+ with pgvector extension (18+ recommended)
Run: CREATE EXTENSION IF NOT EXISTS vector;

Value parameters

dataSource: HikariCP connection pool
ownsDataSource: Whether to close dataSource on close() (default: true)
tableName: Name of the vectors table

Attributes

Companion: object
Supertypes: trait VectorStore

class Object

trait Matchable

class Any

Attributes

Companion: class
Supertypes: class Object

trait Matchable

class Any
Self type: PgVectorStore.type

Qdrant vector database implementation of VectorStore.

Uses Qdrant's REST API for vector similarity search with support for filtering, payload storage, and multiple distance metrics.

Features:

Cloud-native architecture with horizontal scaling
HNSW indexing for fast approximate nearest neighbor search
Rich filtering on payload (metadata) fields
Multiple distance metrics (Cosine, Euclid, Dot)
Snapshot and backup capabilities

Requirements:

Qdrant server running (docker or cloud)
REST API enabled (default port 6333)

Value parameters

apiKey: Optional API key for authentication
baseUrl: Base URL for Qdrant API (e.g., "http://localhost:6333")
collectionName: Name of the collection to use

Attributes

Companion: object
Supertypes: trait VectorStore

class Object

trait Matchable

class Any

Attributes

Companion: class
Supertypes: class Object

trait Matchable

class Any
Self type: QdrantVectorStore.type

SQLite FTS5-based keyword index implementation.

Uses SQLite's Full-Text Search 5 extension with BM25 scoring. FTS5 provides efficient text indexing and ranking capabilities.

Features:

BM25 relevance scoring
Snippet highlighting
Boolean query operators (AND, OR, NOT)
Phrase matching with quotes
Prefix matching with *

Query syntax examples:

"hello world" - documents containing both terms
"hello OR world" - documents containing either term
"hello NOT world" - documents with hello but not world
""hello world"" - exact phrase match
"hello*" - prefix match

Attributes

Companion: object
Supertypes: trait KeywordIndex

class Object

trait Matchable

class Any

Attributes

Companion: class
Supertypes: class Object

trait Matchable

class Any
Self type: SQLiteKeywordIndex.type

SQLite-based vector store implementation.

Uses SQLite for storage with in-memory cosine similarity computation. Suitable for development, testing, and small-to-medium datasets (up to ~100K vectors depending on dimensions).

Features:

File-based or in-memory storage
FTS5 full-text search fallback
ACID transactions
No external dependencies beyond SQLite

Limitations:

Vector similarity computed in Scala (not accelerated)
All embeddings loaded into memory for search
No built-in sharding or replication

For production with larger datasets, consider pgvector or Qdrant.

Value parameters

connection: The database connection
dbPath: Path to SQLite database file

Attributes

Companion: object
Supertypes: trait VectorStore

class Object

trait Matchable

class Any

Attributes

Companion: class
Supertypes: class Object

trait Matchable

class Any
Self type: SQLiteVectorStore.type

A record with its similarity score from a search.

Value parameters

record: The vector record
score: Similarity score (0.0 to 1.0, higher is more similar)

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: ScoredRecord.type

A record stored in the vector store.

Value parameters

content: Optional text content (for display/debugging)
embedding: The vector embedding
id: Unique identifier for the record
metadata: Key-value metadata for filtering

Attributes

Companion: object
Supertypes: trait Serializable

trait Product

trait Equals

class Object

trait Matchable

class Any
Show all

Attributes

Companion: class
Supertypes: trait Product

trait Mirror

class Object

trait Matchable

class Any
Self type: VectorRecord.type

Low-level vector storage abstraction for RAG and semantic search.

VectorStore provides a backend-agnostic interface for storing and searching vector embeddings. Implementations can be SQLite, pgvector, Qdrant, Milvus, Pinecone, or any other vector database.

This is the foundation layer - higher-level abstractions like MemoryStore can build on top of VectorStore for domain-specific functionality.

Key design principles:

Backend-agnostic: Same interface for all vector databases
Minimal API: Focus on core vector operations
Composable: Can be wrapped with additional functionality
Type-safe: Uses Result[A] for error handling

Attributes

Supertypes: class Object

trait Matchable

class Any
Known subtypes: class PgVectorStore

class QdrantVectorStore

class SQLiteVectorStore

Factory for creating VectorStore instances.

Supports creating stores from configuration or explicit parameters. Backend selection is based on the provider name.

Currently supported backends:

"sqlite" - SQLite-based storage (default)
"pgvector" - PostgreSQL with pgvector extension
"qdrant" - Qdrant vector database

Future backends (roadmap):

"milvus" - Milvus vector database
"pinecone" - Pinecone cloud service

Attributes

Supertypes: class Object

trait Matchable

class Any
Self type: VectorStoreFactory.type

Statistics about a vector store.

Value parameters

dimensions: Set of embedding dimensions in the store
sizeBytes: Approximate size in bytes (if available)
totalRecords: Total number of records

Attributes

Supertypes: trait Serializable

trait Product

trait Equals

class Object

trait Matchable

class Any
Show all

org.llm4s.vectorstore

Members list

Type members

Classlikes

Attributes

Attributes

Value parameters

Attributes

Attributes

Attributes

Value parameters

Attributes

Attributes

Attributes

Value parameters

Attributes

Value parameters

Attributes

Attributes

Attributes

Value parameters

Attributes

Attributes

Value parameters

Attributes

Attributes

Value parameters

Attributes

Attributes

Attributes

Attributes

Value parameters

Attributes

Attributes

Value parameters

Attributes

Attributes

Value parameters

Attributes

Attributes

Attributes

Attributes

Value parameters

Attributes