org.llm4s.vectorstore

Members list

Type members

Classlikes

sealed trait FusionStrategy

Fusion strategy for combining vector and keyword search results.

Fusion strategy for combining vector and keyword search results.

Attributes

Companion
object
Supertypes
class Object
trait Matchable
class Any
Known subtypes
object KeywordOnly
class RRF
object VectorOnly

Attributes

Companion
trait
Supertypes
trait Sum
trait Mirror
class Object
trait Matchable
class Any
Self type
final case class HybridSearchResult(id: String, content: String, score: Double, vectorScore: Option[Double], keywordScore: Option[Double], metadata: Map[String, String], highlights: Seq[String])

Result from hybrid search combining vector and keyword results.

Result from hybrid search combining vector and keyword results.

Value parameters

content

Document content

highlights

Keyword match highlights (if available)

id

Document ID

keywordScore

Original BM25 keyword score (if available)

metadata

Document metadata

score

Combined relevance score (higher is better)

vectorScore

Original vector similarity score (0-1, if available)

Attributes

Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all
final class HybridSearcher

Hybrid searcher combining vector similarity and keyword matching.

Hybrid searcher combining vector similarity and keyword matching.

Provides unified search over both vector embeddings (semantic similarity) and keyword indexes (BM25 term matching). Results are fused using configurable strategies like RRF or weighted scoring.

Usage:

for {
 vectorStore <- VectorStoreFactory.inMemory()
 keywordIndex <- KeywordIndex.inMemory()
 searcher = HybridSearcher(vectorStore, keywordIndex)
 // Add documents to both stores
 _ <- vectorStore.upsert(VectorRecord("doc-1", embedding, Some("content")))
 _ <- keywordIndex.index(KeywordDocument("doc-1", "content"))
 // Search with hybrid fusion
 results <- searcher.search(queryEmbedding, "search terms", topK = 10)
} yield results

Attributes

Companion
object
Supertypes
class Object
trait Matchable
class Any

Attributes

Companion
class
Supertypes
class Object
trait Matchable
class Any
Self type
final case class KeywordDocument(id: String, content: String, metadata: Map[String, String])

Document to be indexed for keyword search.

Document to be indexed for keyword search.

Value parameters

content

Text content to index

id

Unique document identifier

metadata

Additional metadata (not indexed, but returned in results)

Attributes

Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all
trait KeywordIndex

Abstract interface for keyword-based document indexing and search.

Abstract interface for keyword-based document indexing and search.

Implementations use BM25 (Best Matching 25) scoring for relevance ranking. BM25 considers term frequency, document length, and inverse document frequency.

This trait is designed to complement VectorStore for hybrid search scenarios:

  • VectorStore: Semantic similarity via embeddings
  • KeywordIndex: Exact/partial term matching via BM25

The two can be combined using score fusion (RRF or weighted) for hybrid search.

Attributes

Companion
object
Supertypes
class Object
trait Matchable
class Any
Known subtypes
object KeywordIndex

Factory for creating KeywordIndex instances.

Factory for creating KeywordIndex instances.

Attributes

Companion
trait
Supertypes
class Object
trait Matchable
class Any
Self type
final case class KeywordIndexStats(totalDocuments: Long, totalTokens: Option[Long], avgDocumentLength: Option[Double], indexSizeBytes: Option[Long])

Statistics about the keyword index.

Statistics about the keyword index.

Value parameters

avgDocumentLength

Average document length in tokens

indexSizeBytes

Approximate index size on disk (if applicable)

totalDocuments

Number of indexed documents

totalTokens

Approximate total token count across all documents

Attributes

Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all
final case class KeywordSearchResult(id: String, content: String, score: Double, metadata: Map[String, String], highlights: Seq[String])

Result from a keyword search operation.

Result from a keyword search operation.

Value parameters

content

Document content

highlights

Optional highlighted snippets showing match context

id

Document ID

metadata

Document metadata

score

BM25 relevance score (higher is more relevant)

Attributes

Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all
sealed trait MetadataFilter

Filter for metadata-based queries.

Filter for metadata-based queries.

Attributes

Companion
object
Supertypes
class Object
trait Matchable
class Any
Known subtypes
object All
class And
class Contains
class Equals
class HasKey
class In
class Not
class Or
Show all

Attributes

Companion
trait
Supertypes
trait Sum
trait Mirror
class Object
trait Matchable
class Any
Self type
final class PgKeywordIndex extends KeywordIndex

PostgreSQL-based keyword index implementation using native full-text search.

PostgreSQL-based keyword index implementation using native full-text search.

Uses PostgreSQL's tsvector/tsquery for efficient text indexing and ranking. Provides BM25-like scoring via ts_rank_cd (cover density ranking).

Requirements:

  • PostgreSQL 16+ (18+ recommended for best performance)

Features:

  • Native PostgreSQL full-text search with tsvector
  • ts_rank_cd scoring for relevance ranking
  • ts_headline for snippet highlighting
  • GIN indexing for fast full-text lookups
  • JSONB metadata storage with GIN index
  • Connection pooling via HikariCP

Query syntax (via websearch_to_tsquery):

  • "hello world" - documents containing both terms
  • "hello OR world" - documents containing either term
  • "-hello" - exclude documents with hello
  • ""hello world"" - exact phrase match

Value parameters

dataSource

HikariCP data source for connection pooling

language

PostgreSQL text search configuration (default: "english")

ownsDataSource

Whether to close dataSource on close()

tableName

Base table name (creates {tableName}_keyword table)

Attributes

Companion
object
Supertypes
trait KeywordIndex
class Object
trait Matchable
class Any

Attributes

Companion
class
Supertypes
class Object
trait Matchable
class Any
Self type
final class PgVectorStore extends VectorStore

PostgreSQL + pgvector implementation of VectorStore.

PostgreSQL + pgvector implementation of VectorStore.

Uses pgvector extension for efficient vector similarity search with support for IVFFlat and HNSW indexes.

Features:

  • Hardware-accelerated vector operations
  • HNSW indexing for fast approximate nearest neighbor search
  • Connection pooling via HikariCP
  • ACID transactions
  • Scalable to millions of vectors

Requirements:

  • PostgreSQL 16+ with pgvector extension (18+ recommended)
  • Run: CREATE EXTENSION IF NOT EXISTS vector;

Value parameters

dataSource

HikariCP connection pool

ownsDataSource

Whether to close dataSource on close() (default: true)

tableName

Name of the vectors table

Attributes

Companion
object
Supertypes
trait VectorStore
class Object
trait Matchable
class Any
object PgVectorStore

Attributes

Companion
class
Supertypes
class Object
trait Matchable
class Any
Self type
final class QdrantVectorStore extends VectorStore

Qdrant vector database implementation of VectorStore.

Qdrant vector database implementation of VectorStore.

Uses Qdrant's REST API for vector similarity search with support for filtering, payload storage, and multiple distance metrics.

Features:

  • Cloud-native architecture with horizontal scaling
  • HNSW indexing for fast approximate nearest neighbor search
  • Rich filtering on payload (metadata) fields
  • Multiple distance metrics (Cosine, Euclid, Dot)
  • Snapshot and backup capabilities

Requirements:

  • Qdrant server running (docker or cloud)
  • REST API enabled (default port 6333)

Value parameters

apiKey

Optional API key for authentication

baseUrl

Base URL for Qdrant API (e.g., "http://localhost:6333")

collectionName

Name of the collection to use

Attributes

Companion
object
Supertypes
trait VectorStore
class Object
trait Matchable
class Any

Attributes

Companion
class
Supertypes
class Object
trait Matchable
class Any
Self type
final class SQLiteKeywordIndex extends KeywordIndex

SQLite FTS5-based keyword index implementation.

SQLite FTS5-based keyword index implementation.

Uses SQLite's Full-Text Search 5 extension with BM25 scoring. FTS5 provides efficient text indexing and ranking capabilities.

Features:

  • BM25 relevance scoring
  • Snippet highlighting
  • Boolean query operators (AND, OR, NOT)
  • Phrase matching with quotes
  • Prefix matching with *

Query syntax examples:

  • "hello world" - documents containing both terms
  • "hello OR world" - documents containing either term
  • "hello NOT world" - documents with hello but not world
  • ""hello world"" - exact phrase match
  • "hello*" - prefix match

Attributes

Companion
object
Supertypes
trait KeywordIndex
class Object
trait Matchable
class Any

Attributes

Companion
class
Supertypes
class Object
trait Matchable
class Any
Self type
final class SQLiteVectorStore extends VectorStore

SQLite-based vector store implementation.

SQLite-based vector store implementation.

Uses SQLite for storage with in-memory cosine similarity computation. Suitable for development, testing, and small-to-medium datasets (up to ~100K vectors depending on dimensions).

Features:

  • File-based or in-memory storage
  • FTS5 full-text search fallback
  • ACID transactions
  • No external dependencies beyond SQLite

Limitations:

  • Vector similarity computed in Scala (not accelerated)
  • All embeddings loaded into memory for search
  • No built-in sharding or replication

For production with larger datasets, consider pgvector or Qdrant.

Value parameters

connection

The database connection

dbPath

Path to SQLite database file

Attributes

Companion
object
Supertypes
trait VectorStore
class Object
trait Matchable
class Any

Attributes

Companion
class
Supertypes
class Object
trait Matchable
class Any
Self type
final case class ScoredRecord(record: VectorRecord, score: Double)

A record with its similarity score from a search.

A record with its similarity score from a search.

Value parameters

record

The vector record

score

Similarity score (0.0 to 1.0, higher is more similar)

Attributes

Companion
object
Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all
object ScoredRecord

Attributes

Companion
class
Supertypes
trait Product
trait Mirror
class Object
trait Matchable
class Any
Self type
final case class VectorRecord(id: String, embedding: Array[Float], content: Option[String], metadata: Map[String, String])

A record stored in the vector store.

A record stored in the vector store.

Value parameters

content

Optional text content (for display/debugging)

embedding

The vector embedding

id

Unique identifier for the record

metadata

Key-value metadata for filtering

Attributes

Companion
object
Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all
object VectorRecord

Attributes

Companion
class
Supertypes
trait Product
trait Mirror
class Object
trait Matchable
class Any
Self type
trait VectorStore

Low-level vector storage abstraction for RAG and semantic search.

Low-level vector storage abstraction for RAG and semantic search.

VectorStore provides a backend-agnostic interface for storing and searching vector embeddings. Implementations can be SQLite, pgvector, Qdrant, Milvus, Pinecone, or any other vector database.

This is the foundation layer - higher-level abstractions like MemoryStore can build on top of VectorStore for domain-specific functionality.

Key design principles:

  • Backend-agnostic: Same interface for all vector databases
  • Minimal API: Focus on core vector operations
  • Composable: Can be wrapped with additional functionality
  • Type-safe: Uses Result[A] for error handling

Attributes

Supertypes
class Object
trait Matchable
class Any
Known subtypes

Factory for creating VectorStore instances.

Factory for creating VectorStore instances.

Supports creating stores from configuration or explicit parameters. Backend selection is based on the provider name.

Currently supported backends:

  • "sqlite" - SQLite-based storage (default)
  • "pgvector" - PostgreSQL with pgvector extension
  • "qdrant" - Qdrant vector database

Future backends (roadmap):

  • "milvus" - Milvus vector database
  • "pinecone" - Pinecone cloud service

Attributes

Supertypes
class Object
trait Matchable
class Any
Self type
final case class VectorStoreStats(totalRecords: Long, dimensions: Set[Int], sizeBytes: Option[Long])

Statistics about a vector store.

Statistics about a vector store.

Value parameters

dimensions

Set of embedding dimensions in the store

sizeBytes

Approximate size in bytes (if available)

totalRecords

Total number of records

Attributes

Supertypes
trait Serializable
trait Product
trait Equals
class Object
trait Matchable
class Any
Show all