KeywordIndex

org.llm4s.vectorstore.KeywordIndex
See theKeywordIndex companion object
trait KeywordIndex

Abstract interface for keyword-based document indexing and search.

Implementations use BM25 (Best Matching 25) scoring for relevance ranking. BM25 considers term frequency, document length, and inverse document frequency.

This trait is designed to complement VectorStore for hybrid search scenarios:

  • VectorStore: Semantic similarity via embeddings
  • KeywordIndex: Exact/partial term matching via BM25

The two can be combined using score fusion (RRF or weighted) for hybrid search.

Attributes

Companion
object
Graph
Supertypes
class Object
trait Matchable
class Any
Known subtypes

Members list

Value members

Abstract methods

def clear(): Result[Unit]

Clear all indexed documents.

Clear all indexed documents.

Attributes

Returns

Unit on success, or error

def close(): Unit

Close the index and release resources.

Close the index and release resources.

Attributes

def count(): Result[Long]

Count total indexed documents.

Count total indexed documents.

Attributes

Returns

Document count

def delete(id: String): Result[Unit]

Delete a document by ID.

Delete a document by ID.

Value parameters

id

Document ID

Attributes

Returns

Unit on success, or error

def deleteBatch(ids: Seq[String]): Result[Unit]

Delete multiple documents.

Delete multiple documents.

Value parameters

ids

Document IDs to delete

Attributes

Returns

Unit on success, or error

def deleteByPrefix(prefix: String): Result[Long]

Delete all documents with IDs starting with the given prefix.

Delete all documents with IDs starting with the given prefix.

Value parameters

prefix

The ID prefix to match

Attributes

Returns

Number of documents deleted

def get(id: String): Result[Option[KeywordDocument]]

Get a document by ID.

Get a document by ID.

Value parameters

id

Document ID

Attributes

Returns

Document if found, None if not found, or error

def index(doc: KeywordDocument): Result[Unit]

Index a single document.

Index a single document.

Value parameters

doc

Document to index

Attributes

Returns

Unit on success, or error

def indexBatch(docs: Seq[KeywordDocument]): Result[Unit]

Index multiple documents in batch.

Index multiple documents in batch.

Value parameters

docs

Documents to index

Attributes

Returns

Unit on success, or error

def search(query: String, topK: Int, filter: Option[MetadataFilter]): Result[Seq[KeywordSearchResult]]

Search for documents matching a query.

Search for documents matching a query.

Uses BM25 scoring for relevance ranking.

Value parameters

filter

Optional metadata filter

query

Search query (supports operators depending on implementation)

topK

Maximum number of results to return

Attributes

Returns

Ranked search results, or error

def searchWithHighlights(query: String, topK: Int, snippetLength: Int, filter: Option[MetadataFilter]): Result[Seq[KeywordSearchResult]]

Search with highlighted snippets.

Search with highlighted snippets.

Value parameters

filter

Optional metadata filter

query

Search query

snippetLength

Target length for highlight snippets

topK

Maximum number of results

Attributes

Returns

Results with highlighted matches

Get index statistics.

Get index statistics.

Attributes

Returns

Index statistics

Concrete methods

def update(doc: KeywordDocument): Result[Unit]

Update a document (re-index with new content).

Update a document (re-index with new content).

Value parameters

doc

Updated document

Attributes

Returns

Unit on success, or error