ContextManager

org.llm4s.context.ContextManager
See theContextManager companion object
class ContextManager(tokenCounter: ConversationTokenCounter, config: ContextConfig, llmClient: Option[LLMClient], artifactStore: Option[ArtifactStore])

Orchestrates the 4-step context management pipeline for llm4s conversations.

ContextManager is the primary entry point for keeping a conversation within a model's token limit. Each step is applied in order of increasing cost; the pipeline exits early as soon as the conversation fits the requested budget.

==Compressor Comparison==

Strategy                | Cost        | Quality | Latency | What it touches
------------------------|-------------|---------|---------|-------------------------
DeterministicCompressor | Free        | Lower   | Fast    | Tool outputs only
HistoryCompressor       | Free        | Medium  | Fast    | Older history → digest
LLMCompressor           | 1 LLM call  | High    | Slow    | Digest messages only

==4-Step Pipeline==

Each step exits immediately if the budget is already met:

  1. '''ToolDeterministicCompaction''' (DeterministicCompressor): Shrinks and caps tool outputs (JSON, logs, binary content) without modifying user or assistant messages. No API calls; always runs first.

  2. '''HistoryCompression''' (HistoryCompressor): Keeps the last config.maxSemanticBlocks semantic blocks verbatim and replaces older blocks with compact [HISTORY_SUMMARY] digests, capped to config.summaryTokenTarget tokens. No API calls.

  3. '''LLMHistorySqueeze''' (LLMCompressor): If still over budget and config.enableLLMCompression is true, compresses only the digest messages further via one LLM inference call per digest.

  4. '''FinalTokenTrim''' (TokenWindow): Hard-trims to budget tokens (with config.headroomPercent), always pinning [HISTORY_SUMMARY] messages so they are never dropped.

==Usage==

// Quick setup with defaults:
val manager = ContextManager.withDefaults(tokenCounter).getOrElse(???)
val result  = manager.manageContext(conversation, budget = 8000)
result.foreach(managed => println(managed.summary))

// With an LLM client for Step 3:
val manager = ContextManager.create(tokenCounter, ContextConfig.default, Some(llmClient))
 .getOrElse(???)

Value parameters

artifactStore

Optional store for externalized binary/large content from Step 1; defaults to an in-memory store if None

config

Pipeline configuration — controls headroom, semantic block count, and which steps are enabled

llmClient

Optional LLM client; required for Step 3 (LLMHistorySqueeze); Step 3 is skipped if None

tokenCounter

Token counter calibrated to the target model's tokenizer

Attributes

See also

DeterministicCompressor for Step 1 implementation

HistoryCompressor for Step 2 implementation

LLMCompressor for Step 3 implementation

TokenWindow for Step 4 implementation

ContextConfig for all configuration options

Companion
object
Graph
Supertypes
class Object
trait Matchable
class Any

Members list

Value members

Concrete methods

Apply 4-step context management pipeline to fit conversation within budget

Apply 4-step context management pipeline to fit conversation within budget

Attributes