Memory System

Persistent context and knowledge for agents across conversations.

Table of contents

  1. Overview
  2. Quick Start
  3. Memory Types
  4. Memory Manager
    1. Configuration Options
  5. Recording Memory
    1. User Facts
    2. Entity Knowledge
    3. External Knowledge
    4. Task Outcomes
    5. Conversation Messages
  6. Retrieving Memory
    1. Relevant Context
    2. Conversation History
    3. Entity Context
    4. User Context
  7. Memory Stores
    1. In-Memory Store
    2. SQLite Store
    3. Vector Store
  8. Memory with Agents
    1. Injecting Context
    2. Automatic Recording
  9. Semantic Search
    1. With Embeddings
    2. Search with Filters
  10. Memory Consolidation
  11. Entity Extraction
    1. Manual Extraction
  12. Memory Statistics
  13. Persistence Patterns
    1. Save and Load
    2. Cross-Session Memory
  14. Best Practices
    1. 1. Set Appropriate Importance
    2. 2. Use Specific Memory Types
    3. 3. Manage Context Token Budget
    4. 4. Clean Up Old Memories
  15. Examples
  16. Next Steps

Overview

The LLM4S Memory System provides:

  • Short-term memory - Conversation context within a session
  • Long-term memory - Persistent facts and knowledge
  • Semantic search - Find relevant context using embeddings
  • Entity tracking - Remember information about people, places, things
  • Multiple backends - In-memory, SQLite, vector stores

Quick Start

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
import org.llm4s.agent.memory._

// Create a memory manager
val result = for {
  manager <- SimpleMemoryManager.empty

  // Record user facts
  m1 <- manager.recordUserFact(
    content = "Prefers Scala over Java",
    userId = Some("user-123"),
    importance = Some(0.9)
  )

  // Record entity knowledge
  m2 <- m1.recordEntityFact(
    entityId = EntityId("anthropic"),
    content = "AI company that created Claude",
    importance = Some(0.8)
  )

  // Get relevant context for a query
  context <- m2.getRelevantContext("Tell me about Scala programming")
} yield context

result match {
  case Right(ctx) => println(s"Context: $ctx")
  case Left(err) => println(s"Error: $err")
}

Memory Types

Type Purpose Example
Conversation Chat history “User asked about weather”
UserFact User preferences/info “Prefers dark mode”
Entity Knowledge about entities “Paris is capital of France”
Knowledge External knowledge “Scala 3 released in 2021”
Task Task outcomes “Generated report successfully”
Custom Application-specific Any custom memory type

Memory Manager

The MemoryManager is the main interface for working with memory:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import org.llm4s.agent.memory._

// Create with default in-memory store
val manager = SimpleMemoryManager.empty

// Create with configuration
val configuredManager = SimpleMemoryManager(
  config = MemoryConfig(
    autoRecordMessages = true,
    autoExtractEntities = false,
    defaultImportance = 0.5,
    contextTokenBudget = 2000,
    consolidationEnabled = true
  ),
  store = new InMemoryStore()
)

Configuration Options

Option Default Description
autoRecordMessages true Automatically record conversation turns
autoExtractEntities false Extract entities from messages via LLM
defaultImportance 0.5 Default importance score (0-1)
contextTokenBudget 2000 Max tokens for context retrieval
consolidationEnabled true Enable memory consolidation

Recording Memory

User Facts

1
2
3
4
5
6
// Record a user preference
val result = manager.recordUserFact(
  content = "Prefers functional programming",
  userId = Some("user-123"),
  importance = Some(0.9)
)

Entity Knowledge

1
2
3
4
5
6
// Record knowledge about an entity
val result = manager.recordEntityFact(
  entityId = EntityId("scala-lang"),
  content = "Scala is a JVM language combining OOP and FP",
  importance = Some(0.8)
)

External Knowledge

1
2
3
4
5
6
// Record external knowledge
val result = manager.recordKnowledge(
  content = "The latest LLM4S version is 0.5.0",
  source = Some("release-notes"),
  importance = Some(0.7)
)

Task Outcomes

1
2
3
4
5
6
// Record a task result
val result = manager.recordTask(
  content = "Successfully generated quarterly report",
  taskId = Some("task-456"),
  importance = Some(0.6)
)

Conversation Messages

1
2
3
4
5
6
7
8
9
10
11
12
13
import org.llm4s.llmconnect.model._

// Record a conversation turn
val result = manager.recordMessage(
  message = UserMessage("What's the weather in Paris?"),
  conversationId = Some(ConversationId("conv-789"))
)

// Record entire conversation
val result = manager.recordConversation(
  conversation = conversation,
  conversationId = ConversationId("conv-789")
)

Retrieving Memory

Relevant Context

Get context relevant to a query using semantic search:

1
2
3
4
5
val context = manager.getRelevantContext(
  query = "Tell me about Scala programming",
  maxTokens = Some(1000),
  memoryTypes = Some(Set(MemoryType.UserFact, MemoryType.Knowledge))
)

Conversation History

1
2
3
4
val history = manager.getConversationContext(
  conversationId = ConversationId("conv-789"),
  limit = Some(10)
)

Entity Context

1
2
3
4
val entityInfo = manager.getEntityContext(
  entityId = EntityId("anthropic"),
  limit = Some(5)
)

User Context

1
2
3
4
val userInfo = manager.getUserContext(
  userId = "user-123",
  limit = Some(10)
)

Memory Stores

In-Memory Store

Fast but volatile - loses data on restart:

1
2
3
4
import org.llm4s.agent.memory.InMemoryStore

val store = new InMemoryStore()
val manager = SimpleMemoryManager(store = store)

SQLite Store

Persistent local storage:

1
2
3
4
5
6
7
8
9
import org.llm4s.agent.memory.SQLiteMemoryStore

// File-based (persistent)
val store = SQLiteMemoryStore.file("/tmp/memory.db")

// In-memory SQLite (fast, volatile)
val store = SQLiteMemoryStore.inMemory()

val manager = SimpleMemoryManager(store = store)

Vector Store

Semantic search with embeddings:

1
2
3
4
5
6
7
8
9
10
import org.llm4s.agent.memory.VectorMemoryStore
import org.llm4s.agent.memory.EmbeddingService

// Create embedding service
val embeddingService = new EmbeddingService(embeddingClient)

// Create vector store
val store = new VectorMemoryStore(embeddingService)

val manager = SimpleMemoryManager(store = store)

Memory with Agents

Injecting Context

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
val result = for {
  providerConfig <- Llm4sConfig.provider()
  client <- LLMConnect.getClient(providerConfig)
  agent = new Agent(client)

  // Get memory manager with existing data
  manager <- loadMemoryManager()

  // Get relevant context for the query
  context <- manager.getRelevantContext(userQuery)

  // Build enhanced system message with context
  systemMessage = s"""You are a helpful assistant.
    |
    |Relevant context from memory:
    |$context""".stripMargin

  // Run agent with context-enhanced system message
  state <- agent.run(
    query = userQuery,
    tools = tools,
    systemMessage = Some(SystemMessage(systemMessage))
  )

  // Record the conversation
  _ <- manager.recordConversation(state.conversation, conversationId)
} yield state

Automatic Recording

1
2
3
4
5
6
7
val manager = SimpleMemoryManager(
  config = MemoryConfig(
    autoRecordMessages = true  // Automatically record all messages
  )
)

// Messages are recorded automatically when using this manager

With Embeddings

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import org.llm4s.agent.memory._

// Setup embedding service
val embeddingService = new EmbeddingService(embeddingClient)
val store = new VectorMemoryStore(embeddingService)
val manager = SimpleMemoryManager(store = store)

// Record knowledge
for {
  m1 <- manager.recordKnowledge("Paris is the capital of France")
  m2 <- m1.recordKnowledge("Berlin is the capital of Germany")
  m3 <- m2.recordKnowledge("Rome is the capital of Italy")

  // Semantic search finds relevant memories
  results <- m3.store.search("European capitals", topK = 2)
} yield results

Search with Filters

1
2
3
4
5
6
7
8
9
10
11
12
13
14
val filter = MemoryFilter(
  memoryTypes = Some(Set(MemoryType.Knowledge)),
  minImportance = Some(0.7),
  userId = Some("user-123"),
  entityId = None,
  afterTimestamp = None,
  beforeTimestamp = None
)

val results = store.search(
  query = "programming languages",
  topK = 5,
  filter = Some(filter)
)

Memory Consolidation

Automatically summarize and consolidate old memories:

1
2
3
4
5
6
7
8
9
10
11
val manager = SimpleMemoryManager(
  config = MemoryConfig(
    consolidationEnabled = true
  )
)

// Manually trigger consolidation
val result = manager.consolidateMemories(
  olderThan = java.time.Duration.ofDays(7),
  maxToConsolidate = 100
)

Entity Extraction

Extract entities from messages using LLM:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
val result = for {
  providerConfig <- Llm4sConfig.provider()
  client <- LLMConnect.getClient(providerConfig)

  manager = SimpleMemoryManager(
    config = MemoryConfig(
      autoExtractEntities = true
    ),
    llmClient = Some(client)  // Required for entity extraction
  )

  // Record message - entities extracted automatically
  m1 <- manager.recordMessage(
    UserMessage("I work at Anthropic in San Francisco")
  )
  // Entities "Anthropic" and "San Francisco" are automatically extracted
} yield m1

Manual Extraction

1
2
3
4
val entities = manager.extractEntities(
  content = "Claude is an AI assistant created by Anthropic"
)
// Returns: Seq(EntityId("claude"), EntityId("anthropic"))

Memory Statistics

1
2
3
4
5
6
val stats = manager.stats()

println(s"Total memories: ${stats.totalCount}")
println(s"By type: ${stats.countByType}")
println(s"Oldest: ${stats.oldestTimestamp}")
println(s"Newest: ${stats.newestTimestamp}")

Persistence Patterns

Save and Load

1
2
3
4
5
6
7
8
9
10
11
12
// Using SQLite for persistence
val result = for {
  // Create persistent store
  store <- SQLiteMemoryStore.file("/path/to/memory.db")
  manager = SimpleMemoryManager(store = store)

  // Record memories (automatically persisted)
  m1 <- manager.recordUserFact("User preference", Some("user-1"))

  // On next session, create manager with same store path
  // All memories are automatically loaded
} yield m1

Cross-Session Memory

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
object PersistentMemory {
  private val dbPath = "/path/to/memory.db"

  def getManager(): Result[SimpleMemoryManager] = {
    for {
      store <- SQLiteMemoryStore.file(dbPath)
    } yield SimpleMemoryManager(store = store)
  }
}

// Session 1
val result1 = for {
  manager <- PersistentMemory.getManager()
  m <- manager.recordUserFact("Likes Scala", Some("user-1"))
} yield m

// Session 2 (later)
val result2 = for {
  manager <- PersistentMemory.getManager()
  context <- manager.getUserContext("user-1")
  // context includes "Likes Scala" from session 1
} yield context

Best Practices

1. Set Appropriate Importance

1
2
3
4
5
6
7
8
// High importance - core user preferences
manager.recordUserFact("Primary programming language is Scala", importance = Some(0.9))

// Medium importance - useful but not critical
manager.recordKnowledge("Attended ScalaDays 2024", importance = Some(0.6))

// Low importance - ephemeral information
manager.recordTask("Ran tests at 10am", importance = Some(0.3))

2. Use Specific Memory Types

1
2
3
4
5
6
7
8
9
10
11
// Don't use generic Knowledge for everything
// Use specific types for better retrieval

// For user preferences
manager.recordUserFact(...)

// For entity-specific information
manager.recordEntityFact(...)

// For external knowledge
manager.recordKnowledge(...)

3. Manage Context Token Budget

1
2
3
4
5
6
7
8
9
val manager = SimpleMemoryManager(
  config = MemoryConfig(
    contextTokenBudget = 2000  // Limit context size
  )
)

// Context retrieval respects the budget
val context = manager.getRelevantContext(query)
// Returns <= 2000 tokens of context

4. Clean Up Old Memories

1
2
3
4
5
6
7
8
9
10
11
12
13
// Consolidate old memories periodically
manager.consolidateMemories(
  olderThan = java.time.Duration.ofDays(30),
  maxToConsolidate = 500
)

// Or delete old, low-importance memories
store.deleteMemories(
  filter = MemoryFilter(
    maxImportance = Some(0.3),
    beforeTimestamp = Some(thirtyDaysAgo)
  )
)

Examples

Example Description
BasicMemoryExample Getting started with memory
ConversationMemoryExample Conversation history management
MemoryWithAgentExample Integrating memory with agents
SQLiteMemoryExample Persistent SQLite storage
VectorMemoryExample Semantic search with embeddings

Browse all examples →


Next Steps