Agent Framework

Build sophisticated AI agents with tools, guardrails, memory, and multi-agent coordination.

Table of contents

  1. Overview
  2. Quick Start
    1. Basic Agent
    2. Agent with Tools
    3. Multi-Turn Conversations
  3. Safety Defaults
  4. Core Concepts
    1. Agent State
    2. Agent Lifecycle
    3. Tool Execution Strategies
  5. Features
    1. Guardrails
    2. Memory System
    3. Handoffs
    4. Streaming Events
  6. Built-in Tools
  7. Context Window Management
  8. Conversation Persistence
  9. Reasoning Modes
  10. Examples
  11. Design Documents
  12. Next Steps

Overview

The LLM4S Agent Framework provides a production-ready foundation for building LLM-powered agents with:

  • Tool Calling - Type-safe tools with automatic schema generation
  • Guardrails - Input/output validation for safety and quality
  • Memory - Short and long-term context with semantic search
  • Handoffs - Agent-to-agent delegation for specialist routing
  • Streaming - Real-time events for responsive UIs
  • Orchestration - Multi-agent workflows with DAG execution

Quick Start

Basic Agent

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
import org.llm4s.config.Llm4sConfig
import org.llm4s.llmconnect.LLMConnect
import org.llm4s.agent.Agent
import org.llm4s.toolapi.ToolRegistry

// Create an agent and run a query
val result = for {
  providerConfig <- Llm4sConfig.provider()
  client <- LLMConnect.getClient(providerConfig)
  agent = new Agent(client)
  state <- agent.run(
    query = "What is the capital of France?",
    tools = ToolRegistry.empty
  )
} yield state

result match {
  case Right(state) => println(state.lastAssistantMessage)
  case Left(error) => println(s"Error: $error")
}

Agent with Tools

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
import org.llm4s.toolapi.{ToolRegistry, ToolFunction}

// Define a tool
def getWeather(location: String): String = {
  s"The weather in $location is sunny, 72F"
}

val weatherTool = ToolFunction(
  name = "get_weather",
  description = "Get current weather for a location",
  function = getWeather _
)

// Run agent with tools
val result = for {
  providerConfig <- Llm4sConfig.provider()
  client <- LLMConnect.getClient(providerConfig)
  agent = new Agent(client)
  tools = new ToolRegistry(Seq(weatherTool))
  state <- agent.run("What's the weather in Paris?", tools)
} yield state

Multi-Turn Conversations

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
// Functional multi-turn pattern
val result = for {
  providerConfig <- Llm4sConfig.provider()
  client <- LLMConnect.getClient(providerConfig)
  agent = new Agent(client)
  tools = ToolRegistry.empty

  // First turn
  state1 <- agent.run("Tell me about Scala", tools)

  // Follow-up (preserves context)
  state2 <- agent.continueConversation(state1, "How does it compare to Java?")

  // Another follow-up
  state3 <- agent.continueConversation(state2, "What about performance?")
} yield state3

Safety Defaults

  • Agent step limit: Agent.run(...) defaults to maxSteps = Some(50) to prevent infinite loops. Pass maxSteps = None to allow unlimited steps.
  • HTTPTool methods: HttpConfig() defaults to GET and HEAD only. Use HttpConfig.withWriteMethods() or HttpConfig().withAllMethods to allow write methods.

Core Concepts

Agent State

The AgentState is an immutable container that tracks:

  • Conversation history - All messages exchanged
  • Available tools - Tools the agent can call
  • Status - InProgress, WaitingForTools, Complete, Failed, or HandoffRequested
  • System message - Instructions for the LLM
  • Completion options - Temperature, max tokens, etc.
1
2
// Agent state is immutable - operations return new states
val newState = state.addMessage(UserMessage("Follow-up question"))

Agent Lifecycle

1
2
3
4
5
6
7
8
9
10
11
12
13
14
Initial Query
     |
     v
+----------+     LLM Call      +------------------+
| InProgress| --------------> | WaitingForTools  |
+----------+                  +------------------+
     ^                               |
     |        Tool Execution         |
     +-------------------------------+
     |
     v (no more tool calls)
+----------+
| Complete |
+----------+

Tool Execution Strategies

Control how multiple tool calls are executed:

1
2
3
4
5
6
7
8
9
10
import org.llm4s.agent.ToolExecutionStrategy

// Sequential (default) - one at a time, safest
agent.run(query, tools)

// Parallel - all at once, fastest
agent.runWithStrategy(query, tools, ToolExecutionStrategy.Parallel)

// Parallel with limit - balance speed and resources
agent.runWithStrategy(query, tools, ToolExecutionStrategy.ParallelWithLimit(3))

Features

Guardrails

Validate inputs and outputs for safety:

1
2
3
4
5
6
7
8
9
10
11
12
13
import org.llm4s.agent.guardrails.builtin._

agent.run(
  query = "Generate JSON data",
  tools = tools,
  inputGuardrails = Seq(
    new LengthCheck(1, 10000),
    new ProfanityFilter()
  ),
  outputGuardrails = Seq(
    new JSONValidator()
  )
)

Learn more about guardrails →

Memory System

Persistent context across conversations:

1
2
3
4
5
6
7
import org.llm4s.agent.memory._

val result = for {
  manager <- SimpleMemoryManager.empty
  m1 <- manager.recordUserFact("Prefers Scala", Some("user-1"), Some(0.9))
  context <- m1.getRelevantContext("programming preferences")
} yield context

Learn more about memory →

Handoffs

Delegate to specialist agents:

1
2
3
4
5
6
7
8
9
import org.llm4s.agent.Handoff

agent.run(
  query = "Complex physics question",
  tools = tools,
  handoffs = Seq(
    Handoff.to(physicsAgent, "Physics expertise required")
  )
)

Learn more about handoffs →

Streaming Events

Real-time execution feedback:

1
2
3
4
5
6
7
8
9
import org.llm4s.agent.streaming._

agent.runWithEvents(query, tools) {
  case TextDelta(text, _) => print(text)
  case ToolCallStarted(_, name, _, _) => println(s"Calling $name...")
  case ToolCallCompleted(_, name, result, _, _, _) => println(s"$name: $result")
  case AgentCompleted(state, steps, ms, _) => println(s"Done in $steps steps")
  case _ => ()
}

Learn more about streaming →


Built-in Tools

LLM4S provides pre-built tools for common tasks:

1
2
3
4
5
6
7
8
9
10
11
12
13
import org.llm4s.toolapi.builtin.BuiltinTools

// Core tools (always safe)
BuiltinTools.core          // DateTime, Calculator, UUID, JSON

// Safe for most use cases
BuiltinTools.safe()        // + web search, HTTP

// With file access (read-only)
BuiltinTools.withFiles()   // + read-only file access

// All tools (use with caution)
BuiltinTools.development() // All tools including write access

Available tools:

Tool Description
DateTimeTool Current date/time, timezone conversion
CalculatorTool Mathematical calculations
UUIDTool Generate unique identifiers
JSONTool Parse and format JSON
HTTPTool Make HTTP requests
WebSearchTool Search the web
FileReadTool Read files (with restrictions)
ShellTool Execute shell commands (development only)

Context Window Management

Handle long conversations automatically:

1
2
3
4
5
6
7
8
9
10
11
12
import org.llm4s.agent.{ContextWindowConfig, PruningStrategy}

val config = ContextWindowConfig(
  maxMessages = Some(20),
  preserveSystemMessage = true,
  minRecentTurns = 2,
  pruningStrategy = PruningStrategy.OldestFirst
)

// Use with runMultiTurn for automatic pruning
val queries = Seq("Question 1", "Question 2", "Question 3")
agent.runMultiTurn(queries, tools, contextConfig = Some(config))

Pruning Strategies:

Strategy Behavior
OldestFirst Remove oldest messages first (FIFO)
MiddleOut Keep first and last messages, remove middle
RecentTurnsOnly(n) Keep only the last N conversation turns
Custom(fn) User-defined pruning function

Conversation Persistence

Save and resume conversations:

1
2
3
4
5
6
7
8
// Save state to disk
AgentState.saveToFile(state, "/tmp/conversation.json")

// Load and resume
val result = for {
  loadedState <- AgentState.loadFromFile("/tmp/conversation.json", tools)
  resumedState <- agent.continueConversation(loadedState, "Continue our conversation")
} yield resumedState

Reasoning Modes

Enable extended thinking for complex problems:

1
2
3
4
5
6
7
8
import org.llm4s.llmconnect.model.{CompletionOptions, ReasoningEffort}

val options = CompletionOptions()
  .withReasoning(ReasoningEffort.High)  // None, Low, Medium, High
  .copy(maxTokens = Some(4096))

// Use with agent
agent.run(query, tools, completionOptions = Some(options))

Supported by OpenAI o1/o3 and Anthropic Claude models.


Examples

Example Description
SingleStepAgentExample Step-by-step debugging
MultiStepAgentExample Complete execution flow
MultiTurnConversationExample Functional multi-turn API
LongConversationExample Context window pruning
ConversationPersistenceExample Save and resume
AsyncToolAgentExample Parallel tool execution
BuiltinToolsAgentExample Built-in tools

Browse all examples →


Design Documents

For in-depth technical details:


Next Steps

  1. Guardrails Guide - Input/output validation
  2. Memory Guide - Persistent context
  3. Handoffs Guide - Agent delegation
  4. Streaming Guide - Real-time events
  5. Examples Gallery - Working code samples

Table of contents