LLM4S - Large Language Models for Scala

A comprehensive, type-safe framework for building LLM-powered applications in Scala.

Why LLM4S?

LLM4S brings the power of large language models to the Scala ecosystem with a focus on type safety, functional programming, and production readiness.

import org.llm4s.config.Llm4sConfig
import org.llm4s.llmconnect.LLMConnect
import org.llm4s.llmconnect.model.UserMessage

// Simple LLM call with automatic provider selection
val result = for {
  providerConfig <- Llm4sConfig.provider()
  client <- LLMConnect.getClient(providerConfig)
  response <- client.complete(
    messages = List(UserMessage("Explain quantum computing")),
    model = None  // Uses configured model
  )
} yield response

result match {
  case Right(completion) => println(completion.content)
  case Left(error) => println(s"Error: $error")
}

Key Features

Core LLM Platform

🔌 Multi-Provider Support

Connect seamlessly to OpenAI, Anthropic, Azure OpenAI, Google Gemini, and Ollama with a unified API. Switch providers with a single environment variable. Learn more →

📡 Streaming Responses

Real-time token streaming with backpressure handling and error recovery. View examples →

🔍 RAG & Embeddings

Complete RAG pipeline: vector storage (SQLite, pgvector, Qdrant), hybrid search with BM25 keyword matching (SQLite FTS5 or PostgreSQL native), Cohere cross-encoder reranking, and sentence-aware document chunking. For production deployment, see RAG in a Box. Vector stores → | Examples →

🖼️ Multimodal Support

Generate and analyze images, convert speech-to-text and text-to-speech, and work with multiple content modalities. Image generation → | Speech →

📊 Observability

Comprehensive tracing with Langfuse integration for debugging, monitoring, and production analytics. Learn more →

🛠️ Type-Safe Tool Calling

Define tools with automatic schema generation and type-safe execution. Supports both local tools and Model Context Protocol (MCP) servers. See examples →

Agent Framework

🤖 Agent Framework

Build sophisticated single and multi-agent workflows with built-in tool calling, conversation management, and state persistence. Explore agents →

💬 Multi-Turn Conversations

Functional, immutable conversation management with automatic context window pruning and conversation persistence. View patterns →

🛡️ Guardrails & Validation

Declarative input/output validation framework for production safety. Built-in guardrails for length checks, profanity filtering, JSON validation, tone validation, and LLM-as-Judge. Learn more →

🔄 Agent Handoffs

LLM-driven agent-to-agent delegation for specialist routing. Simple API for handing off queries to domain experts with automatic context preservation. See examples →

🧠 Memory System

Short-term and long-term memory with entity tracking. In-memory, SQLite, and vector store backends for semantic search across conversations. Explore memory →

💭 Reasoning Modes

Extended thinking support for OpenAI o1/o3 and Anthropic Claude. Configure reasoning effort levels and access thinking content. Learn more →

Infrastructure

⚡ Built-in Tools

Pre-built tools for common tasks: DateTime, Calculator, UUID, JSON parsing, HTTP requests, web search, and file operations with security controls. Browse tools →

🐳 Secure Execution

Containerized workspace for safe tool execution with Docker isolation. Advanced topics →

Quick Start

Installation

Add LLM4S to your build.sbt:

libraryDependencies += "org.llm4s" %% "llm4s-core" % "	0.0.0+116108a88-SNAPSHOT"

Current Version: ` 0.0.0+116108a88-SNAPSHOT` Check Maven Central for the latest release.

Configuration

Set your API key and model:

export LLM_MODEL=openai/gpt-4o
export OPENAI_API_KEY=sk-...

Your First Program

import org.llm4s.config.Llm4sConfig
import org.llm4s.llmconnect.LLMConnect
import org.llm4s.llmconnect.model._

object HelloLLM extends App {
  val result = for {
    providerConfig <- Llm4sConfig.provider()
    client <- LLMConnect.getClient(providerConfig)
    response <- client.complete(
      messages = List(
        SystemMessage("You are a helpful assistant."),
        UserMessage("What is Scala?")
      ),
      model = None
    )
  } yield response.content

  result.fold(
    error => println(s"Error: $error"),
    content => println(s"Response: $content")
  )
}

Complete installation guide →

Example Gallery

Explore 69 working examples covering all features:

Basic Examples

Basic LLM Calling - Simple conversations
Streaming Responses - Real-time token streaming
Multi-Provider - OpenAI, Anthropic, Ollama

Agent Examples

Multi-Turn Conversations - Functional conversation API
Async Tool Execution - Parallel tool strategies
Conversation Persistence - Save and resume

Guardrails & Safety

Input/Output Validation - Length, profanity, JSON
LLM-as-Judge - Semantic validation
Custom Guardrails - Build your own validators

Handoffs & Memory

Agent Handoffs - Specialist delegation
Memory System - Entity and context memory
Vector Search - Semantic retrieval

Tools & Streaming

Built-in Tools - DateTime, HTTP, file access
Streaming Events - Real-time agent events
Reasoning Modes - Extended thinking

Browse all examples →

Documentation

🤖 Agent Framework

Tools, guardrails, memory, handoffs

Learn agents →

📖 User Guide

RAG, vector stores, multimodal

Browse guides →

💻 Examples

70 working code examples

Browse examples →

🚀 Advanced Topics

Production readiness & optimization

Learn more →

📚 API Reference

Complete API documentation

View API docs →

📖 Scaladoc

Generated API documentation

Browse Scaladoc →

Why Scala for LLMs?

✅ **Type Safety** - Catch errors at compile time, not in production

✅ **Functional Programming** - Immutable data and pure functions for predictable systems

✅ **JVM Ecosystem** - Access to mature, production-grade libraries

✅ **Concurrency** - Advanced models for safe, efficient parallelism

✅ **Performance** - JVM speed with functional elegance

✅ **Enterprise Ready** - Seamless integration with JVM systems

Compatibility

Scala & JDK Support

Scala Version	JDK Version	Status
3.7.x	21, 17	✅ Fully Supported
2.13.x	21, 17	✅ Fully Supported

LLM Provider Support

Provider	Status	Models
OpenAI	✅ Complete	GPT-4o, GPT-4, GPT-3.5, o1, o3
Anthropic	✅ Complete	Claude 3.5, Claude 3
Azure OpenAI	✅ Complete	All Azure-hosted models
Ollama	✅ Complete	Llama, Mistral, local models
Google Gemini	🚧 Planned	Coming soon
Cohere	🚧 Planned	Coming soon

Community

Discord: Join our community
GitHub: llm4s/llm4s
Starter Kit: llm4s.g8
License: MIT

Project Status

LLM4S is under active development with comprehensive LLM capabilities.

Core Framework (Complete)

Category	Features
LLM Providers	OpenAI, Anthropic, Azure, Ollama
Content Generation	Text, Images, Speech (STT/TTS), Embeddings
Tools & Integration	Tool Calling, MCP Servers, Built-in Tools, Workspace Isolation
Infrastructure	Type-Safe Config, Result Error Handling, Langfuse Tracing

Agent Framework Phases

✅ Phase 1.0-1.4: Core agents, conversations, guardrails, handoffs, memory
✅ Phase 2.1-2.2: Event streaming, async tool execution
✅ Phase 3.2: Built-in tools module
✅ Phase 4.1, 4.3: Reasoning modes, session serialization
🚧 Next: Enhanced observability, provider expansion
📋 v1.0.0: Production readiness

View detailed roadmap →

Getting Help

Documentation: Browse the user guide
Examples: Check out 69 working examples
Discord: Ask questions in our community
Issues: Report bugs on GitHub

Ready to get started? Install LLM4S →