Configuration Guide

Configure LLM4S for multiple providers, environments, and use cases.

Configuration Hierarchy
Environment Variables
Application Configuration (HOCON)
1. Create application.conf
2. Access Configuration in Code
Provider-Specific Configuration
Embeddings Configuration
Tracing Configuration
Multi-Provider Setup
System Properties
Environment-Specific Configurations
Configuration Best Practices
1. ✅ DO
2. ❌ DON’T
Validation Example
Troubleshooting
Next Steps
Additional Resources

Configuration Hierarchy

LLM4S uses a hierarchical configuration system with the following precedence (highest to lowest):

System properties (-Dkey=value JVM flags)
Environment variables (.env file or shell exports)
application.conf (HOCON format in your project)
reference.conf (Library defaults)

Environment Variables

The simplest way to configure LLM4S is through environment variables.

Core Settings

# Required: Model selection (determines provider)
LLM_MODEL=openai/gpt-4o

# Required: API key for your chosen provider
OPENAI_API_KEY=sk-proj-...

Complete Environment Setup

Create a .env file in your project root:

# ===================
# Model Selection
# ===================
# Format: <provider>/<model-name>
# Supported providers: openai, anthropic, azure, ollama, gemini
LLM_MODEL=openai/gpt-4o

# ===================
# OpenAI Configuration
# ===================
OPENAI_API_KEY=sk-proj-...
OPENAI_BASE_URL=https://api.openai.com/v1  # Optional: Override endpoint
OPENAI_ORGANIZATION=org-...                # Optional: Organization ID

# ===================
# Anthropic Configuration
# ===================
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_BASE_URL=https://api.anthropic.com  # Optional
ANTHROPIC_VERSION=2023-06-01                  # Optional: API version

# ===================
# Azure OpenAI Configuration
# ===================
AZURE_API_KEY=your-azure-key
AZURE_API_BASE=https://your-resource.openai.azure.com
AZURE_DEPLOYMENT_NAME=gpt-4o
AZURE_API_VERSION=2024-02-15-preview  # Optional

# ===================
# Ollama Configuration
# ===================
OLLAMA_BASE_URL=http://localhost:11434
# No API key needed for Ollama

# ===================
# Google Gemini Configuration
# ===================
GOOGLE_API_KEY=your-google-api-key
GEMINI_BASE_URL=https://generativelanguage.googleapis.com/v1beta  # Optional

# ===================
# Tracing Configuration
# ===================
TRACING_MODE=langfuse  # Options: langfuse, console, none

# Langfuse settings (if using TRACING_MODE=langfuse)
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_URL=https://cloud.langfuse.com  # or self-hosted

# ===================
# Embeddings Configuration (RAG)
# ===================
# Unified format: provider/model (recommended)
EMBEDDING_MODEL=openai/text-embedding-3-small  # Uses default base URL
# OPENAI_API_KEY is shared with LLM config above

# Or for Voyage embeddings:
# EMBEDDING_MODEL=voyage/voyage-3
# VOYAGE_API_KEY=pa-...

# Or for local embeddings with Ollama (no API key needed):
# EMBEDDING_MODEL=ollama/nomic-embed-text

# Optional: Override default base URLs
# OPENAI_EMBEDDING_BASE_URL=https://api.openai.com/v1
# VOYAGE_EMBEDDING_BASE_URL=https://api.voyageai.com/v1
# OLLAMA_EMBEDDING_BASE_URL=http://localhost:11434

# ===================
# Workspace Configuration (Optional)
# ===================
WORKSPACE_DIR=/tmp/llm4s-workspace
WORKSPACE_IMAGE=llm4s/workspace-runner:latest
WORKSPACE_PORT=8080
WORKSPACE_CONTAINER_NAME=llm4s-workspace

Load Environment Variables

Option 1: Source the file

source .env
sbt run

Option 2: Use sbt-dotenv plugin

Add to project/plugins.sbt:

addSbtPlugin("au.com.onegeek" %% "sbt-dotenv" % "2.1.233")

Variables automatically load when SBT starts!

Option 3: IntelliJ IDEA

Run → Edit Configurations
Environment variables → Add from file
Select your .env file

Application Configuration (HOCON)

For more complex configurations, use application.conf:

Create application.conf

Create src/main/resources/application.conf:

llm {
  # Model configuration
  model = ${?LLM_MODEL}  # Fallback to env var
  model = "openai/gpt-4o"  # Default if not set

  # Generation parameters
  temperature = 0.7
  max-tokens = 2000
  top-p = 1.0

  # Provider configurations
  providers {
    openai {
      api-key = ${?OPENAI_API_KEY}
      base-url = ${?OPENAI_BASE_URL}
      base-url = "https://api.openai.com/v1"  # Default
      organization = ${?OPENAI_ORGANIZATION}
    }

    anthropic {
      api-key = ${?ANTHROPIC_API_KEY}
      base-url = ${?ANTHROPIC_BASE_URL}
      base-url = "https://api.anthropic.com"
      version = ${?ANTHROPIC_VERSION}
      version = "2023-06-01"
    }

    azure {
      api-key = ${?AZURE_API_KEY}
      api-base = ${?AZURE_API_BASE}
      deployment-name = ${?AZURE_DEPLOYMENT_NAME}
      api-version = ${?AZURE_API_VERSION}
      api-version = "2024-02-15-preview"
    }

    ollama {
      base-url = ${?OLLAMA_BASE_URL}
      base-url = "http://localhost:11434"
    }
  }
}

# Tracing configuration
tracing {
  mode = ${?TRACING_MODE}
  mode = "none"  # Default: disabled

  langfuse {
    public-key = ${?LANGFUSE_PUBLIC_KEY}
    secret-key = ${?LANGFUSE_SECRET_KEY}
    url = ${?LANGFUSE_URL}
    url = "https://cloud.langfuse.com"
  }
}

# Context window management
context {
  max-messages = 50
  preserve-system-message = true
  pruning-strategy = "oldest-first"  # oldest-first, middle-out, recent-turns
}

Access Configuration in Code

import org.llm4s.config.Llm4sConfig

// Load provider-specific config
val providerConfig = Llm4sConfig.provider()

// Load tracing config
val tracingConfig = Llm4sConfig.tracing()

// Load embeddings config
val embeddingsConfig = Llm4sConfig.embeddings()

Provider-Specific Configuration

OpenAI

# Model selection
LLM_MODEL=openai/gpt-4o          # Latest GPT-4o
LLM_MODEL=openai/gpt-4-turbo     # GPT-4 Turbo
LLM_MODEL=openai/gpt-3.5-turbo   # GPT-3.5

# Required
OPENAI_API_KEY=sk-proj-...

# Optional
OPENAI_BASE_URL=https://api.openai.com/v1
OPENAI_ORGANIZATION=org-...

Available Models:

gpt-4o - Latest GPT-4 Optimized
gpt-4-turbo - Fast GPT-4
gpt-4 - Standard GPT-4
gpt-3.5-turbo - Fast and cheap

Anthropic

# Model selection
LLM_MODEL=anthropic/claude-sonnet-4-5-latest
LLM_MODEL=anthropic/claude-opus-4-5-latest

# Required
ANTHROPIC_API_KEY=sk-ant-...

# Optional
ANTHROPIC_BASE_URL=https://api.anthropic.com
ANTHROPIC_VERSION=2023-06-01

Available Models:

claude-sonnet-4-5-latest - Latest Claude 4.5 Sonnet
claude-opus-4-5-latest - Most capable Claude 4.5 Opus
claude-3-haiku-latest - Fastest

Azure OpenAI

# Model selection
LLM_MODEL=azure/gpt-4o

# Required
AZURE_API_KEY=your-azure-key
AZURE_API_BASE=https://your-resource.openai.azure.com
AZURE_DEPLOYMENT_NAME=gpt-4o  # Your deployment name

# Optional
AZURE_API_VERSION=2024-02-15-preview

Ollama (Local Models)

# Model selection
LLM_MODEL=ollama/llama2
LLM_MODEL=ollama/mistral
LLM_MODEL=ollama/codellama

# Required
OLLAMA_BASE_URL=http://localhost:11434

# No API key needed!

Install Ollama:

# Install
curl -fsSL https://ollama.com/install.sh | sh

# Pull a model
ollama pull llama2

# Start server
ollama serve

Google Gemini

# Model selection
LLM_MODEL=gemini/gemini-2.0-flash
LLM_MODEL=gemini/gemini-1.5-pro
LLM_MODEL=gemini/gemini-1.5-flash

# Required
GOOGLE_API_KEY=your-google-api-key

# Optional
GEMINI_BASE_URL=https://generativelanguage.googleapis.com/v1beta

Available Models:

gemini-2.0-flash - Latest Gemini 2.0 (fast, multimodal)
gemini-2.5-flash - Gemini 2.5 Flash (thinking model)
gemini-1.5-pro - High capability (1M token context)
gemini-1.5-flash - Fast and efficient (1M token context)

Get API Key:

Go to Google AI Studio
Click “Get API Key”
Create or select a project
Copy the generated API key

Embeddings Configuration

LLM4S supports multiple embedding providers for RAG (Retrieval-Augmented Generation) workflows.

Unified Format (Recommended)

Use EMBEDDING_MODEL with the format provider/model-name, similar to LLM_MODEL:

# OpenAI embeddings
EMBEDDING_MODEL=openai/text-embedding-3-small
OPENAI_API_KEY=sk-...  # Same key as LLM

# Voyage AI embeddings
EMBEDDING_MODEL=voyage/voyage-3
VOYAGE_API_KEY=pa-...

# Ollama embeddings (local, no API key needed)
EMBEDDING_MODEL=ollama/nomic-embed-text

Default base URLs are used automatically:

OpenAI: https://api.openai.com/v1
Voyage: https://api.voyageai.com/v1
Ollama: http://localhost:11434

Override base URLs if needed:

EMBEDDING_MODEL=openai/text-embedding-3-small
OPENAI_EMBEDDING_BASE_URL=https://custom.openai.com/v1

Available Models

OpenAI:

text-embedding-3-small - Fast, cost-effective (1536 dimensions)
text-embedding-3-large - Higher quality (3072 dimensions)
text-embedding-ada-002 - Legacy model (1536 dimensions)

Voyage AI:

voyage-3 - General purpose
voyage-3-large - Higher quality
voyage-code-2 - Code-optimized

Ollama (local):

nomic-embed-text - General purpose (768 dimensions)
mxbai-embed-large - Higher quality (1024 dimensions)
all-minilm - Lightweight (384 dimensions)

Install Ollama embedding model:

ollama pull nomic-embed-text

Legacy Format (Still Supported)

The legacy format using EMBEDDING_PROVIDER is still supported for backward compatibility:

# OpenAI
EMBEDDING_PROVIDER=openai
OPENAI_EMBEDDING_BASE_URL=https://api.openai.com/v1
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
OPENAI_API_KEY=sk-...

# Voyage AI
EMBEDDING_PROVIDER=voyage
VOYAGE_EMBEDDING_BASE_URL=https://api.voyageai.com/v1
VOYAGE_EMBEDDING_MODEL=voyage-3
VOYAGE_API_KEY=pa-...

# Ollama
EMBEDDING_PROVIDER=ollama
OLLAMA_EMBEDDING_BASE_URL=http://localhost:11434
OLLAMA_EMBEDDING_MODEL=nomic-embed-text

Using Embeddings in Code

import org.llm4s.config.Llm4sConfig
import org.llm4s.llmconnect.EmbeddingClient
import org.llm4s.llmconnect.config.EmbeddingModelConfig
import org.llm4s.agent.memory.LLMEmbeddingService

// Create embedding client from typed config
val clientResult = for {
  (provider, cfg) <- Llm4sConfig.embeddings()
  client <- EmbeddingClient.from(provider, cfg)
} yield client

// Or use the higher-level service
val embeddingServiceResult = for {
  client <- clientResult
  model <- Llm4sConfig.textEmbeddingModel()
} yield LLMEmbeddingService(client, EmbeddingModelConfig(model.modelName, model.dimensions))

System Properties (Alternative)

When running via sbt, use system properties for reliable configuration:

# Using unified format (recommended)
sbt -Dllm4s.embeddings.model=ollama/nomic-embed-text \
    "run"

# Or with legacy format
sbt -Dllm4s.embeddings.provider=ollama \
    -Dllm4s.embeddings.ollama.model=nomic-embed-text \
    "run"

Tracing Configuration

Console Tracing (Development)

TRACING_MODE=console

Output appears in stdout:

[TRACE] Completion request: UserMessage(What is Scala?)
[TRACE] Token usage: 150 tokens
[TRACE] Response time: 1.2s

Langfuse (Production)

TRACING_MODE=langfuse
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_URL=https://cloud.langfuse.com

Get keys from Langfuse:

Sign up at langfuse.com
Create a project
Navigate to Settings → API Keys
Copy public and secret keys

Disable Tracing

TRACING_MODE=none

Multi-Provider Setup

Switch between providers without code changes:

# Development: Use Ollama (free, local)
LLM_MODEL=ollama/llama2
OLLAMA_BASE_URL=http://localhost:11434

# Staging: Use OpenAI GPT-3.5 (cheap)
LLM_MODEL=openai/gpt-3.5-turbo
OPENAI_API_KEY=sk-...

# Production: Use Claude Sonnet (best quality)
LLM_MODEL=anthropic/claude-sonnet-4-5-latest
ANTHROPIC_API_KEY=sk-ant-...

Your code remains the same:

val client = Llm4sConfig.provider().flatMap(LLMConnect.getClient)

System Properties

Override any setting via JVM flags:

sbt -DLLM_MODEL=openai/gpt-4o \
    -DOPENAI_API_KEY=sk-... \
    run

Or in build.sbt:

javaOptions ++= Seq(
  s"-DLLM_MODEL=openai/gpt-4o",
  s"-DOPENAI_API_KEY=${sys.env.getOrElse("OPENAI_API_KEY", "")}"
)

Environment-Specific Configurations

Development

Create .env.dev:

LLM_MODEL=ollama/llama2
OLLAMA_BASE_URL=http://localhost:11434
TRACING_MODE=console

Staging

Create .env.staging:

LLM_MODEL=openai/gpt-3.5-turbo
OPENAI_API_KEY=${STAGING_OPENAI_KEY}
TRACING_MODE=langfuse
LANGFUSE_PUBLIC_KEY=${STAGING_LANGFUSE_PUBLIC}
LANGFUSE_SECRET_KEY=${STAGING_LANGFUSE_SECRET}

Production

Create .env.prod:

LLM_MODEL=anthropic/claude-sonnet-4-5-latest
ANTHROPIC_API_KEY=${PROD_ANTHROPIC_KEY}
TRACING_MODE=langfuse
LANGFUSE_PUBLIC_KEY=${PROD_LANGFUSE_PUBLIC}
LANGFUSE_SECRET_KEY=${PROD_LANGFUSE_SECRET}

Load the appropriate file:

source .env.prod
sbt run

Configuration Best Practices

✅ DO

Use environment variables for secrets (API keys)
Use application.conf for non-sensitive defaults
Add .env to .gitignore to prevent committing secrets
Use different configs for dev/staging/prod
Validate configuration at startup

import org.llm4s.config.Llm4sConfig

val config = Llm4sConfig.provider()
config.fold(
  error => {
    println(s"Configuration error: $error")
    System.exit(1)
  },
  _ => println("Configuration loaded successfully")
)

❌ DON’T

Don’t hardcode API keys in source code
Don’t commit .env files to version control
Don’t use System.getenv() directly (use Llm4sConfig)
Don’t mix production keys in development
Don’t skip validation - fail fast on bad config

Validation Example

import org.llm4s.config.Llm4sConfig
import org.llm4s.llmconnect.LLMConnect

object ValidateConfig extends App {
  println("Validating LLM4S configuration...")

  val validation = for {
    providerConfig <- Llm4sConfig.provider()
    client <- LLMConnect.getClient(providerConfig)
  } yield {
    println(s"✅ Model: ${providerConfig.model}")
    println(s"✅ Provider: ${providerConfig.getClass.getSimpleName}")
    println(s"✅ Client ready: ${client.getClass.getSimpleName}")
  }

  validation match {
    case Right(_) =>
      println("✅ Configuration valid!")
    case Left(error) =>
      println(s"❌ Configuration error: $error")
      System.exit(1)
  }
}

Troubleshooting

Problem: “API key not found”

Symptoms:

Left(ConfigurationError("Missing API key for provider: openai"))

Root causes:

Environment variable not set
Typo in variable name (case-sensitive)
.env file not loaded in your shell session
Using wrong shell (check with echo $SHELL)

Debug steps:

# Check if variable is set
echo $OPENAI_API_KEY

# List all LLM4S-related variables
env | grep -E '(LLM_MODEL|OPENAI|ANTHROPIC|AZURE|OLLAMA)'

# Source .env explicitly
source .env
echo $OPENAI_API_KEY  # Should show your key

# Run SBT with explicit env loading
sbt -Dconfig.override_with_env_vars=true run

Problem: “Invalid model format”

Symptoms:

Left(ConfigurationError("Invalid model format: gpt-4o"))

Fix: Models must include the provider prefix:

# ✅ Correct format: provider/model-name
LLM_MODEL=openai/gpt-4o
LLM_MODEL=anthropic/claude-3-5-sonnet-20241022
LLM_MODEL=ollama/llama3.2

# ❌ Wrong - missing provider
LLM_MODEL=gpt-4o
LLM_MODEL=claude-3-5-sonnet

Problem: “Configuration not loading from application.conf”

Symptoms: Environment variables work but application.conf is ignored.

Debug:

import com.typesafe.config.ConfigFactory

val config = ConfigFactory.load()
println(config.hasPath("llm4s.provider"))  // Should be true
println(config.getString("llm4s.provider.model"))  // Should show your model

Common causes:

application.conf not in src/main/resources/
HOCON syntax error (missing quotes, wrong nesting)
Environment variable overriding config (env vars have higher precedence)

Fix: Validate HOCON syntax:

# Test config loading
sbt "runMain com.typesafe.config.impl.ConfigImpl"

Problem: “Provider mismatch”

Symptoms:

import org.llm4s.error.AuthenticationError

Left(AuthenticationError("Invalid API key"))

Root cause: Model provider doesn’t match API key provider.

# ✅ Check your configuration
echo $LLM_MODEL        # Should be openai/...
echo $OPENAI_API_KEY   # Should start with sk-proj-...

# ❌ Common mismatch:
LLM_MODEL=openai/gpt-4o
ANTHROPIC_API_KEY=sk-ant-...  # ❌ Wrong! Need OPENAI_API_KEY
OPENAI_API_KEY=              # ❌ Wrong - key is empty

Problem: “Rate limiting / 429 errors”

Symptoms:

import org.llm4s.error.RateLimitError

Left(RateLimitError("Rate limit exceeded"))

Solutions:

Add retry logic: ```scala import scala.concurrent.duration._ import org.llm4s.error.RateLimitError import org.llm4s.types.Result

// Note: This example uses Thread.sleep for simplicity. // In production, prefer scala.concurrent or cats-effect for non-blocking delays. def retryWithBackoffA: Result[A] = { (1 to maxRetries).foldLeft(op) { (result, attempt) => result match { case Left(_: RateLimitError) if attempt < maxRetries => Thread.sleep(Math.pow(2, attempt).toLong * 1000) // Exponential backoff op case other => other } } }

2. **Use a cheaper model for testing:**
```bash
# Development
LLM_MODEL=openai/gpt-4o-mini  # 60x cheaper

# Production
LLM_MODEL=openai/gpt-4o

Switch to Ollama for development:

LLM_MODEL=ollama/llama3.2  # Free, no rate limits

Problem: “Slow responses or timeouts”

Symptoms: Requests take 30+ seconds or timeout.

Solutions:

Increase timeout in application.conf:

llm4s {
  request-timeout = 60 seconds  # Default is 30s
}

Use streaming for long responses: ```scala // Non-streaming: waits for full response client.complete(Conversation(Seq(UserMessage(“Your query”))))

// Streaming: get tokens as they arrive client.streamComplete( Conversation(Seq(UserMessage(“Your query”))) ) { chunk => chunk.content.foreach(print) }

3. **Reduce response length:**
```scala
client.complete(
  Conversation(Seq(UserMessage("Your query"))),
  CompletionOptions(maxTokens = Some(500))  // Limit response length
)

Problem: “OutOfMemoryError with embeddings”

Symptoms: JVM crashes when processing large document collections.

Solutions:

Batch embeddings instead of loading all at once: ```scala val documents: List[String] = loadDocuments() val batchSize = 100

val embeddings = documents.grouped(batchSize).flatMap { batch => embedder.embed(batch) match { case Right(emb) => emb case Left(err) => println(s”Batch failed: $err”) List.empty } }.toList

2. **Increase JVM heap:**
```bash
export SBT_OPTS="-Xmx4G -Xss4M"
sbt run

Use a local embedding model (smaller memory footprint):

LLM_EMBEDDING_MODEL=ollama/nomic-embed-text

Next Steps

Configuration complete! Now you can:

Explore features → - Dive into agents, tools, and more
Browse examples → - See configuration in action
User guide → - Learn core concepts
Observability → - Set up tracing

Additional Resources

Llm4sConfig API - Detailed API documentation
Environment Variables Reference - Complete list
Production Guide - Production best practices
Discord Community - Get help with configuration

Ready to build? Explore what’s next →

Configuration Guide

Table of contents

Configuration Hierarchy

Environment Variables

Core Settings

Complete Environment Setup

Load Environment Variables

Application Configuration (HOCON)

Create application.conf

Access Configuration in Code

Provider-Specific Configuration

OpenAI

Anthropic

Azure OpenAI

Ollama (Local Models)

Google Gemini

Embeddings Configuration

Unified Format (Recommended)

Available Models

Legacy Format (Still Supported)

Using Embeddings in Code

System Properties (Alternative)

Tracing Configuration

Console Tracing (Development)

Langfuse (Production)

Disable Tracing

Multi-Provider Setup

System Properties

Environment-Specific Configurations

Development

Staging

Production

Configuration Best Practices

✅ DO

❌ DON’T

Validation Example

Troubleshooting

Problem: “API key not found”

Problem: “Invalid model format”

Problem: “Configuration not loading from application.conf”

Problem: “Provider mismatch”

Problem: “Rate limiting / 429 errors”

Problem: “Slow responses or timeouts”

Problem: “OutOfMemoryError with embeddings”

Next Steps

Additional Resources