Configuration Guide
Configure LLM4S for multiple providers, environments, and use cases.
Table of contents
- Configuration Hierarchy
- Environment Variables
- Application Configuration (HOCON)
- Provider-Specific Configuration
- Embeddings Configuration
- Tracing Configuration
- Multi-Provider Setup
- System Properties
- Environment-Specific Configurations
- Configuration Best Practices
- Validation Example
- Troubleshooting
- Next Steps
- Additional Resources
Configuration Hierarchy
LLM4S uses a hierarchical configuration system with the following precedence (highest to lowest):
- System properties (
-Dkey=valueJVM flags) - Environment variables (
.envfile or shell exports) - application.conf (HOCON format in your project)
- reference.conf (Library defaults)
Environment Variables
The simplest way to configure LLM4S is through environment variables.
Core Settings
1
2
3
4
5
# Required: Model selection (determines provider)
LLM_MODEL=openai/gpt-4o
# Required: API key for your chosen provider
OPENAI_API_KEY=sk-proj-...
Complete Environment Setup
Create a .env file in your project root:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
# ===================
# Model Selection
# ===================
# Format: <provider>/<model-name>
# Supported providers: openai, anthropic, azure, ollama, gemini
LLM_MODEL=openai/gpt-4o
# ===================
# OpenAI Configuration
# ===================
OPENAI_API_KEY=sk-proj-...
OPENAI_BASE_URL=https://api.openai.com/v1 # Optional: Override endpoint
OPENAI_ORGANIZATION=org-... # Optional: Organization ID
# ===================
# Anthropic Configuration
# ===================
ANTHROPIC_API_KEY=sk-ant-...
ANTHROPIC_BASE_URL=https://api.anthropic.com # Optional
ANTHROPIC_VERSION=2023-06-01 # Optional: API version
# ===================
# Azure OpenAI Configuration
# ===================
AZURE_API_KEY=your-azure-key
AZURE_API_BASE=https://your-resource.openai.azure.com
AZURE_DEPLOYMENT_NAME=gpt-4o
AZURE_API_VERSION=2024-02-15-preview # Optional
# ===================
# Ollama Configuration
# ===================
OLLAMA_BASE_URL=http://localhost:11434
# No API key needed for Ollama
# ===================
# Google Gemini Configuration
# ===================
GOOGLE_API_KEY=your-google-api-key
GEMINI_BASE_URL=https://generativelanguage.googleapis.com/v1beta # Optional
# ===================
# Tracing Configuration
# ===================
TRACING_MODE=langfuse # Options: langfuse, console, none
# Langfuse settings (if using TRACING_MODE=langfuse)
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_URL=https://cloud.langfuse.com # or self-hosted
# ===================
# Embeddings Configuration (RAG)
# ===================
# Unified format: provider/model (recommended)
EMBEDDING_MODEL=openai/text-embedding-3-small # Uses default base URL
# OPENAI_API_KEY is shared with LLM config above
# Or for Voyage embeddings:
# EMBEDDING_MODEL=voyage/voyage-3
# VOYAGE_API_KEY=pa-...
# Or for local embeddings with Ollama (no API key needed):
# EMBEDDING_MODEL=ollama/nomic-embed-text
# Optional: Override default base URLs
# OPENAI_EMBEDDING_BASE_URL=https://api.openai.com/v1
# VOYAGE_EMBEDDING_BASE_URL=https://api.voyageai.com/v1
# OLLAMA_EMBEDDING_BASE_URL=http://localhost:11434
# ===================
# Workspace Configuration (Optional)
# ===================
WORKSPACE_DIR=/tmp/llm4s-workspace
WORKSPACE_IMAGE=llm4s/workspace-runner:latest
WORKSPACE_PORT=8080
WORKSPACE_CONTAINER_NAME=llm4s-workspace
Load Environment Variables
Option 1: Source the file
1
2
source .env
sbt run
Option 2: Use sbt-dotenv plugin
Add to project/plugins.sbt:
1
addSbtPlugin("au.com.onegeek" %% "sbt-dotenv" % "2.1.233")
Variables automatically load when SBT starts!
Option 3: IntelliJ IDEA
- Run → Edit Configurations
- Environment variables → Add from file
- Select your
.envfile
Application Configuration (HOCON)
For more complex configurations, use application.conf:
Create application.conf
Create src/main/resources/application.conf:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
llm {
# Model configuration
model = ${?LLM_MODEL} # Fallback to env var
model = "openai/gpt-4o" # Default if not set
# Generation parameters
temperature = 0.7
max-tokens = 2000
top-p = 1.0
# Provider configurations
providers {
openai {
api-key = ${?OPENAI_API_KEY}
base-url = ${?OPENAI_BASE_URL}
base-url = "https://api.openai.com/v1" # Default
organization = ${?OPENAI_ORGANIZATION}
}
anthropic {
api-key = ${?ANTHROPIC_API_KEY}
base-url = ${?ANTHROPIC_BASE_URL}
base-url = "https://api.anthropic.com"
version = ${?ANTHROPIC_VERSION}
version = "2023-06-01"
}
azure {
api-key = ${?AZURE_API_KEY}
api-base = ${?AZURE_API_BASE}
deployment-name = ${?AZURE_DEPLOYMENT_NAME}
api-version = ${?AZURE_API_VERSION}
api-version = "2024-02-15-preview"
}
ollama {
base-url = ${?OLLAMA_BASE_URL}
base-url = "http://localhost:11434"
}
}
}
# Tracing configuration
tracing {
mode = ${?TRACING_MODE}
mode = "none" # Default: disabled
langfuse {
public-key = ${?LANGFUSE_PUBLIC_KEY}
secret-key = ${?LANGFUSE_SECRET_KEY}
url = ${?LANGFUSE_URL}
url = "https://cloud.langfuse.com"
}
}
# Context window management
context {
max-messages = 50
preserve-system-message = true
pruning-strategy = "oldest-first" # oldest-first, middle-out, recent-turns
}
Access Configuration in Code
1
2
3
4
5
6
7
8
9
10
import org.llm4s.config.Llm4sConfig
// Load provider-specific config
val providerConfig = Llm4sConfig.provider()
// Load tracing config
val tracingConfig = Llm4sConfig.tracing()
// Load embeddings config
val embeddingsConfig = Llm4sConfig.embeddings()
Provider-Specific Configuration
OpenAI
1
2
3
4
5
6
7
8
9
10
11
# Model selection
LLM_MODEL=openai/gpt-4o # Latest GPT-4o
LLM_MODEL=openai/gpt-4-turbo # GPT-4 Turbo
LLM_MODEL=openai/gpt-3.5-turbo # GPT-3.5
# Required
OPENAI_API_KEY=sk-proj-...
# Optional
OPENAI_BASE_URL=https://api.openai.com/v1
OPENAI_ORGANIZATION=org-...
Available Models:
gpt-4o- Latest GPT-4 Optimizedgpt-4-turbo- Fast GPT-4gpt-4- Standard GPT-4gpt-3.5-turbo- Fast and cheap
Anthropic
1
2
3
4
5
6
7
8
9
10
# Model selection
LLM_MODEL=anthropic/claude-sonnet-4-5-latest
LLM_MODEL=anthropic/claude-opus-4-5-latest
# Required
ANTHROPIC_API_KEY=sk-ant-...
# Optional
ANTHROPIC_BASE_URL=https://api.anthropic.com
ANTHROPIC_VERSION=2023-06-01
Available Models:
claude-sonnet-4-5-latest- Latest Claude 4.5 Sonnetclaude-opus-4-5-latest- Most capable Claude 4.5 Opusclaude-3-haiku-latest- Fastest
Azure OpenAI
1
2
3
4
5
6
7
8
9
10
# Model selection
LLM_MODEL=azure/gpt-4o
# Required
AZURE_API_KEY=your-azure-key
AZURE_API_BASE=https://your-resource.openai.azure.com
AZURE_DEPLOYMENT_NAME=gpt-4o # Your deployment name
# Optional
AZURE_API_VERSION=2024-02-15-preview
Ollama (Local Models)
1
2
3
4
5
6
7
8
9
# Model selection
LLM_MODEL=ollama/llama2
LLM_MODEL=ollama/mistral
LLM_MODEL=ollama/codellama
# Required
OLLAMA_BASE_URL=http://localhost:11434
# No API key needed!
Install Ollama:
1
2
3
4
5
6
7
8
# Install
curl -fsSL https://ollama.com/install.sh | sh
# Pull a model
ollama pull llama2
# Start server
ollama serve
Google Gemini
1
2
3
4
5
6
7
8
9
10
# Model selection
LLM_MODEL=gemini/gemini-2.0-flash
LLM_MODEL=gemini/gemini-1.5-pro
LLM_MODEL=gemini/gemini-1.5-flash
# Required
GOOGLE_API_KEY=your-google-api-key
# Optional
GEMINI_BASE_URL=https://generativelanguage.googleapis.com/v1beta
Available Models:
gemini-2.0-flash- Latest Gemini 2.0 (fast, multimodal)gemini-2.5-flash- Gemini 2.5 Flash (thinking model)gemini-1.5-pro- High capability (1M token context)gemini-1.5-flash- Fast and efficient (1M token context)
Get API Key:
- Go to Google AI Studio
- Click “Get API Key”
- Create or select a project
- Copy the generated API key
Embeddings Configuration
LLM4S supports multiple embedding providers for RAG (Retrieval-Augmented Generation) workflows.
Unified Format (Recommended)
Use EMBEDDING_MODEL with the format provider/model-name, similar to LLM_MODEL:
1
2
3
4
5
6
7
8
9
10
# OpenAI embeddings
EMBEDDING_MODEL=openai/text-embedding-3-small
OPENAI_API_KEY=sk-... # Same key as LLM
# Voyage AI embeddings
EMBEDDING_MODEL=voyage/voyage-3
VOYAGE_API_KEY=pa-...
# Ollama embeddings (local, no API key needed)
EMBEDDING_MODEL=ollama/nomic-embed-text
Default base URLs are used automatically:
- OpenAI:
https://api.openai.com/v1 - Voyage:
https://api.voyageai.com/v1 - Ollama:
http://localhost:11434
Override base URLs if needed:
1
2
EMBEDDING_MODEL=openai/text-embedding-3-small
OPENAI_EMBEDDING_BASE_URL=https://custom.openai.com/v1
Available Models
OpenAI:
text-embedding-3-small- Fast, cost-effective (1536 dimensions)text-embedding-3-large- Higher quality (3072 dimensions)text-embedding-ada-002- Legacy model (1536 dimensions)
Voyage AI:
voyage-3- General purposevoyage-3-large- Higher qualityvoyage-code-2- Code-optimized
Ollama (local):
nomic-embed-text- General purpose (768 dimensions)mxbai-embed-large- Higher quality (1024 dimensions)all-minilm- Lightweight (384 dimensions)
Install Ollama embedding model:
1
ollama pull nomic-embed-text
Legacy Format (Still Supported)
The legacy format using EMBEDDING_PROVIDER is still supported for backward compatibility:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# OpenAI
EMBEDDING_PROVIDER=openai
OPENAI_EMBEDDING_BASE_URL=https://api.openai.com/v1
OPENAI_EMBEDDING_MODEL=text-embedding-3-small
OPENAI_API_KEY=sk-...
# Voyage AI
EMBEDDING_PROVIDER=voyage
VOYAGE_EMBEDDING_BASE_URL=https://api.voyageai.com/v1
VOYAGE_EMBEDDING_MODEL=voyage-3
VOYAGE_API_KEY=pa-...
# Ollama
EMBEDDING_PROVIDER=ollama
OLLAMA_EMBEDDING_BASE_URL=http://localhost:11434
OLLAMA_EMBEDDING_MODEL=nomic-embed-text
Using Embeddings in Code
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
import org.llm4s.config.Llm4sConfig
import org.llm4s.llmconnect.EmbeddingClient
import org.llm4s.llmconnect.config.EmbeddingModelConfig
import org.llm4s.agent.memory.LLMEmbeddingService
// Create embedding client from typed config
val clientResult = for {
(provider, cfg) <- Llm4sConfig.embeddings()
client <- EmbeddingClient.from(provider, cfg)
} yield client
// Or use the higher-level service
val embeddingServiceResult = for {
client <- clientResult
model <- Llm4sConfig.textEmbeddingModel()
} yield LLMEmbeddingService(client, EmbeddingModelConfig(model.modelName, model.dimensions))
System Properties (Alternative)
When running via sbt, use system properties for reliable configuration:
1
2
3
4
5
6
7
8
# Using unified format (recommended)
sbt -Dllm4s.embeddings.model=ollama/nomic-embed-text \
"run"
# Or with legacy format
sbt -Dllm4s.embeddings.provider=ollama \
-Dllm4s.embeddings.ollama.model=nomic-embed-text \
"run"
Tracing Configuration
Console Tracing (Development)
1
TRACING_MODE=console
Output appears in stdout:
1
2
3
[TRACE] Completion request: UserMessage(What is Scala?)
[TRACE] Token usage: 150 tokens
[TRACE] Response time: 1.2s
Langfuse (Production)
1
2
3
4
TRACING_MODE=langfuse
LANGFUSE_PUBLIC_KEY=pk-lf-...
LANGFUSE_SECRET_KEY=sk-lf-...
LANGFUSE_URL=https://cloud.langfuse.com
Get keys from Langfuse:
- Sign up at langfuse.com
- Create a project
- Navigate to Settings → API Keys
- Copy public and secret keys
Disable Tracing
1
TRACING_MODE=none
Multi-Provider Setup
Switch between providers without code changes:
1
2
3
4
5
6
7
8
9
10
11
# Development: Use Ollama (free, local)
LLM_MODEL=ollama/llama2
OLLAMA_BASE_URL=http://localhost:11434
# Staging: Use OpenAI GPT-3.5 (cheap)
LLM_MODEL=openai/gpt-3.5-turbo
OPENAI_API_KEY=sk-...
# Production: Use Claude Sonnet (best quality)
LLM_MODEL=anthropic/claude-sonnet-4-5-latest
ANTHROPIC_API_KEY=sk-ant-...
Your code remains the same:
1
val client = Llm4sConfig.provider().flatMap(LLMConnect.getClient)
System Properties
Override any setting via JVM flags:
1
2
3
sbt -DLLM_MODEL=openai/gpt-4o \
-DOPENAI_API_KEY=sk-... \
run
Or in build.sbt:
1
2
3
4
javaOptions ++= Seq(
s"-DLLM_MODEL=openai/gpt-4o",
s"-DOPENAI_API_KEY=${sys.env.getOrElse("OPENAI_API_KEY", "")}"
)
Environment-Specific Configurations
Development
Create .env.dev:
1
2
3
LLM_MODEL=ollama/llama2
OLLAMA_BASE_URL=http://localhost:11434
TRACING_MODE=console
Staging
Create .env.staging:
1
2
3
4
5
LLM_MODEL=openai/gpt-3.5-turbo
OPENAI_API_KEY=${STAGING_OPENAI_KEY}
TRACING_MODE=langfuse
LANGFUSE_PUBLIC_KEY=${STAGING_LANGFUSE_PUBLIC}
LANGFUSE_SECRET_KEY=${STAGING_LANGFUSE_SECRET}
Production
Create .env.prod:
1
2
3
4
5
LLM_MODEL=anthropic/claude-sonnet-4-5-latest
ANTHROPIC_API_KEY=${PROD_ANTHROPIC_KEY}
TRACING_MODE=langfuse
LANGFUSE_PUBLIC_KEY=${PROD_LANGFUSE_PUBLIC}
LANGFUSE_SECRET_KEY=${PROD_LANGFUSE_SECRET}
Load the appropriate file:
1
2
source .env.prod
sbt run
Configuration Best Practices
✅ DO
- Use environment variables for secrets (API keys)
- Use application.conf for non-sensitive defaults
- Add .env to .gitignore to prevent committing secrets
- Use different configs for dev/staging/prod
- Validate configuration at startup
1
2
3
4
5
6
7
8
9
10
import org.llm4s.config.Llm4sConfig
val config = Llm4sConfig.provider()
config.fold(
error => {
println(s"Configuration error: $error")
System.exit(1)
},
_ => println("Configuration loaded successfully")
)
❌ DON’T
- Don’t hardcode API keys in source code
- Don’t commit .env files to version control
- Don’t use System.getenv() directly (use Llm4sConfig)
- Don’t mix production keys in development
- Don’t skip validation - fail fast on bad config
Validation Example
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
import org.llm4s.config.Llm4sConfig
import org.llm4s.llmconnect.LLMConnect
object ValidateConfig extends App {
println("Validating LLM4S configuration...")
val validation = for {
providerConfig <- Llm4sConfig.provider()
client <- LLMConnect.getClient(providerConfig)
} yield {
println(s"✅ Model: ${providerConfig.model}")
println(s"✅ Provider: ${providerConfig.getClass.getSimpleName}")
println(s"✅ Client ready: ${client.getClass.getSimpleName}")
}
validation match {
case Right(_) =>
println("✅ Configuration valid!")
case Left(error) =>
println(s"❌ Configuration error: $error")
System.exit(1)
}
}
Troubleshooting
Problem: “API key not found”
Symptoms:
1
Left(ConfigurationError("Missing API key for provider: openai"))
Root causes:
- Environment variable not set
- Typo in variable name (case-sensitive)
.envfile not loaded in your shell session- Using wrong shell (check with
echo $SHELL)
Debug steps:
1
2
3
4
5
6
7
8
9
10
11
12
# Check if variable is set
echo $OPENAI_API_KEY
# List all LLM4S-related variables
env | grep -E '(LLM_MODEL|OPENAI|ANTHROPIC|AZURE|OLLAMA)'
# Source .env explicitly
source .env
echo $OPENAI_API_KEY # Should show your key
# Run SBT with explicit env loading
sbt -Dconfig.override_with_env_vars=true run
Problem: “Invalid model format”
Symptoms:
1
Left(ConfigurationError("Invalid model format: gpt-4o"))
Fix: Models must include the provider prefix:
1
2
3
4
5
6
7
8
# ✅ Correct format: provider/model-name
LLM_MODEL=openai/gpt-4o
LLM_MODEL=anthropic/claude-3-5-sonnet-20241022
LLM_MODEL=ollama/llama3.2
# ❌ Wrong - missing provider
LLM_MODEL=gpt-4o
LLM_MODEL=claude-3-5-sonnet
Problem: “Configuration not loading from application.conf”
Symptoms: Environment variables work but application.conf is ignored.
Debug:
1
2
3
4
5
import com.typesafe.config.ConfigFactory
val config = ConfigFactory.load()
println(config.hasPath("llm4s.provider")) // Should be true
println(config.getString("llm4s.provider.model")) // Should show your model
Common causes:
- application.conf not in
src/main/resources/ - HOCON syntax error (missing quotes, wrong nesting)
- Environment variable overriding config (env vars have higher precedence)
Fix: Validate HOCON syntax:
1
2
# Test config loading
sbt "runMain com.typesafe.config.impl.ConfigImpl"
Problem: “Provider mismatch”
Symptoms:
1
2
3
import org.llm4s.error.AuthenticationError
Left(AuthenticationError("Invalid API key"))
Root cause: Model provider doesn’t match API key provider.
1
2
3
4
5
6
7
8
# ✅ Check your configuration
echo $LLM_MODEL # Should be openai/...
echo $OPENAI_API_KEY # Should start with sk-proj-...
# ❌ Common mismatch:
LLM_MODEL=openai/gpt-4o
ANTHROPIC_API_KEY=sk-ant-... # ❌ Wrong! Need OPENAI_API_KEY
OPENAI_API_KEY= # ❌ Wrong - key is empty
Problem: “Rate limiting / 429 errors”
Symptoms:
1
2
3
import org.llm4s.error.RateLimitError
Left(RateLimitError("Rate limit exceeded"))
Solutions:
- Add retry logic: ```scala import scala.concurrent.duration._ import org.llm4s.error.RateLimitError import org.llm4s.types.Result
// Note: This example uses Thread.sleep for simplicity. // In production, prefer scala.concurrent or cats-effect for non-blocking delays. def retryWithBackoffA: Result[A] = { (1 to maxRetries).foldLeft(op) { (result, attempt) => result match { case Left(_: RateLimitError) if attempt < maxRetries => Thread.sleep(Math.pow(2, attempt).toLong * 1000) // Exponential backoff op case other => other } } }
1
2
3
4
5
6
7
8
2. **Use a cheaper model for testing:**
```bash
# Development
LLM_MODEL=openai/gpt-4o-mini # 60x cheaper
# Production
LLM_MODEL=openai/gpt-4o
- Switch to Ollama for development:
1
LLM_MODEL=ollama/llama3.2 # Free, no rate limits
Problem: “Slow responses or timeouts”
Symptoms: Requests take 30+ seconds or timeout.
Solutions:
- Increase timeout in application.conf:
1 2 3
llm4s { request-timeout = 60 seconds # Default is 30s }
- Use streaming for long responses: ```scala // Non-streaming: waits for full response client.complete(Conversation(Seq(UserMessage(“Your query”))))
// Streaming: get tokens as they arrive client.streamComplete( Conversation(Seq(UserMessage(“Your query”))) ) { chunk => chunk.content.foreach(print) }
1
2
3
4
5
6
7
3. **Reduce response length:**
```scala
client.complete(
Conversation(Seq(UserMessage("Your query"))),
CompletionOptions(maxTokens = Some(500)) // Limit response length
)
Problem: “OutOfMemoryError with embeddings”
Symptoms: JVM crashes when processing large document collections.
Solutions:
- Batch embeddings instead of loading all at once: ```scala val documents: List[String] = loadDocuments() val batchSize = 100
val embeddings = documents.grouped(batchSize).flatMap { batch => embedder.embed(batch) match { case Right(emb) => emb case Left(err) => println(s”Batch failed: $err”) List.empty } }.toList
1
2
3
4
5
2. **Increase JVM heap:**
```bash
export SBT_OPTS="-Xmx4G -Xss4M"
sbt run
- Use a local embedding model (smaller memory footprint):
1
LLM_EMBEDDING_MODEL=ollama/nomic-embed-text
Next Steps
Configuration complete! Now you can:
- Explore features → - Dive into agents, tools, and more
- Browse examples → - See configuration in action
- User guide → - Learn core concepts
- Observability → - Set up tracing
Additional Resources
- Llm4sConfig API - Detailed API documentation
- Environment Variables Reference - Complete list
- Production Guide - Production best practices
- Discord Community - Get help with configuration
Ready to build? Explore what’s next →