ConversationTokenCounter

org.llm4s.context.ConversationTokenCounter
See theConversationTokenCounter companion class

Factory methods for creating ConversationTokenCounter instances.

Provides model-aware counter creation that automatically selects the appropriate tokenizer based on the model name. Supports OpenAI, Anthropic, Azure, and Ollama models.

==Tokenizer Selection==

Different models use different tokenization schemes:

  • '''GPT-4o, o1''': Uses o200k_base tokenizer
  • '''GPT-4, GPT-3.5''': Uses cl100k_base tokenizer
  • '''Claude models''': Uses cl100k_base approximation (may differ 20-30%)
  • '''Ollama models''': Uses cl100k_base approximation

Attributes

Example
// Model-aware creation (recommended)
val counter = ConversationTokenCounter.forModel("openai/gpt-4o")
// Direct tokenizer selection
val openAICounter = ConversationTokenCounter.openAI()
val gpt4oCounter = ConversationTokenCounter.openAI_o200k()
Companion
class
Graph
Supertypes
class Object
trait Matchable
class Any
Self type

Members list

Value members

Concrete methods

Create a token counter for a specific tokenizer.

Create a token counter for a specific tokenizer.

Value parameters

tokenizerId

The tokenizer to use (e.g., TokenizerId.CL100K_BASE)

Attributes

Returns

A Result containing the counter, or an error if the tokenizer is unavailable

def forModel(modelName: String): Result[ConversationTokenCounter]

Create a token counter for a specific model name with automatic tokenizer selection.

Create a token counter for a specific model name with automatic tokenizer selection.

This is the '''recommended''' way to create token counters as it automatically selects the appropriate tokenizer based on the model name and provider.

The model name should be in the format provider/model-name (e.g., openai/gpt-4o, anthropic/claude-3-sonnet). Plain model names are also supported.

Value parameters

modelName

The model identifier (e.g., "gpt-4o", "openai/gpt-4o", "claude-3-sonnet")

Attributes

Returns

A Result containing the counter, or an error if the tokenizer is unavailable

See also

tokens.TokenizerMapping for the full model-to-tokenizer mapping

Create a token counter using the OpenAI cl100k_base tokenizer.

Create a token counter using the OpenAI cl100k_base tokenizer.

Suitable for GPT-4, GPT-3.5-turbo, and most embedding models. This is the most common OpenAI tokenizer and a reasonable fallback for unknown models.

Attributes

Returns

A Result containing the counter, or an error if the tokenizer is unavailable

Create a token counter using the OpenAI o200k_base tokenizer.

Create a token counter using the OpenAI o200k_base tokenizer.

Suitable for GPT-4o and o1 series models which use this newer tokenizer with a larger vocabulary (200k tokens vs 100k).

Attributes

Returns

A Result containing the counter, or an error if the tokenizer is unavailable