llm4s-core/org.llm4s/org.llm4s.context/ConversationTokenCounter

ConversationTokenCounter

org.llm4s.context.ConversationTokenCounter

See theConversationTokenCounter companion class

Factory methods for creating ConversationTokenCounter instances.

Provides model-aware counter creation that automatically selects the appropriate tokenizer based on the model name. Supports OpenAI, Anthropic, Azure, and Ollama models.

==Tokenizer Selection==

Different models use different tokenization schemes:

'''GPT-4o, o1''': Uses o200k_base tokenizer
'''GPT-4, GPT-3.5''': Uses cl100k_base tokenizer
'''Claude models''': Uses cl100k_base approximation (may differ 20-30%)
'''Ollama models''': Uses cl100k_base approximation

Attributes

Example

// Model-aware creation (recommended)
val counter = ConversationTokenCounter.forModel("openai/gpt-4o")
// Direct tokenizer selection
val openAICounter = ConversationTokenCounter.openAI()
val gpt4oCounter = ConversationTokenCounter.openAI_o200k()

Companion

class

Graph

Supertypes

class Object

trait Matchable

class Any

Self type

ConversationTokenCounter.type

Members list

Value members

Concrete methods

Create a token counter for a specific tokenizer.

Value parameters

tokenizerId: The tokenizer to use (e.g., TokenizerId.CL100K_BASE)

Attributes

Returns: A Result containing the counter, or an error if the tokenizer is unavailable

Create a token counter for a specific model name with automatic tokenizer selection.

This is the '''recommended''' way to create token counters as it automatically selects the appropriate tokenizer based on the model name and provider.

The model name should be in the format provider/model-name (e.g., openai/gpt-4o, anthropic/claude-3-sonnet). Plain model names are also supported.

Value parameters

modelName: The model identifier (e.g., "gpt-4o", "openai/gpt-4o", "claude-3-sonnet")

Attributes

Returns: A Result containing the counter, or an error if the tokenizer is unavailable
See also: tokens.TokenizerMapping for the full model-to-tokenizer mapping

Create a token counter using the OpenAI cl100k_base tokenizer.

Suitable for GPT-4, GPT-3.5-turbo, and most embedding models. This is the most common OpenAI tokenizer and a reasonable fallback for unknown models.

Attributes

Returns: A Result containing the counter, or an error if the tokenizer is unavailable

Create a token counter using the OpenAI o200k_base tokenizer.

Suitable for GPT-4o and o1 series models which use this newer tokenizer with a larger vocabulary (200k tokens vs 100k).

Attributes

Returns: A Result containing the counter, or an error if the tokenizer is unavailable

In this article

Generated with