A unified Python library for counting tokens across multiple LLM providers. Tokemon provides a simple, consistent interface to count tokens for OpenAI, Anthropic, Google AI, and xAI models.
- Unified API for token counting across multiple providers
- Support for both synchronous and asynchronous operations
- Dynamic model discovery via provider APIs
- Type-safe responses with dataclass
pip install tokemonfrom tokemon import tokemon, ProviderName, Mode
# Create a tokenizer for your preferred provider
tokenizer = tokemon(
model="gpt-4o",
provider=ProviderName.OPENAI.value,
mode=Mode.SYNC,
)
# Count tokens
response = tokenizer.count_tokens("Hello, world!")
print(response.input_tokens) # Number of tokens
print(response.model) # Model name
print(response.provider) # Provider nameOpenAI tokenization uses tiktoken and works offline without an API key.
from tokemon import tokemon, ProviderName, Mode
tokenizer = tokemon(
model="gpt-4o",
provider=ProviderName.OPENAI.value,
mode=Mode.SYNC,
)
response = tokenizer.count_tokens("Hello, world!")
print(f"Token count: {response.input_tokens}")Note: Set the
ANTHROPIC_API_KEYenvironment variable before using the Anthropic provider.
export ANTHROPIC_API_KEY="your-api-key"from tokemon import tokemon, ProviderName, Mode
tokenizer = tokemon(
model="claude-sonnet-4-5",
provider=ProviderName.ANTHROPIC.value,
mode=Mode.SYNC,
)
response = tokenizer.count_tokens("Hello, world!")
print(f"Token count: {response.input_tokens}")Async example:
import asyncio
from tokemon import tokemon, ProviderName, Mode
async def main():
tokenizer = tokemon(
model="claude-sonnet-4-5",
provider=ProviderName.ANTHROPIC.value,
mode=Mode.ASYNC,
)
response = await tokenizer.count_tokens("Hello, world!")
print(f"Token count: {response.input_tokens}")
asyncio.run(main())Note: Set the
GEMINI_API_KEYenvironment variable before using the Google AI provider.
export GEMINI_API_KEY="your-api-key"from tokemon import tokemon, ProviderName, Mode
tokenizer = tokemon(
model="gemini-2.5-flash",
provider=ProviderName.GOOGLE.value,
mode=Mode.SYNC,
)
response = tokenizer.count_tokens("Hello, world!")
print(f"Token count: {response.input_tokens}")Async example:
import asyncio
from tokemon import tokemon, ProviderName, Mode
async def main():
tokenizer = tokemon(
model="gemini-2.5-flash",
provider=ProviderName.GOOGLE.value,
mode=Mode.ASYNC,
)
response = await tokenizer.count_tokens("Hello, world!")
print(f"Token count: {response.input_tokens}")
asyncio.run(main())Note: Set the
XAI_API_KEYenvironment variable before using the xAI provider.
export XAI_API_KEY="your-api-key"from tokemon import tokemon, ProviderName, Mode
tokenizer = tokemon(
model="grok-3",
provider=ProviderName.XAI.value,
mode=Mode.SYNC,
)
response = tokenizer.count_tokens("Hello, world!")
print(f"Token count: {response.input_tokens}")Async example:
import asyncio
from tokemon import tokemon, ProviderName, Mode
async def main():
tokenizer = tokemon(
model="grok-3",
provider=ProviderName.XAI.value,
mode=Mode.ASYNC,
)
response = await tokenizer.count_tokens("Hello, world!")
print(f"Token count: {response.input_tokens}")
asyncio.run(main())Use tokemon_models() to discover models supported by each provider at runtime:
from tokemon import tokemon_models, ProviderName, Mode
# Get a provider instance
provider = tokemon_models(
provider=ProviderName.OPENAI.value,
mode=Mode.SYNC,
)
# List available models
models = provider.models()
print(models)Async example:
import asyncio
from tokemon import tokemon_models, ProviderName, Mode
async def main():
provider = tokemon_models(
provider=ProviderName.ANTHROPIC.value,
mode=Mode.ASYNC,
)
models = await provider.models()
print(models)
asyncio.run(main())The count_tokens method returns a TokenizerResponse dataclass:
@dataclass
class TokenizerResponse:
input_tokens: int | None # Number of tokens in the input
model: str # Model name used for tokenization
provider: str # Provider name (openai, anthropic, google, xai)- Python >= 3.10
MIT License