Multi-protocol LLM proxy and Haskell client library. Connect to any LLM API (OpenAI, Anthropic, Gemini) using any SDK with automatic protocol translation.
- Protocol Translation: OpenAI ↔ Anthropic ↔ Gemini automatic conversion
- Dual Usage: Haskell library or standalone proxy server
- Streaming: Full SSE support with smart buffering
- Function Calling: Works across all protocols (JSON and XML formats)
- Vision: Multimodal image support
- Flexible Auth: Optional authentication for local vs cloud backends
# Install
git clone https://github.com/junjihashimoto/louter.git
cd louter
cabal build all
# Configure
cat > config.yaml <<EOF
backends:
llama-server:
type: openai
url: http://localhost:11211
requires_auth: false
model_mapping:
gpt-4: qwen/qwen2.5-vl-7b
EOF
# Run
cabal run louter-server -- --config config.yaml --port 9000Now send OpenAI/Anthropic/Gemini requests to localhost:9000.
Test it:
curl http://localhost:9000/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "gpt-4", "messages": [{"role": "user", "content": "Hello!"}]}'Add to your project:
# package.yaml
dependencies:
- louter
- text
- aesonBasic usage:
import Louter.Client
import Louter.Client.OpenAI (llamaServerClient)
main = do
client <- llamaServerClient "http://localhost:11211"
response <- chatCompletion client $
defaultChatRequest "gpt-4" [Message RoleUser "Hello!"]
print responseStreaming:
import Louter.Client
import Louter.Types.Streaming
import System.IO (hFlush, stdout)
main = do
client <- llamaServerClient "http://localhost:11211"
let request = (defaultChatRequest "gpt-4"
[Message RoleUser "Write a haiku"]) { reqStream = True }
streamChatWithCallback client request $ \event -> case event of
StreamContent txt -> putStr txt >> hFlush stdout
StreamFinish reason -> putStrLn $ "\n[Done: " <> reason <> "]"
StreamError err -> putStrLn $ "[Error: " <> err <> "]"
_ -> pure ()Function calling:
import Data.Aeson (object, (.=))
weatherTool = Tool
{ toolName = "get_weather"
, toolDescription = Just "Get current weather"
, toolParameters = object
[ "type" .= ("object" :: Text)
, "properties" .= object
[ "location" .= object
[ "type" .= ("string" :: Text) ]
]
, "required" .= (["location"] :: [Text])
]
}
request = (defaultChatRequest "gpt-4"
[Message RoleUser "Weather in Tokyo?"])
{ reqTools = [weatherTool]
, reqToolChoice = ToolChoiceAuto
}| Frontend | Backend | Use Case |
|---|---|---|
| OpenAI SDK | Gemini API | Use OpenAI SDK with Gemini models |
| Anthropic SDK | Local llama-server | Use Claude Code with local models |
| Gemini SDK | OpenAI API | Use Gemini SDK with GPT models |
| Any SDK | Any Backend | Protocol-agnostic development |
Local model (no auth):
backends:
local:
type: openai
url: http://localhost:11211
requires_auth: false
model_mapping:
gpt-4: qwen/qwen2.5-vl-7bCloud API (with auth):
backends:
openai:
type: openai
url: https://api.openai.com
requires_auth: true
api_key: "${OPENAI_API_KEY}"
model_mapping:
gpt-4: gpt-4-turbo-previewMulti-backend:
backends:
local:
type: openai
url: http://localhost:11211
requires_auth: false
model_mapping:
gpt-3.5-turbo: qwen/qwen2.5-7b
openai:
type: openai
url: https://api.openai.com
requires_auth: true
api_key: "${OPENAI_API_KEY}"
model_mapping:
gpt-4: gpt-4-turbo-previewSee examples/ for more configurations.
-- Local llama-server (no auth)
import Louter.Client.OpenAI (llamaServerClient)
client <- llamaServerClient "http://localhost:11211"
-- Cloud APIs (with auth)
import Louter.Client.OpenAI (openAIClient)
import Louter.Client.Anthropic (anthropicClient)
import Louter.Client.Gemini (geminiClient)
client <- openAIClient "sk-..."
client <- anthropicClient "sk-ant-..."
client <- geminiClient "your-api-key"-- ChatRequest
data ChatRequest = ChatRequest
{ reqModel :: Text
, reqMessages :: [Message]
, reqTools :: [Tool]
, reqTemperature :: Maybe Float
, reqMaxTokens :: Maybe Int
, reqStream :: Bool
}
-- Message
data Message = Message
{ msgRole :: MessageRole -- RoleSystem | RoleUser | RoleAssistant
, msgContent :: Text
}
-- Tool
data Tool = Tool
{ toolName :: Text
, toolDescription :: Maybe Text
, toolParameters :: Value -- JSON schema
}-- Non-streaming
chatCompletion :: Client -> ChatRequest -> IO (Either Text ChatResponse)
data ChatResponse = ChatResponse
{ respId :: Text
, respChoices :: [Choice]
, respUsage :: Maybe Usage
}
-- Streaming
streamChatWithCallback :: Client -> ChatRequest -> (StreamEvent -> IO ()) -> IO ()
data StreamEvent
= StreamContent Text -- Response text
| StreamReasoning Text -- Thinking tokens
| StreamToolCall ToolCall -- Complete tool call (buffered)
, StreamFinish FinishReason
| StreamError Text# Build
docker build -t louter .
# Run with config
docker run -p 9000:9000 -v $(pwd)/config.yaml:/app/config.yaml louter
# Or use docker-compose
docker-compose up# Python SDK integration tests (43+ tests)
python tests/run_all_tests.py
# Haskell unit tests
cabal test allClient Request (Any Format)
↓
Protocol Converter
↓
Core IR (OpenAI-based)
↓
Backend Adapter
↓
LLM Backend (Any Format)
Key Components:
- SSE Parser: Incremental streaming with attoparsec
- Smart Buffering: Tool calls buffered until complete JSON
- Type Safety: Strict Haskell types throughout
Streaming Strategy:
- Content/Reasoning: Stream immediately (real-time output)
- Tool Calls: Buffer until complete (valid JSON required)
- State Machine: Track tool call assembly by index
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:9000/v1",
api_key="not-needed"
)
response = client.chat.completions.create(
model="gpt-4", # Routed to qwen/qwen2.5-vl-7b
messages=[{"role": "user", "content": "Hello!"}]
)# config.yaml
backends:
gemini:
type: gemini
url: https://generativelanguage.googleapis.com
requires_auth: true
api_key: "${GEMINI_API_KEY}"
model_mapping:
claude-3-5-sonnet-20241022: gemini-2.0-flash# Start proxy on Anthropic-compatible port
cabal run louter-server -- --config config.yaml --port 8000
# Configure Claude Code:
# API Endpoint: http://localhost:8000
# Model: claude-3-5-sonnet-20241022Health check:
curl http://localhost:9000/healthJSON-line logging:
cabal run louter-server -- --config config.yaml --port 9000 2>&1 | jq .Connection refused:
# Check backend is running
curl http://localhost:11211/v1/modelsInvalid API key:
# Verify environment variable
echo $OPENAI_API_KEYModel not found:
- Check
model_mappingin config - Frontend model (client requests) → Backend model (sent to API)
See examples/ for configuration examples and use cases.
MIT License - see LICENSE file.