Transform OpenAI's Codex models into OpenAI-compatible endpoints using Cloudflare Workers. Access advanced reasoning capabilities and seamless API compatibility, powered by OAuth2 authentication and the same infrastructure that drives the official OpenAI Codex CLI.
- π OAuth2 Authentication - Uses your OpenAI account credentials via Codex CLI
- π― OpenAI-Compatible API - Drop-in replacement for OpenAI endpoints
- π OpenAI SDK Support - Works with official OpenAI SDKs and libraries
- π§ Advanced Reasoning - Configurable reasoning effort with
think-tagscompatibility - π‘οΈ API Key Security - Optional authentication layer for endpoint access
- π Third-party Integration - Compatible with Open WebUI, Cline, and more
- β‘ Cloudflare Workers - Global edge deployment with low latency
- π Smart Token Management - Automatic token refresh with KV storage
- π‘ Real-time Streaming - Server-sent events for live responses
- π¦ Ollama Compatibility - Full Ollama API support for local model workflows
- ποΈ Flexible Tool Support - OpenAI-compatible function calling
Choose your preferred deployment method:
- π Cloudflare Workers (Recommended) - Serverless, global edge deployment
- π³ Docker - Self-hosted with full control - See Docker Guide
- OpenAI Account with Codex CLI access
- Cloudflare Account with Workers enabled
- Wrangler CLI installed (
npm install -g wrangler)
You need OAuth2 credentials from the official OpenAI Codex CLI.
-
Install OpenAI Codex CLI:
npm install -g @openai/codex # Alternatively: brew install codex -
Start Codex and authenticate:
codex
Select "Sign in with ChatGPT" when prompted. You'll need a Plus, Pro, or Team ChatGPT account to access the latest models, including gpt-5, at no extra cost to your plan.
-
Complete authentication:
The login process will start a server on
localhost:1455. Open the provided URL in your browser to complete the authentication flow. -
Locate the credentials file:
Windows:
C:\Users\USERNAME\.codex\auth.jsonmacOS/Linux:
~/.codex/auth.json -
Copy the credentials: The file contains JSON in this format:
{ "tokens": { "id_token": "eyJhbGciOiJSUzI1NiIs...", "access_token": "sk-proj-...", "refresh_token": "rft_...", "account_id": "user-..." }, "last_refresh": "2024-01-15T10:30:00.000Z" }
If you've used the Codex CLI before:
- Update the CLI and ensure
codex --versionis 0.20.0 or later - Delete
~/.codex/auth.json(orC:\Users\USERNAME\.codex\auth.jsonon Windows) - Run
codexand authenticate again
If you're on a headless server or SSH'd into a remote machine:
Option 1: Copy credentials from local machine
# Authenticate locally first, then copy the auth.json file
scp ~/.codex/auth.json user@remote:~/.codex/auth.jsonOption 2: Port forwarding for remote authentication
# From your local machine, create an SSH tunnel
ssh -L 1455:localhost:1455 user@remote-host
# Then run codex in the SSH session and open localhost:1455 locallyYou can also use your OpenAI API key instead:
export OPENAI_API_KEY="your-api-key-here"To force API key usage even when ChatGPT auth exists:
codex --config preferred_auth_method="apikey"# Create a KV namespace for token caching
wrangler kv namespace create "KV"Note the namespace ID returned and update wrangler.toml:
kv_namespaces = [
{ binding = "KV", id = "your-kv-namespace-id" }
]Create a .dev.vars file:
# Required: API key for client authentication
OPENAI_API_KEY=sk-your-secret-api-key-here
# Required: Codex CLI authentication JSON
OPENAI_CODEX_AUTH={"tokens":{"id_token":"eyJ...","access_token":"sk-proj-...","refresh_token":"rft_...","account_id":"user-..."},"last_refresh":"2024-01-15T10:30:00.000Z"}
# Required: ChatGPT API configuration
CHATGPT_LOCAL_CLIENT_ID=your_client_id_here
CHATGPT_RESPONSES_URL=https://chatgpt.com/backend-api/codex/responses
# Optional: Ollama integration
OLLAMA_API_URL=http://localhost:11434
# Optional: Reasoning configuration
REASONING_EFFORT=medium
REASONING_SUMMARY=auto
REASONING_COMPAT=think-tags
# Optional: Debug settings
VERBOSE=false
DEBUG_MODEL=For production, set the secrets:
wrangler secret put OPENAI_API_KEY
wrangler secret put OPENAI_CODEX_AUTH
wrangler secret put CHATGPT_LOCAL_CLIENT_ID
wrangler secret put CHATGPT_RESPONSES_URL# Install dependencies
npm install
# Deploy to Cloudflare Workers
npm run deploy
# Or run locally for development
npm run devFor self-hosted deployment with Docker, see the comprehensive Docker Deployment Guide.
Quick Docker start with pre-built image:
# Pull and run the latest image
docker pull ghcr.io/gewoonjaap/codex-openai-wrapper:latest
# Create environment file
echo "OPENAI_API_KEY=sk-your-api-key-here" > .env
echo "OPENAI_CODEX_AUTH={...your-auth-json...}" >> .env
# Run the container
docker run -d \
--name codex-openai-wrapper \
-p 8787:8787 \
--env-file .env \
ghcr.io/gewoonjaap/codex-openai-wrapper:latestOr use Docker Compose for development:
git clone https://github.com/GewoonJaap/codex-openai-wrapper.git
cd codex-openai-wrapper
cp .dev.vars.example .dev.vars
# Edit .dev.vars with your configuration
docker-compose up -dThe service will be available at http://localhost:8787
| Variable | Required | Description |
|---|---|---|
OPENAI_API_KEY |
β | API key for client authentication |
OPENAI_CODEX_AUTH |
β | OAuth2 credentials JSON from Codex CLI |
CHATGPT_LOCAL_CLIENT_ID |
β | ChatGPT client ID |
CHATGPT_RESPONSES_URL |
β | ChatGPT API endpoint URL |
| Variable | Default | Description |
|---|---|---|
REASONING_EFFORT |
minimal |
Reasoning effort level: minimal, low, medium, high |
REASONING_SUMMARY |
auto |
Reasoning summary mode: auto, on, off |
REASONING_COMPAT |
think-tags |
Reasoning output format: think-tags, standard |
| Variable | Default | Description |
|---|---|---|
OLLAMA_API_URL |
http://localhost:11434 |
Ollama instance URL for local model integration |
DEBUG_MODEL |
- | Override model for debugging purposes |
VERBOSE |
false |
Enable detailed debug logging |
- When
OPENAI_API_KEYis set, all/v1/*and/api/*endpoints require authentication - Clients must include the header:
Authorization: Bearer <your-api-key> - Recommended format:
sk-followed by a random string (e.g.,sk-1234567890abcdef...) - Without this variable, endpoints are publicly accessible (not recommended for production)
- Automatic Refresh: Tokens are automatically refreshed when they expire or are older than 28 days
- KV Persistence: Refreshed tokens are stored in Cloudflare KV for persistence across requests
- Fallback Logic: Falls back from KV β environment β refresh β retry seamlessly
- Debug Logging: Comprehensive token source tracking for troubleshooting
| Binding | Purpose |
|---|---|
KV |
OAuth token caching and session management |
https://your-worker.your-subdomain.workers.dev
POST /v1/chat/completions
Authorization: Bearer sk-your-api-key-here
Content-Type: application/json
{
"model": "gpt-4",
"messages": [
{
"role": "system",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Explain quantum computing in simple terms"
}
],
"stream": true
}Enable enhanced reasoning capabilities:
{
"model": "gpt-4",
"messages": [
{
"role": "user",
"content": "Solve this step by step: What is the derivative of x^3 + 2x^2 - 5x + 3?"
}
],
"reasoning": {
"effort": "high",
"summary": "on"
}
}POST /v1/completions
Authorization: Bearer sk-your-api-key-here
Content-Type: application/json
{
"model": "gpt-3.5-turbo-instruct",
"prompt": "Write a Python function to calculate fibonacci numbers:",
"max_tokens": 150,
"stream": true
}GET /v1/models
Authorization: Bearer sk-your-api-key-hereResponse:
{
"object": "list",
"data": [
{
"id": "gpt-4",
"object": "model",
"created": 1708976947,
"owned_by": "openai-codex"
}
]
}POST /api/chat
Authorization: Bearer sk-your-api-key-here
Content-Type: application/json
{
"model": "llama2",
"messages": [
{"role": "user", "content": "Hello!"}
],
"stream": true
}GET /api/tags
Authorization: Bearer sk-your-api-key-herePOST /api/show
Authorization: Bearer sk-your-api-key-here
Content-Type: application/json
{
"name": "llama2"
}GET /healthNo authentication required
GET /No authentication required
The wrapper supports OpenAI-compatible tool calling (function calling) with seamless integration.
const response = await fetch('/v1/chat/completions', {
method: 'POST',
headers: {
'Content-Type': 'application/json',
'Authorization': 'Bearer sk-your-api-key-here'
},
body: JSON.stringify({
model: 'gpt-4',
messages: [
{ role: 'user', content: 'What is the weather in Tokyo?' }
],
tools: [
{
type: 'function',
function: {
name: 'get_weather',
description: 'Get current weather information for a location',
parameters: {
type: 'object',
properties: {
location: {
type: 'string',
description: 'City name'
},
unit: {
type: 'string',
enum: ['celsius', 'fahrenheit'],
description: 'Temperature unit'
}
},
required: ['location']
}
}
}
],
tool_choice: 'auto'
})
});auto: Let the model decide whether to call a functionnone: Disable function calling{"type": "function", "function": {"name": "function_name"}}: Force a specific function call
Cline is a powerful AI assistant extension for VS Code:
-
Install Cline in VS Code from the Extensions marketplace
-
Configure OpenAI API settings:
- Set API Provider to "OpenAI"
- Set Base URL to:
https://your-worker.workers.dev/v1 - Set API Key to:
sk-your-secret-api-key-here
-
Select models:
- Use
gpt-4for complex reasoning tasks - Use
gpt-3.5-turbofor faster responses
- Use
-
Add as OpenAI-compatible endpoint:
- Base URL:
https://your-worker.workers.dev/v1 - API Key:
sk-your-secret-api-key-here
- Base URL:
-
Auto-discovery: Open WebUI will automatically discover available models through the
/v1/modelsendpoint.
from openai import OpenAI
# Initialize with your worker endpoint
client = OpenAI(
base_url="https://your-worker.workers.dev/v1",
api_key="sk-your-secret-api-key-here"
)
# Chat completion with reasoning
response = client.chat.completions.create(
model="gpt-4",
messages=[
{"role": "system", "content": "You are a helpful coding assistant."},
{"role": "user", "content": "Write a binary search algorithm in Python"}
],
extra_body={
"reasoning": {
"effort": "high",
"summary": "on"
}
},
stream=True
)
for chunk in response:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")import OpenAI from 'openai';
const openai = new OpenAI({
baseURL: 'https://your-worker.workers.dev/v1',
apiKey: 'sk-your-secret-api-key-here',
});
const stream = await openai.chat.completions.create({
model: 'gpt-4',
messages: [
{ role: 'user', content: 'Explain async/await in JavaScript' }
],
stream: true,
});
for await (const chunk of stream) {
const content = chunk.choices[0]?.delta?.content || '';
process.stdout.write(content);
}# Chat completion
curl -X POST https://your-worker.workers.dev/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-your-secret-api-key-here" \
-d '{
"model": "gpt-4",
"messages": [
{"role": "user", "content": "Explain machine learning"}
]
}'
# Ollama chat
curl -X POST https://your-worker.workers.dev/api/chat \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-your-secret-api-key-here" \
-d '{
"model": "llama2",
"messages": [
{"role": "user", "content": "Hello world!"}
]
}'LiteLLM works seamlessly with the wrapper:
import litellm
# Configure LiteLLM to use your worker
litellm.api_base = "https://your-worker.workers.dev/v1"
litellm.api_key = "sk-your-secret-api-key-here"
# Use with reasoning capabilities
response = litellm.completion(
model="gpt-4",
messages=[
{"role": "user", "content": "Solve this step by step: What is 15 * 24?"}
],
extra_body={
"reasoning": {
"effort": "medium",
"summary": "auto"
}
},
stream=True
)
for chunk in response:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")The wrapper provides sophisticated reasoning capabilities with multiple configuration options:
minimal: Basic reasoning with minimal token overheadmedium: Balanced reasoning for most use caseshigh: Deep reasoning for complex problems
auto: Automatically decide when to include reasoning summarieson: Always include reasoning summaries in responsesoff: Never include reasoning summaries
think-tags: Wrap reasoning in<think>tags for DeepSeek R1-style output
Environment-level configuration (applies to all requests):
REASONING_EFFORT=high
REASONING_SUMMARY=on
REASONING_COMPAT=think-tagsRequest-level overrides:
{
"model": "gpt-4",
"messages": [...],
"reasoning": {
"effort": "high",
"summary": "on"
}
}When reasoning is enabled, responses include structured thinking:
{
"id": "chatcmpl-123",
"object": "chat.completion.chunk",
"created": 1708976947,
"model": "gpt-4",
"choices": [{
"index": 0,
"delta": {
"content": "<think>\nLet me break this problem down step by step...\n</think>\n\nTo solve this equation..."
},
"finish_reason": null
}]
}401 Authentication Error
- Verify your
OPENAI_API_KEYis correctly set - Check if client is sending
Authorization: Bearer <key>header - Ensure the API key format starts with
sk-
OAuth Token Refresh Failed
- Check if your
OPENAI_CODEX_AUTHcredentials are valid - Ensure the refresh token hasn't expired
- Verify the JSON format matches the expected structure
KV Storage Issues
- Confirm KV namespace is correctly configured in
wrangler.toml - Check KV namespace permissions in Cloudflare dashboard
- Verify the binding name matches (
KV)
Upstream Connection Errors
- Check if
CHATGPT_RESPONSES_URLis accessible - Verify network connectivity from Cloudflare Workers
- Ensure OAuth tokens have proper scopes
# Check authentication status
curl -X POST https://your-worker.workers.dev/debug/auth \
-H "Authorization: Bearer sk-your-api-key-here"
# Test token refresh
curl -X POST https://your-worker.workers.dev/debug/refresh \
-H "Authorization: Bearer sk-your-api-key-here"graph TD
A[Client Request] --> B[Cloudflare Worker]
B --> C[API Key Validation]
C --> D{Valid API Key?}
D -->|No| E[401 Unauthorized]
D -->|Yes| F{Token in KV Cache?}
F -->|Yes| G[Use Cached Token]
F -->|No| H[Check Environment Token]
H --> I{Token Valid?}
I -->|Yes| J[Cache & Use Token]
I -->|No| K[Refresh Token]
K --> L[Cache New Token]
G --> M[Call ChatGPT API]
J --> M
L --> M
M --> N{Success?}
N -->|No| O[Auto-retry with Refresh]
N -->|Yes| P[Apply Reasoning]
O --> P
P --> Q[Stream Response]
Q --> R[OpenAI Format]
R --> S[Client Response]
The wrapper acts as a secure translation layer, managing OAuth2 authentication automatically while providing OpenAI-compatible responses with advanced reasoning capabilities.
- API Key Authentication: Configurable endpoint protection
- OAuth2 Token Management: Secure credential handling
- Automatic Token Refresh: Seamless session management
- KV Storage Encryption: Secure token persistence
- Environment Isolation: Separate dev/prod configurations
- CORS Protection: Configurable cross-origin policies
- Global Edge Deployment: Cloudflare's worldwide network
- Intelligent Caching: KV-based token management
- Streaming Responses: Real-time data delivery
- Connection Pooling: Optimized upstream connections
- Automatic Retries: Resilient error handling
- Fork the repository: https://github.com/GewoonJaap/codex-openai-wrapper
- Create a feature branch:
git checkout -b feature-name - Make your changes and add tests
- Run linting:
npm run lint - Test thoroughly:
npm test - Commit your changes:
git commit -am 'Add feature' - Push to the branch:
git push origin feature-name - Submit a pull request
git clone https://github.com/GewoonJaap/codex-openai-wrapper.git
cd codex-openai-wrapper
npm install
cp .dev.vars.example .dev.vars
# Edit .dev.vars with your configuration
npm run devnpm run dev # Start development server
npm run deploy # Deploy to Cloudflare Workers
npm run lint # Run ESLint and TypeScript checks
npm run format # Format code with Prettier
npm test # Run test suite
npm run build # Build the projectThis codebase is provided for personal use and self-hosting only.
Redistribution of the codebase, whether in original or modified form, is not permitted without prior written consent from the author.
You may fork and modify the repository solely for the purpose of running and self-hosting your own instance.
Any other form of distribution, sublicensing, or commercial use is strictly prohibited unless explicitly authorized.
- Inspired by the official OpenAI Codex CLI
- Built on Cloudflare Workers
- Uses Hono web framework
- Token management patterns from OpenAI SDK