OpenAI-compatible chat/completion proxy backed by https://chat.z.ai/.
- Supports
POST /v1/chat/completions - Supports
POST /v1/responses - Creates a fresh upstream chat for every request
- Preserves reasoning output separately from final answer text
- Reuses
ZAI_SESSION_TOKENdirectly or refreshes it fromZAI_JWT
- Python 3.12+
uv- One of:
ZAI_JWTZAI_SESSION_TOKEN
export ZAI_JWT='your-jwt'
uv run python -m zai2apiOr with the installed script:
export ZAI_JWT='your-jwt'
uv run zai2apiDefault bind address is 0.0.0.0:8000.
ZAI_JWT: preferred auth source; used to fetch a fresh session tokenZAI_SESSION_TOKEN: optional direct session token reuseDEFAULT_MODEL: defaults toglm-5- Available public model ids:
glm-5,glm-5.1,glm-5-turboand their-nothinkingvariants HOST: defaults to0.0.0.0PORT: defaults to8000LOG_LEVEL: defaults toinfoREQUEST_TIMEOUT: defaults to120
curl http://127.0.0.1:8000/v1/chat/completions \
-H 'content-type: application/json' \
-d '{
"model": "glm-5",
"messages": [
{"role": "system", "content": "Be concise."},
{"role": "user", "content": "Say hello."}
]
}'curl http://127.0.0.1:8000/v1/responses \
-H 'content-type: application/json' \
-d '{
"model": "glm-5",
"input": "Say hello."
}'