1 unstable release

0.1.0	Apr 12, 2026

#2206 in WebAssembly

Apache-2.0

58KB
1.5K SLoC

kapsl-sdk

Python client SDK for kapsl-runtime — the Rust-native AI model inference engine.

Supports socket, TCP, shared-memory, and hybrid transports with a simple Python API.

Install

pip install kapsl-sdk

Pre-compiled abi3 wheels are available for Linux, macOS, and Windows on Python 3.9+.

Quick start

from kapsl_sdk import KapslClient

client = KapslClient()  # connects to /tmp/kapsl.sock by default

# Streaming LLM inference
prompt = "<|im_start|>user\nHello<|im_end|>\n<|im_start|>assistant\n"

for chunk in client.infer_stream(model_id=0, shape=[1, 1], dtype="string", data=prompt.encode()):
    print(chunk.decode("utf-8"), end="", flush=True)

import numpy as np

# Standard tensor inference
data = np.array([[1.0, 2.0, 3.0, 4.0]], dtype=np.float32)
result = client.infer(model_id=0, shape=[1, 4], dtype="float32", data=data.tobytes())
output = np.frombuffer(result, dtype=np.float32)

Transports

Client	Transport	Use case
`KapslClient`	Unix socket / TCP	Default — local or remote
`KapslShmClient`	Shared memory	Lowest latency, co-located only
`KapslHybridClient`	Socket control + SHM data	Production throughput

from kapsl_sdk import KapslClient, KapslShmClient, KapslHybridClient

# TCP
client = KapslClient("tcp://192.168.1.10:9096")

# Shared memory (same machine only)
client = KapslShmClient()

# Hybrid
client = KapslHybridClient()

Authentication

client = KapslClient(api_token="your-token")

kapsl-rag