WakeLLM bridges a local always-on Linux machine with an ephemeral cloud GPU pod on RunPod. It provisions the remote pod on demand, establishes a local SSH port-forwarding tunnel, and shuts the pod down automatically when it is no longer in use — keeping compute costs proportional to actual usage.
It runs on any Linux host with Docker: a Raspberry Pi, a home server, a VPS, or a workstation. The local machine acts as the always-on control plane; the GPU compute lives entirely in the cloud.
- A local cron job, CLI command, or HTTP request triggers WakeLLM.
- WakeLLM sends a resume mutation to the RunPod API.
- Once the pod reports ready, WakeLLM opens an SSH tunnel, forwarding configured remote ports to
localhost. - Local services connect to the remote Ollama or Open WebUI as if they were running natively.
- The idle monitor detects inactivity and tears down the pod automatically.
- Ephemeral compute, persistent local state. Agent memory, databases, and credentials stay on the local machine. The cloud is used only for computation.
- SSH port forwarding. Uses native OpenSSH to bind remote ports (Ollama, Open WebUI, etc.) to
localhost— no extra tooling required. - Idle auto-kill. Polls Ollama's
/api/psendpoint. Shuts down when no model has been loaded for a configurable idle period. - Hard uptime cap. Unconditional shutdown after a configurable total runtime, regardless of activity.
- Billing fail-safes. Pod start timeout, tunnel crash detection, and exception-triggered shutdown all call
podStopbefore exiting. - Local HTTP API.
POST /wakeandGET /statusendpoints for programmatic control and status polling. - Container-first. Runs as a Docker container. Startup gate runs unit tests and Trivy security scans before launching the application.
# Copy and fill in your config
cp env/config.env.example env/config.env
# edit env/config.env
chmod +x start-wake.sh
./start-wake.shstart-wake.sh builds the image, runs unit tests and Trivy security scans in
ephemeral containers, then starts WakeLLM. All checks must pass before the
application starts.
- Docker
- A RunPod account and API key
- A RunPod pod (not serverless) with
sshdrunning and an SSH key registered - An SSH private key corresponding to the key registered in the pod
| Document | Description |
|---|---|
| docs/architecture.md | Component map, state machine, lifecycle flow, threading model |
| docs/configuration.md | All configuration keys — environment variable reference |
| docs/api.md | HTTP API reference: POST /wake, GET /status |
| docs/deployment.md | Docker build and run instructions, expected startup output |
| docs/development.md | Test structure, how to add tests, design constraints |
| docs/openclaw.md | Integrating OpenClaw (chatbot + scheduled digest use cases) |
MIT License. See LICENSE.