WakeLLM

WakeLLM bridges a local always-on Linux machine with an ephemeral cloud GPU pod on RunPod. It provisions the remote pod on demand, establishes a local SSH port-forwarding tunnel, and shuts the pod down automatically when it is no longer in use — keeping compute costs proportional to actual usage.

It runs on any Linux host with Docker: a Raspberry Pi, a home server, a VPS, or a workstation. The local machine acts as the always-on control plane; the GPU compute lives entirely in the cloud.

How It Works

A local cron job, CLI command, or HTTP request triggers WakeLLM.
WakeLLM sends a resume mutation to the RunPod API.
Once the pod reports ready, WakeLLM opens an SSH tunnel, forwarding configured remote ports to localhost.
Local services connect to the remote Ollama or Open WebUI as if they were running natively.
The idle monitor detects inactivity and tears down the pod automatically.

Key Features

Ephemeral compute, persistent local state. Agent memory, databases, and credentials stay on the local machine. The cloud is used only for computation.
SSH port forwarding. Uses native OpenSSH to bind remote ports (Ollama, Open WebUI, etc.) to localhost — no extra tooling required.
Idle auto-kill. Polls Ollama's /api/ps endpoint. Shuts down when no model has been loaded for a configurable idle period.
Hard uptime cap. Unconditional shutdown after a configurable total runtime, regardless of activity.
Billing fail-safes. Pod start timeout, tunnel crash detection, and exception-triggered shutdown all call podStop before exiting.
Local HTTP API. POST /wake and GET /status endpoints for programmatic control and status polling.
Container-first. Runs as a Docker container. Startup gate runs unit tests and Trivy security scans before launching the application.

Quick Start

# Copy and fill in your config
cp env/config.env.example env/config.env
# edit env/config.env

chmod +x start-wake.sh
./start-wake.sh

start-wake.sh builds the image, runs unit tests and Trivy security scans in ephemeral containers, then starts WakeLLM. All checks must pass before the application starts.

Prerequisites

Docker
A RunPod account and API key
A RunPod pod (not serverless) with sshd running and an SSH key registered
An SSH private key corresponding to the key registered in the pod

Documentation

Document	Description
docs/architecture.md	Component map, state machine, lifecycle flow, threading model
docs/configuration.md	All configuration keys — environment variable reference
docs/api.md	HTTP API reference: POST /wake, GET /status
docs/deployment.md	Docker build and run instructions, expected startup output
docs/development.md	Test structure, how to add tests, design constraints
docs/openclaw.md	Integrating OpenClaw (chatbot + scheduled digest use cases)

License

MIT License. See LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
docs		docs
env		env
tests		tests
wakellm		wakellm
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
entrypoint.sh		entrypoint.sh
requirements.txt		requirements.txt
start-wake.sh		start-wake.sh
wakellm.py		wakellm.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

WakeLLM

How It Works

Key Features

Quick Start

Prerequisites

Documentation

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

WakeLLM

How It Works

Key Features

Quick Start

Prerequisites

Documentation

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages