Tired of complex AI setups? π© llama.ui is an open-source desktop application that provides a beautiful β¨, user-friendly interface for interacting with large language models (LLMs) powered by llama.cpp. Designed for simplicity and privacy π, this project lets you chat with powerful quantized models on your local machine - no cloud required! π«βοΈ
This repository is a fork of llama.cpp WebUI with:
- Fresh new styles π¨
- Extra functionality βοΈ
- Smoother experience β¨
-
Multi-Provider Support: Works with llama.cpp, LM Studio, Ollama, vLLM, OpenAI,.. and many more!
-
Conversation Management:
- IndexedDB storage for conversations
- Branching conversation support (edit messages while preserving history)
- Import/export functionality
-
Rich UI Components:
- Markdown rendering with syntax highlighting
- LaTeX math support
- File attachments (text, images, PDFs)
- Theme customization with DaisyUI themes
- Responsive design for mobile and desktop
-
Advanced Features:
- PWA support with offline capabilities
- Streaming responses with Server-Sent Events
- Customizable generation parameters
- Performance metrics display
-
Privacy Focused: All data is stored locally in your browser - no cloud required!
-
Localized Interface: Most popular language packs are included in the app, and you can choose the language at any time.
- β¨ Open our hosted UI instance
- βοΈ Click the gear icon β General settings
- π Set "Base URL" to your local llama.cpp server (e.g.
http://localhost:8080) - π Start chatting with your AI!
π§ Need HTTPS magic for your local instance? Try this mitmproxy hack!
Uh-oh! Browsers block HTTP requests from HTTPS sites π€. Since llama.cpp uses HTTP, we need a bridge π. Enter mitmproxy - our traffic wizard! π§ββοΈ
Local setup:
mitmdump -p 8443 --mode reverse:http://localhost:8080/Docker quickstart:
docker run -it -p 8443:8443 mitmproxy/mitmproxy mitmdump -p 8443 --mode reverse:http://localhost:8080/Pro-tip with Docker Compose:
services:
mitmproxy:
container_name: mitmproxy
image: mitmproxy/mitmproxy:latest
ports:
- '8443:8443' # π Port magic happening here!
command: mitmdump -p 8443 --mode reverse:http://localhost:8080/
# ... (other config)
β οΈ Certificate Tango Time!
- Visit http://localhost:8443
- Click "Trust this certificate" π€
- Restart π¦ llama.ui page π
- Profit! πΈ
VoilΓ ! You've hacked the HTTPS barrier! π©β¨
- π¦ Grab the latest release from our releases page
- ποΈ Unpack the archive (feel that excitement! π€©)
- β‘ Fire up your llama.cpp server:
Linux/MacOS:
./server --host 0.0.0.0 \
--port 8080 \
--path "/path/to/llama.ui" \
-m models/llama-2-7b.Q4_0.gguf \
--ctx-size 4096Windows:
llama-server ^
--host 0.0.0.0 ^
--port 8080 ^
--path "C:\path\to\llama.ui" ^
-m models\mistral-7b.Q4_K_M.gguf ^
--ctx-size 4096- π Visit http://localhost:8080 and meet your new AI buddy! π€β€οΈ
We're building something special together! π
- π― PRs are welcome! (Seriously, we high-five every contribution! β)
- π Bug squashing? Yes please! π§―
- π Documentation heroes needed! π¦Έ
- β¨ Make magic with your commits! (Follow Conventional Commits)
Prerequisites:
- π» macOS/Windows/Linux
- β¬’ Node.js >= 22
- π¦ Local llama.cpp server humming along
Build the future:
npm ci # π¦ Grab dependencies
npm run build # π¨ Craft the magic
npm start # π¬ Launch dev server (http://localhost:5173) for live-coding bliss! π₯Planning to redistribute the app with opinionated settings out of the box? Any JSON under
src/config is baked into immutable defaults at build time (see
src/config/index.ts).
If those baked defaults include a non-empty baseUrl, the inference server will auto-sync on first load
so model metadata is fetched without requiring manual input.
- Frontend: React with TypeScript
- Styling: Tailwind CSS + DaisyUI
- State Management: React Context API
- Routing: React Router
- Storage: IndexedDB via Dexie.js
- Build Tool: Vite
- App Context: Manages global configuration and settings
- Inference Context: Handles API communication with inference providers
- Message Context: Manages conversation state and message generation
- Storage Utils: IndexedDB operations and localStorage management
- Inference API: HTTP client for communicating with inference servers
llama.ui is proudly MIT licensed - go build amazing things! π See LICENSE for details.
Made with β€οΈ and β by humans who believe in private AI