Skip to content

ClaraCore extends llama-swap with intelligent automation, bringing zero-configuration setup to llama.cpp deployments.

License

Notifications You must be signed in to change notification settings

claraverse-space/ClaraCore

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ClaraCore Banner

ClaraCore 🚀

License: MIT

# Add to PATH and reload shell:
export PATH="$HOME/.local/bin:$PATH"
source ~/.bashrc

🛡️ Windows Security Notice ClaraCore may be flagged by antivirus software - this is a false positive. Complete guide to antivirus warnings | Why is this flagged?

# Quick fix - Add Windows Defender exclusion:
Add-MpPreference -ExclusionPath "$env:LOCALAPPDATA\ClaraCore"

# Or unblock the file:
Unblock-File "$env:LOCALAPPDATA\ClaraCore\claracore.exe"

# Or run troubleshooter:
curl -fsSL https://raw.githubusercontent.com/claraverse-space/ClaraCore/main/scripts/troubleshoot.ps1 | powershell

✅ Verify it's safe: Check SHA256 hash against official releases or build from source.

Need help? See our Setup Guide or Container Guide

🙏 Credits & Acknowledgments)

Go Version Platform

Auto-setup for llama.cpp - Point it at your GGUF models folder and get a complete AI inference server in seconds.

ClaraCore extends llama-swap with intelligent automation, bringing zero-configuration setup to llama.cpp deployments.

🔥 Quick Install

Native Installation

Linux/macOS (Recommended):

curl -fsSL https://raw.githubusercontent.com/claraverse-space/ClaraCore/main/scripts/install.sh | bash

Windows:

irm https://raw.githubusercontent.com/claraverse-space/ClaraCore/main/scripts/install.ps1 | iex

Then start immediately:

claracore --models-folder /path/to/your/gguf/models
# Visit: http://localhost:5800/ui/setup

🐳 Docker (CUDA/ROCm)

CUDA (NVIDIA):

docker run -d --gpus all -p 5800:5800 -v ./models:/models claracore:cuda --models-folder /models

ROCm (AMD):

docker run -d --device=/dev/kfd --device=/dev/dri -p 5800:5800 -v ./models:/models claracore:rocm --models-folder /models

📦 Optimized containers: 2-3GB vs 8-12GB full SDKs. See Container Guide

Features include: Auto-setup, hardware detection, binary management, and production configs!

✨ What's New in ClaraCore

While maintaining 100% compatibility with llama-swap, ClaraCore adds:

  • 🎯 Auto-Setup Engine - Automatic GGUF detection and configuration
  • 🔍 Smart Hardware Detection - CUDA/ROCm/Vulkan/Metal optimization
  • ⬇️ Binary Management - Automatic llama-server downloads
  • ⚙️ Intelligent Configs - Production-ready settings out of the box
  • 🚀 Speculative Decoding - Automatic draft model pairing

🚀 Quick Start

# One command setup - that's it!
./claracore --models-folder /path/to/your/gguf/models --backend vulkan

# ClaraCore will:
# 1. Scan for GGUF files
# 2. Detect your hardware (GPUs, CUDA, etc.)
# 3. Download optimal binaries
# 4. Generate intelligent configuration
# 5. Start serving immediately

📦 Installation

Automated Installer (Recommended)

Linux/macOS:

curl -fsSL https://raw.githubusercontent.com/claraverse-space/ClaraCore/main/scripts/install.sh | bash

Windows:

irm https://raw.githubusercontent.com/claraverse-space/ClaraCore/main/scripts/install.ps1 | iex

The installer will:

  • Download the latest binary for your platform
  • Set up configuration files
  • Add to system PATH
  • Configure auto-start service (systemd/launchd/Windows Service)

Manual Download

# Windows
curl -L -o claracore.exe https://github.com/claraverse-space/ClaraCore/releases/latest/download/claracore-windows-amd64.exe

# Linux
curl -L -o claracore https://github.com/claraverse-space/ClaraCore/releases/latest/download/claracore-linux-amd64
chmod +x claracore

# macOS Intel
curl -L -o claracore https://github.com/claraverse-space/ClaraCore/releases/latest/download/claracore-darwin-amd64
chmod +x claracore

# macOS Apple Silicon
curl -L -o claracore https://github.com/claraverse-space/ClaraCore/releases/latest/download/claracore-darwin-arm64
chmod +x claracore

Build from Source

git clone https://github.com/claraverse-space/ClaraCore.git
cd ClaraCore
python build.py  # Builds UI + Go backend with version info
# or: go build -o claracore .

🎛️ Core Features

All the power of llama-swap, plus intelligent automation:

From llama-swap (unchanged)

  • ✅ OpenAI API compatible endpoints
  • ✅ Automatic model swapping on-demand
  • ✅ Multiple models with groups
  • ✅ Auto-unload with ttl
  • ✅ Web UI for monitoring
  • ✅ Docker/Podman support
  • ✅ Full llama.cpp server control

ClaraCore Enhancements

  • ✅ Zero-configuration setup
  • ✅ Automatic binary downloads
  • ✅ Hardware capability detection
  • ✅ Intelligent resource allocation
  • ✅ Speculative decoding setup
  • ✅ Model metadata parsing

📖 Usage Examples

Automatic Setup

# Just point to your models
./claracore --models-folder ~/models

Manual Setup - for devices like strix halo and others who want to customize or setup without relying on auto-detection

# Create a config file
./claracore -ram 64 -vram 24 -backend cuda
# it will download the binaries and create a config file automatically and UI will have all the features needed to manage models

API Usage

# List models
curl http://localhost:8080/v1/models

# Chat completion
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3-8b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

🔧 Configuration

ClaraCore generates configurations automatically, but you can customize everything:

# Auto-generated by ClaraCore
models:
  "llama-3-70b":
    cmd: |
      binaries/llama-server/llama-server
      --model models/llama-3-70b-q4.gguf
      --host 127.0.0.1 --port ${PORT}
      --flash-attn auto -ngl 99
    draft: "llama-3-8b"  # Automatic speculative decoding
    proxy: "http://127.0.0.1:${PORT}"
    
groups:
  "large-models":
    swap: true
    exclusive: true
    members: ["llama-3-70b", "qwen-72b"]

� Documentation

API Reference

Key Features

  • OpenAI-Compatible Endpoints: /v1/chat/completions, /v1/embeddings, /v1/models
  • Configuration Management: /api/config/* - Manage models and settings
  • Model Downloads: /api/models/download - Download from Hugging Face
  • System Detection: /api/system/detection - Hardware and backend detection
  • Real-time Events: /api/events - SSE stream for live updates

Common Operations

# Get all available models
curl http://localhost:5800/v1/models

# Update model parameters with restart prompt
curl -X POST http://localhost:5800/api/config/model/llama-3-8b \
  -H "Content-Type: application/json" \
  -d '{"temperature": 0.8, "max_tokens": 1024}'

# Smart configuration regeneration
curl -X POST http://localhost:5800/api/config/regenerate-from-db \
  -H "Content-Type: application/json" \
  -d '{"options": {"forceBackend": "vulkan", "preferredContext": 8192}}'

# Monitor setup progress
curl http://localhost:5800/api/setup/progress

Web Interface

  • Setup Wizard: http://localhost:5800/ui/setup - Initial configuration
  • Model Management: http://localhost:5800/ui/models - Chat and model controls
  • Configuration: http://localhost:5800/ui/configuration - Edit settings
  • Downloads: http://localhost:5800/ui/downloads - Model download manager

�🙏 Credits & Acknowledgments

ClaraCore is built on llama-swap by @mostlygeek

This project extends llama-swap's excellent proxy architecture with automatic setup capabilities. Approximately 90% of the core functionality comes from the original llama-swap codebase. We're deeply grateful for @mostlygeek's work in creating such a solid foundation.

Special Thanks To:

🤝 Contributing

We welcome contributions! Whether it's bug fixes, new features, or documentation improvements.

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

� Release Management

For maintainers creating releases:

# Quick release (interactive)
./release.bat        # Windows
./release.sh         # Linux/macOS

# Manual release
python release.py --version v0.1.1 --token-file .github_token

# Draft release for testing
python release.py --version v0.1.1 --token-file .github_token --draft

See RELEASE_MANAGEMENT.md for detailed release procedures.

�📄 License

MIT License - Same as llama-swap. See LICENSE for details.

🔗 Links


Built with ❤️ by the community, for the community

Standing on the shoulders of giants

About

ClaraCore extends llama-swap with intelligent automation, bringing zero-configuration setup to llama.cpp deployments.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 32