ClaraCore 🚀

# Add to PATH and reload shell:
export PATH="$HOME/.local/bin:$PATH"
source ~/.bashrc

🛡️ Windows Security Notice ClaraCore may be flagged by antivirus software - this is a false positive. Complete guide to antivirus warnings | Why is this flagged?

# Quick fix - Add Windows Defender exclusion:
Add-MpPreference -ExclusionPath "$env:LOCALAPPDATA\ClaraCore"

# Or unblock the file:
Unblock-File "$env:LOCALAPPDATA\ClaraCore\claracore.exe"

# Or run troubleshooter:
curl -fsSL https://raw.githubusercontent.com/claraverse-space/ClaraCore/main/scripts/troubleshoot.ps1 | powershell

✅ Verify it's safe: Check SHA256 hash against official releases or build from source.

Need help? See our Setup Guide or Container Guide

🙏 Credits & Acknowledgments)

Auto-setup for llama.cpp - Point it at your GGUF models folder and get a complete AI inference server in seconds.

ClaraCore extends llama-swap with intelligent automation, bringing zero-configuration setup to llama.cpp deployments.

🔥 Quick Install

Native Installation

Linux/macOS (Recommended):

curl -fsSL https://raw.githubusercontent.com/claraverse-space/ClaraCore/main/scripts/install.sh | bash

Windows:

irm https://raw.githubusercontent.com/claraverse-space/ClaraCore/main/scripts/install.ps1 | iex

Then start immediately:

claracore --models-folder /path/to/your/gguf/models
# Visit: http://localhost:5800/ui/setup

🐳 Docker (CUDA/ROCm)

CUDA (NVIDIA):

docker run -d --gpus all -p 5800:5800 -v ./models:/models claracore:cuda --models-folder /models

ROCm (AMD):

docker run -d --device=/dev/kfd --device=/dev/dri -p 5800:5800 -v ./models:/models claracore:rocm --models-folder /models

📦 Optimized containers: 2-3GB vs 8-12GB full SDKs. See Container Guide

✨ Features include: Auto-setup, hardware detection, binary management, and production configs!

✨ What's New in ClaraCore

While maintaining 100% compatibility with llama-swap, ClaraCore adds:

🎯 Auto-Setup Engine - Automatic GGUF detection and configuration
🔍 Smart Hardware Detection - CUDA/ROCm/Vulkan/Metal optimization
⬇️ Binary Management - Automatic llama-server downloads
⚙️ Intelligent Configs - Production-ready settings out of the box
🚀 Speculative Decoding - Automatic draft model pairing

🚀 Quick Start

# One command setup - that's it!
./claracore --models-folder /path/to/your/gguf/models --backend vulkan

# ClaraCore will:
# 1. Scan for GGUF files
# 2. Detect your hardware (GPUs, CUDA, etc.)
# 3. Download optimal binaries
# 4. Generate intelligent configuration
# 5. Start serving immediately

📦 Installation

Automated Installer (Recommended)

Linux/macOS:

curl -fsSL https://raw.githubusercontent.com/claraverse-space/ClaraCore/main/scripts/install.sh | bash

Windows:

irm https://raw.githubusercontent.com/claraverse-space/ClaraCore/main/scripts/install.ps1 | iex

The installer will:

Download the latest binary for your platform
Set up configuration files
Add to system PATH
Configure auto-start service (systemd/launchd/Windows Service)

Manual Download

# Windows
curl -L -o claracore.exe https://github.com/claraverse-space/ClaraCore/releases/latest/download/claracore-windows-amd64.exe

# Linux
curl -L -o claracore https://github.com/claraverse-space/ClaraCore/releases/latest/download/claracore-linux-amd64
chmod +x claracore

# macOS Intel
curl -L -o claracore https://github.com/claraverse-space/ClaraCore/releases/latest/download/claracore-darwin-amd64
chmod +x claracore

# macOS Apple Silicon
curl -L -o claracore https://github.com/claraverse-space/ClaraCore/releases/latest/download/claracore-darwin-arm64
chmod +x claracore

Build from Source

git clone https://github.com/claraverse-space/ClaraCore.git
cd ClaraCore
python build.py  # Builds UI + Go backend with version info
# or: go build -o claracore .

🎛️ Core Features

All the power of llama-swap, plus intelligent automation:

From llama-swap (unchanged)

✅ OpenAI API compatible endpoints
✅ Automatic model swapping on-demand
✅ Multiple models with groups
✅ Auto-unload with ttl
✅ Web UI for monitoring
✅ Docker/Podman support
✅ Full llama.cpp server control

ClaraCore Enhancements

✅ Zero-configuration setup
✅ Automatic binary downloads
✅ Hardware capability detection
✅ Intelligent resource allocation
✅ Speculative decoding setup
✅ Model metadata parsing

📖 Usage Examples

Automatic Setup

# Just point to your models
./claracore --models-folder ~/models

Manual Setup - for devices like strix halo and others who want to customize or setup without relying on auto-detection

# Create a config file
./claracore -ram 64 -vram 24 -backend cuda
# it will download the binaries and create a config file automatically and UI will have all the features needed to manage models

API Usage

# List models
curl http://localhost:8080/v1/models

# Chat completion
curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-3-8b",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

🔧 Configuration

ClaraCore generates configurations automatically, but you can customize everything:

# Auto-generated by ClaraCore
models:
  "llama-3-70b":
    cmd: |
      binaries/llama-server/llama-server
      --model models/llama-3-70b-q4.gguf
      --host 127.0.0.1 --port ${PORT}
      --flash-attn auto -ngl 99
    draft: "llama-3-8b"  # Automatic speculative decoding
    proxy: "http://127.0.0.1:${PORT}"
    
groups:
  "large-models":
    swap: true
    exclusive: true
    members: ["llama-3-70b", "qwen-72b"]

� Documentation

API Reference

Complete API Documentation - Full API reference with examples
Quick API Reference - Concise API overview

Key Features

OpenAI-Compatible Endpoints: /v1/chat/completions, /v1/embeddings, /v1/models
Configuration Management: /api/config/* - Manage models and settings
Model Downloads: /api/models/download - Download from Hugging Face
System Detection: /api/system/detection - Hardware and backend detection
Real-time Events: /api/events - SSE stream for live updates

Common Operations

# Get all available models
curl http://localhost:5800/v1/models

# Update model parameters with restart prompt
curl -X POST http://localhost:5800/api/config/model/llama-3-8b \
  -H "Content-Type: application/json" \
  -d '{"temperature": 0.8, "max_tokens": 1024}'

# Smart configuration regeneration
curl -X POST http://localhost:5800/api/config/regenerate-from-db \
  -H "Content-Type: application/json" \
  -d '{"options": {"forceBackend": "vulkan", "preferredContext": 8192}}'

# Monitor setup progress
curl http://localhost:5800/api/setup/progress

Web Interface

Setup Wizard: http://localhost:5800/ui/setup - Initial configuration
Model Management: http://localhost:5800/ui/models - Chat and model controls
Configuration: http://localhost:5800/ui/configuration - Edit settings
Downloads: http://localhost:5800/ui/downloads - Model download manager

�🙏 Credits & Acknowledgments

ClaraCore is built on llama-swap by @mostlygeek

This project extends llama-swap's excellent proxy architecture with automatic setup capabilities. Approximately 90% of the core functionality comes from the original llama-swap codebase. We're deeply grateful for @mostlygeek's work in creating such a solid foundation.

Special Thanks To:

@mostlygeek - Creator of llama-swap
llama.cpp team - The inference engine
Georgi Gerganov - Creator of llama.cpp

🤝 Contributing

We welcome contributions! Whether it's bug fixes, new features, or documentation improvements.

Fork the repository
Create your feature branch (git checkout -b feature/amazing-feature)
Commit your changes (git commit -m 'Add amazing feature')
Push to the branch (git push origin feature/amazing-feature)
Open a Pull Request

� Release Management

For maintainers creating releases:

# Quick release (interactive)
./release.bat        # Windows
./release.sh         # Linux/macOS

# Manual release
python release.py --version v0.1.1 --token-file .github_token

# Draft release for testing
python release.py --version v0.1.1 --token-file .github_token --draft

See RELEASE_MANAGEMENT.md for detailed release procedures.

�📄 License

MIT License - Same as llama-swap. See LICENSE for details.

🔗 Links

Built with ❤️ by the community, for the community

Standing on the shoulders of giants

Name		Name	Last commit message	Last commit date
Latest commit History 369 Commits
.claude		.claude
.github		.github
autosetup		autosetup
docker-cpu		docker-cpu
docker-cuda		docker-cuda
docker-rocm		docker-rocm
docker-strix		docker-strix
docs		docs
estimator		estimator
event		event
examples		examples
misc		misc
proxy		proxy
scripts		scripts
ui		ui
.coderabbit.yaml		.coderabbit.yaml
.gitignore		.gitignore
.goreleaser.yaml		.goreleaser.yaml
LICENSE.md		LICENSE.md
README.md		README.md
RELEASE_MANAGEMENT.md		RELEASE_MANAGEMENT.md
SECURITY_VERIFICATION.md		SECURITY_VERIFICATION.md
banner.png		banner.png
build.py		build.py
claracore.go		claracore.go
claracore.rc		claracore.rc
config_temp.yaml		config_temp.yaml
go.mod		go.mod
go.sum		go.sum
release.bat		release.bat
release.py		release.py
release.sh		release.sh
requirements-release.txt		requirements-release.txt

License

claraverse-space/ClaraCore

Folders and files

Latest commit

History

Repository files navigation